Mark Lloyd of Seattle, Washington recently e-mailed me the following essay of his. I invited him to let me post it here on the blog as a guest contribution, and he agreed. Here it is, with light editing.
A couple of years ago I wrote a computer program simulating a basketball team with players' shooting percentages varying on different days according to distributions with a mean of 50 percent, but with different variances. I then tested the strategy of
giving the ball to players who made two shots in a row compared to giving the ball to players randomly. In the long term giving the ball to the hot hand was a winning strategy.
Thinking about this recently I came up with an analogy that makes clear the
difference between the casino and hot hand theory and the mistakes many
statisticians make assuming that believing in the hot hand is just the same as the gambler's fallacy of believing that one should bet on a roulette wheel that has shown a streak of results.
Let’s say we have an imperfect casino with five roulette tables that, if in
perfect condition, select red and black 50 percent of the time. The casino’s roulette tables however are in disrepair, and further, the casino sits above a subway line that occasionally shakes the tables, causing them to vary their percentage of hitting black or red from 40-60 to 60-40 with a mean of 50-50. There is no way to observe the condition of the roulette wheel except to observe the results of wagering on the tables.
With most visitors not having the time to do a long-term statistical analysis of the tables, is wagering on a table that has hit red or black twice in a row a better strategy than randomly betting on any color on any table? Clearly yes: A table that is hitting red 60 percent of the time will hit red twice in a row 36 percent of the time, while a table hitting 40 percent will only hit two in a row 16 percent
of the time. With the tables at some range between 40-60 and 60-40, this two-in-a-row percentage will vary between 16 and 36 percent. More often imperfect tables that hit two in a row will have a higher winning percentage.
If every day you walk in to the casino, wait for a table to hit a color twice in a row, then bet on that table and color, over time you will do better than if you randomly bet on a random table and color.
A real-life non-statistician watching for a hot hand in a basketball game may well be thinking something like: “He hit two in a row, I think he is shooting better than average today.” The non-statistician has also made a correct observation that at least in the case of basketball, daily shooting percentages vary more than one would expect randomly. In the course of a game there is limited information to measure this variance, so looking for shooting streaks is an imperfect way to find players who have a higher underlying skill that day. The statistician's job is to operationalize that observation.
What statisticians will observe is that in the short term on any given day, if you measure any roulette table in the imperfect casino, the percentage of hitting a color after hitting the same color twice in a row will not be different than after any other sequence (unless a subway train passes underneath). This is the same as observing a basketball player on a given day (or a given hour depending on that shooter's pattern of consistency). This observation misleads the statistician into believing the hot hand is no different than a casino winning streak in a perfect casino, but the statistician is asking a different question than the non-statistician.
The statistician should be asking how the percentage of streaks varies
from day to day; this will more closely operationalize what the non-statistician is observing and make the statistician wealthier as well.
The world of sports is a world of imperfect casinos. This confounds statisticians.
3 comments:
Statistics is certainly applicable to imperfect casinos, as long as "the average" is not the only statistic you use. In fact, statistics can tell you whether a given casino is perfect or not, as long as you have enough samples including some that cover the imperfect behavior (subway train).
"... if you measure any roulette table in the imperfect casino, the percentage of hitting a color after hitting the same color twice in a row will not be different than after any other sequence (unless a subway train passes underneath)."
But if your sample includes 100% of all rolls over many days, then it certainly will include some influenced by subway, in which case the statistician WILL find that the percentage of repeating a color is (slightly) higher after two in a row of that color than after any other sequence. If the statistician didn't find that result, then it is not true that "If every day you walk in to the casino, wait for a table to hit a color twice in a row, then bet on that table and color, over time you will do better than if you randomly bet on a random table and color." It's the exact same measurement, just expressed in different words.
For purposes of this analogy, over time the subways shake the tables in a manner so they all average 50% red/black over the long term. The percentage of hits after consecutive red or black hits should be the same as after other sequences in the long term.
In our example we have inside information that the roulette tables performance vary more than chance on different days. We don't care about any SINGLE table's performance after a streak as compared to not after a streak, we are looking for indications of which table is hot. That player percentages vary on a daily basis more than expected by chance is not inside information and is the key to understanding the problem.
Let's imagine an even more poorly run casino further down the subway line. As in our original example part of it's problem is it allows a 50/50 bet with no house advantage. It has two tables that when shaken are either 60/40 or 50/50 favoring a single color. Over 100days you bet a doller on the first table that hits a color twice in a row. Again you win, betting on the better table 69.9 percent of the time (25/36), then winning 60% of the time on those bets.
On you 100 dollars bet you can expect to win 82.8 dollars on the 60/40 table and 31 dollars on the 50/50 tables for a total of 113 dollars. Of course your results will vary over multiple 100 day sequences.
The notion that the hot hand is a a naive notion is based on constructing the problem to match the conditions of the gamblers fallacy. It should be a trivial observation that a basketball player who is shooting with higher skill on a given day will have more streaks and that a naive observer is trying to choose among players to take a shot with limited information, yet the consensus (if Wikipedia is to be believed) is that the hot hand is a fallacy.
Andy you are correct that a statistician hired by a casino and uninterested in arcane hot hand debates would probably identify the defective tables.
Post a Comment