The Hot Hand in Sports: August 2020

Friday, August 28, 2020

Review of the "Other" Hot Hand Book

I have just finished reading The Hot Hand, a March 2020 release by Wall Street Journal writer Ben Cohen. I had cornered the market on hot hand books from 2011-2020 (see image in right-hand column), but I welcome Mr. Cohen's book. Actually, I feel the two books are very different. Whereas mine was devoted entirely to sports and contained many statistical analyses, the new book delves into many areas outside of sports (e.g., art, the stock market, detective work) and, with one crucial exception (discussed later), is fairly light on mathematics, statistics, and probability.

I found Cohen's examination of where hot hands appear to exist (or not exist) to be fair and well-contextualized. That doesn't mean I agree with every conclusion in the book, but I think the overall tone was appropriate.

There are two main themes (as best I can tell) that tie together the sports and non-sports examples throughout the book. One is how the conventional wisdom on a given question can change at any point (e.g., is there or is there not a hot-hand effect beyond chance? is a particular painting truly from a great master or a fraud?), sometimes going back-and-forth multiple times. The second theme is how ongoing advances in technology (e.g., high-speed sky-cam videos of basketball games; x-rays of paintings) can contribute to changes in conventional wisdom.

From a sports perspective, the book ends with a one-two punch that argues for a hot hand in basketball shooting. One is a 2014 study by three extremely savvy Harvard undergraduates, who used sky-cam video data of NBA games to take into account, with great specificity, the difficulty of each shot (distance from the hoop, defensive presence, etc.), which the earliest studies of NBA shooting could not do (here and here).

The Harvard study found a roughly two percent improvement in future shooting success coming off of a couple of makes. In Cohen's words, "While the result itself was modest, the meaning of it was monumental" (p. 204). I must demur. Over the past few decades, there has been a movement in psychology and related fields to emphasize effect-sizes -- how much impact one variable has on another or how strongly do two variables correlate. Hence, I'm inclined to assign modest importance to a finding of modest magnitude. The Harvard researchers themselves described their own results as a "small blaze" (p. 204).

Next to enter the scene were the young economists Josh Miller and Adam Sanjurjo. Miller and Sanjurjo's intellectual contribution -- discovering a counter-intuitive and heretofore undetected bias in seemingly basic statistical calculation -- is certainly formidable. Yet, as I wrote in 2015, I find the practical magnitude of Miller and Sanjurjo's insight to be relatively modest, as well.

If a basketball shooter had a long-term track-record of 50% on three-pointers and then made a few in a row, the standard analytic approach pre-Miller-Sanjurjo would have been simply to assess whether the player hit shots at a greater clip than 50% for some number of shots after the initial set of consecutive makes. It turns out, however, that the proper baseline for judging success after a hit for a long-term 50% shooter is actually not 50%, but 42%. As Cohen succinctly puts it, "If a 50 percent shooter was shooting 50% [over the long term], he was actually beating the odds" (p. 227).

The following graph from one of Miller and Sanjurjo's working papers conveys this idea visually (their derivations are far above my mathematical expertise). The graph is divided into three sections, one for when a player's true probability of success is .75 (top), one for when his or her true probability is .50 (middle), and one for when his or her true probability is .25 (bottom). The key thing to look at is the vertical discrepancy between a given dashed line (representing true probability) and the solid color lines. Each color line, which is a function of the total number (n) of shots in a sequence and the length (k) of a hot streak, tells us the new baseline to use in judging a player's future shooting success.

A concrete example should help. I have annotated the graph to highlight a particular point on the top red curve: For a true .75 shooter, who has made five shots in a row, in a sequence of 20 shots. As shown at the end of the grey horizontal line I added from the target data-point to the y-axis, that player should be judged against a standard of .61 for whether he or she is "hot" over his or her next sequence of shots. Reiterating Cohen's explanation (above), a player who shot, say, .66 or .72 would be considered "hot" (i.e., above .61), even though the player's underlying true shooting percentage is .75.

Miller and Sanjurjo note that "as n gets larger, the difference between expected conditional relative frequencies and respective probabilities of success generally decrease..." In other words, the bias they demonstrated tends to diminish with an increasing number of shots and, under certain conditions, approaches zero. Note that, with a sequence of 100 shots and a p = .75 underlying probability, the colored lines start getting really close to the dashed line. Miller and Sanjurjo add, though, that if an athlete has compiled a very long streak (k = 5 straight hits, depicted above in red), even with a very long sequence (n = 100), a substantial bias can remain (e.g., for a p = .50 shooter, there is still a .15 bias, namely .50 on the dashed line, minus .35 on the red line).

One question I have about Miller and Sanjurjo's formulation is how it maps onto fan psychology. As we've seen, a 50% shooter who follows up a few straight hits with, say, a sequence of 47% success is mathematically hot (i.e., exceeding the adjusted baseline of 42%). Yet, I can't imagine fans following the reasoning (as correct as it is) that, "Hey, this 50% shooter is now hitting 47%. He (or she) is on fire!"

A quibble I have with Cohen is his lack of discussion of sports other than basketball. Solid evidence for a sports hot hand has been around since 2004, in the case of professional bowling (Dorsey-Palmateer & Smith, 2004). Cohen knew about this study, as it is included in his bibliography for The Hot Hand (p. 271).

Hot hand research has now been going on for 35 years, dating from the famous Gilovich, Vallone, and Tversky (1985) study. I can see research progressing in several directions, such as the aforementioned fan psychology and more research with actual game data. There is also research showing -- opposite of Miller and Sanjurjo -- that certain hot-hand estimation methods may overstate the extent of a hot hand (Cotton, McIntyre, & Price, 2016). Thus, some of the analytic formulations may have to be reconciled. Lastly, Cohen's epilogue followed Tom Gilovich as he was conducting some new studies. I eagerly await what he has to report. In short, there's plenty of grist for the hot-hand research mill. I'm currently 57 years old, so I would have to live to 92 to see if hot-hand research makes it another 35 years!