Saturday, March 29, 2014

Michigan's 3PT Shooting: An Illustration of Regression to the Mean

Despite holding a 60-45 lead over Tennessee with 10:57 left in last night's NCAA Sweet Sixteen game, the Michigan men's basketball team had to sweat things out for a 73-71 win (play-by-play sheet). One reason the Wolverines were unable to coast to a blow-out win over the Volunteers was a drop in Michigan's three-point shooting percentage from .778 (7-of-9) in the first half to .364 (4-of-11) in the second.

Whereas there could be substantive reasons for the Wolverines' second-half decline from behind the arc (e.g., fatigue, better Tennessee defense), the phenomenon of regression toward the mean almost certainly contributed, as well. Regression toward the mean refers to performers who exhibit extreme values on a set of initial measurements -- on either the high or low end -- achieving at closer to an average level on later measurements. According to the Social Research Methods website, regression toward the mean:

will happen anytime you measure two measures! It will happen forwards in time (i.e., from pretest to posttest). It will happen backwards in time (i.e., from posttest to pretest)! It will happen across measures collected at the same time (e.g., height and weight)! It will happen even if you don't give your program or treatment. 

Using box scores from all of Michigan's 2013-14 games to date (contained in UM's game notes in advance of Sunday's Elite Eight match-up with Kentucky), I plotted the Wolverines' team three-point shooting percentages for each first-half and second-half played this season. Each line in the graph links the two halves of the same game, with the Tennessee game depicted in orange, as one example (there were too many games, 36, to label each line). You may click on the graph to enlarge it.

Regression to the mean is indicated by lines that slope from very high to the middle, and lines that slope from very low to the middle. Also shown in the graph is Michigan's .402 three-point success rate for the season to this point. The Wolverines' pattern is a textbook example of regression toward the mean, as can be seen by comparing the above graph to this diagram from a textbook (Campbell and Kenny's A Primer on Regression Artifacts).

When Michigan (or any team) hits close to 80% of its treys in a half of one game, it is unlikely that it can match or exceed that rate in the other half. It is also true that a team shooting .100 or worse for a half will rarely* match or drop below that level in the other half.

As noted above, regression to the mean is virtually certain to occur anytime multiple measurements are obtained. The above depiction for Michigan is probably more dramatic than would be the case for most other teams, as most teams presumably are not as capable as the Wolverines of exceeding three-point shooting percentages of .600 or .700 within a half. Out of 351 NCAA Division I men's basketball teams, Michigan finished the regular season tied for seventh nationally in three-point shooting percentage.

*I inadvertently omitted the word "rarely" from the original version of this posting.

No comments: