Sunday, January 28, 2007



As can be seen from the little logo I made, today is the fifth anniversary of the launching of the Hot Hand website. Fittingly, as I'm putting the finishing touches on this write-up, the Phoenix Suns (whose game against Cleveland I've had on in the background) have just won their 17th straight game!

In addition to all the intrinsic fun, operating the site has brought me into contact with a number of sports-minded statisticians and decision researchers, as well as sports reporters, that I don't think I would have met otherwise (see photo on this page). Numerous people have been very supportive of my efforts over the years, which I greatly appreciate. I would like to cite two individuals in particular, one from academia and the other from the baseball world, for lending their gifts to the page.

One is Professor Tom Gilovich of Cornell University, a co-founder of hot hand research (Gilovich, Vallone, & Tversky, 1985, Cognitive Psychology), who was the guest for the first-ever hot hand chat, in 2002. The other is prolific baseball analyst and writer Bill James, who contributed an original study on George Brett and Tony Gwynn to the site in 2005, and has sent along additional commentaries as well.

With my June 2006 switch to the present blog format, documents archived at my previous site (such as those related to Tom and Bill) are no longer available online; I've saved electronic copies, though, so if you're interested in these, please e-mail me via the link to my faculty webpage.

To mark this anniversary, I thought I'd summarize what I think are some of the biggest new developments in hot hand research over the past five years (the summaries below alternate in red and in black font, to set them apart visually). If you have additional suggestions or want to discuss one of the areas I've raised, please submit a comment via the link at the bottom of this write-up.

*Whereas my (and others') analyses of events such as the NBA All-Star Long-Distance Shootout (three-point shooting contest) and MLB All-Star Home Run Derby still have not found much evidence for the probability of a success following a success exceeding the probability of a success following a failure, there have been findings of streakiness in other sports.

The chances of detecting streakiness would seem to be enhanced when players could execute a short, simple motion (e.g., swing or stroke) in relation to the ball, which could be repeated often and in short succession. That way, a player could rehearse and remember how he or she executed a successful motion and apply it repeatedly. Consistent with this reasoning, recent studies have shown some evidence of hot hands in bowling (Dorsey-Palmateer & Smith, 2004, The American Statistician), tennis (Klaassen & Magnus, 2001, Journal of the American Statistical Association), and golf putting (Gilden & Wilson, 1995, Psychonomic Bulletin & Review).


*The realization that the athletes who are the most likely to go on hot streaks are those who already have among the highest percentages of success in their respective sports (think Joe DiMaggio's 56-game hitting streak or Tiger Woods's 142-tournament streak of always making the cut) has some interesting implications. One of them is that it may be optimal after all to pass the basketball to a player on a hot streak -- not because making shots increases that player's shooting percentage on future shots above his or her typical percentage, but because the player on a hot streak is likely to be the team's best shooter overall. Bruce Burns, who published a study making this point in Cognitive Psychology in 2004, was kind enough to elaborate upon this theme in an online chat with us a few years ago.

*Streakiness analyses have also provided a vehicle for advances in basic statistics. Klaassen and Magnus (2001) concluded their aforementioned analysis of tennis, which found a positive correlation between winning the previous point and the current one, thusly: "In addition to the empirical findings on tennis, the paper provides a theoretical contribution to the estimation of discrete dynamic panel data models."

*The 2001 baseball book Curve Ball, whose authors Jim Albert and Jay Bennett also participated in an online chat, introduced me to what, for me, was a non-traditional form of analysis, namely visual comparisons based on simulations. I've used this technique a few times over the years, including a 2002 examination of whether Natalie Ritchie and Amber Tarr, two women's basketball players from my home university, Texas Tech, showed any evidence of streakiness in their three-point shooting.

The main idea from Albert and Bennett was to use a spinner, similar to that in the old board game All-Star Baseball, to demonstrate that even an event-generator with the same underlying probability throughout can produce "streakiness" of hot and cold. An example for Ritchie is shown below. Because Ritchie was a 37% three-point shooter overall, I simulated a spinner by obtaining a series of random numbers between 1-100 (each random number corresponded to each shot she actually took during the season). Each random number was examined, in sequence. If a random number were between 1-37 (corresponding to her 37% success rate), it was considered a made three-pointer, whereas a number between 38-100 was considered a miss.



As can be seen, even with the simulated spinner that we know to have a consistent 37% hit rate on each shot, the simulated sequence showed rises and falls, similar to Ritchie's actual shooting (although the rises and falls were not necessarily located in the same places in the two versions).

*The above illustration of visual analysis sets the context for what is perhaps my favorite streak-related story during my time operating this website. In the middle-late part of Big Ten play in 2006, Ohio State's Je'Kel Foster (isn't that a perfect name for someone who goes hot and cold in his shooting?) had an amazingly hot stretch, followed by an equally amazing cold stretch.

First, take a look at this lovely graph from Buckeye Commentary (below the two pie-charts on the new page that comes up). What you'll see is that Foster was consistently shooting (roughly) at a mind-boggling 80% clip during a three-game stretch. This was then followed by a six-game stint in which his average 3PT% was in the teens!

Given Foster's 40% overall 3PT% from behind the arc last season, it certainly appears that his actual highs were higher than what would be expected by chance, and his actual lows were lower.


*Whereas the evidence for individual basketball players' streakiness (beyond chance) appears weak, a promising area that has some preliminary support is that of team runs. The term, inspired by then-Kansas coach Roy Williams's 2003 assertion that, "We are a team of runs," refers to instances where one team handily outscores the other during a stretch, such as the Purdue men going on a 21-0 run yesterday against Illinois (see my posting below from yesterday).

I again used the simulated-spinner approach, although somewhat differently than in the above example. I focused on the unique rivalry between Kansas and Arizona in '03, in which the teams met twice (once in the regular season and once in the NCAA tournament) and both games were laden with team runs. Using data from the entirety of both real games between the teams, I made "possession spinners" for each team (separately). For each team, I calculated the actual, empirical percentage of possessions on which they scored 0, 1, 2, or 3 points, in the two games combined. A team's "spinner" would thus have a no-point area exactly proportional to the team's actual frequency of scoring no points on a possession, a 1-point area proportional to the team's frequency of scoring one point, etc.

I then created some simulated games between the teams, where I would alternate spinning one team's spinner and then the other's, with the number of spins based on the actual number of possessions in the two KU-UA games. The idea, again, is that even though the spinner used for each team is consistent from possession to possession (and includes ample opportunity for scores and non-scores), random processes (e.g., unusually long streaks of scores by one team and non-scores by the other) could still cause team runs to occur. The key question is how these chance-based simulated team runs compared to the actual team runs in the two KU-UA games.

I concluded that:

"Overall, 12 shut-out runs of 7-0 or greater were observed in the 5 simulated games, for an average of 2.4 per game. The frequency of these runs seemed pretty comparable to those observed in the actual KU-UA games. However, the magnitude levels of the runs were largely higher in the actual games (e.g., 11-0, 12-0, 13-0, and 16-0) than in the simulations of chance/independence."

*A final potential research area, still in its early stages, involves integrating streakiness with such potentially related topics as clutch performance and choking. Such was the basis for an online chat with Russ Clark, who had conducted extensive statistical studies of professional golf performance. During the preparations for Dr. Clark's chat, we were pleased to receive a question from Sian Beilock, a prominent scholar of "choking" and mental processes in skilled performance more generally. Dr. Beilock gave a colloquium at Texas Tech this past October, and I was able to go up afterwards and introduce myself.

*Lastly, I want to address a more epistemological point. Something for which I've occasionally been called on the carpet and on which I'm trying to improve, is the need for greater contextualization of findings in terms of the opportunities for something to occur. Statistically analyzing events that captured our attention in the first place precisely becauseof their unusual nature, after the fact, and increasing the number of preconditions (e.g., how likely was Event A to occur, given that unusual events B, C, and D had already occurred?) all have the potential to make the events I analyze seem more rare than they really are. When considering the large numbers of games played each year in a given sport or league, and the many years these leagues have been in existence, it sometimes turns out that what appears to be a highly rare occurrence really isn't, given the big picture of the numerous opportunities for such an event to occur.

On the other hand, sometimes the events we see really are inordinately rare, using any reasonable standard. After the Wake Forest men's basketball team made 50 straight free throws in January 2005, analyst Ken Pomeroy concluded the following:

Assuming Wake Forest shoots 25 free throws a game, you would expect this event to happen to the Deacons once in every 66,000 games...2,200 seasons...110 generations.

I hope you've found my analyses (as well as those by others) to be informative, thought-provoking, and entertaining. If so, I hope to continue my streak of writing up worthwhile analyses for as long as possible.

Saturday, January 27, 2007

This afternoon, the men's and women's basketball teams from my home university, Texas Tech, were playing at the same time, the men at Missouri and the women hosting Texas. Though both Texas Tech squads experienced substantial scoring droughts, the final outcomes were different for the Red Raiders (men) and Lady Raiders.

The Tech men entered their game against Missouri coming off back-to-back home wins against national top 10 teams Kansas and Texas A&M. The Red Raiders, who've consistently been among the nation's top 5 in 3PT% this season, had in fact just shot .556 (10 of 18) from behind the arc against the Aggies.

The opposition Tigers have a new coach this year, Mike Anderson, a disciple of former Arkansas coach Nolan Richardson and his "40 Minutes of Hell" style of defense. Lately, though, the Mizzou defense apparently hadn't been all that stifling, as the Tigers came into the game 1-4 in Big 12 play, including home losses to Iowa State and Kansas State.

Well, today, the Missouri defense turned up the heat and sidetracked the Red Raider offense. As shown in the TTU-Mizzou play-by-play sheet, the Red Raiders went scoreless for roughly the first 9 and 1/2 minutes of the second half (until 10:23 remained), ultimately falling in a 71-58 Tiger victory.

Interestingly, Tech's three-point shooting wasn't bad at all against Missouri, percentagewise (.538, 7 of 13); however, the Raiders got off fewer three-point attempts than against Texas A&M, made three fewer of them, and thus derived nine fewer points from long distance. Also, as the television announcers noted, Mizzou's defense forced Texas Tech's best outside shooter, Jay Jackson, to attempt some threes from really long distance.

The Lady Raiders likewise went through a major scoring slump, in their game against No. 24 Texas. Based on this article (a play-by-play sheet doesn't seem to be available yet), Texas Tech appears to have gone approximately 19 minutes with only two field goals to their credit ("In the final 11:53 of the [first] half, the Lady Raiders hit just two field goals..." and "...Tech failed to score a field goal until 13:23 [remained in the second half]").

Yet, this did not spell doom for the Lady Raiders. Down 48-41 with a little over two minutes remaining in the game, Texas Tech scored the final eight points of the contest. Alesha Robertson's three-pointer with 6.1 seconds left turned out to be the game winner, as the Lady Longhorns couldn't score on their last possession.

Considering that the two Texas Tech squads left enough minutes of (empty-scoring) gaps to remind one of the Watergate tapes (see also here), getting at least one victory is pretty good.

***

Warren Silver, a relative of mine, has a blog on University of Illinois sports and Chicago pro teams. I was just looking at Warren's blog and noticed from his synopsis of today's Illinois-Purdue men's basketball game that the Boilermakers had outscored the Illini 21-0 during one stretch. Here's the play-by-play sheet, where you can see the sequence in which a 4-4 tie later became a 25-4 Purdue lead.

Friday, January 26, 2007

This past Monday night, as described a few entries down from here, the Miami Heat unleashed a 27-0 spurt on the New York Knicks, en route to a 101-83 victory in south Florida.

Tonight, the teams went up to the Big Apple for a rematch. Not only did the Knicks avenge the earlier loss, beating the Heat 116-96; the hosts used a streaky performance of their own in doing so.

Specifically, New York's Jamal Crawford (a former Michigan Wolverine) made 16 straight field-goal attempts (three short of the team record, but highly impressive, nonetheless), went 8-of-10 on three-point attempts for the night overall, and ended up with 52 points (article). As I've excerpted from the play-by-play sheet, Crawford's 16-field goal sequence was as follows (broken up into sets of four for ease of viewing, with three-pointers highlighted in red):

Jamal Crawford makes three point jumper
Jamal Crawford makes driving layup
Jamal Crawford makes 19-foot jumper
Jamal Crawford makes layup

Jamal Crawford makes 27-foot three point jumper
Jamal Crawford makes 23-foot three point jumper
Jamal Crawford makes 23-foot three point jumper
Jamal Crawford makes 27-foot three point jumper


Jamal Crawford makes running jumper
Jamal Crawford makes 12-foot two point shot
Jamal Crawford makes 25-foot three point jumper
Jamal Crawford makes 26-foot three point jumper

Jamal Crawford makes 27-foot three point jumper

Jamal Crawford makes 21-foot jumper
Jamal Crawford makes 13-foot two point shot
Jamal Crawford makes driving layup

As can be seen, these were plenty challenging shots! For whatever reason, spectacular nights of shooting from behind the three-point arc are not as rare as some might imagine. Just last year, the Bulls' Ben Gordon went a perfect 9-of-9 on treys, tying the previous NBA record of most three-point attempts without a miss, accomplished by Latrell Sprewell in 2003.

Thursday, January 25, 2007

For an athlete to exhibit a "hot hand," say by making several basketball free throws in a row (which removes the elements of variation in shot distance and defense by the other team), one of the most rudimentary aspects would be his or her ability to remember, at some level, the motoric actions exhibited on previous successful shots and reproduce them (within some margin of error).

Short of videotaping athletes' repeated shots from different angles (and perhaps with some kinds of electrodes, computer microchips, or other detectors attached to their limbs), it would be helpful to have some type of easily recordable measure of shot intensity on repeated trials. From free throws, only the hit-or-miss outcome is easily obtainable, although trajectory and launch velocity could also be gleaned with greater effort.

One potentially informative solution to our problem comes from the annual National Hockey League All-Star SuperSkills Competition, held the night before the actual All-Star Game. Of particular interest to me is the hardest shot competition, where each of the eight participants gets to take two separate whacks at the puck, and the speeds in miles per hour (mph) are revealed instantly by the television crew.

Here we can get a quantitative look -- admittedly from a small sample of players, exhibiting a fairly basic technique -- at the reproducibility of a sports action. A video of the hardest-shot contest and a results page from all of the skills competitions are both available.

I plotted the correlation between the speeds of each player's two shots, as shown below.


The linear correlation is near perfect (.88, where 1.00 is the maximum), indicating that players who really sent the puck zipping along on one of their shots also did so on their other shot, whereas those with relatively slow-moving shots on one attempt also attained similar movement on their other shot.

On this crude test, with shots taken in quick succession, slapshot speed seems highly reproducible.

Monday, January 22, 2007

There were streaky starts in two different basketball games tonight (actually, each was a combination of one team's "hotness" and the other's "coldness").

In the NBA, the Miami Heat -- playing without stars Dwyane Wade and Shaquille O'Neal -- outscored the New York Knicks 27-0 near the beginning of the game in taking a 29-3 lead. The Knicks closed the deficit to six points at one time, but Miami reasserted control to win 101-83.

Meanwhile, in women's college hoops, No. 1 Duke stunned a Knoxville crowd of 21,118, taking a 19-0 lead over No. 4 Tennessee. Parelleling the aforementioned Heat-Knicks game, the Lady Volunteers rallied to make the game competitive, losing by only 74-70.

Both of these games nicely illustrate the statistical principle of regression toward the mean. For series of observations, such as teams' play at various points in games, extreme initial performances (either extremely good or extremely bad) tend to come back toward the average. Thus, the teams with the torrid starts -- Miami shot at a .684 clip (13 of 19) and Duke made its first five shots, respectively -- would be expected to come down to earth a bit, whereas the teams in the deep freeze -- the Knicks missed 10 straight shots, Tennesse eight -- would be expected to start finding the basket.

Another element that I think is important to note is the relatively narrow range of talent in these games. Duke and Tennessee, of course, were both in the top four of the national women's collegiate rankings. Also, considering the full spectrum of men's pro (or semi-pro) basketball teams around the world (including the various sub-NBA leagues in the U.S. such as the NBDL and "new" ABA, and leagues in numerous other nations), the difference between NBA teams' talent levels are indeed narrow, even if the best and worst of the 30 teams were to play each other (insert your own joke here about the Knicks being an NBA team).

Comebacks (albeit unsuccessful) of the kind seen tonight would seem to be much more likely when teams of relatively comparable ability are playing. When teams are really not comparable, such as nationally ranked men's NCAA Division I Air Force and Division III Colorado College (not to be confused with D-I University of Colorado), you're likely to witness an unmitigated pulverizing. Even here though, with Air Force taking a 50-6 halftime lead, the second half was bound to be less one-sided, and indeed it was.

Thursday, January 18, 2007

An ESPN television graphic, based on NBA play through the end of last night's play, vividly demonstrates that the Dallas Mavericks (game-by-game log) and Phoenix Suns (log) are currently the hottest teams in the league.

The Mavs, after starting the season 0-4, have gone 32-4, for an overall 32-8 record entering tonight's game against the L.A. Lakers. Dallas has also won 18 of its last 19 games, the one loss coming January 7 in L.A. against the very same Lakers.

The Suns, meanwhile, after coming out of the gate 1-4, have gone 29-4, for an overall ledger of 30-8. They are 27-2 in their last 29 games.

Will these be the teams that play in the Western Conference final down the road?

Monday, January 15, 2007

This past Saturday, senior guard Lee Humphrey of the defending NCAA men's basketball champion Florida went 7-for-8 on three-point attempts in the Gators' 84-50 rout of conference rival South Carolina.

As I've done many times before, I want to conduct an analysis of the form, How likely is it that a player with a long-term prior success rate of X percent will proceed to make Y out of Z attempts in his or her next game? However, I want to go into a little more depth this time.

For Humphrey's prior probability of making threes, let's use .45. Looking at his career stats (which, of course, includes only part of the current season), we see that, with the exception of a .370 percentage from behind the arc his sophomore season, his other yearly percentages have clustered around .45 (.439, .459, and for this season so far, .452).

Humphrey's recent 7-of-8 performance from three-point land (.875) certainly exceeds .45. However, due to random sampling error, he is unlikely to hit at exactly a .45 clip in every game. The logic is the same as saying that, even though we know the probability of a tossed coin coming up heads is .50, repeated sets of ten tosses would likely yield something other than five heads and five tails for many of the sequences (sometimes more than five heads, sometimes fewer than five heads). The question then becomes, how incompatible is a 7-of-8 performance with an underlying .45 prior probability?

At this time, I usually bring in an online binomial calculator from Vassar College, and I do so again. By plugging in just three values -- number of attempts, n; number of stipulated successes, k; and probability of a success, p -- we can answer questions such as what's the probability of a prior .45 shooter making exactly 7 three-point attempts out of 8, and what's the probability of him or her making 7 or more out of 8?

(Statisticians would generally be more interested in the latter type of question -- probability of a particular value or more extreme -- than the former, as chances tend to be very low for any single, particular number of successes. For purposes of our analyses, however, we will need to look at probabilities of particular numbers of successes.)

By plugging in the full range of values from 0 to 8 for k, we can see the probabilities of making exactly 0, 1, 2, 3, etc., three-point shots, up to 8. These probabilities, which must sum to 1.0, are illustrated in the figure below.



As can be seen, with a .45 prior shooting percentage from behind the arc and eight shots taken, the most likely outcomes would be either three or four made shots. However, more or fewer made shots than that also have some non-ignorable probabilities.

At the extremes, these probabilities are fairly simple to compute, but get a bit more complicated in the middle of the distribution. A simple analogy would be to calculating the probability of double sixes on a roll of two dice by taking (1/6) squared, or 1/36 (all illustrations in this write-up assume independence of observations, as with dice, which has been shown to be a surprisingly reasonable assumption for sequential sports performances).

For a perfect 8-of-8 successes, the probability is simply (.45)^8, where ^ signifies raising to a power. Raising .45 to the eighth power yields .0017.

The basic probability of an exactly 7-of-8 sequence is computed according to...

(.45)^7 X (.55), which equals .0021 (.45 gets multiplied by itself seven times to represent the made shots, whereas the .55 represents the missed shot).

There are, however, eight different ways to make 7-of-8 shots. The one miss can occur on either the first shot, the second shot, etc., up through the eighth shot. We thus multiply .0021 X 8, yielding .0164.

The probability of making 7 or more out of 8 is thus .0017 + .0164 = .0181, or nearly 2 percent (1 in 50). If a .45 three-point shooter can play around 130 games over a four-year collegiate career, as Humphrey seems on pace to do, he or she might then be expected to have two or three games of making 7 or 8 threes in 8 attempts, purely on the basis of statistical fluctuation.

The probability of making exactly 6 out of 8 is (.45)^6 X (.55)^2, multiplied by the number of ways to make six shots. The number of ways gets pretty large in a hurry (i.e., missing shots 1 & 2, 1 & 3, etc., up through 1 & 8; missing shots 2 & 3, 2 & 4, etc., up through 2 & 8; and so forth). Similar reasoning applies for calculating the probability of making 5 of 8, 4 of 8, etc. See my Intro Stats lecture on this topic for further detail.

***

I also wanted to discuss, briefly, two other games from this past Saturday, one involving my undergraduate alma mater UCLA (vs. USC) and the other involving the university at which I'm on the faculty, Texas Tech (vs. Baylor).

In this year's first installment of the Battle of Los Angeles, USC got the ball with less than a minute remaining, trailing 63-57 (see play-by-play sheet). Under the most realistic scenario for the Trojans to tie the game, three things had to happen: they'd have to make a three, hold UCLA scoreless on its possession, then hit another three. Gabe Pruitt (whom we'll generously consider a .40 shooter from behind the arc, based mostly on previous seasons) and Nick Young (hitting about .45 from three-point land this season, but in the low .30s in previous years, so let's say .40 overall) did their part, hitting the two treys.

In between its two final possessions, USC fouled UCLA's Lorenzo Mata, a roughly .30 free-throw shooter this season, although a .50 and above shooter from the line in earlier seasons. Again for simplicity, let's assume a .40 FT% for Mata, which, conversely, is a .60 miss rate. There would thus be a .36 probability of Mata's missing both free throws. If you want to use .30 as his FT% and .70 as his miss rate, there would be a .49 probability of his missing both.

Mata indeed missed both free throws.

The probability of an 'SC three, Mata missing two from the stripe, and another 'SC three all happening in sequence would thus be .40 X .36 X .40 = .06 (or, if you prefer, .40 X .49 X .40 = .08).

There was one more "shoe to drop," however. Young was fouled on his three-point attempt and made the free throw for a rare four-point play, putting the Trojans up 64-63. I don't know the frequency of fouls on three-point attempts -- which would also have to be incorporated into the calculation -- but I would imagine it's pretty rare. Thus, unless we find out how often fouls on three-point attempts occur, we can say that the probability of USC taking the lead was incalculably small.

Ultimately, the Bruins still had some time on the clock after falling behind by a point, and Arron Afflalo hit a Michael Jordan-esque clutch shot from near the top of the key with four seconds remaining, to give UCLA the win, 65-64.

Finally, a surprising offensive force for Texas Tech in its 73-70 loss to Baylor was 6-8 forward Jon Plefka, who had not made any more than four field goals in a game previously this season. In the second half of the Baylor game, he made seven straight field goal attempts, some from outside including a three (box score and play-by-play document). Plefka will probably be receiving more playing time, so we can track any tendency of his for streak shooting.

Thursday, January 11, 2007

The Texas Tech Lady Raider basketball team lost 49-47 to nationally ranked Texas A&M last night. Intensifying the frustration, no doubt, was Texas Tech's 9-of-22 performance at the free-throw line.

Including the A&M game, Texas Tech is 233-339 (.69) on free throws, but subtracting the 9 for 22 to get a "prior" estimate yields 224-317 (.71).

Using an online calculator for this type of problem (known as a binomial distribution), we find that for a team with a long-term percentage of hitting free throws at .71, its probability of then making nine (or fewer) out of 22 is only .003, or three in a thousand.

One issue often raised in connection with this type of analysis is whether, perhaps, the team's poorest free-throw shooters got to the line disproportionately often. Thus, it would not be that the team got cold at the stripe across the board, but rather that each player shot to his or her normal level and it was only the poor free-throw shooters' increased attempts that knocked the team's average down.

A few things would argue against such an interpretation, in my view.

First, the two Lady Raiders who shot the most free throws against A&M were, respectively, first and (roughly) tied for second in this category for the season to this point.

Second, in an analysis of Kansas's 12-of-30 free-throw shooting in the 2003 NCAA men's championship game against Syracuse -- where I initially got the ball rolling and then Ken Pomeroy came along and did a much more elaborate study -- the finding that the Jayhawks had an excessively poor night from the stripe was pretty robust, regardless of whether adjustments were made for which individual players took precisely how many FT attempts in the title game. (Note: In Ken's analysis, you'll see where he put in a link to my initial study; mine is no longer available online, as it was on the old version of my Hot Hand page, before I switched to blog format. Ken's summary of my analysis should be sufficient, however.)

Tuesday, January 09, 2007

Revisiting the 1971-72 Lakers' 33-Game Winning Streak

Today is the 35th anniversary of the ending of the Los Angeles Lakers' 33-game winning streak, the longest winning streak in major American professional team sports. A game-by-game log of that season, from Basketball Reference, is available here, whereas a narrative of the games during the streak, from Sports Illustrated, is available here. To mark the occasion, let's look back at that Laker team, both historically and statistically. First, here's a commemorative team picture that I recently found in my room at my parents' home in Los Angeles:



In retrospect, it's hard to imagine that the 1971-72 Lakers would dominate the NBA the way they did, with their 33-game winning streak, 69-13 regular-season ledger (an NBA record at the time), and relatively easy march through the play-offs (with no series closer than 4-2).

The Lakers had lost the NBA finals in 1968, '69, and '70, and then were eliminated in the next year's Western Conference finals as the Milwaukee Bucks -- a relatively new franchise, now featuring young star center Lew Alcindor (later Kareem Abdul-Jabbar) -- romped to the '71 NBA title.

By the start of the 1971-72 season, then, the Lakers probably would have struck most observers as an over-the-hill team (I'm inferring this after the fact, as I was only 9 years old at the time of the streak and not very sophisticated regarding players' peak performance years). Although center Wilt Chamberlain and guard Jerry West were still productive, years of knee injuries appeared to be catching up with veteran forward Elgin Baylor. The Lakers did have one newcomer who had the potential to breathe new life into the team, Coach Bill Sharman.

According to Charley Rosen's (2005) book about the 1971-72 Lakers, entitled The Pivotal Season, the Lakers started out pretty well, but there was a feeling that Baylor was holding them back. Writes Rosen, "Baylor was selfish and defenseless... There was only one thing for Sharman to do -- arrange a retirement party for Baylor" (p. 97).

(I personally found the book useful for reminding me of key points in the streak, but according to a review at Amazon.com, the book appears to have quite a few factual errors in its details.)

In fact, it was immediately after Baylor's departure that the Lakers began their streak, beating Baltimore 110-106. Along the way, the Lakers surpassed the previous NBA record winning streak -- 20 games, set the year before by none other than Milwaukee -- and the previous pro sport record of 26 straight wins by the 1916 New York (Baseball) Giants.

In addition to being the previous year's NBA champion and holding the previous NBA record winning streak, the ubiquitous Milwaukee Bucks had another place in the story, spanking the visiting Lakers 120-104 on January 9, 1972 to end L.A.'s victory streak at 33 games.

As those of you who are longtime readers of the Hot Hand page know, to estimate the probability of a perfect sequential run, we multiply the probabilities of the individual components (wins). If there were a uniform probability of the Lakers' winning each game (the way a coin always has a .50 probability of being a head), we would raise that probability to the 33rd power.

However, the 33 games in the streak would obviously have varied in their degree of difficulty. To account for this, I adopted a very simple model that pegged the difficulty of each game on whether the Lakers were at home or away and on the opponent's winning percentage from the previous season (the streak occurred early in the 1971-72 season, so same-season record probably wouldn't have added much).

Based on opposing teams' 1970-71 winning percentages, I created four classes of difficulty. The Bucks' .805 percentage put them in a class by themselves, which I called Group A. Six teams' percentages clustered within .537-.634, so I called this Group B. Another five teams' percentages ranged from .439-.512, so they were Group C. Finally, three teams that were first-year expansion franchises in 1970-71 -- Buffalo (later the Clippers), Cleveland, and Portland -- had winning percentages from .183-.354, thus constituting Group D. The Lakers did not play the remaining team, Cincinnati (later Sacramento), during the streak.

Then what I did was assign (assumed) Laker win probabilities to the 33 games based on the following rules:

D opponent at home for Lakers ---> .90
D opponent on the road ---> .85
C opponent at home ---> .80
C on road or B at home ---> .75
B opponent on road ---> .70
A opponent at home ---> .65
A opponent on road ---> .60

I purposely tried to err in the direction of making these probabilities too high, so that the product of the 33 probabilities would not be overly small. For what it's worth, my estimate of the overall probability of the Lakers winning all 33 of the games they did during the streak is...

.0002, or 1 in 5,000.

Consider the following:

*The NBA has been around for about 60 years.

*There are currently 30 NBA teams, and there have been at least 22 teams during the past 30 years.

*For as long as I can remember, each team has played 82 games per season, which creates a lot of theoretical opportunities for a team to start a 33-game winning streak (such a streak could be started after each loss).

Without doing any more math, it looks to me that over the entire history of the NBA, there would probably be several thousand opportunities for such a streak. Thus, the Lakers' streak might not be that far out of line.

Contemporary observers would probably cite travel as a factor for why a team would be unlikely to win 33 straight games today. However, if you look at the '71-'72 Lakers' game-by-game log at one of the above links, you'll see that from December 17-22, they played five games in six nights (including three straight nights), which is not done anymore. In fact, I don't believe the current NBA schedule allows a team to play any more than two nights in a row. And remember the Lakers' aging roster!

Another aspect to look at is the Lakers' margins of victory during the streak. They had one overtime game, December 10 against Phoenix. Other than that, the point differentials were distributed as follows:

*9 games won by 4-9 points
*15 games won by 10-19 points
*5 games won by 20-29 points
*3 games won by 30 or more

[A slight error in these margin-of-victory frequencies was corrected on 1/16/11.]

On the whole, the Lakers' victory margins were pretty healthy, so they may have been able to conserve some energy by blowing away some teams early.

Finally, if you want to see another perspective, I would recommend this piece by Gabe Farkas at Courtside Times. Although Farkas starts out discussing the super-streaky Laker squad, he ultimately uses the 1995-96 season, in which the Chicago Bulls surpassed the '71-'72 Lakers' 69-13 record by going 72-10, for his major analyses.

Sunday, January 07, 2007

The California Institute of Technology (CalTech) ended its 207-game, 11-year losing streak in men's NCAA Division III basketball with a win over Bard College on Saturday night.

One might expect people at CalTech to have thought a lot about the streak, from the odds of the team finally winning a game to the physics of how to launch a successful shot at the basket. Indeed, as I just found during some web searching, Professor Colin Camerer has done some research on hot and cold hands, although not necessarily related to his own school's team. To learn about this research, go to Professor Camerer's faculty webpage, then scroll down to the section entitled, "Research background and details," and, finally, click on "Field studies: Cabs and basketball."

Also, Dean Oliver, author of the book Basketball on Paper and a statistical consultant for the Seattle SuperSonics, once played point guard for CalTech.

In honor of this occasion, I'll end with a cheer I also just discovered on the web. Variations of this cheer are said to have used by MIT, CalTech, and other quantitatively advanced schools:

E to the u du dx,
E to the x, dx.
Cosine, secant, tangent, sine,
3 point 1 4 1 5 9.
Integral, radical, mu, dv
Slipstick, sliderule, MIT!

Saturday, January 06, 2007

In my December 16, 2006 entry, I mentioned a game around that time in which the New Jersey Nets failed to take advantage of an 18-0 lead over Boston, in falling to the Celtics.

Well, last night, the Nets fell behind 18-0 to the Chicago Bulls and, you guessed it, came back and won.

We've all heard expressions such as, "Things even out in the end," and, "What goes around comes around." That's what's happened to the Nets, albeit with unusual exactitude!

Wednesday, January 03, 2007

With its loss (41-14 to LSU) in tonight's Sugar Bowl, Notre Dame has just set a new record by falling in its ninth straight football bowl game. A chart listing all Fighting Irish bowl games in school history is available on the Wikipedia's Notre Dame football page.

Notre Dame had shared the record for consecutive bowl losses at eight with West Virginia and South Carolina. My graduate school alma mater, the University of Michigan, once lost seven straight bowls.

I think it's fair to say that, at least as a rough approximation, bowl match-ups are created to make the games competitive. Of this season's 32 bowl games, I count 19 in which the two teams either came in with the same number of losses or differed by only one loss.

If we assume each bowl game is a 50/50 proposition as to who will win, then the probability of a team losing nine straight is (1/2) raised to the 9th power, which is 1/512. It's the same logic by which the probability of rolling double sixes with dice is (1/6) X (1/6) or 1/36; the probability of a given outcome on one iteration is raised to the power corresponding to the length of the streak.

A theory that I (and others) have come up with is that Notre Dame bowl games often are not 50/50 propositions because the school's popularity and mystique (Knute Rockne, the Four Horsemen, the Golden Dome, the exclusive contract with NBC, etc.) gets it in bowl games above its ability level. I did a little searching for articles on Notre Dame's recent bowl games and, indeed, the Irish has tended to be the underdog.

Even if we assume the Irish had only a 40% chance of winning any given bowl game (which translates into a 60% chance of losing a given game), the probability of nine straight bowl losses can be estimated at (.60) to the 9th power, or .01 (1 in 100).