Thursday, January 21, 2016

A Decade of UConn Women's Basketball Wins and Losses, At a Glance

Last Friday, ran a piece documenting the best 100-game stretches in U.S. college sports and major pro leagues such as the NBA, NFL, NHL, and MLB. At the time the article appeared, the University of Connecticut (UConn) women's basketball program had won 99 of its last 100 games (now 101 of its last 102). The legendary UCLA men's basketball program under coach John Wooden also had recorded a 99-1 stretch, from 1971-74.

The UConn women under coach Geno Auriemma have been dominant since winning the first of their 10 NCAA national times (and compiling the first of their five undefeated seasons) in 1995. I, therefore, wanted to look at the Huskies' long-term success beyond their past 100 games.

I decided to examine UConn's last 400 games, representing roughly the past decade of play. The Huskies' past 400 games span from the opening game of the 2005-06 season all the way to last night, when UConn routed Central Florida, 106-51. In these games, the Huskies are 377-23 (.942).

This record appears more compelling, in my view, when viewed in graphic form. I've thus created a diagram that shows 400 dots (one for each game), with wins depicted in blue and losses in red. The games are arranged in chronological sequence, from the first contest in the upper-left corner, advancing across each row of dots, until the 400th game in the lower-right corner. Here's the diagram:

Pretty blue, huh? I can't think of a way to convey the Huskies' dominance any more dramatically. The picture includes both a 90-game winning streak (the NCAA Division I record among men or women) and the current 101-1 stretch, as indicated by the side annotations. To help with interpretation of the diagram, I also created the following legend:

(You can click on the graphics to enlarge them.)

Friday, December 25, 2015

The Importance of Six-Minute Scoring Spurts in the Warriors’ Winning Streak

Merry Christmas and Happy Holidays!

As part of the NBA's package of Christmas telecasts, the champion Golden State Warriors (27-1) will host Cleveland (19-7) in a rematch of last season's finals. The Warriors have been the big story of the 2015-16 season, starting out 24-0 (a 28-game regular-season winning streak if one includes the last four games of the previous season) to threaten the 1971-72 Los Angeles Lakers' record 33-game winning streak. After a December 12 loss at Milwaukee to end the streak, Golden State has won three in a row.

On this holiday occasion, I'd like to look back on the Warriors' winning streak, using an unusual lens. Offense is the team's forte, as seen in the NBA team-scoring rankings. Some basketball analysts look at statistics such as teams’ points per game or points per 100 possessions. To understand the Golden State Warriors’ success over the past season and a half, in my view, we have to look at smaller segments of play. Not halves, not quarters, but six-minute “eighths” of games. When the first-quarter clock runs down from 12:00 to 6:01, that would be the first eighth; from 6:00 to 0:00, the second. The eighth and final eighth would run from 6:00 to 0:00 of the fourth quarter.

In the 2015 playoffs, the Warriors played 21 games and thus 168 eighths of basketball (overtimes are not counted within my analyses). In 22 of these 168 eighths (13%), Golden State scored 18 or more points, which translates into 3 or more points per minute. If a team maintained a 3-points/minute pace for a full 48-minute game, it would score 144 points. Thus, I use 18+ point eighths as a marker of offensive explosiveness. The 22 eighths in which the Dubs scored 18+ points during these playoffs included three of 24 points (4 points per minute) and one of 25 points.

Through the Warriors’ first 25 games of the 2015-16 season – 24 wins followed by a loss – they have recorded 46 eighths of 18+ points in the 200 eighths they’ve played (23%). (Given that opposition is stronger in the playoffs than in the regular season, it’s not surprising that Golden State’s percentage of eighths with 18+ points is higher in the latter.)

The Warriors’ best eighth of the current season, as far as I can tell, occurred in the last six minutes of the first quarter on December 8 at Indiana. After scoring 17 points in the first 6:00 of the quarter, the Dubs added 27 points in the latter half of the first quarter (the second eighth of the game). This explosion included four treys (plus Klay Thompson making all three free throws after being fouled behind the arc).

After the Warriors’ 114-98 win at Brooklyn to go 22-0, acting coach Luke Walton was quoted as saying that, "It's one of our biggest strengths, is that we're never out of a game and we're always one little run away from putting a game away."

The following graph plots Golden State's points scored in the final 6:00 (the eighth eighth) of their first 25 games this season, as a function the number of points by which they were leading or trailing with 6:00 minutes left in regulation, We see that the Warriors’ greatest scoring outbursts in the final eighth have occurred when they have trailed or been tied heading into them. (For those with some statistical training, the correlation between Warriors’ margin entering the final eighth and their points scored in the final eighth is r = -.52, p < .01.)

In some ways, this finding is totally intuitive. Trailing or being tied should motivate a team (especially one, such as the Warriors, who were trying to maintain a long winning streak) to play extra hard; conversely, when a team is way ahead, it likely will put reserves in the game and run time off the clock, both resulting in lower offensive output. In another way, however, the finding is not so intuitive. If you’re trailing or tied late in the game, it could mean you are playing a tough opponent and/or having an off-night, which are not conducive to big scoring runs.

The above graph also shows that failure to respond as expected is what put the Warriors’ winning streak in jeopardy in Game 24 at Boston (a double-overtime Golden State win) and helped end it the next night in Milwaukee. According to the trend-line projection, the Dubs would have been expected to score 17 or 18 points in the final eighth of the Celtics game, but instead scored only 12 (this discrepancy is depicted by the red dashed vertical line). Trailing by 13 at Milwaukee, Golden State would have been expected to put up 20 in the final six minutes, but instead scored only 15.* Given that the Boston and Milwaukee games were the sixth and seventh of a seven-game road trip, the late-game loss of the Warriors’ explosiveness doesn’t seem surprising.

I don’t expect media outlets to replace the standard quarter-based line-score with one organized by eighths. For highly explosive teams such as the Warriors, however, I do believe eighths are a useful lens for statistical analysis.


 *The Warriors lost to Milwaukee by 13 (108-95), so strictly speaking, even if Golden State had scored the extra five points predicted by the correlational analysis, it still would have lost. Had the Warriors shown more offensive prowess in the final 6:00, however, the Bucks might have begun to feel pressure and perhaps the ending would have unfolded differently.

Saturday, December 12, 2015

It's Over! Warriors Lose to Bucks 108-95

Living members and fans of the 1971-72 Los Angeles Lakers can rest easy, as that team's 33-game winning streak will remain the NBA record for the foreseeable future. Whether you counted the Golden State Warriors' current win streak at 24 or 28 games (including the last four of the 2014-15 regular season), it doesn't matter. The Warriors' streak is now over, as moments ago, they fell at Milwaukee, 108-95. The Bucks held a double-digit lead for much of the contest. A few times late in the third quarter and early in the fourth, Golden State cut the deficit to three points or fewer, but never could tie the game or take the lead (play-by-play sheet). Playing the final game of a seven-game road-trip, just one night after a double-overtime win in Boston, the Warriors appeared spent.

Wednesday, December 09, 2015

Should Warriors' 4 Wins at End of 2014-15 Regular Season Count as Part of Current Winning Streak?

As virtually all readers of this blog would know, the Golden State Warriors have yet to lose in the 2015-16 NBA season, increasing their record to 23-0 with last night's win at Indiana. The NBA record for longest winning streak is, of course, the 1971-72 Los Angeles Lakers' 33-game stretch.

So the Warriors are 10 wins shy of the record. Well, not necessarily. Golden State won its last four regular-season games of the previous season, so it is technically accurate to say the "Dubs," as they're sometimes called, have won 27 straight regular-season games.

After I mentioned on Twitter the idea of counting the last four games of 2014-15, Lakers fan Len Lester tweeted at me that "if you're carrying over last season gotta include post season too." The Warriors won the NBA title last season, but lost a combined five games in the playoffs.

In thinking about Len's point, I have to admit that it's more than a little odd to claim a continuous win streak from April 2015 (when the regular season ended) to the present, when Golden State lost five games in between.

Hypothetically, if the Warriors get to 30-0 just in the current season and then lose -- giving them 34 straight wins over two regular seasons -- I suspect the NBA might create two entries in its record book: longest winning streak within a single regular season, and longest regular-season winning streak spanning multiple seasons.

I've added a poll in the right-hand column, so readers can vote on whether the last four wins of the previous season should be added to Golden State's current total.

Monday, October 26, 2015

A (Somewhat) Unusual Aspect of Daniel Murphy's Postseason Home-Run Streak

As we approach the opening of the 2015 World Series on Tuesday night, the dominant story line has been the home-run streak of Mets second-baseman Daniel Murphy. He has hit a home-run in each of his last six games, a playoff record.

One of the key principles I've gleaned through nearly 15 years of hot-hand research is that long streaks are most likely to be achieved by players and teams with very high baseline success rates.

The late Harvard paleontologist Stephen Jay Gould famously wrote that, "Long streaks always are, and must be, a matter of extraordinary luck imposed upon great skill." Thus, if you look at some of longest streaks in American sports history -- 88 straight wins by John Wooden's UCLA men's basketball teams of the 1970s, and 90 by Gino Auriemma's UConn women of the 2000s; Joe DiMaggio's 56-game hitting streak; and Tiger Woods making the cut in 142 straight golf tournaments -- these are all athletes and teams that generally succeeded an enormously high percent of the time. Throw in a little luck to avoid a loss in a close game (or a hitless game or missed cut) and, voila, you have a long streak.

Murphy, however, does not have a high base rate of home-run production, never hitting more than 14 in any of his seven regular seasons, and averaging about nine per year. A home-run streak by Murphy is therefore more out-of-the-ordinary than, say, one by Mike Trout would be. (Click here for a statistical analysis taking this approach.)

A second unusual feature of Murphy's streak -- and perhaps I'm being too picky -- is that, when a player goes on a monster home-run tear, there's a good chance he'll have one or more multi-homer games within the streak. Murphy has not, hitting exactly one homer per game during his streak.

In September 2010, for example, then-Colorado shortstop Troy Tulowitzki hit 14 home-runs in a 15-game span, including four multi-homer games. (Playing at homer-friendly Coors Field for many of those games probably helped.)

Then there's the case of Hee-Seop Choi, whose brief MLB career included a weekend in June 2005 in which, playing for the Dodgers he belted six home-runs in a three-game series vs. the Twins (two, one, and three homers, respectively, on Friday, Saturday, and Sunday). He hit only three more homers the rest of the season and never played in the majors after 2005.

Finally, when we look at the players Murphy surpassed for most consecutive postseason games with a homer, the tendency for multi-homer games to be embedded within the streaks is there (highlighted in yellow). The number of homers a player hit in a given game are shown in parentheses. Click on the number in parentheses for the box-score.

Carlos Beltran, 2004, 5 straight games: NLDS Game 5 (2); NLCS Game 1 (1); NLCS Game 2 (1); NLCS Game 3 (1); NLCS Game 4 (1)

Evan Longoria, 2008, 4 straight: ALCS Game 2 (1); ALCS Game 3 (1); ALCS Game 4 (1); ALCS Game 5 (1)

Jim Thome, 1998-1999, 4 straight: '98 ALCS Game 5 (1); '98 ALCS Game 6 (1); '99 ALDS Game 1 (1); '99 ALDS Game 2 (1)

Juan Gonzalez, 1996, 4 straight: ALDS Game 1 (1); ALDS Game 2 (2); ALDS Game 3 (1); ALDS Game 4 (1);

Jeff Leonard, 1987, 4 straight: NLCS Game 1 (1); NLCS Game 2 (1); NLCS Game 3 (1); NLCS Game 4 (1)

Reggie Jackson, 1977-1978, 4 straight: '77 WS Game 4 (1); '77 WS Game 5 (1); '77 WS Game 6 (3); '78 ALCS (1)

Lou Gehrig, 1928-1932, 4 straight: '28 WS Game 2 (1); '28 WS Game 3 (2); '28 WS Game 4 (1); '32 WS Game 1 (1)

Don't get me wrong. Hitting one home-run in each of six straight games, as Murphy has done, is amazing. It's just that, if a player is seeing pitched balls really well or concentrating better than ever, wouldn't we expect such a mental state of being "in the zone" to manifest itself within the same game?

Thursday, September 24, 2015

This Week's Wall Street Journal Interview with Billy Beane and Bill James, and the 2003 Scottsdale, Arizona Informal Sports-Analytics Conference

In March 2003, I attended a small, informal conference in Scottsdale, Arizona (strategically selected to coincide with spring training) with sabermetrically inclined academicians and sports journalists/analysts. The conference was mentioned three days ago in a Wall Street Journal interview with Oakland A's General Manager Billy Beane and Boston Red Sox adviser Bill James (link). My friend Mike Gustafson, remembering that I had attended the conference, brought the WSJ article to my attention (Thanks Mike!).

Reference to the Arizona meeting came up almost by accident in the interview, in response to a question about a different topic. Here's the relevant part of the article:

WSJ: Is there another sport that stands out to either of you as being most ripe for the kind of revolution baseball has undergone with analytics? 

 James: Football, from a popular perception angle, has lots of openings for analysts to rush in. There’s been this ongoing debate in football. Billy and I met in Arizona. When was that? 

 Beane: 2000, 2001 maybe. It was a little half-conference that Bill had and I joined him for. It was before the book. 

James: The guy who put it together was a distinguished economist from the University of Chicago. One thing we talked about then, I remember, there was a guy from AT&T who studies football, and he was arguing then that it’s foolish for football coaches to punt in many situations in which they actually do punt...

Beane was a couple of years off, and the football analyst mentioned by James is David Romer of UC Berkeley (not AT&T, unless he did research for the telecommunications giant at some other point in his career). Other than that, however, Beane and James were pretty accurate about the conference. Here's a link to Romer's paper, by the way.

My connection is that, along with Tom Gilovich, I was representing hot-hand research. Most of the participants in the meeting are shown in the following photo (with a legend to identify participants at the very bottom of this posting). You may click on the graphics to enlarge them.

The "distinguished economist" from Chicago is Richard Thaler (first row, white shirt). Among his other accomplishments, Thaler co-authored the book Nudge in 2008. Thaler co-organized the conference with Jim Sherman of Indiana University (standing next to Thaler). Beane is not shown in the photo, as he was busy overseeing the A's from their nearby spring-training complex and took time from his day to come by the conference to speak at one session.

One broad theme addressed at the conference was the future of systematic empirical research in guiding sports teams' decision-making. Twelve years post-conference, I think it's safe to say that the use of such research has expanded dramatically beyond the A's, Red Sox, and any other teams that were looking at sabermetrics in 2003.

In 2015, ESPN The Magazine, which has been issuing an annual Sports Analytics Issue for at least the past few years, ranked all MLB, NBA, NHL, and NFL franchises on their commitment to sabermetrics, using a five-level scale (All-In, Believers, One Foot In, Skeptics, and Non-Believers). With the exception of the NFL (where no teams were listed as All-In), the combined total for All-In and Believers exceeded that for Skeptics and Non-Believers in all sports. 

Another indicator is the annual MIT Sloan Sports Analytics Conference, the history of which is documented here. The inaugural meeting in 2007 was held on the MIT campus and attracted approximately 175 people. By 2015, attendance had jumped to 3,200, with the festivities now being held in a large convention center. Three league commissioners (NBA, MLB, and Major League Soccer) spoke at the 2015 conference.

Here's the legend identifying participants in the 2003 Scottsdale conference.

Saturday, August 01, 2015

New Study of NBA 3-PT Contest Heats Up Hot-Hand Debates

A new study of NBA All-Star Weekend three-point shooting contests by Joshua Miller and Adam Sanjurjo, posted to the Social Science Research Network (link), has re-ignited debates over the magnitude of hot-hand effects on basketball shooting. Miller and Sanjurjo have identified a bias in certain types of hot-hand calculations that appears to have led to underestimation of hot-hand effects in previous studies. While there appears to be a broad consensus (including the present writer) on the validity of Miller and Sanjurjo's point, numerous other issues are being debated among the lead writers and commentators on various sports blogs.

First off, let's review the aforementioned bias. Miller and Sanjurjo, as have others, compared basketball shooters' hit rates when hot (in this case, following three straight made shots) to their hit rates following three-shot sequences other than three straight hits (when players are less hot or even cold). The authors' SSRN paper notes that distortion stems from the fact that "conditioning on a streak of three or more hits creates a selection bias in which these hits are removed from the sample, leaving a smaller fraction of hits, thus driving conditional performance on the subsequent shot below the base rate" (p. 9). Here's a concrete illustration. Using part of an example Miller shared in an e-mail, where H = hit and M = miss, the sequence [HHHMHHHM] would yield the not-so-hot result that the player was 0-for-2 on shots following three straight hits, even though the player's overall shooting (6-of-8) was very hot. Further, with a correction formula devised by Miller and Sanjuro, hot-hand effects now appear to be larger than previously thought (at least within this type of analysis).

This finding has sent statistically oriented bloggers to their keyboards with great urgency. Columbia University statistics professor Andrew Gelman headlined his July 9 posting "Hey -- guess what? There really is a hot hand!" The last I checked, Gelman's piece had received 105 comments! Then, on July 21, all-around sabermetrician Phil Birnbaum weighed in on his blog with a posting entitled "A 'hot hand' is found in the NBA three-point contest."

Though some of the commenters on these blogs have gone back and forth over the proper magnitude of the correction for the aforementioned bias and other methodological issues, I think it's easiest to take Miller and Sanjurjo's findings at face value. Table 1 of their paper is very informative, presenting results for 33 players, with and without bias-correction. The authors correctly note that, when averaging over players, those with negative results (shooting worse after a hot streak) can cancel out positive results. A simple look at the frequencies of different results therefore seems warranted, so I have summarized the results concisely from Miller and Sanjurjo's more-elaborate table. Even with the bias-correction (which enhances how streaky a player looks), here's how many players show different increases in shooting percentage conditional on making three straight shots:

Accuracy Gain 
After 3 Straight 
Hits ("Hotness")
No. of 
.20 to .22
.14 to .18
.11 to .12
.05 to .07
.01 to .04

Miller and Sanjurjo's claim that some players exhibit quite appreciable streakiness is well-supported. What about the "typical" or "average" performance? The median for all players (which is unaffected by how far in a positive or negative direction the most extreme values sit) is a .05 or 5% improvement after making three straight shots (16 players above .05, 2 at .05, and 15 below it). This is indeed stronger evidence for basketball-shooting streakiness than we've seen before. For example, a Harvard study (Bocskocsky, Ezekowitz & Stein, 2014) found approximately a 2% hot-hand increase for NBA in-game shooting, using SportVU tracking-camera  technology (here and here).

There are a couple of possible reasons why Miller and Sanjurjo's findings may overstate hot-hand effects. As Birnbaum notes, within the 25-shot sequence of the NBA three-point contest, there are five locations, from each of which the player attempts five straight shots. Thus, if a shooter hits his first shot from a given location, he can rely on the same motor/muscle memory in launching the next four shots.

Further, players invited to the NBA three-point-shooting contest are known to be great outside shooters, and players with high base rates of success appear more likely than those with lower base rates to go on hot streaks. Therefore, it would be interesting to see what would happen with a more representative cross-section of NBA players. (Both the motor/muscle-memory and base-rate issues are discussed in my book.)

Even if Miller and Sanjurjo's 5% median hot-hand effect is not inflated, it is still probably a smaller magnitude than most fans would associate with the term "hot hand," as the authors appear to acknowledge. The double-digit percentage-point increases some shooters exhibit after three straight hits, on the other hand, would seem to be closer to a lay characterization of a hot hand.

In addition, Miller, in comments on the Gelman blog, holds the Harvard study to a very high level of scrutiny, in my view. Arguably, too high. For example, Miller notes that it omitted some possible control variables, including "the quality and identity of the defender." However, the Harvard study did control for "Distance of Closest Defender, Angle of Closest Defender, Shooter-Defender Height Difference, and [whether the shooter was] Double Covered." Once all these facets of the defense are accounted for, I don't know how much incremental knowledge we gain from knowing the defensive-efficiency of the player guarding the shooter. I therefore take the Harvard study's 2% estimate of a hot-hand magnitude as having probative value.

In the end, I come down closer to Birnbaum's relatively skeptical view -- including his point that Miller and Sanjurjo's finding should be described as "a" hot hand, rather than "the" hot hand, because, like all studies, it is context-dependent -- than Gelman's more accepting position. Miller and Sanjurjo's hotness magnitudes for the hottest-shooting players are higher than I would have guessed. But the magnitudes for median shooters are only slightly higher than what I would have imagined.