Thursday, November 30, 2006

Being a faculty member at Texas Tech University, I periodically check out the Internet discussion boards related to the school's sports teams. It was there that I learned a few hours ago that the Red Raider men's basketball team is, at the moment, leading the NCAA in three-point shooting percentage.

Texas Tech has made 58 of 115 attempts from behind the arc (50.4%). While I was looking at the team statistics, I decided to peruse the individual three-point shooting statistics, as well.

Excluding three players who are each 2-for-2 (100%) on three-pointers due to insufficient attempts, the current national leader among individuals is BYU's Austin Ainge, who's hit 12-17 (70.6%). (For those who are wondering, Austin is indeed the son of former NBA guard Danny.) I guess you can say the young Ainge has the range!

Neither Texas Tech's 50% success rate as a team, nor Ainge's 70% rate, is likely to hold up for the season. Last year's three-point percentage leaders at the end of the season were Southern Utah (team) at 42.9% and Northern Arizona's Stephen Sir (individual) at 48.9%.

The current season is about one-fourth of the way through. What we're likely seeing, therefore, is the extremity of results associated with small numbers of observations. This concept was first brought to my attention by Geoff Fong in the spring of 1984, when he was on the faculty at Northwestern and I was visiting during my tour of prospective graduate schools (I ultimately chose Michigan).

Geoff was telling me about his research on statistical reasoning, and he pointed out how, early in every Major League Baseball season, the list of batting leaders will tend to have several players hitting above .400, yet there would be virtually no chance of any player ending the season at that level (the last player to hit .400 or better for a season was, of course, Ted Williams in 1941).

This statistical document describes the small-numbers phenomenon a bit more technically:

...all other things being equal, variation is more pronounced with small samples than with large ones. The larger your sample, the more stable your results will be. They will be less subject to the possibility that another study would produce greatly different results. A corollary is that large samples are less likely to produce extreme results. For example, assuming that you have a fair coin, it's much more difficult to get all heads when you toss a coin 50 times than when you toss it only two or three times.

Let's use last year's Texas Tech three-point success rate of .390 as a baseline for this year's squad (though there has been some change in personnel, most of the Red Raiders' outside shooters are still on the team, including offensive stalwart Jarrius [Jay] Jackson).

Using an online calculator for what is known as a binomial probability, we can ask how likely it is that a .390 three-point shooting team (which is what this year's Red Raiders are assumed to be, based on last year) could make 58 (or more) treys in 115 attempts. The answer is .008, a little less than 1-in-100, so what the Red Raiders are doing is already very rare statistically. Eventually, we may have to reject our "null hypothesis" that Texas Tech really has an underlying .390 probability on making threes.

As noted above, however, the larger the sample, the less susceptibility to unusually high or low success rates. To approximate a full season's worth of shots (i.e., a larger sample) instead of just a quarter season, I multiplied by four, Texas Tech's current number of made threes (58 X 4 = 232) and number of attempts (115 X 4 = 460). The ratio of 232/460 is the same as the Raiders' current three-point percentage of 50.4, but would be a much longer-term accomplishment. Again, using .390 as a baseline, the team's probability of hitting 50.4% of 460 three-point attempts is much tinier than before, .0000004, about 4 in 10 million.

Another potentially relevant concept that I'd like to mention briefly is regression toward the mean, which Lady Raider basketball announcer Ryan Hyatt sometimes invokes in his radio broadcasts. Regression toward the mean simply refers to the tendency for extreme values in the early rounds of performance -- either extremely high or extremely low -- to be followed by values more in the center of the distribution.

In conclusion, the statistical phenomena of small samples and regression toward the mean both suggest that the Texas Tech men will suffer some drop-off from their current 50.4% three-point shooting percentage. You probably don't need to have a statistics teacher tell you a 50% three-point shooting clip is unlikely to be maintained for a full season, any more than you need one to tell you that baseball players batting over .400 for the first month of the season will almost certainly fall off in their averages. If, however, you have some interest in the statistical concepts associated with teams' and players' fall-off after hot starts, you've visited the right place!

Saturday, November 25, 2006

In about a half-hour, the Utah Jazz will attempt to improve upon its 12-1 start to the current NBA season (game-by-game log; ignore the pre-season games that are listed first). The Jazz finished exactly at .500 last year (41-41), so such a torrid start this season comes as a surprise to most observers. This article provides some ideas of why Utah appears to be so improved.

Thursday, November 23, 2006

Happy Thanksgiving!

There was a men's college basketball game televised earlier today, in which Southern Illinois went scoreless in overtime in losing to Arkansas. Going scoreless in OT seems like an interesting type of cold hand.

Going scoreless for a five-minute stretch at any point in a game is probably fairly unusual. Further, if a game goes to overtime, that would seem to suggest the teams are pretty evenly matched (at least on that day or night). Therefore, one team shouldn't be able to shut out the opponent by sheer intimidation, for example by continually pressing and stealing the ball.

Shot clocks range from giving teams 24 seconds per possession to shoot in the NBA to 35 seconds in men's NCAA play (women's college ball uses 30 seconds, whereas the WNBA switched last season to 24 seconds from 30). Thus, unless both teams exhaust their full allotments of time to shoot, it would seem that teams could get about two possessions per minute, or 10 for an entire overtime. That's a lot of shots to miss (although a team could have fewer, due to turnovers), not to mention possible free throws.

One mechanism by which a team could go scoreless in OT -- of which you'll see some apparent evidence below -- is that it could get desperate after falling behind early in the extra period and then start jacking up threes.

I naturally wondered how often overtime shutouts have occurred. To get an estimate, I did some web searching using keywords such as overtime, scoreless, shut out and, to exclude other sports, basketball. It might not be the most scientific way to approach the problem, but it should provide a ballpark (or in this case, arena) figure. Below is a list of games I found from 2000 onward, complete with web links to game articles and box scores.

Men's College

Southern Illinois (vs. Arkansas), November 23, 2006
(SIU was 0-3 from the field in OT, no FT attempts)

George Mason (vs. James Madison), February 7, 2004
("The Patriots missed all six of their field goal tries, four from behind the arc, and went 0-for-2 at the free throw line in overtime.")

Women's College

Indiana (vs. Michigan State), February 29, 2004
(IU was 0-6 in OT field goal attempts, all from three-point land, no FT attempts)

Men's Pro

Boston (vs. Indiana), April 29, 2003
(first OT shutout in NBA play-off history; Celtics missed six shots from field and two FT attempts)

Vancouver (vs. Indiana), December 2, 2000

Women's Pro

None found.

Monday, November 13, 2006

With yesterday's 17-16 win over the Buffalo Bills, the Indianapolis Colts have gotten off to a 9-0 start this season. This makes them the only team in NFL history to start out 9-0 in two consecutive seasons. Last year, in fact, the Colts won their first 13 games of the season.

Obviously, winning regular-season games is not the problem for Indy - it's getting to the Super Bowl. We'll see if things are any better this season, come play-off time.

Friday, November 10, 2006

Some players in a given sport seem to perform at the same level night after night, whereas others show more variability from good to bad in how they do. Is it more advantageous to have one of type of player than the other? Sal Baxamusa investigates this question with regard to selected MLB starting pitchers, in The Hardball Times.