FiveThirtyEight vs. The Oddsmakers
You come at the king, you best not miss. - Omar Little
At the start of this year’s NCAA tournament, FiveThirtyEight, the new website of reigning forecast champion Nate Silver, predicted each team’s chances of making it to different rounds of the tournament. In an update yesterday, FiveThirtyEight looked into how their forecasts were doing. Having made my own predictive bracket based on Las Vegas odds, I figured I’d do the same—and see who comes out on top.
How Did FiveThirtyEight Do? #
Rather than simply forecasting winners, FiveThirtyEight’s predictions—like mine—calculate each team’s probability of winning every game. To assess how well these forecasts performed, it’s not appropriate to see how many of their “favorites” won. Instead, it’s better to see if favorites win more or less often than expected. In other words, if FiveThirtyEight identified 100 games in which the favorite had a 60% chance of winning, the favorite should actually win 60 of them. If the results are substantially different from that, then it’s an indication that something’s wrong with the model.
Over the last several years, Silver’s predictions have performed well. The chart below—reproduced using data provided by FiveThirtyEight—compares game results to FiveThirtyEight’s forecasts. As it shows, if you group games by the predicted odds of the favorite winning, the actual results are close to that range.
As Silver noted in his post, though the results in each bucket don’t precisely match the forecast, they fall reasonably close and well within his confidence intervals. Silver’s model, it appears, works reasonably well.
Silver vs. Vegas #
While FiveThirtyEight’s bracket is based on team rankings and a few other factors, I based my bracket solely on Las Vegas odds. Though the predictions are different, our brackets’ favorites are all the same, except for the Championship game. FiveThirtyEight gave a slight edge to Louisville over Florida, while Vegas preferred Florida by a slim margin.
Unfortunately, though Silver’s predictions go back to 2011, I only made forecasts for the tournaments last year and this year. To make the comparison equal, I first trimmed Silver’s data to include only 2013 and 2014 results. The chart below shows the same calculations as above for FiveThirtyEight (the buckets were made larger to adjust for the smaller sample size).
Unsurprisingly, with a smaller sample (especially one that includes the chaos of last year’s tournament), Silver’s model looks tarnished (sorry). Still, the trend is generally in the right direction.
Compared to my predictions using Vegas odds, however, Silver regains his luster. Vegas (or my method of interpreting Vegas) performs worse than FiveThirtyEight. As the chart shows—which overlays my model’s results with Silver’s—the model does a particularly poor job of identifying solid but not overwhelming favorites: Favorites only won half the the games in which they were expected to win 70% to 80% of the time.
Why The Difference? #
Before conceding to Nate Silver’s sterling record and accepting that he is just better at this than me, it’s worth looking into why our predictions came out so differently. Fortunately, there’s a fairly clear explanation. Silver’s recalculates his forecasts as the tournament progresses, updating the predictions after each game. This has two effects. First, his calculations respond to positive and negative signals from previous games. For instance, Virginia’s blowout win against Memphis could improve their odds against Michigan State, or Iowa State’s loss of Georges Niang to injury could lower their chances against Connecticut. My calculations were based on Vegas odds at the beginning of the tournament, and not responsive to new results.
Second—and more importantly—for all the games beyond the first round (when matchups were unknown), I computed game odds by comparing each team’s odds of making the Final Four. This is an imperfect calculation, most notably because those odds are based on a team’s entire path to the Final Four. For teams that face very challenging first games, their odds of making the Final Four are quite low. However, my model doesn’t make any adjustments for teams that overcome this first game.
Florida Gulf Coast’s Cinderella run last year is a perfect example of both of these cases. Not only did Florida Gulf Coast demonstrate that they were a better team than many thought by beating Georgetown and San Diego State by a total of 20 points, but they also cleared two tough hurdles between them and the Final Four. In part because of both of these factors, Silver’s model gave Florida Gulf Coast a 5.8% chance against Florida. My model—which was still based on Florida Gulf Coast’s original 1% chance of making the Final Four—only gave them a 0.2% chance against Florida.
This problem, however, can be partially corrected by only looking at first round games. Because these match-ups are known, forecasts are based on actual game lines rather than derived matchups.
As suspected, the difference in projections are less apparent in the first roud. Charting the relative predictions for each game shows that, in the first round, the differences between models are more or less random, and clustered around zero. In later rounds—when I’m deriving probabilities—Vegas odds almost universally overestimate the favorite’s chances. This makes sense, given the Florida Gulf Coast example above.
As this suggests, looking at only games in the first round, the models fair similarly. The chart below shows the same buckets as before, but only includes first round games—and in this case, the model predictions are more closely aligned.
Based on this, when picking your bracket next year, it doesn’t really matter if you go with Vegas or Nate Silver in the first round. For later round games, Nate Silver has left me hiding behind as car as he whistles The Farmer in the Dell (WARNING: that link is a Wire spoiler). But this does raise an interesting question: If FiveThirtyEight predicted every matchup at the start of the tournament, how would their results look? And how do FiveThirtyEight’s forecasts compare to Vegas lines at the start of each game? In other words, if we level the playing field, should I bet on Nate Silver or the true kings of sports forecasting—the oddsmakers?
Data was collected from FiveThirtyEight and calculated using Vegas odds. All analysis, data, and visualization code can be found in Mode. The graphs are backed by Variance, an excellent new visualization library.
Benn Stancil is the chief analyst of Mode.