# NYT uses R to forecast Senate elections

April 25, 2014
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Nate Silver's departure to relaunch FiveThirtyEight.com left a bit of a hole at the New York Times, which The Upshot — the new data journalism practice at the Times — seeks to fill. And they've gotten off to a great start with the new Senate forecasting model, called Leo. Leo was created by Amanda Cox (longtime graphics editor at the NYT) and Josh Katz (creator of the Dialect Quiz), and uses a similar poll-aggregation methodology to that used by Silver. The model itself is implemented in the R language, and the R code is available for inspection at GitHub.

As of this writing, Leo suggests that the Democrats have a 51% chance of retaining the Senate in the 2014 elections. Now, probabilities are subtle things, and some commentators are prone to report from an estimate like this that "Leo predicts the Democrats will win the Senate in 2014". Of course, that would still be a risky bet to make, "essentially the same as a coin flip" as the Leo website says. As long as the probability isn't 0% or 100%, the actual outcome is always in doubt, dependent on factors we can't measure and over which we have no control. (In other words, luck.) The Upshot does a lovely job of demonstrating this variability with a feature that spins roulette wheels, loaded according to the data (mainly polls) for each race, and simulates one possible outcome.

This is a fantastic way to demonstrate the inherent variability in any statistical forecast, and it ties in with the underlying methodology as well. (For the statisticians out there, those wheels don't spin independently; correlations between individual races are taken into account.) Spin those wheels hundreds or thousands of times, count the number of times Democrats or Republicans win, and you have an estimate of the overall probability each party wins. Nicely done!

The UpShot: Senate Forecasts (via Sharon Machlis)

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...