As a discipline, Political Science — the analysis of the theory and practice and politics — has been around for quite a while. (Our own CEO here at Revolution Analytics, Norman Nie, has been a leading academic and author in the field for over 40 years.) But it’s only in recent years that a deluge of data about politics has erupted: detailed demographic information about constituents; tracking polls taken on a daily basis from dozens of polling firms; campaign donations; information from just about every walk of life that can give insight into a voter’s intentions or reactions to policy. Around election time in particular, the new data is captured and published on a minute-by-minute basis. As a result, advanced statistical techniques that lend themselves to drawing nuances from disparate data streams are increasingly being used to forecast the results of elections.
Take one recent example: the British parliamentary elections. Professor Simon Hix and Nick Vivyan of the London School of Economics and Political Science used R to analyse polling data. Their Hix-Vivyan Prediction method pools data from numerous national polls to infer the elections of MPs in each constituency, and thereby predict the outcome of the election. R is an ideal system for this kind of analysis: not only does it provide the advanced statistical techniques to do the analysis and make the predictions, but because it’s a scripted language they were able to re-run the analysis on a day-by-day basis as new polling data was released and present the results as beautiful graphics like these:
On the day before the election the Hix-Vivyan model predicted the Conservatives would win 293 seats, shy of the 326 required to avoid a hung parliament. (A hung parliament was indeed the result, with Conservatives at 306 seats, eventually forming a coalition with the Liberal Democrats.)
This is just one example of political scientists using advanced statistical techniques to predict election outcomes. Nate Silver at fivethirtyeight.com also tracked the UK election closely, and his in-depth analyses of the US House, Senate and Presidential elections are must-reads for any junkie of the US election system. (Incidentally, Nate has also recently branched out into ranking the World Cup Soccer teams using statistical techniques.) And Andrew Gelman regularly posts about political analysis (always with a Bayesian perspective, and often using R), for example on the recent primary elections in the US. And Boris Shor (from the University of Chicago) often publishes in-depth analysis of individual races in US elections at his blog (click here to download a case study on how he uses Revolution R Enterprise for the analysis). In fact, there’s so much going in statistical analysis of US elections that I think I’ll we’ll to come back to the topic in a follow-up post.
[Update: Corrected spelling of both Hix and Vivyan. Apologies to both.]
British politics and policy at LSE: One day to go: Hix-Vivyan Prediction up to 3 May