Germany most likely to win Euro 2016

June 13, 2016

(This article was first published on Rbloggers Quantifying Information, and kindly contributed to R-bloggers)

After World Cup 2014 we finally are facing the next spectacularfootball event now: Euro 2016. With billions of football fans spreadall over the world, football still seems to be the single most popularsport. Might have something to do with the fact thatfootball is a game of underdogs: David could beat Goliath any day. Just take a look at the marvelous story of underdogLeicester City in this year’s Premier League season. It is this highuncertainty in future match outcomes that keeps everybody excited andpuzzled about the one question: who is going to win?

A question, of course, that just feels like a perfectly designedchallenge for data science, with an ever increasing wealth of footballmatch statistics and other related data that is freely available nowadays.It comes as no surprise, hence, that Euro 2016 also puts statisticsand data mining into the spotlight. Next to “who is going to win?”, the second question is: “who is going to make the best forecast?”

Besides the big players of the industry, whose predictionstraditionally get the most of the media attention, there also is aless known academic research group that already had remarkable successin forecasting in the past. Andreas Groll and Gunther Schauberger fromLudwig-Maximilians-University, together with Thomas Kneib fromGeorg-August-University, again did set out to forecast the next Eurochampion, after they already were able to predict the champion of Euro2012 and World Cup 2014correctly.

Based on publicly available data and the gamlls R-packagethey built a model to forecast probabilities of win, tie and loss for any game of Euro 2016(Actually, they even get probabilities on a more precise level with an exact number of goals forboth teams. For more details on the model take a look at their preliminary technical report).

This is what their model deems as most likely tournament evolutionthis time:

em_results_group em_results_tree

The model not only takes into account individual team strengths, butalso the group constellations that were randomly drawn and also havean influence on the tournament’s outcome. This is what their modelpredicts as probabilities for each team and each possible achievementin the tournament:


So good news for all Germans: after already winning World Cup 2014,“Die Mannschaft” seems to be getting its hands on the next biginternational football title!

Well, do they? Let’s see…

Mainstream media usually only picks up on the prediction of the Eurochampion – the “pointforecast”, so to speak. Keep in mind that although this single outcome maywell be the most likely one, it still is quite unlikely itself with a probability of 21.1% only. So from astatistical point of view, you basically should not judge the model only on grounds of whether it isable to predict the champion again, as this wouldrequire a goodportion of luck, too. Just imagine the probability of the most likelychampion was 30%, then getting it correctly three times in a row merely has aprobability of (0.3)=0.027 or 2.7%. So in order to really evaluate thegoodness of the model you need to check its forecasting poweron a number of games and see whether it consistently does a good job, or even outperformsbookmakers’ odds. Although the report does not list the probabilities for each individual game, you still can get a quite good feeling about the goodness of the model, for example, by looking at the predicted group standings andplayoff participants. Just compare them to what you would have guessed yourself – who’s the better football expert?


Filed under: R, science Tagged: Rbloggers

To leave a comment for the author, please follow the link and comment on their blog: Rbloggers Quantifying Information. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)