[social4i size=”large” align=”float-right”] (By Achim Zeileis) From 10 June to 10 July 2016 the best European football teams will meet in France to determine the European Champion in the UEFA European Championship 2016 tournament. For the first time 24 teams compete, expanding the format from 16 teams as in the previous five Euro tournaments. For forecasting the winning probability of each team a predictive model based on bookmaker odds from 19 online bookmakers is employed. The favorite is the host France with a forecasted winning probability of 21.5%, followed by the current World Champion Germany with a winning probability of 20.1%. The defending European Champion Spain follows after some gap with 13.7% and all remaining teams are predicted to have lower chances with England (9.2%) and Belgium (7.7%) being the “best of the rest”. Furthermore, by complementing the bookmaker consensus results with simulations of the whole tournament, predicted pairwise probabilities for each possible game at the Euro 2016 are obtained along with “survival” probabilities for each team proceeding to the different stages of the tournament. For example, it can be determined that it is much more likely that top favorites France and Germany meet in the semifinal (7.8%) rather than in the final at the Stade de France (4.2%) – which would be a re-match of the friendly game that was played on 13 November 2015 during the terrorist attacks in Paris and that France won 2-0. Hence it is maybe better that the tournament draw favors a match in the semifinal at Marseille (with an almost even winning probability of 50.1% for France). The most likely final is then that either of the two teams plays against the defending champion Spain with a probability of 5.7% for France vs. Spain and 5.4% for Germany vs. Spain, respectively. All of these forecasts are the result of a bookmaker consensus rating proposed in Leitner, Hornik, and Zeileis (International Journal of Forecasting, 26(3), 471-481, 2010). This technique correctly predicted the winner of the FIFA World Cup 2010 and Euro 2012 tournaments while missing the winner but correctly predicting the final for the Euro 2008and three out of four semifinalists at the FIFA World Cup 2014. A new working paper about the UEFA Euro 2016, upon which this blog post is based, applies the same technique and is introduced here. The core idea is to use the expert knowledge of international bookmakers. These have to judge all possible outcomes in a sports tournament such as the UEFA Euro and assign odds to them. Doing a poor job (i.e., assigning too high or too low odds) will cost them money. Hence, in our forecasts we solely rely on the expertise of 19 such bookmakers. Specifically, we (1) adjust the quoted odds by removing the bookmakers’ profit margins (or overround, typically around 15%), (2) aggregate and average these to a consensus rating, and (3) infer the corresponding tournament-draw-adjusted team abilities using the Bradley-Terry model for pairwise comparisons. For step (1), it is assumed that the quoted odds are derived from the underlying “true” odds as: quoted odds = odds · α + 1, where + 1 is the stake (which is to be paid back to the bookmakers’ customers in case they win) and α is the proportion of the bets that is actually paid out by the bookmakers. The so-called overround is the remaining proportion1 – α and the main basis of the bookmakers’ profits (see also Wikipedia and the links therein). For the 19 bookmakers employed in this analysis, the median overround is a sizeable 15.1%. Subsequently, in step (2), the overround-adjusted odds are transformed to the log-odds (or logit scale), averaged for each team, and transformed back to winning probabilities (displayed in the barchart above). Finally, step (3) of the analysis uses the following idea:
- If team abilities are available, pairwise winning probabilities can be derived for each possible match using a Bradley-Terry approach.
- Given pairwise winning probabilities, the whole tournament can be easily simulated to see which team proceeds to which stage in the tournament and which team finally wins.
- Such a tournament simulation can then be run sufficiently often (here 100,000 times) to obtain relative frequencies for each team winning the tournament.