UEFA, is that it ?

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Following my previous post, a few more things. As mentioned by Frédéric, it is – indeed – possible to compute the probability of all pairs. More precisely, all pairs are not as likely to occur: some teams can play against (almost) eveyone, while others cannot. From the previous table, it is possible to compute probability that the last team plays against team 1. Or team 2 (numbers are from the  xls file mentioned previously). To make it simple

> table(M[,2*n])/length(M[,2*n])*100

       1        2        3        5        7       10       11 
11.82500 12.61212 12.61212 13.25279 19.31173 18.70767 11.67856

Here, the last team (as I did rank them) has 11.8% chances to play against team 1, and 19.3% to play against team 7. If we compute all the probabilities, we obtain

> S
       1     2     3     5     7    10    11    13
4   0.00 14.16 14.16  0.00 22.22 21.25 13.05 15.13
6  12.52 13.19 13.19 14.11 20.13  0.00 12.35 14.47
8  18.78  0.00 19.54 21.50  0.00  0.00 18.39 21.76
9  18.78 19.54  0.00 21.50  0.00  0.00 18.39 21.76
12 14.68 15.54 15.54 16.56  0.00 23.19 14.47  0.00
14 11.64 12.37 12.37 13.05 18.96 18.25  0.00 13.34
15 11.77 12.55 12.55  0.00 19.36 18.59 11.64 13.50
16 11.82 12.61 12.61 13.25 19.31 18.70 11.67  0.00

that can be visualized below

White areas cannot be reached, while red ones are more likely. Here, we compute probability that home team (given on the x-axis) plays against some visitor team (on the y-axis). The fact that those probabilities are not uniform seems odd. But I guess it comes from those constraints…

Another weird point: it is possible to reach a deadlock. At least with the technique I have been using. So far, I did not count them. But we can, simply the following code

> U=c(4,6,8,9,12,14,15,16)
> a1=U[1]
> b1=U[2]
> c1=U[3]
> d1=U[4]
> e1=U[5]
> f1=U[6]
> g1=U[7]
> h1=U[8]
> a2=b2=c2=d2=e2=f2=g2=h2=NA
> posa2=(1:n)%notin%c(LISTEIMPOSSIBLE[,a1])
> if(length(posa2)==0){na=na+1}
> for(a2 in posa2){
+ posb2=(1:n)%notin%c(LISTEIMPOSSIBLE[,b1],a2)
+ if(length(posb2)==0){na=na+1}
+ for(b2 in posb2){
+ posc2=(1:n)%notin%c(LISTEIMPOSSIBLE[,c1],a2,b2)
+ if(length(posc2)==0){na=na+1}
+ for(c2 in posc2){
+ posd2=(1:n)%notin%c(LISTEIMPOSSIBLE[,d1],
+ a2,b2,c2)
+ if(length(posd2)==0){na=na+1}
+ for(d2 in posd2){
+ pose2=(1:n)%notin%c(LISTEIMPOSSIBLE[,e1],
+ a2,b2,c2,d2)
+ if(length(pose2)==0){na=na+1}
+ for(e2 in pose2){
+ posf2=(1:n)%notin%c(LISTEIMPOSSIBLE[,f1],
+ a2,b2,c2,d2,e2)
+ if(length(posf2)==0){na=na+1}
+ for(f2 in posf2){
+ posg2=(1:n)%notin%c(LISTEIMPOSSIBLE[,g1],
+ a2,b2,c2,d2,e2,f2)
+ if(length(posg2)==0){na=na+1}
+ for(g2 in posg2){
+ posh2=(1:n)%notin%c(LISTEIMPOSSIBLE[,h1],
+ a2,b2,c2,d2,e2,f2,g2)
+ if(length(posh2)==0){na=na+1}
+ for(h2 in posh2){
+ s=s+1
+ V=c(a1,a2,b1,b2,c1,c2,d1,d2,e1,e2,f1,f2,g1,g2,h1,h2)
+ }}}}}}}}

On the initial ordering of home team, the number of deadlocks was

> na
[1] 657

The probability of obtaining a deadlock is then

> 657/(657+5463)
[1] 0.1073529

(657 scenarios ended in a dead end, while 5463 ended well). The worst case was obtained when we considered

 [1]    6    4   16   14   12   15    8    9

In that case, the probability of obtaining a deadlock was

> 4047/(4047+5463)
[1] 0.4255521

Here, it clearly depends on the ordering. So if we draw – randomly – the order of the home teams, i.e.

> Urandom=sample(U,size=8)

the distribution of the probablity of having a deadlock is

All those computations were based on my understanding of the drawings. But Kristof (aka @ciebiera), on his blog krzysztofciebiera.blogspot.ca/… obtained different results. For instance, based on my previous computations, the probability to obtain identical pairs was 0.018349% (1 chance out of 5463), but Kristof obtained – based on the UEFA procedure (as he called it) – a probability of 0.00181337%. Which is not _ strictly – the same, but both computations yield relatively close results…

Arthur Charpentier

Arthur Charpentier, professor at UQaM in Actuarial Science. Former professor-assistant at ENSAE Paristech, associate professor at Ecole Polytechnique and assistant professor in Economics at Université de Rennes 1. Graduated from ENSAE, Master in Mathematical Economics (Paris Dauphine), PhD in Mathematics (KU Leuven), and Fellow of the French Institute of Actuaries.

More PostsWebsite

Follow Me:
TwitterLinkedInGoogle Plus

To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)