**Freakonometrics » R-english**, and kindly contributed to R-bloggers)

Following my previous post, a few more things. As mentioned by Frédéric, it is – indeed – possible to compute the probability of all pairs. More precisely, all pairs are not as likely to occur: some teams can play against (almost) eveyone, while others cannot. From the previous table, it is possible to compute probability that the last team plays against team 1. Or team 2 (numbers are from the xls file mentioned previously). To make it simple

> table(M[,2*n])/length(M[,2*n])*100 1 2 3 5 7 10 11 11.82500 12.61212 12.61212 13.25279 19.31173 18.70767 11.67856

Here, the last team (as I did rank them) has 11.8% chances to play against team 1, and 19.3% to play against team 7. If we compute all the probabilities, we obtain

> S 1 2 3 5 7 10 11 13 4 0.00 14.16 14.16 0.00 22.22 21.25 13.05 15.13 6 12.52 13.19 13.19 14.11 20.13 0.00 12.35 14.47 8 18.78 0.00 19.54 21.50 0.00 0.00 18.39 21.76 9 18.78 19.54 0.00 21.50 0.00 0.00 18.39 21.76 12 14.68 15.54 15.54 16.56 0.00 23.19 14.47 0.00 14 11.64 12.37 12.37 13.05 18.96 18.25 0.00 13.34 15 11.77 12.55 12.55 0.00 19.36 18.59 11.64 13.50 16 11.82 12.61 12.61 13.25 19.31 18.70 11.67 0.00

that can be visualized below

White areas cannot be reached, while red ones are more likely. Here, we compute probability that home team (given on the *x*-axis) plays against some visitor team (on the *y*-axis). The fact that those probabilities are not uniform seems odd. But I guess it comes from those constraints…

Another weird point: it is possible to reach a deadlock. At least with the technique I have been using. So far, I did not count them. But we can, simply the following code

> U=c(4,6,8,9,12,14,15,16) > a1=U[1] > b1=U[2] > c1=U[3] > d1=U[4] > e1=U[5] > f1=U[6] > g1=U[7] > h1=U[8] > a2=b2=c2=d2=e2=f2=g2=h2=NA > posa2=(1:n)%notin%c(LISTEIMPOSSIBLE[,a1]) > if(length(posa2)==0){na=na+1} > for(a2 in posa2){ + posb2=(1:n)%notin%c(LISTEIMPOSSIBLE[,b1],a2) + if(length(posb2)==0){na=na+1} + for(b2 in posb2){ + posc2=(1:n)%notin%c(LISTEIMPOSSIBLE[,c1],a2,b2) + if(length(posc2)==0){na=na+1} + for(c2 in posc2){ + posd2=(1:n)%notin%c(LISTEIMPOSSIBLE[,d1], + a2,b2,c2) + if(length(posd2)==0){na=na+1} + for(d2 in posd2){ + pose2=(1:n)%notin%c(LISTEIMPOSSIBLE[,e1], + a2,b2,c2,d2) + if(length(pose2)==0){na=na+1} + for(e2 in pose2){ + posf2=(1:n)%notin%c(LISTEIMPOSSIBLE[,f1], + a2,b2,c2,d2,e2) + if(length(posf2)==0){na=na+1} + for(f2 in posf2){ + posg2=(1:n)%notin%c(LISTEIMPOSSIBLE[,g1], + a2,b2,c2,d2,e2,f2) + if(length(posg2)==0){na=na+1} + for(g2 in posg2){ + posh2=(1:n)%notin%c(LISTEIMPOSSIBLE[,h1], + a2,b2,c2,d2,e2,f2,g2) + if(length(posh2)==0){na=na+1} + for(h2 in posh2){ + s=s+1 + V=c(a1,a2,b1,b2,c1,c2,d1,d2,e1,e2,f1,f2,g1,g2,h1,h2) + }}}}}}}}

On the initial ordering of home team, the number of deadlocks was

> na [1] 657

The probability of obtaining a deadlock is then

> 657/(657+5463) [1] 0.1073529

(657 scenarios ended in a dead end, while 5463 ended well). The worst case was obtained when we considered

[1] 6 4 16 14 12 15 8 9

In that case, the probability of obtaining a deadlock was

> 4047/(4047+5463) [1] 0.4255521

Here, it clearly depends on the ordering. So if we draw – randomly – the order of the home teams, i.e.

> Urandom=sample(U,size=8)

the distribution of the probablity of having a deadlock is

All those computations were based on my understanding of the drawings. But Kristof (aka @ciebiera), on his blog krzysztofciebiera.blogspot.ca/… obtained different results. For instance, based on my previous computations, the probability to obtain identical pairs was 0.018349% (1 chance out of 5463), but Kristof obtained – based on the UEFA procedure (as he called it) – a probability of 0.00181337%. Which is not _ strictly – the same, but both computations yield relatively close results…

**leave a comment**for the author, please follow the link and comment on his blog:

**Freakonometrics » R-english**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...