[This article was first published on free range statistics - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week I wrote about the impact of seeding the draw in a tennis tournament. Seeding is one way to increase the chance of the top players making it to the final rounds of a single elimination tournament, leading to fairer outcomes and to a higher chance of the best matchups happening in the finals. Basically it’s to overcome this problem:
“At a Lawn Tennis Tournament, where I chanced, some while ago, to be a spectator, the present method of assigning prizes was brought to my notice by the lamentations of one of the Players, who had been beaten (and had thus lost all chance of a prize) early in the contest, and who had had the mortification of seeing the 2nd prize carried off by a Player whom he knew to be quite inferior to himself.”
Charles Dodgson (Lewis Carroll)
The incident related above led nineteenth century Oxford University mathematician Charles Dodgson to propose an alternative to the unseeded single elimination tournament that was standard at the time. His monograph on the subject, LAWN TENNIS TOURNAMENTS: The True Method of Assigning Prizes with a Proof of the Fallacy of the Present Method, can be found from page 1,082 of his complete works. Of course, Dodgson’s side hustle was as part time children’s author Lewis Carroll, of works including the astonishingly good Alice’s Adventures in Wonderland, Through the Looking Glass and The Hunting of the Snark; and the deservedly forgotton Sylvie and Bruno.
Here are the basic elements of Dodgson’s proposed system, which he describes for a tournament of 32 male players:
“A list is kept, and against each name is entered, at the end of each contest, the name of any one who has been superior to him – whether by actually beating him, or by beating some one who has done so (thus, if A beats B, and B beats C, A and B are both ‘superiors’ of C). So soon as any name has 3 ‘superiors’ entered against it, it is struck out of the list”
“only one contest that (first) day – the 32 Players being arranged in 16 pairs”
“For the 2nd day … the 16 unbeaten men are paired together, and similarly the 16 with 1 superior (the Losers in these last-named pairs will now have 3 superiors each, and will therefore be struck off the list). In all other contests they are paired in the same way; first pairing the unbeaten, then those with 1 superior, and so on, and avoiding, as far as possible, pairing two Players who have a common superior.”
“By the middle of the 3rd day the unbeaten are reduced to two… These two … have a whole-day match on the 4th day…”
“By the end of the 4th day, the ‘First-prize-man’ is known (by the very same process of elimination used in the existing method): and the remaining Players are paired by the same rules as before, for the 2 contests on the 5th day.”
The essence of the method is that no-one is knocked out until they are definitely not one of the three best players, as proven by having three or more “superiors” who have either beaten them in a match or beaten someone who beat them. By this method, the actual top three bubble to the top of the competition.
One interesting observation is that under Dodgson’s rules there is no requirement to have the number of players a power of two, as is the case in a standard single-elimination tournament if the organisers wish to avoid unbalanced byes.
Dodgson argues that his proposed method is guaranteed to allocate the first, second and third prizes accurately to the best three players in order. However, his monograph makes some key assumptions:
superiority is transitive eg if A is superior to B and B to C, then A is superior to C
superiority is deterministic, consistent and persistent
Obviously, the real world isn’t like this. How does his method stand up when we make the results of individual matches uncertain and inconsistent with eachother in line with the realistic Elo-based simulation of results I used last week?
To check this out, I simulated Dodgson-style tournaments with the same 128 top women players from 1990 that I used last week with more conventional tournaments. I built the program to do this so the user had a choice in how individual matchups between players resulted – either deterministically (higher rated player is guaranteed to win, as per Dodgson’s own 32 player demonstration) or realistically probabilistically (chances are random but still depend on the Elo ratings of the two players).
Results
If winners are deterministic, things work out as Dodgson claimed
Of course, Dodgson was a highly accomplished mathematician so no surprise that he is correct that his method correctly allocates prizes in a 32 player competition with deterministic match outcomes. I was curious to see if it would work in a larger competition. I found that his rules (with minimal modifications discussed later in this post) returned the top three players with the top three prizes, in correct order, 100 times out of 100 different runs.
A correctly seeded single-elimination tournament with deterministic match outcomes will also return the top 4 players accurately 100% of the time. I think that Dodgson assumed there was no reliable prior knowledge of players’ skill levels, ruling out the possibility of a seeded tournament; certainly he only compares his method to an unseeded draw.
Note that a Dodgson tournament will take around twice as many matches (varying according to the efficiency of the draw, but around 240 matches was normal in my experiment) as a single-elimination tournament (which needs 127 matches for 128 players).
Results are not so good when match wins are realistically probabilistic
Dodgson’s method is not as good as he would hope when match outcomes are not deterministic, but happen by chance in accordance with the skill differential indicated by players different Elo ratings. Of course, this is the realistic model; even at the height of her powers, Steffi Graf (the top rated player in our 1990 collection) had non zero chance of losing to her top opponents in any given match, as we saw in the graphic from last week:
The figure below shows the result of 1,000 simulated tournaments played out according to Dodgson’s rules, with realistic win and loss probabilities (ie not just 1 and 0). Some example conclusions from this chart include:
The top player wins the tournament 57% of the time
The top two players are the top two prize recipients 36% of the time, and in the right 1-2 sequence 23% of the time.
The top three players are awarded the top three prizes in the right sequence only 7% of the time
For comparison, all those figures are 100% in the deterministic model Dodgson planned on.
Under the probabilistic model, the outcomes of Dodgson’s tournament rules are similar to those from a seeded single elimination tournament as in my blog post last week. For example, in that seeded competition, top player Steffi Graf won 60% of tournaments; and the top two players were in the grand final 42% of the time. This is marginally better than the result under Dodgson’s rules with around half the number of matches, reflecting the effective use that the extra prior information on player ability is put to in the seeding.
Implementation
Approach
It was interesting and non-trivial to implement Dodgson’s rules in a way that would be robust to larger tournament sizes, random match outcomes, and inconsistent and non-transitive results. A few notable choices I had to make:
I allocated the initial pairs of players (and subsequent matchups) at random rather than alphabetically based on their names
I drop the idea of a “round” and instead run a loop looking for the next individual match to play, matching where possible players with the same number of games played and losses as eachother. I made up a concept of the “awkward player” at any moment – a player who has played as few games as anyone, and has the least number of legitimate opponents available with the same number of losses, avoiding rematches, etc. Finding a match for that awkward player becomes the priority in each iteration of my loop.
I had to allow, in some circumstances, matches between players who had played a different number of games to that point. There’s probably a solution that doesn’t require this but I couldn’t find it in my time budget.
I also couldn’t think of a practical way to implement the requirement “avoiding, as far as possible, pairing two Players who have a common superior”. I did filter out re-matches except in unusual situations to avoid the algorithm getting stuck.
With non-deterministic results and rematches allowed when unavoidable, a number of contradictions become possible and need to be carefully handled. For example, a player can become a superior of themselves (in the situation A is beaten by B; then B is beaten by A in a rematch – now future A is a ‘superior’ of past A, a situation that I ruled out by refusing to count).
It’s possible for the four remaining players all to graduate to having 3 superiors each as a result of a single match late in the tournament, leaving no clear placings. In this situation I made them joint first place, but more realistic would of course be a semi- and grand final series.
Similarly, it’s possible for players 2 and 3 (of three remaining) to be knocked out in one step. This means no grand final, but a play-off is needed for second place.
All up, this was a fun exercise. I’m glad this approach to running a tournament works fairly well, even at the cost of roughly twice as many matches needed as in a seeded single-elimination. And it stands up not too badly even with realistically uncertain and inconsistent match outcomes. But of course, it’s not as perfect as the ideal deterministic world described in Dodgson’s original monograph. Things can get pretty complicated, as the above list of decisions and challenges begins to show. There are some quite awkward edge cases that I didn’t stop to sort in detail.
I’m not aware Dodgson’s method has ever been used for a real life tournament schedule, although there have been a few simulation experiments similar to my own. I’m not sure how seriously he meant it to be taken; the same monograph proposing this method also suggests in passing tennis should get rid of sets altogether and replace them with a simpler method such as “he who first wins 14 games, or who gets 9 ahead, wins the match”. I doubt he seriously expected that proposal to be taken up. But it’s worth noting this monograph was published under his professional persona as recreational mathematics enthusiast Charles Dodgson, rather than as children’s author Lewis Carroll.
Let’s just say that, like most of his other work, it’s good he published this! But I would hesitate to recommend adopting it in toto.
Code
Here’s all the R code for these simulations in a single chunk. A carroll_tournament() function is the work horse, with a couple of helpers for very specific tasks such as one_match() which plays out a single match using either a deterministic or probabilistic method of allocating a winner.
The illustrations in this blog post are screenshots from Lewis Carroll’s complete works with original illustrations by John Tenniel.
Related
To leave a comment for the author, please follow the link and comment on their blog: free range statistics - R.