Geelong and the curse of the bye

[This article was first published on R – What You're Doing Is Rather Desperate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This week we return to Australian Rules Football, the R package fitzRoy and some statistics to ask – why can’t Geelong win after a bye?

(with apologies to long-time readers who used to come for the science)

Code and a report for this blog post are available at Github.

First, some background. In 2011 the AFL expanded from 16 to 17 teams with the addition of the Gold Coast Suns. In the same year, a bye round (a week where some teams don’t play) was reintroduced to the competition. For the purposes of this discussion, we are interested only in bye rounds since 2011, and during the regular home/away season.

You will often hear footy fans claim – sometimes with very little evidence – that “we don’t go well after the bye.” For one team, this is certainly true. That team is Geelong, who have not won a game in the round following a bye since Round 7 in 2011.

Is this unusual? If so, does the available game data suggest any reason?

We start as ever with the excellent fitzRoy package and use get_match_results() to – well, get the match results.

Next, we can use some tidyverse magic to obtain all games in the round immediately before, and after, a bye. This looks long and complicated, so here’s an version with annotations in the comments to explain what’s going on:

results_bye <- results %>% 
  # choose the desired columns
  select(Season, Round, Date, Venue, Home.Team, Away.Team, Margin) %>% 
  # create one column for teams, another to indicate whether home or away
  gather(Status, Team, -Season, -Round, -Margin, -Date, -Venue) %>% 
  # filter for 2011 onwards and only home/away games
  filter(Season > 2010, grepl("^R", Round)) %>% 
  # create a column with the number of each round
  separate(Round, into = c("prefix", "suffix"), sep = 1) %>% 
  mutate(suffix = as.numeric(suffix)) %>% 
  # for each team's games in a season find games
  # the week before and after a bye
  arrange(Season, Team, suffix) %>%
  group_by(Season, Team) %>% 
  mutate(bye = case_when(
    suffix - lead(suffix) == -2 ~ "before",
    suffix - lag(suffix) == 2 ~ "after",
    TRUE ~ as.character(suffix)
  # margins are with respect to home team so negate them if away
  Margin = ifelse(Status == "Away.Team", -Margin, Margin)) %>% 
  ungroup() %>% 
  # filter for the pre- and post-bye games
  filter(bye %in% c("before", "after")) %>% 
  # calculate result
  mutate(Result = case_when(
    Margin > 0 ~ "W",
    Margin < 0 ~ "L",
    TRUE ~ "D"
  )) %>% 
  # recreate the Round column
  unite(Round, prefix, suffix, sep = "")

Let’s confirm that Geelong have not won after a bye in a long time:

results_bye %>% 
  filter(Team == "Geelong", bye == "after")
2011R72011-05-07Kardinia Park66Home.TeamGeelongafterW
2011R232011-08-27Kardinia Park-13Home.TeamGeelongafterL
2016R162016-07-08Kardinia Park-38Home.TeamGeelongafterL
2019R142019-06-22Adelaide Oval-11Away.TeamGeelongafterL

How does that compare with other teams?

We see all combinations: teams that seem to win more after a bye, as well as teams that win less and teams for which a bye makes no difference. However, Geelong certainly has the worst post-bye win/loss record.

We can ask: is the win/loss count in pre-bye games significantly different to those post-bye? One approach to this is to construct 2×2 contingency tables and perform Fisher’s exact test.

With some more tidyverse magic we can nest the data for each team, generate the tests and summarise the results. This approach is explained very nicely in “Running a model on separate groups” over at Simon Jackson’s blog.

Only Geelong has p < 0.05, suggesting that there is something interesting about the win/loss count after the bye. We’ll just show the first 5 teams here.

results_bye %>% 
  count(Team, bye, Result) %>% 
  nest(-Team) %>% 
  mutate(data = map(data, . %>% spread(Result, n) %>% select(2:3)), 
         fisher = map(data, fisher.test), 
         summary = map(fisher, tidy)) %>% 
  select(Team, summary) %>% 
  unnest() %>% 
  select(-method, -alternative) %>% 
  arrange(p.value) %>% 
  pander(split.table = Inf)
North Melbourne0.17360.29410.0028352.438

We can extend the previous visualisation by further breaking down games into home and away:

Now we see that of Geelong’s 8 post-bye losses, 6 were away games. Port Adelaide have a similar record. Then again, Brisbane have not won an away game before the bye, but you don’t hear anyone talking about Brisbane “not going well before the bye”.

When we look at those 6 away post-bye losses, one was in Melbourne – which in terms of travel distance is not very far from Geelong. The other five were “genuine” away games in Sydney, Brisbane, Adelaide and Perth (2).

2019R142019-06-22Adelaide Oval-11Away.TeamGeelongafterL

In addition, three of the losses were against a side also coming off the bye, but playing at home.


What about away games before the bye? One loss in Melbourne, four wins in Melbourne and one win in Sydney, versus the GWS Giants who at that time were a new and struggling team.

2011R212011-08-14Football Park11Away.TeamGeelongbeforeW
2013R112013-06-08Sydney Showground59Away.TeamGeelongbeforeW

Our last question: for games after a bye, what was the expected result? By expected we mean “according to the bookmakers”. We can join the match results with historical betting data, assign the expected result (win or loss) to Geelong according to their odds, then compare expected versus actual results. This reveals that six of the eight post-bye losses were unexpected – not surprising as Geelong has been a strong team in the period from 2011 to now.


In summary
Historically, Geelong do seem more prone to losing after a bye round than other teams, and those losses have been unexpected in terms of betting odds.

However, a large proportion of their post-bye losses have been interstate away games, versus strong opponents. Away games before the bye have been either in Melbourne, or versus weaker opponents.

Scheduling may therefore have played a role in Geelong’s post-bye win/loss record.

To leave a comment for the author, please follow the link and comment on their blog: R – What You're Doing Is Rather Desperate. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)