Geelong and the curse of the bye

[This article was first published on R – What You're Doing Is Rather Desperate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This week we return to Australian Rules Football, the R package fitzRoy and some statistics to ask – why can’t Geelong win after a bye?

(with apologies to long-time readers who used to come for the science)


Code and a report for this blog post are available at Github.

First, some background. In 2011 the AFL expanded from 16 to 17 teams with the addition of the Gold Coast Suns. In the same year, a bye round (a week where some teams don’t play) was reintroduced to the competition. For the purposes of this discussion, we are interested only in bye rounds since 2011, and during the regular home/away season.

You will often hear footy fans claim – sometimes with very little evidence – that “we don’t go well after the bye.” For one team, this is certainly true. That team is Geelong, who have not won a game in the round following a bye since Round 7 in 2011.

Is this unusual? If so, does the available game data suggest any reason?

We start as ever with the excellent fitzRoy package and use get_match_results() to – well, get the match results.

Next, we can use some tidyverse magic to obtain all games in the round immediately before, and after, a bye. This looks long and complicated, so here’s an version with annotations in the comments to explain what’s going on:

results_bye <- results %>% 
  # choose the desired columns
  select(Season, Round, Date, Venue, Home.Team, Away.Team, Margin) %>% 
  # create one column for teams, another to indicate whether home or away
  gather(Status, Team, -Season, -Round, -Margin, -Date, -Venue) %>% 
  # filter for 2011 onwards and only home/away games
  filter(Season > 2010, grepl("^R", Round)) %>% 
  # create a column with the number of each round
  separate(Round, into = c("prefix", "suffix"), sep = 1) %>% 
  mutate(suffix = as.numeric(suffix)) %>% 
  # for each team's games in a season find games
  # the week before and after a bye
  arrange(Season, Team, suffix) %>%
  group_by(Season, Team) %>% 
  mutate(bye = case_when(
    suffix - lead(suffix) == -2 ~ "before",
    suffix - lag(suffix) == 2 ~ "after",
    TRUE ~ as.character(suffix)
  ),
  # margins are with respect to home team so negate them if away
  Margin = ifelse(Status == "Away.Team", -Margin, Margin)) %>% 
  ungroup() %>% 
  # filter for the pre- and post-bye games
  filter(bye %in% c("before", "after")) %>% 
  # calculate result
  mutate(Result = case_when(
    Margin > 0 ~ "W",
    Margin < 0 ~ "L",
    TRUE ~ "D"
  )) %>% 
  # recreate the Round column
  unite(Round, prefix, suffix, sep = "")

Let’s confirm that Geelong have not won after a bye in a long time:

results_bye %>% 
  filter(Team == "Geelong", bye == "after")
SeasonRoundDateVenueMarginStatusTeambyeResult
2011R72011-05-07Kardinia Park66Home.TeamGeelongafterW
2011R232011-08-27Kardinia Park-13Home.TeamGeelongafterL
2012R132012-06-22S.C.G.-6Away.TeamGeelongafterL
2013R132013-06-23Gabba-5Away.TeamGeelongafterL
2014R92014-05-17Subiaco-32Away.TeamGeelongafterL
2016R162016-07-08Kardinia Park-38Home.TeamGeelongafterL
2017R132017-06-15Subiaco-13Away.TeamGeelongafterL
2018R152018-06-29Docklands-2Away.TeamGeelongafterL
2019R142019-06-22Adelaide Oval-11Away.TeamGeelongafterL

How does that compare with other teams?

We see all combinations: teams that seem to win more after a bye, as well as teams that win less and teams for which a bye makes no difference. However, Geelong certainly has the worst post-bye win/loss record.

We can ask: is the win/loss count in pre-bye games significantly different to those post-bye? One approach to this is to construct 2×2 contingency tables and perform Fisher’s exact test.

With some more tidyverse magic we can nest the data for each team, generate the tests and summarise the results. This approach is explained very nicely in “Running a model on separate groups” over at Simon Jackson’s blog.

Only Geelong has p < 0.05, suggesting that there is something interesting about the win/loss count after the bye. We’ll just show the first 5 teams here.

results_bye %>% 
  count(Team, bye, Result) %>% 
  nest(-Team) %>% 
  mutate(data = map(data, . %>% spread(Result, n) %>% select(2:3)), 
         fisher = map(data, fisher.test), 
         summary = map(fisher, tidy)) %>% 
  select(Team, summary) %>% 
  unnest() %>% 
  select(-method, -alternative) %>% 
  arrange(p.value) %>% 
  pander(split.table = Inf)
Teamestimatep.valueconf.lowconf.high
Geelong21.40.015221.5331396
Sydney5.430.16980.602779.83
North Melbourne0.17360.29410.0028352.438
Richmond3.680.34690.405943.34
Collingwood3.7190.34980.404853.81

We can extend the previous visualisation by further breaking down games into home and away:

Now we see that of Geelong’s 8 post-bye losses, 6 were away games. Port Adelaide have a similar record. Then again, Brisbane have not won an away game before the bye, but you don’t hear anyone talking about Brisbane “not going well before the bye”.

When we look at those 6 away post-bye losses, one was in Melbourne – which in terms of travel distance is not very far from Geelong. The other five were “genuine” away games in Sydney, Brisbane, Adelaide and Perth (2).

SeasonRoundDateVenueMarginStatusTeambyeResult
2012R132012-06-22S.C.G.-6Away.TeamGeelongafterL
2013R132013-06-23Gabba-5Away.TeamGeelongafterL
2014R92014-05-17Subiaco-32Away.TeamGeelongafterL
2017R132017-06-15Subiaco-13Away.TeamGeelongafterL
2018R152018-06-29Docklands-2Away.TeamGeelongafterL
2019R142019-06-22Adelaide Oval-11Away.TeamGeelongafterL

In addition, three of the losses were against a side also coming off the bye, but playing at home.

SeasonRoundDateVenueMarginStatusTeambyeResult
2012R132012-06-22S.C.G.-6Away.TeamGeelongafterL
2014R92014-05-17Subiaco-32Away.TeamGeelongafterL
2017R132017-06-15Subiaco-13Away.TeamGeelongafterL

What about away games before the bye? One loss in Melbourne, four wins in Melbourne and one win in Sydney, versus the GWS Giants who at that time were a new and struggling team.

SeasonRoundDateVenueMarginStatusTeambyeResult
2011R52011-04-26M.C.G.19Away.TeamGeelongbeforeW
2011R212011-08-14Football Park11Away.TeamGeelongbeforeW
2012R112012-06-08Docklands12Away.TeamGeelongbeforeW
2013R112013-06-08Sydney Showground59Away.TeamGeelongbeforeW
2016R142016-06-25Docklands-3Away.TeamGeelongbeforeL
2019R122019-06-07M.C.G.67Away.TeamGeelongbeforeW

Our last question: for games after a bye, what was the expected result? By expected we mean “according to the bookmakers”. We can join the match results with historical betting data, assign the expected result (win or loss) to Geelong according to their odds, then compare expected versus actual results. This reveals that six of the eight post-bye losses were unexpected – not surprising as Geelong has been a strong team in the period from 2011 to now.

byeResultExpectedn
afterLL2
afterLW6
afterWW1
beforeLL1
beforeLW1
beforeWL1
beforeWW6

In summary
Historically, Geelong do seem more prone to losing after a bye round than other teams, and those losses have been unexpected in terms of betting odds.

However, a large proportion of their post-bye losses have been interstate away games, versus strong opponents. Away games before the bye have been either in Melbourne, or versus weaker opponents.

Scheduling may therefore have played a role in Geelong’s post-bye win/loss record.

To leave a comment for the author, please follow the link and comment on their blog: R – What You're Doing Is Rather Desperate.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)