One-Sided Matches in the English Premier League

October 9, 2018
By

(This article was first published on World Soccer Analytics, and kindly contributed to R-bloggers)

While some English Premier League matches are bitterly even in nature, others are historically more one-sided. Using the engsoccerdata package in R, I looked at EPL results data from 1888 to 2017, and sought to find the answer to the question: what have been the most uneven matches in the English first division?

Observing matches with 30+ games played between the two teams, it seems that the most one-sided English first division match ever is Manchester United – Luton Town, with Manchester United winning over 73% of the 30 fixtures. This is pretty high indeed, and there is only one other fixture with a win rate greater than 70%: Liverpool – QPR, with Liverpool victorious nearly 72% of the time over 46 matches.

An interactive graph can be found at this link, while here is the static version below:

static_plot_one_sided_fixtures.png

 

And below is the code for the analysis (also on Github). There’s certainly a lot more analysis to be done on this dataset so feel free to use my code to make other interesting insights/visualizations.

Step 1: Initial Pre-Processing

library(dplyr)
library(ggplot2)
library(engsoccerdata)
library(ggiraph)

df <- engsoccerdata::england
#only matches in tier 1 (English First Division and subsequently EPL)
df <- df %>% filter(tier == 1)

#winner of game
df <- df %>% mutate(winner = case_when(
  hgoal > vgoal ~ home,
  hgoal < vgoal ~ visitor,
  TRUE ~ "Draw"
),
loser = case_when(
   hgoal < vgoal ~ home,
   hgoal > vgoal ~ visitor,
   TRUE ~ "Draw"
))
                  
#teams involved
df <- df %>% 
  rowwise %>% 
  mutate(teams_involved = paste(sort(c(home,visitor)),collapse=" - ")) %>% 
  ungroup()

df <- df %>% 
  group_by(teams_involved) %>% 
  mutate(total_games_played = n())

Step 2: Count number of wins per fixture and find top 20 most one sided fixtures

win_count <- df %>% 
  count(winner,
        teams_involved,
        total_games_played) %>% 
  mutate(win_perc = n/total_games_played) %>% 
  ungroup()

more_common_fixtures <- win_count %>% 
  filter(total_games_played>=30)

one_sided_fixtures <- more_common_fixtures %>% 
  ungroup() %>%
  top_n(20,
        wt = win_perc)

Step 3: Graph using ggplot2 and ggiraph::geom_bar_interactive()

top_graph <- one_sided_fixtures %>% 
  ggplot(aes(x = reorder(teams_involved,win_perc),
             y = win_perc,
             tooltip = paste0(winner," (",n," of ",total_games_played,")")))+
  geom_bar_interactive(stat="identity", fill = "darkblue")+
  coord_flip()+
  ylim(0,1)+
  labs(title = "The Top 20 Most One-Sided Matches in English Premier League History",
       y = "Win Percentage",
       x = NULL,
       caption = "Data from engsoccerdata R package") + 
  theme(plot.title = element_text(hjust = 0.5))

 ggiraph(code = print(top_graph),width = 0.8)

To leave a comment for the author, please follow the link and comment on their blog: World Soccer Analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)