Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

While some English Premier League matches are bitterly even in nature, others are historically more one-sided. Using the engsoccerdata package in R, I looked at EPL results data from 1888 to 2017, and sought to find the answer to the question: what have been the most uneven matches in the English first division?

Observing matches with 30+ games played between the two teams, it seems that the most one-sided English first division match ever is Manchester United – Luton Town, with Manchester United winning over 73% of the 30 fixtures. This is pretty high indeed, and there is only one other fixture with a win rate greater than 70%: Liverpool – QPR, with Liverpool victorious nearly 72% of the time over 46 matches.

An interactive graph can be found at this link, while here is the static version below:

And below is the code for the analysis (also on Github). There’s certainly a lot more analysis to be done on this dataset so feel free to use my code to make other interesting insights/visualizations.

Step 1: Initial Pre-Processing

library(dplyr)
library(ggplot2)
library(engsoccerdata)
library(ggiraph)

df <- engsoccerdata::england
#only matches in tier 1 (English First Division and subsequently EPL)
df <- df %>% filter(tier == 1)

#winner of game
df <- df %>% mutate(winner = case_when(
hgoal > vgoal ~ home,
hgoal < vgoal ~ visitor,
TRUE ~ "Draw"
),
loser = case_when(
hgoal < vgoal ~ home,
hgoal > vgoal ~ visitor,
TRUE ~ "Draw"
))

#teams involved
df <- df %>%
rowwise %>%
mutate(teams_involved = paste(sort(c(home,visitor)),collapse=" - ")) %>%
ungroup()

df <- df %>%
group_by(teams_involved) %>%
mutate(total_games_played = n())

Step 2: Count number of wins per fixture and find top 20 most one sided fixtures

win_count <- df %>%
count(winner,
teams_involved,
total_games_played) %>%
mutate(win_perc = n/total_games_played) %>%
ungroup()

more_common_fixtures <- win_count %>%
filter(total_games_played>=30)

one_sided_fixtures <- more_common_fixtures %>%
ungroup() %>%
top_n(20,
wt = win_perc)

Step 3: Graph using ggplot2 and ggiraph::geom_bar_interactive()

top_graph <- one_sided_fixtures %>%
ggplot(aes(x = reorder(teams_involved,win_perc),
y = win_perc,
tooltip = paste0(winner," (",n," of ",total_games_played,")")))+
geom_bar_interactive(stat="identity", fill = "darkblue")+
coord_flip()+
ylim(0,1)+
labs(title = "The Top 20 Most One-Sided Matches in English Premier League History",
y = "Win Percentage",
x = NULL,
caption = "Data from engsoccerdata R package") +
theme(plot.title = element_text(hjust = 0.5))

ggiraph(code = print(top_graph),width = 0.8)