Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
While some English Premier League matches are bitterly even in nature, others are historically more one-sided. Using the engsoccerdata package in R, I looked at EPL results data from 1888 to 2017, and sought to find the answer to the question: what have been the most uneven matches in the English first division?
Observing matches with 30+ games played between the two teams, it seems that the most one-sided English first division match ever is Manchester United – Luton Town, with Manchester United winning over 73% of the 30 fixtures. This is pretty high indeed, and there is only one other fixture with a win rate greater than 70%: Liverpool – QPR, with Liverpool victorious nearly 72% of the time over 46 matches.
An interactive graph can be found at this link, while here is the static version below:
And below is the code for the analysis (also on Github). There’s certainly a lot more analysis to be done on this dataset so feel free to use my code to make other interesting insights/visualizations.
Step 1: Initial Pre-Processing
library(dplyr)
library(ggplot2)
library(engsoccerdata)
library(ggiraph)
df <- engsoccerdata::england
#only matches in tier 1 (English First Division and subsequently EPL)
df <- df %>% filter(tier == 1)
#winner of game
df <- df %>% mutate(winner = case_when(
hgoal > vgoal ~ home,
hgoal < vgoal ~ visitor,
TRUE ~ "Draw"
),
loser = case_when(
hgoal < vgoal ~ home,
hgoal > vgoal ~ visitor,
TRUE ~ "Draw"
))
#teams involved
df <- df %>%
rowwise %>%
mutate(teams_involved = paste(sort(c(home,visitor)),collapse=" - ")) %>%
ungroup()
df <- df %>%
group_by(teams_involved) %>%
mutate(total_games_played = n())
Step 2: Count number of wins per fixture and find top 20 most one sided fixtures
win_count <- df %>%
count(winner,
teams_involved,
total_games_played) %>%
mutate(win_perc = n/total_games_played) %>%
ungroup()
more_common_fixtures <- win_count %>%
filter(total_games_played>=30)
one_sided_fixtures <- more_common_fixtures %>%
ungroup() %>%
top_n(20,
wt = win_perc)
Step 3: Graph using ggplot2 and ggiraph::geom_bar_interactive()
top_graph <- one_sided_fixtures %>%
ggplot(aes(x = reorder(teams_involved,win_perc),
y = win_perc,
tooltip = paste0(winner," (",n," of ",total_games_played,")")))+
geom_bar_interactive(stat="identity", fill = "darkblue")+
coord_flip()+
ylim(0,1)+
labs(title = "The Top 20 Most One-Sided Matches in English Premier League History",
y = "Win Percentage",
x = NULL,
caption = "Data from engsoccerdata R package") +
theme(plot.title = element_text(hjust = 0.5))
ggiraph(code = print(top_graph),width = 0.8)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
