Visualizing the Premier League So Far, Part 1: Overview with xPts Tables and xG Plots

November 20, 2019
By

[This article was first published on R by R(yo), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It’s the November International Break and we’re around a third of the
season through in the Premier League. To break the monotony I decided to
do a blog post exploring how teams are doing so far according to a
variety of metrics such as shots, goals, xG, xPts, etc. both for and
against. For this post specifically I talk about some overall graphs and
tables based on xG and xPts. Later parts (Parts 2, 3, etc.) will look
much more closely at how teams are doing in every 15 minute interval of
games, how they’re performing offensively and defensively across
different game situations, and more. It may still be a little early
to do something like this but I wanted to compare in the future how
things were before the hectic holiday schedule set in (and personally,
to get more practice writing about this stuff). In any case keep in mind
that there’s only been 12 games so far.

Learning from some of my previous soccer-related blog posts ( Exhibit
A
,
Exhibit B
), many of which are lengthier than an A Song of Ice & Fire novel,
I’ve decided against doing another blog post that is simultaneously an
analysis and a R programming tutorial. I’ve elected to post the R
code to get the data and create the viz in a separate file this time. As
always all the data is gathered with R, tidied with the {tidyverse}, and
the viz are created with {ggplot2} and friends.

You can follow along with the code:
here

Anyways, let get started!

xPts (Expected Points) Table

Expected points (expected Points, xPts, etc.) are an extension of the
expected Goals (xG) models you might be familiar with if you’ve at least
touched the surface of footy analytics. If you’re not familiar with xG
then take a quick look at my Copa
America

blog post or this article from
FBref for a
primer. Given the expected goals values each team created during a
match, one can take these values and simulate it thousands of times to
calculate the probabilities that the game will result in a
win/draw/loss. From these probabilities (match odds) we can convert them
into how many points (anywhere between 0 and 3) that a team could win
given the xG they and their opponents accumulated.

League Table by xPts

Points

Result

Goals

Expected Goals

Team (Actual Rank) xPts Points PD W D L For Against GD xG xGA xGD
Manchester City (4) 28.3 25 3.3 8 1 3 35 13 22 36.1 13.5 22.6
Liverpool (1) 23.9 34 -10.1 11 1 0 28 10 18 22.1 11.7 10.4
Chelsea (3) 23.3 26 -2.7 8 2 2 27 17 10 25.2 12.7 12.5
Manchester United (7) 23.2 16 7.2 4 4 4 16 12 4 17.2 11.1 6.1
Everton (15) 20.2 14 6.2 4 2 6 13 18 -5 17.8 12.3 5.5
Wolverhampton Wanderers (8) 17.8 16 1.8 3 7 2 16 15 1 15.0 13.9 1.1
Leicester (2) 17.8 26 -8.2 8 2 2 29 8 21 14.9 11.6 3.3
Burnley (10) 17.7 15 2.7 4 3 5 17 18 -1 16.3 13.6 2.7
Brighton (11) 16.5 15 1.5 4 3 5 15 17 -2 16.7 19.6 -2.9
Watford (18) 16.1 8 8.1 1 5 6 8 23 -15 13.8 18.4 -4.6
Arsenal (6) 15.3 17 -1.7 4 5 3 16 17 -1 15.4 15.4 0.0
Tottenham (14) 15.3 14 1.3 3 5 4 18 17 1 13.2 16.3 -3.1
Bournemouth (9) 14.8 16 -1.2 4 4 4 15 15 0 15.7 18.1 -2.4
Sheffield United (5) 14.8 17 -2.2 4 5 3 13 9 4 14.6 16.4 -1.8
Southampton (19) 13.7 8 5.7 2 2 8 11 29 -18 15.0 21.0 -6.0
Aston Villa (17) 12.7 11 1.7 3 2 7 17 20 -3 15.4 20.9 -5.5
Crystal Palace (12) 12.4 15 -2.6 4 3 5 10 16 -6 9.7 19.4 -9.7
West Ham (16) 11.5 13 -1.5 3 4 5 14 20 -6 15.5 22.3 -6.8
Newcastle United (13) 9.5 15 -5.5 4 3 5 11 18 -7 10.0 20.2 -10.2
Norwich (20) 8.3 7 1.3 2 1 9 11 28 -17 12.7 23.5 -10.8
\* All xG values are non-Penalty xG

At the top of the table, City are clearly in the lead and their expected
numbers match up well with their actual numbers, scoring a whopping 35
goals (pretty close to their 36.1 xG). Liverpool are over performing
their xG by close to 6 goals, scoring 28 from 22.1 xG and posting the
biggest difference between xPts and actual points (10.1) having gained
34 points to lead the Premier League table when expected points had them
for 23.9 points. Meanwhile Chelsea might rue their bad luck/opponent’s
good finishing ability with having conceded a bit over 4 goals compared
to their xGA. In terms of xPts per game, City are gaining 1.9 xPts per
game far ahead of their rivals Chelsea and Liverpool (at 1 and 0.9 xPts
per game respectively).

The biggest difference between a club’s real rank and their xPts
rank is Everton who given their xG might be quite a lot higher up the
table than their current lowly 15th place in reality. Another struggling
club, Watford, could also be quite comfortable in mid-table if results
have gone the way their xG performance may have merited (especially in
attack, they’ve been awful at finishing their chances, only 8 goals from
13.8 xG).

The biggest movers downwards from the actual league table based on xPts
are newly promoted Sheffield United who tumble down from 5th to 14th.
Most of this may stem from the fact that they have conceded a lot less
goals compared to what their xGA might suggest. This difference between
xGA and goals conceded should catch up to them eventually and although
their 5th place position flatters them a bit, in my opinion they’ve been
doing well enough that they’ll end up in mid-table very comfortably.
They certainly are an interesting team tactically that you might want to
read up on (Edit: As I was typing this blog post StatsBomb came out with
an article regarding the
Blades
).

Leicester’s actual numbers (29 goals scored, second most in the league),
while very good, do have to come with the caveat that a lot of their
goals scored have come when against ten men (13/29 goals or 44%). They
are around mid-table in terms of xG per game (see next few graphs below)
and they have been over performing their xG by a massive 14 goals so we
would expect the numbers to drop closer to their xG values sometime soon
(especially Jamie Vardy, who’s scored 11 goals from just 4.67 xG
according to
infogol).
However their good defensive record should keep them in fighting shape
for the Champions League spots (especially as the other
traditionally “top” teams look mediocre this season). If their numbers
do start declining it may be an interesting analysis to drill deeper to
figure out why as the numbers in the above table only give a very
broad overview of performance.

non-Penalty xGD (Expected Goal Difference) & xG per Game

Goal difference as a stat is known to any fan and also used as a
tie-breaker for when teams have a similar amount of points in a league
table. However, as we know from the various “expected” stats looking at
something like goal difference only looks at the end results and not the
actual quality of chances a team created/conceded during games. We’ve
seen many team score lucky goals and/or concede under very unlucky
circumstances so looking at a team’s expected goal differential can
provide a better understanding as to which teams are performing well
overall (offensively and defensively).

City are well ahead of the pack while Arsenal are about even so far,
around 0.015 xGD to be more precise. Like in the xPts league table,
Everton are doing fairly well despite what their actual results might
suggest along with Manchester United.

Norwich, Newcastle, and Crystal Palace are the three worst teams mainly
due to their atrocious attacking numbers. On the defensive side
Newcastle and Crystal Palace are slightly better (but not by much) while
Norwich are leaking chances on their end of the pitch. Although the
Canaries won praise for their adventurous style at the start of the
season, a huge injury crisis has brought them to heel having conceded
28 goals (second worst in the league just ahead of fellow strugglers
Southampton by 1 goal) from 23.5 xGA (also worst in the league).

Spurs are in the bottom half on xGD, mainly due to their awful attacking
numbers as they are fourth from the bottom for xG per game! Their
attacking numbers have steadily declined compared to two seasons ago and
last season’s Champions League final run slightly masked their poor
domestic record since the start of 2019 (11 Wins, 7 Draws, 12 Losses).

Match Reports

If you follow me on Twitter you might’ve seen that I do xG timeline and
shot maps for certain games. I mainly do one for Liverpool as an
ongoing thread that I update every match day but I also do them for
other games (mainly Bundesliga) that I was able to watch on that weekend
as well. You can find the Liverpool matchday thread
here and you
can find out how I create these by checking out my StatsBomb Messi Data
Viz

blog post from a few months back.

Here’s an example:

In recent games I also started taking notes during the match and then
posting an edited version under the report viz. Basically, a rough
bullet-point summary of what I saw, some random thoughts, and trying to
find patterns among my match notes. Here’s an example from the Liverpool vs. City game:

It’s weird, I’ve been watching all my life but never bothered taking
notes until I started doing all this data analysis stuff a year or so
ago. The closest I’ve come before was doing some self-analysis on my own
footage as well as a school project where I did play-by-play commentary
on a local game! It’s important to watch and learn from the game as well
as pouring over the stats so it’s a nice weekly exercise for me.

In terms of tactical stuff (rather than stats) you should check out
spielverlagerung.com,
zonalmarking.net,
betweentheposts.net, and many more (the
first two were sites I used to read occasionally when I was a kid,
recommended by one of my coaches I think, and recently took up reading
again). Of course, there are tons of people on Twitter you might want to
follow as well (ask me or take a look at the people I
follow
).

Conclusion & Preview for Part 2

We are still only 12 games in and with the hectic holiday schedule
(especially Liverpool with the Club World Cup…) coming up a lot can
change in even a month from now. This was just a short appetizer for
what I have in store for this data that I got from
understat.com and I’ll be drilling deeper to
investigate some of the numbers that came up above in the next few blog
posts.

As mentioned, the next part will take a look at team stats based on
different game situations like from open play, free kicks, and corners.
Here’s a sample (still very much a work-in-progress):

To leave a comment for the author, please follow the link and comment on their blog: R by R(yo).

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)