# Score Involvements

June 20, 2018
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Got an email from someone who was reading the footballistics book they were really into it and got up to chapter 3.

They must be a really big Western Bulldogs fan who has a lot of theories as to why after their premiership year in 2016 it seems they have dropped off suddenly, perhaps it has something to do with their spread of score involvements or who is involved in the scoring chain?

# Step one – recreate what is going on in the book

The graph I am interested in recreated to check I understand what is going on here, is the Adelaide Crows graph of % involvement in team scores in 2017.

``````library(fitzRoy)
library(tidyverse)``````
``## -- Attaching packages -------------------------------- tidyverse 1.2.1 --``
``````## v ggplot2 2.2.1     v purrr   0.2.5
## v tibble  1.4.2     v dplyr   0.7.5
## v tidyr   0.8.1     v stringr 1.3.0
## v readr   1.1.1     v forcats 0.3.0``````
``````## -- Conflicts ----------------------------------- tidyverse_conflicts() --
``````fitzRoy::player_stats%>%
select(Player, SI, G, B, Round)%>%
group_by(Round)%>%
mutate(SIT=SI/((sum(G)+sum(B))))%>%
group_by(Player)%>%
summarise(averageSIT=mean(SIT))%>%
arrange(desc(averageSIT))  ``````
``````## # A tibble: 31 x 2
##    Player          averageSIT
##
##  1 Taylor Walker        0.323
##  2 Tom Lynch            0.300
##  3 Matt Crouch          0.293
##  4 Eddie Betts          0.285
##  5 Rory Sloane          0.272
##  6 Josh Jenkins         0.254
##  7 Mitch McGovern       0.252
##  8 Richard Douglas      0.234
## 10 Sam Jacobs           0.224
## # ... with 21 more rows``````

What we can see here is that our leaderboard doesn’t quite align to what the book says? So what could be going on here?

Could be a coding issue? Could be an interpretation issue, or could be something else entirely.

The first thing we could do is do a quick check either manually (checking footywire)

To do this we would run just this part of the script above

``````fitzRoy::player_stats%>%
select(Player, SI, G, B, Round)%>%
group_by(Round)%>%
mutate(TG=sum(G), TB=sum(B))%>%
mutate(SIT=SI/((sum(G)+sum(B))))``````
``````## # A tibble: 550 x 8
## # Groups:   Round [25]
##    Player             SI     G     B Round      TG    TB   SIT
##
##  1 Rory Laird         10     0     0 Round 1    22    13 0.286
##  2 Matt Crouch         8     0     0 Round 1    22    13 0.229
##  3 Richard Douglas     9     2     0 Round 1    22    13 0.257
##  4 Rory Sloane        10     0     0 Round 1    22    13 0.286
##  5 Charlie Cameron    11     2     3 Round 1    22    13 0.314
##  6 Wayne Milera        6     1     0 Round 1    22    13 0.171
##  7 David MacKay       15     1     0 Round 1    22    13 0.429
##  8 Josh Jenkins       12     3     3 Round 1    22    13 0.343
##  9 Brodie Smith       10     1     2 Round 1    22    13 0.286
## 10 Rory Atkins         8     3     0 Round 1    22    13 0.229
## # ... with 540 more rows``````

We would need to check if in round 1 2017 Rory Laird was involved in 28.57% of Adelaides scores. We can see he had 10 SI, Adelaides players scored in total 22 goals and 13 behinds.

``10/(22+13)``
``## [1] 0.2857143``

Looking at the page though, we didn’t include the rushed behinds! Perhaps this is the missing data that will get our numbers to align.

One last check before we check if its rushed behinds. Lets make sure we are looking at the right amount of games.

``````fitzRoy::player_stats%>%
select(Player, SI, G, B, Round)%>%
group_by(Round)%>%
mutate(SIT=SI/((sum(G)+sum(B))))%>%
group_by(Player)%>%
tally()``````
``````## # A tibble: 31 x 2
##    Player              n
##
##  1 Alex Keath          6
##  2 Andy Otten         19
##  4 Brodie Smith       23
##  5 Charlie Cameron    24
##  6 Curtly Hampton      9
##  7 Daniel Talia       24
##  8 David MacKay       22
##  9 Eddie Betts        24
## 10 Hugh Greenwood     15
## # ... with 21 more rows``````

Here it gives Tex as having played 23 games. Lets move on to see if its the rushed behinds!

``````df<-  fitzRoy::player_stats%>%filter(Season==2017)
df2<-fitzRoy::match_results
df2<-df2%>%filter(Season==2017)
df3<-select(df2, Date, Round, Home.Team, Home.Goals, Home.Behinds)
df4<-select(df2, Date, Round, Away.Team, Away.Goals, Away.Behinds)
colnames(df3)[3]<-"Team"
colnames(df3)[4]<-"Goals"
colnames(df3)[5]<-"Behinds"
colnames(df4)[3]<-"Team"
colnames(df4)[4]<-"Goals"
colnames(df4)[5]<-"Behinds"

df5<-rbind(df4,df3)

df6<-inner_join(df,df5, by=c("Team","Date"))

select(Player, SI, Goals, Behinds, Round.x)%>%
group_by(Round.x)%>%
mutate(SIT=SI/(((Goals)+(Behinds))))%>%
group_by(Player)%>%
summarise(averageSIT=mean(SIT))%>%
arrange(desc(averageSIT))``````
``````## # A tibble: 31 x 2
##    Player          averageSIT
##
##  1 Taylor Walker        0.291
##  2 Tom Lynch            0.268
##  3 Matt Crouch          0.261
##  4 Eddie Betts          0.257
##  5 Rory Sloane          0.243
##  6 Mitch McGovern       0.234
##  7 Josh Jenkins         0.227
##  8 Richard Douglas      0.211
## 10 Sam Jacobs           0.199
## # ... with 21 more rows``````

Ok still doesn’t seem as though we know what is up, post to be updated!

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.