Who Had the Best Fantasy Football Season Ever?

Posted on June 23, 2016 by Jesse in R bloggers | 0 Comments

[This article was first published on R – Jesse Piburn, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Who had the best fantasy football season ever? Well like most “best ever” lists, the real answer is… it depends. The results you get depend on how you define “best” and how you slice and dice the data. Want to claim James Brim is the best WR of all-time? We can do that… I mean come on if only he played a few more games.

Before we start ranking, we need the data. Pro Football Reference is a great source of football stats and all things pro football. They publish player stats for every season all the way back until 1970. As an added bonus, they also provide each player’s total fantasy points for the season for three different scoring methods; standard, FanDuel, and DraftKings, which as my friend Forrest Gump would say, “That’s good! one less thing”.

To get the data I wrote two functions getSeason and getPFR. getSeason does all the work and getPFR wraps getSeason in lapply and binds all the data frames together so we can pass a vector of years rather than just one at a time. Special thanks to Hadley Wickham for the great rvest, dplyr, and ggplot2 packages and Jeffrey Arnold for the fantastic ggthemes package.

library(rvest)
library(dplyr)

getSeason <- function(season) {
  
  url <- paste0("http://www.pro-football-reference.com/years/", season, "/fantasy.htm")
  df_list <- url %>% html() %>% html_table()
  
  if (length(df_list) == 0) stop("No html table found at ", url)
  
  df <- df_list[[1]]
  
  ## clean up ----
  
  ## names ----
  df_names <- c("yr_rk", "plyr", "tm", "age", "g", "gs", "pass_cmp", "pass_att",
                "pass_yds", "pass_tds", "pass_int", "rush_att", "rush_yds", 
                "rush_yds_per", "rush_td", "rec_tgt", "rec", "rec_yds", 
                "rec_yds_per", "rec_tds", "fant_pos", "fant_pt", "fant_dk_pt", 
                "fant_fd_pt", "fant_vbd", "fant_pos_rank", "fant_ov_rank")
  
  # targets not available before 1992, rbind_all will handle merging
  if (season < 1992) df_names <- setdiff(df_names, "rec_tgt")
  
  names(df) <- df_names
  
  ## headers ----
  # 1st row is sub headers so remove those
  df <- df[2:nrow(df), ]
  
  ## extra rows ----
  # remove rows without player info
  df[which(df$plyr == ""), "plyr"] <- NA
  df <- df[!is.na(df$plyr), ]
  
  df <- df[which(df$fant_pos != ""), ]
  

  ## plyr names ----
  # some names end in * or *+ or +
  df$plyr <- gsub("\\*|\\*\\+|\\+", "", df$plyr)
  
  # turn numeric cols into numeric, that also turns blanks into NAs
  num_cols <- setdiff(names(df), c("plyr", "tm", "fant_pos"))
  df[num_cols][] <- lapply(df[num_cols], as.numeric)
  
  df$yr <- season
  df$yr_rk <- NULL
  
  # make column for nice plotting
  df <- df %>% mutate(plyr_info = paste0(plyr, " (", tm, ") ", yr))
  
  df
}


getPFR <- function(seasons) {
  
  df_list <- lapply(seasons, FUN = getSeason)
 
  out <- rbind_all(df_list)

  out
}


df <- getPFR(2015:1970)

Now that we have the function to get the data we can start answering our question. For all of the charts below, the scoring is based on FanDuel scoring, because they use 0.5 PPR (points per reception) and ~~I feel that 0.5 PPR is a more accurate representation of the true performances of NFL players~~ my league, The Saturated Unicorn (no, it isn’t a sexual innuendo) is 0.5 PPR. But, of course you can just change the column name to the standard or DraftKings scoring for a quick look at the others.

Going Forward

The data from PFF is only available from 1970 onwards (sorry, Jim Brown fans)
Scoring is based on FanDuel
PFF only has the stats on QBs, RBs, WRs, and TEs. So this doesn’t consider Kickers, Defenses, or Individual Defensive Players

Most Total Points in a Single Season

Not too surprising, LaDainian Tomlinson’s (453.1) mythical 2006 MVP season is by far the most total points ever scored. LT’s 453.1 points are 31.1 points more than 2nd place, Peyton Manning’s (422.0) 2013 campaign. Of the top 50 performances, 26 are by QBs, 23 by RBs, and only 1 by a WR, Jerry Rice (356.0). Tom Brady and Aaron Rodgers are tied for the most appearances with 4 each. Probably the most impressive is that 22nd most points ever scored were by O.J. Simpson (369.3) in 1975, back when the NFL season was still 14 games.

library(dplyr)
library(ggplot2)
library(ggthemes)

# Most Points in a Single Season -----
top_season_df <- df %>% arrange(desc(fant_fd_pt)) %>% 
  mutate(rank = 1:n()) %>% select(plyr_info, plyr, fant_pos, tm, yr, fant_fd_pt, rank) %>%
  filter(rank <= 50)

ggplot(top_season_df, aes(x = reorder(plyr_info, fant_fd_pt), y = fant_fd_pt, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                      values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Most Total Fantasy Points, Single Season") +
  theme(plot.title = element_text(size=14, hjust=0)) +
  coord_flip(ylim=c(325,450))

Most Points in a Single Season, per Games Played

Well look at that, maybe LT didn’t have the best season ever after all, OK well he did, but maybe Rams fans have a little ammunition to argue otherwise now. Marshall Faulk averaged 29.96 points in the 14 games he played in the 2000 regular season besting LTs 2006 per game by about 1.6 points per game. When you look at the points on a per game basis, it makes you think of oh what could have been, Priest Holmes in 2004, Jerry Rice in 1987, Jake Delhomme in 1999…

library(dplyr)
library(ggplot2)
library(ggthemes)

# Most Points in a Single Season, per Game Played -----
pg_avg_df <- df %>% group_by(plyr, tm, yr) %>%
  mutate(ppg = round(fant_fd_pt / g, 2),
         plyr_label = paste0(plyr, " (", tm, ") ", yr, " (",g,")")) %>%
  ungroup() %>%
  arrange(desc(ppg)) %>%
  mutate(rank = 1:n()) %>%
  select(plyr_label, ppg, rank, fant_pos) %>%
  filter(rank <= 50)

ggplot(pg_avg_df, aes(x = reorder(plyr_label, ppg), y = ppg, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                    values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Most Fantasy Points in Single Season, per Games Played") +
  theme(plot.title = element_text(size=11, hjust=0)) +
  coord_flip(ylim=c(20,30))

Largest Single Season Value Over Replacement Player (VORP)

Another common way of looking fantasy scoring is Value Over Replacement Player (VORP). The idea is pretty simple, the value of a certain player is relative to a player that you could replace them with. If Dorial Green-Beckham is the best available WR for pickup, the value of Jarvis Landry, is not what Landry can score, it’s how many more points can he score than Green-Beckham. To use this metric, you have to set replacement levels for each position, this would vary by roster type and league size but common ones are QB-12, RB-24, WR-24, and TE-12 so we’ll use those for our rankings.

library(dplyr)
library(ggplot2)
library(ggthemes)

# Largest Single Season Value Over Replacement Player (VORP)
vorp_df <- df %>% group_by(fant_pos, yr) %>% arrange(desc(fant_fd_pt)) %>% 
  mutate(rank = 1:n(),
         replace_rank = ifelse(fant_pos == "QB", 12, 30),
         replace_rank = ifelse(fant_pos == "RB", 24, replace_rank),
         replace_rank = ifelse(fant_pos == "WR", 24, replace_rank),
         replace_rank = ifelse(fant_pos == "TE", 12, replace_rank),
         replace_value = nth(fant_fd_pt, median(replace_rank)),
         vor = fant_fd_pt - replace_value,
         vor_per = round( (vor/replace_value) * 100, 2)) %>% ungroup() %>% arrange(desc(vor)) %>% 
  mutate(vor_rank = 1:n()) %>% select(plyr_info, fant_pos, vor,vor_rank) %>%
  filter(vor_rank <= 50)

ggplot(vor_df, aes(x = reorder(plyr_info, vor), y = vor, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                    values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Largest VORP, Single Season") +
  theme(plot.title = element_text(size=14, hjust=0)) +
  coord_flip(ylim=c(170,280))

Largest Single Season Value Over Positional Average (VOPA)

I thought I would jump in on the making up metrics party too. I’m sure someone has used this before, but I haven’t checked so I’ll just stick with I made it up. Instead of looking at replacement levels, Value Over Positional Average (VOPA) takes the points of a player and subtracts the average points scored by all qualifying players at that position. For qualifying players, I set the level to the top 32 players at QB, RB, and TE. For WR I went with 64 as there are normally at least two WRs on the field at most times. In the code I left it were you could set the qualifying number by position.

library(dplyr)
library(ggplot2)
library(ggthemes)

# value over positional average ----
vopa_df <- df %>% group_by(fant_pos, yr) %>% arrange(desc(fant_fd_pt)) %>% 
  mutate(rank = 1:n(),
         replace_rank = ifelse(fant_pos == "QB", 32, 32),
         replace_rank = ifelse(fant_pos == "RB", 32, replace_rank),
         replace_rank = ifelse(fant_pos == "WR", 64, replace_rank),
         replace_rank = ifelse(fant_pos == "TE", 32, replace_rank)) %>%
  filter(rank <= replace_rank) %>%
  mutate(pa = mean(fant_fd_pt),
         vopa = fant_fd_pt - pa,
         vopa_per = round( (vopa/pa) * 100, 2)) %>% 
  ungroup() %>% 
  arrange(desc(vopa)) %>% 
  mutate(vopa_rank = 1:n()) %>% 
  select(plyr_info, fant_pos, vopa, vopa_rank) %>%
  filter(vopa_rank <= 50)

ggplot(vopa_df, aes(x = reorder(plyr_info, vopa), y = vopa, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                    values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Largest VOPA, Single Season") +
  theme(plot.title = element_text(size=14, hjust=0)) +
  coord_flip(ylim=c(140,240))

Bonus

Is James Brim the best WR of all-time?

On a per game basis… yes.

Most Total Career Fantasy Points

Go Vols!

To leave a comment for the author, please follow the link and comment on their blog: R – Jesse Piburn.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Who Had the Best Fantasy Football Season Ever?

Most Total Points in a Single Season

Most Points in a Single Season, per Games Played

Largest Single Season Value Over Replacement Player (VORP)

Largest Single Season Value Over Positional Average (VOPA)

Bonus

Is James Brim the best WR of all-time?

Most Total Career Fantasy Points

Related

Most Total Points in a Single Season

Most Points in a Single Season, per Games Played

Largest Single Season Value Over Replacement Player (VORP)

Largest Single Season Value Over Positional Average (VOPA)

Bonus

Is James Brim the best WR of all-time?

Most Total Career Fantasy Points

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)