Who Had the Best Fantasy Football Season Ever?

June 23, 2016
By

(This article was first published on R – Jesse Piburn, and kindly contributed to R-bloggers)

Who had the best fantasy football season ever? Well like most “best ever” lists, the real answer is… it depends. The results you get depend on how you define “best” and how you slice and dice the data. Want to claim James Brim is the best WR of all-time? We can do that… I mean come on if only he played a few more games.

Before we start ranking, we need the data. Pro Football Reference is a great source of football stats and all things pro football. They publish player stats for every season all the way back until 1970. As an added bonus, they also provide each player’s total fantasy points for the season for three different scoring methods; standard, FanDuel, and DraftKings, which as my friend Forrest Gump would say, “That’s good! one less thing”.

To get the data I wrote two functions getSeason and getPFR. getSeason does all the work and getPFR wraps getSeason in lapply and binds all the data frames together so we can pass a vector of years rather than just one at a time. Special thanks to Hadley Wickham for the great rvest, dplyr, and ggplot2 packages and Jeffrey Arnold for the fantastic ggthemes package.

library(rvest)
library(dplyr)

getSeason <- function(season) {
  
  url <- paste0("http://www.pro-football-reference.com/years/", season, "/fantasy.htm")
  df_list <- url %>% html() %>% html_table()
  
  if (length(df_list) == 0) stop("No html table found at ", url)
  
  df <- df_list[[1]]
  
  ## clean up ----
  
  ## names ----
  df_names <- c("yr_rk", "plyr", "tm", "age", "g", "gs", "pass_cmp", "pass_att",
                "pass_yds", "pass_tds", "pass_int", "rush_att", "rush_yds", 
                "rush_yds_per", "rush_td", "rec_tgt", "rec", "rec_yds", 
                "rec_yds_per", "rec_tds", "fant_pos", "fant_pt", "fant_dk_pt", 
                "fant_fd_pt", "fant_vbd", "fant_pos_rank", "fant_ov_rank")
  
  # targets not available before 1992, rbind_all will handle merging
  if (season < 1992) df_names <- setdiff(df_names, "rec_tgt")
  
  names(df) <- df_names
  
  ## headers ----
  # 1st row is sub headers so remove those
  df <- df[2:nrow(df), ]
  
  ## extra rows ----
  # remove rows without player info
  df[which(df$plyr == ""), "plyr"] <- NA
  df <- df[!is.na(df$plyr), ]
  
  df <- df[which(df$fant_pos != ""), ]
  

  ## plyr names ----
  # some names end in * or *+ or +
  df$plyr <- gsub("\\*|\\*\\+|\\+", "", df$plyr)
  
  # turn numeric cols into numeric, that also turns blanks into NAs
  num_cols <- setdiff(names(df), c("plyr", "tm", "fant_pos"))
  df[num_cols][] <- lapply(df[num_cols], as.numeric)
  
  df$yr <- season
  df$yr_rk <- NULL
  
  # make column for nice plotting
  df <- df %>% mutate(plyr_info = paste0(plyr, " (", tm, ") ", yr))
  
  df
}


getPFR <- function(seasons) {
  
  df_list <- lapply(seasons, FUN = getSeason)
 
  out <- rbind_all(df_list)

  out
}


df <- getPFR(2015:1970)

Now that we have the function to get the data we can start answering our question. For all of the charts below, the scoring is based on FanDuel scoring, because they use 0.5 PPR (points per reception) and I feel that 0.5 PPR is a more accurate representation of the true performances of NFL players my league, The Saturated Unicorn (no, it isn’t a sexual innuendo) is 0.5 PPR. But, of course you can just change the column name to the standard or DraftKings scoring for a quick look at the others.

Going Forward

  • The data from PFF is only available from 1970 onwards (sorry, Jim Brown fans)
  • Scoring is based on FanDuel
  • PFF only has the stats on QBs, RBs, WRs, and TEs. So this doesn’t consider Kickers, Defenses, or Individual Defensive Players

 

Most Total Points in a Single Season

Not too surprising, LaDainian Tomlinson’s (453.1) mythical 2006 MVP season is by far the most total points ever scored. LT’s 453.1 points are 31.1 points more than 2nd place, Peyton Manning’s (422.0) 2013 campaign. Of the top 50 performances, 26 are by QBs, 23 by RBs, and only 1 by a WR, Jerry Rice (356.0). Tom Brady and Aaron Rodgers are tied for the most appearances with 4 each. Probably the most impressive is that 22nd most points ever scored were by O.J. Simpson (369.3) in 1975, back when the NFL season was still 14 games.

top ff seasons

library(dplyr)
library(ggplot2)
library(ggthemes)

# Most Points in a Single Season -----
top_season_df <- df %>% arrange(desc(fant_fd_pt)) %>% 
  mutate(rank = 1:n()) %>% select(plyr_info, plyr, fant_pos, tm, yr, fant_fd_pt, rank) %>%
  filter(rank <= 50)

ggplot(top_season_df, aes(x = reorder(plyr_info, fant_fd_pt), y = fant_fd_pt, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                      values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Most Total Fantasy Points, Single Season") +
  theme(plot.title = element_text(size=14, hjust=0)) +
  coord_flip(ylim=c(325,450))

Most Points in a Single Season, per Games Played

Well look at that, maybe LT didn’t have the best season ever after all, OK well he did, but maybe Rams fans have a little ammunition to argue otherwise now. Marshall Faulk averaged 29.96 points in the 14 games he played in the 2000 regular season besting LTs 2006 per game by about 1.6 points per game. When you look at the points on a per game basis, it makes you think of oh what could have been, Priest Holmes in 2004, Jerry Rice in 1987, Jake Delhomme in 1999…

top ff seaon per game

library(dplyr)
library(ggplot2)
library(ggthemes)

# Most Points in a Single Season, per Game Played -----
pg_avg_df <- df %>% group_by(plyr, tm, yr) %>%
  mutate(ppg = round(fant_fd_pt / g, 2),
         plyr_label = paste0(plyr, " (", tm, ") ", yr, " (",g,")")) %>%
  ungroup() %>%
  arrange(desc(ppg)) %>%
  mutate(rank = 1:n()) %>%
  select(plyr_label, ppg, rank, fant_pos) %>%
  filter(rank <= 50)

ggplot(pg_avg_df, aes(x = reorder(plyr_label, ppg), y = ppg, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                    values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Most Fantasy Points in Single Season, per Games Played") +
  theme(plot.title = element_text(size=11, hjust=0)) +
  coord_flip(ylim=c(20,30))

 

Largest Single Season Value Over Replacement Player (VORP)

Another common way of looking fantasy scoring is Value Over Replacement Player (VORP). The idea is pretty simple, the value of a certain player is relative to a player that you could replace them with. If Dorial Green-Beckham is the best available WR for pickup, the value of Jarvis Landry, is not what Landry can score, it’s how many more points can he score than Green-Beckham. To use this metric, you have to set replacement levels for each position, this would vary by roster type and league size but common ones are QB-12, RB-24, WR-24, and TE-12 so we’ll use those for our rankings.

top vorp

library(dplyr)
library(ggplot2)
library(ggthemes)

# Largest Single Season Value Over Replacement Player (VORP)
vorp_df <- df %>% group_by(fant_pos, yr) %>% arrange(desc(fant_fd_pt)) %>% 
  mutate(rank = 1:n(),
         replace_rank = ifelse(fant_pos == "QB", 12, 30),
         replace_rank = ifelse(fant_pos == "RB", 24, replace_rank),
         replace_rank = ifelse(fant_pos == "WR", 24, replace_rank),
         replace_rank = ifelse(fant_pos == "TE", 12, replace_rank),
         replace_value = nth(fant_fd_pt, median(replace_rank)),
         vor = fant_fd_pt - replace_value,
         vor_per = round( (vor/replace_value) * 100, 2)) %>% ungroup() %>% arrange(desc(vor)) %>% 
  mutate(vor_rank = 1:n()) %>% select(plyr_info, fant_pos, vor,vor_rank) %>%
  filter(vor_rank <= 50)

ggplot(vor_df, aes(x = reorder(plyr_info, vor), y = vor, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                    values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Largest VORP, Single Season") +
  theme(plot.title = element_text(size=14, hjust=0)) +
  coord_flip(ylim=c(170,280))

 

Largest Single Season Value Over Positional Average (VOPA)

I thought I would jump in on the making up metrics party too. I’m sure someone has used this before, but I haven’t checked so I’ll just stick with I made it up. Instead of looking at replacement levels, Value Over Positional Average (VOPA) takes the points of a player and subtracts the average points scored by all qualifying players at that position. For qualifying players, I set the level to the top 32 players at QB, RB, and TE. For WR I went with 64 as there are normally at least two WRs on the field at most times. In the code I left it were you could set the qualifying number by position.

top vopa

library(dplyr)
library(ggplot2)
library(ggthemes)

# value over positional average ----
vopa_df <- df %>% group_by(fant_pos, yr) %>% arrange(desc(fant_fd_pt)) %>% 
  mutate(rank = 1:n(),
         replace_rank = ifelse(fant_pos == "QB", 32, 32),
         replace_rank = ifelse(fant_pos == "RB", 32, replace_rank),
         replace_rank = ifelse(fant_pos == "WR", 64, replace_rank),
         replace_rank = ifelse(fant_pos == "TE", 32, replace_rank)) %>%
  filter(rank <= replace_rank) %>%
  mutate(pa = mean(fant_fd_pt),
         vopa = fant_fd_pt - pa,
         vopa_per = round( (vopa/pa) * 100, 2)) %>% 
  ungroup() %>% 
  arrange(desc(vopa)) %>% 
  mutate(vopa_rank = 1:n()) %>% 
  select(plyr_info, fant_pos, vopa, vopa_rank) %>%
  filter(vopa_rank <= 50)

ggplot(vopa_df, aes(x = reorder(plyr_info, vopa), y = vopa, fill = fant_pos)) + 
  geom_bar(stat = "identity", position = position_dodge(width=1)) + 
  theme_fivethirtyeight() + 
  theme(panel.grid.major.y = element_blank()) +
  scale_fill_manual(guide = guide_legend(title = "Position"),
                    values = c("QB" = "#F15A60","RB" = "#7AC36A", "WR" = "#5A9BD4", "TE" = "#FAA75B")) +
  ggtitle("Largest VOPA, Single Season") +
  theme(plot.title = element_text(size=14, hjust=0)) +
  coord_flip(ylim=c(140,240))

 

 

Bonus

Is James Brim the best WR of all-time?

On a per game basis… yes.

top ff game avg

 

Most Total Career Fantasy Points

Go Vols!

top ff career

 

 

To leave a comment for the author, please follow the link and comment on their blog: R – Jesse Piburn.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)