# Identify Sleepers in Fantasy Football using Statistics and Wisdom of the Crowd

June 24, 2014
By

(This article was first published on Fantasy Football Analytics » R | Fantasy Football Analytics, and kindly contributed to R-bloggers)

In this post, I demonstrate how to statistically identify sleepers in fantasy football using the wisdom of the crowd.

## The R Scripts

The R Script for the “Wisdom of the Crowd” section is at:

The R Script for the “Experts” section is at:

## What is a “Sleeper”?

In order to statistically identify sleepers, we must first define what we mean by a “sleeper”.  Using the definition from NFL.com, a sleeper is “a late round pick…who exceeds his statistical expectations and becomes a prominent [fantasy player]“.  Thus, we want to identify players who are likely to exceed their statistical expectations and have a breakout season.

You might wonder how much statistics can tell us about the likelihood a player could have a breakout season.  A lot, I would argue.  People tend to think of players’ projections in terms of a single value, namely the most likely value (e.g., the point estimate or average).  For example, FantasyPros, which averages across sources of projections, provides one value of projected points for each player.  Thinking in terms of a single value is bad because a) it suggests a higher level of accuracy and precision than actually exists, b) it falsely assumes that all players are equally predictable, and c) it ignores the fact that the players’ projections take the form of a distribution (not a single value).  Consider the following figure:

In the density plot above, there are 3 players: A, B, and C.  All three players have the same average projection: 150 points.  That is, if you average across all sources, each player is considered most likely to score 150 points.  This point estimate, however, ignores the different distributions for the different players.  We see that Player A, with the narrowest distribution, is likely to score between 140-160 points, whereas Player B is likely to score between 120-180 points, and Player C with the widest distribution is likely to score between 70-230 points.  We call these differences in the width of the distribution the “variability,” which can be quantified with the standard deviation.  By thinking in terms of an interval estimate (range) rather than a point estimate (average), we can more accurately assess the likelihood that a player will exceed expectations and have a breakout season.  In the example above, Player C would be most likely to be the sleeper because Player C has the highest potential upside (based on the highest standard deviation).  Thus, we can quantify sleepers as those players with high variability in their projections across sources as measured by standard deviation.  For more info, see here and here.

## Wisdom of the Crowd

Adapted from work by Drew Conway (see here), the script takes 10,000 mock drafts from Fantasy Football Calculator, and computes a robust standard deviation for each player’s draft position.  We compute a robust standard deviation (known as median absolute deviation) to make sure the variability estimate is not driven by outliers from a few crazy drafters.  This gives us a sense of who the crowd thinks the riskiest players are.  In other words, it gives us the wisdom of the crowd for which players are the most variable in terms of ranking.  The riskiest players according to the wisdom of the crowd are labeled in the figure below:

## Experts

We can similarly calculate a standard deviation across rankings and projections by experts.  For info on how these are calculated, see here.

## Combining Variability of Rankings and Projections

After calculating the variability of players’ rankings (crowd and experts) and projections (experts), we can combine them.  In order to equally weight the variability of rankings and projections, I combined the two variability of rankings (crowd and experts) before averaging them with the variability of projections.  To average, I first z-score standardize them to put them on the same mathematical metric (mean=0, SD=1).  Then I average the variability of the crowd’s and the experts’ rankings to get an overall ranking variability.  Then I average the standardized ranking variability with the standardized projection variability to get an overall risk variability.  Then I rescaled the risk variable to have a mean of 5 and a standard deviation of 2.  Players with risk values above 7 are thus greater than 1 SD above the mean in terms of variability.

## Who are the Sleepers?

Here are some notable players who have high upside potential and are potential sleepers (it’s also worth noting that, by definition, they also have considerable downside potential, as well, so they are best drafted later in the draft as a low risk, high reward pick).

• Doug Baldwin, WR, SEA
• Khiry Robinson, RB, NO
• Andrew Hawkins, WR, CLE
• Jace Amaro, TE, NYJ
• Devonta Freeman, RB, ATL

## Conclusion

We can use statistics and wisdom of the crowd to understand which players are most likely to have breakout seasons (based on the variability around their rankings and projections).

The post Identify Sleepers in Fantasy Football using Statistics and Wisdom of the Crowd appeared first on Fantasy Football Analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...