# Comparing ESPN’s, CBS’s, and NFL.com’s Fantasy Football Projections using R

**Fantasy Football Analytics in R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the future, we will determine how to select the best possible team by maximizing your team’s projected points and minimizing its downside risk. But in order to do this, we will have to rely on our best guess of how many points each player will score. We will use 2012 projections from ESPN, CBS, and NFL.com and actual fantasy points from Yahoo. Our selected team, however, will only be as good as our projections. Garbage in, garbage out. Thus, it’s crucial to evaluate the accuracy of projections to know how much confidence to give them.

A couple years ago, there was an article in the New York Times that compared the fantasy football projections from ESPN, CBS, and Yahoo. The authors found that Yahoo’s projections were more accurate than CBS’s, which were more accurate than ESPN’s. The website co-founded by the author, FantasyPros.com, has collected historical accuracy of various sources of projections from 2009. The website uses this information to infer that some sources of projections are more accurate than others, and that the weight that you should give to each source should depend on its prior performance. Sounds reasonable, right? The problem is that many of these “sources” are just individual so-called experts. Just like mutual fund managers trying to predict the stock market, these “experts” are not reliably able to outperform the average (see, e.g., here and here). That’s why even casual bloggers outperform the experts (see, e.g., here).

### If I shouldn’t trust the experts, whom should I trust?

To answer this question, it’s important to consider psychometrics. In classical test theory, any observed score (fantasy football projection) is composed of two parts: “true score” (i.e., the signal) and error (i.e., the noise; e.g., bias on the part of any individual source). One of the easiest ways to maximize the true signal to error/noise ratio is to **aggregate** information. In general, an average or latent variable will be more reliable and valid than the individual sources that compose it (assuming the sources are valid measures of the same thing). In other words, combining projections from various sources allows us to approximate more accurately the “true” projection for a player. In fact, one of the most accurate ways of getting accurate measurements in many domains is through wisdom of the crowd (see, e.g., here), where the best guess is the average of many individuals’ responses. Unfortunately, I’m not aware of any sites that apply the “wisdom of the crowd” calculations to fantasy football projections (but for rankings, see here and here). As a result, we will compare last year’s projections from ESPN, CBS, and NFL.com to see which was most accurate, and then compare those to the average and latent combinations of the three.

If you know of any sites with publicly available projections that aggregate across many sources, let me know.

### The R Script

https://github.com/dadrivr/FantasyFootballAnalyticsR/blob/master/R%20Scripts/Evaluate%20Projections.R

*R*

^{2}), Harrell’s c-index, and intraclass correlation (ICC).

**1)**

**R-squared**represents the proportion of variance in the outcome that is explained by the predictor. R-squared is better than the simple Pearson

*r*correlation coefficient when evaluating predictions because R-squared is better able to detect shifts in the data.

**2)**

**Harrell’s c-index**is a measure of concordance that is equivalent to the area under the curve (AUC) in a receiver operating characteristic (ROC) curve, which represents the tradeoff between a predictor’s sensitivity and specificity.

**3)**

**ICC**is commonly used to assess inter-rater reliability. Because the projected and actual points are supposed to measure the same thing on the same metric, we are not only interested in determining how

*predictive*the projections are, but also how

*accurate*(i.e., similar in value) they are. We will use the absolute agreement form of ICC to determine the accuracy of the projections.

Here’s a table of the accuracy of the predictions according to these 3 metrics (top 2 for each metric in bold):

Source | R-squared | Harrell’s c | ICC |
---|---|---|---|

ESPN | .532 | .747 | .730 |

CBS | .584 |
.765 |
.753 |

NFL.com | .484 | .725 | .686 |

Average | .568 | .757 | .749 |

Latent | .569 |
.759 |
.754 |

Below is a scatterplot of the association between our latent projected points for 2012 and the actual fantasy points scored in 2012. The R-squared of .57 suggests that we are explaining about 57% of the variance in actual fantasy points scored, which suggests that, although fairly accurate, our projections have room for improvement. To improve our projections, we will want to incorporate other sources of projections.

ggplot(data=projectedWithActualPts, aes(x=projectedPtsLatent, y=actualPts)) + geom_point() + geom_smooth() + xlab("Projected Fantasy Football Points") + ylab("Actual Fantasy Football Points") + ggtitle("Association Between Projected Fantasy Points and Actual Points") + annotate("text", x = 80, y = max(projectedWithActualPts$projectedPtsLatent), label = paste("R-Squared = ",round(summary(lm(actualPts ~ projectedPtsLatent, data=projectedWithActualPts))$r.squared,2),sep=""))

**leave a comment**for the author, please follow the link and comment on their blog:

**Fantasy Football Analytics in R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.