In a previous post, I showed how to determine the best starting lineup to draft using an optimizer tool. The optimizer identifies the players that maximize your projected points within your risk tolerance. The optimizer does not take into account the uncertainty of players' projections (except by excluding players above your risk tolerance). This post demonstrates how to run an optimization simulation that identifies that best possible starting lineup by taking into account players' uncertainty or risk.
Why relying on a single projection estimate is bad
One limitation of relying on a single projection estimate (a point estimate), is that it assumes that the projections are equally accurate for all players. This is a false assumption. Some players are more difficult to predict than others. Some players have a "high upside" whereas others are pretty reliable. These differences in predictability are not captured by a point estimate. They can be captured, however, by an interval estimate (e.g., a range of values or confidence interval). Nate Silver has written on the many advantages of using interval estimates rather than point estimates (see his book here).
How can we find the optimal starting lineup using interval estimates of players' projections?
The first thing we have to do is to calculate an interval estimate for each player. This is the range of likely values for each player. We can construct an interval estimate from two parameters: 1) mean and 2) standard deviation. The mean is the average of all projections for a player. In other words, the projection from FantasyPros, which averages numerous sources of projections, can be our mean. For the standard deviation, we have to calculate the variability around the projections from the various sources for each player. For how we calculate the standard deviation of players' projections, see here. Now that we have the mean and standard deviation for each player, we can construct each players' distribution of possible points by drawing random values from a normal distribution with the same mean and standard deviation using the rnorm() function.
Here are the distributions of 3 "made up" players (I can't unambiguously say "fantasy players") with the same mean and different standard deviations:
How the Optimization Simulation Works
The optimization simulation works by selecting a random value for each player within his distribution, optimizing the best team, and iterating this many, many times. We can then see how many times each player makes the best lineup. In the following example, the simulation iterates 100,000 times. This gives us a fairly reliable estimate of each player's projection distribution, and, as a result, the likelihood that the player is on the best starting lineup.
The R Script
The R script for simulating the optimization to take into account uncertainty in players' projections is located here:
Here's the simulation syntax:
In summary, relying on a single projection value is bad because it assumes that all players have equal uncertainty/risk in their projections. We can get a better estimate of each player's likelihood that they are on the best possible starting lineup by taking into the uncertainty around their projections. To achieve this, we simulate the optimization many times by drawing random values within each player's distribution of likely point values. By doing this, we can get a better idea of who are the best players to draft.