The other day we had a fun little discussion in the comments section of the sister blog about the appropriateness of stating forecast probabilities to the nearest tenth of a percentage point.
It started when Josh Tucker posted this graph from Nate Silver:
My first reaction was: this looks pretty but it’s hyper-precise. I’m a big fan of Nate’s work, but all those little wiggles on the graph can’t really mean anything. And what could it possibly mean to compute this probability to that level of precision?
In the comments, people came at me from two directions. From one side, Jeffrey Friedman expressed a hard core attitude that it’s meaningless to give a probability forecast of a unique event:
What could it possibly mean, period, given that this election will never be repeated? . . . I know there’s a vast literature on this, but I’m still curious, as a non-statistician, what it could mean for there to be a meaningful 65% probability (as opposed to a non-quantifiable likelihood) that a one-time outcome will occur. If 65.7 is too precise, why isn’t 65? My [Friedman] hypothesis is that we’re trying to homogenize inherently heterogenous political events to make them tractable to statistical analysis. But presidential elections are not identical balls drawn randomly from an urn, and they are not lottery numbers randomly picked by a computer.
This one was in my wheelhouse, and I responded:
Probabilities can have a lot of meaning, even for an event that will never be repeated. For example, suppose you want to make a decision where the outcome is contingent on who wins the election. It can make sense to quantify your uncertainty using probability. Neumann and Morgenstern wrote a book about this! But at some point the quantification becomes meaningless. “60%,” sure. “65%,” maybe. “65.7%,” no way. . . . The different events being analyzed (in this case, elections) are not modeled as identical balls drawn from an urn. A better analogy you might keep in your mind is business forecasting. You might have some uncertainty about the price of oil next year, or whether the price will exceed $X a barrel. It’s an uncertain event but you know something about it. Then you get some information, for example a new oil field is discovered or a new refinery somewhere is built. This changes your probability. A number such as 65% could express this. Similarly, I might describe someone as being 5 feet 8 inches tall, or even 5 feet 8 1/2 inches tall, but it would be silly to call him 5 feet 8.34 inches tall, given that his height changes by a large fraction of an inch during the day.
From the other direction, commenter Paul wrote:
I disagree with Andrew that 65.7% is too precise. One reason is that intrade prices are quoted to 3 digit precision — viz., ranging from $0.00 to $10.00, with bid-ask spreads as low as 1 cent. So, for example, if I believed Nate’s formula was more accurate than Intrade, and the current quote was $6.56 bid, $6.57 ask, I would like Nate to provide more precision in order to determine whether or not to buy ‘Barack Obama to be re-elected on 2012′. Even with today’s low interest rates, I would still need Nate to forecast at least a 65.74% probability in order to believe purchasing ‘Obama 2012′ at $6.57 would outperform a 0.90% money market rate over the next 18 days.
Paul makes a good point about market pricing. I can see why Intrade would want that precision. But I can’t see the point of Nate giving probabilities to that level of precision, given that he’s not working for Intrade or for a trader. Or, to put it another way, I can can see why Nate might want to report fractional percentage-point probabilities on the New York Times website: more detail means more fluctuations which means more news. (In the above image, that 65.7% was listed as “-2.2 since Oct 10.” I don’t think this “-2.2″ means very much but it represents a change, i.e. news.) But from a statistical standpoint I don’t see the value.
Crunching the numbers
Let’s do a quick calibration. Currently Nate gives Obama a 67.6% change of winning, with a 50.0% to 48.9% lead in the popular vote. That’s a 50.55% share of the 2-party vote. Nate’s page doesn’t give a standard error, but let’s suppose that his forecast for Obama’s popular vote share is a normal distribution with mean 50.55% and standard deviation 1.5%. That is, there’s a 95% chance that Obama will get within 47.5% and 53.5% of the 2-party vote. That seems in the right ballpark to me. Then the probability Obama will win the popular vote is pnorm((50.55-50)/1.5) = 0.643. Not quite Nate’s 65.7%; I can attribute the difference to the electoral college.
This is getting interesting. A big lead in the probability (65%-35%) corresponds to a liny lead in the vote (50.5%-49.5%). Now suppose that our popular vote forecast is off by one-tenth of a percentage point. Given all our uncertainties, it would seem pretty ridiculous to claim we could forecast to that precision anyway, right? If we bump Obama’s predicted 2-party vote share up to 50.65%, we get a probability Obama wins of pnorm((50.65-50)/1.5) = 0.668. If we ratchet Obama’s expected vote share down to 50.45%, his probability of winning goes down to pnorm((50.45-50)/1.5) = 0.618.
Thus, a shift of 0.1% in Obama’s expected vote share corresponds to a change of 2.5 percentage points in his probability of winning.
Now let’s do it the other way. If Obama’s expected vote share is 50.65%, his probability of winning is 0.6676 (keeping that extra digit to avoid roundoff issues). If his probability of winning goes up by 0.1 percentage points, then his expected percentage of the two-party vote must be qnorm(0.6686,50,1.5) = 50.654. That’s right: a change in 0.1 of win probability corresponds to a 0.004 percentage point share of the two-party vote. I can’t see that it can possibly make sense to imagine an election forecast with that level of precision. Even multiplying everything by ten—specifying win probabilities to the nearest percentage point—corresponds to specifying expected vote shares to within 0.04% of the vote, which remains ridiculous.
Really, I think it would be just fine to specify win probabilities to the nearest 10%, which will register shifts of 0.4% in expected vote share. Probabilities to the nearest 10%: if it’s good enough for the National Weather Service, it’s good enough for me.
P.S. Just to emphasize: I think Nate’s great, and I can understand the reasons (in terms of generating news and getting eyeballs on the webpage) that he gives probabilities such as “65.7%.” I just don’t think they make sense from a statistical point of view, any more than it would make sense to describe a person as 5 feet 8.34 inches tall. Nate’s in a tough position: on one hand, once you have a national and state-level forecast, there’s not much you can say, day-to-day or week-to-week. On the other hand, people want news, hence the pressure to report essentially meaningless statistics such as a change in probability from 65.7% to 67.6%, etc.
P.P.S. I think the above calculations are essentially valid even though Nate’s forecast is at a state-by-state level. See my comment here.
To see this in another way, imagine that your forecast uncertainty about the election is summarized by 1000 simulations of the election outcome, that is, a 1000 x 51 matrix of simulated vote shares by state. If Pr(Obama wins) = 0.657, this corresponds to 657 out of 1000 simulations adding up to an Obama win. Now suppose there is a 1% shift in win probability, then this bumps 657 up to 667. What shift in the vote would carry just 10 out of 1000 simulations over the bar? Given that vote swings are largely national, it will come to approximately 0.04% (that is, 4 hundredths of a percentage point).