Got a ticket for the runoff?

[This article was first published on » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This is one of the very last posting before the election next Sunday. So far, the only certainty is the runoff ticket of the incumbent candidate, Dilma Rousseff (PT). The runner up candidates, the environmentalist Marina Silva (PSB) and the Social Democrat Aecio Neves are walking to a neck-and-neck dispute over the last spin.

Although polling houses are showing Marina’s support falling away–and rapidly, it’s fair to remember that pollsters did a very poor job in fielding her true vote share last election, in 2010. The main polling firms simply mis-predicted her noting less than 6%, as last week polls placed her with 12% to 13% of the vote intentions, but she got 19.33%. And Dilma was told to win a majority with a margin, but the decision went to a runoff.

Actually, it seems that Brazilian pollsters suffer from measuring third candidate’s support as this problem is kind of a recurrent thing. The flip side of this argument is that if it holds for this election, then, Aecio should have by today something around 25% instead of 20% of the vote preference.

The following is the forecast IF election were held today with the last polling data updates. Because there is a high number of swing voters according to the polls, so my model. We should do some math to find the likely results. Considering only the valid vote (discarding wasting votes) the forecast is: Dilma (PT) 46%; Marina (PSB) 28%; Aecio (PSDB) 22%; Others 4%

A more simplistic approach is to aggregate the last polls and weight them according to their sample, so to reflect their data importance. It’s also nice if you want to add some weigh for the elapsed time between them, but I think I do a better job in the forecast model shown above, which includes all these features. We distribute the undecideds proportionally to today’s candidate support evidence. Then, we discard the wasting vote share, re-weighting data again to have the likely outcome according to the polls only as shown below. The last line (6) contains the likely results.

Forecasting charts: DILMA ROUSSEFF (PT) pt



OTHERS others

To leave a comment for the author, please follow the link and comment on their blog: » R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)