#PredictiveCOL – Forecasting Colombia’s peace plebiscite (final update)

October 4, 2016
By

(This article was first published on Data Literacy - The blog of Andrés Gutiérrez, and kindly contributed to R-bloggers)

For sure, this is the more exciting forecast I have ever done. On one hand, I am Colombian guy, and I really want to live in a peaceful country, and I do want a better place for raising my children. On the other hand, I am very serious when it comes to forecasting.

Maybe you have read this blog before, and you have realized that I love predictions cuz, as a data scientist, you can use your own set of statistical methodologies that relates to voting intention and turnout. If you think genuinely, everything counts while forecasting this kind of processes. For example, you may weekly track people’s security perception, how much the president is disliked, how many minutes newscast spend about the peace process, how many articles or opinion columns are written, how many tweets are sent per week about FARC, everything, everything counts and matter. Let us name these variables as contextual variables
Of course, polls and pollsters are also important. They are the only proxy of voting intention. However, polls get old, and they should be updated with new data; also some pollsters are not as confident as others (I really do not believe everything some pollsters claim), and sample size also matters cuz it improves the sampling error. 
Remember that this plebiscite has two options for Colombians to choose. The first option is Yes, that goes for Yes, I approve the agreement between Farc and government. The second option is No, meaning No, I don’t support that agreement. With those topics in mind, let me introduce you my (bayesian) predictions for the Colombian plebiscite. Firstly, let me exhibit the trend that the two options have shown over time:

Trendplot

As you can see, we have deflated the data by undecided voters. Note that we extracted the signal (bold lines) from the noise generated by polls. The following graph shows the less robust prediction based only on what polls have estimated.

Pollplot

Now, a more reasonable forecast based on prior information (2014 presidential elections and 2014 legislative elections) that tries to explain the outcomes of the polls by modeling the extracted signal (see trends in previous graphic) with contextual variables. After the model is fitted, we use a Bayesian setup that relates the prior information with the estimated response from this model. The following chart shows this forecast.

Bayesplot

Ok, it is clear that Colombian people will support this peace process. However, this election is legitimate if and only if Yes voter turnout is greater than 4.4 million. By using a similar Bayesian methodology, next graph shows the predicted turnout.

Turnoutplot

Also, we used a small area modeling to forecast the response of every Colombian department (equivalent to a state in the US). The following map shows that the majority of departments are supporting the agreement between FARC and government. However, there are some of them that will vote No. Dark areas are not supporting the deal, while light-gray areas will do support the agreement.
Plotcol1

Finally, the posterior probability that Yes defeats No is 98.8%.

To leave a comment for the author, please follow the link and comment on their blog: Data Literacy - The blog of Andrés Gutiérrez.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)