**» R**, and kindly contributed to R-bloggers)

The 2015 Argentine’s presidential election to be held next October 25th is approaching and the dispute begun to appear more clearly since the major parties announced their potential candidates last June.

This Sunday, the political parties are holding their primaries for the upcoming presidential election. As in US, in Argentine the primaries are important for parties to solve internal disputes, so the winning candidate can run more comfortable with a united party.

As usual, I set up a forecasting model to track down the vote intentions in my neighbouring country–the land of tango. I’m still consolidating opinion polls data while trying to get some clues about past pollster’s performance, so I can account for the likely house effect.

The following graph was adjusted using simple loess techniques. As I already have a reasonable population of polls, I could adjust a Dirichlet regression, which produces a more robust picture of the dispute (the second graph below), though at this stage the model is an oversimplification as some pollsters are more reliable than others. So, I hope next time to post a more sound forecast.

From the figures above, we can see that some polls have quite weird sinusoidal artifacts. Considering those are wrong compared to the others, they can influence the trend line estimates if on a particular day the deviating poll is the only measurement we have.

## House Effects

I want to improve the estimates over the next weeks by using better priors for the house effects of each polling firm. For example, in the picture below, pollster “OPSM” polled favourably for Mauricio Macri (blue line/dots) while worst for the official candidate, Daniel Scioli. On the other hand, house “Hugo Haime & Asc.” underestimated the principal opposition candidate (M. Macri) while overestimating Daniel Scioli and the PJ’s dissident, Sergio Massa.

Let’s think about the implications of this for a moment. Some institutes published polls in which the one or the other candidate over a period of several months is predicted on average two percents below/above the median of all the polls published in that period; this is really hard to believe given that the polling organizations are all claiming to interview a representative group of the population. Even if we acknowledge that the polls are producing some noisy measurements, there would not be this kind of hex. Unless there is a systematic error in the polls that occurs over and over.

**leave a comment**for the author, please follow the link and comment on their blog:

**» R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...