Site icon R-bloggers

Russian elections

[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Just a few words about the Russian election. I read this entry http://www.badscience.net/2012/03/is-there-statistical-evidence-of-fraud-in-the-russian-election-data/ and thought to look for myself. For me it seems the data is not good enough to answer the fraud question.


Downloading data, reading and just look:
> r1 <- read.xls(“xxxxxxxxxxxxxx”)

> head(r1)
            projecturl id     updt region uik obstrusted INVALID VALID
1 http://sms.golos.org  1 38324.72     27 650          1       4   323
2 http://sms.golos.org  2 38689.09     25 216          0       9   927
3 http://sms.golos.org  3 38324.72     38 732          1       7  1282
4 http://sms.golos.org  4 38324.72     25 291          0      14  1185
5 http://sms.golos.org  5 38324.72     38 668          0      15  1510
6 http://sms.golos.org  6 38324.72     27 198          0      15  1889
  Zhirinovsky Zyuganov Mironov Prokhorov Putin
1          42       40       3        24   214
2          88      229      58        92   460
3          80      333      46       150   673
4         129      315      67       175   499
5          76      395      70       227   742
6         127      353     115       379   915


Data looks good. Some unknown columns, region, VALID and the contenders look pretty straightforward.


Some regions occur once, others quite often. Some are completely missing

> regs <- xtabs(~ region,data=r1)
> names(regs[regs==1])
[1] “13” “32” “43” “65” “75” “86” “87”

Quite some difference in counts per region, as per the next plot. That is actually very odd, for someone not knowing about this field..
plot(xtabs(VALID ~ factor(region,levels=min(region):max(region)),data=r1))

And, if we think VALID=Zhirinovsky + Zyuganov + Mironov + Prokhorov + Putin, that is not true either.

r1$myValid <- with(r1, Zhirinovsky + Zyuganov + Mironov + Prokhorov + Putin)
plot(myValid ~ VALID,data=r1)
The data just do not add together.

Conclusion
The data is either not complete and contains too many questions to even think about looking for fraud, or this is the true data and it is so bad as seen here and the fraud is obvious.


To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.