Analysis of Iran absentee votes

June 20, 2009

(This article was first published on MLT thinks, and kindly contributed to R-bloggers)

On the official Iranian election results outside of Iran are posted. Here is a bit of exploration of the results.

The graph shows the number of votes for Ahmadinejad (x-axis) vs. the number of votes for Moussavi (y-axis). Each point is a city (or a country). The diagonal line represents the ratio in official total results for all of Iran. Any point above the line has more votes for Moussavi than the official ratio says, any point below the line more for Ahmadinejad. (I didn’t use some regions where the reported total was less than the sum of votes). The total percentages of the data I looked at is 56% for Moussavi, 40% for Ahmadinejad, 2% for Karroubi and 2% for Reazei.

There are two main lines that the points lie on: one is very close to the official ratio, the other has a much steeper ratio – many more votes for Moussavi.
The points that lie on the less steep line with more votes to Ahmadinejad are Mecca, Medina, Damascus, Kuwait, Karbala.
points that lie on the steep line, with more votes to Moussavi are New York, London, Kuala Lampur, Paris, Ottawa.
Dubai is somewhat of an outlier, and sits between the two lines.

Looking at the ratio of votes for Ahmadinejad to the other two candidates, Karroubi and Reazei gives the same overall picture. The graphs are below.

When looking at these results, the first thing one has to remember is that they are not expected to represent the population in Iran. The Iranian people who live in New York represents a very different socioeconomic sphere, and are exposed to very different Media than the people who live in Iran.

The results in these ballots definitely do not agree with the letter supposedly proving the election fraud in Iran. In no case do the percentage of votes for Karroubi and Reazei come close to the numbers given in the letter. In the letter they get 32% and 9%, in these votes, even in the most “western” booths, they get 4% and 1% (and Moussavi gets 85%). So, either these results are also fabricated, or the letter is fraudulent.

What I can infer from this are the following options

1. If both official results and these results are fabricated, then in this case they are cleverly fabricated. As opposed to the results in Iran, where there are no trends whatsoever with respect to urban vs. non-urban areas, here there are very strong trends.

2. If just the official results are fabricated, and these results are real, then in at least some places – Mecca, Medina, Damascus, Kuwait, and Karbala, the results of the vote are very close to the official published results. This at least means that the results in Iran were not totally fabricated – they reflect the ratio of opinions in some places.

3. It could be that the official results are fabricated, and at the same time, results in some cities outside Iran, but not in all cities where fabricated. Either because of where the counting was done, or who counted.

4. It could be that both the official results and these results are real. When I started the analysis I expected much more discrepancy. The official results are not that far off the results in Mecca and Medina, which I would expect to represent the population in Iran much more than New York.

Dubai is especially interesting. It lies between the two slopes. This could be a mixture of people who live under more or less western influence, or are on two different social status levels. It is interesting, because Nate’s analysis of urban vs. non urban areas saw no such trend.

In short, after looking at these results, I still can not be sure of what happened. I’m pretty sure that the “Fraud letter” is fake, though.

Here are the same graphs comparing Ahmadinejad to the other two candidates:

To leave a comment for the author, please follow the link and comment on their blog: MLT thinks. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)