Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

For those of you used to reading about international relations, I apologize for the following brief foray into American politics. It appears that the American Enterprise Institute and David Frum have decided to (abruptly) part ways. Before David left, however, he and his team of interns provided some interesting statistical insight into the Tea Party movement, as he writes:

Over the course of our survey, FrumForum interviewed approximately 60 people of the estimated 300-500 protesters assembled on Capitol Hill to protest the healthcare bill currently before the House*. We asked them questions about their perception of current taxation rates and the economy.

*To be sure, this survey lacked a control group and a statistically significant sample, but based on our estimates, we surveyed between 11% and 19% of the protesters on Capitol Hill. It is perhaps more valid to treat the results as that of a focus group, and a general contribution to the understanding of the Tea Party movement.

Frum asserts that the survey, “shows that Tea Partiers tend to be more financially pessimistic than average Americans, and perceive the United States’ tax burden to be significantly higher than it actually is.” Given the small number of respondents (as noted in the highlighted footnote above), it is difficult to get a sense of the actual distribution of beliefs within the population of Tea Partiers. We, can, however bootstrap (simulate) these distribution by making parametric assumption, and then more accurately test the assertion of Frum.

A brief caveat: as Frum has pointed out, these data were by no means scientifically collected, and therefore any results generated will be biased in whatever direction the collection pointed them. In the follow experiments, I will be treating these data as though they were legitimate, despite the dangers of doing so. That said, we will model the results of these questions using the Gamma distribution and approximate the shape (k) and scale ($theta$) parameters using descriptive statistice from the data gathered in the survey.

To generate the approximate values for k and $theta$ I will use the mean and standard deviation values provided for each of the survey questions. From the definition of the Gamma distribution, the mean is equal to $ktheta$ and the variance is equal to $ktheta^2$. By substituting the standard deviation of the survey results for the variance we can easily approximate values of these parameters. To do so, I use sympy, an open source symbolic mathematics Python package, which among other things simply makes my life easier.

Once these values have been calculated it is very easy to generate simulated distributions. I use the following R code to do so:

Next, we will visualize these distributions with ggplot2, noting where the actual values for these questions fall on the distributions using a vertical red line.

With the simulated distributions it is a bit easier draw conclusions as to the level of “misinformation” within the population of the Tea Party movement. My initial reaction is that while the actual values for both questions are clearly in the tails of these distributions, they are not so far in these tails as to make them extreme. In fact, for the federal income tax question the actual value is well within one standard deviation of the mean. Perhaps the Tea Partiers are not as misinformed as many would be presumed from their rhetoric. At the same time, however, these are not necessarily difficult questions, so any deviation from the actual value could be viewed negatively.