Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

## Tasks: Compute the Effective Number of Parties by year and region

• Using the American National Election Studies, compute the Effective Number of Parties in the electorate across regions and years/waves.

## The data

The data come from The American National Election Studies (ANES). The ANES is a survey that covers voting behavior, public opinion, and political participation. Many other countries have their own version of this survey, for instance see here. While the primary mission of these studies is to answer questions about voting behavior, the wealth of variables collected amongst voters means that we can use these data to answer other questions too. If you would like to know about the other variables contained in the ANES questionnaires, you may want to read its codebook. You may also be interested in a post by Anthony Damico on this topic.

## Data chasing

I’m particularly interested in 3 variables from the list. Party identification (VCF0301), originally measured as 7-point scale, the census region code (VCF0112), and years of the wave (VCF0004).

## The Effective Number of Parties

For those who do not know the concept of an “Effective Number of Parties”, you can read a post back in 2014, or go to Wikipedia for summarizing details. In short, the effective number of parties is the number of viable or important political parties in a party system that includes parties of unequal size. This measure is given by the inverse of the Herfindahl-Hirschman Index (HHI) or the inverse participation ratio (IPR) in physics. The HHI is calculated by taking the voting share of each party in the electorate, squaring them, and summing the result: $HHI = s1^2 + s2^2 + s3^2 + … + sn^2$ (where s is the voting share of each party expressed as a whole number. In mathematical notation, it looks like:

For now, I’ll be using dplyr to estimate the effective number of parties by year and region, but the SciencesPo package has a function named politicalDiversity that can calculate several indices used by political science scholars, which I’ll be addressing in future posts.

## The plot

The index takes into consideration the relative size distribution of the parties (actually, declared partisanship) in an electorate. It approaches 1 when the distribution of preferences in a region is concentrated around only one party. Conversely, the index increases when the number of parties favored in the region increases.

The analysis suggests a highly concentrated political market in-the-electorate in the South states, around 1 and a half party in the 50s, but then regressing towards the national mean (invHHI = 2.349) after the 60s.

It’s true that majority plurality (single ballot) electoral systems tend to have a low number of effective parties compared to majority second ballot systems and to proportional representation systems. With an average count of effective parties in-the-electorate around 2, there is not much room for a third political force to emerge; and this seems to be quite consolidated among regions.

#reproducible

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.