**Daniel's Blog**, and kindly contributed to R-bloggers)

## Tasks: Compute the Effective Number of Parties by year and region

- Using the American National Election Studies, compute the Effective Number of Parties in the electorate across regions and years/waves.

## The data

The data come from The American National Election Studies (ANES).

The ANES is a survey that covers voting behavior, public opinion, and political participation. Many other countries have their own version of this survey, for instance see here.

While the primary mission of these studies is to answer questions about voting behavior, the wealth of variables collected amongst voters means that we can use these data to answer other questions too. If you would like to know about the other variables contained in the ANES questionnaires, you may want to read its codebook. You may also be interested in a post by Anthony Damico on this topic.

There are many ways you can import the version of data I’ve made. For example, you download it. Then read it.

## Data chasing

I’m particularly interested in 3 variables from the list. Party identification (VCF0301), originally measured as 7-point scale, the census region code (VCF0112), and years of the wave (VCF0004).

## The Effective Number of Parties

For those who do not know the concept of an “Effective Number of Parties”, you can read a post back in 2014, or go to Wikipedia for summarizing details. In short, the effective number of parties is the number of viable or important political parties in a party system that includes parties of unequal size.

This measure is given by the inverse of the Herfindahl-Hirschman Index (HHI) or the inverse participation ratio (IPR) in physics.

The HHI is calculated by taking the voting share of each party in the electorate, squaring them, and summing the result: $HHI = s1^2 + s2^2 + s3^2 + … + sn^2$ (where *s* is the voting share of each party expressed as a whole number. In mathematical notation, it looks like:

For now, I’ll be using *dplyr* to estimate the effective number of parties by year and region, but the **SciencesPo** package has a function named *politicalDiversity* that can calculate several indices used by political science scholars, which I’ll be addressing in future posts.

## The plot

The index takes into consideration the relative size distribution of the parties (actually, declared partisanship) in an electorate. It approaches 1 when the distribution of preferences in a region is concentrated around only one party. Conversely, the index increases when the number of parties favored in the region increases.

The analysis suggests a highly concentrated political market in-the-electorate in the South states, around 1 and a half party in the 50s, but then regressing towards the national mean (invHHI = 2.349) after the 60s.

It’s true that majority plurality (single ballot) electoral systems tend to have a low number of effective parties compared to majority second ballot systems and to proportional representation systems. With an average count of effective parties in-the-electorate around 2, there is not much room for a third political force to emerge; and this seems to be quite consolidated among regions.

`#reproducible`

**leave a comment**for the author, please follow the link and comment on their blog:

**Daniel's Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...