Filtering cases

June 26, 2009
By

(This article was first published on Learning R, and kindly contributed to R-bloggers)

Something that's very important to be able to do in data analysis and visualization is to filter out cases. Let's say you want to do identical analyses of two different groups, or of one group and then a subset of it. R can do this a little differently; instead of merely filtering out cases you can create an object that is a subset, and then call it when necessary.

Let's look at some data on the U.S. Congress. Keith Poole has developed a two-dimensional procedure that places members of Congress at specific points based on roll call votes. What we'll do now is compare Democrats and Republicans in the 110th Congress.

First, we load the data into R.

voteview <- read.csv ("C:/Data/HouseSmall.csv", header = TRUE) attach (voteview)

The voteview data frame contains data on all Congresses beginning with the 101st. That's more than we want to deal with, and also, we need a way to look at Democrats and Republicans separately. We'll create an object just for Democrats in the 110th Congress. and then one for Republicans.

dems110 <- subset(voteview, party == 100 & cong == 110)

reps110 <- subset(voteview, party == 200 & cong == 110)


Now let's create a graph to compare them.

plot (c (-1.5, 1.5), c(-1.5, 1.5), type = 'n',
xlab = "1st dimension",

ylab = "2nd dimension",

col.axis = "#777777",

col.lab = "#777777",

cex.axis = 0.75,

cex.lab = 1.25,

main = "DW-nominate scores, 110th Congress",

col.main = "#444444")

abline (v = 0, col = "#cccccc")

points (dwnom2 ~ dwnom1, data = dems110, pch = "D", col = "blue", cex = 0.75)

points (dwnom2 ~ dwnom1, data = reps110, pch = "R", col = "red", cex = 0.75)


That's all a bit complicated, so next time I'll talk about what all those things do. But for now I'll just show what it looks like.

Figure 1. The polarization of Congress.

To leave a comment for the author, please follow the link and comment on his blog: Learning R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.