R 101: The Subset Function

November 9, 2011
By

(This article was first published on Abraham Mathew » R, and kindly contributed to R-bloggers)

The subset function is available in base R and can be used to return subsets of a vector, martix, or data frame which meet a particular condition. In my three years of using R, I have repeatedly used the subset() function and believe that it is the most useful tool for selecting elements of a data structure. I assume that many of you are familiar with this function, so I will simply conclude this post by providing some brief examples of the subset function.

numvec = c(2,5,8,9,0,6,7,8,4,5,7,11)
charvec = c("David","James","Sara","Tim","Pierre",
        "Janice","Sara","Priya","Keith","Mark",
        "Apple","Sara")
gender = c("M","M","F","M","M","M","F","F","F","M","M","F")
state = c("CO","KS","CA","IA","MO","FL","CA","CO","FL","CA","WY","AZ")

subset(numvec, numvec > 7)
subset(numvec, numvec < 9 & numvec > 4)
subset(numvec, numvec < 3 |numvec > 9)

df = data.frame(var1=c(numvec), var2=c(charvec),
          gender=c(gender), state=c(state))

subset(df, var1 < 5)
subset(df, var2 == "Sara")
subset(df, var1==5, select=c(var2, state))
subset(df, var2 != "Sara" & gender == "F" & var1 > 5)

To leave a comment for the author, please follow the link and comment on his blog: Abraham Mathew » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.