R 101: The Subset Function

November 9, 2011

(This article was first published on Abraham Mathew » R, and kindly contributed to R-bloggers)

The subset function is available in base R and can be used to return subsets of a vector, martix, or data frame which meet a particular condition. In my three years of using R, I have repeatedly used the subset() function and believe that it is the most useful tool for selecting elements of a data structure. I assume that many of you are familiar with this function, so I will simply conclude this post by providing some brief examples of the subset function.

numvec = c(2,5,8,9,0,6,7,8,4,5,7,11)
charvec = c("David","James","Sara","Tim","Pierre",
gender = c("M","M","F","M","M","M","F","F","F","M","M","F")
state = c("CO","KS","CA","IA","MO","FL","CA","CO","FL","CA","WY","AZ")

subset(numvec, numvec > 7)
subset(numvec, numvec < 9 & numvec > 4)
subset(numvec, numvec < 3 |numvec > 9)

df = data.frame(var1=c(numvec), var2=c(charvec),
          gender=c(gender), state=c(state))

subset(df, var1 < 5)
subset(df, var2 == "Sara")
subset(df, var1==5, select=c(var2, state))
subset(df, var2 != "Sara" & gender == "F" & var1 > 5)

To leave a comment for the author, please follow the link and comment on their blog: Abraham Mathew » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)