sapply is my new friend!

August 15, 2013
By

(This article was first published on Data and Analysis with R, at Work, and kindly contributed to R-bloggers)

I’ve written previously about how the apply function is a major workhorse in many of my work projects. What I didn’t know is how handy the sapply function can be!

There are a couple of cases so far where I’ve found that sapply really comes in handy for me:

1) If I want to quickly see some descriptive stats for multiple columns in my dataframe. For example,

sapply(mydf[,10:20], median, na.rm=true)

would show me the medians of columns 10 through 20, displaying the column names above each median value.

2) If I want to apply the same function to multiple vectors in my dataframe, modifying them in place. I oftentimes have count variables that have NA values in place of zeros. I made a “zerofy” function to add zeros into a vector that lacks them. So, if I want to use my function to modify these count columns, I can do the following:

mydf[,30:40] = sapply(mydf[,30:40], zerofy)

Which then replaces the original data in columns 30 through 40 with the modified data! Handy!


To leave a comment for the author, please follow the link and comment on his blog: Data and Analysis with R, at Work.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.