Guess who wins: apply() versus for loops in R

April 28, 2012
By

(This article was first published on Data and Analysis with R, at Work, and kindly contributed to R-bloggers)

Yesterday I tried to do some data processing on my really big data set in MS Excel. Wow, did it not like handling all those data!! Every time I tried to click on a different ribbon, the screen didn’t even register that I had clicked on that ribbon. So, I took the hint, and decided to do my data processing in R.

One of the tasks that I needed to do was calculate a maximum value, in each row of the dataset, from multiple monetary values in 5 different fields. The first thing I noticed was that the regular max() function in R doesn’t quite like it when you try to calculate a maximum from a series of NA values (it returned an inf value for some reason…). So, I decided to create a “safe” max function:

Finding that it was working, I then constructed a simple for loop to iterate through my ~395,000 rows. As you could imagine, this was taking forever! After much looking around, I realized that the best solution was actually a base function, apply()!!

I constructed my “max” variable with one simple line of code: big.dataset\$max_money = apply(as.matrix(big.dataset[,214:218]), 1, function (x) safe.max(x))

Compared to the for loop, which was taking forever, this method was a breeze! It took less than a minute to get through the whole data set. Moral of the story? When you’re dealing with lots of data, code as efficiently as possible!

To leave a comment for the author, please follow the link and comment on their blog: Data and Analysis with R, at Work.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.

Sponsors

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)