faster for() loops in R

November 13, 2013
By

(This article was first published on Ancient Eco, and kindly contributed to R-bloggers)

One of the most common ways people write for() loops is to create an empty results vector and then concatenate each result with the previous (and growing) results vector, like the following.  (Note: wrapping an expression in the function system.time() executes the function and returns a summary of how long it took, in seconds.)

x <- c()
system.time(
  for(i in 1:40000){
    x<-c(x,i) #here i is combined with previous contents of x
  }
)

   user  system elapsed 
  2.019   0.082   2.100 

It is MUCH faster to create the results an empty vector of the correct size, and modify elements in place.  This prevents R from having to move around an ever growing object in memory and is much faster. In short....it seems that what R is slow at is allocating memory for objects.

x<-numeric(40000) #empty numeric vector
system.time(
  for(i in 1:40000){
    x[i] <- i #changing value of particular element of x
  }
)
   user  system elapsed 
  0.066   0.001   0.067 

The second method is over 31 times faster on my machine.

PS.  This post was inspired by Hadley Wickham's much more technical and in-depth coverage of memory usage in R.

To leave a comment for the author, please follow the link and comment on his blog: Ancient Eco.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.