faster for() loops in R

November 13, 2013

(This article was first published on Ancient Eco, and kindly contributed to R-bloggers)

One of the most common ways people write for() loops is to create an empty results vector and then concatenate each result with the previous (and growing) results vector, like the following.  (Note: wrapping an expression in the function system.time() executes the function and returns a summary of how long it took, in seconds.)

x <- c()
  for(i in 1:40000){
    x<-c(x,i) #here i is combined with previous contents of x

   user  system elapsed 
  2.019   0.082   2.100 
It is MUCH faster to create the results an empty vector of the correct size, and modify elements in place.  This prevents R from having to move around an ever growing object in memory and is much faster. In short….it seems that what R is slow at is allocating memory for objects.
x<-numeric(40000) #empty numeric vector
  for(i in 1:40000){
    x[i] <- i #changing value of particular element of x
   user  system elapsed 
  0.066   0.001   0.067 
The second method is over 31 times faster on my machine.
PS.  This post was inspired by Hadley Wickham’s much more technical and in-depth coverage of memory usage in R.

To leave a comment for the author, please follow the link and comment on their blog: Ancient Eco. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)