**R-posts.com**, and kindly contributed to R-bloggers)

It is usually said, that for– and while-loops should be avoided in R. I was curious about just how the different alternatives compare in terms of speed.

The first loop is perhaps the worst I can think of – the return vector is initialized without type and length so that the memory is constantly being allocated.

use_for_loop <- function(x){ y <- c() for(i in x){ y <- c(y, x[i] * 100) } return(y) }

The second for loop is with preallocated size of the return vector.

use_for_loop_vector <- function(x){ y <- vector(mode = "double", length = length(x)) for(i in x){ y[i] <- x[i] * 100 } return(y) }

I have noticed I use sapply() quite a lot, but I think not once have I used vapply() We will nonetheless look at both

use_sapply <- function(x){ sapply(x, function(y){y * 100}) } use_vapply <- function(x){ vapply(x, function(y){y * 100}, double(1L)) }

And because I am a tidyverse-fanboy we also loop at map_dbl().

library(purrr) use_map_dbl <- function(x){ map_dbl(x, function(y){y * 100}) }

We test the functions using a vector of random doubles and evaluate the runtime with microbenchmark.

x <- c(rnorm(100)) mb_res <- microbenchmark::microbenchmark( `for_loop()` = use_for_loop(x), `for_loop_vector()` = use_for_loop_vector(x), `purrr::map_dbl()` = use_map_dbl(x), `sapply()` = use_sapply(x), `vapply()` = use_vapply(x), times = 500 )

The results are listed in table and figure below.

expr | min | lq | mean | median | uq | max | neval |
---|---|---|---|---|---|---|---|

for_loop() | 8.440 | 9.7305 | 10.736446 | 10.2995 | 10.9840 | 26.976 | 500 |

for_loop_vector() | 10.912 | 12.1355 | 13.468312 | 12.7620 | 13.8455 | 37.432 | 500 |

purrr::map_dbl() | 22.558 | 24.3740 | 25.537080 | 25.0995 | 25.6850 | 71.550 | 500 |

sapply() | 15.966 | 17.3490 | 18.483216 | 18.1820 | 18.8070 | 59.289 | 500 |

vapply() | 6.793 | 8.1455 | 8.592576 | 8.5325 | 8.8300 | 26.653 | 500 |

The clear winner is vapply() and for-loops are rather slow. However, if we have a very low number of iterations, even the worst for-loop isn’t too bad:

x <- c(rnorm(10)) mb_res <- microbenchmark::microbenchmark( `for_loop()` = use_for_loop(x), `for_loop_vector()` = use_for_loop_vector(x), `purrr::map_dbl()` = use_map_dbl(x), `sapply()` = use_sapply(x), `vapply()` = use_vapply(x), times = 500 )

expr | min | lq | mean | median | uq | max | neval |
---|---|---|---|---|---|---|---|

for_loop() | 5.992 | 7.1185 | 9.670106 | 7.9015 | 9.3275 | 70.955 | 500 |

for_loop_vector() | 5.743 | 7.0160 | 9.398098 | 7.9575 | 9.2470 | 40.899 | 500 |

purrr::map_dbl() | 22.020 | 24.1540 | 30.565362 | 25.1865 | 27.5780 | 157.452 | 500 |

sapply() | 15.456 | 17.4010 | 22.507534 | 18.3820 | 20.6400 | 203.635 | 500 |

vapply() | 6.966 | 8.1610 | 10.127994 | 8.6125 | 9.7745 | 66.973 | 500 |

**leave a comment**for the author, please follow the link and comment on their blog:

**R-posts.com**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...