Avoiding unnecessary memory allocations in R

March 8, 2016

As a rule, everything I discover in R has already been discussed by Hadley Wickham. In this case, he writes:

The reason why the C++ function is faster is subtle, and relates to memory management. The R version needs to create an intermediate vector the same length as y (x – ys), and allocating memory is an expensive operation. The C++ function avoids this overhead because it uses an intermediate scalar.

In my case, I want to count the number of items in a vector below a certain threshold. R will allocate a new vector for the result of the comparison, and then sum over that vector. It’s possible to speed that up about ten-fold by directly counting in C++:

Often this won’t be the bottleneck, but may be useful at some point.

