# STL transform + remove_copy for subsetting

December 29, 2012
By

(This article was first published on Rcpp Gallery, and kindly contributed to R-bloggers)

We have seen the use of the STL transform functions in the posts STL transform and Transforming a matrix. We use the same logic in conjuction with a logical (ie boolean) vector in order subset an initial vector.

``````#include <Rcpp.h>

using namespace Rcpp;
using namespace std;

const double flagval = __DBL_MIN__; // works
//const double flagval = NA_REAL;   // does not

// simple double value 'flagging' function
inline double flag(double a, bool b) { return b ? a : flagval; }

// [[Rcpp::export]]
NumericVector subsetter(NumericVector a, LogicalVector b) {
// We use the flag() function to mark values of 'a'
// for which 'b' is false with the 'flagval'
transform(a.begin(), a.end(), b.begin(), a.begin(), flag);

// We use sugar's sum to compute how many true values to expect
NumericVector res = NumericVector(sum(b));

// And then copy the ones different from flagval from a into
// res using the remove_copy function from the STL
remove_copy(a.begin(), a.end(), res.begin(), flagval);
return res;
}
``````

We can illustrate this on a simple example or two:

``````a <- 1:5
subsetter(a, a %% 2 == 0)
``````
```[1] 2 4
```
``````subsetter(a, a > 2)
``````
```[1] 3 4 5
```

Casual benchmarking (not shown) shows this to be comparable to and even slightly faster than basic indexing in `R` itself.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...