Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
We can leverage small parts of the R’s C API in order to infer the type of objects directly at the run-time of a function call, and use this information to dynamically wrap objects as needed. We’ll also present an example of recursing through a list.
To get a basic familiarity with the main functions exported from R API, I recommend reading Hadley’s guide to R’s C internals guide here first, as we will be using some of these functions for navigating native R SEXPs. (Reading it will also give you an appreciation for just how much work Rcpp does in insulating us from the ugliness of the R API.)
From the R API, we’ll be using the TYPEOF macro, as well as referencing the internal R types:
REALSXPfor numeric vectors,INTSXPfor integer vectors,VECSXPfor lists
We’ll start with a simple example: an Rcpp function that takes a list, loops through it, and:
- if we encounter a numeric vector, double each element in it;
- if we encounter an integer vector, add 1 to each element in it
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
List do_stuff( List x_ ) {
List x = clone(x_);
for( List::iterator it = x.begin(); it != x.end(); ++it ) {
switch( TYPEOF(*it) ) {
case REALSXP: {
NumericVector tmp = as<NumericVector>(*it);
tmp = tmp * 2;
break;
}
case INTSXP: {
if( Rf_isFactor(*it) ) break; // factors have internal type INTSXP too
IntegerVector tmp = as<IntegerVector>(*it);
tmp = tmp + 1;
break;
}
default: {
stop("incompatible SEXP encountered; only accepts lists with REALSXPs and INTSXPs");
}
}
}
return x;
}
A quick test:
dat <- list(
1:5, ## integer
as.numeric(1:5) ## numeric
)
tmp <- do_stuff(dat)
print(tmp)
[[1]]
[1] 2 3 4 5 6
[[2]]
[1] 2 4 6 8 10
Some notes on the above:
- We clone the list passed through to ensure we work with a copy, rather than the original list passed in,
- We switch over the internal R type using
TYPEOF, and do something for the case of numeric vectors (REALSXP), and integer vectors (INTSXP), - After we’ve figured out what kind of object we have, we can use
Rcpp::asto wrap the R object with the appropriate container, - Because Rcpp’s wrappers point to the internal R structures, any changes made to them are reflected in the R object wrapped,
- We use Rcpp sugar to easily add and multiply each element in a vector,
- We throw an error if a non-numeric / non-integer object is encountered. One could leave the
default:switch just to do nothing or fall through, or handle otherSEXPs as needed as well.
We also check that we fail gracefully when we encounter a non-accepted SEXP:
do_stuff( list(new.env()) ) Error: incompatible SEXP encountered; only accepts lists with REALSXPs and INTSXPs
However, this only operates on top-level objects within the list. What if your list contains other lists, and you want to recurse through those lists as well?
It’s actually quite simple: if the internal R type of the object encountered is a VECSXP, then we just call our recursive function on that element itself!
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
List recurse(List x_) {
List x = clone(x_);
for( List::iterator it = x.begin(); it != x.end(); ++it ) {
switch( TYPEOF(*it) ) {
case VECSXP: {
*it = recurse(*it);
break;
}
case REALSXP: {
NumericVector tmp = as<NumericVector>(*it);
tmp = tmp * 2;
break;
}
case INTSXP: {
if( Rf_isFactor(*it) ) break; // factors have internal type INTSXP too
IntegerVector tmp = as<IntegerVector>(*it);
tmp = tmp + 1;
break;
}
default: {
stop("incompatible SEXP encountered; only accepts lists containing lists, REALSXPs, and INTSXPs");
}
}
}
return x;
}
A test case:
dat <- list(
x=1:5, ## integer
y=as.numeric(1:5), ## numeric
z=list( ## another list to recurse into
zx=10L, ## integer
zy=20 ## numeric
)
)
out <- recurse(dat)
print(out)
$x
[1] 2 3 4 5 6
$y
[1] 2 4 6 8 10
$z
$z$zx
[1] 11
$z$zy
[1] 40
Note that all we had to do was add a VECSXP case in our switch statement. If we see a list, we call the same recurse function on that list, and then re-assign the result of that recursive call. Neat!
Hence, by using TYPEOF to query the internal R type of objects pre-wrap, we can wrap objects as needed into an appropriate container, and then use Rcpp / C++ code as necessary to modify them.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
