Can Rcpp fuse ?

July 8, 2014
By

(This article was first published on R Enthusiast and R/C++ hero, and kindly contributed to R-bloggers)

One of the features of Rcpp11 people seemed to like during useR is the fuse function. fuse is somewhat similar to the c function in R.

For the purpose of this post, let's simplify what fuse does, and just say that it takes several compatible vectors and fuses them together into one.

// some vectors
NumericVector a = {1.0, 2.3},  
              b = {2.5, 5.7}, 
              c = {4.2, 4.1, 1.4};

// fuse them into a vector of size 7
NumericVector d = fuse(a, b, c) ;  

One nice thing is that it also handles scalars, e.g :

// some vectors
NumericVector a = {1.0, 2.3},  
              c = {4.2, 4.1, 1.4};

// fuse them into a vector of size 6
NumericVector d = fuse(a, 3.5, c) ;  

So people seem to like that and I don't blame them, that's a cool feature. Then usually what happens is that I get :

That's a cool feature you have in Rcpp11. Can I use it in Rcpp too ?

I have mixed feelings about that kind of question. It is nice that people acknowledge that this is a nice feature and they want to use it, but at the same time I'd like people to stop thinking that Rcpp11 is some kind of a lab for things that will go later into Rcpp. Some might, but it is unlikely that I will participate in that effort. I'd rather help people migrating to Rcpp11.

Conceptually, fuse is really simple. For each argument, it figures out if this is a scalar or a vector, retrieves the size of the vector to create, creates it, and then loops around to copy data from the inputs to the output. So fuse is just a glorified set of for loops.

In Rcpp11 because we can assume C++11, this can be written using variadic templates, i.e. template functions taking a variable number of arguments. Rcpp has to remain compatible with C++98, so if we wanted to port fuse to Rcpp, we would have to ship a C++98 compatible version. One overload for 2 parameters, one for 3 parameters, one for 4 parameters, ...

This at least brings two problems to the table:

  • Code bloat. This would be one of these features that add thousands of lines of code. This is hard to maintain and slows the compiler down yet again.

  • Where to stop ? NumericVector::create arbitrarily stops at 20 parameters. Of course fusing 20 vectors is going to be enough for most uses case, but on occasions people will feel the desire to go beyond, and the compiler will welcome them with lovely multi page errors.

To leave a comment for the author, please follow the link and comment on his blog: R Enthusiast and R/C++ hero.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.