[This article was first published on A second megabyte of memory, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R is an Open Source project providing an interactive language and environment for statistical computing.  It has become the lingua franca for research in statistical methods.  Because R is an interpreted language it is comparatively easy to develop applications (called packages) for it.  However, the resulting code can sometimes be slow, which is a problem in compute-bound methods like Markov chain Monte Carlo

From its inception R, and its predecessor S, have allowed calls to compiled code written in C or Fortran but such programming is not for the faint-hearted.  There are many ways in which you can trip yourself up.  Over the past several years Dirk Eddelbuettel and Romain Francois (there should be a cedilla under the c but I don’t know how to create one) developed a package called Rcpp that provides C++ classes and methods for encapsulating R objects.  I recently started using these in the lme4 package for mixed-effects models that I develop with Martin Maechler and Ben Bolker.

High-performance numerical linear algebra is particularly important in the mixed-effects models calculations which use both dense and sparse matrices.  I ran across a wonderful linear algebra system called Eigen that is a templated library of C++ classes and methods and, with Dirk and Romain, wrote the RcppEigen package to interface with R.

This posting is to show an example of code that can be made to run much faster using RcppEigen than in the original R code.

The example, from Dongjun Chung, requires sampling from a collection of multinomial random variables as part of an iterative estimation method for parameter estimation in his R package for the analysis of ChIP-sequencing data.  There are generally a small number of categories – 5 is typical – and a relatively large number (say 10,000) instances that are available as a 10000 by 5 matrix of non-negative elements whose rows sum to 1.  At each iteration a sample of size 10000 consisting of indices in the range 1 to 5 is to be generated from the current matrix of probabilities.  Dongjun wrote an R function for this

Rsamp <- function(X) {
stopifnot(is.numeric(X <- as.matrix(X)),
(nc <- ncol(X)) > 1L,
all(X >= 0))
apply(X, 1L, function(x) sample(nc, size=1L, replace=FALSE, prob=x+1e-10))
}
which is careful R code (e.g. using apply instead of running a loop) but, even so, this function is the bottleneck in the method.

A method using RcppEigen requires a similar R function

RcppSamp <- function(X) {
stopifnot(is.numeric(X <- as.matrix(X)),
(nc <- ncol(X)) > 1L,
all(X >= 0))
.Call(CppSamp, X)
}

and a C++ function

SEXP CppSamp(SEXP X_) {
typedef Eigen::ArrayXd   Ar1;
typedef Eigen::ArrayXXd  Ar2;
typedef Eigen::Map MAr2;

const MAr2 X(as(X_));
int m(X.rows()), n(X.cols()), nm1(n – 1);
Ar1     samp(m);
RNGScope sc; // Handle GetRNGstate()/PutRNGstate()
for (int i=0; i < m; ++i) {
Ar1 ri(X.row(i));
std::partial_sum(ri.data(), ri.data() + n, ri.data())
ri /= ri[nm1];    // normalize to sum to 1
samp[i] = (ri < ::unif_rand()).count() + 1;
}
return wrap(samp);
}

The general idea is that the Eigen::ArrayXd class is a one-dimensional array of doubles and the Eigen::ArrayXXd class is a two-dimensional array of doubles.  Operations on array classes are component-wise operations or reductions.  There are corresponding classes Eigen::VectorXd and Eigen::MatrixXd that provide linear algebra operations.  A Eigen::Map of another structure has the corresponding structure but takes a pointer to the storage instead of allocating its own storage.  One of the idioms of writing with RcppEigen is to create const Eigen::Map objects from the input arguments, thus avoiding unnecessary copying.  The as templated function and the wrap function are part of RcppEigen that generalizes the methods in Rcpp for converting back and forth between the R objects and the C++ classes.

Here the approach is to find the cumulative sums in each row, normalize these sums by dividing by the last element and comparing them to a draw from the standard uniform distribution.  The number of elements less than the uniform variate is the 0-based index for the result and we add 1 to get the 1-based index.

Creating the RcppEigen code is more work than the pure R code but not orders of magnitude more work.  You can operate on entire arrays as shown here and, best of all, you don’t need to worry about protecting storage from the garbage collector.  And the RcppEigen code is much faster

> set.seed(1234321)
> X <- matrix(runif(100000 * 5), ncol=5)
> benchmark(Rsamp(X), RcppSamp(X), replications=5,
+           columns=c(“test”, “elapsed”, “relative”, “user.self”))
test elapsed relative user.self
2 RcppSamp(X)   0.058        1      0.06
1    Rsamp(X)   5.162       89      5.12

Update: I have corrected the spelling of Dongjun Chung’s name.  My apologies for mis-spelling it.

Update: The next posting discusses a Julia implementation of this function.  An initial version was somewhat slower than the RcppEigen version but then Stefan Karpinsky got to work and created an implementation that was over twice as fast as the RcppEigen version.  In what may seem like a bizarre approach to an R programmer, the trick is to de-vectorize the code.

To leave a comment for the author, please follow the link and comment on their blog: A second megabyte of memory.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)