(This article was first published on

**A second megabyte of memory**, and kindly contributed to R-bloggers)R is an Open Source project providing an interactive language and environment for statistical computing. It has become the *lingua franca* for research in statistical methods. Because **R** is an interpreted language it is comparatively easy to develop applications (called *packages) *for it. However, the resulting code can sometimes be slow, which is a problem in compute-bound methods like Markov chain Monte Carlo

From its inception

**R,**and its predecessor**S,**have allowed calls to compiled code written in**C**or**Fortran**but such programming is not for the faint-hearted. There are many ways in which you can trip yourself up. Over the past several years Dirk Eddelbuettel and Romain Francois (there should be a cedilla under the c but I don’t know how to create one) developed a package called*Rcpp*that provides**C++**classes and methods for encapsulating**R**objects. I recently started using these in the*lme4*package for mixed-effects models that I develop with Martin Maechler and Ben Bolker.High-performance numerical linear algebra is particularly important in the mixed-effects models calculations which use both dense and sparse matrices. I ran across a wonderful linear algebra system called Eigen that is a templated library of

**C++**classes and methods and, with Dirk and Romain, wrote the*RcppEigen*package to interface with**R.**

This posting is to show an example of code that can be made to run much faster using

*RcppEigen*than in the original**R**code.The example, from Dongjun Chung, requires sampling from a collection of multinomial random variables as part of an iterative estimation method for parameter estimation in his

**R**package for the analysis of ChIP-sequencing data. There are generally a small number of categories – 5 is typical – and a relatively large number (say 10,000) instances that are available as a 10000 by 5 matrix of non-negative elements whose rows sum to 1. At each iteration a sample of size 10000 consisting of indices in the range 1 to 5 is to be generated from the current matrix of probabilities. Dongjun wrote an**R**function for thisRsamp <- function(X) {

stopifnot(is.numeric(X <- as.matrix(X)),

(nc <- ncol(X)) > 1L,

all(X >= 0))

apply(X, 1L, function(x) sample(nc, size=1L, replace=FALSE, prob=x+1e-10))

}

which is careful

**R**code (e.g. using apply instead of running a loop) but, even so, this function is the bottleneck in the method.A method using

*RcppEigen*requires a similar**R**function

RcppSamp <- function(X) {

stopifnot(is.numeric(X <- as.matrix(X)),

(nc <- ncol(X)) > 1L,

all(X >= 0))

.Call(CppSamp, X)

}

and a

**C++**functionSEXP CppSamp(SEXP X_) {

typedef Eigen::ArrayXd Ar1;

typedef Eigen::ArrayXXd Ar2;

typedef Eigen::Map MAr2;

const MAr2 X(as(X_));

int m(X.rows()), n(X.cols()), nm1(n – 1);

Ar1 samp(m);

RNGScope sc; // Handle GetRNGstate()/PutRNGstate()

for (int i=0; i < m; ++i) {

Ar1 ri(X.row(i));

std::partial_sum(ri.data(), ri.data() + n, ri.data())

ri /= ri[nm1]; // normalize to sum to 1

samp[i] = (ri < ::unif_rand()).count() + 1;

}

return wrap(samp);

}

The general idea is that the Eigen::ArrayXd class is a one-dimensional array of doubles and the Eigen::ArrayXXd class is a two-dimensional array of doubles. Operations on array classes are component-wise operations or reductions. There are corresponding classes Eigen::VectorXd and Eigen::MatrixXd that provide linear algebra operations. A Eigen::Map of another structure has the corresponding structure but takes a pointer to the storage instead of allocating its own storage. One of the idioms of writing with

*RcppEigen*is to create*const Eigen::Map* objects from the input arguments, thus avoiding unnecessary copying. The*as*templated function and the*wrap*function are part of*RcppEigen*that generalizes the methods in*Rcpp*for converting back and forth between the**R**objects and the**C++**classes.Here the approach is to find the cumulative sums in each row, normalize these sums by dividing by the last element and comparing them to a draw from the standard uniform distribution. The number of elements less than the uniform variate is the 0-based index for the result and we add 1 to get the 1-based index.

Creating the

*RcppEigen*code is more work than the pure**R**code but not orders of magnitude more work. You can operate on entire arrays as shown here and, best of all, you don’t need to worry about protecting storage from the garbage collector. And the*RcppEigen*code is much faster> set.seed(1234321)

> X <- matrix(runif(100000 * 5), ncol=5)

> benchmark(Rsamp(X), RcppSamp(X), replications=5,

+ columns=c(“test”, “elapsed”, “relative”, “user.self”))

test elapsed relative user.self

2 RcppSamp(X) 0.058 1 0.06

1 Rsamp(X) 5.162 89 5.12

Update: I have corrected the spelling of Dongjun Chung’s name. My apologies for mis-spelling it.

Update: The next posting discusses a Julia implementation of this function. An initial version was somewhat slower than the RcppEigen version but then Stefan Karpinsky got to work and created an implementation that was over twice as fast as the RcppEigen version. In what may seem like a bizarre approach to an R programmer, the trick is to de-vectorize the code.

Update: The next posting discusses a Julia implementation of this function. An initial version was somewhat slower than the RcppEigen version but then Stefan Karpinsky got to work and created an implementation that was over twice as fast as the RcppEigen version. In what may seem like a bizarre approach to an R programmer, the trick is to de-vectorize the code.

To

**leave a comment**for the author, please follow the link and comment on their blog:**A second megabyte of memory**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...