# Another nice Rcpp example

April 23, 2011
By

(This article was first published on Thinking inside the box , and kindly contributed to R-bloggers)

While preparing my slides for the Rcpp workshop this Thursday,
So I posed a quick question on the rcpp-devel list.

And I received a few friendly answers. My favourite, so far, was a suggestion by Lance
Bachmeier
who sent me a short script which used both R and C++ (via Rcpp) to
simulate a first-order vector autoregressive process (and he ensured me that it worked well enough on his graduate students). It is indeed
a great example as it involves (simple) matrix multiplication in an iterative
fashion. Which makes it a great example not only for Rcpp but also for our
Armadillo C++ templated library for linear algebra and more). And at the same time, we can also add another look
at the new and shiny R compiler I also blogged about recently.

So Lance and I iterated over this a little more over email, and I now added this as a new (and initial) example file in the
RcppArmadillo has been sitting in incoming at
CRAN since earlier in the week while the archive maintainer takes a well-deserved vacation. It
should hit the public archive within a few days, and is otherwise available too from
my site.)

So let’s walk through the example:

```R> ## parameter and error terms used throughout
R> a <- matrix(c(0.5,0.1,0.1,0.5),nrow=2)
R> e <- matrix(rnorm(10000),ncol=2)
R> rSim <- function(coeff, errors) {
+   simdata <- matrix(0, nrow(errors), ncol(errors))
+   for (row in 2:nrow(errors)) {
+     simdata[row,] = coeff %*% simdata[(row-1),] + errors[row,]
+   }
+   return(simdata)
+ }
R> rData <- rSim(a, e)                     # generated by R
```

This starts with a simple enough loop. After skipping the first row, each iteration multiplies the previous row with the parameters and adds
error terms.

We can then turn to the R compiler:

```R> ## Now let's load the R compiler (requires R 2.13 or later)
R> suppressMessages(require(compiler))
R> compRsim <- cmpfun(rSim)
R> compRData <- compRsim(a,e)              # generated by R 'compiled'
R> stopifnot(all.equal(rData, compRData))  # checking results
```

Nice and easy: We load the compiler package, create a compiled function and use it. We check the results and surely
enough find them to be identical.

With that, time to turn to C++ using Armadillo via

```R> ## Now load 'inline' to compile C++ code on the fly
R> suppressMessages(require(inline))
R> code <- '
+   arma::mat coeff = Rcpp::as<arma::mat>(a);
+   arma::mat errors = Rcpp::as<arma::mat>(e);
+   int m = errors.n_rows; int n = errors.n_cols;
+   arma::mat simdata(m,n);
+   simdata.row(0) = arma::zeros<arma::mat>(1,n);
+   for (int row=1; row<m; row++) {
+     simdata.row(row) = simdata.row(row-1)*trans(coeff)+errors.row(row);
+   }
+   return Rcpp::wrap(simdata);
+ '
R> ## create the compiled function
R> rcppSim <- cxxfunction(signature(a="numeric",e="numeric"),
R> rcppData <- rcppSim(a,e)                # generated by C++ code
R> stopifnot(all.equal(rData, rcppData))   # checking results
```

short C++ function in the `code` variable, declare a signature taking `a` and `e` as before and ask
`cxxfunction()` to deploy the plugin for RcppArmadillo so
that it and Rcpp are found during build. With that, we have a compiled function
to generate data, and we once again check the result. The C++ code is pretty straightforward as well. We can instatiate Armadillo matrices
directly from the R objects we pass down; we then run a similar loop building the result row by row.

Now, with all the build-up, here is the final timing comparison, using the
rbenchmark package:

```R> ## now load the rbenchmark package and compare all three
R> suppressMessages(library(rbenchmark))
R> res <- benchmark(rcppSim(a,e),
+                  rSim(a,e),
+                  compRsim(a,e),
+                  columns=c("test", "replications", "elapsed",
+                            "relative", "user.self", "sys.self"),
+                  order="relative")
R> print(res)
test replications elapsed relative user.self sys.self
1  rcppSim(a, e)          100   0.038   1.0000      0.04        0
3 compRsim(a, e)          100   2.011  52.9211      2.01        0
2     rSim(a, e)          100   4.148 109.1579      4.14        0
```

So in a real-world example involving looping and some algebra (which is of course already done by BLAS and LAPACK libraries), the new R
compiler improves by more than a factor of two, cutting time from 4.14 seconds down to about 2 seconds. Yet, this still leaves the C++
solution, clocking in at a mere 38 milliseconds, ahead by a factor of over fifty relative to the new R compiler

And compared to just R itself, the simple solution involving
Rcpp and
RcppArmadillo is almost 110 times faster. As I mentioned, I quite like
this example ;-).

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...