# Transforming a Matrix in Parallel using RcppParallel

June 28, 2014
By

(This article was first published on Rcpp Gallery, and kindly contributed to R-bloggers)

The RcppParallel package includes
high level functions for doing parallel programming with Rcpp. For example,
the `parallelFor` function can be used to convert the work of a standard
RcppParallel to transform an R matrix in parallel.

### Serial Version

First a serial version of the matrix transformation. We take the square root
of each item of a matrix and return a new matrix with the tranformed values.
We do this by using `std::transform` to call the `sqrt` function on each
element of the matrix:

``````#include
using namespace Rcpp;

#include
#include

// [[Rcpp::export]]
NumericMatrix matrixSqrt(NumericMatrix orig) {

// allocate the matrix we will return
NumericMatrix mat(orig.nrow(), orig.ncol());

// transform it
std::transform(orig.begin(), orig.end(), mat.begin(), ::sqrt);

// return the new matrix
return mat;
}``````

### Parallel Version

Now we’ll adapt our code to run in parallel using the `parallelFor` function.
RcppParallel takes care of dividing up work between threads, our job is to
implement a “Worker” function object that is called by the RcppParallel
scheduler.

The `SquareRoot` function object below includes pointers to the input matrix
as well as the output matrix. Within it’s `operator()` method it performs a
`std::transform` with the `sqrt` function on the array elements specified by
the `begin` and `end` arguments:

``````// [[Rcpp::depends(RcppParallel)]]
#include
using namespace RcppParallel;

struct SquareRoot : public Worker
{
// source matrix
const RMatrix<double> input;

// destination matrix
RMatrix<double> output;

// initialize with source and destination
SquareRoot(const NumericMatrix input, NumericMatrix output)
: input(input), output(output) {}

// take the square root of the range of elements requested
void operator()(std::size_t begin, std::size_t end) {
std::transform(input.begin() + begin,
input.begin() + end,
output.begin() + begin,
::sqrt);
}
};``````

Note that `SquareRoot` derives from `RcppParallel::Worker`. This is required
for function objects passed to `parallelFor`.

Note also that we use the `RMatrix` type for accessing the matrix.
This is because this code will execute on a background thread where it’s not
safe to call R or Rcpp APIs. The `RMatrix` class is included in the
RcppParallel package and provides a lightweight, thread-safe wrapper around R
matrixes.

Here’s the parallel version of our matrix transformation function that makes
uses of the `SquareRoot` function object. The main difference is that rather
than calling `std::transform` directly, the `parallelFor` function is called
with the range to operate on (in this case based on the length of the input
matrix) and an instance of `SquareRoot`:

``````// [[Rcpp::export]]
NumericMatrix parallelMatrixSqrt(NumericMatrix x) {

// allocate the output matrix
NumericMatrix output(x.nrow(), x.ncol());

// SquareRoot functor (pass input and output matrixes)
SquareRoot squareRoot(x, output);

// call parallelFor to do the work
parallelFor(0, x.length(), squareRoot);

// return the output matrix
return output;
}``````

### Benchmarks

A comparison of the performance of the two functions shows the parallel
version performing about 2.5 times as fast on a machine with 4 cores:

``````# allocate a matrix
m <- matrix(as.numeric(c(1:1000000)), nrow = 1000, ncol = 1000)

# ensure that serial and parallel versions give the same result
stopifnot(identical(matrixSqrt(m), parallelMatrixSqrt(m)))

# compare performance of serial and parallel
library(rbenchmark)
res <- benchmark(matrixSqrt(m),
parallelMatrixSqrt(m),
order="relative")
res[,1:4]``````
```                   test replications elapsed relative
2 parallelMatrixSqrt(m)          100   0.294    1.000
1         matrixSqrt(m)          100   0.755    2.568
```

Note that performance gains will typically be 30-50% less on Windows systems
as a result of less sophisticated thread scheduling (RcppParallel does not
currently use TBB on Windows
whereas it does on the Mac and Linux).

https://github.com/RcppCore/RcppParallel.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...