**Rcpp Gallery**, and kindly contributed to R-bloggers)

The RcppParallel package includes

high level functions for doing parallel programming with Rcpp. For example,

the `parallelReduce`

function can be used aggreggate values from a set of

inputs in parallel. This article describes using RcppParallel to parallelize

the inner-product

example previously posted to the Rcpp Gallery.

### Serial Version

First the serial version of computing the inner product. For this we use

a simple call to the STL `std::inner_product`

function:

`#include `
using namespace Rcpp;
#include
// [[Rcpp::export]]
double innerProduct(NumericVector x, NumericVector y) {
return std::inner_product(x.begin(), x.end(), y.begin(), 0.0);
}

### Parallel Version

Now we adapt our code to run in parallel. We’ll use the `parallelReduce`

function to do this. This function requires a “worker” function object

(defined below as `InnerProduct`

). For details on worker objects see the

parallel-vector-sum

article on the Rcpp Gallery.

```
// [[Rcpp::depends(RcppParallel)]]
#include
```
using namespace RcppParallel;
struct InnerProduct : public Worker
{
// source vectors
const RVector<double> x;
const RVector<double> y;
// product that I have accumulated
double product;
// constructors
InnerProduct(const NumericVector x, const NumericVector y)
: x(x), y(y), product(0) {}
InnerProduct(const InnerProduct& innerProduct, Split)
: x(innerProduct.x), y(innerProduct.y), product(0) {}
// process just the elements of the range I've been asked to
void operator()(std::size_t begin, std::size_t end) {
product += std::inner_product(x.begin() + begin,
x.begin() + end,
y.begin() + begin,
0.0);
}
// join my value with that of another InnerProduct
void join(const InnerProduct& rhs) {
product += rhs.product;
}
};

Note that `InnerProduct`

derives from the `RcppParallel::Worker`

class. This

is required for function objects passed to `parallelReduce`

.

Note also that we use the `RVector`

type for accessing the vector.

This is because this code will execute on a background thread where it’s not

safe to call R or Rcpp APIs. The `RVector`

class is included in the

RcppParallel package and provides a lightweight, thread-safe wrapper around R

vectors.

Now that we’ve defined the function object, implementing the parallel inner

product function is straightforward. Just initialize an instance of

`InnerProduct`

with the input vectors and call `parallelReduce`

:

```
// [[Rcpp::export]]
double parallelInnerProduct(NumericVector x, NumericVector y) {
// declare the InnerProduct instance that takes a pointer to the vector data
InnerProduct innerProduct(x, y);
// call paralleReduce to start the work
parallelReduce(0, x.length(), innerProduct);
// return the computed product
return innerProduct.product;
}
```

### Benchmarks

A comparison of the performance of the two functions shows the parallel

version performing about 2.5 times as fast on a machine with 4 cores:

```
x <- runif(1000000)
y <- runif(1000000)
library(rbenchmark)
res <- benchmark(sum(x*y),
innerProduct(x, y),
parallelInnerProduct(x, y),
order="relative")
res[,1:4]
```

test replications elapsed relative 3 parallelInnerProduct(x, y) 100 0.035 1.000 2 innerProduct(x, y) 100 0.088 2.514 1 sum(x * y) 100 0.283 8.086

Note that performance gains will typically be 30-50% less on Windows systems

as a result of less sophisticated thread scheduling (RcppParallel does not

currently use TBB on Windows

whereas it does on the Mac and Linux).

You can learn more about using RcppParallel at

https://github.com/RcppCore/RcppParallel.

**leave a comment**for the author, please follow the link and comment on their blog:

**Rcpp Gallery**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...