Introduction to RcppNT2

January 31, 2016
By

(This article was first published on Rcpp Gallery, and kindly contributed to R-bloggers)

Modern CPU processors are built with new, extended
instruction sets that optimize for certain operations. A
class of these allow for vectorized operations, called
Single Instruction / Multiple Data
(SIMD) instructions.
Although modern compilers will use these instructions when
possible, they are often unable to reason about whether or
not a particular block of code can be executed using SIMD
instructions.

The Numerical Template Toolbox (NT2)
is a collection of header-only C++ libraries that make it
possible to explicitly request the use of SIMD instructions
when possible, while falling back to regular scalar
operations when not. NT2 itself is powered
by Boost, alongside two proposed
Boost libraries – Boost.Dispatch, which provides a
mechanism for efficient tag-based dispatch for functions,
and Boost.SIMD, which provides a framework for the
implementation of algorithms that take advantage of SIMD
instructions. RcppNT2
wraps and exposes these libraries for use with R.

The primary abstraction that Boost.SIMD uses under the
hood is the boost::simd::pack<> data structure. This item
represents a small, contiguous, pack of integral objects
(e.g. doubles), and comes with a host of functions that
facilitate the use of SIMD operations on those objects when
possible. Although you don’t need to know the details to use
the high-level functionality provided by Boost.SIMD, it’s
useful for understanding what happens behind the scenes.

Here’s a quick example of how we might compute the sum of
elements in a vector, using NT2.

// [[Rcpp::depends(RcppNT2)]]
#include 
using namespace RcppNT2;

#include 
using namespace Rcpp;

// Define a functor -- a C++ class which defines a templated
// 'function call' operator -- to perform the addition of 
// two pieces of data.
struct add_two {
  template <typename T>
  T operator()(const T& lhs, const T& rhs) {
    return lhs + rhs;
  }
};

// [[Rcpp::export]]
double simd_sum(NumericVector x) {
  // Pass the functor to 'simdReduce()'. This is an
  // algorithm provided by RcppNT2, which makes it
  // easy to apply nt2-style functor definitions
  // across a range of data.
  return simdReduce(x.begin(), x.end(), 0.0, add_two());
}

Behind the scenes, simdReduce() takes care of iteration
over the provided sequence, and ensures that we use optimized SIMD
instructions over packs of numbers when possible, and scalar
instructions when not. By passing a templated functor,
simdReduce() can automatically choose the correct template
specialization depending on whether it’s working with a pack
or not. In other words, two template specializations will be
generated in this case: one with T = double, and another
with T = boost::simd::pack.

Let’s confirm that this produces the correct output, and run
a small benchmark.

# helper function for printing microbenchmark output
printBm <- function(bm) {
  summary <- summary(bm)
  print(summary[, 1:7], row.names = FALSE)
}

# generate some data
data <- rnorm(1024 * 1000)

# verify that it produces the correct sum
all.equal(simd_sum(data), sum(data))
[1] TRUE
# compare results
library(microbenchmark)
bm <- microbenchmark(sum(data), simd_sum(data))
printBm(bm)
           expr     min       lq      mean    median       uq      max
      sum(data) 894.451 943.4145 1033.5598 1020.5000 1071.327 1429.533
 simd_sum(data) 280.585 293.6315  316.6797  307.8795  314.429  574.050

We get a noticable gain by taking advantage of SIMD
instructions here. However, it’s worth noting that we don’t
handle NA and NaN with the same granularity as R.

Learning More

This article provides just a taste of how RcppNT2 can be used.
If you’re interested in learning more, please check out the
RcppNT2 website.

To leave a comment for the author, please follow the link and comment on their blog: Rcpp Gallery.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)