On Monday, we compared the performance of several different ways of calculating a distance matrix in R. Now there's another method to add to the list: using GPU acceleration in R.
A GPU is a dedicated, high-performance chip available on many computers today. Unlike the CPU, it's not used for general computations, but rather for specialized tasks that benefit from a massively multi-threaded architecture. Video-game graphics is the usual target for GPUs, but in recent years they've been used for certain high-performance computing tasks as well. The problem is that GPUs require specialized programming, and because they have limited access to RAM, they're generally not well suited to tasks that require a lot of data throughput. But for simulations and other tasks that require a lot of computing on limited data, they can offer huge performance benefits.
The rpud package for R implements a few algorithms in R that will use a CUDA-compatible NVIDIA GPU for the computations. The algorithms include support vector machines, bayesian classification, and hierarchical linear models. On the NVIDIA Cuda Zone blog, Gord Sissons tested the rpud package for hierarchcal clustering, which involves calculating a distance matrix. Here's a comparison of the perfomance using regular R functions (blue) and with GPU-accelerated functions (orange):
Note the Y axis is on a log-10 scale: in most cases the GPU-based functions ran 10x faster than the standard CPU-based functions.
GPU programming doesn't help with everything, but if your problem happens to be one that has a GPU-based implementation, and you have the appropriate GPU hardware, the results can be dramatic. Check the link below for details of the tests, and how you can spin up a cloud-based GPU server to run them on.
Parallel Forall: GPU-Accelerated R in the Cloud with Teraproc Cluster-as-a-Service