More BLAS, BLASter, BLAStest: Updates on gcbd

October 3, 2010

Following up on my
initial post announcing gcbd,
here is a brief note on a new version. The initial post announced version
0.2.2 which was the first
CRAN version of gcbd.
I updated to 0.2.3 when I made the
aforementioned first blog post
about gcbd with the lattice plot of the BLAS and GPU benchmark results across
six different implementations (from reference BLAS to two Atlas versions,
Goto, MKL and a GPU-based one).

There is now a new version 0.2.4 of gcbd on CRAN.
I revised the paper ever so slightly based on some more feedback, and
focussed the results sections by concentrating on just the log-axes lattice
blot and the corresponding lattice plot of raw results—where the y-axis is
capped at 30 seconds:

GPU/CPU Benchmark Results in levels

This chart–in levels rather than using logarithmic axes is done
illustrates just how large the performance difference can be for for matrix
multiplication and LU decomposition. QR and SVD are closer but accelerated
BLAS libraries still win. GPUs can be compelling for some tasks and large

More discussion is still available in the
paper which is
also included in the gcbd package
for R.

