Following up on my initial post announcing gcbd, here is a brief note on a new version. The initial post announced version 0.2.2 which was the first CRAN version of gcbd. I updated to 0.2.3 when I made the aforementioned first blog post about gcbd with the lattice plot of the BLAS and GPU benchmark results across six different implementations (from reference BLAS to two Atlas versions, Goto, MKL and a GPU-based one).
There is now a new version 0.2.4 of gcbd on CRAN. I revised the paper ever so slightly based on some more feedback, and focussed the results sections by concentrating on just the log-axes lattice blot and the corresponding lattice plot of raw results—where the y-axis is capped at 30 seconds:
This chart–in levels rather than using logarithmic axes is done here–nicely illustrates just how large the performance difference can be for for matrix multiplication and LU decomposition. QR and SVD are closer but accelerated BLAS libraries still win. GPUs can be compelling for some tasks and large sizes.