Count Your BLAS-ings

[This article was first published on Mad (Data) Scientist, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

One nice thing about open-source software is that users often have a lot of choices.  Such is the case with R, for instance the thousands of contributed packages available on CRAN.  My focus here is on BLAS, the core of matrix operations in R, where again there are interesting choices available to users who wish to take advantage of them.

The Basic Linear Algebra Subroutines have been around for many decades, used throughout science and engineering, in various implementations.  A fairly efficient BLAS implementation is included with R, but for those with heavy linear algebra needs, several open-source alternatives are available, such as ATLAS, ACML and OpenBLAS, as well as the commercial Intel MKL.  (Recently Revolution Analytics announced an open version of their R platform that includes the MKL.)

Here I will discuss  OpenBLAS, a library currently attractive to many, due to its open-source nature and ability to make use of the multicore machines that are so common today.   I’ll focus on numerical accuracy.

This should not be considered a detailed tutorial on OpenBLAS, but here is a “hand-waving” overview of its usage and installation. Usage couldn’t be simpler, actually; you just continue business as usual,  with OpenBLAS transparently doing what base-R BLAS has always done for you.   Under more advanced usage, you might try to tweak things by setting the number of cores.

Installation is only a bit more elaborate, if you are comfortable building R from source.  At the configure stage, I ran

configure --prefix=/home/matloff/MyR311 --enable-BLAS-shlib

After running the usual make and make install.  I then needed to do a symbolic link of to the OpenBLAS library.  Since I’ve been doing timing comparisons for my book, I’ve made shell aliases to run either stock BLAS or OpenBLAS; a more sophisticated approach would have been to use update-alternatives.  

No doubt about it, OpenBLAS is fast, and many timing comparisons for R are to be found on the Web, including the Revo link above.  But what about numerical accuracy?  After seeing Mike Hannon’s recent post on R-help, along with Brian Ripley’s reply, I became curious, I searched the Web for information on this aspect, and came up empty-handed.  So, I  will present here the results of some simple experiments I’ve done as a result.

First, though, a disclaimer:  Although I know the basics of numerical analysis, I am not an expert in any sense, including the sense of being an expert on the various BLASes.  If anyone out there has more to add, that would be highly appreciated.

OpenBLAS derives its speed not just from making use of multiple cores, but also from various tweaks of the code, yielding a very fine degree of optimization.  One can thus envision a development team (which, by the way, took over the old Goto BLAS project) so obsessed with speed that they might cut some corners regarding numerical accuracy.  Thus the latter is a subject of legitimate concern.

For my little test here, I chose to compute eigenvalues, using R’s eigen() function.  I generated p x p unit covariance matrices (1s on the diagonal, ρ everywhere off the diagonal) for my test:

covrho <- function(p,rho) {
 m <- diag(p)
 m[row(m) != col(m)] <- rho

I tried this with various values of p and ρ; here I’ll show the results for 2500 and 0.95, respectively.  The machine used has 16 cores, plus a hyperthreading degree of 2; OpenBLAS likely used 32 threads.

With the standard R BLAS, the elapsed time was 57.407.  Under OpenBLAS, that time was reduced to 12.101.

But interestingly, the first eigenvalue was found to be 2375.05 in both cases.  (This was the exact value in eout$values[1], where eout was the return value from eigen().)

Changing ρ to 0.995, I got reported principal eigenvalues of 2487.505 in both cases.  (Timings were roughly as before.)

As another example, I also tried finding the matrix inverse for this last matrix, using solve().   Both versions of BLAS gave 199.92 as the [1,1] element of the inverse.  Interestingly, though, there was a wider time discrepancy, 53.955 seconds versus 0.933.

It is a little odd that the numbers come out with so few decimal places.  I wonder whether R is deliberately doing some rounding, based on estimates of accuracy.  In any event, I would generally caution against looking at too many decimal places, no matter how good the accuracy is, since typically the input data itself is not so accurate.

So, it seems, at first glance, that OpenBLAS is doing fine.  But Brian has an excellent point about the value of sticking with the tried-and-true, in this case meaning, R’s default BLAS implementation.

I invite you to try your own accuracy comparisons, and post them here.

To leave a comment for the author, please follow the link and comment on their blog: Mad (Data) Scientist. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.