Why is R slow? some explanations and MKL/OpenBLAS setup to try to fix this

[This article was first published on Pachá (Batteries Included), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

Many users tell me that R is slow. With old R releases that is 100% true provided old R versions used its own numerical libraries instead of optimized numerical libraries.

But, numerical libraries do not explain the complete story. In many cases slow code execution can be attributed to inefficient code and in precise terms because of not doing one or more of these good practises:

  • Using byte-code compiler
  • Vectorizing operations
  • Using simple data structures (i.e using data frames instead of matrices in large computing instances)
  • Re-using results

I would add another good practise: “Use the tidyverse”. Provided tidyverse packages such as dplyr benefit from Rcpp, having a C++ backend can be faster than using dplyr’s equivalents in base (i.e plain vanilla) R.

The idea of this post is to clarify some ideas. R does not compete with C or C++ provided they are different languages. R and data.table package may compete with Python and numpy library. This does not mean that I’m defending R over Python or backwards. The reason behind this is that both R and Python implementations consists in an interpreter while in C and C++ it consists in a compiler, and this means that C and C++ will always be faster because in really over-simplifying terms compiler implementations are closer to the machine.

Basic setup for general usage

As an Ubuntu user I can say the basic R installation from Canonical or CRAN repositories work for most of the things I do on my laptop.

When I use RStudio Server Pro© that’s a different story because I really want to optimize things because when I work with large data (i.e. 100GB in RAM) a 3% more of resources efficiency or reduced execution time is valuable.

Installing R with OpenBLAS will give you a tremedous performance boost, and that will work for most of laptop situations. I explain how to do that in detail for Ubuntu 17.10 and Ubuntu 16.04 but a general setup would be as simple as one of this two options:

# 1: Install from Canonical (default Ubuntu repository)
sudo apt-get update && sudo apt-get upgrade
sudo apt-get install libopenblas-dev r-base

# 2: Install from CRAN mirror
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys  51716619E084DAB9
printf '#CRAN mirror\ndeb https://cran.rstudio.com/bin/linux/ubuntu artful/\ndeb-src https://cran.rstudio.com/bin/linux/ubuntu artful/\n' | sudo tee -a /etc/apt/sources.list.d/cran-mirror.list
sudo apt-get update && sudo apt-get upgrade
sudo apt-get install libopenblas-dev r-base

# 3: Install RStudio (bonus)
cd ~/Downloads
wget https://download1.rstudio.org/rstudio-xenial-1.1.383-amd64.deb
sudo apt-get install gdebi
sudo gdebi rstudio-xenial-1.1.383-amd64.deb
printf '\nexport QT_STYLE_OVERRIDE=gtk\n' | sudo tee -a ~/.profile

Being (1) a substitute of (2). It’s totally up to you which one to use and both will give you a really fast R compared to installing it without OpenBLAS.

Benchmarking different R setups

I already use R with OpenBLAS just like the setup above. I will compile parallel R instances to do the benchmarking.

Installing Intel© MKL numerical libraries

My benchmarks do indicate that in my case it’s convenient to take the time it takes to install Intel© MKL. The execution time is strongly reduces for some operations when compared to R with OpenBLAS performance.

Run this to install MKL:

# keys taken from https://software.intel.com/en-us/articles/installing-intel-free-libs-and-python-apt-repo
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB

sudo sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list'
sudo apt-get update && sudo apt-get install intel-mkl-64bit

Installing CRAN R with MKL

To compile it from source (in this case it’s the only option) run these lines:

# key added after sudo apt-get update returned a warning following this guide: https://support.rstudio.com/hc/en-us/articles/218004217-Building-R-from-source
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys  51716619E084DAB9
printf '#CRAN mirror\ndeb https://cran.rstudio.com/bin/linux/ubuntu artful/\ndeb-src https://cran.rstudio.com/bin/linux/ubuntu artful/\n' | sudo tee -a /etc/apt/sources.list.d/cran-mirror.list

# you need to enable multiverse repo or packages as xvfb won't be found
sudo rm -rf /etc/apt/sources.list
printf 'deb http://us.archive.ubuntu.com/ubuntu artful main restricted universe multiverse
deb-src http://us.archive.ubuntu.com/ubuntu artful main restricted universe multiverse\n
deb http://security.ubuntu.com/ubuntu artful-security main restricted universe multiverse
deb-src http://security.ubuntu.com/ubuntu artful-security main restricted universe multiverse\n
deb http://us.archive.ubuntu.com/ubuntu artful-updates main restricted universe multiverse
deb-src http://us.archive.ubuntu.com/ubuntu artful-updates main restricted universe multiverse\n' | sudo tee -a /etc/apt/sources.list

sudo apt-get update
sudo apt-get clean
sudo apt-get autoclean
sudo apt-get autoremove
sudo apt-get upgrade --with-new-pkgs

sudo apt-get build-dep r-base

cd ~/GitHub/r-with-intel-mkl
wget https://cran.r-project.org/src/base/R-3/R-3.4.2.tar.gz
tar xzvf R-3.4.2.tar.gz

cd R-3.4.2
source /opt/intel/mkl/bin/mklvars.sh intel64
MKL="-Wl,--no-as-needed -lmkl_gf_lp64 -Wl,--start-group -lmkl_gnu_thread  -lmkl_core  -Wl,--end-group -fopenmp  -ldl -lpthread -lm"
./configure --prefix=/opt/R/R-3.4.2-intel-mkl --enable-R-shlib --with-blas="$MKL" --with-lapack
make && sudo make install
printf '\nexport RSTUDIO_WHICH_R=/usr/local/bin/R\nexport RSTUDIO_WHICH_R=/opt/R/R-3.4.2-intel-mkl\n' | tee -a ~/.profile

Installing CRAN R with OpenBLAS

Just not to interfere with working installation I decided to compile a parallel instance from source:

cd ~/GitHub/r-with-intel-mkl/
rm -rf R-3.4.2
tar xzvf R-3.4.2.tar.gz
cd R-3.4.2

./configure --prefix=/opt/R/R-3.4.2-openblas --enable-R-shlib --with-blas --with-lapack
make && sudo make install
printf 'export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-openblas/bin/R\n' | tee -a ~/.profile

Installing CRAN R with no optimized numerical libraries

There is a lot of discussion and strong evidence from different stakeholders in the R community that do indicate that this is by far the most inefficient option. I compiled this just to make a complete benchmark:

cd ~/GitHub/r-with-intel-mkl/
rm -rf R-3.4.2
tar xzvf R-3.4.2.tar.gz
cd R-3.4.2

./configure --prefix=/opt/R/R-3.4.2-defaults --enable-R-shlib
make && sudo make install
printf 'export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-defaults/bin/R\n' | tee -a ~/.profile

Installing Microsoft© R Open with MKL

This R version includes MKL by default and it’s supposed to be easy to install. I could not make it run and that’s bad because different articles (like this post by Brett Klamer) state that this R version is really efficient but no different to standard CRAN R with MKL numerical libraries.

In any case here’s the code to install this version:

cd ~/GitHub/r-with-intel-mkl
wget https://mran.blob.core.windows.net/install/mro/3.4.2/microsoft-r-open-3.4.2.tar.gz
tar xzvf microsoft-r-open-3.4.2.tar.gz
cd microsoft-r-open
sudo ./install.sh
printf 'export RSTUDIO_WHICH_R=/opt/microsoft/ropen/3.4.2/lib64/R/bin/R\n' | tee -a ~/.profile

# it was not possible to start /opt/microsoft/ropen/3.4.2/lib64/R/bin/R
# the error is:
# *** caught segfault ***
# address 0x50, cause 'memory not mapped'

# removing Microsoft R
# https://mran.microsoft.com/documents/rro/installation#revorinst-uninstall steps did not work
sudo apt-get remove 'microsoft-r-.*'
sudo apt-get autoclean && sudo apt-get autoremove

Benchmark results

My scripts above do edit ~/.profile. This is to open RStudio and work with differently configured R instances on my computer.

I released the benchmark results and scripts on GitHub. The idea is to run the same scripts from ATT© and Microsoft© to see how different setups perform.

To work with CRAN R with MKL I had to edit ~/.profile because of how I configurated the instances. So I had to run nano ~/.profile and comment the last part of the file to obtain this result:

#export RSTUDIO_WHICH_R=/usr/bin/R
export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-intel-mkl/bin/R
#export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-openblas/bin/R
#export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-defaults/bin/R
#export RSTUDIO_WHICH_R=/opt/microsoft/ropen/3.4.2/lib64/R/bin/R

After that I log out and then log in to open RStudio to run the benchmark.

The other two cases are similar and the benchmark results were obtained editing ~/.profile, logging out and in and opening RStudio with the corresponding instance.

As an example, this result starts with the R version and the corresponding numerical libraries used in that sessions. Any other result are reported in the same way.

And here are the results from ATT© benchmarking script:

Task CRAN R with MKL (seconds) CRAN R with OpenBLAS (seconds) CRAN R with no optimized libraries (seconds)
Creation, transp., deformation of a 2500×2500 matrix (sec) 0.68 0.68 0.67
2400×2400 normal distributed random matrix ^1000 0.56 0.56 0.56
Sorting of 7,000,000 random values 0.79 0.79 0.79
2800×2800 cross-product matrix (b = a’ * a) 0.3 0.36 14.55
Linear regr. over a 3000×3000 matrix (c = a \ b’) 0.17 0.22 6.98
FFT over 2,400,000 random values 0.33 0.33 0.33
Eigenvalues of a 640×640 random matrix 0.22 0.49 0.74
Determinant of a 2500×2500 random matrix 0.2 0.22 2.99
Cholesky decomposition of a 3000×3000 matrix 0.31 0.21 5.76
Inverse of a 1600×1600 random matrix 0.2 0.21 2.79
3,500,000 Fibonacci numbers calculation (vector calc) 0.54 0.54 0.54
Creation of a 3000×3000 Hilbert matrix (matrix calc) 0.23 0.24 0.23
Grand common divisors of 400,000 pairs (recursion) 0.27 0.29 0.3
Creation of a 500×500 Toeplitz matrix (loops) 0.28 0.28 0.28
Escoufier’s method on a 45×45 matrix (mixed) 0.22 0.23 0.28
Total time for all 15 tests 5.3 5.62 37.78
Overall mean (weighted mean) 0.31 0.32 0.93

And here are the results from Microsoft© benchmarking script:

Task CRAN R with MKL (seconds) CRAN R with OpenBLAS (seconds) CRAN R with no optimized libraries (seconds)
Matrix multiply 5.985 13.227 165.18
Cholesky Factorization 1.061 2.349 26.762
Singular Value Decomposition 7.268 18.325 47.076
Principal Components Analysis 14.932 40.612 162.338
Linear Discriminant Analysis 26.195 43.75 117.537

Actions after benchmarking results

I decided to try Intel MKL and I’ll write another post benchmarking things I do everyday beyond what is considered in the scripts.

To clean my system I deleted all R instances but MKL:

sudo apt-get remove r-base r-base-dev
sudo apt-get remove 'r-cran.*'

sudo apt-get autoclean && sudo apt-get autoremove

sudo apt-get build-dep r-base

sudo rm -rf /opt/R/R-3.4.2-openblas
sudo rm -rf /opt/R/R-3.4.2-defaults
sudo ln -s /opt/R/R-3.4.2-intel-mkl/bin/R  /usr/bin/R

I edited ~/.profile so the final lines are:

export RSTUDIO_WHICH_R=/usr/bin/R
#export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-intel-mkl/bin/R
#export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-openblas/bin/R
#export RSTUDIO_WHICH_R=/opt/R/R-3.4.2-defaults/bin/R
#export RSTUDIO_WHICH_R=/opt/microsoft/ropen/3.4.2/lib64/R/bin/R

And I also decided to configure my user packages directory from zero:

# remove installed user packages
rm -rf ~/R

# create new user packages directory
mkdir ~/R/
mkdir ~/R/x86_64-pc-linux-gnu-library/
mkdir ~/R/x86_64-pc-linux-gnu-library/3.4

# install common packages
R --vanilla << EOF
install.packages(c("tidyverse","data.table","dtplyr","devtools","roxygen2","bit64","pacman"), repos = "https://cran.rstudio.com/")
q()
EOF

# Export to HTML/Excel
R --vanilla << EOF
install.packages(c("htmlTable","openxlsx"), repos = "https://cran.rstudio.com/")
q()
EOF

# Blog tools
R --vanilla << EOF
install.packages(c("knitr","rmarkdown"), repos='http://cran.us.r-project.org')
q()
EOF
sudo apt-get install python-pip
sudo pip install --upgrade --force-reinstall markdown rpy2==2.7.8 pelican==3.6.3

# PDF extraction tools
sudo apt-get install libpoppler-cpp-dev default-jre default-jdk
sudo R CMD javareconf
R --vanilla << EOF
library(devtools)
install.packages(c("rjava","pdftools"), repos = "https://cran.rstudio.com/")
install_github("ropensci/tabulizer")
q()
EOF

# TTF/OTF fonts usage
R --vanilla << EOF
install.packages("showtext", repos = "https://cran.rstudio.com/")
q()
EOF

# Cairo for graphic devices
sudo apt-get install libgtk2.0-dev libxt-dev libcairo2-dev
R --vanilla << EOF
install.packages("Cairo", repos = "https://cran.rstudio.com/")
q()
EOF

Concluding remarks

The benchmark exposed here are in no way a definitive end to the long discussion on numerical libraries. My results show some evidence that indicates, that because of more speed for some operations, I should use MKL.

One of the advantages of the setup I explained is that you can use MKL with Python. In that case numpy calculations will be boosted.

Using MKL with AMD© processors might not provide an important improvement when compared to use OpenBLAS. This is because MKL uses specific processor instructions that work well with i3 or i5 processors but not neccesarily with non-Intel models.

To leave a comment for the author, please follow the link and comment on their blog: Pachá (Batteries Included).

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)