R tips: Installing Rmpi on Fedora Linux

[This article was first published on CYBAEA Data and Analysis, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Somebody on the R-help mailing list asked how to get Rmpi working on his Fedora Linux machine so he could do high-performance computing on a cluster of machines (or a single multicore machine) using the R statistical computing and analysis platform. Since it is unusually painful to get working, I might as well copy the instructions here.

1. Install Open MPI on Fedora Core

First install the openmpi libraries using:

yum install openmpi openmpi-devel openmpi-libs

The default installation on Fedora still doesn’t quite work, so you need to execute the following command as root (only once is required, after installation of the package):

ldconfig /usr/lib64/openmpi/lib/

You are not quite done: for R to work right with the libraries, you need to modify the LD_LIBRARY_PATH environment variable to include the path to the Open MPI libraries. I have the following in my ~/.bash_profile:

export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}/usr/lib64/openmpi/lib/"

Edit your file to contain the same, and execute that line at the command prompt and you are ready to continue.

2. Install the Rmpi package for R

Now that your Open MPI libraries are set up, and what you do next depends on what version of Rmpi you are installing. Most likely you are installing the latest version in which case the following section applies. The instructions for older versions are retained in a later section for reference.

2.1. Current versions of the Rmpi package

Make sure you have executed the ldconfig command and set the LD_LIBRARY_PATH environment variables as described in the previous section before you continue.

Since at least version 0.5-8 of the Rmpi library you can install it from the R command line after you have fixed the Open MPI install. At the R prompt do:

                 configure.args =

It should work and install OK. This is obviously quite a mouthful to remember, but help is at hand through the options() mechanism in R. In your ~/.Rprofile you can add something like:

    my.configure.args <-
        list("Rmpi" =
             ## Not needed for Rmpi but shown to illustrate the format
             "ncdf" =
    options("configure.args" = my.configure.args)

Then you can just type install.packages("Rmpi") at the R command prompt to install the package.

2.2. Older versions of the Rmpi package

The problem is the configuration file configure.ac which is, unfortunately, completely brain-damaged with hard-coded assumptions about which subdirectories should contain header and library files and no way of overriding it.

Download the latest Rmpi package from CRAN and unpack it using tar zxvf Rmpi_0.5-7.tar.gz. Go to the new Rmpi directory and replace the file configure.ac with the one below (for a x86_64 system; for 32 bit you probably need to change -64 to -32):

 Process this file with autoconf to produce a configure script.



MPI_LIBS=`pkg-config --libs openmpi-1.3.1-gcc-64`
MPI_INCLUDE=`pkg-config --cflags openmpi-1.3.1-gcc-64`

AC_CHECK_LIB(util, openpty, [ MPI_LIBS="$MPI_LIBS -lutil" ])
AC_CHECK_LIB(pthread, main, [ MPI_LIBS="$MPI_LIBS -lpthread" ])




The number 1.3.1 may change in future releases of Fedora: see /usr/lib64/pkgconfig/openmpi-*.pc for the current value.

Still in the Rmpi directory do the following in your shell:

cd ..
tar zcvf Rmpi_0.5-7-F11.tar.gz Rmpi
R CMD INSTALL Rmpi_0.5-7-F11.tar.gz 

3. Test it

Now Rmpi should be working in R:

> library("Rmpi")
> mpi.spawn.Rslaves(nslaves=2)
    2 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 3 is running on: server
slave1 (rank 1, comm 1) of size 3 is running on: server
slave2 (rank 2, comm 1) of size 3 is running on: server
> x <- c(10,20)
> mpi.apply(x,runif)
 [1] 0.25142616 0.93505554 0.03162852 0.71783194 0.35916139 0.85082154
 [7] 0.35404191 0.14221315 0.60063773 0.71805190

 [1] 0.84157864 0.63481773 0.38217188 0.67839089 0.27827728 0.35429266
 [7] 0.04898744 0.96601584 0.25687905 0.77381186 0.69011927 0.37391028
[13] 0.19017369 0.51196594 0.51970563 0.15791524 0.21358237 0.69642478
[19] 0.12690207 0.44177656

Jump to comments.

You may also like these posts:

  1. [0.45]
    Spreadsheet errors

    For my sins, I have done more than my fair share of analysis in Excel. I am quite capable of building and maintaining 130Mb spreadsheets (I had a dozen of them for one client). Excel is pretty much installed everywhere, so it is sometimes the only way to get started getting commercial value of the data in the organisation. But I don’t like it and let’s have a look at one reason why. In order not to always pick on Microsoft, we use another application, but you get the same results with Excel.

  2. [0.40]
    R code for Chapter 1 of Non-Life Insurance Pricing with GLM

    Insurance pricing is backwards and primitive, harking back to an era before computers. One standard (and good) textbook on the topic is Non-Life Insurance Pricing with Generalized Linear Models by Esbjorn Ohlsson and Born Johansson. We have been doing som…

  3. [0.37]
    R code for Chapter 2 of Non-Life Insurance Pricing with GLM

    We continue working our way through the examples, case studies, and exercises of what is affectionately known here as “the two bears book” (Swedish björn = bear) and more formally as Non-Life Insurance Pricing with Generalized Linear Models by Esbjörn Ohl…

  4. [0.36]
    Excel Tip: Array boolean operator

    I learn something new every day. Thinking I knew pretty much everythging there is to know about Microsofts Excel spreadsheet application, I was surprised to see that you could turn any array into a boolean array depending on a condition by simply writing …

  5. [0.36]
    Faster R through better BLAS

    Can we make our analysis using the R statistical computing and analysis platform run faster? Usually the answer is yes, and the best way is to improve your algorithm and variable selection. But recently David Smith was suggesting that a big benefit of the…

To leave a comment for the author, please follow the link and comment on their blog: CYBAEA Data and Analysis.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)