by Joseph Rickert
A frequent question that we get here at Microsoft about MRO (Microsoft R Open) is: can be used with RStudio? The short answer is absolutely yes! In fact, more than just being compatible, MRO is the perfect complement for the RStudio environment. MRO is a downstream distribution of open source R that supports multiple operating systems and provides features that enhance the performance and reproducible use of the R language. RStudio, being much more than a simple IDE, provides several features such as the tight integration knitr, RMarkdown and Shiny that promote literate programming, the creation of reproducible code as well as sharing and collaboration. Together, MRO and RStudio they make a powerful combination. Before elaborating on this theme, I should just make it clear how to select MRO from the RStudio IDE. After you have installed MRO on your system, open RStudio, go to the “Tools” tab at the top, and select “Global Options”. You should see a couple of pop-up windows like the screen capture below. If RStudio is not already pointing to MRO (like it is in the screen capture) browse to it, and click “OK”.
One feature of MRO that dovetails nicely with RStudio is that way that MRO is tied to a fixed repository. Every day, at precisely midnight UTC, the infrastructure that supports the MRO distribution takes a snapshot of CRAN and stores it on Microsoft’s MRAN site. (You can browse through the snapshots back to September 17, 2014 from the CRAN Time Machine.) Each MRO release is pre-configured to point to a particular CRAN snapshot. MRO 3.2.3, for example, points to CRAN as it was on January 1, 2016. Everyone who downloads MRO is guaranteed to start from a common baseline that reflects CRAN and all of its packages as they existed at a particular point in time. This provides an enormous advantage for corporations and collaborating teams of R programmers who can be sure that they are at least starting off on the same page, all working with the same CRAN release and a consistent view of the universe of R packages.
However, introducing the discipline of a fixed repository into the RStudio workflow is not completely frictionless. Occasionally, the stars don’t line up perfectly and an RStudio user, or any other user that needs a particular version of a CRAN package for some reason, may have to take some action. For example, I recently downloaded MRO 3.2.3, fired up RStudio and thought “sure why not” when reminded that a newer version of RStudio was available. Then, I clicked to create a new rmarkdown file and was immediately startled by an error message that said that the available rmarkdown package was not the version required by RStudio. The easy fix, of course, was to point to a repository containing a more recent version of rmarkdown than the one associated with the default snapshot date. If this happens to you, either of the following will take care of things:
To get the latest version of the markdown package, use:
install.packages(“rmarkdown”, repos = “https://cran.revolutionanalytics.com“)
To get the 0.9.2 version of the markdown package, use:
install.packages(“rmarkdown”, repos = “https://mran.revolutionanalytics.com/snapshot/2016-01-02“)
Apparently, by chance, we missed setting a snapshot date for MRO that would be convenient for RStudio users by one day,
A second way that MRO fits into RStudio is the way that the checkpoint package, which installs with MRO, can enhance the reproducibility power of RStudio’s project directory structure. If you choose a new directory when set up a new Rstudio project, and then run the checkpoint() function from that project, checkpoint will set up a local repository in a subdirectory of the project directory. For example, executing the following two lines of code from a script in the MyProject directory will install all packages required by your project as they were at midnight UTC on the specified date.
Versions of all of the packages that are called out by scripts in your MyProject directory that existed on CRAN on January 29, 2016 will be installed in a subfolder of MyProject underneath ~/.checkpoint. Unless you use the same checkpoint date for other projects, the packages for MyProject will be independent of packages installed for those other projects. This kind of project specific structure is very helpful for keeping things straight. It provides a reproducibility code sharing layer on top of (or maybe underneath) RStudio's GitHub integration and other reproducibility features. When you want to share code with a colleague they don't need to manually install all of the packages ahead of time. Just have them clone your GitHub repository or put your code into their own RStudio project in some other way and then have them run checkpoint() from there. Checkpoint will search through the scripts in their project and install the versions of the packages they need.
Finally, I should mention that MRO can enhance any project by providing multi-threaded processing to the code underlying many of the R functions you will be using. R functions that make use of linear algebra operations under the hood such as matrix multiplication of Choleesky decompositions etc. will get a considerable performance boost. (Look here for some benchmarks.) For Linux and Windows platforms users can enable multi-threaded processing by downloading and Installing the Intel Math Kernel Libraries (MKL) when they install MRO from the MRAN site. Mac OS X users automatically get multithreading because MRO comes pre-configured to use the Mac Accelerate Framework.
Let us know if you use RStudio with MRO.