Site icon R-bloggers

#28: Welcome RSPM and test-drive with Bionic and Focal

[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Welcome to the 28th post in the relatively random R recommendations series, or R4 for short. Our last post was a “double entry” in this R4 series and the newer T4 video series and covered a topic touched upon in this R4 series multiple times: easy binary install, especially on Ubuntu.

That post already previewed the newest kid on the block: RStudio’s RSPM, now formally announced. In the post we were only able to show Ubuntu 18.04 aka bionic. With the formal release of RSPM support has been added for Ubuntu 20.04 aka focal—and we are happy to announce that of course we added a corresponding Rocker r-rspm container. So you can now take full advantage of RSPM either via docker pull rocker/r-rspm:18.04 or via docker pull rocker/r-rspm:20.04 covering the two most recent LTS releases.

RSPM is a nice accomplishment. Covering multiple Linux distributions is an excellent achievement. Allowing users to reason in terms of the CRAN packages (i.e. installing xml2, not r-cran-xml2) eases use. Doing it from via the standard R command install.packages() (or wrapper around it like our install.r from littler package) is very good too and an excellent technical achievement.

There is, as best as I can tell, only one shortcoming, along with one small bit of false advertising. The shortcoming is technical. By bringing the package installation into the user application domain, it is separated from the system and lacks integration with system libraries. What do I mean here? If you were to add R to a plain Ubuntu container, say 18.04 or 20.04, then added the few lines to support RSPM and install xml2 it would install. And fail. Why? Because the system library libxml2 does not get installed with the RSPM package—whereas the .deb from the distribution or PPAs does. So to help with some popular packages I added libxml2, libunits and a few more for geospatial work to the rocker/r-rspm containers. Being already present ensures packages xml2 and units can run immediately. Please file issue tickets at the Rocker repo if you come across other missing libraries we could preload. (A related minor nag is incomplete coverage. At least one of my CRAN packages does not (yet?) come as a RSPM binary. Then again, CRAN has 16k packages, and the RSPM coverage is much wider than the PPA one. But completeness would be neat. The final nag is lack of Debian support which seems, well, odd.)

So what about the small bit of false advertising? Well it is claimed that RSPM makes installation “so much faster on Linux”. True, faster than the slowest possible installation from source. Also easier. But we had numerous posts on this blog showing other speed gains: Using ccache. And, of course, using binaries. And as the initial video mentioned above showed, installing from the PPAs is also faster than via RSPM. That is easy to replicate. Just set up the rocker/r-ubuntu:20.04 (or 18.04) container alongside the rocker/r-rspm:20.04 (or also 18.04) container. And then time install.r rstan (or install.r tinyverse) in the RSPM one against apt -y update; apt install -y r-cran-rstan (or ... r-cran-tinyverse). In every case I tried, the installation using binaries from the PPA was still faster by a few seconds. Not that it matters greatly: both are very, very quick compared to source installation (as e.g. shown here in 2017 (!!)) but the standard Ubuntu .deb installation is simply faster than using RSPM. (Likely due to better CDN usage so this may change over time. Neither method appears to do downloads in parallel so there is scope for both for doing better.)

So in sum: Welcome to RSPM, and nice new tool—and feel free to “drive” it using rocker/r-rspm:18.04 or rocker/r-rspm:20.04.

If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

To leave a comment for the author, please follow the link and comment on their blog: Thinking inside the box .

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.