Site icon R-bloggers

Installing spatial R packages on Ubuntu

[This article was first published on the Geocomputation with R website, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post explains how to quickly get key R packages for geographic research installed on Ubuntu, a popular Linux distribution.

< !-- ![](/home/robin/geocompr/geocompr.github.io/static/img/geocompr-linux.png) -->

A recent thread on the r-spatial GitHub organization alludes to many considerations when choosing a Linux set-up for work with geographic data, ranging from the choice of Linux distribution (distro) to the use of binary vs or compiled versions (binaries are faster to install). This post touches on some of these things but, its main purpose is to provide advice on getting R’s key spatial packages up-and-running on a future-proof Linux operating system (Ubuntu).

Now is an excellent time to be thinking about the topic because changes are in the pipeline and getting set-up (or preparing to get set-up) now could save hours in the future. These imminent changes include:

To keep-up with these changes, this post will be updated in late April when some of the dust has settled around these changes. However, the advice presented here should be future-proof, including information on how to upgrade Ubuntu in section 3.

There many ways of getting Ubuntu set-up for spatial R packages. A benefit of Linux operating systems is that they offer choice and prevent ‘lock-in’. However, the guidance in the next section should reduce set-up time and improve maintainability (with updates managed by Ubuntu) compared with other ways of doing things, especially for beginners. If you’re planning to switch to Linux as the basis of your geographic work, this advice may be particularly useful. (The post was written in response to people asking how to set-up R on their new Ubuntu installations. For more on getting a computer running Ubuntu, check out companies that support open source operating systems and guides installing Ubuntu on an existing machine.

By ‘key packages’ I mean the following, which enable the majority of day-to-day geographic data processing and visualization tasks:

The focus is on Ubuntu because that’s what I’ve got most experience with and it is well supported by the community. Links for installing geographic R packages on other distros are provided in a subsequent.

1. Installing spatial R packages on Ubuntu

< !-- Of course, it depends on what Linux distribution you're running, and we'll cover installation on some of the most popular [distros](https://distrowatch.com/). --> < !-- ## Ubuntu -->

R’s spatial packages can be installed from source on the latest version of this popular operating system, once the appropriate repository has been set-up, meaning faster install times (only a few minutes including the installation of upstream dependencies). The following bash commands should install key geographic R packages on Ubuntu 19.10:

# add a repository that ships the latest version of R:
sudo add-apt-repository ppa:marutter/rrutter3.5
# update the repositories so the software can be found:
sudo apt update
# install system dependencies:
sudo apt install libudunits2-dev libgdal-dev libgeos-dev libproj-dev libconfig1-dev
# binary versions of key R packages:
sudo apt install r-base-dev r-cran-sf r-cran-raster r-cran-rjava

To test your installation of R has worked, try running R in an IDE such as RStudio or in the terminal by entering R. You should be able to run the following commands without problem:

library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
install.packages("tmap")

If you are using an older version of Ubuntu and don’t want to upgrade to 19.10, which will upgrade to (20.04) by the end of April 2020, see instructions at github.com/r-spatial/sf and detailed instructions on the blog rtask.thinkr.fr, which contains this additional shell command:

# for Ubuntu 18.04
sudo add-apt-repository ppa:marutter/c2d4u3.5

That adds a repository that ships hundreds of binary versions of R packages, meaning faster install times for packages (see the Binary package section of the open source book R Packages for more on binary packages). An updated repository, called c2d4u4.0 or similar, will be available for Ubuntu 20.04 in late April.

< !-- The c2d4u3.5 repository only supports LTS Ubuntu versions, meaning it is unavailable for Ubuntu 19.10, but will be available for Ubuntu 20.04, allowing hundreds of packages to be installed quickly from the system terminal. --> < !-- The following command, for example, will install **tmap** on LTS versions of Ubuntu that have the `c2d4u3.5` repository enabled much faster than the alternative `install.packages()` approach. -->

If you have issues with the instructions in this post here, you can find a wealth of answers on site such as StackOverflow, the sf issue tracker, r-sig-geo and Debian special interest group (SIG) email lists (the latter of which provided input into this blog post, thanks to Dirk Eddelbuettel and Michael Rutter).

2. Updating R packages and upstream dependencies

Linux operating systems allow you to customize your set-up in myriad ways. This can be enlightening but it can also be wasteful. It’s worth considering the stability/cutting-edge continuum before diving into a particular set-up and potentially wasting time (if the previous section hasn’t already made-up your mind).

< !-- For me, a good set-up, that means the latest version of Ubuntu plus CRAN versions of most R packages. --> < !-- For most people I recommend installing the release version as follows: -->

A reliable way to keep close (but not too close) to the cutting edge on the R side is simply to keep your packages up-to-date. Running the following command (or using the Tools menu in RStudio) every week or so will ensure you have up-to-date package versions:

update.packages()

The following commands will update system dependencies including the ‘OSGeo stack’ composed of PROJ, GEOS and GDAL:

sudo apt-get update # see if things have changed
sudo apt upgrade # install changes

If you want to update Ubuntu to the latest version, you can with the following command (also see instructions here):

apt-get dist-upgrade

To get more up-to-date upstream geographic libraries than provided in the default Ubuntu repositories, you can add the ubuntugis repository as follows. This is a pre-requisite on Ubuntu 18.04 and earlier but also works with later versions (warning, adding this repository could cause complications if you already have software such as QGIS that uses a particular version of GDAL installed):

sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
sudo apt update
sudo apt upgrade

That will give you more up-to-date versions of GDAL, GEOS and PROJ which may offer some performance improvements. Note: if you do update dependencies such as GDAL you will need to re-install the relevant packages, e.g. with install.packages("sf"). You can revert that change with the following little-known command:

sudo add-apt-repository --remove ppa:ubuntugis/ubuntugis-unstable

If you also want the development versions of key R packages, e.g. to test new features and support development efforts, you can install them from GitHub, e.g. as follows:

remotes::install_github("r-spatial/sf")
remotes::install_github("rspatial/raster")
remotes::install_github("mtennekes/tmaptools") # required for dev version of tmap
remotes::install_github("mtennekes/tmap")

3. Installing geographic R packages on other Linux operating systems

If you are in the fortunate position of switching to Linux and being able to choose the distribution that best fits your needs, it’s worth thinking about which distribution will be both user-friendly (more on that soon), performant and future-proof. Ubuntu is a solid choice, with a large user community and repositories such as ‘ubuntugis’ providing more up-to-date versions of upstream geographic libraries such as GDAL.

QGIS is also well-supported on Ubuntu.

However, you can install R and key geographic packages on other operating systems, although it may take longer. Useful links on installing R and geographic libraries are provided below for reference:

  • Installing R on Debian is covered on the CRAN website. Upstream dependencies such as GDAL can be installed on recent versions of Debian, such as buster, with commands such as apt-get install libgdal-dev as per instructions on the rocker/geospatial.

  • Installing R on Fedora/Red Hat is straightforward, as outlined on CRAN. GDAL and other spatial libraries can be installed from Fedora’s dnf package manager, e.g. as documented here for sf.

  • Arch Linux has a growing R community. Information on installing and setting-up R can be found on the ArchLinux wiki. Installing upstream dependencies such as GDAL on Arch is also relatively straightforward. There is also a detailed guide for installing R plus geographic packages by Patrick Schratz.

4. Geographic R packages on Docker

< !-- As with cars, ease of use is important for the popularity of computer technology. --> < !-- ^[ --> < !-- The history of cars can provide insight into the importance of ease of use of technologies today. --> < !-- Cars, have arguably transformed our settlements and lifestyles more than any other technology, were initially hard to use. --> < !-- Before they became a consumer product in the 1950s (by the end of which 1/6^th^ of jobs in the USA were in the [car industry](https://en.wikipedia.org/wiki/1950s_American_automobile_culture)) relied on a [hand cranks](https://www.youtube.com/watch?v=iFd8uo7ogpM) to start them until the proliferation of electric starter motors following U.S. Patent [1,150,523](https://patents.google.com/patent/US1150523), which was subsequently used by Cadillac in [1912](https://www.hemmings.com/blog/2012/02/27/the-accident-that-started-it-all/) and onwards. --> < !-- Like cars, people tend to go for computer technologies that are easy to use, that are 'plug and play', so it's important for the future of open-source software that the solutions I recommend are easy to set-up and use. --> < !-- ] -->

The Ubuntu installation instructions outlined above provide such an easy and future-proof set-up. But if you want an even easier way to get the power of key geographic packages running on Linux, and have plenty of RAM and HD space, running R on the ‘Docker Engine’ may be an attractive option.

Advantages of using Docker include reproducibility (code will always run the same on any given image, and images can be saved), portability (Docker can run on Linux, Windows and Mac) and scalability (Docker provides a platform for scaling-up computations across multiple nodes).

For an introduction to using R/RStudio in Docker, see the Rocker project.

Using that approach, I recommend the following Docker images for using R as a basis for geographic research:

  • rocker/geospatial which contains key geographic packages, including those listed above
  • robinlovelace/geocompr which contains all the packages needed to reproduce the contents of the book, and which you can run with the following command in a shell in which Docker is installed:
docker run -e PASSWORD=yourpassword --rm -p 8787:8787 robinlovelace/geocompr

To test-out the Ubuntu 19.10 set-up recommended above I created a Dockerfile and associated image on Dockerhub that you can test-out as follows:

docker run -it robinlovelace/geocompr:ubuntu-eoan
R
library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(raster)
library(tmap) 

The previous commands should take you to a terminal inside the docker container where you try out the Linux command line and R. If you want to use more cutting-edge versions of the geographic libraries, you can use the ubuntu-bionic image (note the more recent version numbers, with PROJ 7.0.0 for example):

sudo docker run -it robinlovelace/geocompr:ubuntu-bionic
R
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0

These images do not currently contain all the dependencies needed to reproduce the code in Geocomputation with R. < !-- , if you're looking for a production-ready Docker image that has both RStudio Server and a wide range of geographic packages pre-installed, building from the `rocker/geospatial` image, this could be a good place to start: -->

However, as documented in issue 476 in the geocompr GitHub repo, there is a plan to provide Docker images with this full ‘R-spatial’ stack installed, building on strong foundations such as rocker/geospatial and the ubuntugis repositories, to support different versions of GDAL and other dependencies. We welcome any comments or tech support to help make this happen. Suggested changes to this post are also welcome, see the source code here.

5. Fin

R is an open-source language heavily inspired by Unix/Linux so it should come as no surprise that it runs well on a variety of Linux distributions, Ubuntu (covered in this post) in particular. The guidance in this post should get geographic R packages set-up quickly in a future-proof way. A sensible next step is to sharpen you system administration (sysadmin) and shell coding skills, e.g. with reference to Ubuntu wiki pages and Chapter 2 of the open source book Data Science at the Command Line.

This will take time but, building on OSGeo libraries, a well set-up Linux machine is an ideal platform to install, run and develop key geographic R packages in a performant, stable and future-proof way. < !-- I hope that this tutorial provides some useful pointers and encourages more people to switch from proprietary software to open source solutions as the basis of their geographic and computational work. -->

Be the FOSS4G change you want to see in the world!

To leave a comment for the author, please follow the link and comment on their blog: the Geocomputation with R website.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.