This post explains how to quickly get key R packages for geographic research installed on Ubuntu, a popular Linux distribution.
A recent thread on the r-spatial GitHub organization alludes to many considerations when choosing a Linux set-up for work with geographic data, ranging from the choice of Linux distribution (distro) to the use of binary vs or compiled versions (binaries are faster to install). This post touches on some of these things but, its main purpose is to provide advice on getting R’s key spatial packages up-and-running on a future-proof Linux operating system (Ubuntu).
Now is an excellent time to be thinking about the topic because changes are in the pipeline and getting set-up (or preparing to get set-up) now could save hours in the future.
These imminent changes include:
- The next major release of R (4.0.0), scheduled for the 24th April (2020-04-24)
- The next major release of Ubuntu (20.04), a Long Term Support (LTS) version that will be used by millions of servers and research computers worldwide for years to come. Coincidentally, Ubuntu 20.04 will be released a day earlier than R 4.0.0, on 23rd April (2020-04-23).
- Ongoing changes to the OSGeo stack on which key geographic R packages depend, as documented in r-spatial repos and a recent blog post on how recent versions of PROJ enable more precise coordinate reference system definitions.
To keep-up with these changes, this post will be updated in late April when some of the dust has settled around these changes.
However, the advice presented here should be future-proof, including information on how to upgrade Ubuntu in section 3.
There many ways of getting Ubuntu set-up for spatial R packages.
A benefit of Linux operating systems is that they offer choice and prevent ‘lock-in’.
However, the guidance in the next section should reduce set-up time and improve maintainability (with updates managed by Ubuntu) compared with other ways of doing things, especially for beginners.
If you’re planning to switch to Linux as the basis of your geographic work, this advice may be particularly useful.
(The post was written in response to people asking how to set-up R on their new Ubuntu installations.
For more on getting a computer running Ubuntu, check out companies that support open source operating systems and guides installing Ubuntu on an existing machine.
By ‘key packages’ I mean the following, which enable the majority of day-to-day geographic data processing and visualization tasks:
- sf for reading, writing and working with a range geographic vector file formats and geometry types
- raster, a mature package for working with geographic raster data (see the terra for an in-development replacement for raster)
- tmap, a flexible package for making static and interactive maps
The focus is on Ubuntu because that’s what I’ve got most experience with and it is well supported by the community.
Links for installing geographic R packages on other distros are provided in a subsequent.
1. Installing spatial R packages on Ubuntu
R’s spatial packages can be installed from source on the latest version of this popular operating system, once the appropriate repository has been set-up, meaning faster install times (only a few minutes including the installation of upstream dependencies).
The following bash commands should install key geographic R packages on Ubuntu 19.10:
# add a repository that ships the latest version of R: sudo add-apt-repository ppa:marutter/rrutter3.5 # update the repositories so the software can be found: sudo apt update # install system dependencies: sudo apt install libudunits2-dev libgdal-dev libgeos-dev libproj-dev libfontconfig1-dev # binary versions of key R packages: sudo apt install r-base-dev r-cran-sf r-cran-raster r-cran-rjava
To test your installation of R has worked, try running R in an IDE such as RStudio or in the terminal by entering
You should be able to run the following commands without problem:
library(sf) #> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0 install.packages("tmap")
If you are using an older version of Ubuntu and don’t want to upgrade to 19.10, which will upgrade to (20.04) by the end of April 2020, see instructions at github.com/r-spatial/sf and detailed instructions on the blog rtask.thinkr.fr, which contains this additional shell command:
# for Ubuntu 18.04 sudo add-apt-repository ppa:marutter/c2d4u3.5
That adds a repository that ships hundreds of binary versions of R packages, meaning faster install times for packages (see the Binary package section of the open source book R Packages for more on binary packages).
An updated repository, called c2d4u4.0 or similar, will be available for Ubuntu 20.04 in late April.
If you have issues with the instructions in this post here, you can find a wealth of answers on site such as StackOverflow, the sf issue tracker, r-sig-geo and Debian special interest group (SIG) email lists (the latter of which provided input into this blog post, thanks to Dirk Eddelbuettel and Michael Rutter).
2. Updating R packages and upstream dependencies
Linux operating systems allow you to customize your set-up in myriad ways.
This can be enlightening but it can also be wasteful.
It’s worth considering the stability/cutting-edge continuum before diving into a particular set-up and potentially wasting time (if the previous section hasn’t already made-up your mind).
A reliable way to keep close (but not too close) to the cutting edge on the R side is simply to keep your packages up-to-date.
Running the following command (or using the Tools menu in RStudio) every week or so will ensure you have up-to-date package versions:
The following commands will update system dependencies including the ‘OSGeo stack’ composed of PROJ, GEOS and GDAL:
sudo apt-get update # see if things have changed sudo apt upgrade # install changes
If you want to update Ubuntu to the latest version, you can with the following command (also see instructions here):
To get more up-to-date upstream geographic libraries than provided in the default Ubuntu repositories, you can add the
ubuntugis repository as follows.
This is a pre-requisite on Ubuntu 18.04 and earlier but also works with later versions (warning, adding this repository could cause complications if you already have software such as QGIS that uses a particular version of GDAL installed):
sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable sudo apt update sudo apt upgrade
That will give you more up-to-date versions of GDAL, GEOS and PROJ which may offer some performance improvements.
Note: if you do update dependencies such as GDAL you will need to re-install the relevant packages, e.g. with
You can revert that change with the following little-known command:
sudo add-apt-repository --remove ppa:ubuntugis/ubuntugis-unstable
If you also want the development versions of key R packages, e.g. to test new features and support development efforts, you can install them from GitHub, e.g. as follows:
remotes::install_github("r-spatial/sf") remotes::install_github("rspatial/raster") remotes::install_github("mtennekes/tmaptools") # required for dev version of tmap remotes::install_github("mtennekes/tmap")
3. Installing geographic R packages on other Linux operating systems
If you are in the fortunate position of switching to Linux and being able to choose the distribution that best fits your needs, it’s worth thinking about which distribution will be both user-friendly (more on that soon), performant and future-proof.
Ubuntu is a solid choice, with a large user community and repositories such as ‘ubuntugis’ providing more up-to-date versions of upstream geographic libraries such as GDAL.
QGIS is also well-supported on Ubuntu.
However, you can install R and key geographic packages on other operating systems, although it may take longer.
Useful links on installing R and geographic libraries are provided below for reference:
Installing R on Debian is covered on the CRAN website. Upstream dependencies such as GDAL can be installed on recent versions of Debian, such as buster, with commands such as
apt-get install libgdal-devas per instructions on the rocker/geospatial.
Arch Linux has a growing R community. Information on installing and setting-up R can be found on the ArchLinux wiki. Installing upstream dependencies such as GDAL on Arch is also relatively straightforward. There is also a detailed guide for installing R plus geographic packages by Patrick Schratz.
4. Geographic R packages on Docker
The Ubuntu installation instructions outlined above provide such an easy and future-proof set-up.
But if you want an even easier way to get the power of key geographic packages running on Linux, and have plenty of RAM and HD space, running R on the ‘Docker Engine’ may be an attractive option.
Advantages of using Docker include reproducibility (code will always run the same on any given image, and images can be saved), portability (Docker can run on Linux, Windows and Mac) and scalability (Docker provides a platform for scaling-up computations across multiple nodes).
For an introduction to using R/RStudio in Docker, see the Rocker project.
Using that approach, I recommend the following Docker images for using R as a basis for geographic research:
rocker/geospatialwhich contains key geographic packages, including those listed above
robinlovelace/geocomprwhich contains all the packages needed to reproduce the contents of the book, and which you can run with the following command in a shell in which Docker is installed:
docker run -e PASSWORD=yourpassword --rm -p 8787:8787 robinlovelace/geocompr
To test-out the Ubuntu 19.10 set-up recommended above I created a Dockerfile and associated image on Dockerhub that you can test-out as follows:
docker run -it robinlovelace/geocompr:ubuntu-eoan R library(sf) #> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0 library(raster) library(tmap)
The previous commands should take you to a terminal inside the docker container where you try out the Linux command line and R.
If you want to use more cutting-edge versions of the geographic libraries, you can use the
ubuntu-bionic image (note the more recent version numbers, with PROJ 7.0.0 for example):
sudo docker run -it robinlovelace/geocompr:ubuntu-bionic R library(sf) #> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
These images do not currently contain all the dependencies needed to reproduce the code in Geocomputation with R.
However, as documented in issue 476 in the
geocompr GitHub repo, there is a plan to provide Docker images with this full ‘R-spatial’ stack installed, building on strong foundations such as
rocker/geospatial and the
ubuntugis repositories, to support different versions of GDAL and other dependencies.
We welcome any comments or tech support to help make this happen.
Suggested changes to this post are also welcome, see the source code here.
R is an open-source language heavily inspired by Unix/Linux so it should come as no surprise that it runs well on a variety of Linux distributions, Ubuntu (covered in this post) in particular.
The guidance in this post should get geographic R packages set-up quickly in a future-proof way.
A sensible next step is to sharpen you system administration (sysadmin) and shell coding skills, e.g. with reference to Ubuntu wiki pages and Chapter 2 of the open source book Data Science at the Command Line.
This will take time but, building on OSGeo libraries, a well set-up Linux machine is an ideal platform to install, run and develop key geographic R packages in a performant, stable and future-proof way.
Be the FOSS4G change you want to see in the world!