**Theory meets practice...**, and kindly contributed to R-bloggers)

## Abstract

We show how to create cartograms with R by illustrating the population and age-distribution of the planning regions of Berlin by static plots and animations.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU General Public License (GPL v3) license from .

## Introduction

Every good lecture on sophisticated statistical modelling starts with underlining the importance of **data visualization** as the first step of an analysis. Choropleth maps are a common choice for visualizing the spatial distribution of a feature recorded in administrative regions, e.g., population density or the incidence rate of a disease. Here, each region is shaded with a color selected in accordance with the feature variable, e.g., higher hue if the feature value is higher. Choosing the right palette for such visualizations is a science of its own, see e.g. Zeileis, Hornik, and Murrell (2009) or the ColorBrewer project, which is available in R through the `RColorBrewer`

package. A nice way to further spice up your spatial visualizations are **area cartograms**, where the boundary shape of each region is warped such that its area becomes proportional to the value of the feature variable you want to illustrate. The difficult part here is to preserve the arrangement of the regions, see for example Gastner and Newman (2004) for the methodological challenges of this task.

In this post we show how such area cartograms can easily be created with R using the packages `Rcartogram`

and `getcartr`

together with the powerful packages `sp`

, `rgeos`

and `rgdal`

for the spatial data wrangling. Both `Rcartogram`

and `getcartr`

are only available from github, because the license of the underlying `Cart`

C fragment implementing the method of Gastner and Newman (2004) does not appear to be GPL (or the like) compatible.

# The Data

We use population numbers for the 447 planning regions of Berlin (Lebensweltlich orientierte Räume (LOR)). The boundaries of these regions are available as ESRI Shapefile through the open data portal of Berlin under the CC BY license. The 2015 population data of the LORs are available as CSV file through the same data portal.

```
tmpfile <- paste0(tempfile(),".zip")
download.file("https://www.statistik-berlin-brandenburg.de/opendata/RBS_OD_LOR_2015_12.zip",destfile=tmpfile)
unzip(tmpfile,exdir=file.path(filePath,"RBS_OD_LOR_2015_12"))
download.file("https://www.statistik-berlin-brandenburg.de/opendata/EWR201512E_Matrix.csv",destfile=file.path(filePath,"EWR201512E_Matrix.csv"))
```

With the help of the `rgdal`

, `sp`

and the `rgeos`

CRAN packages, R can be used as a geographic information system (GIS). This allows for easy merging of these two data sources together with a spatial aggregation to the **Prognoseräume** level, which is a slightly higher level of aggregation than the LORs (60 regions instead of 447). The output of these data wrangling steps will be a SpatialPolygonsDataFrame object `pgrs`

– see GitHub code for details.

```
library(rgdal)
library(sp)
library(rgeos)
```

```
##Read shapefile
lor <- readOGR(dsn=file.path(filePath,"RBS_OD_LOR_2015_12"),layer="RBS_OD_LOR_2015_12")
```

```
## OGR data source with driver: ESRI Shapefile
## Source: "/Users/hoehle/Sandbox/Blog/figure/source/2016-10-10-cartograms//RBS_OD_LOR_2015_12", layer: "RBS_OD_LOR_2015_12"
## with 447 features
## It has 8 fields
```

`proj4string(lor)`

`## [1] "+proj=utm +zone=33 +ellps=GRS80 +units=m +no_defs"`

```
##Compute area of each LOR in km^2 area (unit: meters -> convert to square km)
lor$area <- gArea(lor, byid=TRUE) / (1e6)
##Read population
pop <- readr::read_csv2(file=file.path(filePath, "EWR201512E_Matrix.csv"))
##Merge SpatialPolygonsDataFrame with population information
lor_pop <- merge(lor, pop, by.x="PLR",by.y="RAUMID")
```

Plotting the result of the `pgrs`

object as an instance of `SpatialPolygonsDataFrame`

can be done using the standard `Spatial*`

plotting routines documented extensively in, e.g, Bivand, Pebesma, and Gómez-Rubio (2008) and its comprehensive webpage.

```
######################################################################
## Plotting the result, see nice tutorial by
## http://www.nickeubank.com/wp-content/uploads/2015/10/RGIS3_MakingMaps_part1_mappingVectorData.html
## or the Bivand et al. (2008) book - a must read!
## Note: there is a 2nd edition available nowadays.
######################################################################
library(RColorBrewer)
my.palette <- brewer.pal(n = 6, name = "Purples")
##Helper function for making labels for each entry
sp.label <- function(x, label) {list("sp.text", coordinates(x), label,cex=0.5)}
borderCol <- "white"
#Plot choropleth map
spplot(pgrs, "density", col.regions = my.palette, cuts = length(my.palette)-1, col = borderCol,main="Choropleth map of Population Density",sp.layout=sp.label(pgrs, pgrs$EXTPGRNAME))
require(grid)
grid.text(expression("Population density (Persons / "~km^2~")"), x=unit(0.95, "npc"), y=unit(0.50, "npc"), rot=-90)
```

## Installing the Cartogram R packages

Two packages `Rcartogram`

and `getcartr`

make the functionality of the Gastner and Newman (2004) procedure available for working with objects of class `Spatial*`

. Installing `Rcartogram`

requires the `fftw`

library to be installed. How to best do that depends on your system, for Mac OS X the homebrew package system makes this installation easy.

```
##On command line in OS/X with homebrew. Wrapped in FALSE statement to not run system() unintentionally
if (FALSE) {
system("brew install fftw")
}
##Install the R implementation of Cart by Gastner and Newman (2004)
devtools::install_github("omegahat/Rcartogram")
devtools::install_github('chrisbrunsdon/getcartr',subdir='getcartr')
```

We are now ready to compute our first cartogram using the `getcartr::quick.carto`

function.

```
library(Rcartogram)
library(getcartr)
##Make a cartogram
pgrs_carto <- quick.carto(spdf=pgrs,v=pgrs$E_E,res=256)
##Display it using sp functionality
spplot(pgrs_carto, "area", col.regions = my.palette, cuts = length(my.palette)-1, col = borderCol,main="Population Cartogram as Choropleth of Area",sp.layout=sp.label(pgrs_carto, label=pgrs_carto$EXTPGRNAME))
grid::grid.text(expression("Area (km"^2*")"), x=unit(0.95, "npc"), y=unit(0.50, "npc"), rot=-90)
```

With the cartogram functionality now being directly available through R allows one to embedd cartogram making in a full R pipeline. We illustrate this by generating a sequence of cartograms into an animated GIF file using the `animation`

package. The animation below shows a cartogram for the population size for each of the 32 age group in the Berlin data set. One observes that the 25-45 year old tend to live in the city centre, while the 95-110 year old seem to concentrate in the wealthy regions in the south west.

# Outlook

While writing this posts some other useRs have posted on how to create interactive cartograms.

# References

Bivand, R. S., E. J. Pebesma, and V. Gómez-Rubio. 2008. *Applied Spatial Data Analysis with R*. Springer-Verlag.

Gastner, Michael T., and M. E. J. Newman. 2004. “Diffusion-Based Method for Producing Density-Equalizing Maps.” *Proceedings of the National Academy of Sciences of the United States of America* 101 (20): 7499–7504. doi:10.1073/pnas.0400280101.

Zeileis, Achim, Kurt Hornik, and Paul Murrell. 2009. “Escaping RGBland: Selecting Colors for Statistical Graphics.” *Computational Statistics & Data Analysis* 53 (9): 3259–70. doi:http://dx.doi.org/10.1016/j.csda.2008.11.033.

**leave a comment**for the author, please follow the link and comment on their blog:

**Theory meets practice...**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...