Simplifying polygons layers

[This article was first published on r.iresmi.net, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The current 2021 french administrative limits database (Adminexpress from IGN) is more detailed than the original version (from 50 MB zipped in 2017 to 500 MB zipped now), thanks to a more detailed geometry being currently based on the BDTOPO. However we don’t always need large scale details especially for web applications. The commune layer itself is a huge 400 MB shapefile not really usable for example in a small scale leaflet map.

Using sf::st_simplify() in R or a similar command in QGIS on these shapefiles would create holes or overlapping polygons, shapefiles not being topologically aware. We could probably convert to lines, build topology, simplify, clean, build polygons in GRASS or ArcGis, but it’s quite a hassle…

A nice solution is using Mapshaper on mapshaper.org, or better for reproducibility using {mapshaper} in R. For such large dataset it is advised to use a node.js install instead of relying on the package’s embedded version.

in red the original, in black the simplified version with départements in bold

On Debian-like :

> sudo apt-get install nodejs npm

or on windows : install https://nodejs.org/. If needed add C:\Users\xxxxxxxx\AppData\Roaming\npm to your $PATH.

> npm config set proxy "http://login:password@proxy:8080" # if necessary
> npm install -g mapshaper

For ms_simplify() we will set sys = TRUE to take advantage of the node.js executable. Experiment with the other parameters to get a resolution that suits you. Here we use Visvalingam at 3%, squeezing the commune layer from 400 MB to 30 MB. From here we rebuild departement, region and epci with ms_dissolve() commands. Then we join back with original attributes and export in a geopackage with some metadata.

library(tidyverse)
library(sf)
library(rmapshaper)
library(geojsonio)
library(janitor)
library(fs)

# ADMIN EXPRESS COG France entière édition 2021 (in WGS84)
# ftp://Admin_Express_ext:[email protected]/ADMIN-EXPRESS-COG_3-0__SHP__FRA_WM_2021-05-19.7z
# also available on :
# http://files.opendatarchives.fr/professionnels.ign.fr/adminexpress/ADMIN-EXPRESS-COG_3-0__SHP__FRA_WM_2021-05-19.7z


# originals ---------------------------------------------------------------

source_ign <- "~/sig/ADMINEXPRESS/ADMIN-EXPRESS-COG_3-0__SHP__FRA_2021-05-19/ADMIN-EXPRESS-COG/1_DONNEES_LIVRAISON_2021-05-19/ADECOG_3-0_SHP_WGS84G_FRA"

com <- source_ign %>% 
  path("COMMUNE.shp") %>% 
  read_sf() %>% 
  clean_names()

dep <- source_ign %>% 
  path("DEPARTEMENT.shp") %>% 
  read_sf() %>% 
  clean_names()

reg <- source_ign %>% 
  path("REGION.SHP") %>% 
  read_sf() %>% 
  clean_names()

epci <- source_ign %>% 
  path("EPCI.shp") %>% 
  read_sf() %>% 
  clean_names()

# simplify ---------------------------------------------------------------

check_sys_mapshaper()

# 6 min
# using a conversion to geojson_json to avoid encoding problems
com_simpl <- com %>%
  geojson_json(lat = "lat", lon = "long", group = "INSEE_COM", geometry = "polygon", precision = 6) %>%
  ms_simplify(keep = 0.03, method = "vis", keep_shapes = TRUE, sys = TRUE)

dep_simpl <- com_simpl %>% 
  ms_dissolve(field = "insee_dep", sys = TRUE)

reg_simpl <- com_simpl %>% 
  ms_dissolve(field = "insee_reg", sys = TRUE)

epci_simpl <- com_simpl %>% 
  ms_dissolve(field = "siren_epci", sys = TRUE)


# add attributes and export ----------------------------------------------

destination  <- "~/donnees/ign/adminexpress_simpl.gpkg"

com_simpl %>% 
  geojson_sf() %>% 
  st_write(destination, layer = "commune",
           layer_options = c("IDENTIFIER=Communes Adminexpress 2021 simplifiées",
                             "DESCRIPTION=France WGS84 version COG (2021-05). Simplification mapshaper."))

dep_simpl %>% 
  geojson_sf() %>% 
  left_join(st_drop_geometry(dep), by = "insee_dep") %>% 
  st_write(destination, layer = "departement",
           layer_options = c("IDENTIFIER=Départements Adminexpress 2021 simplifiés",
                             "DESCRIPTION=France WGS84 version COG (2021-05). Simplification mapshaper."))

reg_simpl %>% 
  geojson_sf() %>% 
  left_join(st_drop_geometry(reg), by = "insee_reg") %>% 
  st_write(destination, layer = "region",
           layer_options = c("IDENTIFIER=Régions Adminexpress 2021 simplifiées",
                             "DESCRIPTION=France WGS84 version COG (2021-05). Simplification mapshaper."))

epci_simpl %>% 
  geojson_sf() %>% 
  mutate(siren_epci = str_remove(siren_epci, "200054781/")) %>% # remove Grand Paris
  left_join(st_drop_geometry(epci), by = c("siren_epci" = "code_siren")) %>% 
  st_write(destination, layer = "epci",
           layer_options = c("IDENTIFIER=EPCI Adminexpress 2021 simplifiés",
                             "DESCRIPTION=Établissement public de coopération intercommunale France WGS84 version COG (2021-05). Simplification mapshaper."))
  

To leave a comment for the author, please follow the link and comment on their blog: r.iresmi.net.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)