The rest of this note expands on some of the ideas outlined in the
README file of the
Some of the maps available through Google My Maps contain extremely detailed information about things like non-Hispanic gangs in South Los Angeles. (If you are interested in the topic, Hispanic gangs are there too, and the relevant Reddit thread links to even more detailed maps covering even more territory.)
A problem, however, is that there is no straightforward way to reuse the data from these maps in R. Google My Maps exports its maps in KML, and while several packages, such as
sf, can read the Simple Features that make up a KML file, these packages do not (yet) provide methods to easily pass the data to
The state of the art
To be clear, it has long been possible to plot spatial data with
ggplot2, and the
ggmap package makes several static map sources easily available from within R, for use with
Solutions to plot spatial data with
ggplot2, still, remain complex and somewhat experimental. One solution currently in development is the
ggspatial package, which makes use of Michael D. Sumner and Kohske Takahashi’s
ggpolypath package. Michael D. Sumner is also developing a suite of packages,
spdplyr, to turn spatial data into tidy data frames.
Spatial data are complex to plot: they do not fit nicely in rectangular datasets, they make use of several coordinate systems, and they involve mixes of raster and vector information more often than many other data. However, a lot of people are working on spatial data visualization, and places like R-sig-geo or GIS Stack Exchange contain many helpful threads on the topic.
A temporary solution
After checking the sources mentioned above, I decided that KML files downloaded from Google My Maps deserved their own little experimental package. The aim of the package would be to go as quickly as possible from the KML file to
ggplot2, which meant coding some sort of
fortify method for KML data.
Again, it is important to stress that there are already some methods to read spatial data with
ggplot2. However, I wanted something that would be tailored to the kind of data provided on Google My Maps, and therefore ended up writing a bunch of
xml2 wrappers to read KML into tidy data frames.
The result is the
tidykml package, which reads basic KML geometries into tibbles, a.k.a. tidy data frames. The “Sherman’s March” example shown at the top of this note is an elaboration on one of the two examples featured in the
README of the package, the other example being this map of L.A. non-Hispanic gangs:
Both maps are the results of less than a dozen lines of code. The raw KML files used in the two examples are bundled with the package, in zipped KML format: see the
?states documentation pages for details and precise sources.
Some drastic limitations
As underlined in its
tidykml package is drastically limited in at least two ways. The first limitation is that the package was conceived for, and tested against, KML files from either GADM, a database of global administrative areas, or from Google My Maps. As a result, it might misbehave with KML files from other sources.
The second limitation is an even more drastic one: the
tidykml package takes the easy way out of multi-geometries (such as multi-polygons) by only taking into account the first element of these geometries. This means, for instance, that a U.S. state that contains islands will lose these islands on import (provided that the first polygon of the state holds its mainland component, and all further elements hold its islands).
Both limitations above have solutions, but these solutions are too complex with regards to the goal of the
tidykml package, which is available on GitHub but will probably never be available on CRAN, given its experimental nature and limitations. In the future, I would rather trust other R packages to develop comprehensive and straightforward methods to visualize KML files and other forms of spatial data.
: the package has now been tested against GADM data. The limits of the package are very obvious: since it does not handle inner boundaries (holes in polygons), the map for France at Level 0, for instance, is a complete failure. Similarly, very detailed maps (France at level 4, for instance) take a long time to process.