Introducing Tidygeocoder 1.0.0

[This article was first published on Jesse Cambon-R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Tidygeocoder v1.0.0 is now live on CRAN. There are numerous new features and improvements such as batch geocoding (submitting multiple addresses per query), returning full results from geocoder services (not just latitude and longitude), address component arguments (city, country, etc.), query customization, and reduced package dependencies.

For a full list of new features and improvements refer to the release page on Github. For usage examples you can reference the the Getting Started vignette.

To demonstrate a few of the new capabilities of this package, I decided to make a map of the stadiums for the UEFA Champions League Round of 16 clubs. To start, I looked up the addresses for the stadiums and put them in a dataframe.

library(dplyr)
library(tidygeocoder)
library(ggplot2)
require(maps)
library(ggrepel)

# https://www.uefa.com/uefachampionsleague/clubs/
stadiums <- tibble::tribble(
~Club,                ~Street,   ~City,   ~Country,
"Barcelona",          "Camp Nou", "Barcelona", "Spain",
"Bayern Munich",      "Allianz Arena", "Munich", "Germany",
"Chelsea",            "Stamford Bridge", "London", "UK",
"Borussia Dortmund",  "Signal Iduna Park", "Dortmund", "Germany",
"Juventus",           "Allianz Stadium", "Turin", "Italy",
"Liverpool",          "Anfield", "Liverpool", "UK",
"Olympique Lyonnais", "Groupama Stadium", "Lyon", "France",
"Man. City",          "Etihad Stadium", "Manchester", "UK",
"Napoli",             "San Paolo Stadium", "Naples", "Italy",
"Real Madrid",        "Santiago Bernabéu Stadium", "Madrid", "Spain",
"Tottenham",          "Tottenham Hotspur Stadium", "London", "UK",
"Valencia",           "Av. de Suècia, s/n, 46010", "Valencia", "Spain",
"Atalanta",           "Gewiss Stadium", "Bergamo", "Italy",
"Atlético Madrid",    "Estadio Metropolitano", "Madrid", "Spain",
"RB Leipzig",         "Red Bull Arena", "Leipzig", "Germany",
"PSG",                "Le Parc des Princes", "Paris", "France"
  )

To geocode these addresses, you can use the geocode function as shown below. New in v1.0.0, the street, city, and country arguments specify the address. The Nominatim (OSM) geocoder is selected with the method argument. Additionally, the full_results and custom_query arguments (also new in v1.0.0) are used to return the full geocoder results and set Nominatim’s “extratags” parameter which returns extra columns.

stadium_locations <- stadiums %>%
  geocode(street = Street, city = City, country = Country, method = 'osm', 
          full_results = TRUE, custom_query= list(extratags = 1))

This returns 40 columns including the longitude and latitude. A few of the columns returned due to the extratags argument are shown below.

stadium_locations %>%
  select(Club, City, Country, extratags.sport, extratags.capacity, extratags.operator, extratags.wikipedia) %>%
  rename_with(~gsub('extratags.', '', .)) %>%
  knitr::kable()
ClubCityCountrysportcapacityoperatorwikipedia
BarcelonaBarcelonaSpainsoccerNANAen:Camp Nou
Bayern MunichMunichGermanysoccer75021NAde:Allianz Arena
ChelseaLondonUKsoccer41837Chelsea Football Cluben:Stamford Bridge (stadium)
Borussia DortmundDortmundGermanysoccerNANAde:Signal Iduna Park
JuventusTurinItalysoccerNANAit:Allianz Stadium (Torino)
LiverpoolLiverpoolUKsoccer54074Liverpool Football Cluben:Anfield
Olympique LyonnaisLyonFrancesoccer58000Olympique Lyonnaisfr:Parc Olympique lyonnais
Man. CityManchesterUKsoccerNAManchester City Football Cluben:City of Manchester Stadium
NapoliNaplesItalysoccerNANAen:Stadio San Paolo
Real MadridMadridSpainsoccer85454NAes:Estadio Santiago Bernabéu
TottenhamLondonUKsoccer;american_football62062Tottenham Hotspuren:Tottenham Hotspur Stadium
ValenciaValenciaSpainNANANANA
AtalantaBergamoItalysoccerNANANA
Atlético MadridMadridSpainsoccerNANAes:Estadio Metropolitano (Madrid)
RB LeipzigLeipzigGermanyNANANAde:Red Bull Arena (Leipzig)
PSGParisFrancesoccer48527Paris Saint-Germainfr:Parc des Princes

Below, the stadium locations are plotted on a map of Europe using the longitude and latitude coordinates and ggplot.

# reference: https://www.datanovia.com/en/blog/how-to-create-a-map-using-ggplot2/
# EU Countries
some.eu.countries <- c(
  "Portugal", "Spain", "France", "Switzerland", "Germany",
  "Austria", "Belgium", "UK", "Netherlands",
  "Denmark", "Poland", "Italy", 
  "Croatia", "Slovenia", "Hungary", "Slovakia",
  "Czech republic"
)
# Retrieve the map data
some.eu.maps <- map_data("world", region = some.eu.countries)

# Plot
ggplot(stadium_locations, aes(x = long, y = lat)) +
  borders('world', xlim = c(-10, 10), ylim = c(40, 55)) +
  geom_label_repel(aes(label = Club), force = 2, segment.alpha = 0) + 
  geom_point() + theme_void() 

Another great mapping option is the leaflet package, which was originally what I intended to use for the map above, but getting it to render on a Jekyll blog proved to be a bit involved.

If you find any issues with the package or have ideas on how to improve it, feel free to file an issue on Github. For reference, the RMarkdown file that generated this blog post can be found here.

To leave a comment for the author, please follow the link and comment on their blog: Jesse Cambon-R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)