OSM Nominatim with R: getting Location’s Geo-coordinates by its Address

January 9, 2018

(This article was first published on R Programming – DataScience+, and kindly contributed to R-bloggers)

It is quite likely to get address info when scraping data from the web, but not geo-coordinates which may be required for further analysis like clustering. Thus geocoding is often needed to get a location’s coordinates by its address.

There are several options, including one of the most popular, google geocoding API. This option can be easily implemented into R with the function geocode from the library ggmap. It has the limitation of 2500 request a day (when it’s used free of charge), see details here.

To increase the number of free of charge geocoding requests, OpenStreetMap (OSM) Nominatim API can be used. OSM allows up to 1 request per second (see the usage policy), that gives about 35 times more API calls compared to the google geocoding API.

Here is one of the ways on how to implement OSM nominatim API in R:

## geocoding function using OSM Nominatim API
## details: http://wiki.openstreetmap.org/wiki/Nominatim
## made by: D.Kisler 

nominatim_osm <- function(address = NULL)
    d <- jsonlite::fromJSON( 
      gsub('\\@addr\\@', gsub('\\s+', '\\%20', address), 
           'http://nominatim.openstreetmap.org/search/@[email protected]?format=json&addressdetails=0&limit=1')
    ), error = function(c) return(data.frame())
  if(length(d) == 0) return(data.frame())
  return(data.frame(lon = as.numeric(d$lon), lat = as.numeric(d$lat)))

The function requires the library jsonlite.

Function input: the location address as string.
Function output: a data.frame with lon (longitude) and lat (latitude) of the input location, or empty data.frame if no/invalid address provided as the function input.

Let’s test the function.

#dplyr will be used to stack lists together into a data.frame and to get the pipe operator '%>%'
#input addresses
addresses <- c("Baker Street 221b, London", "Brandenburger Tor, Berlin", 
               "Platz der Deutschen Einheit 1, Hamburg", "Arc de Triomphe de l’Etoile, Paris",
               "Дворцовая пл., Санкт-Петербург, Россия")
d <- suppressWarnings(lapply(addresses, function(address) {
  #set the elapsed time counter to 0
  t <- Sys.time()
  #calling the nominatim OSM API
  api_output <- nominatim_osm(address)
  #get the elapsed time
  t <- difftime(Sys.time(), t, 'secs')
  #return data.frame with the input address, output of the nominatim_osm function and elapsed time
  return(data.frame(address = address, api_output, elapsed_time = t))
  }) %>%
#stack the list output into data.frame
bind_rows() %>% data.frame())
#output the data.frame content into console
                                 address        lon      lat   elapsed_time
1              Baker Street 221b, London -0.1584945 51.52376 0.2216313 secs
2              Brandenburger Tor, Berlin 13.3777025 52.51628 0.1038268 secs
3 Platz der Deutschen Einheit 1, Hamburg  9.9842058 53.54129 0.1253307 secs
4     Arc de Triomphe de l’Etoile, Paris  2.2950372 48.87378 0.1097755 secs
5 Дворцовая пл., Санкт-Петербург, Россия 30.3151066 59.93952 0.1000750 secs

    Related Post

    1. Spark RDDs Vs DataFrames vs SparkSQL – Part 3 : Web Server Log Analysis
    2. Painlessly Merge Data into Actuarial Loss Development Triangles with R
    3. The Mcomp Package
    4. Building a Hacker News scraper with 8 lines of R code using rvest library
    5. Building a Telecom Dictionary scraping web using rvest in R

    To leave a comment for the author, please follow the link and comment on their blog: R Programming – DataScience+.

    R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

    If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

    Comments are closed.

    Search R-bloggers


    Never miss an update!
    Subscribe to R-bloggers to receive
    e-mails with the latest R posts.
    (You will not see this message again.)

    Click here to close (This popup will not appear again)