Plane Crash Data – Part 2: Google Maps Geocoding API Request

August 16, 2017
By

(This article was first published on INWT-Blog-RBloggers, and kindly contributed to R-bloggers)

This is the second part of our series about plane crash data. To execute the code below, you’ll first need to execute the code from the first part of this series to obtain the prepared plane crash dataset.

In this part I’d like to get the geocoordinates from the Google Maps Geocoding API for the crash location and the point of departure as well as for the intended point of arrival. The location of the crash is contained in the location variable. The other two pieces of information are contained in the route variable, so we first need to extract them and store them in separate variables.

separators <- " - |- | -" data <- data %>%   # split route variable into "from" and "to"   separate(route,            sep = separators,            into = c("from", "to"),            extra = "merge") %>%       # if there was a pit stop, "to" sometimes still contains two locations.   # we need only the last one.   separate(to,            sep = separators,            into = c("pitStop", "to"),            fill = "left") %>%   select(-pitStop)

In order to prevent weird results, we exclude incomplete cases right from the start.

# exclude observations with NA data <- data[complete.cases(data), ] 

Now, in order to send requests to the Google Maps Geocoding API – which converts addresses into geocoordinates – you need to get yourself an API key. Here you go: Get Google API key

Let us store our key in an R object:

apiKey <- "ENTER_API_KEY_HERE"

Now we have almost everything ready: We have complete data containing locations of departure, intended arrival, and crash as strings, and we have an API that converts these strings into geocoordinates. However this API returns the geocoordinates in form of a JSON string which we can’t use right away. So what we need is a function to extract the relevant information from this JSON string and store it in our dataset. Therefore we need to load the jsonlite package.

library("jsonlite")

Look at the following function. It takes two arguments: the location and the API key. The return value is a vector containing the geocoordinates of the location. If the status of the request is “OK”, the API returns the geocoordinates (latitude lat and longitude lng) which our function writes directly into a dataframe. However if the Google API cannot return any coordinates for the requested location, the API will return the string “ZERO_RESULTS”. Then our function returns NAs. This case may occur if the location is unknown (?) or given as Sightseeing for example.

getGeoCoord <- function(loc, apiKey) {      # create request   request <- paste0("https://maps.googleapis.com/maps/api/geocode/json?",                      "address=", gsub(" ", "+", loc), "&key=", apiKey)      # extract results and convert them to strings   result <- request %>% lapply(fromJSON) %>% .[[1]]      if (result$status == "OK") {     result <- result$results$geometry$location[1, ]   } else if (result$status == "ZERO_RESULTS") {     result <- data.frame(lat = NA, lng = NA)   }      result %>% data.frame }

Now let us use the function and extract geocoordinates for the plane crash locations, the departure locations and the locations of intended arrival. We first store them in objects called coordCrash, coordFrom and coordTo. Then we add them to our existing dataframe.

# send requests: # crash location coordCrash <- lapply(data$location, getGeoCoord, apiKey = apiKey) %>%    bind_rows %>% setNames(paste0(names(.), "CrashLoc")) # departure location coordFrom <- lapply(data$from, getGeoCoord, apiKey = apiKey) %>%    bind_rows %>% setNames(paste0(names(.), "From")) # intended arrival location coordTo <- lapply(data$to, getGeoCoord, apiKey = apiKey) %>%    bind_rows %>% setNames(paste0(names(.), "To")) # add the new columns to data data <- cbind(data, coordCrash, coordFrom, coordTo)

Now the data is ready to be visualised. This happens in the third part of the series.

Further parts of the Plane Crash series:

To leave a comment for the author, please follow the link and comment on their blog: INWT-Blog-RBloggers.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)