Google Geo Data – Data Access Without Restrictions

January 18, 2016
By

(This article was first published on ThinkToStart » R Tutorials, and kindly contributed to R-bloggers)

Geo-Distances are of great importance: Researchers from various disciplines refer to geographic distances – health researchers refer to geographic data when analyzing the spread of diseases, economists when evaluating the impact of transaction costs on human behavior, or sociologists when evaluating interpersonal distances (based on external factors) in human interaction.

However, each query sent to the Google Maps Distance Matrix API (currently available via the ggmap-package) is limited by the number of allowed elements, where the number of origins times the number of destinations defines the number of elements. The Google Maps Distance Matrix API has the following limits in place (Users of the standard API):

  • 2,500 free elements per day
  • 100 elements per query
  • 100 elements per 10 seconds

Thus, researchers face a limit in requesting distances. This code proposes a work-around, respectively, an approach to request the distances; specifically, the proposed code requests the driving distance and driving time between two geographical points via google maps without any API restrictions. However, the code is quiet flexible and could be adjusted to request line-distances, etc.

The example refers to the attached csv-file. The comments are part of the script.

Example csv-file

You need five R packages (data.tablehttr, stringr, XML) to run the code.

Remarks, hints and further modifications are welcome.

In the first step, you have to load the relevant packages and the attached data files, which consists of four lot/lan distances.

library("data.table")
library("httr")
library("stringr")
library("XML")

# Read Data example (Data example provided in the header)
newdata <- read.csv("D:/r_geocodes.csv", header = TRUE, sep=";")

Second, define the URL codes to request the distances via google maps.

newdata$URL <- with(newdata, paste("https://www.google.de/maps/dir/",lat1,"+",lon1,"/",lat2,",",lon2, sep=""))
newdata$URL <- as.character(newdata$URL)

Next, define the relevant functions to download the data:

# Function Extracting the last n characters from a string 
substrRight <- function(x, n){
  substr(x, nchar(x)-n+1, nchar(x))
}
#######################################################################
# Function to request google maps driving distance
download.maybe <- function(url, refetch=FALSE, path=".") {
  cnamet <- as.data.table(as.character(GET(url)))
  cnamet <- as.character(cnamet)
  # Compute Distance
  dis<-substring((strsplit(substrRight(strsplit(cnamet,"km")[[1]][1], 9), ",")[[1]])[2], 2)
  dis

  # Compute Time
  # Minutes
  dur_m <- as.numeric(gsub( "[^[:alnum:],]", "", substrRight(strsplit(cnamet,"Min.")[[1]][1], 4) ))
  dur_m
  
  # Hours (if applicable)
  durh_h_new<-as.numeric(gsub( "[^[:alnum:]]", "",
                               ifelse(grepl("Std", substrRight(strsplit(cnamet,"Min")[[1]][1], 15))=="TRUE",
                                      str_extract_all(substrRight(strsplit(substrRight(strsplit(cnamet,"Std")[[1]][1], 3),"Std.")[[1]][1], 5),"\(?[0-9,.]+\)?")[[1]],
                                      "0")))
  durh_h_new
  # Change in Minutes
  dur_fin<-dur_m+(durh_h_new*60)
  dur_fin
  # Combine all
  fin<- as.character(paste (dis, dur_fin, sep = " ", collapse = NULL))
  fin
}

Finally, run the corresponding function for your data (here: example data set).

# First Row: Google URL
# Second Column: Distance
# Third Column: Driving Time (Hint: Always the current driving time . might differ due traffic!!!)
files <- as.data.frame(t(as.data.frame(strsplit(sapply(newdata$URL, download.maybe, path=path), "\, |\,| "))))
colnames(files)[1] <- "Distance in km"
colnames(files)[2] <- "Driving Time in minutes"

That’s it. Now, you should get the following output data file.

SC_R_Output_Geo_Code

The post Google Geo Data – Data Access Without Restrictions appeared first on ThinkToStart.

To leave a comment for the author, please follow the link and comment on their blog: ThinkToStart » R Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)