Getting geo data into SQL Server using API and R

[This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R language (as well as Python) as comprehensive languages helping not only data analysis and data science tasks make it pretty easy, but is also a multiple-purpose language.

In this blog-post I will examine ways of getting data into SQL SERVER using API. I have briefly discussed this already in one of my previous blog poss, but this time, we can do it using SQL Server and R or SQL Server and Python. For the API Service I will call couple of different API services to demonstrate the ease of usage.

2018-09-02 18_06_59-Presentation1 - PowerPoint

Most common API calls are those, retrieving information from social media networks, such as Twitter, Facebook, Slack, LinkedIn and others. These information are great for network analysis, but business also tend to get data from Mailchimp (or Mandrill), Google API, Yahoo stocks, automatic translations sites, Jira, Jenkins and many more.

Calling API in R

Before we continue, We should test a simple

Google API

Using goople API for getting address from longitute and latitude.

DROP TABLE IF EXISTS dbo.[Address];
GO

CREATE TABLE [dbo].[Address](
	[street_number] [varchar](255) NULL,
	[street_name] [varchar](255) NULL,
	[city_name] [varchar](255) NULL,
	[ZIP] [varchar](255) NULL,
	[Country] [varchar](255) NULL,
	[ID] [int] NULL
);
GO

CREATE TABLE dbo.LAT_LNG
(
 LAT VARCHAR(100)
,LNG VARCHAR(100)
,ID INT
);
GO


INSERT INTO dbo.LAT_LNG(LAT,LNG,ID)
		  SELECT '46.080317','14.5178409',1 
UNION ALL SELECT '48.198726','16.396652', 2
UNION ALL SELECT '-34.3594725','18.470029',3;
GO

Now that we have some sample data in table, we can execute R script to get the corresponding addresses.

INSERT INTO dbo.[Address]
exec sp_execute_external_script 
	@language = N'R'
	,@script =  N'
		library(RODBC)
		#library(sqldf)
		library(RCurl)
		library(RJSONIO)
		library(plyr)

	Addresses <- InputDataSet

Addresses$x <- as.character(Addresses$x)
Addresses$y <- as.character(Addresses$y)
Addresses$latlng <- paste(as.character(Addresses$y),",",as.character(Addresses$x), sep ="")


url <- function(latlng, return.call = "json") {
  root <- "https://maps.googleapis.com/maps/api/geocode/"
  API_Key <- "AIzaSyDBzvtkyo1Q8uSC8oDu3dxxxxxxxxxxx"        
       #Taken from: https://developers.google.com/maps/documentation/geocoding/start#get-a-key
       #Bound to my personal email: [email protected]
  u <- paste(root, return.call, "?latlng=", latlng, "&key=",API_Key,sep = "")
  return(URLencode(u))
}


geoAddress <- function(latlng,verbose=FALSE) {
  if(verbose) cat(latlng,"\n")
  u <- url(latlng)
  doc <- getURL(u, .opts = list(ssl.verifypeer = FALSE)) #opts is RCurls additional options for undergo certiciates/SSL/VPN problems
  x <- fromJSON(doc,simplify = FALSE)
  if(x$status=="OK") 
  {
    street_number <- x$results[[1]]$address_components[[1]]$short_name
    street_name <- x$results[[1]]$address_components[[2]]$short_name
    city_name <- x$results[[1]]$address_components[[3]]$short_name
    ZIP <- x$results[[1]]$address_components[[5]]$short_name
    return(c(street_number, street_name, city_name, ZIP, formatted_address))
  } 
  else 
  {
    return(c(NA,NA,NA,NA,NA))
  }
}


# create empty data.frame
address_df <- data.frame(street_number=NA
                         ,street_name=NA
                         ,city_name=NA
                         ,ZIP=NA
                         ,Address=NA
                         ,ID=NA
)


for (i in 1:3) # 2500 API calls per day for free (until 18.7.2018)
{
  read_latlng <- geoAddress(Addresses[i,4])
  read_customer <- Addresses[i,3]
  
  address_df_temp <- data.frame(t(sapply(read_latlng,c)))
  names(address_df_temp)[1]<-paste("street_number")
  names(address_df_temp)[2]<-paste("street_name")
  names(address_df_temp)[3]<-paste("city_name")
  names(address_df_temp)[4]<-paste("ZIP")
  names(address_df_temp)[5]<-paste("Address")
  cus1 <- data.frame(read_customer)
  names(cus1)[1]<-paste("ID")
  address_df_temp <- data.frame(c(address_df_temp,cus1))
  address_df <- rbind(address_df, address_df_temp)
  
  Sys.sleep(1)
  rm(address_df_temp, cus1) 
}

OutputDataSet <- address_df'
	,@input_data_1 = N'SELECT  LNG AS y, LAT AS x,id FROM dbo.LAT_LNG'

And after running a select statement against the addresses table, I get the following:

2018-09-02 17_58_00-API_R.sql - TOMAZK_MSSQLSERVER2017.test (TOMAZK_Tomaz (69))_ - Microsoft SQL Ser

Google offers so many APIs, that I only choose google maps as shown above as an example. Another favorite Google API of mine is getting the distances from same API between different points.

Happy R-ing.

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)