Getting Historical Weather Data in R and SAP HANA

[This article was first published on All Things R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

For many of my latest data blogs, I needed historical weather data to perform data mash-ups to pin-point the cause.  For example, for my continued exploration into the airlines/airports historical data using SAP HANA and R, I wanted to find out whether the weather was behind the extreme delay experienced out of a particular airport for a particular day/hour.  So I needed to mash-up the weather data with the airlines data for this analysis.

I looked around but could not find a better way to get the weather data.  So I turned to R.  Now, to get historical weather data, I am using Weather Underground’s REST APIs and I put together a simple program in R to get the weather data in a data.frame.  This R module gets called from SAP HANA and it inserts a new table into HANA with the right weather information.  Once, I have the data in HANA, I performed mash-ups in HANA and off I go on my intellectual pursuit.

Weather Underground returns the data in both XML and JSON file formats.  The program logic is very simple, [once you have spent hours cracking it, the end product looks simple anyways :-)] and there are appropriate comments in the code below for self-learning.

I want to mention that you are not limited to just getting the historical view on weather data.  You can get the weather forecast for next 10 days, perform your analysis and predict future!

Make sure to register with Weather Underground (API documentation link), comply with their rules and get your own key to access their APIs.
############################################################################
getHistoricalWeather <- function(airport.code="SFO", date="Sys.Date()")
{
  base.url <- 'http://api.wunderground.com/api/{your key here}/’
  # compose final url
  final.url <- paste(base.url, 'history_', date, '/q/', airport.code, '.json', sep='')


  # reading in as raw lines from the web service
  conn <- url(final.url)
  raw.data <- readLines(conn, n=-1L, ok=TRUE)
 # Convert to a JSON
  weather.data <- fromJSON(paste(raw.data, collapse=""))
  close(conn)
  return(weather.data)
}



# get data for 10 days – restriction by Weather Underground for free usage
date.range <- seq.Date(from=as.Date('2006-1-01'), to=as.Date('2006-1-10'), by='1 day')


# Initialize a data frame
hdwd <- data.frame()



# loop over dates, and fetch weather data
for(i in seq_along(date.range)) {
    weather.data <- getHistoricalWeather('SFO', format(date.range[i], "%Y%m%d"))                 
      hdwd <- rbind(hdwd, ldply(weather.data$history$dailysummary, 
          function(x) c(‘SJC’, date.range[i], x$fog, x$rain, x$snow,  x$meantempi, x$meanvism, x$maxtempi, x$mintempi)))
}
colnames(hdwd) <- c("Airport", "Date", ‘Fog’, ‘Rain’, ‘Snow’,’AvgTemp’, ‘AvgVisibility’,’MaxTemp’,’MinTemp’)


# save to CSV
write.csv(hdwd, file=gzfile(‘SFC-Jan2006.csv.gz’), row.names=FALSE)

############################################################################
Results – 

Airport Date Fog Rain Snow AvgTemp AvgVisibility MaxTemp MinTemp
SFO 13149 0 1 0 55 14 62 47
SFO 13150 0 1 0 53 11 55 50
SFO 13151 0 1 0 51 14 56 46
SFO 13152 0 0 0 56 16 62 50
SFO 13153 0 0 0 54 14 60 48
SFO 13154 0 1 0 52 14 59 45
SFO 13155 0 1 0 56 14 61 50
SFO 13156 0 0 0 51 16 57 45
SFO 13157 0 0 0 49 16 56 41
SFO 13158 0 0 0 54 10 61 46


Happy Analyzing!

To leave a comment for the author, please follow the link and comment on their blog: All Things R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)