Getting Historical Weather Data in R and SAP HANA
[This article was first published on All Things R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
For many of my latest data blogs, I needed historical weather data to perform data mash-ups to pin-point the cause. For example, for my continued exploration into the airlines/airports historical data using SAP HANA and R, I wanted to find out whether the weather was behind the extreme delay experienced out of a particular airport for a particular day/hour. So I needed to mash-up the weather data with the airlines data for this analysis.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I looked around but could not find a better way to get the weather data. So I turned to R. Now, to get historical weather data, I am using Weather Underground’s REST APIs and I put together a simple program in R to get the weather data in a data.frame. This R module gets called from SAP HANA and it inserts a new table into HANA with the right weather information. Once, I have the data in HANA, I performed mash-ups in HANA and off I go on my intellectual pursuit.
Weather Underground returns the data in both XML and JSON file formats. The program logic is very simple, [once you have spent hours cracking it, the end product looks simple anyways :-)] and there are appropriate comments in the code below for self-learning.
I want to mention that you are not limited to just getting the historical view on weather data. You can get the weather forecast for next 10 days, perform your analysis and predict future!
Make sure to register with Weather Underground (API documentation link), comply with their rules and get your own key to access their APIs.
############################################################################
getHistoricalWeather <- function(airport.code="SFO", date="Sys.Date()")
{
base.url <- 'http://api.wunderground.com/api/{your key here}/’
# compose final url
final.url <- paste(base.url, 'history_', date, '/q/', airport.code, '.json', sep='')
# reading in as raw lines from the web service
conn <- url(final.url)
raw.data <- readLines(conn, n=-1L, ok=TRUE)
# Convert to a JSON
weather.data <- fromJSON(paste(raw.data, collapse=""))
close(conn)
return(weather.data)
}
# get data for 10 days – restriction by Weather Underground for free usage
date.range <- seq.Date(from=as.Date('2006-1-01'), to=as.Date('2006-1-10'), by='1 day')
# Initialize a data frame
hdwd <- data.frame()
# loop over dates, and fetch weather data
for(i in seq_along(date.range)) {
weather.data <- getHistoricalWeather('SFO', format(date.range[i], "%Y%m%d"))
hdwd <- rbind(hdwd, ldply(weather.data$history$dailysummary,
function(x) c(‘SJC’, date.range[i], x$fog, x$rain, x$snow, x$meantempi, x$meanvism, x$maxtempi, x$mintempi)))
}
colnames(hdwd) <- c("Airport", "Date", ‘Fog’, ‘Rain’, ‘Snow’,’AvgTemp’, ‘AvgVisibility’,’MaxTemp’,’MinTemp’)
# save to CSV
write.csv(hdwd, file=gzfile(‘SFC-Jan2006.csv.gz’), row.names=FALSE)
############################################################################
Results –
Results –
Airport | Date | Fog | Rain | Snow | AvgTemp | AvgVisibility | MaxTemp | MinTemp |
SFO | 13149 | 0 | 1 | 0 | 55 | 14 | 62 | 47 |
SFO | 13150 | 0 | 1 | 0 | 53 | 11 | 55 | 50 |
SFO | 13151 | 0 | 1 | 0 | 51 | 14 | 56 | 46 |
SFO | 13152 | 0 | 0 | 0 | 56 | 16 | 62 | 50 |
SFO | 13153 | 0 | 0 | 0 | 54 | 14 | 60 | 48 |
SFO | 13154 | 0 | 1 | 0 | 52 | 14 | 59 | 45 |
SFO | 13155 | 0 | 1 | 0 | 56 | 14 | 61 | 50 |
SFO | 13156 | 0 | 0 | 0 | 51 | 16 | 57 | 45 |
SFO | 13157 | 0 | 0 | 0 | 49 | 16 | 56 | 41 |
SFO | 13158 | 0 | 0 | 0 | 54 | 10 | 61 | 46 |
Happy Analyzing!
To leave a comment for the author, please follow the link and comment on their blog: All Things R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.