SAP HANA OData and R

(This article was first published on Blag's bag of rants, and kindly contributed to R-bloggers)

As you might have discovered by now...I love R...it's just an amazing programming language...

By now...I have integrate R and SAP HANA via ODBC and via the SAP HANA-R integration...but I have completely left out the SAP HANA OData capabilities.

For this blog, we're going to create a simple Attribute View, expose it via SAP HANA and then consume it on R to display a nice and fancy graphic -;)

First, let's create an Attribute View and call it FLIGHTS. This Attribute View is going to be composed of the tables SPFLI, SCARR and SFLIGHT and will output the fields PRICE, CURRENCY, CITYFROM, CITYTO, DISTANCE, CARRID and CARRNAME. If you wonder why so many fields? Just so I can use it in another examples -;)


With the Attribute View ready, we can create a project in the repository and the necessary files to expose it as an OData service.

First, we create the .xsapp file...which should be empty -:P

Then, we create the .xsaccess file with the following code...

.xsaccess
{
"exposed" : true,
"authentication" : [ { "method" : "Basic" } ]
}

Finally, we create a file called flights.xsodata

flights.xodata
service {
"BlagStuff/FLIGHTS.attributeview" as "FLIGHTS" keys generate local "Id";
}

When everything is ready...we can call our service to test it...we can call it as either JSON or XML. For this example, we're going to call it as XML.


Now that we know its working...we can go and code with R -:D For this...we're going to need 3 packages (That you can install via RStudio or R itself), ggplot2, RCurl and XML.

HANA_OData_and_R.R
library("ggplot2")
library("RCurl")
library("XML")
web_page = getURL("XXX:8000/BlagStuff/flights.xsodata/FLIGHTS?$format=xml", userpwd = "SYSTEM:******")
doc <- xmlTreeParse(web_page, getDTD = F,useInternalNodes=T)
r <- xmlRoot(doc)

carrid<-list()
carrid_list<-list()
carrid_big_list<-list()
price<-list()
price_list<-list()
price_big_list<-list()
currency<-list()
currency_list<-list()
currency_big_list<-list()

for(i in 5:xmlSize(r)){
carrid[1]<-xmlValue(r[[i]][[5]][[1]][[2]])
carrid_list[i]<-carrid[1]
price[1]<-xmlValue(r[[i]][[5]][[1]][[8]])
price_list[i]<-price[1]
currency[1]<-xmlValue(r[[i]][[5]][[1]][[7]])
currency_list[i]<-currency[1]
}

carrid_big_list<-unlist(carrid_list)
price_big_list<-unlist(price_list)
currency_big_list<-unlist(currency_list)
flights_table<-data.frame(CARRID=as.character(carrid_big_list),PRICE=as.numeric(price_big_list),
CURRENCY=as.character(currency_big_list))
flights_agg<-aggregate(PRICE~.,data=flights_table, FUN=sum)
flights_agg<-flights_agg[order(flights_agg$CARRID),]

flights_table<-data.frame(CARRID=as.character(flights_agg$CARRID),PRICE=as.character(flights_agg$PRICE),
CURRENCY=as.character(flights_agg$CURRENCY))

ggplot(flights_table, aes(x=CARRID, y=PRICE, fill=CURRENCY)) + geom_histogram(binwidth=.5,
position="dodge", stat="identity")

Basically, we're are reading the OData service that comes in XML format and parsing it into a tree so we can extract it's components. One thing that might call your attention is that we're using xmlValue(r[[i]][[5]][[1]][[2]]) where i starts from 5.

Well...there's an easy explanation -:) if we access our XML tree...the first value it's going to be "feed", the second "id" and so on...the fifth is going to be "entry" which is what we need. Then for the next [[5]]...inside "entry", the first value it's going to be "id", the second "title" and so on...the fifth is going to be "content" which is what we need. Then for the next [[1]]...inside "content", the first value it's going to be "properties" which is what we need. And for the last [[2]]...inside "properties" the first value it's going to be "id" and the second it's going to "carrid" which is what we need. BTW, xmlValue will get the value of the XML tag -:P

In other words...we need to analyze the XML schema and determine what we need to extract...after that, we simply need to assign those values to variables and create our data.frame.

Then we create an aggregation to sum the PRICE values (In other words, we're going to have the PRICE grouped by CARRID and CURRENCY), then we sort the values and finally we create a new data.frame so we can present the PRICE as character instead of numeric...just for better presentation of the graphic...

Finally...we call the plot and we're done -:)


Happy plotting! -:)

Greetings,

Blag.

To leave a comment for the author, please follow the link and comment on his blog: Blag's bag of rants.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.