If you follow my blogs (I hope you do) then you know I really love the R programming language but I also love SAP HANA and in the past I have dealt with integration between those two:
You have to choose SUSE Linux Enterprise with 32 bit. I tried with 64 bit and it wasn’t funny…didn’t work and I lost a lot of time…32 bit for the win!
For the installation, you can follow this link SAP HANA Database Development Guide – Integration with R programming language, but at least in my case, I need to deal with a lot of difficulties, that gladly I’m going to write down in this blog, so you don’t have to deal with them
First, we need a compiler as we’re going to compile #R from it’s source.
sudo zypper install gcc gcc-c++ gcc-fortran
Then we need to get and extract the #R source code.
tar zxf R-2.13.0.tar.gz && cd R-2.13.0
./configure --enable-R-shlib --with-realine=no --with-x=no
This step really takes a long time…so you better go doing something more productive in the meantime…
When #R is finally installed, we need to download and install the Rserve package.
Now, we have to log into R and do the installation…
install.packages("/PATH_TO_FILE/Rserve.tar.gz", repos = NULL)
library("Rserve") #To test the installation. If there's no output, then it's working fine
If you get an error regarding a personal library…just say “y”. Once Rserve is install, we need to create a config file.
#Press ESC key
#Press ESC key
Now, we have to create a user that will run the Rserve so we can connect to it from SAP HANA.
useradd -m login_name
For some reason Amazon doesn’t provide the password for the root user…but we might need it eventually…so just do this…after all, if your user and you’re paying for it…
sudo passwd root
#Assign a password
R CMD Rserve --RS-port 6311 --no-save --RS-encoding "utf8"
Now…we’re ready to move to move to our SAP HANA server and keep configuring
Right click on your system node at the navigator tab
Select on the right hand side the Configuration tab
Select the indexserver.ini
Select the calcengine
#Add the following parameters...
cer_timeout - 300
cer_rserve_addresses - Our R Amazon server:6311
cer_rserve_maxsendsize - 0
Open a SQL Editor and copy the following code…
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110101',4195);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110201',4245);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110301',4971);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110401',4469);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110501',4257);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110601',4973);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110701',4470);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110801',4981);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20110901',4530);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20111001',4167);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20111101',4059);
insert into "SYSTEM"."TICKETS_BY_YEAR" values('20111201',1483);
This table is supposed to hold the tickets sales for a given company, during each month of the year 2011. What we want to do here is to determine or predict how are going to our sales on 2012. We’re going to use #R for that matter. Create the following script and call it “Predict_Tickets”. This script will have two Stored Procedures, called Prediction_Model and Get_Tickets.
CREATE TYPE T_PREDICTION_TICKETS AS TABLE (
DROP PROCEDURE Prediction_Model;
DROP PROCEDURE Get_Tickets;
CREATE PROCEDURE Prediction_Model(IN tickets_year TICKETS_BY_YEAR,OUT result T_PREDICTION_TICKETS)
LANGUAGE RLANG AS
prt.lm<-lm(tickets ~ period)
CREATE PROCEDURE Get_Tickets()
LANGUAGE SQLSCRIPT AS
Tickets = SELECT * FROM TICKETS_BY_YEAR;
INSERT INTO "TICKETS_BY_YEAR" SELECT * FROM :T_PREDICTION_TICKETS;
SELECT * FROM "TICKETS_BY_YEAR";