R – Analyze any data frame in Saiku

December 4, 2013
By

(This article was first published on R (en) - Analytik dat, and kindly contributed to R-bloggers)

In my previous article I have shown how R can be used to analyze PostgreSQL tables in Saiku using dynamically generated OLAP cubes. Today I will show you how you can analyze any R data frame in Saiku. WIth Saiku you can easily create excel-like pivot tables, charts, export PDF/pictures, zoom through tables and do other cool things.

 Pre-requisites

This is what you need to have installed to make it work:

  • R 3.01+
  • Saiku 2.5+
  • R packages sqldf and RSQLite
  • SQLite jdbc driver (do not forget to copy the driver to saiku-server/tomcat/webapps/saiku/WEB-INF/lib/)

Note: I have Ubuntu 13.04 and unfortunately I cannot test this on any other system right now.

Installation

Download the code and create function GenerateCubeSQLite function in R. Remember that you need packages sqldf and RSQLite installed.

Examples

First example uses already available data frame mtcars. Run the following code (you need to set valid values for CubeDestination and DataSourceDestination parameters):

mtcars2 = mtcars
mtcars2$id=seq(1:nrow(mtcars2))
mtcars2$vs=as.integer(mtcars2$vs)
mtcars2$am=as.integer(mtcars2$am)
mtcars2$gear=as.integer(mtcars2$gear)
mtcars2$carb=as.integer(mtcars2$carb)
# Do not forget to change CubeDestination and DataSourceDestination to valid values
GenerateCubeSQLite(mtcars2,PrimaryKey='id',CubeDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/foodmart/test.xml',DataSourceDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/saiku-datasources/test')
If everything goes well, you just have to reload cubes in Saiku, choose newly generated Cube named mtcars2:
b2ap3_thumbnail_mtcars-saiku-olap-screenshot.png
 
Second example is based on my own sample data frame named big_portfolio. You can download the dataset here. Store the dataset in your working directory, import the dataset and call the function GenerateCubeSQLite again:
# Example 2
big_portfolio=read.csv(file='big_portfolio.csv',header=TRUE,sep=';',stringsAsFactors=FALSE)
big_portfolio$origination_date=as.Date(big_portfolio$origination_date)
big_portfolio$repayment_date=as.Date(big_portfolio$repayment_date,format='%Y-%m-%d')
str(big_portfolio)
GenerateCubeSQLite(big_portfolio,PrimaryKey='id',CubeDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/foodmart/test.xml',DataSourceDestination='/home/tgr/Applications/Saiku/saiku-server/tomcat/webapps/saiku/WEB-INF/classes/saiku-datasources/test')
If everything goes well, refresh cubes in Saiku and play with data. You can get something like this for example:
b2ap3_thumbnail_analyze-any-r-data-frame-in-saiku-example.png
 
Note that this time we also have some date columns. GenerateCubeSQLite generated time dimension automatically for us, so we can easily group data by months/quarters/years

How it works

Saiku can work with data stored in SQLite database. The function GenerateCubeSQLite stores R data frame in temporary SQLite databases together with on-the-fly generated time dimension.

Conclusion

From now on it is easy to analyze any data frame you have in Saiku. Let me know if this works for you.

 

To leave a comment for the author, please follow the link and comment on his blog: R (en) - Analytik dat.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.