We’ve introduced R in the organization!
It is running along with the heavy weights of statistical analysis like SAS, SPSS, Matlab. Here’s what we did and how we did it…
HOW DID IT START?
The business need was to build a web-based tool for marketing budget optimization – Marketing RoI (Return on Investments) i.e. how should a company that has multiple advertisement channels allocate its marketing budget across multiple channels to maximize profit or customer loyalty or customer life time value (LTV).
1) Input: The input to the analysis is the company’s historical marketing budget allocation, profit, customer loyalty and LTV.
– Step 1) Our experts create a formula that relates the inputs given with RoI and LTV etc. It involves econometric techniques etc.
– Step 2) Optimization of the formula when the user conducts what-if analysis by varying total budget and/or spend across individual channels to see its effect on RoI and LTV. The desktop optimization model written in Excel using a commercial Excel plugin.
3) Output: Optimized spend across advertising channels and ability to evaluate multiple scenarios to determine optimum marketing mix
A) Web application: The web forms needed to allow users to input data and run scenarios were simple. We develop web applications using Ruby on Rails on LAMP internally. Ruby on Rails gives us an agile environment to develop software by taking care of routine web application tasks like database connectivity.
For this we had to prove a couple of things:
1) Optimization of formula from step 1
2) Integration with web application
Option 1: Commercial optimization engine
We did a quick spike to test optimization with the commercial optimization plugin’s server version and also its integration with Ruby on Rails web application and it was successful. We had to use JRuby to integrate Ruby with plugin’s server edition as it provides only Java and .NET API.
Option 2: R (Open source)
In parallel, we checked if R can be used. R is a leading open source statistical environment.
– To solve the optimization problem in R we found a lot of R optimization packages and started testing packages like BB as the formula (from step 1) was non-linear, and had constraints and conditions. We tested BB’s SPG function and also tried other generic algorithms. We got good optimization results from R (similar or better compared to commercial optimization engine).
– Now we had to check how to integrate R with our web application written in Ruby. We found a number of options like integrating R with Apache (rApache) or integrating R directly with Ruby (rsruby). We decided to use rsruby.
We ran a number of proof of concepts with R and shared results with stakeholders. The results were positive in terms of performance as well as the optimized results… So we got better results and that too for free!
- You need to be careful in running it in a shared environment, where it can use all your CPU and memory if it runs for long
- Don’t forget to write unit tests using RUnit for your R code
- Capturing exceptions from R and dealing with them properly (appropriate message to users)
- rsruby installation documentation is good but needs a few tries depending on your Linux distribution
- rsruby does not run on Windows (wasn’t a problem for us as we run our web applications on LAMP)
User acceptance testing: If you are transforming an Excel-based model into web-version, it is critical to have a fully working example of the Excel model to replicate it in R/other statistical packages
- Overcoming the challenges of using new open source software in enterprise: Like most enterprise IT shops, we are used to commercial software as well and the idea of using open source software to do serious work is limited to the most popular open source frameworks like Drupal, Ruby on Rails, Linux. We positioned R as an add-on to our LAMP environment and got a separate virtual server dedicated to it as it is memory hungry.