Predicting R models with PMML: Revolution R Enterprise and ADAPA

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The recently announced Revolution Analytics / Zementis partnership goes a long way towards demonstrating how R fits into big-league production environments. A frequent complaint against R is that although R is fine prototyping tool it is not able to handle production environments. Well, that’s just not true. In fact, it is straightforward to build a model in R, translate it into PMML using a standard R library, and then send the PMML file off to Zementis’ ADAPA scoring engine where the model described in the PMML file can be used to score a new data set. Moreover, using Revolution’s RevoDeployR web services technology it is relatively easy to set up the infrastructure where: Revolution R is running on a server somewhere (on site or in the cloud), the ADAPA scoring engine is running on another server and users can access both through a light client, browser or any BI tool.

The following code provides a simple example of splitting a file into training data and testing data, building a simple model and translating it to PMML.

# Load the required R libraries
library(pmml);
library(XML);
 
# Read in audit data and split into a training file and a testing file
auditDF <- read.csv("http://rattle.togaware.com/audit.csv")
auditDF <- na.omit(auditDF)              # remove NAs to make things easy
 
target <- auditDF$TARGET_Adjusted       # Get number of observations
N <- length(target); M <- N - 500  
i.train <- sample(N,M)                  # Get a random sample for training
audit.train <- auditDF[i.train,]
audit.test  <- auditDF[-i.train,]
 
# Build a logistic regression model
glm.model <- glm(audit.train$TARGET_Adjusted ~ .,data=audit.train,family="binomial")
 
# Describe the model in PMML and save it in an AML file
glm.pmml <- pmml(glm.model,name="glm model",data=trainDF)
xmlFile <- file.path(getwd(),"audit-glm.xml")
saveXML(glm.pmml,xmlFile)

The first few lines of PMML code that gets built should look something like:

 

 

 

  2011-02-28 14:41:54

 

 

 

 

 

 

  

  

  

  

  

  

Once the PMML file is built it can be submitted to the ADAPA engine and used to score a new data set.

The interactive demo on the Revolution site pulls all of this together and exercises the key moving parts that would be present in a production level scoring application.

Follow these steps to walk through the demo:

  1. Click on the link appropriate link in the Example: Audit Data section to download the file audit_scoring.csv to your disk.
  2. In the Build Predictive model box on the left:
    1. Select a name for the model
    2. Choose a Data set (You only have one choice: Audit Data).
    3. Select a model technique.
    4. Select the explanatory variables for your model.
    5. Press the Train Model button
  3. In the Evaluate Performance box on the right, press the Deploy Model button to have RevoDeployR send the PMML code over to the ADAPA engine.
  4. In the CSV Batch Scoring box:

    1. Select your model.
    2. Upload the audit_scoring.csv file (or any other file that you may have which would be appropriate for the model you just built)
    3. Watch for the results.

Revolution Analytics: Using ADAPA & Revolution R Enterprise—Audit Data Demo

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)