Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

by Andrie de Vries

We have written on several occasions about AzureML, the Microsoft machine learning studio that is part of the Cortana Analytics suite:

In September we announced that the AzureML package for R allows you to publish R functions as Azure web services. This is a brilliantly easy way to deploy your functions to other users and clients. For example, you can publish a function from R, then consume that function from Excel!

I am pleased to announce that we have completed a significant rewrite of the AzureML package. This rewrite adds several enhancements. Specifically, AzureML now also allows you to interact with:

• Workspace: connect to and manage AzureML workspaces

We have also significantly enhanced the functionality to publish and consume models

• Publish: define a custom function or train a model and publish it as an Azure Web Service
• Consume: use available web services from R in a variety of convenient formats

Interacting with datasets

This version of the AzureML package adds new functionality to interact with datasets and experiments.

The code to do this is very simple:

# Create a workspace object
ws <- workspace()

# List datasets
datasets(ws, filter = "sample")

head(frame)

As expected, this displays the first few lines of the resulting data frame:

X Y month day FFMC  DMC    DC  ISI temp RH wind rain area
1 7 5   mar fri 86.2 26.2  94.3  5.1  8.2 51  6.7  0.0    0
2 7 4   oct tue 90.6 35.4 669.1  6.7 18.0 33  0.9  0.0    0
3 7 4   oct sat 90.6 43.7 686.9  6.7 14.6 33  1.3  0.0    0
4 8 6   mar fri 91.7 33.3  77.5  9.0  8.3 97  4.0  0.2    0
5 8 6   mar sun 89.3 51.3 102.2  9.6 11.4 99  1.8  0.0    0
6 8 6   aug sun 92.3 85.3 488.0 14.7 22.2 29  5.4  0.0    0

Publishing an R function as a webservice

We made many improvements to the mechanism underlying the functionality to publish a web service.

In particular, it is now very easy to provide a data frame as input to the publishing function. You no longer have to specify the classes of every column. Instead, the publishWebservice() function automatically determines the column classes of the inputs as well as the results.

To illustrate, here is an example from the help:

ws <- workspace()

# Publish a simple model using the lme4::sleepdata

library(lme4)
set.seed(1)
train <- sleepstudy[sample(nrow(sleepstudy), 120),]
m <- lm(Reaction ~ Days + Subject, data = train)

# Deine a prediction function to publish based on the model:
sleepyPredict <- function(newdata){
predict(m, newdata=newdata)
}

ep <- publishWebService(ws, fun = sleepyPredict, name="sleepy lm",
inputSchema = sleepstudy,
data.frame=TRUE)

# OK, try this out, and compare with raw data
ans = consume(ep, sleepstudy)$ans plot(ans, sleepstudy$Reaction) 

Installation instructions

Right now, the new version is only available at github. To install the package, use:

if(!require("devtools")) install.packages("devtools")
devtools::install_github("RevolutionAnalytics/AzureML")