DeployR Data I/O

June 22, 2015
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Sean Wells, Senior Software Engineer, Microsoft and David Russell

DeployR exists to solve a number of fundamental R analytics integration problems faced by application developers. For example, have you ever wondered how you might execute an R script from within a Web-based dashboard, an enterprise middleware solution, or a mobile application? DeployR makes it very simple. In fact, DeployR makes it very simple for any application developed in any language to:

  1. Provision one or more dedicated R sessions on demand
  2. Execute R scripts on those R sessions
  3. Pass inputs (data, files, etc.) when executing those R scripts
  4. Retrieve outputs (data, files, plots, etc.) following the execution of those R scripts

DeployR offers these features and many more through a set of Analytics Web services. These Web service interfaces isolate the application developer and the application itself from R code, from the underlying R sessions, in fact from all the complexities typically associated with R integration. DeployR client libraries currently available in Java, JavaScript and .NET make integration for application developers a snap. As application developers, we can leave R coding and model building to the data scientists. Now we can also leave all aspects of R session management to the DeployR server. This frees us up to focus on simple integrations with DeployR services that deliver the phenomenal power of R directly within our applications.

But what we want to focus on in this post is the rich set of data inputs and data outputs than can be passed and retrieved when executing R scripts on DeployR. And rather than tell you how you can work with inputs and outputs we think it's better to show you. Which is why we have prepared extensive tutorials on DeployR Data I/O which includes full source code and run instructions available on github.

Taking just one example application from these tutorials, RepoFileInEncodedDataOut, let's briefly explore what's going on. The application generates the following console output:

Code Console Output
CONFIGURATION Using endpoint http://localhost:7400/deployr
CONFIGURATION Using broker config [ PooledBrokerConfig ]
CONNECTION Established authenticated pooled broker [ RBroker ]
TASK INPUT Repository binary file input set on task, [ PooledTaskOptions.preloadWorkspace ]
TASK OPTION DeployR-encoded R object request set on task [ PooledTaskOptions.routputs ]
EXECUTION Pooled task submitted to broker [ RTask ]
TASK RESULT Pooled task completed in 1833ms [ RTaskResult ]
TASK OUTPUT Retrieved DeployR-encoded R object output hip [ RDataFrame ]
TASK OUTPUT Retrieved DeployR-encoded R object output hipDim [ RNumericVector ]
TASK OUTPUT Retrieved DeployR-encoded R object hipDim value=[2719.0, 9.0]
TASK OUTPUT Retrieved DeployR-encoded R object output hipNames [ RStringVector ]
TASK OUTPUT Retrieved DeployR-encoded R object hipNames value=[HIP, Vmag, RA, DE, Plx, pmRA, pmDE, e_Plx, B.V]

This console output can be interpreted as follows:

  1. CONFIGURATION & CONNECTION – the application connects and provisions R sessions on DeployR
  2. TASK INPUT – the application identifies a repository-managed file for auto-loading prior to task execution
  3. TASK OPTION – the application identifies the set of R workspace objects to be returned following task execution
  4. EXECUTION – the application executes the task
  5. TASK OUTPUT – the application retrieves the set of R workspace objects returned following task execution

Take a look at the Java source code for this particular example application here. As with all examples, the source code is heavily documented to help you understand the implementation step-by-step.

These tutorials are currently written in Java but the capabilities and mechanisms demonstrated apply equally across all languages and platforms. Check out the tutorials and let us know what you think.

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)