Site icon R-bloggers

Publish R functions as stored procedures with the sqlrutils package

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you've created an R function (say, a routine to clean up missing values in a data set, or a function to make forecasts using a machine learning model), and you want to make it easy for DBAs to use it, it's now possible to publish R functions as a SQL Server 2016 stored procedure. The sqlrutils package provides tools to convert an existing R function to a stored procedure which can then be executed by anyone with authenticated access to the database — even if they don't know any R.

To use an R function as a stored procedure, you'll need SQL Server 2016 with R Services installed. You'll also need to use the sqlrutils package to publish the function as a stored procedure: it's included with both Microsoft R Client (available free) and Microsoft R Server (included with SQL Server 2016), version 9.0 or later.

With that set up, the process is fairly simple:

  • Using Microsoft R (Client or Server), call library(sqlrutils) to make the publishing functions available in R.
    • Call help(package="sqlrutils") to see the list of functions provided.
  • Create the R function you want to publish. It should include one data.frame argument (this will be the input from the database) and return a data.frame as the value of the stored procedure.
    • The function can also return NULL, if the goal of the function is to modify the database directly.
    • You can use functions from the RevoScaleR package. For example, use RxSqlServerData to access data in the database with a SQL query, use RxDataStep to transform data using R, or use functions like rxLogit to train a predictive model.
    • Note that the function will only access data passed in as inputs — don't try reaching out to global variables, as they won't be there when the function runs within SQL Server.
  • Declare the input parameter for the stored procedure with the InputData function. In addition to naming the parameter, you can specify a default query to use if none is provided.
  • Optionally, define one or more additional input parameters for the stored procedure. These will become inputs to the R function defined above.
    • Only R objects of type POSIXct, numeric, character, integer, logical, and raw are allowed as inputs.
  • Prepare the R function for publishing with the StoredProcedure function. This is where you name the stored procedure and define the input parameters declared above.
  • Publish the stored procedure to the database with registerStoredProcedure
    • You'll need to provide a connection string with location and authentication information for the database. You can also provide this in the previous step, when you create the stored procedure.
    • You can test out the stored procedure from your R console with executeStoredProcedure.

That's it! Now your R function is available as an ordinary stored procedure, and can be executed (with its provided input data and optional parameters) like any other stored procedure.  

This is just an outline, and I've omitted some of the options for clarity. For more details on the process, you can find complete documentation at the link below.

Microsoft Docs: Generating an R Stored Procedure for R Code using the sqlrutils Package

 

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.