A scalable data science platform with Microsoft R Server and Spark

April 18, 2016

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

If you want to train a statistical model on very large amounts of data, you'll need three things: a storage platform capable of holding all of the training data, a computational platform capable of efficently performing the heavy-duty mathematical computations required, and a statistical computing language with algorithms that can take advantage of the storage and computation power. Microsoft R Server, running on HDInsight with Apache Spark provides all three.

As Mario Inchiosa and Roni Burd demonstrate in this recorded webinar, Microsoft R Server can now run within HDInsight Hadoop nodes running on Microsoft Azure. Better yet, the big-data-capable algorithms of ScaleR (pdf) take advantage of the in-memory architecture of Spark, dramatically reducing the time needed to train models on large data. And if your data grows or you just need more power, you can dynamically add nodes to the HDInsight cluster using the Azure portal. 

Many of the details are in the slides embdedded above, but to see a demonstration of Microsoft R Server running on Spark with HDInsight, click on the link below for access to the recorded webinar.

Microsoft Azure On-Demand Webinar: Building A Scalable Data Science Platform with R and Hadoop

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)