Posts Tagged ‘ Big Data ’

Webinar Tomorrow: Big Data Trees and Hadoop Connection in Revolution R Enterprise 6.1

November 14, 2012
By

Tomorrow at 9AM Pacific, Revolution Analytics VP of Product Development Sue Ranney will introduce two key Big Data features of the new Revolution R Enterprise 6.1. Now you can train classification and regression trees on data sets of unlimited size, quickly and using the resources of multiple processors and clusters. (This white paper describes our implementation of tree models...

Read more »

Benchmarking bigglm

November 13, 2012
By

By Joseph Rickert In a recent blog post, David Smith reported on a talk that Steve Yun and I gave at STRATA in NYC about building and benchmarking Poisson GLM models on various platforms. The results presented showed that the rxGlm function from Revolution Analytics’ RevoScaleR package running on a five node cluster outperformed a Map Reduce/ Hadoop implementation...

Read more »

What’s new in Revolution R Enterprise 6.1

November 8, 2012
By

We're pleased to announce that the latest update to Revolution R Enterprise is available today! Existing subscribers will soon receive an email with update instructions, and the free academic distribution will be updated later today. Version 6.1 adds a frequently-requested big-data statistical modeling algorithm, adds new connectivity option for Hadoop, improves performance, and provides new security and installation options...

Read more »

Webinar Thursday: How R is used to optimize tractor production at John Deere

November 6, 2012
By

I just sat in on the rehearsal for Thursday's webinar by John Deere's Derek Hoffman, Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and Flexibility. Derek will give a spirited argument of why R is critical for the faming equipment manufacturer's operations: from forecasting demand for equipment, forecasting crop yields (they produce forecasts for more than half...

Read more »

Slides and replay for "The Rise of Data Science"

November 2, 2012
By

I had a great time presenting my new webinar yesterday, thanks to everyone who attended "The Rise of Data Science in the Age of Big Data Analytics" and especially those who submitted questions. Sorry I didn't have time to get to them all, but feel free to ask here in the comments. There's been some discussion recently about whether...

Read more »

R among TechCrunch’s 5 Trendy Open-Source Techs for Big Data

October 30, 2012
By

Tim Gasper (Product Manager at Big Data platform Infochimps) has an informative article at TechCrunch that provides an overview of five open-source technologies trending now for Big Data applications. They are: Storm and Kafka (for processing stream data) Drill and Dremel (for ad-hoc queries of big data) R (for data science with big data) Gremlin and Giraph (for graph...

Read more »

Allstate compares SAS, Hadoop and R for Big-Data Insurance Models

October 25, 2012
By
Allstate compares SAS, Hadoop and R for Big-Data Insurance Models

At the Strata conference in New York today, Steve Yun (Principal Predictive Modeler at Allstate's Research and Planning Center) described the various ways he tackled the problem of fitting a generalized linear model to 150M records of insurance data. He evaluated several approaches: Proc GENMOD in SAS Installing a Hadoop cluster Using open-source R (both on the full data...

Read more »

Quick notes from Strata NYC 2012

October 24, 2012
By

The O'Reilly Strata conferences are always great fun to attend, and this latest installment in New York City is no exception. This one is super-busy though; the conference has been sold out for weeks -- and not just marketing-sold-out, it's fire-department-sold out. It's non-stop conversations and presentations, and it's tough to move through the hallways in between. Nonetheless, I...

Read more »

Two Talks on Data Science, Big Data and R

October 23, 2012
By

On Thursday next week (November 1), I'll be giving a new webinar on the topic of Big Data, Data Science and R. Titled "The Rise of Data Science in the Age of Big Data Analytics: Why Data Distillation and Machine Learning Aren’t Enough", this is a provocative look at why data scientists cannot be replaced by technology, and why...

Read more »

Vendor news: TIBCO’s proprietary R runtime; Teradata’s appliance integrates R

October 17, 2012
By
Vendor news: TIBCO’s proprietary R runtime; Teradata’s appliance integrates R

In a webinar today previewing Spotfire 5 (scheduled for release this November), TIBCO announced that it will include TERR: The Tibco Enterprise Runtime for R. TERR is a closed-source reimplementation of the R language engine, and not based on the GPL-licensed R project from the R Foundation. Here's the relevant slide from the webinar: By making the TERR engine...

Read more »