At the Oracle OpenWorld conference in San Francisco today, Oracle announced the new Oracle Big Data Appliance, “a new engineered system that includes an open source distribution of Apache™ Hadoop™, Oracle NoSQL Database, Oracle Data Integrator Application Adapter for Hadoop, Oracle Loader for Hadoop, and an open source distribution of R.” Oracle's foray into the Hadoop and NoSQL spaces has captured most of the attention, but it's the inclusion of open source R that I find most interesting. Oracle is clearly finding demand from its customers for advanced analytics capabilities for big data, and is looking to R to fill that gap. I agree with Ed Dumbill's assessment:
Big data isn't much use until you can make sense of it, and the inclusion of R in Oracle's big data appliance bears this out. It also sets up R as a new industry standard for analytics: something that will raise serious concern among vendors of established statistical and analytical solutions SAS and SPSS.
Of course, this isn't the first time that R has been embedded into a data warehousing appliance. IBM Netezza's iClass device integrates with Revolution R, and AsterData, the Teradata Data Warehouse Appliance, and Greenplum all provide connections to R as well. Here at Revolution Analytics, we think that such enterprise-level integrations with R serve to grow the R ecosystem and serve as validation of R as a key platform for advanced analytics. As CEO Norman Nie said to GigaOm this weekend,
“Oracle’s announcement to embed R demonstrates validation for the leading statistics language and offers further evidence that R is a key weapon in advanced analytics today”, said Dr. Nie. “As the Enterprise R company, Revolution Analytics has seen an enormous demand for R solutions, so it’s no surprise that Oracle and other companies are looking to further evangelize and distribute R among the enterprise.”
In today's announcement, Oracle also announced another new component, “Oracle R Enterprise”. Details are sparse on what exactly this is, so far, other than it allows you to “run existing R applications and use the R client directly against data stored in Oracle Database”, presumably thanks to an R package created by Oracle. Timothy Prickett Morgan at The Register has a couple of extra details:
The Big Data Appliance also includes the R programming language, a popular open source statistical-analysis tool. This R engine will integrate with 11g R2, so presumably if you want to do statistical analysis on unstructured data stored in and chewed by Hadoop, you will have to move it to Oracle after the chewing has subsided.
This approach to R-Hadoop integration is different from that announced last week between Revolution Analytics, the so-called Red Hat for stats that is extending and commercializing the R language and its engine, and Cloudera, which sells a commercial Hadoop setup called CDH3 and which was one of the early companies to offer support for Hadoop.
We look forward to hearing more about support for R from Oracle and other enterprise vendors. As the R ecosystem continues to grow into the enterprise space, this can only mean more investment in the R project as a whole.
Oracle Press Releases: Oracle Unveils the Oracle Big Data Appliance