202 search results for "hadoop"

Revolution Newsletter: March 2013

March 25, 2013
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full March edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Get Results Fast with our Quick Start Programs: Need help getting value from predictive...

Read more »

Data Science Education gets personal

March 14, 2013
By

by Joseph B. Rickert It is difficult to imagine that there is anyone on the planet with an internet connection and a desire to learn something new who has not at least looked into taking a massive open online course (MOOC). Last Fall, in an 11/4/12 article, the New York Time declared the Year of the MOOC and quoted...

Read more »

In case you missed it: February 2013 Roundup

March 13, 2013
By

In case you missed them, here are some articles from February of particular interest to R users. How to resample from a large data set with RHadoop, and a video introduction to the RHadoop packages. A 90-second video explains: What is Revolution R Enterprise? Jeffrey Stanton has published a free e-book "An Introduction to Data Science" using R. I...

Read more »

Revolution Analytics News Roundup

March 4, 2013
By

Between the Strata conference and various announcements, last week was certainly a busy one for the crew here at Revolution Analytics. So I thought I'd take the opportunity to catch you up on some of the recent media articles you might have missed: The Wall Street Journal interviewed our new VP of Services Neera Talbert on the trend towards...

Read more »

Summary of My First Trip to Strata #strataconf

February 28, 2013
By
Summary of My First Trip to Strata #strataconf

In this post I am goIing to summarize some of the things that I learned at Strata Santa Clara 2013. For now, I will only discuss the conference sessions as I have a much longer post about the tutorial sessions that I am still working on and will post at a later date. I will add to this post...

Read more »

Resampling data in Hadoop with RHadoop

February 27, 2013
By

On Revolution Analytics partner Cloudera's blog, Uri Laserson has posted an excellent guide to resampling from a large data set in Hadoop. Resampling is an important step in fitting ensemble models (including random forests and other bagging techniques), and Uri provides a step-by-step guide to implementing resampling methods using RHadoop. He provides the complete map-reduce code in the R...

Read more »

New ways to Hadoop with R

February 26, 2013
By

Today, there are two main ways to use Hadoop with R and big data: 1. Use the open-source rmr package to write map-reduce tasks in R (running within the Hadoop cluster - great for data distillation!) 2. Import data from Hadoop to a server running Revolution R Enterprise, via Hbase, ODBC (for high-performance Hadoop/SQL interfaces), or streaming data direct...

Read more »

Revolution Newsletter: February 2013

February 25, 2013
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full February edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Case study: Real-Time Marketing Analytics. Online advertising technology company Exelate uses predictive models to...

Read more »

Free e-book on Data Science with R

February 22, 2013
By
Free e-book on Data Science with R

A new book by Jeffrey Stanton from Syracuse Iniversity School of Information Studies, An Introduction to Data Science, is now available for free download. The book, developed for Syracuse's Certificate for Data Science, is available under a Creative Commons License as a PDF (20Mb) or as an interactive eBook from iTunes. The book begins with the following clear definition...

Read more »

Video: IBM Opinionated Infrastructure Hangout

February 22, 2013
By

Had a great time earlier this week on a Google Hangout as part of the IBM Opinionated Infrastructure series. Moderator James Governor (analyst from RedMonk) kept the conversation lively, with topics ranging from to the value of information to the benefits of predictive analytics and evolution of Hadoop. R gets a mention at several points in the conversation, which...

Read more »