336 search results for "hadoop"

Analyzing Your Data on the AWS Cloud (with R)

July 22, 2013
By
Analyzing Your Data on the AWS Cloud (with R)

Guest post by Jonathan Rosenblatt Disclaimer: This post is not intended to be a comprehensive review, but more of a “getting started guide”. If I did not mention an important tool or package I apologize, and invite readers to contribute in the comments. …Read more »

Read more »

Real-Time Big Data Analytics: Emerging Architecture

July 19, 2013
By
Real-Time Big Data Analytics: Emerging Architecture

O'Reilly Media has published a new whitepaper, Real-Time Big Data Analytics: Emerging Architecture. This 32-page document describes the processes and components necessary for getting on-demand information from big-data stores such as Hadoop. It answers the questions "How fast is fast?" and "How real is real-time?" and "how big is big?", and provides practical guidance for implementing real-time analytics systems....

Read more »

Connect R with Myrrix – Mahout & Cloudera’s real-time, scalable recommender system

Connect R with Myrrix – Mahout & Cloudera’s real-time, scalable recommender system

(This article was first published on BNOSAC - Belgium Network of Open Source Analytical Consultants, and kindly contributed to R-bloggers) Myrrix is probably more known by java developers and users of Mahout than R users. This is because most of the times java and R developers live in a different community.  If you go to the website of Myrrix...

Read more »

Revolution Newsletter: July 2013

July 15, 2013
By

The most recent edition of the Revolution Newsletter came out a couple of weeks ago. In case you missed it, the news section is below, and you can read the full July edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Let’s Be Quick About...

Read more »

A Rough Guide to Data Science

July 9, 2013
By
A Rough Guide to Data Science

If Big Data was last year's buzzword, Data Science may reach the same level of hype this year. There's no shortage of discussion about the high demand for data scientists, the term's usefulness as a designation, and even declarations of its "sexiness" as a career. And as with many terms that reach a critical mass on social media, data...

Read more »

In case you missed it: June 2013 Roundup

July 3, 2013
By

In case you missed them, here are some articles from June of particular interest to R users: You can create a Word document from a template and an R script with the R2DOCX package. Joe Rickert reviews books and other resources for learning about time series analysis in R. Timely Portfolio covers 15 years of history of time series...

Read more »

Predictive analysis on Web Analytics tool data

July 3, 2013
By
Predictive analysis on Web Analytics tool data

In our previous webinar, we discussed on predictive analytics and basic things to perform predictive analysis. We also discussed on an eCommerce problem and how it can be solved using predictive analysis. In this post, I will explain R script that I used to perform predictive analysis during webinar. Before I explain about R script,

Read more »

Revolution Newsletter: June 2013

June 27, 2013
By

The most recent edition of the Revolution Newsletter came out a couple of weeks ago. In case you missed it, the news section is below, and you can read the full June edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. R is for Analytics:...

Read more »

Time Is on My Side – A Small Example for Text Analytics on a Stream

June 23, 2013
By
Time Is on My Side – A Small Example for Text Analytics on a Stream

Introduction and Background While my last posting was about recommendation in the context of Location Based Social Networks there are also other interesting topics regarding the analysis of unstructured data. The most established one is probably Text Analytics/Mining focusing on all sorts of text data.For me, coming from spatial analysis, these topic is relatively new but I couldn’t help noticing...

Read more »

PivotalR Improves the Scalability and Performance of In-Database Analytics

June 18, 2013
By
PivotalR Improves the Scalability and Performance of In-Database Analytics

One of the greatest challenges while working with big datasets concerns the need to move information out of storage for analysis. To this end, the recent announcement of PivotalR 0.1 extends Pivotal HD's capabilities, allowing users of the statistical programming language R to perform in-database analytics without leaving the command line.

Read more »