487 search results for "hadoop"

You can Hadoop it! It’s elastic! Boogie woogie woog-ie!

February 16, 2010
By
You can Hadoop it! It’s elastic! Boogie woogie woog-ie!

I just came back from the future and let me be the first to tell you this: Learn some Chinese. And more than just cào nǐ niáng (肏你娘) which your friend in grad school told you means “Live happy with many blessings”. Trust me, I’ve been hanging with Madam Wu and she told me

Read more »

Streaming Hadoop Data Into R Scripts

March 23, 2009
By
Streaming Hadoop Data Into R Scripts

Along the lines of Mongo Measurement Requires Mongo Management, the HadoopStreaming package on CRAN provides utilities for applying R scripts to Hadoop streaming. Hadoop is used on Amazon's EC2.

Read more »

Exploring NYC Taxi Data with Microsoft R Server and HDInsight

April 19, 2016
By
Exploring NYC Taxi Data with Microsoft R Server and HDInsight

As I mentioned yesterday, Microsoft R Server now available for HDInsight, which means that you can now run R code (including the big-data algorithms of Microsoft R Server) on a managed, cloud-based Hadoop instance. Debraj GuhaThakurta, Senior Data Scientist, and Shauheen Zahirazami, Senior Machine Learning Engineer at Microsoft, demonstrate some of these capabilities in their analysis of 170M taxi...

Read more »

A scalable data science platform with Microsoft R Server and Spark

April 18, 2016
By

If you want to train a statistical model on very large amounts of data, you'll need three things: a storage platform capable of holding all of the training data, a computational platform capable of efficently performing the heavy-duty mathematical computations required, and a statistical computing language with algorithms that can take advantage of the storage and computation power. Microsoft...

Read more »

AirbnB uses R to scale data science

April 5, 2016
By
AirbnB uses R to scale data science

Airbnb, the property-rental marketplace that helps you find a place to stay when you're travelling, uses R to scale data science. Airbnb is a famously data-driven company, and has recently gone through a period of rapid growth. To accommodate the influx of data scientists (80% of whom are proficient in R, and 64% use R as their primary data...

Read more »

Help improve treatment for brain injuries using machine learning and R

April 4, 2016
By
Help improve treatment for brain injuries using machine learning and R

The field of neuroscience -- the study of brains and the nervous system -- has taken some major leaps in recent years. Scientists can now gather real-time electrical activity from the brain during actions and thoughts, which is helping to pinpoint the exact location of brain lesions caused by strokes, and is leading to promising treatments for epilepsy and...

Read more »

A bit on the F1 score floor

April 2, 2016
By
A bit on the F1 score floor

At Strata+Hadoop World “R Day” Tutorial, Tuesday, March 29 2016, San Jose, California we spent some time on classifier measures derived from the so-called “confusion matrix.” We repeated our usual admonition to not use “accuracy” as a project goal (business people tend to ask for it as it is the word they are most familiar … Continue reading...

Read more »

yorkr crashes the IPL party! – Part 2

April 2, 2016
By
yorkr crashes the IPL party! – Part 2

Most people say that it is the intellect which makes a great scientist. They are wrong: it is character. Albert Einstein *Science is organized knowledge. Wisdom is organized life.“* Immanuel Kant If I have seen further, it is by standing on the shoulders of giants Isaac Newton Valid criticism does you a favor. Carl Sagan

Read more »

Upcoming Win-Vector LLC appearances

March 23, 2016
By
Upcoming Win-Vector LLC appearances

Win-Vector LLC will be presenting on statistically validating models using R and data science at: Strata+Hadoop World “R Day” Tutorial 9:00am–5:00pm Tuesday, March 29 2016, San Jose, California. ODSC San Francisco Meetup, 6:30pm-9:00pm Thursday, March 31, 2016, San Francisco, California. We will share code and examples. Registration required (and Strata is a paid conference). Please … Continue reading...

Read more »

Bay Area R User Group at Strata and PAW

March 10, 2016
By

by Joseph Rickert I always think of Strata Hadoop World and Predictive Analytics World as initiating the Spring conference season here in the San Francisco Bay Area. The rainy season is usually over by the end of March and it is a perfect time to visit. If you are traveling to either of these conferences from out of town...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)