496 search results for "hadoop"

Exploring NYC Taxi Data with Microsoft R Server and HDInsight

April 19, 2016
By
Exploring NYC Taxi Data with Microsoft R Server and HDInsight

As I mentioned yesterday, Microsoft R Server now available for HDInsight, which means that you can now run R code (including the big-data algorithms of Microsoft R Server) on a managed, cloud-based Hadoop instance. Debraj GuhaThakurta, Senior Data Scientist, and Shauheen Zahirazami, Senior Machine Learning Engineer at Microsoft, demonstrate some of these capabilities in their analysis of 170M taxi...

Read more »

A scalable data science platform with Microsoft R Server and Spark

April 18, 2016
By

If you want to train a statistical model on very large amounts of data, you'll need three things: a storage platform capable of holding all of the training data, a computational platform capable of efficently performing the heavy-duty mathematical computations required, and a statistical computing language with algorithms that can take advantage of the storage and computation power. Microsoft...

Read more »

Answers to FAQ about SparkR for R users

April 5, 2016
By
Answers to FAQ about SparkR for R users

Many people keep asking me whether I have tried SparkR, is it worth using, is it sexy or WHAT is it at all. I felt that creating frequently asked questions (FAQ) in the field of WHAT is that Spark/SparkR? would help many R Scientists to understand this Big Data Buzz-tool. I have gathered information from the...

Read more »

AirbnB uses R to scale data science

April 5, 2016
By
AirbnB uses R to scale data science

Airbnb, the property-rental marketplace that helps you find a place to stay when you're travelling, uses R to scale data science. Airbnb is a famously data-driven company, and has recently gone through a period of rapid growth. To accommodate the influx of data scientists (80% of whom are proficient in R, and 64% use R as their primary data...

Read more »

Help improve treatment for brain injuries using machine learning and R

April 4, 2016
By
Help improve treatment for brain injuries using machine learning and R

The field of neuroscience -- the study of brains and the nervous system -- has taken some major leaps in recent years. Scientists can now gather real-time electrical activity from the brain during actions and thoughts, which is helping to pinpoint the exact location of brain lesions caused by strokes, and is leading to promising treatments for epilepsy and...

Read more »

A bit on the F1 score floor

April 2, 2016
By
A bit on the F1 score floor

At Strata+Hadoop World “R Day” Tutorial, Tuesday, March 29 2016, San Jose, California we spent some time on classifier measures derived from the so-called “confusion matrix.” We repeated our usual admonition to not use “accuracy” as a project goal (business people tend to ask for it as it is the word they are most familiar … Continue reading...

Read more »

yorkr crashes the IPL party! – Part 2

April 2, 2016
By
yorkr crashes the IPL party! – Part 2

Most people say that it is the intellect which makes a great scientist. They are wrong: it is character. Albert Einstein *Science is organized knowledge. Wisdom is organized life.“* Immanuel Kant If I have seen further, it is by standing on the shoulders of giants Isaac Newton Valid criticism does you a favor. Carl Sagan

Read more »

Upcoming Win-Vector LLC appearances

March 23, 2016
By
Upcoming Win-Vector LLC appearances

Win-Vector LLC will be presenting on statistically validating models using R and data science at: Strata+Hadoop World “R Day” Tutorial 9:00am–5:00pm Tuesday, March 29 2016, San Jose, California. ODSC San Francisco Meetup, 6:30pm-9:00pm Thursday, March 31, 2016, San Francisco, California. We will share code and examples. Registration required (and Strata is a paid conference). Please … Continue reading...

Read more »

Bay Area R User Group at Strata and PAW

March 10, 2016
By

by Joseph Rickert I always think of Strata Hadoop World and Predictive Analytics World as initiating the Spring conference season here in the San Francisco Bay Area. The rainy season is usually over by the end of March and it is a perfect time to visit. If you are traveling to either of these conferences from out of town...

Read more »

Data Science Virtual Machine updated with Microsoft R Server

March 2, 2016
By
Data Science Virtual Machine updated with Microsoft R Server

Microsoft has updated the Data Science Virtual Machine, a data science toolkit-in-a-box that you can easily spin up on the Microsoft Azure cloud service. The virtual machine now comes pre-configured with Microsoft R Server Developer Edition (upgraded from Microsoft R Open), Anaconda Python, Jupyter notebooks for Python and R, Visual Studio Community Edition, Power BI desktop, and SQL Server...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de









ODSC

CRC R books series











Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)