Blog Archives

Build a Gradient Boosted Trees Model with Microsoft R Server

May 3, 2016
By
Build a Gradient Boosted Trees Model with Microsoft R Server

by Yuzhou Song, Microsoft Data Scientist R is an open source, statistical programming language with millions of users in its community. However, a well-known weakness of R is that it is both single threaded and memory bound, which limits its ability to process big data. With Microsoft R Server (MRS), the enterprise grade distribution of R for advanced analytics,...

Read more »

Reading Efron with R

May 2, 2016
By
Reading Efron with R

by Joseph Rickert When I first went to grad school, the mathematicians advised me cultivate the habit of reading with a pencil. This turned into a lifelong habit and useful skill for reading all sorts of things: literature, reports and newspapers for example; not just technical papers. However, reading statistics and data science papers, or really anything that includes...

Read more »

R Conferences: Europe 2016

April 28, 2016
By
R Conferences: Europe 2016

by Joseph Rickert Answering email queries from friends and acquaintances from around the world wanting to attend useR! 2016 has been painful. It is amazing that the conference sold out a full two months before its start, but upon reflection, not unbelievable. From its inception useR! has been an "academic" conference both in spirit and location. Including this year's...

Read more »

A Data Scientist’s Perspective on Microsoft R

April 26, 2016
By
A Data Scientist’s Perspective on Microsoft R

by Lixun Zhang, Data Scientist at Microsoft As a data scientist, I have experience with R. Naturally, when I was first exposed to Microsoft R Open (MRO, formerly Revolution R Open) and Microsoft R Server (MRS, formerly Revolution R Enterprise), I wanted to know the answers for 3 questions: What do R, MRO, and MRS have in common? What’s...

Read more »

Get ready for R/Finance 2016

April 21, 2016
By
Get ready for R/Finance 2016

by Joseph Rickert R/Finance 2016 is less than a month away and, as always, I am very much looking forward to it. In past years, I have elaborated on what puts it among my favorite conferences even though I am not a finance guy. R/Finance is small, single track and intense with almost no fluff. And scattered among the...

Read more »

Get Involved with the R Consortium

April 14, 2016
By
Get Involved with the R Consortium

by Joseph Rickert The R Consortium, the non-profit trade organization formed under the Linux Foundation to support the R language and the R Community, is beginning to build real momentum. First of all, two new companies recently joined the Consortium: Avant which provides online personal and auto loans and Procogia, a consulting firm that helps companies make data-driven business...

Read more »

Book Review: Graphical Data Analysis with R

April 7, 2016
By
Book Review: Graphical Data Analysis with R

by Joseph Rickert Basically, there are two kinds of graphics or plots you can make from a data set: (1) those that allow you to see what is going on with the data, and (2) those you make to communicate what you have found to someone else. When making the first kind, you want to select plots that will...

Read more »

An Analysis of Traffic Violation Data with SQL Server and R

April 6, 2016
By
An Analysis of Traffic Violation Data with SQL Server and R

By Srini Kumar, Director of Data Science at Microsoft Who does not hate being stopped and given a traffic ticket? Invariably, we think that something is not fair that we got it and everyone else did not. I am no different, and living in the SF Bay Area, I have often wondered if I could get the data about...

Read more »

What’s new on CRAN: March 2016

March 31, 2016
By
What’s new on CRAN: March 2016

by Joseph Rickert Packages continue to flood into CRAN at a rate the challenges the sanity of anyone trying to keep up with what's new. So far this month, more than 190 packages have been added. Here is a my view of what's interesting in this March madness. The launch_tutorial() function from the RtutoR package by Anup Nair launches...

Read more »

Learning from Learning Curves

March 29, 2016
By
Learning from Learning Curves

by Bob Horton, Senior Data Scientist, Microsoft This is a follow-up to my earlier post on learning curves. A learning curve is a plot of predictive error for training and validation sets over a range of training set sizes. Here we’re using simulated data to explore some fundamental relationships between training set size, model complexity, and prediction error. Start...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)