Blog Archives

Looking forward to 2016

December 24, 2015
By
Looking forward to 2016

by Joseph Rickert The following map of all of the R user groups listed in Microsoft's Local R User Group Directory is good way to visualize the R world as we rocket into 2016. As a member of the useR!2016 planning committee, foremost in my mind right now is that in just a few months people will be coming...

Read more »

Trade-offs to consider when reading a large dataset into R using the RevoScaleR package

December 15, 2015
By
Trade-offs to consider when reading a large dataset into R using the RevoScaleR package

by Seth Mottaghinejad, Data Scientist at Microsoft R and big data There are many R packages dedicated to letting users (or useRs if you prefer) deal with big data in R. (We will intentionally avoid using proper case for 'big data', because (1) the term has been somewhat hackneyed, and (2) for the sake of this article we can...

Read more »

Wald’s graphical sequential inspection procedure

December 10, 2015
By
Wald’s graphical sequential inspection procedure

by John Mount Ph.D. Data Scientist at Win-Vector LLC Our most recent article was a dynamic programming solution to the A/B test problem. Explicitly solving such dynamic programs is a long and tedious process, so you are well served by finding and introducing clever invariants to track (something better than just raw win-rates). This clever idea, called "sequential analysis",...

Read more »

Fun with ddR: Using Distributed Data Structures in R

December 8, 2015
By
Fun with ddR: Using Distributed Data Structures in R

by Edward Ma and Vishrut Gupta (Hewlett Packard Enterprise) A few weeks ago, we revealed ddR (Distributed Data-structures in R), an exciting new project started by R-Core, Hewlett Packard Enterprise, and others that provides a fresh new set of computational primitives for distributed and parallel computing in R. The package sets the seed for what may become a standardized...

Read more »

Feature Selection with caret’s Genetic Algorithm Option

December 3, 2015
By
Feature Selection with caret’s Genetic Algorithm Option

by Joseph Rickert If there is anything that experienced machine learning practitioners are likely to agree on, it would be the importance of careful and thoughtful feature engineering. The judicious selection of which predictor variables to include in a model often has a more beneficial effect on overall classifier performance than the choice of the classification algorithm itself. This...

Read more »

Exploring Recursive CTEs with sqldf

December 1, 2015
By
Exploring Recursive CTEs with sqldf

by Bob Horton Sr. Data Scientist at Microsoft Common table expressions (CTEs, or “WITH clauses”) are a syntactic feature in SQL that makes it easier to write and use subqueries. They act as views or temporary tables that are only available during the lifetime of a single query. A more sophisticated feature is the “recursive CTE”, which is a...

Read more »

R User Group Activity 2015

November 27, 2015
By
R User Group Activity 2015

by Joseph Rickert 2015 has been a good year for R user groups, both in terms of activity and the number of new groups founded. The plot below which runs 12/30/2012 through the week beginning with Monday 11/23/2015 shows that the number of weekly meeting continues to drift up to the right. You can see the seasonal pattern of...

Read more »

Mapping out Marriott’s Starwood Acquisition

November 24, 2015
By
Mapping out Marriott’s Starwood Acquisition

by Michael Helbraun The software business includes travel, and that means hotels. The news that Marriott was acquiring Starwood was of particular interest to me – especially since more than 75% of my 95 nights so far this year on the road have been spent with one of those two companies. While other folks can evaluate if the deal...

Read more »

Fun with Simpson’s Paradox: Simulating Confounders

November 21, 2015
By
Fun with Simpson’s Paradox: Simulating Confounders

Bob Horton Sr Data Scientist, Microsoft Wikipedia describes Simpson’s paradox as “a trend that appears in different groups of data but disappears or reverses when these groups are combined.” Here is the figure from the top of that article (you can click on the image in Wikipedia then follow the “more details” link to find the R code used...

Read more »

Rated R: Recommended Reading

November 19, 2015
By
Rated R: Recommended Reading

by Joseph Rickert What are you reading? - and what are you recommending to friends, colleagues, and students who want to learn something about R programming? A quick search of Amazon will show that there are several new R books proposed for 2016; but of course, new doesn't necessarily mean better. I fully expect that many new books in...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)