Blog Archives

Using Microsoft R Server on a single machine for experiments with 600 million taxi rides.

June 14, 2016
By
Using Microsoft R Server on a single machine for experiments with 600 million taxi rides.

by Dmitry Pechyoni, Microsoft Data Scientist The New York City taxi dataset is one of the largest publicly available datasets. It has about 1.1 billion taxi rides in New York City. Previously this dataset was explored and visualized in a number of blog posts, where the authors used various technologies (e.g., PostgreSQL and Apache Elastic Search). Moreoever, in a...

Read more »

R Consortium and User! 2016 News

June 9, 2016
By

by Joseph Rickert IBM Joins the R Consortium This past Monday at the Spark Summit in San Francisco IBM announced that it had joined the R Consortium as a "Platinum" member. This is very good news with respect to the development and growth of the R language, the health of the R Community and the position of opensource software...

Read more »

Bayesian Optimization of Machine Learning Models

June 7, 2016
By
Bayesian Optimization of Machine Learning Models

by Max Kuhn: Director, Nonclinical Statistics, Pfizer Many predictive and machine learning models have structural or tuning parameters that cannot be directly estimated from the data. For example, when using K-nearest neighbor model, there is no analytical estimator for K (the number of neighbors). Typically, resampling is used to get good performance estimates of the model for a given...

Read more »

Using caret to compare models

June 2, 2016
By
Using caret to compare models

by Joseph Rickert The model table on the caret package website lists more that 200 variations of predictive analytics models that are available withing the caret framework. All of these models may be prepared, tuned, fit and evaluated with a common set of caret functions. All on its own, the table is an impressive testament to the utility and...

Read more »

Principal Components Regression in R: Part 3

May 31, 2016
By
Principal Components Regression in R: Part 3

by John Mount Ph. D. Data Scientist at Win-Vector LLC In her series on principal components analysis for regression in R, Win-Vector LLC's Dr. Nina Zumel broke the demonstration down into the following pieces: Part 1: the proper preparation of data and use of principal components analysis (particularly for supervised learning or regression). Part 2: the introduction of y-aware...

Read more »

Some Impressions from R Finance 2016

May 27, 2016
By
Some Impressions from R Finance 2016

by Joseph Rickert R / Finance 2016 lived up to expectations and provided the quality networking and learning experience that longtime participants have come to value. Eight years is a long time for a conference to keep its sparkle and pizzazz. But, the conference organizers and the UIC have managed to create a vibe that keeps people coming back....

Read more »

Principal Components Regression in R: Part 2

May 24, 2016
By
Principal Components Regression in R: Part 2

by John Mount Ph. D. Data Scientist at Win-Vector LLC In part 2 of her series on Principal Components Regression Dr. Nina Zumel illustrates so-called y-aware techniques. These often neglected methods use the fact that for predictive modeling problems we know the dependent variable, outcome or y, so we can use this during data preparation in addition to using...

Read more »

User Groups and R Awareness

May 19, 2016
By
User Groups and R Awareness

by Joseph Rickert For quite a few years now we have attempted to maintain the Revolution Analytics' Local R User Group Directory as the complete and authoritative list of R user groups. Meetup groups make this list in one of two ways: we discover the group because they have a web page of some sort proclaiming the group to...

Read more »

Principal Components Regression in R, an operational tutorial

May 17, 2016
By
Principal Components Regression in R, an operational tutorial

John Mount Ph. D. Data Scientist at Win-Vector LLC Win-Vector LLC's Dr. Nina Zumel has just started a two part series on Principal Components Regression that we think is well worth your time. You can read her article here. Principal Components Regression (PCR) is the use of Principal Components Analysis (PCA) as a dimension reduction step prior to linear...

Read more »

Good R Packages

May 12, 2016
By
Good R Packages

by Joseph Rickert What makes for a good R package? With over 8,000 packages up on CRAN the quantity of packages is clearly not an issue for R users. Developing an instinct to recognize quality, however, both requires and deserves some effort. I regularly spend time on Dirk Eddelbuettel’s CRANberries site investigating new packages and monitoring changes in old...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)