What’s New in 6.2: Stepwise Regression for Big Data

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

by Thomas Dinsmore

This is the third in a series of posts highlighting new features in Revolution R Enterprise Release 6.2, which is scheduled for General Availability April 22.  This week's post features our new Stepwise Regression capability.

The Stepwise process starts with a specified model and then sequentially adds into or removes from the model the variable that improves the fit most based on a selection criterion until no further improvement is possible or it hits a specified model boundary.  By automating the process of selecting feature candidates for use in a predictive model, Stepwise Regression significantly accelerates the model building process.

One of our customers, for example, builds more than a thousand models every week for targeted marketing.  At that scale of activity, traditional model-fitting techniques are simply too slow.  Starting with a feature set of more than 500 candidate variables, this customer runs fast feature selection techniques to reduce the number of variables, then runs Stepwise Regression to finalize the model.

In designing the Stepwise Regression capability, we relied on customer feedback, and also reviewed similar capabilties in open source R, such as stepAIC() in the MASS package.  Since many of our customers are actively converting from SAS, we looked at the Stepwise capabilities in SAS as well. 

In Release 6.2, we support the following Stepwise methods for Linear Regression: 

  • Forward selection
  • Backwards elimination
  • Bidirectional search

 We support three different user-specifiable selection criteria:

  • AIC
  • BIC
  • Mallows' Cp

Coming up later this year in our Release 7.0, we plan to expand the Stepwise capabilities to Logistic Regression and General Linear Models.

Your comments and suggestions are welcome. If you like to use Stepwise Regression and you are interested in a feature that you don't see mentioned in this post, let us know what you think in the Comments section below.

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)