New caret version with adaptive resampling

[This article was first published on Blog - Applied Predictive Modeling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A new version of caret is on CRAN now.

There are a number of bug fixes:

  • A man page with the list of models available via train was added back into the package. See ?models.
  • Thoralf Mildenberger found and fixed a bug in the variable importance calculation for neural network models.
  • The output of varImp for pamr models was updated to clarify the ordering of the importance scores.
  • getModelInfo was updated to generate a more informative error message if the user looks for a model that is not in the package’s model library.
  • A bug was fixed related to how seeds were set inside of train.
  • The model “parRF” (parallel random forest) was added back into the library.
  • When case weights are specified in train, the hold-out weights are exposed when computing the summary function.
  • A check was made to convert a data.table given to train to a data frame (see this post).

One big new feature is that adaptive resampling can be used. I’ll be speaking about this at useR! this year. Also, while I’m submitting a manuscript, a pre-print is available at arxiv.

Basically, after a minimum number of resamples have been processed, all tuning parameter values are not treated equally. Some that are unlikely to be optimal are ignored as resampling proceeds. There can be substantial speed-ups in doing so and there is a low probability that a poor model will be found. Here is a plot of the median speed-up (y axis) versus the estimated probability of model at least as good as the one found using all the resamples will occur.


The manuscript has more details about the other factors in the graph. One nice property of this methodology is that, when combined with parallel processing, the speed-ups could be as high as 30-fold (for the simulated example).

These features should be considered experimental. Send me any feedback on them that you may have.

To leave a comment for the author, please follow the link and comment on their blog: Blog - Applied Predictive Modeling. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)