Latest vtreat up on CRAN

[This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There is a new version of the R package vtreat now up on CRAN.

vtreat is an essential data preparation system for predictive modeling that helps defend your predictive modeling work against real world data issues including:

  • High cardinality categorical variables
  • Rare levels (including new or novel levels during application) in categorical variables
  • Missing data (random or systematic)
  • Irrelevant variables/columns
  • Nested model bias, and other over-fit issues.

vtreat also includes excellent, citable, documentation: vtreat: a data.frame Processor for Predictive Modeling.

For this release I want to thank everybody who generously donated their time to submit an issue or build a git pull-request. In particular:

  • Vadim Khotilovich, who found and fixed a major performance problem in the y-stratified sampling.
  • Lawrence Wu, who has been donating documentation fixes.
  • Peter Hurford, who has been donating documentation fixes.

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)