# Re-Share: vtreat Data Preparation Documentation and Video

**R – Win-Vector Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I would like to re-share vtreat (R version, Python version) a data preparation documentation for machine learning tasks.

vtreat is a system for preparing messy real world data for predictive modeling tasks (classification, regression, and so on). In particular it is very good at re-coding high-cardinality string-valued (or categorical) variables for later use.

A nice introductory video lecture on vtreat can be found here, and the latest copy of the lecture slides here. Or, you can check out chapter 8 “Advanced data preparation” of Zumel, Mount, *Practical Data Science with R*, 2nd Edition, Manning 2019– which covers the use of vtreat.

The vtreat documentation is organized by task (regression, classification, multinomial classification, and unsupervised), language (R or Python) and interface style (design/prepare, or fit/prepare). In particular the R code now supports variations of the interfaces, allowing users to choose what works best with their coding style. Either design/prepare, which is very fluid when combined with wrapr::unpack notation or the fit/prepare (which uses mutable state to organize steps).

**Regression**:`Python`

regression example,`R`

regression example, fit/prepare interface,`R`

regression example, design/prepare/experiment interface.**Classification**:`Python`

classification example,`R`

classification example, fit/prepare interface,`R`

classification example, design/prepare/experiment interface.**Unsupervised tasks**:`Python`

unsupervised example,`R`

unsupervised example, fit/prepare interface,`R`

unsupervised example, design/prepare/experiment interface.**Multinomial classification**:`Python`

multinomial classification example,`R`

multinomial classification example, fit/prepare interface,`R`

multinomial classification example, design/prepare/experiment interface.

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Win-Vector Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.