More on preparing data

March 18, 2016

(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers)

The Microsoft Data Science User Group just sponsored Nina Zumel‘s presentation “Preparing Data for Analysis Using R”. Microsoft saw Win-Vector LLC‘s ODSC West 2015 presentation “Prepping Data for Analysis using R” and generously offered to sponsor improving it and disseminating it to a wider audience.


We feel Nina really hit the ball out of the park with over 400 new live viewers. Read more for links to even more free materials!

Microsoft has generously sponsored the following:

These are really great materials and we will be promoting and distributing them widely.

Nina emphasized teaching the principles of data treatment and cleaning (frankly an under-emphasized task). She also mentioned a free R library supplied by Win-Vector LLC: vtreat that automates a great number of the steps in a principled and statistically sound manner. Because her lecture is likely to attract more interest in the vtreat library: we have tuned up the vtreat documentation a bit and made it available as pre-rendered HTML (in addition to the normal vignette distribution). Of particular interest we have finally enumerated all the variable types that vtreat uses to re-encode your data.

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)