Meta-packages, nails in CRAN’s coffin
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Derek Jones recently discussed a possible future for the R
ecosystem in “StatsModels: the first nail in R’s coffin”.
This got me thinking on the future of CRAN
(which I consider vital to R
, and vital in distributing our work) in the era of super-popular meta-packages. Meta-packages are convenient, but they have a profoundly negative impact on the packages they exclude.
For example: tidyverse
advertises a popular R
universe where the vital package data.table
never existed.
And now tidymodels
is shaping up to be a popular universe where our own package vtreat
never existed, except possibly as a footnote to embed
.
Users currently (with some luck) discover packages like ours and then (because they trust CRAN
) feel able to try them. With popular walled gardens that becomes much less likely. It is one thing for a standard package to duplicate another package (it is actually hard to avoid, and how work legitimately competes), it is quite another for a big-brand meta-package to pre-pick winners (and losers).
All I can say is: please give vtreat
a chance and a try. It is a package for preparing messy real-world data for predictive modeling. In addition to re-coding high cardinality categorical variables (into what we call effect-codes after Cohen, or impact-codes), it deals with missing values, can be parallelized, can be run on databases, and has years of production experience baked in.
Some places to start with vtreat
:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.