Model Homotopies in the Wild

[This article was first published on R – Win Vector LLC, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

So are model homotopies commonly used?

Yes, they are.

As an example consider glmnet:

Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL http://www.jstatsoft.org/v33/i01/.

From help(glmnet):

library(glmnet)
x = matrix(rnorm(100 * 20), 100, 20)
g2 = sample(c(0,1), 100, replace = TRUE)
fit2 = glmnet(x, g2, family = "binomial")

fit2 isn’t a model. It is in fact a family of models subscripted by a single variable, in this case by lambda the degree of regularization. So it is a model homotopy parameterized by regularization instead of by prevalence.

Further, the predict(fit2, newx = x) call returns one prediction for each of these related models, not a prediction from any one model.

This model homotopy even includes a plot method showing the trajectory of the cofficients parameterized by the L1 norm of the coefficients (which themselves are consequences of the regularization trajectory the model homotopy is parameterized by).

plot(fit2)

5d2e383d fb05 4c92 8bab b701e9bb289a

In principle this is a discrete approximation of a fully continuous model homotopy.

Also, in gradient boosting and deep learning, it is common to examine the performance of a family of related models indexed by the training-epoch-number or training-generation-number. In this case the model subscript is discrete, but we see the family of models is reasoned about as a collection. In my opinion this means having a general name for such a collection is of some value.

An example of such a graph is given here:

To leave a comment for the author, please follow the link and comment on their blog: R – Win Vector LLC.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.