All your models belong to us: how to combine package archivist and function trace()

[This article was first published on SmarterPoland.pl » English, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Let’s see how to collect all linear regression models that you will ever create in R.

It’s easy with the trace() function. A really powerful, yet not that popular function, that allows you to inject any R code in any point of a body of any function.
Useful in debugging and have other interesting applications.
Below I will show how to use this function to store a copy of every linear model that is created with lm(). In the same way you may store copies of plots/other models/data frames/anything.

To store a persistent copy of an object one can simply use the save() function. But we are going to use the archivist package instead. It stores objects in a repository and give you some nice features, like searching within repository, sharing the repository with other users, checking session info for a particular object or restoring packages to versions consistent with a selected object.

To use archivist with the trace() function you just need to call two lines. First one will create an empty repo, and the second will execute ‘saveToLocalRepo()’ at the end of each call to the lm() function.

library(archivist)
# create an empty repo
createLocalRepo ("allModels", default = TRUE)
# add tracing code
trace(lm, exit = quote(saveToRepo(z)))

Now, at the end of every lm() function the fitted model will be stored in the repository.
Let’s see this in action.

> lm(Sepal.Length~., data=iris) -> m1
Tracing lm(Sepal.Length ~ ., data = iris) on exit 

> lm(Sepal.Length~ Petal.Length, data=iris) -> m1
Tracing lm(Sepal.Length ~ Petal.Length, data = iris) on exit 

> lm(Sepal.Length~-Species, data=iris) -> m1
Tracing lm(Sepal.Length ~ -Species, data = iris) on exit

All models are stored as rda files in a disk based repository.
You can load them to R with the asearch() function.
Let’s get all lm objects, apply the AIC function to each of them and sort along AIC.

> asearch("class:lm") %>% 
    sapply(., AIC) %>% 
    sort
4c3ae060f3aaa2509b2faf63d857358e 5c5751e36b31b2251d2767d96993320a 
                        79.11602                        160.04042 
ed2f4d257fd568c5c6f231fadc7aa645 
                       372.07953

The aread() function will download the selected model.

> aread("4c3ae060f3aaa2509b2faf63d857358e")

Call:
lm(formula = Sepal.Length ~ ., data = iris)

Coefficients:
      (Intercept)        Sepal.Width       Petal.Length        Petal.Width  
           2.1713             0.4959             0.8292            -0.3152  
Speciesversicolor   Speciesvirginica  
          -0.7236            -1.0235

Now you can just create model after model and if needed they all can be restored.

Read more about the archivist here: http://pbiecek.github.io/archivist/.

To leave a comment for the author, please follow the link and comment on their blog: SmarterPoland.pl » English.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)