Hadley Wickham (@hadleywickham) this week mentioned on Twitter his preference for
saveRDS() over the more familiar
save(). Being a new function to me, I thought I’d take a look…
load() will be familiar to many R users. They allow you to save a named R object to a file or other connection and restore that object again. When loaded the named object is restored to the current environment (in general use this is the global environment — the workspace) with the same name it had when saved. This is annoying for example when you have a saved model object resulting from a previous fit and you want to compare it with the model object returned when the R code is rerun. Unless you change the name of the model fit object in your script you can’t have both the saved object and the newly created one available in the same environment at the same time.
Here’s an example of what I mean.
> require(mgcv) Loading required package: mgcv This is mgcv 1.7-13. For overview type 'help("mgcv-package")'. > mod <- gam(Ozone ~ s(Wind), data = airquality, method = "REML") > mod Family: gaussian Link function: identity Formula: Ozone ~ s(Wind) Estimated degrees of freedom: 3.529 total = 4.529002 REML score: 529.4881 > save(mod, file = "mymodel.rda") > ls()  "mod" > load(file = "mymodel.rda") > ls()  "mod"
saveRDS() provides a far better solution to this problem and to the general one of saving and loading objects created with R.
saveRDS() serializes an R object into a format that can be saved. Wikipedia describes this thus
…serialization is the process of converting a data structure or object state into a format that can be stored (for example, in a file or memory buffer, or transmitted across a network connection link) and “resurrected” later in the same or another computer environment.
save() does the same thing, but with one important difference;
saveRDS() doesn’t save the both the object and its name it just saves a representation of the object. As a result, the saved object can be loaded into a named object within R that is different from the name it had when originally serialized.
We can illustrate this using the model fitted earlier
> ls()  "mod" > saveRDS(mod, "mymodel.rds") > mod2 <- readRDS("mymodel.rds") > ls()  "mod" "mod2" > identical(mod, mod2, ignore.environment = TRUE)  TRUE
(Note that the two model objects have different environments within their representations so we have to ignore this when testing their identity.)
You’ll notice that in the call to
saveRDS() I named the file with the extension
.rds. This appears to be the convention used for serialized object of this sort; R uses this representation often, for example package meta-data and the databases used by
help.search(). In contrast the extension
.rda is often used for objects serialized via
So there you have it;
readRDS() are the newest additions to my day-to-day workflow.