There is a saying that there are two groups of people: those who are already doing backups and those who will. So, how this is linked with reproducible research and R?
If your work is to analyze data then you often face a need to restore/recreate/update results that you have generated some time ago.
You may think ,,I have a knitr reports for everything!”. That’s great! It will save you a lot of troubles. But to have 100% of warranty for exactly same results you need to have exactly the same environment and same versions of packages.
Do you know how many R packages have been updated during last 12 months?
Load this plot directly to R: archivist::aread('pbiecek/archivist/scripts/packDev/039745c40ab717f4459c5144343baca1')
How many of current versions of selected packages were on CRAN 12 months ago?
The ecdf for dates of current releases.
Load this plot directly to R: archivist::aread('pbiecek/archivist/scripts/packDev/923ec99f79cce099408d4973471dd30d)
Around 50% of these packages were updated in last 12 months. And sometimes these changes have a huge impact, like version 2.0 of ggplot2.
In order to recreate the exactly same results you either need to keep copy of important (all?) packages or keep copy of obtained results.
With current version of archivist (2.0) you can easily (just with one line) archive all created objects and embed hooks to these objects into your report. It’s enough to use addHooksToPrint() function at the beginning of your knitr script.
How it’s better than simple ‘save()’ function? Lot’s of additional features, like you can ask for session info for a given artifact
package * version date
1 archivist * 2.0 2016-02-12
2 assertthat 0.1 2013-12-06
3 bitops 1.0-6 2013-08-17
4 colorspace 1.2-6 2015-03-11
5 DBI 0.3.1 2014-09-24
6 devtools 1.9.1 2015-09-11
7 digest 0.6.9 2016-01-08
8 dplyr * 0.4.3 2015-09-01
9 DT * 0.1 2015-06-09
For more examples see this knitr+archivist report
to reproduce or retrieve all results presented here.