In these days, I was talking about an R package I developed with a colleague. He used several times the word library to refer to the R package. So, I realized that many R users do not know that package and library are not synonymous when referring to R.
The “Writing R Extensions” manual is clear: “A package is not a library“, although the same manual admits “this is a persistent mis-usage”.
What is a package
An R package is a directory of files which extend R. Some authors say that R packages are a god way to distribute R code as well as papers are a good way to disseminate scientific researches. Rossi provides some good reasons to write an R package.
- We refer to the directory containing files as a source package, the master files of a package. These directory can be compressed in a tarball containing the files of a source package, the .tar.gz version of the source package.
- An installed package is the result of running
R CMD INSTALLor
install.packages()at the R console on a source package.
- On some platforms (notably OS X and Windows) there are also binary packages, a zip file or tarball containing the files of an installed package which can be unpacked rather than installing from sources.
Summarizing: we can refer to the source package as the human readable version of the package and to the installed package or to the binary package as the computer readable version.
What is a library
In R, a library can refer to:
- A directory into which packages are installed, e.g.
- A shared, dynamic or static library or (especially on Windows) a DLL, where the second L stands for ‘library’. Installed packages may contain compiled code in what is known on Unix-alikes as a shared object and on Windows as a DLL.
Origin of the mis-used
“Writing R extension” manual suggest that the mis-use seems to stem from S, whose analogues of R’s packages were officially known as library sections and later as chapters, but almost always referred to as libraries.
I add to this that the R function to load packages, i.e.
library(), doesn’t help to understand. By the way, before loading a package we have to install it with install.packages(). Than we will load the package from the directory into which package is installed that is a library.
At the end of this post, what should a new user remember of all these?
Ramarro, a web book about advanced R programming written by Andrea Spanò, contains a useful summary.
Terms about R packages are often confused. This may help to clarify:
- Package: a collection of R functions, data, and compiled code in a well-defined format.
- Library: the directory where packages are installed.
- Repository: A website providing packages for installation.
- Source: The original version of a package with human-readable text and code.
- Binary: A compiled version of a package with computer-readable text and code, may work only on a specific platform.