Packages v. Libraries in R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the past I’ve used the terms “R library” and “R package” synonymously (e.g. this blog post and this paper), but a careful reader has called me out. Mark Sharp notes that there are differences between libraries and packages.
Chapter one of the R Manual Writing R Extensions gives the details:
A package is a directory of files which extend R, either a source package (the master files of a package), or a tarball containing the files of a source package, or an installed package, the result of running
R CMD INSTALL
on a source package. On some platforms there are also binary packages, a zip file or tarball containing the files of an installed package which can be unpacked rather than installing from sources.A package is not a library. The latter is used in two senses in R documentation. The first is a directory into which packages are installed, e.g.
/usr/lib/R/library
: in that sense it is sometimes referred to as a library directory orlibrary tree (since the library is a directory which contains packages as directories, which themselves contain directories). The second sense is that used by the operating system, as a shared library or static library or (especially on Windows) a DLL, where the second L stands for ‘library’. Installed packages may contain compiled code in what is known on most Unix-alikes as a shared object and on Windows as a DLL (and used to be called a shared library on some Unix-alikes). The concept of a shared library (dynamic library on Mac OS X) as a collection of compiled code to which a package might link is also used, especially for R itself on some platforms.
However, the manual also gives me a little credit.
This is common mis-usage. It seems to stem from S, whose analogues of R’s packages were officially known as library sections and later as chapters, but almost always referred to as libraries.
Indeed, it seems like I’m not alone.
It is a little counter-intuitive that you load packages with the library()
function. Perhaps this contributes to the persistence of the mis-usage. However, as someone else points out
Even if we don’t like the current semantics, the *name* of
library()
in itself should not be a problem. After all, callingsummary()
does not imply that your primary argument is a summary, so why should callinglibrary()
imply that its primary argument is a “library”?
Even the Quick-R site makes a careful distinction:
Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library.
Thanks to Mark for pointing this out. In the future, I’ll definitely be more careful.
I encourage you to share this with others and contribute to the conversation at Packages v. Libraries in R, which first appeared at carlislerainey.com.For more of my thoughts and ideas, subscribe to my blog (via RSS or Email) and follow me on Twitter. You also might like to browse my archive and read my papers on Strategic Mobilization and Testing Hypotheses of No Meaningful Effect.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.