Packages v. Libraries in R

January 2, 2013
By

(This article was first published on Carlisle Rainey » R, and kindly contributed to R-bloggers)

In the past I've used the terms "R library" and "R package" synonymously (e.g. this blog post and this paper), but a careful reader has called me out. Mark Sharp notes that there are differences between libraries and packages.

Chapter one of the R Manual Writing R Extensions gives the details:

package is a directory of files which extend R, either a source package (the master files of a package), or a tarball containing the files of a source package, or an installed package, the result of running R CMD INSTALL  on a source package. On some platforms there are also binary packages, a zip file or tarball containing the files of an installed package which can be unpacked rather than installing from sources.

A package is not a library. The latter is used in two senses in R documentation. The first is a directory into which packages are installed, e.g. /usr/lib/R/library: in that sense it is sometimes referred to as a library directory orlibrary tree (since the library is a directory which contains packages as directories, which themselves contain directories). The second sense is that used by the operating system, as a shared library or static library or (especially on Windows) a DLL, where the second L stands for ‘library’. Installed packages may contain compiled code in what is known on most Unix-alikes as a shared object and on Windows as a DLL (and used to be called a shared library on some Unix-alikes). The concept of a shared library (dynamic library on Mac OS X) as a collection of compiled code to which a package might link is also used, especially for R itself on some platforms.

However, the manual also gives me a little credit.

This is common mis-usage. It seems to stem from S, whose analogues of R's packages were officially known as library sections and later as chapters, but almost always referred to as libraries.

Indeed, it seems like I'm not alone.

It is a little counter-intuitive that you load packages with the library() function. Perhaps this contributes to the persistence of the mis-usage. However, as someone else points out

Even if we don't like the current semantics, the *name* of library() in itself should not be a problem. After all, calling summary() does not imply that your primary argument is a summary, so why should calling library() imply that its primary argument is a "library"?

Even the Quick-R site makes a careful distinction:

Packages are collections of R functions, data, and compiled code in a well-defined format. The directory where packages are stored is called the library.

Thanks to Mark for pointing this out. In the future, I'll definitely be more careful.

 


I encourage you to share this with others and contribute to the conversation at Packages v. Libraries in R, which first appeared at carlislerainey.com.For more of my thoughts and ideas, subscribe to my blog (via RSS or Email) and follow me on Twitter. You also might like to browse my archive and read my papers on Strategic Mobilization and Testing Hypotheses of No Meaningful Effect.
Follow @carlislerainey

To leave a comment for the author, please follow the link and comment on his blog: Carlisle Rainey » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.