analyze the public libraries survey (pls) with r

[This article was first published on asdfree by anthony damico, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

each and every year, the institute of museum and library services coaxes librarians around the country to put down their handheld “shhhh…” sign and fill out a detailed online questionnaire about their central library, branch, even bookmobile.  the public libraries survey (pls) is actually a census: nearly every public library in the nation responds annually.  that microdata is waiting for you to check it out, no membership required.  the american library association estimates well over one hundred thousand libraries in the country, but less than twenty thousand of those participate in this survey since most libraries in the nation are enveloped by some sort of school system.

laughably easy files to work with, these microdata do not require the r survey package or any of the batman-like statistical tools seen in the other public use file folders.  as confirmed by one of the administrators of this survey, your analysis can simply tabulate, sum, average, whatever else using the base commands in r rather than complex sample survey design commands.  since these data sets are the universe rather than a sample, i’ve foregone a set of analysis examples.  if you want to do something, search stackoverflow with an [r] tag.  no survey design assembly required.  this new github repository contains two scripts:

download all microdata.R
  • download each zipped year of data onto your local computer
  • load a trifecta of tables into RAM
  • save all three data.frame objects as an R data file (.rda)

replicate imls publications.R

click here to view these two scripts

for more detail about the public libraries survey (pls), visit:


plainly described at the bottom of pdf page 6 of the technical documentation, each year of microdata gets released as three tables: a table of library systems (where new york city public libraries would have one entry), a table of library buildings (where new york city public libraries have one entry per branch), one table of states (where all libraries in new york state get collapsed into one).  imls takes care not to disclose stuff like salary information of individual employees, and the more-aggregated tables require less confidentiality-related-data-squelching.  if you need microdata sans suppression, apply for the restricted use files.

confidential to sas, spss, stata, sudaan users: you are using the blockbuster video of statistical languages.  time to transition to r.  😀

To leave a comment for the author, please follow the link and comment on their blog: asdfree by anthony damico. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)