analyze the social security administration’s national beneficiary survey (nbs) with r

October 28, 2014

(This article was first published on asdfree by anthony damico, and kindly contributed to R-bloggers)

above and beyond its namesake roosevelt-era retirement safety net, the social security administration oversees two federal programs – social security disability insurance and supplemental security insurance.  currently covering more than ten million disabled americans (many far younger than retirement age), the quants in the woodlawn disability research office thought it might be smart to learn a little something something about the people who they serve.  so they went ahead and asked them: the national beneficiary survey interviews the populations covered by these two programs with one giant heap of questions.

this microdata is for the serious disability researcher, not the dabbler.  although some of the variables in this data should reasonably compare to questions fielded by other federal surveys, the instrument goes deep into the inner-workings of these public programs.  lemme say that another way: the current population survey (cps), a survey representative of the entire non-institutionalized united states population, includes a flag indicating ssi receipt as well as some disability payment values.  if you want to answer a broad question – “how does the self-reported health status of americans on disability compare to the self-reported health status of all other americans?” – use the two hundred thousand-respondent cps and not the two thousand-respondent nbs.  if you want to the answer to a pinpointed question – “what share of disabled americans currently on a physical therapy regiment are out of work because they fear that they will lose their benefits should they gain employment?” – you’re in the right place.  once again, for the serious disability researcher.  this new github repository contains three scripts:

download all microdata.R

analysis examples.R


click here to view these three scripts

for more detail about the national beneficiary survey, visit:


the social security administration’s public use files exclude about half of all sampled individuals, specifically those targeted because of their enrollment in the ticket-to-work (ttw) programs.  if you compare the unweighted record counts in the public use file to this table, you’ll hit the representative beneficiary sample row rather than the total.

confidential to sas, spss, stata, and sudaan users: your statistical languages of choice are about as huggable as a cactus.  time to transition to r.  😀

To leave a comment for the author, please follow the link and comment on their blog: asdfree by anthony damico. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)