analyze the current population survey (cps) annual social and economic supplement (asec) with r

October 10, 2012

(This article was first published on asdfree by anthony damico, and kindly contributed to R-bloggers)

the annual march cps-asec has been supplying the statistics for the census bureau’s report on income, poverty, and health insurance coverage since 1948.  wow.  the us census bureau and the bureau of labor statistics (bls) tag-team on this one.  until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census – about two hundred thousand respondents.  this provides enough sample to conduct state- and a few large metro area-level analyses.  your sample size will vanish if you start investigating subgroups by state – consider pooling multiple years.  county-level is a no-no.

despite the american community survey’s larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance – and can be trended back to harry truman’s presidency.  aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be treated as point-in-time statistics.  cps-asec generalizes to the united states non-institutional, non-active duty military population.

the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person).  to import these files into r, the parse.SAScii function uses nber’s sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database.  you can try reading through the nber march 2012 sas importation code yourself, but it’s a bit of a proc freak show.  this new github repository contains three scripts:

2005-2012 asec – download all microdata.R

  • download the fixed-width file containing household, family, and person records
  • import by separating this file into three tables, then merge ’em together at the person-level
  • download the fixed-width file containing the person-level replicate weights
  • merge the rectangular person-level file with the replicate weights, then store it in a sql database
  • create a new variable – one – in the data table

2012 asec – analysis examples.R

  • connect to the sql database created by the ‘download all microdata’ program
  • create the complex sample survey object, using the replicate weights
  • perform a boatload of analysis examples

replicate census estimates – 2011.R

  • connect to the sql database created by the ‘download all microdata’ program
  • create the complex sample survey object, using the replicate weights
  • match the sas output shown in the png file below

2011 asec replicate weight sas output.png

click here to view these three scripts

for more detail about the current population survey – annual social and economic supplement (cps-asec), visit:


interviews are conducted in march about experiences during the previous year.  the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011.  when you use the current population survey to talk about america, subract a year from the data file name.

as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research.

confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we’ve invented the butane lighter?  time to transition to r.  😀

To leave a comment for the author, please follow the link and comment on their blog: asdfree by anthony damico. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Recent popular posts


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)