Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

the american time use survey collects information about how we spend our time.  it’s a pretty simple setup: sampled individuals write down everything they do for a single twenty-four hour period, in ten minute intervals.  those diaries are averaged across all respondents, and we end up with results like this genius nyt visualization by amanda cox.  most economists use atus to study uncompensated work (chores and childcare), but you can use it for all sorts of crazy stuff like learning that even in the dead of night, one-twentieth of us are awake.  or that we average 54 seconds of sex every day.  i cannot think of anything i would rather be doing than analyzing this survey dataset.

before you start crosstabbing and svymeaning, it’d be smart to spend ten minutes reading exhibit 6.2 of the user’s guide so you understand how all the data tables (..that the download automation script imports for you..) work together.  simpler analyses might only require the respondent and activity summary files, but at the point you want to determine who was with the respondent at soccer practice, you had better merge like a champ.  of course before any of that, you’ll need to decide which activity codes you actually want to capture.  time spent calf-roping or cattle-riding?  code 130121.  commuting to the vet?  code 180807.  pumping gas?  070102.  tired of me guessing for you? check out the activity coding lexicons.  this new github repository contains four scripts:

• decipher the bls ftp site to download each year-specific (or multi-year) table
• unzip whatcha need, then import the microdata in a jiffy with read.csv
• save each file as an r data file (.rda) into neatly-sorted atus directories

2012 single-year – analysis examples.R

replicate bls standard error – 2007.R

replicate bls example one – 2006.R
• load the activity and respondent data tables into working memory
• subset the activity table to only care of household children events (as prescribed by the 2006 lexicon)
• aggregate that activity table to the respondent-level, then merge those minutes to the respondent data.
• just run a weighted.mean that skips any variance calculation but hits the bls example one on the nose

for more detail about the american time use survey, visit:

notes:

just like the medical expenditure panel survey draws its sample from the national health interview survey, the american time use survey is a subsample of current population survey (cps) respondents.  in fact, the microdata include a handy atus-cps mergefile.  unlike the cps, it’s not a household survey – only one individual at least 15 years of age gets selected from each sampled household.  another important difference from the cps: the atus should not be used to draw state-level conclusions.  atus generalizes to the united states non-institutional, non-active duty military population aged fifteen or more, but don’t zoom in on geographies smaller than census regions.

when you see the svytotal function used in the analysis example script, you’ll notice overall sums around ninety billion.  that’s because the survey weights in this data set actually generalize to person-days.  divide by 365, and you’ll almost precisely hit the sixteen and older row of the 2010 column of table 1 on this census bureau age by sex table.  so at ease, everybody.  at ease.

confidential to sas, spss, stata, and sudaan users: if you want to impress people at parties with an antiquated skill, learn morse code.  at least it’s rhythmic.  time to transition to r. 😀