as the oldest panel survey out there, the bureau of labor statistics‘ national longitudinal surveys (nls) have been operating so long that justin bieber’s grandparents might be too young (and also too legally canadian) to have participated in the initial cohorts. don’t let the panel study of income dynamics people tell you otherwise, this is the ongoing study with the first interviews out of the gate. are the respondents from the 1966 cohorts still being interviewed? no. but that’s because this is an employment survey – each cohort lasts only a few decades before being re-spawned with a shiny new batch of respondents. not a study of retirement or of health, the panel periods are optimized to examine the relationship between teenage years and careers.
the irrefragable starting point is this bullet pointed description of each panel’s sample universe. for example, nlsy97 is a nationally-representative sample of americans born during the first half of the reagan revolution who are still being assessed about their, well, pretty much everything. once you pick a cohort, click the damn link and read their convenient introductions to exactly who you’ll have the pleasure of studying. and don’t forget, this is the wrong survey for cross-sectional analyses. this isn’t the place to assess the unemployment rate in 2011. but if you want to look at how many jobs the same individual has held across the past thirty years, lookie here. this new github repository contains three scripts:
download all microdata.R
- ignore osu’s offer to register and log in to nlsinfo.org as a guest, via code of course
- loop through every study, every survey round, every topic
- import each data file directly into an r data.frame object and save ’em all on your local disk
longitudinal analysis examples.R
- create a complex sample survey object across almost fifteen years of interviews, using a taylor-series linearization design and a delightful function that makes choosing weights easier than pie
- conduct a slew of analyses (is slew to analysis like gaggle to goose?) in an overwhelmingly successful demonstration of the power, brilliance, and mystique of this panel microdata
- create a complex sample survey object that uses variables from both round one and round fifteen but only weights from the round one interviews, which will bias your results so don’t do that irl okay?
- deftly match the bls-provided statistics, standard errors, deffs, and defts on this page
click here to view these three scripts
for more detail about the national longitudinal surveys (nls), visit:
- the wikipedia page because well just because
- the osu-maintained bibliography to search what others have done
though the bls dot gov slash nls homepage and the ohio state-run nlsinfo might seem like disjoint systems at first, these microdata aren’t terribly challenging to analyze so long as you follow the r code i’ve provided. each new series of interviews gets loaded into their online investigator system as an independent data file, but i couldn’t figure out why we shouldn’t just download absolutely everything and make the panel-weighting a cinch. so i did. you can too.
confidential to sas, spss, stata, and sudaan users: antiques roadshow has some bad news for you. you paid too much. time to transition to r. 😀