Blog Archives

analyze the youth risk behavior surveillance system (yrbss) with r

July 29, 2013
By

the youth risk behavior surveillance system is the high school edition of the behavioral risk factor surveillance system (brfss), a scientific study of good kids who do bad things.  questions are mostly about sex, drugs, rock and roll, and populat...

Read more »

analyze the surveillance epidemiology and end results (seer) with r and monetdb

July 15, 2013
By

the surveillance epidemiology and end results program is the aggregation of all cancer registry statistics in the united states.  created by congressional decree, seer has captured a nationally-representative quarter of american cancer incidence s...

Read more »

analyze the american time use survey (atus) with r

July 8, 2013
By

the american time use survey collects information about how we spend our time.  it's a pretty simple setup: sampled individuals write down everything they do for a single twenty-four hour period, in ten minute intervals.  those diaries are a...

Read more »

analyze the united states decennial census public use microdata sample (pums) with r and monetdb

July 1, 2013
By

during his tenure as secretary of state, thomas jefferson oversaw the first american census way back in 1790.  some of my countrymen express pride that we're the oldest democracy, but my heart swells with the knowledge that we've got the world's o...

Read more »

analyze the pesquisa de orcamentos familiares (pof) with r

June 17, 2013
By

for the unlucky among us born without a portuguese mother tongue, the pesquisa de orcamentos familiares (pof) translates to survey of household budgets.  this data set captures brazilian family consumption habits, allocation of expenses, and incom...

Read more »

analyze the new york city housing and vacancy survey (nychvs) with r

May 19, 2013
By

for those interested in the real estate and rental markets of the big apple, the census bureau's nyc housing and vacancy survey might be your key to the city.  if you care about how many new york residents live more than one person per room (a lot...

Read more »

analyze the social security administration public use microdata files (ssapumf) with r

May 5, 2013
By

the social security administration (ssa) must be overflowing with quiet heroes, because their public-use microdata files are as inconspicuous as they are thorough.  sure, ssa publishes enough great statistical research of their own that outside re...

Read more »

analyze the medical large claims experience study (mlces) with r

April 21, 2013
By

not a survey, not even remotely current, the society of actuaries' medical large claims experience study (mlces) might be the best private health insurance claims data available to the public.  this data should be used to calibrate other data sets...

Read more »

analyze the pesquisa nacional por amostra de domicilios (pnad) with r

April 7, 2013
By

think of the pesquisa nacional por amostra de domicilios (pnad) as the brazilian census for off-years - the ones that don't end in zero.  the principal household survey for the nation of brazil, pnad measures general education, labor, income, and ...

Read more »

column-store R or: how i learned to stop worrying and love monetdb

March 18, 2013
By

"Combining R's sophisticated calculations and MonetDB's excellent data access performance is a no-brainer. One gets the best of two (open source) worlds with minimal hassle." - Dr. Hannes Mühleisen"oh wow that was fast like a cheetah with a jetpack or something" - anthony damicowhy try monetdb + ra speed test of four analysis commands on sixty-seven million...

Read more »