laptop-friendly analysis of the census of 82 countries with r and monetdb

November 30, 2015
By

(This article was first published on asdfree, and kindly contributed to R-bloggers)

the integrated public use microdata series international (ipumsi) has been my white whale since i started in survey research.  non-demographers, perhaps think of this repository as a martryoshka varanasi-kaaba-ark of the covenant: nothing compares.  the minnesota population center amassed half a billion person-level records from national statistics offices across the globe.  it’s all free and ready for download, so long as you have a project idea and an institutional affiliation.  so my turn to talk?  because now the software needed for analysis is free as well, and markedly superior to anything that’s available for purchase.  277 censuses later, roll credits.  these tutorials maniacally document every step necessary to

click here to get started working with ipums international

notes: unless you plan to make severe edits to my example code, individual extracts must contain a single year and a single country and be formatted as a csv.  the actual extract link can simply be copied and pasted into your r script from the url highlighted in the screenshot below.  each extract should include the variables “serial”, “strata”, and “perwt” if you plan on calculating statistics to be shared anywhere beyond fingerpainting class.  these census files cannot be treated as simple random samples, those three columns contain the information necessary for my scripts to handle everything correctly.

confidential to sas, spss, stata, and sudaan users: neil armstrong would give pogo sticks the same look i’m giving your softwares right now.  time to reserve your spot on apollo eleven.  time to transition to r.  😀

To leave a comment for the author, please follow the link and comment on their blog: asdfree.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)