acs14lite: A lightweight R interface to the 2010-2014 ACS API

December 11, 2015
By

(This article was first published on Kyle Walker, and kindly contributed to R-bloggers)

I use data from the US Census Bureau’s American Community Survey all of the time. I also use R all of the time. Naturally, this means that I often use ACS data in R – which is pertinent given last week’s release of the new 2010-2014 ACS estimates. I wanted easy access to the data to facilitate my on-going research on demographic trends in US metros, and work at the TCU Center for Urban Studies; as such, I wrote a small R package to provide quick access to the data, acs14lite (https://github.com/walkerke/acs14lite). This is not intended to be comparable to, or a replacement for, the existing ACS package in R; it is more for my personal convenience, but I thought it might be useful to others as well. This is mostly going to be a side project for me, so I don’t have plans for a CRAN submission at this time.

Install from GitHub with the following command in R:

devtools::install_github('walkerke/acs14lite')

Accessing the US Census Bureau’s API requires an API key, which you can get from here: http://api.census.gov/data/key_signup.html. You can then set it globally in your acs14lite session:

library(acs14lite)

set_api_key('your API key here')

There is one main function in the package: acs14. From here, you can request data for the following geographies: the entire US, regions, divisions, states, counties, Census tracts, and Census block groups. These are the geographies that I generally use, and I don’t have plans at the moment to add more; I would welcome pull requests, however.

The acs14 function has the following parameters:

  • api_key: If you’ve set your API key already with set_api_key, you don’t need to provide this.
  • geography: One of ‘us’ (the default), ‘region’, ‘division’, ‘state’, ‘county’, ‘tract’, or ‘block group’.
  • variable: A character string representing the Census variable name you want, or a vector of multiple variable names. Defaults to ‘B01001_001E’, which is total population. You can use the ACS package to look for variable names with its acs.lookup function; remember to add E for estimate and M for margin of error to the end of your variable name.
  • state: The name of the state for which you want data; applicable to counties, tracts, and block groups.
  • county: The name of the county for which you want data: applicable to tracts and block groups.

The function returns an R data frame with the data you want for your requested geography.

Additionally, I’ve written a few functions to help users work with margins of error in the ACS. Margins of error for the raw data are provided from the API; however, we often calculate new variables based on the ACS estimates, which in turn will have their own respective margins of error. I’ve used the guidelines in Appendix 3 here: https://www.census.gov/content/dam/Census/library/publications/2008/acs/ACSGeneralHandbook.pdf to write the following functions:

  • moe_sum: calculates a margin of error for a derived sum of ACS estimates
  • moe_prop: calculates a margin of error for a proportion
  • moe_ratio: calculates a margin of error for a ratio
  • moe_product: calculates a margin of error for a product

Below, I provide a couple examples of how you can use the package.

Interactive dot plot of income by county in Wyoming with Plotly

library(ggplot2)
library(plotly)
library(dplyr)

wy_income <- acs14(geography = 'county', variable = c('B19013_001E', 'B19013_001M'), state = 'WY')

wy2 <- wy_income %>%
  mutate(name = gsub(" County, Wyoming", "", wy_income$NAME),
         low = B19013_001E - B19013_001M,
         high = B19013_001E + B19013_001M) %>%
  select(name, low, high, estimate = B19013_001E) %>%
  arrange(desc(estimate))

g <- ggplot(wy2, aes(x = estimate, y = reorder(name, estimate))) +
  geom_point() +
  geom_errorbarh(aes(xmin = low, xmax = high)) +
  xlab("Median household income, 2010-2014 ACS estimate") +
  ylab("")


ggplotly(g) %>% layout(margin = list(l = 120))

Interactive map of poverty in Los Angeles County by Census tract with CartoDB and the tigris package

library(tigris)
library(CartoDB) # devtools::install_github("becarioprecario/cartodb-r/CartoDB", dep = TRUE)
library(rgdal)

la_poverty <- acs14(geography = 'tract', state = 'CA', county = 'Los Angeles',
                    variable = c('B17001_001E', 'B17001_001M', 'B17001_002E', 'B17001_002M'))

la2 <- la_poverty %>%
  mutate(geoid = paste0(state, county, tract),
         pctpov = round(100 * (B17001_002E / B17001_001E), 1),
         moepov = round(100 * (moe_prop(B17001_002E, B17001_001E, B17001_002M, B17001_001M)), 1)) %>%
  select(geoid, pctpov, moepov)

cdb_name <- 'your CartoDB username here'
cdb_key <- 'your CartoDB API key here'

cartodb(cdb_name, cdb_key)

la_tracts <- tracts('CA', 'Los Angeles', cb = TRUE)

la_tracts2 <- geo_join(la_tracts, la2, "GEOID", "geoid")

r2cartodb(la_tracts2, 'la_poverty')

# Now, head to your CartoDB account to style your map!

To leave a comment for the author, please follow the link and comment on their blog: Kyle Walker.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)