Using tidycensus and leaflet to map Census data

[This article was first published on Rstats on Julia Silge, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Recently, I have been following the development and release of Kyle Walker’s tidycensus package. I have been filled with amazement, delight, and well, perhaps another feeling…

But seriously, I have worked with US Census data a lot in the past and this package

  • is such a valuable addition to the R ecosystem and
  • would have saved me SO MUCH ENERGY, HEADACHE, and TIME.

I was working this weekend on a side project with an old friend about opioid usage in Texas and needed to download some Census data again. A perfect opportunity to give this new package a little run-through!

Exercising my joygret

Before running code like the following from tidycensus, you need to obtain an API key from the Census and then use the function census_api_key() to set it in R.

library(tidyverse)
library(tidycensus)

texas_pop <- get_acs(geography = "county", 
                     variables = "B01003_001", 
                     state = "TX",
                     geometry = TRUE) 

texas_pop
## Simple feature collection with 254 features and 5 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -106.6456 ymin: 25.83738 xmax: -93.50829 ymax: 36.5007
## epsg (SRID):    4269
## proj4string:    +proj=longlat +datum=NAD83 +no_defs
## # A tibble: 254 x 6
##    GEOID                    NAME   variable estimate   moe               geometry
##    <chr>                   <chr>      <chr>    <dbl> <dbl> <S3: sfc_MULTIPOLYGON>
##  1 48007   Aransas County, Texas B01003_001    24292     0 <S3: sfc_MULTIPOLYGON>
##  2 48025       Bee County, Texas B01003_001    32659     0 <S3: sfc_MULTIPOLYGON>
##  3 48035    Bosque County, Texas B01003_001    17971     0 <S3: sfc_MULTIPOLYGON>
##  4 48067      Cass County, Texas B01003_001    30328     0 <S3: sfc_MULTIPOLYGON>
##  5 48083   Coleman County, Texas B01003_001     8536     0 <S3: sfc_MULTIPOLYGON>
##  6 48091     Comal County, Texas B01003_001   119632     0 <S3: sfc_MULTIPOLYGON>
##  7 48103     Crane County, Texas B01003_001     4730     0 <S3: sfc_MULTIPOLYGON>
##  8 48139     Ellis County, Texas B01003_001   157058     0 <S3: sfc_MULTIPOLYGON>
##  9 48151    Fisher County, Texas B01003_001     3858     0 <S3: sfc_MULTIPOLYGON>
## 10 48167 Galveston County, Texas B01003_001   308163     0 <S3: sfc_MULTIPOLYGON>
## # ... with 244 more rows

There we go! The total population in each county in Texas, in a tidyverse-ready data frame. If you want to get information for multiple states, just use purrr. The US Census tabulates lots of important kinds of information here in the United States, although there has been troubling uncertainty about leadership and funding there in recent months.

So we have this data in a form that will be easy to manipulate; what if we want to map it? Kyle Walker again has this taken care of, with his tigris package (a dependency of tidycensus); if you set geometry = TRUE the way that I did when I downloaded the Census data above, tigris handles downloading the shapefiles from the Census, with support for sf simple features. Kyle has a vignette for mapping using ggplot2, but you can also pipe straight into leaflet.

library(leaflet)
library(stringr)
library(sf)

pal <- colorQuantile(palette = "viridis", domain = texas_pop$estimate, n = 10)

texas_pop %>%
    st_transform(crs = "+init=epsg:4326") %>%
    leaflet(width = "100%") %>%
    addProviderTiles(provider = "CartoDB.Positron") %>%
    addPolygons(popup = ~ str_extract(NAME, "^([^,]*)"),
                stroke = FALSE,
                smoothFactor = 0,
                fillOpacity = 0.7,
                color = ~ pal(estimate)) %>%
    addLegend("bottomright", 
              pal = pal, 
              values = ~ estimate,
              title = "Population percentiles",
              opacity = 1)

To leave a comment for the author, please follow the link and comment on their blog: Rstats on Julia Silge.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)