Easier Composite U.S. Choropleths with albersusa

March 29, 2016

(This article was first published on R – rud.is, and kindly contributed to R-bloggers)

Folks who’ve been tracking this blog on R-bloggers probably remember this post where I showed how to create a composite U.S. map with an Albers projection (which is commonly referred to as AlbersUSA these days thanks to D3).

I’m not sure why I didn’t think of this earlier, but you don’t need to do those geographical machinations every time you want a prettier & more inclusive map (Alaska & Hawaii have been states for a while, so perhaps we should make more of an effort to include them in both data sets and maps). After doing the map transformations, the composite shape can be saved out to a shapefile, preferably GeoJSON since (a) you can use geojsonio::geojson_write() to save it and (b) it’s a single file vs a ZIP/directory.

I did just that and saved both state and country maps out with FIPS codes and other useful data slot bits and created a small data package : albersusa : with some helper functions. It’s not in CRAN yet so you need to devtools::install_github("hrbrmstr/albersusa") to use it. The github repo has some basic examples, heres a slightly more complex one.

Mapping Obesity

I grabbed an obesity data set from the CDC and put together a compact example for how to make a composite U.S. county choropleth to show obesity rates per county (for 2012, which is the most recent data). I read in the Excel file, pull out the county FIPS code and 2012 obesity rate, then build the choropleth. It’s not a whole lot of code, but that’s one main reason for the package!

library(ggplot2)   # devtools::install_github("hadley/ggplot2") only if you want subtitles/captions
library(albersusa) # devtools::install_github("hrbrmstr/albersusa")
# get the data and be nice to the server and keep a copy of the data for offline use
URL <- "http://www.cdc.gov/diabetes/atlas/countydata/OBPREV/OB_PREV_ALL_STATES.xlsx"
fil <- basename(URL)
if (!file.exists(fil)) download.file(URL, fil)
# it's not a horrible Excel file, but we do need to hunt for the data
# and clean it up a bit. we just need FIPS & 2012 percent info
wrkbk <- read_excel(fil)
obesity_2012 <- setNames(wrkbk[-1, c(2, 61)], c("fips", "pct"))
obesity_2012$pct <- as.numeric(obesity_2012$pct) / 100
# I may make a version of this that returns a fortified data.frame but
# for now, we just need to read the built-in saved shapefile and turn it
# into something ggplot2 can handle
cmap <- fortify(counties_composite(), region="fips")
# and this is all it takes to make the map below
gg <- ggplot()
gg <- gg + geom_map(data=cmap, map=cmap,
                    aes(x=long, y=lat, map_id=id),
                    color="#2b2b2b", size=0.05, fill=NA)
gg <- gg + geom_map(data=obesity_2012, map=cmap,
                    aes(fill=pct, map_id=fips),
                    color="#2b2b2b", size=0.05)
gg <- gg + scale_fill_viridis(name="Obesity", labels=percent)
gg <- gg + coord_proj(us_laea_proj)
gg <- gg + labs(title="U.S. Obesity Rate by County (2012)",
                subtitle="Content source: Centers for Disease Control and Prevention",
           caption="Data from http://www.cdc.gov/diabetes/atlas/countydata/County_ListofIndicators.html")
gg <- gg + theme_map(base_family="Arial Narrow")
gg <- gg + theme(legend.position=c(0.8, 0.25))
gg <- gg + theme(plot.title=element_text(face="bold", size=14, margin=margin(b=6)))
gg <- gg + theme(plot.subtitle=element_text(size=10, margin=margin(b=-14)))



Note that some cartographers think of this particular map view the way I look at a pie chart, but it’s a compact & convenient way to keep the states/counties together and will make it easier to include Alaska & Hawaii in your cartographic visualizations.

The composite GeoJSON files are in:

  • system.file("extdata/composite_us_states.geojson", package="albersusa")
  • system.file("extdata/composite_us_counties.geojson", package="albersusa")

if you want to use them in another program/context.

Drop an issue on github if you want any more default fields in the data slot and if you “need” territories (I’d rather have a PR for the latter tho :-).

To leave a comment for the author, please follow the link and comment on their blog: R – rud.is.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)