A long time ago, in a github repo far, far away there lived a tiny package that made it possible to create equal area, square U.S. state cartograms in R dubbed
statebins. Three years have come and gone and — truth be told — I’ve never been happy with that package. It never felt “right” and that gnawing feeling finally turned into action with a “re-imagining” of the API.
There were three different functions in the old-style package:
- one for discrete scales (it automated ‘cuts’)
- one for continuous scales
- one for manual scales
It also did some hack-y stuff with
grobs to try to get things to look good without putting too much burden on the user.
All that “mostly” worked, but I always ended up doing some painful workaround when I needed more custom charts (not that I have to use this package much given the line of work I’m in).
Now, there’s just one function for making the cartograms —
statebins() — and another for applying a base theme —
theme_statebins(). The minimalisation has some advantages that we’ll take a look at now, starting with the most basic example (the one on the manual page):
data(USArrests) USArrests$state <- rownames(USArrests) statebins(USArrests, value_col="Assault", name = "Assault") + theme_statebins(legend_position="right")
Two things should stand out there:
- you got
- labels are dark/light depending on the tile color
Before we go into ^^, it may be helpful to show the new function interface:
statebins(state_data, state_col = "state", value_col = "value", dark_label = "black", light_label = "white", font_size = 3, state_border_col = "white", state_border_size = 2, ggplot2_scale_function = ggplot2::scale_fill_distiller, ...)
You pass in the state name/abbreviation & value columns like the old interface but also specify colors for the dark & light labels (set hex code color with
00 ending alpha values if you don’t want labels but Muricans are pretty daft and generally need the abbreviations on the squares). You can set the font size, too (we’ll do that in a bit) and customize the border color (usually to match the background of the target medium). BUT, you also pass in the ggplot2 scale function you want to use and the named parameters for it (that’s what the
... is for).
So, yes I’ve placed more of a burden on you if you want discrete cuts, but I’ve also made the package way more flexible and made it possible to keep the labels readable without you having to lift an extra coding finger.
theme()-ing is also moved out to a separate theme function which makes it easier for you to further customize the final output.
But that’s not all!
There are now squares for Puerto Rico, the Virgin Islands and New York City (the latter two were primarily for new features/data in
cdcfluview but they are good to have available). Let’s build out a larger example with some of these customizations (we’ll make up some data to do that):
library(statebins) library(tidyverse) library(viridis) data(USArrests) # make up some data for the example rownames_to_column(USArrests, "state") %>% bind_rows( data_frame( state = c("Virgin Islands", "Puerto Rico", "New York City"), Murder = rep(mean(max(USArrests$Murder),3)), Assault = rep(mean(max(USArrests$Assault),3)), Rape = rep(mean(max(USArrests$Rape),3)), UrbanPop = c(93, 95, 100) ) ) -> us_arrests statebins(us_arrests, value_col="Assault", ggplot2_scale_function = viridis::scale_fill_viridis) + labs(title="USArrests + made up data") + theme_statebins("right")
Cutting to the chase
I still think it makes more sense to use binned data in these cartograms, and while you no longer get that for “free”, it’s not difficult to do:
adat <- suppressMessages(read_csv("http://www.washingtonpost.com/wp-srv/special/business/states-most-threatened-by-trade/states.csv?cache=1")) mutate( adat, share = cut(avgshare94_00, breaks = 4, labels = c("0-1", "1-2", "2-3", "3-4")) ) %>% statebins( value_col = "share", ggplot2_scale_function = scale_fill_brewer, name = "Share of workforce with jobs lost or threatened by trade" ) + labs(title = "1994-2000") + theme_statebins()
More manual labor
You can also still use hardcoded colors, but it’s a little more work on your end (but not much!):
election_2012 <- suppressMessages(read_csv("https://raw.githubusercontent.com/hrbrmstr/statebins/master/tmp/election2012.csv")) mutate(election_2012, value = ifelse(is.na(Obama), "Romney", "Obama")) %>% statebins( font_size=4, dark_label = "white", light_label = "white", ggplot2_scale_function = scale_fill_manual, name = "Winner", values = c(Romney = "#2166ac", Obama = "#b2182b") ) + theme_statebins()
BREAKING NEWS: Rounded corners
A Twitter request ended up turning into a new feature this afternoon (after I made this post) => rounded corners:
data(USArrests) USArrests$state <- rownames(USArrests) statebins(USArrests, value_col="Assault", name = "Assault", round=TRUE) + theme_statebins(legend_position="right")
It’ll be a while before this hits CRAN and I’m not really planning on keeping the old interface when the submission happens. So, it’ll be on GitHub for a bit to let folks chime in on what additional features you want and whether you really need to keep the deprecated functions around in the package.
So, kick the tyres and don’t hesitate to shoot over some feedback!