Mapping US Counties in R with FIPS

June 16, 2016
By

(This article was first published on R Tricks – Data Science Riot!, and kindly contributed to R-bloggers)

Anyone who’s spent any time around data knows primary keys are your friend. Enter the FIPS code. FIPS is the Federal Information Processing Standard and appears in most data sets published by the US government.

Name Matching
The map below is an example as the “wrong way” to do something like this. This map uses a string matching technique to match US county names with the county names in the maps package. The map below can be replicated with this GitHub gist, but I don’t recommend it. I’m using an air quality data set from the Centers for Disease Control and Prevention.

Problems with string matching:
County names change
Louisiana often has “parishes” or “bayous”
Alaska often has “territories” or “census areas”

str_map

As you can see from above, many counties are missing. It’s possible to fix this with some fancy regex work, but if may take quite some time before you realize why Oglala Lakota County is missing from your base map!

FIPS Matching
The maps package contains a built-in data set that you can call with `county.fps`. The only problem with this is, you still end up string matching with your data set. I’ve found the best way to get a map with baked-in FIPS codes is to download (one of many) shape files provided by the Census Bureau. NOTE: There are also shape files for zip codes, congressional districts, census tracts, etc. The shape file we’re using can be downloaded here. Just unzip and place it in your working dir.

Note that these data don’t contain values for Alaska and Hawaii.

fips_map

Results

The data are taken from 2000 to 2010, and the shape file we’re using is from 2013. But since FIPS codes remain constant, even when county names change, every thing matches up just fine.

Other Advantages
Leaflet anyone? Another plus to shape files is, they are easily rendered to a leaflet map. I threw the below map together “quick and dirty.” I’m not really pleased with the green-to-red color ramp, but I’m sure that could be fixed by manually assigning color buckets. GGplot seems to have a better handle on color ramping straight out of the box.

To leave a comment for the author, please follow the link and comment on their blog: R Tricks – Data Science Riot!.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)