Following up on news stories with choroplethr and R

August 25, 2015
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

by Ari Lamstein, consultant specializing in software engineering and data analysis and author of the free email course Learn to Map Census Data in R.

One of my favorite things about R is that it allows me to follow up on interesting news stories. Consider this interview on EconTalk about the history of fracking in America. Russ Roberts interviewed Gregory Zuckerman about his book The Frackers. One thing that struck me were the stories of how North Dakota is being transformed by the fracking boom. North Dakota sits on the Bakken formation which, due to fracking, is now able to be monetized.

Here are two maps I made which demonstrate North Dakota’s recent demographic changes. The first shows that between 2010 and 2013 North Dakota’s Per Capita Income grew at a rate of 15%, significantly above any other US state. The second one shows that North Dakota’s Median Age decreased by 2%, significantly below any other US state. Today I will demonstrate how to create these maps in R.

AL1

                       

Step 1: Install and Load the Necessary Packages

We’ll use the choroplethr and choroplethrMaps packages to create these maps. To install and load them, type the following from an R console:

install.packages(c(“choroplethr”, “choroplethrMaps”))
library(choroplethr)
library(choroplethrMaps)

 

Step 2: Get and Register a US Census API Key

The data we’ll be mapping comes from the US Census Bureau’s American Community Survey (ACS). To access their API you need an API key, which you can get for free here. Once you have a key, type the following from an R console:

library(acs)
api.key.install(“”)

Step 3: Get the Data

The ACS began in 2005, but only data from 2010 thru 2013 is available via the API. We can get 2010 and 2013 data by typing the following:

demo_2010 = get_state_demographics(2010)
demo_2013 = get_state_demographics(2013)

To see the values that are available, look at the column names:

colnames(demo_2013)
[1] "region" "total_population" "percent_white" "percent_black"   
[5] "percent_asian" "percent_hispanic" "per_capita_income" 
[8]
"median_rent"  "median_age"

Before calculating the percent change, merge the two data frames together by region. Note that “region” here is just a state name:

demo_all = merge(demo_2010, demo_2013, by="region")
colnames(demo_all)

 [1] "region"              "total_population.x" "percent_white.x"   
 [4] "percent_black.x"     "percent_asian.x"    "percent_hispanic.x"
 [7] "per_capita_income.x" "median_rent.x"      "median_age.x"      
[10] "total_population.y"  "percent_white.y"    "percent_black.y"   
[13] "percent_asian.y"     "percent_hispanic.y" "per_capita_income.y"
[16] "median_rent.y"       "median_age.y" 

The 2010 values now have .x appended to them, and the 2013 values have .y appended to them.

Step 4: Map the Percentage Change for Per Capita Income

We’ll use the function state_choropleth to create choropleth maps of the demographic changes. First we need to create a column called value which represents the percent change in Per Capita Income between the two years:

demo_all$value = (demo_all$per_capita_income.y – demo_all$per_capita_income.x) / demo_all$per_capita_income.x * 100

Now we can create the map:

state_choropleth(demo_all)

  AL2

The main problem with this map is the scale: it does not allow us to distinguish between negative and positive values. A more appropriate scale is scale_fill_gradient2: it makes 0 white, negative values red and positive values blue. To change the scale we need to use the object oriented features of choroplethr. Here is code to change the scale as well as do other things, such as remove the state labels.

choro = StateChoropleth$new(demo_all)
choro$title = "State Per Capita Incomen2010-2013 Percent Change"
min = min(demo_all$value)
max = max(demo_all$value)
choro$show_labels = FALSE
choro$set_num_colors(1)
choro$ggplot_scale = scale_fill_gradient2(name="Percent Change", limits = c(min, max))
income_map = choro$render()
income_map

AL3

Step 5: Map the Percentage Change for Median Age

We can create a similar map for changes in the median age using the same pattern:

demo_all$value = (demo_all$median_age.y – demo_all$median_age.x) / demo_all$median_age.x * 100
choro = StateChoropleth$new(demo_all)
choro$title = "State Median Agen2010-2013 Percent Change"
choro$set_num_colors(1)
min = min(demo_all$value)
max = max(demo_all$value)
choro$show_labels = FALSE
choro$ggplot_scale = scale_fill_gradient2(name="Percent Change", limits=c(min, max))
age_map = choro$render()
age_map

  AL4

Step 6: Merge the Maps

Finally, we can merge the two maps using the gridExtra package:

library(gridExtra)
grid.arrange(income_map, age_map)

  AL5

Final Thoughts

In closing, I want to point out that the ACS provides estimates only. For more information on the reliability of the estimates, please see the acs package created by Ezra Haber Glenn.

Ari Lamstein is a consultant who specializes in software engineering and data analysis. He is the author of the free email course Learn to Map Census Data in R.

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)