Visualizing Unemployment Data

December 3, 2011

(This article was first published on Lambda Omega Lambda » R, and kindly contributed to R-bloggers)

So recently Bureau of Labor Statistics released the Oct. 2011 unemployment data. This is not a discussion of it’s validity nor it’s impact, but it is a post on how to visualize it. This post is also for my posterity, I’ve wanted to be able to do this for a while, and it’ll serve as a reference i.e. the map is my own, but the methods are pieced together from other sources.

The Data

So you can go over to the BLS Local Area Stats Page and get the data if you’d like to follow along.

First the data need to form-up so we can use it in R, where we’ll create the map. I (there may be better ways) copied the chart from the link into macvim. Then through a couple[1] s///g‘s I was able to get the file into csv format, which means we’re ready to open R.

Dat’ Map

There are two libraries we’ll be using to help us with this visualization, ggplot2 [2] and maps .

So of course we’ll load them into our session:

Now that we have the library uploaded, we need to get the unemployment data in the session.

unemp <- read.csv("data.csv", header = F)
names(unemp) <- c("region", "percent")
unemp$region <- tolower(unemp$region)

> head(oct)
region unemp
1 north dakota 3.5
2 nebraska 4.2
3 south dakota 4.5
4 new hampshire 5.3
5 vermont 5.6
6 wyoming 5.7

So what we’re going to do next is create a single data.frame from two merged ones. ggplot2 uses long and lat to map the data to the states, so we’ll need to associate the unemployment numbers with those long and lat number.

state_df <- map_data("state")
merged <- merge(state_df, unemp, by="region")
merged <- merged[order(merged$order),]

Great, so now the only step left is to create the map.

ggplot(merged, aes(long, lat, group = group)) +
+ geom_polygon(aes(fill = unemp), colour = alpha("white", 1/2), size = 0.2) +
+ geom_polygon(data = state_df, colour = "white", fill = NA)

And the finished product should look something like this:

[1] Hint: the space between the state name and the number is a tab, \t.
[2] I’ve been using ggplot2 for a couple weeks now, and it is awesome – highly recommended.

To leave a comment for the author, please follow the link and comment on their blog: Lambda Omega Lambda » R. offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...


Comments are closed.