Site icon R-bloggers

Lies, Damn Lies, “Data Journalism” and Charts That Don’t Start at 0

[This article was first published on rud.is » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This tweet by @moorehn (who usually is a superb economic journalist) really bugged me:

Alarming chart of employment for people between 25 and 54. It's like a ski jump. #SOTUecon pic.twitter.com/KNGYmwI88C

— Heidi N. Moore (@moorehn) January 29, 2014

I grabbed the raw data from EPI: (http://www.epi.org/files/2012/data-swa/jobs-data/Employment%20to%20population%20ratio%20(EPOPs).xls) and properly started the graph at 0 for the y-axis and also broke out men & women (since the Excel spreadsheet had the data). It’s a really different picture:

I’m not saying employment is great right now, but it’s nowhere near a “ski jump”. So much for the state of data journalism at the start of 2014.

Here’s the hastily crafted R-code:

library(ggplot2)
library(ggthemes)
library(reshape2)
 
a <- read.csv("empvyear.csv")
b <- melt(a, id.vars="Year")
 
gg <- ggplot(data=b, aes(x=Year, y=value, group=variable))
gg <- gg + geom_line(aes(color=variable))
gg <- gg + ylim(0, 100)
gg <- gg + theme_economist()
gg <- gg + labs(x="Year", y="Employment as share of population (%)", 
                title="Employment-to-population ratio, age 25–54, 1975–2011")
gg <- gg + theme(legend.title = element_blank())
gg

And, here’s the data extracted from the Excel file:

Year,Men,Women
1975,89.0,51.0
1976,89.5,52.9
1977,90.1,54.8
1978,91.0,57.3
1979,91.1,59.0
1980,89.4,60.1
1981,89.0,61.2
1982,86.5,61.2
1983,86.1,62.0
1984,88.4,63.9
1985,88.7,65.3
1986,88.5,66.6
1987,89.0,68.2
1988,89.5,69.3
1989,89.9,70.4
1990,89.1,70.6
1991,87.5,70.1
1992,86.8,70.1
1993,87.0,70.4
1994,87.2,71.5
1995,87.6,72.2
1996,87.9,72.8
1997,88.4,73.5
1998,88.8,73.6
1999,89.0,74.1
2000,89.0,74.2
2001,87.9,73.4
2002,86.6,72.3
2003,85.9,72.0
2004,86.3,71.8
2005,86.9,72.0
2006,87.3,72.5
2007,87.5,72.5
2008,86.0,72.3
2009,81.5,70.2
2010,81,69.3
2011,81.4,69
To leave a comment for the author, please follow the link and comment on their blog: rud.is » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.