Revisiting homicide rates
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A pint of R plotted an interesting dataset: intentional homicides in South America. I thought the graphs were pretty but I was unhappy about the way information was conveyed in the plots; relative risk should be very important but number of homicides is very misleading as it also relates to country population (this problem often comes up in our discussions in Stats Chat).
Instead of just complaining I decided to try a few alternatives (disclaimer: I do not have a good eye for colors or design but I am only looking at ways that could better show relative risk). I therefore downloaded the MS Excel file, which contained a lot of information from other countries and extracted only the information relevant to these plots, which you can obtain here: homicides.csv (4 KB). Some quick code could display the following graph:
require(ggplot2) setwd('~/Dropbox/quantumforest') kill = read.csv('homicides.csv', header = TRUE) kp = ggplot(kill, aes(x = year, y = country, fill = rate)) # Colors coming from # http://learnr.wordpress.com/2010/01/26/ggplot2-quick-heatmap-plotting/ png('homicides-tile.png', width = 500, height = 500) kp = kp + geom_tile() + scale_x_continuous(name = 'Year', expand = c(0, 0)) + scale_y_discrete(name = 'Country', expand = c(0, 0)) + scale_fill_gradient(low = 'white', high = 'steelblue', name = 'Homicide rate') + theme_bw() + opts(panel.grid.major = theme_line(colour = NA), panel.grid.minor = theme_line(colour = NA)) dev.off()
It is also possible to use a line graph, but it quickly gets very messy, so I created totally arbitrary violence categories:
# Totally arbitrary classification kill$type = ifelse(kill$country %in% c('Brazil', 'Colombia', 'Venezuela'), 'Freaking violent', ifelse(kill$country %in% c('Ecuador', 'Surinam', 'Guyana'), 'Plain violent', 'Sort of quiet')) kp2 = ggplot(kill, aes(x = year, y = rate, colour = country)) png('homicides-lines.png', width=1000, height = 300) kp2 + geom_line() + facet_grid(. ~ type) + scale_y_continuous('Homicides/100,000 people') + scale_x_continuous('Year') + theme_bw() + opts(axis.text.x = theme_text(size = 10), axis.text.y = theme_text(size = 10), legend.position = 'none') dev.off()
I hope others will download the data and provide much better alternatives to display violence. If you do, please add a link in the comments.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.