Revisiting homicide rates

February 10, 2012
By

(This article was first published on Quantum Forest » rblogs, and kindly contributed to R-bloggers)

A pint of R plotted an interesting dataset: intentional homicides in South America. I thought the graphs were pretty but I was unhappy about the way information was conveyed in the plots; relative risk should be very important but number of homicides is very misleading as it also relates to country population (this problem often comes up in our discussions in Stats Chat).

Instead of just complaining I decided to try a few alternatives (disclaimer: I do not have a good eye for colors or design but I am only looking at ways that could better show relative risk). I therefore downloaded the MS Excel file, which contained a lot of information from other countries and extracted only the information relevant to these plots, which you can obtain here: homicides.csv (4 KB). Some quick code could display the following graph:

require(ggplot2)

setwd('~/Dropbox/quantumforest')
kill = read.csv('homicides.csv', header = TRUE)

kp = ggplot(kill, aes(x = year, y = country, fill = rate))

# Colors coming from
# http://learnr.wordpress.com/2010/01/26/ggplot2-quick-heatmap-plotting/
png('homicides-tile.png', width = 500, height = 500)
kp = kp + geom_tile() + scale_x_continuous(name = 'Year', expand = c(0, 0)) +
     scale_y_discrete(name = 'Country', expand = c(0, 0)) +
     scale_fill_gradient(low = 'white', high = 'steelblue', name = 'Homicide rate') +
     theme_bw() +
     opts(panel.grid.major = theme_line(colour = NA),
          panel.grid.minor = theme_line(colour = NA))
dev.off()

Tile graph for homicides.

It is also possible to use a line graph, but it quickly gets very messy, so I created totally arbitrary violence categories:

# Totally arbitrary classification
kill$type = ifelse(kill$country %in% c('Brazil', 'Colombia', 'Venezuela'),
                   'Freaking violent',
                   ifelse(kill$country %in% c('Ecuador', 'Surinam', 'Guyana'),
                          'Plain violent',
                          'Sort of quiet'))

kp2 = ggplot(kill, aes(x = year, y = rate, colour = country))

png('homicides-lines.png', width=1000, height = 300)
kp2 + geom_line() + facet_grid(. ~ type) +
      scale_y_continuous('Homicides/100,000 people') +
      scale_x_continuous('Year') + theme_bw() +
      opts(axis.text.x = theme_text(size = 10),
           axis.text.y = theme_text(size = 10),
           legend.position = 'none')
dev.off()

Another view, which still requires labeling countries.

I hope others will download the data and provide much better alternatives to display violence. If you do, please add a link in the comments.

To leave a comment for the author, please follow the link and comment on his blog: Quantum Forest » rblogs.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.