Revisiting homicide rates

February 10, 2012

(This article was first published on Quantum Forest » rblogs, and kindly contributed to R-bloggers)

A pint of R plotted an interesting dataset: intentional homicides in South America. I thought the graphs were pretty but I was unhappy about the way information was conveyed in the plots; relative risk should be very important but number of homicides is very misleading as it also relates to country population (this problem often comes up in our discussions in Stats Chat).

Instead of just complaining I decided to try a few alternatives (disclaimer: I do not have a good eye for colors or design but I am only looking at ways that could better show relative risk). I therefore downloaded the MS Excel file, which contained a lot of information from other countries and extracted only the information relevant to these plots, which you can obtain here: homicides.csv (4 KB). Some quick code could display the following graph:


kill = read.csv('homicides.csv', header = TRUE)

kp = ggplot(kill, aes(x = year, y = country, fill = rate))

# Colors coming from
png('homicides-tile.png', width = 500, height = 500)
kp = kp + geom_tile() + scale_x_continuous(name = 'Year', expand = c(0, 0)) +
     scale_y_discrete(name = 'Country', expand = c(0, 0)) +
     scale_fill_gradient(low = 'white', high = 'steelblue', name = 'Homicide rate') +
     theme_bw() +
     opts(panel.grid.major = theme_line(colour = NA),
          panel.grid.minor = theme_line(colour = NA))

Tile graph for homicides.

It is also possible to use a line graph, but it quickly gets very messy, so I created totally arbitrary violence categories:

# Totally arbitrary classification
kill$type = ifelse(kill$country %in% c('Brazil', 'Colombia', 'Venezuela'),
                   'Freaking violent',
                   ifelse(kill$country %in% c('Ecuador', 'Surinam', 'Guyana'),
                          'Plain violent',
                          'Sort of quiet'))

kp2 = ggplot(kill, aes(x = year, y = rate, colour = country))

png('homicides-lines.png', width=1000, height = 300)
kp2 + geom_line() + facet_grid(. ~ type) +
      scale_y_continuous('Homicides/100,000 people') +
      scale_x_continuous('Year') + theme_bw() +
      opts(axis.text.x = theme_text(size = 10),
           axis.text.y = theme_text(size = 10),
           legend.position = 'none')

Another view, which still requires labeling countries.

I hope others will download the data and provide much better alternatives to display violence. If you do, please add a link in the comments.

To leave a comment for the author, please follow the link and comment on their blog: Quantum Forest » rblogs. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)