Rock around the data clock

[This article was first published on Opiate for the masses, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I had to visualize an hourly distribution of some KPI (assume logins / registrations / purchases per hour), so the idea was, why not build a clock? It’s a nice and intuitive way to present this data. The problem is, that the clock has only 12 hours which are used twice a day, so I will have to assign two data points to every “hour” (am/pm). So assume our data is something like this:

df <- expand.grid(hour = c(12,1:11), period = c("am", "pm"))
df$value <- dnorm(c(seq(0.6,1.1,0.1),seq(-1.1,0.6,0.1)) ,0,0.5)

So with ggplot we can start with some bar plots:

library(ggplot2)
hour_plot <- ggplot(df, aes(x = factor(hour), y = value, fill = period)) +
  geom_bar(stat = "identity", position = "dodge")
hour_plot

bars_only

Neat, but doesn’t look like a clock to me, so let’s change the coordinates to polar:

hour_plot <- hour_plot + coord_polar(theta = "x")
hour_plot

hour_ugly

Well, better, looks like a clock, but won’t win any beauty contests. Let’s skew it and remove some junk.

hour_plot <- hour_plot + coord_polar(theta = "x", start = 0.26)+
 xlab("")+ylab("")+
 theme(axis.ticks = element_blank(), axis.text.y = element_blank(), 
 panel.background = element_blank(), panel.grid.major.x = element_line(colour="grey"),
 axis.text.x = element_text(size = 25), legend.title=element_blank())
hour_plot

hour_better

Nice! That’s the data clock. From a design and readability point of view the transition from a.m to p.m (left bar to right bar) and p.m. to a.m. (right bar to left bar) at 11-12 o’clock could be misleading or counterintuitive. My suggestion here is to guide the viewer with color transitions, instead of binary a.m./ p.m. (red / blue) colors. One way to do this is by adding a new variable to the data, to which we then can bind our color to.

df$fill <- c(1:13,12:2)

Now we have to adjust the aesthetics of ggplot a bit and we’re done.

hour_plot <- ggplot(df, aes(x = factor(hour), y = value, fill =fill, group = period))+
 geom_bar(stat = "identity", position = "dodge")+
 coord_polar(theta = "x", start = 0.26)+
 xlab("")+ylab("")+
 theme(axis.ticks = element_blank(), axis.text.y = element_blank(), 
 panel.background = element_blank(), panel.grid.major.x = element_line(colour="grey"),
 axis.text.x = element_text(size = 25), legend.title=element_blank())
hour_plot

hour_even_better

Delete the useless legend et voilà!

hour_plot <- hour_plot + guides(fill=FALSE)

hour_best

One has to mention, that p.m. is on the right side and a.m. is on the left. Sure, it’s still not perfect, because the transition between a.m./p.m. is still rough, so I would be delighted to read your suggestions for improvement in the comments ;)

Rock around the data clock was originally published by Kirill Pomogajko at Opiate for the masses on October 21, 2014.

To leave a comment for the author, please follow the link and comment on their blog: Opiate for the masses.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)