Plotting “time of day” data using ggplot2

[This article was first published on What You're Doing Is Rather Desperate » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

William asks:

How can I make a graph that looks like this, “tweet density” style, showing time intervals?

He then helpfully describes his input data: a CSV file with headers “time started, time finished, date”.

Here’s a simple CSV file, tasks.csv:

task,date,start,end
task1,2010-03-05,09:00:00,13:00:00
task2,2010-03-06,10:00:00,15:00:00
task3,2010-03-06,11:00:00,18:00:00
task4,2010-03-07,08:00:00,11:00:00
task5,2010-03-08,14:00:00,17:00:00
task6,2010-03-09,12:00:00,16:00:00
task7,2010-03-10,14:00:00,19:00:00
task8,2010-03-11,09:30:00,13:30:00

Read into R, calculate the weekday and reorder the weekday factors from Sunday, Monday…to Saturday:

tasks <- read.csv("tasks.csv", header = T)
# day of week
tasks$day <- weekdays(strptime(tasks$date, "%Y-%m-%d"))
week      <- c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")
tasks$day <- factor(tasks$day, levels = week)

Convert the start and end times to decimal hours. I’m not very familiar with the as.POSIX… functions, so I’m sure that there’s a more elegant way to do this:

# convert time to decimal hours
tasks$start.ct   <- as.POSIXct(paste(tasks$date, tasks$start, sep = " "))
tasks$end.ct     <- as.POSIXct(paste(tasks$date, tasks$end, sep = " "))
tasks$start.hour <- as.POSIXlt(tasks$start.ct)$hour + as.POSIXlt(tasks$start.ct)$min/60 + as.POSIXlt(tasks$start.ct)$sec/3600
tasks$end.hour   <- as.POSIXlt(tasks$end.ct)$hour + as.POSIXlt(tasks$end.ct)$min/60 + as.POSIXlt(tasks$end.ct)$sec/3600

We’re going to plot task duration as a horizontal rectangle. If there is more than one task per day, we need to offset the rectangle vertically, so as they don’t overlap.

# offset tasks if > 1 per day
tasks$ymin <- c(rep(0, nrow(tasks)))
t <- table(tasks$day)
for(day in rownames(t)) {
  if(t[[day]] > 1) {
    ss <- tasks[tasks$day == day,]
    y  <- 0
    for(i in as.numeric(rownames(ss))) {
      tasks[i,]$ymin <- y
      y <- y + 1.2
    }
  }
}

Finally, call ggplot with the rectangle geom plus a bunch of options to colour the rectangles (by task), facet the plot (by day) and clean up, rescale and label the axes:

# plot
library(ggplot2)
png(filename = "tasks.png", width = 640, height = 480)
ggplot(tasks, aes(xmin = start.hour, xmax = end.hour, ymin = ymin, ymax = ymin + 1, fill = factor(task))) + geom_rect() +
facet_grid(day~.) + opts(axis.text.y = theme_blank(), axis.ticks = theme_blank()) + xlim(0,23) + xlab("time of day")
dev.off()

Result:


Filed under: programming, R, statistics Tagged: datetime, how to, plotting

To leave a comment for the author, please follow the link and comment on their blog: What You're Doing Is Rather Desperate » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)