Plot Weekly or Monthly Totals in R

[This article was first published on Mollie's Research Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When plotting time series data, you might want to bin the values so that each data point corresponds to the sum for a given month or week. This post will show an easy way to use cut and ggplot2‘s stat_summary to plot month totals in R without needing to reorganize the data into a second data frame.

Let’s start with a simple sample data set with a series of dates and quantities:

library(ggplot2)
library(scales)

# load data:
log <- data.frame(Date = c("2013/05/25","2013/05/28","2013/05/31","2013/06/01","2013/06/02","2013/06/05","2013/06/07"), 
  Quantity = c(9,1,15,4,5,17,18))

log
str(log)

> log
        Date Quantity
1 2013/05/25        9
2 2013/05/28        1
3 2013/05/31       15
4 2013/06/01        4
5 2013/06/02        5
6 2013/06/05       17
7 2013/06/07       18

> str(log)
'data.frame': 7 obs. of  2 variables:
 $ Date    : Factor w/ 7 levels "2013/05/25","2013/05/28",..: 1 2 3 4 5 6 7
 $ Quantity: num  9 1 15 4 5 17 18


Next, if the date data is not already in a date format, we'll need to convert it to date format:

# convert date variable from factor to date format:
log$Date <- as.Date(log$Date,
  "%Y/%m/%d") # tabulate all the options here
str(log)

> str(log)
'data.frame': 7 obs. of  2 variables:
 $ Date    : Date, format: "2013-05-25" "2013-05-28" ...
 $ Quantity: num  9 1 15 4 5 17 18

Next we need to create variables stating the week and month of each observation. For week, cut has an option that allows you to break weeks as you'd like, beginning weeks on either Sunday or Monday.

# create variables of the week and month of each observation:
log$Month <- as.Date(cut(log$Date,
  breaks = "month"))
log$Week <- as.Date(cut(log$Date,
  breaks = "week",
  start.on.monday = FALSE)) # changes weekly break point to Sunday
log

> log
        Date Quantity      Month       Week
1 2013-05-25        9 2013-05-01 2013-05-19
2 2013-05-28        1 2013-05-01 2013-05-26
3 2013-05-31       15 2013-05-01 2013-05-26
4 2013-06-01        4 2013-06-01 2013-05-26
5 2013-06-02        5 2013-06-01 2013-06-02
6 2013-06-05       17 2013-06-01 2013-06-02
7 2013-06-07       18 2013-06-01 2013-06-02

Finally, we can create either a line or bar plot of the data by month and by week, using stat_summary to sum up the values associated with each week or month:

# graph by month:
ggplot(data = log,
  aes(Month, Quantity)) +
  stat_summary(fun.y = sum, # adds up all observations for the month
    geom = "bar") + # or "line"
  scale_x_date(
    labels = date_format("%Y-%m"),
    breaks = "1 month") # custom x-axis labels
Time series plot, binned by month


# graph by week:
ggplot(data = log,
  aes(Week, Quantity)) +
  stat_summary(fun.y = sum, # adds up all observations for the week
    geom = "bar") + # or "line"
  scale_x_date(
    labels = date_format("%Y-%m-%d"),
    breaks = "1 week") # custom x-axis labels
Time series plot, totaled by week


The full code is available in a gist.

References

To leave a comment for the author, please follow the link and comment on their blog: Mollie's Research Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)