# Using Dates and Times in R

February 10, 2014
By

(This article was first published on Noam Ross - R, and kindly contributed to R-bloggers)

Today at the Davis R Users’ Group, Bonnie Dixon gave a tutorial on the various ways to handle dates and times in R. Bonnie provided this great script which walks through essential classes, functions, and packages. Here it is piped through `knitr::spin`. The original R script can be found as a gist here.

## Date/time classes

Three date/time classes are built-in in R, Date, POSIXct, and POSIXlt.

### Date

This is the class to use if you have only dates, but no times, in your data.

create a date:

``````dt1 <- as.Date("2012-07-22")
dt1``````
``## [1] "2012-07-22"``

non-standard formats must be specified:

``````dt2 <- as.Date("04/20/2011", format = "%m/%d/%Y")
dt2``````
``## [1] "2011-04-20"``
``````dt3 <- as.Date("October 6, 2010", format = "%B %d, %Y")
dt3``````
``## [1] "2010-10-06"``

see list of format symbols:

```?`(strptime)``

calculations with dates:

find the difference between dates:

``dt1 - dt2``
``## Time difference of 459 days``
``difftime(dt1, dt2, units = "weeks")``
``## Time difference of 65.57 weeks``

``dt2 + 10``
``## [1] "2011-04-30"``
``dt2 - 10``
``## [1] "2011-04-10"``

create a vector of dates and find the intervals between them:

``````three.dates <- as.Date(c("2010-07-22", "2011-04-20", "2012-10-06"))
three.dates``````
``## [1] "2010-07-22" "2011-04-20" "2012-10-06"``
``diff(three.dates)``
``````## Time differences in days
## [1] 272 535``````

create a sequence of dates:

``````six.weeks <- seq(dt1, length = 6, by = "week")
six.weeks``````
``````## [1] "2012-07-22" "2012-07-29" "2012-08-05" "2012-08-12" "2012-08-19"
## [6] "2012-08-26"``````
``````six.weeks <- seq(dt1, length = 6, by = 14)
six.weeks``````
``````## [1] "2012-07-22" "2012-08-05" "2012-08-19" "2012-09-02" "2012-09-16"
## [6] "2012-09-30"``````
``````six.weeks <- seq(dt1, length = 6, by = "2 weeks")
six.weeks``````
``````## [1] "2012-07-22" "2012-08-05" "2012-08-19" "2012-09-02" "2012-09-16"
## [6] "2012-09-30"``````

see the internal integer representation

``unclass(dt1)``
``## [1] 15543``
``dt1 - as.Date("1970-01-01")``
``## Time difference of 15543 days``

### POSIXct

If you have times in your data, this is usually the best class to use.

create some POSIXct objects:

``````tm1 <- as.POSIXct("2013-07-24 23:55:26")
tm1``````
``## [1] "2013-07-24 23:55:26 PDT"``
``````tm2 <- as.POSIXct("25072013 08:32:07", format = "%d%m%Y %H:%M:%S")
tm2``````
``## [1] "2013-07-25 08:32:07 PDT"``

specify the time zone:

``````tm3 <- as.POSIXct("2010-12-01 11:42:03", tz = "GMT")
tm3``````
``## [1] "2010-12-01 11:42:03 GMT"``

some calculations with times

compare times:

``tm2 > tm1``
``## [1] TRUE``

``tm1 + 30``
``## [1] "2013-07-24 23:55:56 PDT"``
``tm1 - 30``
``## [1] "2013-07-24 23:54:56 PDT"``

find the difference between times:

``tm2 - tm1``
``## Time difference of 8.611 hours``

automatically adjusts for daylight savings time:

``as.POSIXct("2013-03-10 08:32:07") - as.POSIXct("2013-03-09 23:55:26")``
``## Time difference of 7.611 hours``

Get the current time (in POSIXct by default):

``Sys.time()``
``## [1] "2014-02-10 18:26:01 PST"``

see the internal integer representation:

``unclass(tm1)``
``````## [1] 1.375e+09
## attr(,"tzone")
## [1] ""``````
``difftime(tm1, as.POSIXct("1970-01-01 00:00:00", tz = "UTC"), units = "secs")``
``## Time difference of 1.375e+09 secs``

### POSIXlt

This class enables easy extraction of specific componants of a time. (“ct” stand for calender time and “lt” stands for local time. “lt” also helps one remember that POXIXlt objects are lists.)

create a time:

``````tm1.lt <- as.POSIXlt("2013-07-24 23:55:26")
tm1.lt``````
``## [1] "2013-07-24 23:55:26"``
``unclass(tm1.lt)``
``````## \$sec
## [1] 26
##
## \$min
## [1] 55
##
## \$hour
## [1] 23
##
## \$mday
## [1] 24
##
## \$mon
## [1] 6
##
## \$year
## [1] 113
##
## \$wday
## [1] 3
##
## \$yday
## [1] 204
##
## \$isdst
## [1] 1``````
``unlist(tm1.lt)``
``````##   sec   min  hour  mday   mon  year  wday  yday isdst
##    26    55    23    24     6   113     3   204     1``````

extract componants of a time object:

``tm1.lt\$sec``
``## [1] 26``
``tm1.lt\$wday``
``## [1] 3``

truncate or round off the time:

``trunc(tm1.lt, "days")``
``## [1] "2013-07-24"``
``trunc(tm1.lt, "mins")``
``## [1] "2013-07-24 23:55:00"``

### chron

This class is a good option when you don’t need to deal with timezones. It requires the package `chron`.

``require(chron)``
``````## Loading required package: chron
##
## Attaching package: 'chron'
##
## The following objects are masked from 'package:lubridate':
##
##     days, hours, minutes, seconds, years``````

create some times:

``````tm1.c <- as.chron("2013-07-24 23:55:26")
tm1.c``````
``## [1] (07/24/13 23:55:26)``
``````tm2.c <- as.chron("07/25/13 08:32:07", "%m/%d/%y %H:%M:%S")
tm2.c``````
``## [1] (07/25/13 08:32:07)``

extract just the date:

``dates(tm1.c)``
``````##     day
## 07/24/13``````

compare times:

``tm2.c > tm1.c``
``## [1] TRUE``

``tm1.c + 10``
``## [1] (08/03/13 23:55:26)``

calculate the differene between times:

``tm2.c - tm1.c``
``## [1] 08:36:41``
``difftime(tm2.c, tm1.c, units = "hours")``
``## Time difference of 8.611 hours``

does not adjust for daylight savings time:

``as.chron("2013-03-10 08:32:07") - as.chron("2013-03-09 23:55:26")``
``## [1] 08:36:41``

Detach the `chron` package as it will interfere with `lubridate` later in this script.

``detach("package:chron", unload = TRUE)``

### Summary of date/time classes

• When you just have dates, use Date.
• When you have times, POSIXct is usually the best,
• but POSIXlt enables easy extraction of specific components
• and chron is simplest when you don’t need to deal with timezones and daylight savings time.

## Manipulating times and dates

### lubridate

This package is a wrapper for POSIXct with more intuitive syntax.

``require(lubridate)``

create a time:

``````tm1.lub <- ymd_hms("2013-07-24 23:55:26")
tm1.lub``````
``## [1] "2013-07-24 23:55:26 UTC"``
``````tm2.lub <- mdy_hm("07/25/13 08:32")
tm2.lub``````
``## [1] "2013-07-25 08:32:00 UTC"``
``````tm3.lub <- ydm_hm("2013-25-07 4:00am")
tm3.lub``````
``## [1] "2013-07-25 04:00:00 UTC"``
``````tm4.lub <- dmy("26072013")
tm4.lub``````
``## [1] "2013-07-26 UTC"``

some manipulations: extract or reassign componants:

``year(tm1.lub)``
``## [1] 2013``
``week(tm1.lub)``
``## [1] 30``
``wday(tm1.lub, label = TRUE)``
``````## [1] Wed
## Levels: Sun < Mon < Tues < Wed < Thurs < Fri < Sat``````
``hour(tm1.lub)``
``## [1] 23``
``tz(tm1.lub)``
``## [1] "UTC"``
``````second(tm2.lub) <- 7
tm2.lub``````
``## [1] "2013-07-25 08:32:07 UTC"``

converting to decimal hours can facilitate some types of calculations:

``````tm1.dechr <- hour(tm1.lub) + minute(tm1.lub)/60 + second(tm1.lub)/3600
tm1.dechr``````
``## [1] 23.92``

Lubridate distinguishes between four types of objects: instants, intervals, durations, and periods. An instant is a specific moment in time. Intervals, durations, and periods are all ways of recording time spans.

Dates and times parsed in lubridate are instants:

``is.instant(tm1.lub)``
``## [1] TRUE``

round an instant:

``round_date(tm1.lub, "minute")``
``## [1] "2013-07-24 23:55:00 UTC"``
``round_date(tm1.lub, "day")``
``## [1] "2013-07-25 UTC"``

get the current time or date as an instant:

``now()``
``## [1] "2014-02-10 18:26:02 PST"``
``today()``
``## [1] "2014-02-10"``

Note that lubridate uses UTC time zones as default.

see an instant in a different time zone:

``with_tz(tm1.lub, "America/Los_Angeles")``
``## [1] "2013-07-24 16:55:26 PDT"``

change the time zone of an instant (keeping the same clock time):

``force_tz(tm1.lub, "America/Los_Angeles")``
``## [1] "2013-07-24 23:55:26 PDT"``

some calculations with instants. Note that the units are seconds:

``tm2.lub - tm1.lub``
``## Time difference of 8.611 hours``
``tm2.lub > tm1.lub``
``## [1] TRUE``
``tm1.lub + 30``
``## [1] "2013-07-24 23:55:56 UTC"``

An interval is the span of time that occurs between two specified instants.

``````in.bed <- as.interval(tm1.lub, tm2.lub)
in.bed``````
``## [1] 2013-07-24 23:55:26 UTC--2013-07-25 08:32:07 UTC``

Check whether a certain instant occured with a specified interval:

``tm3.lub %within% in.bed``
``## [1] TRUE``
``tm4.lub %within% in.bed``
``## [1] FALSE``

determine whether two intervals overlap:

``````daylight <- as.interval(ymd_hm("2013-07-25 06:03"), ymd_hm("2013-07-25 20:23"))
daylight``````
``## [1] 2013-07-25 06:03:00 UTC--2013-07-25 20:23:00 UTC``
``int_overlaps(in.bed, daylight)``
``## [1] TRUE``

A duration is a time span not anchored to specific start and end times. It has an exact, fixed length, and is stored internally in seconds.

create some durations:

``````ten.minutes <- dminutes(10)
ten.minutes``````
``## [1] "600s (~10 minutes)"``
``````five.days <- ddays(5)
five.days``````
``## [1] "432000s (~5 days)"``
``````one.year <- dyears(1)
one.year``````
``## [1] "31536000s (~365 days)"``
``as.duration(in.bed)``
``## [1] "31001s (~8.61 hours)"``

arithmatic with durations:

``tm1.lub - ten.minutes``
``## [1] "2013-07-24 23:45:26 UTC"``
``five.days + dhours(12)``
``## [1] "475200s (~5.5 days)"``
``ten.minutes/as.duration(in.bed)``
``## [1] 0.01935``

A period is a time span not anchored to specific start and end times, and measured in units larger than seconds with inexact lengths. create some periods:

``````three.weeks <- weeks(3)
three.weeks``````
``## [1] "21d 0H 0M 0S"``
``````four.hours <- hours(4)
four.hours``````
``## [1] "4H 0M 0S"``

arithmatic with periods:

``tm4.lub + three.weeks``
``## [1] "2013-08-16 UTC"``
``````sabbatical <- months(6) + days(12)
sabbatical``````
``## [1] "6m 12d 0H 0M 0S"``
``three.weeks/sabbatical``
``````## estimate only: convert to intervals for accuracy

## [1] 0.108``````

### Calculating mean clock times

Say we have a vector of clock times in decimal hours, and we want to calculate the mean clock time.

``````bed.times <- c(23.9, 0.5, 22.7, 0.1, 23.3, 1.2, 23.6)
bed.times``````
``## [1] 23.9  0.5 22.7  0.1 23.3  1.2 23.6``
``mean(bed.times)  # doesn't work``
``## [1] 13.61``

The clock has a circular scale, which ends where it begins, so we need to use circular statistics. (For more info on circular statistics see http://en.wikipedia.org/wiki/Mean_of_circular_quantities.)

Get the package, psych.

``````require(psych)
``## [1] 23.9``

### An example of using times and dates in a data frame

Here is a data frame with a week of hypothetical times of going to bed and getting up for one person, and the total amount of time sleep time obtained each night according to a sleep monitoring device.

``````sleep <- data.frame(bed.time = ymd_hms("2013-09-01 23:05:24", "2013-09-02 22:51:09",
"2013-09-04 00:09:16", "2013-09-04 23:43:31", "2013-09-06 00:17:41", "2013-09-06 22:42:27",
"2013-09-08 00:22:27"), rise.time = ymd_hms("2013-09-02 08:03:29", "2013-09-03 07:34:21",
"2013-09-04 07:45:06", "2013-09-05 07:07:17", "2013-09-06 08:17:13", "2013-09-07 06:52:11",
"2013-09-08 07:15:19"), sleep.time = dhours(c(6.74, 7.92, 7.01, 6.23, 6.34,
7.42, 6.45)))
sleep``````
``````##              bed.time           rise.time           sleep.time
## 1 2013-09-01 23:05:24 2013-09-02 08:03:29 24264s (~6.74 hours)
## 2 2013-09-02 22:51:09 2013-09-03 07:34:21 28512s (~7.92 hours)
## 3 2013-09-04 00:09:16 2013-09-04 07:45:06 25236s (~7.01 hours)
## 4 2013-09-04 23:43:31 2013-09-05 07:07:17 22428s (~6.23 hours)
## 5 2013-09-06 00:17:41 2013-09-06 08:17:13 22824s (~6.34 hours)
## 6 2013-09-06 22:42:27 2013-09-07 06:52:11 26712s (~7.42 hours)
## 7 2013-09-08 00:22:27 2013-09-08 07:15:19 23220s (~6.45 hours)``````

We want to calculate sleep efficiency, the percent of time in bed spent asleep.

``````sleep\$efficiency <- round(sleep\$sleep.time/(sleep\$rise.time - sleep\$bed.time) *
100, 1)
sleep``````
``````##              bed.time           rise.time           sleep.time efficiency
## 1 2013-09-01 23:05:24 2013-09-02 08:03:29 24264s (~6.74 hours)       75.2
## 2 2013-09-02 22:51:09 2013-09-03 07:34:21 28512s (~7.92 hours)       90.8
## 3 2013-09-04 00:09:16 2013-09-04 07:45:06 25236s (~7.01 hours)       92.3
## 4 2013-09-04 23:43:31 2013-09-05 07:07:17 22428s (~6.23 hours)       84.2
## 5 2013-09-06 00:17:41 2013-09-06 08:17:13 22824s (~6.34 hours)       79.3
## 6 2013-09-06 22:42:27 2013-09-07 06:52:11 26712s (~7.42 hours)       90.9
## 7 2013-09-08 00:22:27 2013-09-08 07:15:19 23220s (~6.45 hours)       93.7``````

Now let’s calculate the mean of each column:

``colMeans(sleep)  # doesn't work``
``## Error: 'x' must be numeric``
``circadian.mean(hour(sleep\$bed.time) + minute(sleep\$bed.time)/60 + second(sleep\$bed.time)/3600)``
``## [1] 23.6``
``circadian.mean(hour(sleep\$rise.time) + minute(sleep\$rise.time)/60 + second(sleep\$rise.time)/3600)``
``## [1] 7.559``
``mean(sleep\$sleep.time)/3600``
``## [1] 6.873``
``mean(sleep\$efficiency)``
``## [1] 86.63``

We can also plot sleep duration and efficiency across the week:

``````par(mar = c(5, 4, 4, 4))
plot(round_date(sleep\$rise.time, "day"), sleep\$efficiency, type = "o", col = "blue",
xlab = "Morning", ylab = NA)
par(new = TRUE)
plot(round_date(sleep\$rise.time, "day"), sleep\$sleep.time/3600, type = "o",
col = "red", axes = FALSE, ylab = NA, xlab = NA)
axis(side = 4)
mtext(side = 4, line = 2.5, col = "red", "Sleep duration")
mtext(side = 2, line = 2.5, col = "blue", "Sleep efficiency")``````

## More resources on times and dates

date and time tutorials for R:

lubridate:

time zone and daylight saving time info:

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...