# Find the last day of the month

October 17, 2014
By

I have different sets of monthly data that I want to align and evaluate once a month. The different sources report the timestamp of the monthly data differently – one reports the date without the day, another as the last business day, and another as the last day of the month. For what I want to do, I’m content to align the data to the last day of the month. How do I do that in R?

The solution for this isn’t obvious in Excel or OpenOffice, either, but I thought it would be at least similarly simple in R. I’d looked around, and the best solution I’d found on r-help was something of a kludge:

```index(x) = as.POSIXct(as.Date(as.yearmon(index(x)),frac=1), tz="UTC")
```

I said it wasn’t pretty, didn’t I? You are reading that right – convert the index first to `yearmon`, then to `Date` (with `frac` representing the fraction of a month with a number between 1 and 0), then finally to `POSIXct`.

That’s worked fine for me until I recently found strange, eight-second misalignments between my data sets that seemed to be caused by the transformation above. Timezone issues, perhaps? On Jeff Ryan’s advice to keep all my date transformations in `POSIX`, I finally found a better solution in r-help that had been posted by Whit Armstrong years and years ago. I’m not sure why I didn’t find it earlier, but his solution was somewhat more general than what I have been looking for. I’ve stripped his answer down a bit to the following utility function:

```eom <- function(date) {
# date character string containing POSIXct date
date.lt <- as.POSIXlt(date) # add a month, then subtract a day:
mon <- date.lt\$mon + 2
year <- date.lt\$year
year <- year + as.integer(mon==13) # if month was December add a year
mon[mon==13] <- 1
iso = ISOdate(1900+year, mon, 1, hour=0, tz=attr(date,"tz"))
result = as.POSIXct(iso) - 86400 # subtract one day
result + (as.POSIXlt(iso)\$isdst - as.POSIXlt(result)\$isdst)*3600
}
```

This wasn’t as simple as I was hoping for, but once functionalized it becomes simple. The premise is the same as the spreadsheet solution – add a month then subtract a day.

```# Whit's example
x <- seq(as.POSIXct("2001-01-10"),as.POSIXct("2005-12-10"),by="months")
data.frame(before=x,after=eom(x))
before      after
1  2001-01-10 2001-01-31
2  2001-02-10 2001-02-28
3  2001-03-10 2001-03-31
4  2001-04-10 2001-04-30
5  2001-05-10 2001-05-31
6  2001-06-10 2001-06-30
... snip ...
```

Ah, much better. Thanks, Whit!

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...