**tradeblotter » R**, and kindly contributed to R-bloggers)

I have different sets of monthly data that I want to align and evaluate once a month. The different sources report the timestamp of the monthly data differently – one reports the date without the day, another as the last business day, and another as the last day of the month. For what I want to do, I’m content to align the data to the last day of the month. How do I do that in R?

The solution for this isn’t obvious in Excel or OpenOffice, either, but I thought it would be at least similarly simple in R. I’d looked around, and the best solution I’d found on r-help was something of a kludge:

index(x) = as.POSIXct(as.Date(as.yearmon(index(x)),frac=1), tz="UTC")

I said it wasn’t pretty, didn’t I? You are reading that right – convert the index first to `yearmon`

, then to `Date`

(with `frac`

representing the fraction of a month with a number between 1 and 0), then finally to `POSIXct`

.

That’s worked fine for me until I recently found strange, eight-second misalignments between my data sets that seemed to be caused by the transformation above. Timezone issues, perhaps? On Jeff Ryan’s advice to keep all my date transformations in `POSIX`

, I finally found a better solution in r-help that had been posted by Whit Armstrong years and years ago. I’m not sure why I didn’t find it earlier, but his solution was somewhat more general than what I have been looking for. I’ve stripped his answer down a bit to the following utility function:

eom <- function(date) { # date character string containing POSIXct date date.lt <- as.POSIXlt(date) # add a month, then subtract a day: mon <- date.lt$mon + 2 year <- date.lt$year year <- year + as.integer(mon==13) # if month was December add a year mon[mon==13] <- 1 iso = ISOdate(1900+year, mon, 1, hour=0, tz=attr(date,"tz")) result = as.POSIXct(iso) - 86400 # subtract one day result + (as.POSIXlt(iso)$isdst - as.POSIXlt(result)$isdst)*3600 }

This wasn’t as simple as I was hoping for, but once functionalized it becomes simple. The premise is the same as the spreadsheet solution – add a month then subtract a day.

# Whit's example x <- seq(as.POSIXct("2001-01-10"),as.POSIXct("2005-12-10"),by="months") data.frame(before=x,after=eom(x)) before after 1 2001-01-10 2001-01-31 2 2001-02-10 2001-02-28 3 2001-03-10 2001-03-31 4 2001-04-10 2001-04-30 5 2001-05-10 2001-05-31 6 2001-06-10 2001-06-30 ... snip ...

Ah, much better. Thanks, Whit!

**leave a comment**for the author, please follow the link and comment on their blog:

**tradeblotter » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...