Forecasting with long seasonal periods

September 28, 2010

(This article was first published on Research tips » R, and kindly contributed to R-bloggers)

I am often asked how to fit an ARIMA or ETS model with data having a long seasonal period such as 365 for daily data or 48 for half-hourly data. Generally, seasonal versions of ARIMA and ETS models are designed for shorter periods such as 12 for monthly data or 4 for quarterly data.

The ets() function in the forecast package restricts seasonality to be a maximum period of 24 to allow hourly data but not data with a larger seasonal frequency. The problem is that there are m-1 parameters to be estimated for the initial seasonal states where m is the seasonal period. So for large m, the estimation becomes almost impossible.

The arima() function will allow a seasonal period up to m=350 but in practice will usually run out of memory whenever the seasonal period is more than about 200. I haven’t dug into the code to find out why this is the case — theoretically it would be possible to have any length of seasonality as the number of parameters to be estimated does not depend on the seasonal order. However, seasonal differencing of very high order does not make a lot of sense — for daily data it involves comparing what happened today with what happened exactly a year ago and there is no constraint that the seasonal pattern is smooth.

For such data I prefer a Fourier series approach where the seasonal pattern is modelled using Fourier terms with short-term time series dynamics allowed in the error. For example, consider the following model:

displaystyle y_t = a + sum_{k=1}^K left[ alphasin(2pi kt/m) + betacos(2pi kt/m)right] + N_t

where N_t is an ARIMA process. The value of K can be chosen by minimizing the AIC.

This is easily fitted using R. Here is an example with m=200, K=4 and N_t an ARIMA(2,0,1) process:

n <- 2000
m <- 200
y <- ts(rnorm(n) + (1:n)%%100/30, f=m)
fourier <- function(t,terms,period)
  n <- length(t)
  X <- matrix(,nrow=n,ncol=2*terms)
  for(i in 1:terms)
    X[,2*i-1] <- sin(2*pi*i*t/period)
    X[,2*i] <- cos(2*pi*i*t/period)
  colnames(X) <- paste(c("S","C"),rep(1:terms,rep(2,terms)),sep="")

fit <- Arima(y, order=c(2,0,1), xreg=fourier(1:n,4,m))
plot(forecast(fit, h=2*m, xreg=fourier(n+1:(2*m),4,m)))

The advantages of this approach are:

  • it allows any length seasonality;
  • for data with more than one seasonal period, you can include Fourier terms of different frequencies;
  • the seasonal pattern is smooth for small values of K (but more wiggly seasonality can be handled by increasing K);
  • the short-term dynamics are easily handled with a simple ARMA error.

The only real disadvantage (compared to a seasonal ARIMA model) that I can think of is that the seasonality is assumed to be fixed — the pattern is not allowed to change over time. But in practice, seasonality is usually remarkably constant so this is not a big disadvantage except for very long time series.

The order of N_t can also be chosen automatically:

fit <- auto.arima(y, D=0, max.P=0, max.Q=0, xreg=fourier(1:n,4,m))

Note that the seasonal parts of the ARIMA model should all be set to zero as we do not want N_t to be a seasonal ARIMA model.

I’ll add the fourier() function to the forecast package on the next release.

It is much harder to do something like this for ETS models. One of our students has been working on this and our paper on complex seasonality describes the procedure. We plan to add this functionality to the forecast package soon, but the code is not quite ready yet.


To leave a comment for the author, please follow the link and comment on their blog: Research tips » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training




CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)