Initializing the Holt-Winters method

November 29, 2010

(This article was first published on Research tips » R, and kindly contributed to R-bloggers)

The Holt-Winters method is a popular and effective approach to forecasting seasonal time series. But different implementations will give different forecasts, depending on how the method is initialized and how the smoothing parameters are selected. In this post I will discuss various initialization methods.

Suppose the time series is denoted by y_1,dots,y_n and the seasonal period is m (e.g., m=12 for monthly data). Let hat{y}_{t+h|t} be the h-step forecast made using data to time t. Then the additive formulation of Holt-Winters’ method is given by the following equations

begin{align*}<br data-recalc-dims= ell_t & = alpha(y_t – s_{t-m}) + (1-alpha)(ell_{t-1}+b{t-1})\
b_t & = beta(ell_t – ell_{t-1}) + (1-beta)b_{t-1}\
s_t &= gamma(y_t-ell_{t-1} – b_{t-1}) + (1-gamma)s_{t-m}\
hat{y}_{t+h|t} &= ell_t + b_th + s_{t+h-m},
end{align*}”title=”Rendered by” style=”vertical-align: 0px; border: none;”/>

and the multiplicative version is given by

begin{align*}<br data-recalc-dims= ell_t & = alpha(y_t/ s_{t-m}) + (1-alpha)(ell_{t-1}+b{t-1})\
b_t & = beta(ell_t – ell_{t-1}) + (1-beta)b_{t-1}\
s_t &= gamma(y_t/(ell_{t-1} – b_{t-1})) + (1-gamma)s_{t-m}\
hat{y}_{t+h|t} &= (ell_t + b_th)s_{t+h-m}.
end{align*}”title=”Rendered by” style=”vertical-align: 0px; border: none;”/>

In many books, the seasonal equation (with s_t on the LHS) is slightly different from these, but I prefer the version above because it makes it easier to write the system in state space form. In practice, the modified form makes very little difference to the forecasts.

In my 1998 textbook, the following initialization was proposed. Set

begin{align*}<br data-recalc-dims= ell_m & = (y_1+cdots+y_m)/m\
b_m &= left[(y_{m+1}+y_{m+2}+cdots+y_{m+m})-(y_1+y_2+cdots+y_{m})right]/m^2.
end{align*}”title=”Rendered by” style=”vertical-align: 0px; border: none;”/>

The level is obviously the average of the first year of data. The slope is set to be the average of the slopes for each period in the first two years:

[<br data-recalc-dims= (y_{m+1}-y_1)/m,quad (y_{m+2}-y_2)/m,quad dots,quad (y_{m+m}-y_m)/m.
]”title=”Rendered by” style=”vertical-align: 0px; border: none;”/>

Then, for additive seasonality set s_i=y_i-ell_m and for multiplicative seasonality set s_i=y_i/ell_m, where i=1,dots,m. This works pretty well, and is easy to implement, but for short and noisy series it can give occasional dodgy results. It also has the disadvantage of providing state estimates for period m, so that the first forecast is for period m+1 rather than period 1.

In some books (e.g., Bowerman, O’Connell and Koehler, 2005), a regression-based procedure is used instead. They suggest fitting a regression with linear trend to the first few years of data (usually 3 or 4 years are used). Then the initial level ell_0 is set to the intercept, and the initial slope b_0 is set to the regression slope. The initial seasonal values s_{-m+1},dots,s_0 are computed from the detrended data. This is a very poor method that should not be used as the trend will be biased by the seasonal pattern. Imagine a seasonal pattern, for example, where the last period of the year is always the largest value for the year. Then the trend will be biased upwards. Unfortunately, Bowerman, O’Connell and Koehler (2005) are not alone in recommending bad methods. I’ve seen similar, and worse, procedures recommended in other books.

While it would be possible to implement a reasonably good regression method, a much better procedure is based on a decomposition. This is what was recommended in my 2008 Springer book and is implemented in the HoltWinters and ets functions in R.

  1. First fit a 2times m moving average smoother to the first 2 or 3 years of data (HoltWinters uses 2 years, ets uses 3 years). Here is a quick intro to moving average smoothing.
  2. Then subtract (for additive HW) or divide (for multiplicative HW) the smooth trend from the original data to get de-trended data. The initial seasonal values are then obtained from the averaged de-trended data. For example, the initial seasonal value for January is the average of the de-trended Januaries.
  3. Next subtract (for additive HW) or divide (for multiplicative HW) the seasonal values from the original data to get seasonally adjusted data.
  4. Fit a linear trend to the seasonally adjusted data to get the initial level ell_0 (the intercept) and the initial slope b_0.

This is generally quite good and fast to implement and allows “forecasts” to be produced from period 1. (Of course, they are not really forecasts as the data to be forecast has been used in constructing them.) However, it does require 2 or 3 years of data. For very short time series, an alternative (implemented in the ets function in R) is to set the initial seasonal values to be the last year of available data with the mean subtracted (for additive HW) or divided out (for multiplicative HW). Then proceed with steps 3 and 4 as above.

Whichever method is used, these initial values should be seen as rough estimates only. They can be improved by optimizing them along with the smoothing parameters using maximum likelihood estimation, for example. The only implementation of the Holt-Winters’ method that does that, to my knowledge, is the ets function in R. In that function, the above procedure is used to find starting values for the estimation.

Some recent work (De Livera, Hyndman and Snyder, 2010) shows that all of the above may soon be redundant. In Section 3.1, we show that the initial state values can be concentrated out of the likelihood and estimated directly using a regression procedure. Although we present the idea in the context of complex seasonality, it is valid for any exponential smoothing model. I am planning on modifying the ets function to implement this idea, but it will probably have to a wait for a couple of months as my “to do” list is rather long.

To leave a comment for the author, please follow the link and comment on their blog: Research tips » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)