Make it easy to make ensemble time series forecast
forecastHybrid is an R package to make it easier to use the average predictions of ‘ensembles’ (or ‘combinations’) of time series models from Rob Hyndman’s
forecast package. It looks after the averaging, and also calculates prediction intervals by a conservative method that aims to redress the general over-optimism in forecasting prediction intervals.
New in version 0.3.0 is:
- Following developments in the
forecastpackage, prediction intervals are now created for
nnetarobjects in the ensemble. This should address one aspect of incorrect prediction intervals (e.g. issue #37).
- theta models can be added (by including “
f” in the
models =argument for
hybridModel()) and are indeed part of the default – so unless set otherwise,
hybridModel()will now fit six models.
accuracy.cvts()is now exported. It returns ra ange of summary measures of the cross-validated forecast accuracy for objects created by
cvts. Note that development on
forecastHybridhas been independent of Hyndman’s work on a similarly purposed
forecast– be careful not to get the two confused.
ggplot2graphics when the argument
ggplot = TRUEis passed.
- Time series must be at least four observations long.
- Fixed an error where
e.argswas passed to
In some respects, version 0.2.0 was a more important upgrade because it introduced
cvts and the use of a weighted average forecast with weights for the component models chosen by cross-validation. I didn’t get around to writing a blog post when that happened.
hybridModel function fits between two and six time series models to a time series, with or without
xreg external regressor explanatory variables. The default, if the
models= argument is left blank, is to fit all six available models. The modelling process also determines a set of weights to be used in subsequent forecasts. The two options are:
- equal weights ie a simple average
- cross validation weights ie a weighted average, with models that performed better in cross-validation given more weight.
Generally, we expect cross-validation weights to perform better, but they are quite computationally intensive and we haven’t got round to implementing parallel processing yet.
Here’s a demo with the median duration of unemployment in weeks from the
economics dataset in the
ggplot2 package. This snippet demonstrates both ways of using weights and the default base plot method. This plot method is the one provided by
forecast for all objects of class
forecast, including those created by forecastHybrid.
The cross-validation errors should be better because they give less weight to poor performing forecasts. The forecasts from
nnetar for example are often not good, and in a set of experiments for a recent talk at the Australian Statistical Conference I showed that they are sufficiently poor that they often worsen ensemble forecasts when added to the mix. Setting weights by cross-validation should mitigate this.
All of the component models can be accessed individually if desired:
ggplot2::autoplot() method provided by the
forecast package also works:
We implemented the Theta forecast method based on
thetaf() in Hyndman’s
forecast (although there are alternatives). We had to split the process into a modelling function
thetam(), and an accompanying
forecast.thetam method. The results are identical to
forecast::thetaf(), as demonstrated below:
Conservative prediction intervals?
forecastHybrid method of setting prediction intervals is conservative. It takes the widest range of values covered by any of the component models. In a blog post on different ways of setting forecast combinations, Rob Hyndman points out that, using the monthly
co2 dataset of Mauna Loa Atmospheric CO2 Concentration distributed with R, the prediction intervals for the
forecastHybrid method with six component models look too conservative ie sufficiently wide that they look very likely to contain the correct values much more than aimed for. In this particular case I agree - the method should be used with a bit of caution.
In practice, I often use only the
ets forecasts in combination rather than all six possibilities. Even then, the method is sometimes too conservative with data that of monthly or higher frequency, particularly at the 80% level. Here’s the conclusions from my recent presentation on this topic (apologies for the screenshot from PowerPoint):
So for annual and quarterly real world data, the method is not too conservative at all; and if you want your prediction intervals to contain the true value 95% of the time, the method checks out ok even for monthly data.