The debate about business confidence as a possible leading indicator of economic growth reminded me of a question that has long been on my back-burner – what data make the best leading indicators (say, a quarter of a year ahead) of economic growth? There are a number of indicators that get a lot of attention in the media and from economic pundits of course, but I wanted to take a data-driven approach and see what actual evidence there is for the various candidate variables.
To cut to the chase, this graphic summarises my results:
The previous quarter’s GDP growth is, unsurprisingly, a strong indicator of the coming quarter’s. Food prices are strongly negatively related to economic growth (in New Zealand anyway – with the strong agriculture sector here food prices may be serving as a proxy for something else, but exactly what I’m not sure although I’ve got some ideas). Other indicators with fair evidence of value include visitor arrivals, car registrations, business confidence (the OECD measure, which David Hood points out is apparently derived from the NZIER’s survey, not the ANZ’s), and electronic card transactions.
One surprise for me is the complete absence of predictive power of merchandise goods exports for growth in GDP. Any suggestions why welcomed in the comments!
These confidence intervals come from what I believe to be a pretty robust frequentist modelling approach that combines multiple imputation, bootstrap and elastic net regularization. I have in the back of my head the idea of a follow-up project that uses Bayesian methods to provide actual forecasts of the coming GDP, but I may not get around to it (or if I do, it will probably be for Victorian or Australian data seeing as I’ll be living there from October and more motivated for forecasts in that space).
To address the question, I started with the Stats NZ data release calendar to identify official statistics that are available on a monthly basis and published within a month of the reference period. I counted seven such collections:
- Electronic card transactions
- Food price index
- Transport vehicle registrations
- International travel and migration
- Overseas merchandise trade
- Building consents issued
- Livestock slaughtering statistics
To these seven, I also added the:
- Reserve Bank’s Trade Weighted Index of the New Zealand dollar compared to a basket of currencies
- OECD business confidence series (I used this rather than the ANZ or NZIER original series as the OECD version is available freely for reproducibility purposes)
Some of these datasets are available programmatically, and some required manual steps to download from Infoshare. Slightly more detail on this is documented in the GitHub repo I set up for the project, as are all the scripts that import and combine the data. If you’re interested in that, fork the entire repository, as its a combined whole and needs to be run as such, via the
integrate.R script in the root directory of the project.
Here’s the original time series, plus seasonally adjusted versions of all of them except for the trade weighted index (currencies are so random I thought it made little sense to do a seasonally adjusted version of this; possible subject of future investigation):
I did my own seasonal adjustment because even when seasonally adjusted series are available in some of the Stats NZ collections, typically it goes back only to the 1990s or 2000s. I wanted data at least all the way back to 1987 when the quarterly GDP series starts; time series are difficult enough to analyse as it is without restricting ourselves to a decade or so of data (this is also why I went back to original data series rather than using the Treasury “leading economic indicator” Excel collections, as they cover only a relatively recent period).
Apart from business confidence (which by its nature is restricted to within a constant range), the original data are all clearly non-stationary, so I converted them to growth rates, which gives us the following chart summarising my candidate leading economic indicators:
Both seasonal adjustment and stationarity are important with time series so we can focus on the underlying relationships and avoid spurious correlations from the structural (growth and episodicity) elements of the series. An alternative approach would be to model the seasonality and growth directly in a full blown Bayesian model, but even in that case I’d want to look at de-seasoned, stationary versions for the exploratory stage.
Now that we have those de-seasoned, stationary versions of the data, we can look at them and see which ones are correlated one-on-one with economic growth for the quarter. Remember, the idea here is to use data that are available fairly promptly as leading indicators of economic growth in the same quarter, because economic growth is only reported quarterly and is hard to estimate, and hence is reported at a material lag. Here are the bivariate correlations:
I’ve sequenced the variables in order of strength of relationship with GDP growth in the quarter (code). We can see that lagged GDP growth, electronic card transactions growth, and OECD business confidence have the three strongest positive correlations with GDP growth. Food price growth has a strong negative correlation (ie high growth in food prices is a predictor of low growth in GDP).
The lower diagonal in that pairs plot shows connected scatter plots with a single-predictor linear model. This turned out to be pleasantly easy to do with
GGally::ggpairs and a simple user-defined function. Here’s the code for all three of the exploratory charts.
Minimalist regression (poor)
My starting point was a fairly naive out-of-the-box linear regression, just chucking all the data into
This gives me this result:
|Residual Std. Error||0.004 (df = 48)|
|F Statistic||5.215*** (df = 12; 48)|
|Note:||*p<0.1; **p<0.05; ***p<0.01|
There’s two big problems with this approach:
- By default, the model is fit to the minimal set of the data that is complete for all variables. Electronic card transactions is the shortest series, going back only to 2003 when it is treated as a growth rate. So all the information in the other variables before that time has been thrown out.
- A big amount of collinearity in the data is inevitable with this sort of collection. While it doesn’t bias the estimated coefficients, it can make a big increase to the variance, and basically adds a lot of noise to our estimated effect sizes. I think this is why we have the implausible result in this naive model of lagged GDP growth not being useful as an explanatory variable.
… and one small problem, which is that other than including the lagged value of GDP growth as an explanatory variable, I’m ignoring the time series nature of the data. More on that later.
Multiple imputation (better)
We could address the problem of the short electronic card transactions series either by eliminating that series from the data altogether (which would be a shame as it’s obviously useful), or by imputing it’s previous values. Conceptually this is a little dubious – what exactly are we saying growth in electronic card transactions in 1988 would mean? – but statistically it’s fine. The imputed values will capture the relationships between the various variables from 2003 onwards and avoid creating new artefacts, so in effect we will find the relationship between electronic cards and the others for the years we have the data, while still using the full 1987+ data for all other variables.
The appropriate way to do this is to create multiple complete data sets with different imputed values, fit our model to each, and pool the results. The
mice package makes this spectacularly easy:
Compared to the naive model on the cut-down dataset, we now have a much more plausible strong relationship with lagged GDP growth (which didn’t show up at all as significant in the first model). Food prices and car/stationwagon registrations the other two variables with “significant” evidence at this stage. The electronic card transactions (
ect_growth) effect is much smaller now, an unsurprising result from the “blurring” of the data introduced by the imputation approach and the fact that the other variables have long series of actual values to assert their relationships with the response of GDP growth.
So, we’ve dealt with the missing data, but not yet the multi-collinearity.
Multiple imputation, elastic net regularization, and bootstrap (best)
My preferred approach to reducing the random noise in effect sizes from having collinear explanatory variables is elastic net regularization, which I’ve written about a number of times in this blog.
Regularization can be combined with multiple imputation if we use a bootstrap. For each bootstrap re-sample, we perform a single imputation of missing variables, using just the data in the re-sample. This adds a little to the computational complexity, but this dataset isn’t large, and the more valid inferences possible make it worthwhile. In general, as much of the data processing stage as possible should be repeated in each bootstrap re-sample, for the re-sampling process to work as intended, ie as a mimic of “what would happen if we could repeat this experiment 999 times”, the fundamental principle of frequentist inference.
Bootstrap of a time series normally needs special methods. I’m not necessarily happy with the shortcut I’ve taken here, which is to say that I can treat the data as cross-sectional other than by including the lagged GDP growth variable. However, with all the variables in their stationary state, and with the residual series not showing signs of autocorrelation (more on this later), I think it’s fair enough in a pragmatic way.
I was able to cannibalise code from my previous posts with minimal amendments.
My first step is to estimate good values for the
alpha hyper-parameter in generalised linear regression. This is the value between 0 and 1 that lets you choose a combination of the type of penalty used to determine the amount that coefficients are shrunk towards zero, to counter the way collinearity adds random noise to their estimates. I do this by fitting models with a range of different values of
alpha, while using cross-validation (built into the
glmnet package) to get good values of
lambda is the hyper-parameter that indicates how much shrinkage there is.
This gives me a graphic like the following:
We can see that we can get low values of mean error with many combinations of
lambda, which is reassuring in suggesting that the choices aren’t critical. But I store the best value of alpha for use in the actual bootstrapped model fitting to follow.
Here’s the code that does the actual bootstrapped, mulitply-imputed, elastic net regularized fitting of my model. It has four steps:
- define a function
elastic()that takes a resampled set of data, performs a single imputation on it, determines a good value of lambda, fits the model and returns the values of the coefficients
- use the
boot::bootfunction to run that function on 999 different samples with replacement of the original data
- tidy up the resulting percentile confidence intervals into a neat data frame
- draw a nice graphic showing the confidence intervals
Most of the code below is actually in the third and fourth of those steps:
And I’m sticking to that (same graphic as I started the post with) as my best results.
I’ve mentioned a couple of times my nervousness about how I’ve inadequately treated the timeseries nature of these data. Here’s a quick look at the residuals of one version of my model (ie one set of imputed values). It assures me that there is no evidence of the residuals being autocorrelated, but I still think I’m cutting corners:
As an alternative approach, I used the
forecast::auto.arima() approach to model an imputed set of the data explicitly as a timeseries. This gets me some results that differ slightly from the above, but perhaps are broadly compatible:
The key difference is that business confidence (
bc_sa) and, to a lesser extent, electronic card transactions (
ect_growth) are more strongly significant in this version. Food price (
fpi_growth) is still strongly negative, and other confidence intervals all encompass zero. Note that lagged GDP growth doesn’t appear in this model explicitly – it comes in automatically as the
ar1 (auto-regression at lag 1) term of the algorithmically-chosen ARIMA model.
The results are similar enough that I decide I’m ok with the approach I’ve taken, so I don’t invest in what would be quite an effort (but eminently possible with time) to introduce multiple imputation and regularization into the time series model.
Here’s the code for that bit of time series modelling:
All the analysis code, as well as the data downloading and grooming code (which is the bulk of it…), is available in the GitHub repository for this project. The code excerpts above aren’t self-sufficient without the rest of the project, which has a folder system that doesn’t lend itself to representing in a simple blog post; you need to clone the project to get it to work. For the record, here are the R packages used: