If you have studied Vector Autorregressive (VAR) models you are probably familiar with the “curse of dimensionality” (CD). It is very frustrating to see how ordinary least squares (OLS) fails to produce reliable results even for moderate size VARs. For those who are new to VARs, the CD means that the number of parameters to estimate grow very fast with the size of the model. Consider the VAR(1):
where is a vector of dimension with all the variables in the system, is the matrix of coefficients, is a vector of intercepts and are the errors. The model above may be easily generalized to a VAR(p):
A VAR(1) with three variables has 9 coefficients to be estimated; a VAR(6) with 6 variables has 216; and a VAR(12) with 12 variables has 1728 coefficients. If you want to estimate the VAR(12) you will need at least 144 observations for the OLS to be feasible, and you will be estimating a lot of noise, forecasts and structural analysis will be a total mess.
Fortunately, there are several alternatives to OLS such as Bayesian VARs (BVAR). There are many forms of conjugated BVARs, these models don’t need algorithms, i.e. they have analytical solution that can be computed very fast even on very high dimension. This post will present our implementation of a BVAR proposed by Banbura et al. (2010).
Aplication: Panel of volatilities
In this application I’m going to estimate a very big VAR on the daily volatilities of the 30 stocks of the Dow. In order to access the data and replicate this example you are going to need to install the package HDeconometrics from github.
# = load package and data = # library(HDeconometrics) data("voldata") # = Break data into in and out of sample to test model accuracy= # Yin=voldata[1:5499,] Yout=voldata[-c(1:5499),]
The objective here is to forecast the volatilities from the Dow stocks up to one month ahead. This is more or less 22 work days. It is usual to set the VAR lags to one month in to estimate this type of model, i.e. a VAR(22) with 30 variables. We need to estimate 19800 coefficients WOW!!! Each equation has 660 variables.
Since we have more than 5000 observations, the OLS estimation is feasible and can be directly compared with the BVAR. The code below estimates a VAR(22) using OLS and BVAR and display some comparative plots.
# = Run models = # # = OLS = # modelols=HDvar(Yin,p=22) # takes a while to run predols=predict(modelols,h=22) # = BVAR = # modelbvar=lbvar(Yin, p = 22, delta = 0.5) predbvar=predict(modelbvar,h=22) # = Forecasts of the Coca-Cola volatility = # k="KO" plot(tail(voldata[,k],122),type="l", main="Coca-Cola forecasts") lines(c(rep(NA,100),predols[,k]),col=2) lines(c(rep(NA,100),predbvar[,k]),col=4) abline(v=100,lty=2,col=4) legend("topleft",legend=c("OLS","BVAR"),col=c(2,4),lty=1,lwd=1,seg.len=1,cex=1,bty="n")
# = Overall percentual error = # MAPEols=abs((Yout-predols)/Yout)*100 MAPEbvar=abs((Yout-predbvar)/Yout)*100 matplot(MAPEols,type="l",ylim=c(0,80),main="Overall % error",col="lightsalmon",ylab="Error %") aux=apply(MAPEbvar,2,lines,col="lightskyblue1") lines(rowMeans(MAPEols),lwd=3,col=2,type="b") lines(rowMeans(MAPEbvar),lwd=3,col=4,type="b") legend("topleft",legend=c("OLS","BVAR"),col=c(2,4),lty=1,lwd=1,seg.len=1,cex=1,bty="n")
The two figures above show that the BVAR is more accurate. If you replicated the example you probably noticed the BVAR also runs much faster. We can also use the VAR results to have some informal insights on which stocks are more influential and more influenced by other stocks. The first plot below shows how much each stock’s volatility influence the other stocks. The second plot shows how much each stock is influenced by others.
# = Influences = # aux=modelbvar$coef.by.block[2:23] impacts=abs(Reduce("+", aux )) diag(impacts)=0 I=colSums(impacts) R=rowSums(impacts) par(mfrow=c(2,1)) barplot(I,col=rainbow(30),cex.names = 0.3, main = "Most Influent") barplot(R,col=rainbow(30),cex.names = 0.3, main = "Most Influenced")
Why does the BVAR works?
The BVAR presented here assumes all coefficients are equal to 0 except by the first lag autorregressive coefficients, which are set to 0.5 trough the argument
delta. As the model receives the data, it starts to adjust the coefficients, but they only change if the data really carry useful information. The only coefficients that will deviate from zero are the ones that are in fact important. Although we are dealing with thousand potential parameters, the BVAR will pay much more attention to the ones that are really important.
Follow us for more information on high-dimensional VARs in the future. Here is some suggested reading:
- Banbura, Marta, Domenico Giannone, and Lucrezia Reichlin. “Large Bayesian VARs.” (2008).