# Volatility forecast evaluation in R

September 24, 2012
By

(This article was first published on Eran Raviv » R, and kindly contributed to R-bloggers)

In portfolio management, risk management and derivative pricing, volatility plays an important role. So important in fact that you can find more volatility models than you can handle (Wikipedia link). What follows is to check how well each model performs, in and out of sample. Here are three simple things you can do:

1. Regression based test – Mincer Zarnowitz regression.
The idea is simple, regress actual (realized) values on forecasts:

Now we jointly test the hypothesis: .
An intercept of zero means that your forecast in unbiased. Why? say by way of contradiction it is 0.02, this means that in order for the two sides to equate, we add to 0.02 on average to the forecast which means it is constantly under estimating the observed. The slope should be 1, as to say, your forecast “explains” the observed value perfectly.

2. Pairwise Comparison – Diebold Mariano test.
Say you have two models which produce two sets of forecasts. you therefore have two sets of errors. Call these errors

In the case where the two methods are the same, the difference between these two vectors is zero on average (Or a function of these vectors, e.g. e1^2 – e2^2). In the extreme case where you just replicate the forecasts using the same method, the difference is exactly zero. More importantly, we know how this difference is distributed (asymptotically..) so that we can test if it is indeed far from zero.

Students always have trouble understanding this point. The explanation I give is that it is impossible to measure the distance between 0 and 2 without knowing how likely a result of 2 really is! A result of 2 with a uniform distribution between {-3,3} is not as unlikely as a result of 2 with the standard normal distribution. I guess that is also the reason why we call the probability, a probability measure, since it measures..

The method that produces significantly smaller errors (typically squared errors or absolute errors) is preferred. You can easily extend it to more than one comparison (google “Tukey for all pairwise comparisons”, or look here for the multivariate extension of the Diebold Mariano test.)

3. Jarque–Bera test
In the case we have an accurate volatility forecast. We can center the series and scale it using our forecasts for the standard deviation. To be precise:

Should have mean zero and standard deviation of one. The new standardized series will NOT be in general normally distributed, but you can use this test as a measure for how well the skewness and kurtosis of the original series are captured by your model. In fact, the idea for this post came to me when I read Pat’s post: garch-and-long-tails where Pat was checking how Kurtosis is (unconditionally) captured when we use t-distribution instead of normal in the Garch model. The Jarque–Bera test is a natural extension since the higher moments, skewness and kurtosis, appear in the expression for the test statistic.

In practice
The first two options are valid for general forecasting evaluation, however, volatility is unobservable so it is unclear what we use as observed values. What we do is to replace the “observed” with a proxy, typically the squared return. This proxy is known to be not very accurate. here you can find more accurate alternative, however, they are based on intra-day information so you need access to intra-day data source.

Let’s see how this works in R. Most of the code is written for us, which should remind you how nice it is to belong with the R community. We need:

?View Code RSPLUS
 library(quantmod) ;library(rugarch) ; library(car) library(forecast) ; library(lawstat)

I pull the data from google, make the return series for the S&P for the most recent 5 years and estimate the standard garch(1,1) and another more accurate GJR-garch (asymmetric garch) and implement the three options.

?View Code RSPLUS
  end<- format(Sys.Date(),"%Y-%m-%d") start<-format(Sys.Date() - 5*365,"%Y-%m-%d") # Most recent 5 years dat0 = as.matrix(getSymbols('SPY', src="google", from=start, to=end, auto.assign = F, warnings = FALSE,symbol.lookup = F)) n = NROW(dat0) ret <- NULL ret[2:n] <- dat0[2:n,4]/dat0[1:(n-1),4] - 1 # close to close plot(ret, ty = "l") ret = ret[-1] #-1 to drop the NA gjrtspec <- ugarchspec(mean.model=list(armaOrder=c(1,0)),variance.model =list(model = "gjrGARCH"),distribution="std") normspec <- ugarchspec(mean.model=list(armaOrder=c(1,0)), distribution="norm") Tgjrmodel = ugarchfit(gjrtspec,ret) Nmodel = ugarchfit(normspec,ret) Tgjrfit = as.data.frame(Tgjrmodel)$sigma Nfit = as.data.frame(Nmodel)$sigma ret.sq = ret^2 N.sq = Nfit^2 Tgjr.sq = Tgjrfit^2 plot(ret.sq, ty = "l", ylim = c(0,.004)) ; lines(Tgjr.sq, col = 2) ; lines(N.sq, col = 3)

SPY volatility with two Garch model specifications

?View Code RSPLUS
 ######################### # option one - Mincer Zarnowitz #########################   N.mz = lm(ret.sq~N.sq) ; summary(N.mz)$coef T.mz = lm(ret.sq~Tgjr.sq) ; summary(T.mz)$coef   linearHypothesis(N.mz, c("(Intercept) = 0", "N.sq = 1")) Res.Df RSS Df Sum of Sq F Pr(>F) 1 1273 0.000951 2 1271 0.000948 2 2.4e-06 1.61 0.2   linearHypothesis(T.mz, c("(Intercept) = 0", "Tgjr.sq = 1")) Res.Df RSS Df Sum of Sq F Pr(>F) 1 1273 0.000885 2 1271 0.000885 2 2.06e-07 0.15 0.86   ######################### # option two - DM.test #########################   dm.test((ret.sq - N.sq), (ret.sq - Tgjr.sq), alternative = c("two.sided"), h = 1, power = 2) Diebold-Mariano Test data: (ret.sq - N.sq) (ret.sq - Tgjr.sq) DM = 5e-04, Forecast horizon = 1, Loss function power = 2, p-value = 0.9996 alternative hypothesis: two.sided # No difference between the two sets of forecasts for the squared loss function   ######################### # option three - Jarque Bera test ######################### standardN = scale(ret,scale = F)/Nfit standardTgjr = scale(ret,scale = F)/Tgjrfit scaleret = scale(ret) # no volatility model, just Unconditional volatility rjb.test(scaleret,"JB")$stat X-squared 5369 ## Very high :) rjb.test(standardN,"JB")$stat X-squared 58.79 rjb.test(standardTgjr,"JB")\$stat X-squared 50.62

Notes
1. I did not really produce forecasts here but fitted values. For forecasting you need to create forecasts recursively using a loop for h step ahead forecast.