An idea that I have been toying for a while, has been to study the effect of a domain-specific optimization strategy in the ARMA+GARCH models. If you recall from this long tutorial, the implemented approach cycles through all models within a the specified ranges for the parameters and chooses the best model based on the AIC statistic. One idea which I have studied recently is to try to improve the model selection by using a different criteria to determine the “best” model, namely to use a domain-specific strategy.
Here is where greed enters the picture: Since our domain is finance, and they claim greed is good. What if we choose the model which has best performance in-sample?
There are a few practical details to solve first. Certainly it’s not a bad idea to remove the parameters within the specified range from the result computation. For instance, if we consider models between (0,0,1,1) and (5,5,1,1), we will not include the first five days when computing the returns for each model. The reason is that for shorter models we will get the exact value for some of these days, which may be considered unfair. I don’t have a strong feeling here, since one may argue that overall we’d like to prefer shorter models, and if we allow all parameters, that naturally gives an advantage to the shorter models, because we will be using the actuals for some of the parameters. In practice, it proves irrelevant.
I also want the model selection to be deterministic. Here is the full list of criteria when choosing a “winner” between two models:
- Choose the one with higher returns
- If returns are the same, choose the one with less parameters
- If the number of parameter is the same, (3,5) and (5,3) for instance, choose the one with less AR parameters – (3,5) in the previous example
How do we compute the returns in-sample? This might be easier to explain by a piece of code:
require( quantmod ) require( fGarch ) # Get the S&P 500 getSymbols( "^GSPC", from="1900-01-1" ) # Compute the returns gspcRets = round( na.trim( diff( log( Cl( GSPC ) ) ) ), 6 ) # The sample for the ARMA+GARCH model fit tt = tail( gspcRets["/2012-09-10"], 500 ) # Use an arbitrary model to illustrate the idea fit = garchFit( formula=as.formula( "~ arma(1,3)+garch(1,1)" ), data=tt, trace=F ) # Assuming there was no exception on the previous line, compute the # in-sample returns. If there was an exception - fit another model. # Compute the indicator ff = ifelse( fit@fitted < 0, -1, 1 ) # Optionally, exclude the first few days ff[1:5] = 0 ret = as.numeric( tail( cumprod( 1 + ff*fit@data ), 1 ) ) # ret is the "greedy" metric for this model ret #  1.696181 - or 69%
After I added support to my code base for this type of metric, I rerun the S&P 500 simulation. The result from this simulation is the greedy daily indicator. Now, we need to compare against the original, which uses the AIC statistic.
require( quantmod ) require( fGarch ) # Get the S&P 500 getSymbols( "^GSPC", from="1900-01-1" ) # Compute the returns gspcRets = round( na.trim( diff( log( Cl( GSPC ) ) ) ), 6 ) # Load the indicator ind = as.xts( read.zoo(file="gspcInd3.csv", format="%Y-%m-%d", header=T, sep=",") ) # Load the greedy indicator greedyInd = as.xts( read.zoo(file="gspcGreedyInd.csv", format="%Y-%m-%d", header=T, sep=",") ) # Merge the dates mm = merge( ind$Indicator, greedyInd$Indicator, all=F ) # Compute a vector with 1s in the positions where the two indicators differ aa = ifelse( mm[,1] == mm[,2], 0, 1 ) # Percentage of different days round( NROW( aa[aa != 0] ) / NROW( aa) * 100, 2 ) # 12.59% - good, looks like the difference may be significant # Merge everything together nn = merge( gspcRets, aa, ind$Indicator, greedyInd$Indicator, all=F ) # The performance of the original (based on AIC) indicator indPerf = nn[,1]*nn[,2]*nn[,3] # The performance of the greedy indicator indGreedyPerf = nn[,1]*nn[,2]*nn[,4] # How many days were correctly predicted by each method NROW( indPerf[indPerf>0] ) # 957 NROW( indGreedyPerf[indGreedyPerf>0] ) # 962
Going through the above code, I was excited when I saw the significant number of different forecasts between the two methods (line 24). Then, of course, I was put off to see no real advantage in either method.