Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this tutorial I am going to share my R&D and trading experience using the well-known from statistics Autoregressive Moving Average Model (ARMA). There is a lot written about these models, however, I strongly recommend Introductory Time Series with R, which I find is a perfect combination between light theoretical background and practical implementations in R. Another good reading is the online e-book Forecasting: principles and practice written by Rob Hyndman, an expert in statistical forecasting and the author of the excellent forecast R package.

## Getting Started

In R, I am mostly using the fArma package, which is a nice wrapper with extended functionality around the arima function from the stats package (used in the above-mentioned book). Here is a simple session of fitting an ARMA model to the S&P 500 daily returns:

```library( quantmod )
library( fArma )

# Get S&P 500
getSymbols( "^GSPC", from="2000-01-01" )

# Compute the daily returns
gspcRets = diff( log( Cl( GSPC ) ) )

# Use only the last two years of returns
gspcTail = as.ts( tail( gspcRets, 500 ) )

# Fit the model
gspcArma = armaFit( formula=~arma(2,2), data=gspcTail )
```

For more details, please refer to the literature and the packages, I just want to emphasize on a couple of points:

• We model the daily returns instead of the prices. There are multiples reasons: this way financial series usually become stationary, we need some way to “normalize” a series, etc.
• We use the diff and log function to compute the daily returns instead of percentages. Not only this is a standard practice in statistics, but it also provides a damn good approximation to the discrete returns.

The approach I will present here is a form of walk-forward backtesting. While walking the series day by day, we will use history of certain length to find the best model. Then we will use this model to predict the next day’s return. If the prediction is negative, we assume short position, otherwise we assume a long position.

An example will make things clearer: After the close of June 11th, 2012, we compute the last 500 daily returns. Using these returns we search through the space of ARMA models and select the best-fitting (with respect to some metric and some requirements) model. Finally, we use this model to compute the prediction for the tomorrow’s return and use the sign of the return to decide the appropriate position.

## Choosing a Good Model

The first obstacle for this method before it could be useful to us, is to select the model parameters. In the case of ARMA, there are two parameters. In other words, there is an infinite number of choices: (0,1), (1,0), (1,1), (2,1), etc. How do we know what parameters to use?

A common approach in statistics to quantify the goodness of fit test is the AIC (for Akaike Information Criteria) statistic. Once the fitting is done, the value of the aic statistics is accessible via:

```xxArma = armaFit( xx ~ arma( 5, 1 ), data=xx )
[email protected]\$aic
```

There are other statistics of course, however, typically the results are quite similar.

To summarize, all we need is a loop to go through all parameter combinations we deem reasonable, for instance from (0,0) to (5,5), inclusive, for each parameter pair fit the model, and finally pick the model with the lowest AIC or some other statistic.

```# https://gist.github.com/2913657

armaSearch = function(
xx,
minOrder=c(0,0),
maxOrder=c(5,5),
trace=FALSE )
{
bestAic = 1e9
len = NROW( xx )
for( p in minOrder:maxOrder ) for( q in minOrder:maxOrder )
{
if( p == 0 && q == 0 )
{
next
}

formula = as.formula( paste( sep="", "xx ~ arma(", p, ",", q, ")" ) )

fit = tryCatch( armaFit( formula, data=xx ),
error=function( err ) FALSE,
warning=function( warn ) FALSE )
if( !is.logical( fit ) )
{
fitAic = [email protected]\$aic
if( fitAic < bestAic )
{
bestAic = fitAic
bestFit = fit
bestModel = c( p, q )
}

if( trace )
{
ss = paste( sep="", "(", p, ",", q, "): AIC = ", fitAic )
print( ss )
}
}
else
{
if( trace )
{
ss = paste( sep="", "(", p, ",", q, "): None" )
print( ss )
}
}
}

if( bestAic < 1e9 )
{
return( list( aic=bestAic, fit=bestFit, model=bestModel ) )
}

return( FALSE )
}
```

Note that sometimes armaFit fails to find a fit and returns an error, thus quitting the loop immediately. armaSearch handles this problem by using the tryCatch function to catch any error or warning and return a logical value (FALSE) instead of interrupting everything and exiting with an error. Thus we can distinguish an erroneous and normal function return just by checking the type of the result. A bit messy probably, but it works.

Some R packages, forecast and rugarch for instance, provide a similar, auto.arima function out of the box. So one can build his infrastructure around one of these instead.

## Forecasting

Once the parameters are selected, it’s time to determine the position at the close. One way to do that is by a one day ahead prediction, if the prediction comes negative (remember the series we are operating on is the daily returns) then the desired position is short, otherwise it’s long.

```library( quantmod )
library( fArma )

getSymbols( "SPY", from="1900-01-01" )
spyRets = diff( log( Cl( SPY )["/2012-05-29"] ) )
spyArma = armaFit( ~arma(0, 2), data=as.ts( tail( spyRets, 500 ) ) )
as.numeric( predict( spyArma, n.ahead=1, doplot=F )\$pred )
# -0.0004558926
```

Now, to build an indicator for the back testing, one can walk the daily return series and at each point perform the steps we covered so far. The main loop looks like (shortened on purpose):

```# currentIndex is the index of the day we are making a forcast for
# xx is the return series
# history is look-back period to consider at each point
repeat
{
nextIndex = currentIndex + 1

# lags is how many days behind is the data, the default is 1,
# meaning use data up to yesterdays close
forecastLength = nextIndex - currentIndex + lags - 1

# Get the series
yy = xx[index(xx)[(currentIndex-history-lags+1):(currentIndex-lags)]]

# Find the best fit
bestFit = armaSearch(
yy,
minOrder,
maxOrder,
withForecast=TRUE,   # we want the model to have a valid forecast
forecastLength=forecastLength,   # 1 for a dialy forecast
trace=trace,
cores=cores )   # the number of cores to use

if( !is.null( bestFit ) )
{
# Forecast
fore = tryCatch( predict( bestFit, n.ahead=forecastLength, doplot=FALSE ),
error=function( err ) FALSE,
warning=function( warn ) FALSE )
if( !is.logical( fore ) )
{
# Save the forecast
forecasts[currentIndex] = tail( fore\$pred, 1 )

# Save the model order
ars[currentIndex] = order
mas[currentIndex] = order

forecasts[currentIndex] = 0
}

if( nextIndex > len ) break
currentIndex = nextIndex
}
}
```

Where history is the look-back period to consider at each point, I usually use 500, which is about two years of data. In other words, to determine the position at each individual day (previous day close to the current day close determines the return) we use history of 500 days, lagged by lags day. You will see later how lags comes into play in practice.

Notice, that predict has also to be surrounded by a tryCatch block. armaSearch also has the nice feature to determine whether a model has a forecast or not (predict succeeds or not, this test is controlled via the withForecast parameter).

## Improving Performance

The number of computations we have to do adds up quickly. For example, for 10 years of historic data we need to compute about 2,520 trading days. For each day we are going to fit and predict at least 35 (35=6*6-1, 0 to 5 both for the AR and MA component, but excluding the (0,0) combination) models. Multiplying the number of models by the number of days, and we are already looking at more than 88 thousand model fits – that’s a lot of computations.

One way to improve the performance of these necessary computations can be achieved by exploiting multi-core CPUs. My approach is to parallelize the model selection, the armaSearch function in the above code. Although this may not be the most efficient approach, it is certainly the more practical since it will also boost the performance of armaSearch when used independently.

I won’t post the final version of the code here due to it’s length. I will give you the GIST link instead!

## Modeling Volatility with GARCH

Financial time series are random in general. One of the few properties they exhibit is Volatility Clustering. This is typically achieved by extending the ARMA forecasting with a GARCH model. Sounds complex, and the theoretical details are complex indeed, but it turns out to be pretty straightforward in R:

```library(quantmod)
library(fGarch)

getSymbols("SPY", from="1900-01-01")
spyGarch = garchFit(~arma(0, 2) + garch(1, 1), data=as.ts(tail(spyRets, 500)))
# the actual forecasts are predict(spyGarch, n.ahead=1, doplot=F)[,1]
```

Of course, we also need to modify all relevant functions, like armaSearch. Calls to garchFit and predict also need to be handled via tryCatch. Notice also that predict returns a matrix for GARCH models.

If you are interested in the full source code, you can contact us.

## S&P 500 Performance

Let’s start with the equity curve of applying the ARMA+GARCH strategy over the full 60 years (since 1950) of S&P 500 historic data.

It looks fantastic! In fact, it impressed me so much that I looked for bugs in the code for quite some time. Even on a logarithmic chart the performance of this method is stunning – CAGR of 18.87%, and the ARMA+GARCH strategy achieves this performance with a comparable maximum drawdown of 56%.

To compute the ARMA strategy growth, we first need the daily indicator (this indicator takes about two days to compute with all optimizations I went through).

The first column is the date, the second the position for this day: 1 for long, -1 for short, 0 for none. Note, the position is already aligned with the day of the return (it is computed at the close of the previous day), in other words, the indicator is aligned properly with the returns – no need to shift right via lag. The indicator, the first column, needs to be multiplied with the S&P 500 daily returns. The rest of the columns are irrelevant and hopefully self-explanatory.

Let’s wrap up the post with the code that loads the indicator and plots the graphic:

```library(quantmod)
library(lattice)
library(timeSeries)

getSymbols("^GSPC", from="1900-01-01")

# The maximum draw down

# The largest dropdawn is:
#         From     Trough         To      Depth Length ToTrough Recovery
# 1 2007-10-10 2009-03-09 2012-09-28 -0.5677539   1255      355       NA

# Filter out only the common indexes
mm = merge( gspcArmaInd[,1], gspcRets, all=F )
gspcArmaRets = mm[,1] * mm[,2]

# The maximum draw down
# The largest dropdawn is:
#          From     Trough         To      Depth Length ToTrough Recovery
# 1  1987-10-26 1992-10-09 1997-10-27 -0.5592633   2531     1255     1276

gspcArmaGrowth = log( cumprod( 1 + gspcArmaRets ) )

gspcBHGrowth = log( cumprod( 1 + mm[,2] ) )

gspcAllGrowth = merge( gspcArmaGrowth, gspcBHGrowth, all=F )

xyplot( gspcAllGrowth,
superpose=T,
col=c("darkgreen", "darkblue"),
lwd=2,
key=list( x=.01,
y=0.95,