Lately I have been testing trading models based on methods from various fields: statistics, machine learning, wavelet analysis and others. And I have been doing all that in R! In this series, I will try to share some of these efforts starting with the well-known from statistics Autoregressive Moving Average Model (ARMA). There is a lot of written about these models, however, I strongly recommend Introductory Time Series with R, which I find is a perfect combination between light theoretical background and practical implementations in R.
In R, I am using the fArma package, which is a nice wrapper with extended functionality around the arima function from the stats package (used in the book). Here is a simple session of fitting an ARMA model to the S&P 500 daily returns:
# Get S&P 500
getSymbols( "^GSPC", from="2000-01-01" )
# Compute the daily returns
GSPC.rets = diff(log(Cl(GSPC)))
# Use only the last two years of returns
GSPC.tail = as.ts( tail( GSPC.rets, 500 ) )
# Fit the model
GSPC.arma = armaFit( formula=~arma(2,2), data=GSPC.tail )
The first obstacle is to select the model parameters. In the case of ARMA, there are two parameters. In other words, there is an infinite number of choices: (0,1), (1,0), (1,1), (2,1), etc. How do we know what parameters to use?
A naive approach would be to back-test strategies with all different combinations over a period of time and pick the best. This is something I have dubbed “hyper system” (ie a system of systems) and can be applied to any combination of indicators and comparative function.
Fortunately there are more robust statistical methods to do that. More on that in the next post …