Backtesting a Simple Stock Trading Strategy: Part 3

Posted on October 17, 2011 by Zach Mayer in R bloggers | 0 Comments

[This article was first published on Modern Toolmaking, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Note: This post is NOT financial advice! This is just a fun way to explore some of the capabilities R has for importing and manipulating data.

In a previous post, I examined a simple stock trading strategy: Find the high point over the last 200 days, and buy the stock if it’s been less than 100 days since that high. Otherwise, have no position.

What if we use different parameters than 200-day high and hold 100 days? How will that affect our strategy? First of all, we have to reload the data for the S&P 500 index and re-define the functions used to implement our strategy.

Next, we must decide the range of parameters we wish the test for our strategy. I’ve decided to use a “grid search” to thoroughly examine the parameter space. Somewhat arbitrarily, I’ve decided to test the values from 5-500, by 5, for both parameters. This gives us 100 possible values for each parameter, or 10000 total. Good thing the “daysSinceHigh” function is pretty fast!

Because my processing power is limited, I’m only going to look at every 5th value in this parameter space. The first order of business is to calculate a matrix containing each n-Day high series, where the first column is the number of days since the 5-day high, the second column is the number of days since the 10-day high, etc. This matrix has 100 columns:

Next, I make a list with 100 elements. Each element represents a holding period, which I will apply to a copy of the “n-Day high matrix” from the previous step. For example, the 1st element in the list is a matrix representing a 5-day holding period. The first column in this matrix represents buying at the 5-day high, and holding for 5 days. This is equivalent to buy-and-hold. The second column represents buying at the 10-day high, and holding for 5 days. The third column represents buying at the 15-day high and so on. I repeat this process for each element in the 100-matrix list, which gives us an object representing every possible permutation of our strategy.

It is then a relatively easy thing to calculate the returns associated with each permutation of the strategy, by using the “sweep” function to multiply each column of each matrix by the daily returns for our stock

Now we have a list of matrices of returns. Each column of a matrix represents the returns of our strategy, using a different set of parameters. This allows us to calculate cumulative returns for each set of parameters, and make a nifty graph that shows the relationship between nHigh, nHold, and returns.

This graph uses a custom color ramp function, which was created by Andrie on StackOverflow. The color of each point in the corresponds to how high the returns are at that point. The X axis is number of days to use for the nHigh, and the yAxis is the number of days to use for nHold. As you can see, 100 days seems to be a solid holding period across many values of nHigh, but by using a different value of nHigh, we could increase returns substantially.

Of course, just because these values worked in the past doesn’t mean they will work in the future. Still, it’s good to see that our arbitrary parameters (which performed well in the last post), fall inside a wide range of parameters that yield a positive return for our strategy. This brings up an interesting question: how DO we select parameters for our strategy? How can we tell how well our parameter selection strategy would have performed in the past, given that we’ve optimized our selection based on of our knowledge of the past?

For homework, think about how overfitting and cross-validatation apply to this problem…

BONUS CODE: This creates some nifty 3D charts, using the rgl library.