Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post is motivated by a discussion that arose when I tested a strategy by Frank of Trading The Odds (post here). One point, brought up by Tony Cooper of Double Digit Numerics, the original author of the paper that Trading The Odds now trades (I consider it a huge honor that my blog is read by authors of original trading strategies), is that my heatmap analysis only looked cross-sectional performance, as opposed to performance over time–that is, performance that could have been outstanding over the course of the entire backtest could have been the result of a few lucky months. This is a fair point, which I hope this post will address in terms of a heuristic using both visual and analytical outputs.

The strategy for this post is the following, provided to me kindly by Mr. Helmuth Vollmeier (whose help in all my volatility-related investigations cannot be understated):

Consider VXV and VXMT, the three month and six month implied volatility on the usual SP500. Define contango as VXV/VXMT < 1, and backwardation vice versa. Additionally, take an SMA of said ratio. Go long VXX when the ratio is greater than 1 and above its SMA, and go long XIV when the converse holds. Or in my case, get in at the close when that happens and exit at the next day's close after the converse occurs (that is, my replication is slightly off due to using some rather simplistic coding for illustrative purposes).

In any case, here's the script for setting up the strategy, most of which is just downloading the data–the strategy itself is just a few lines of code:

require(downloader)
require(PerformanceAnalytics)

destfile="vxvData.csv")
destfile="vxmtData.csv")

ratio <- Cl(vxv)/Cl(vxmt)

destfile="longXIV.txt")

xiv <- merge(xiv, ratio, join='inner')
vxx <- merge(vxx, ratio, join='inner')
colnames(xiv)[5] <- colnames(vxx)[5] <- "ratio"

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))

retsList <- list()
count <- 1
for(i in 10:200) {
ratioSMA <- SMA(ratio, n=i)
vxxSig <- lag(ratio > 1 & ratio > ratioSMA)
xivSig <- lag(ratio < 1 & ratio < ratioSMA)
rets <- vxxSig*vxxRets + xivSig*xivRets
colnames(rets) <- i
retsList[[i]]  <- rets
count <- count+1
}
retsList <- do.call(cbind, retsList)
colnames(retsList) <- gsub("X", "", colnames(retsList))
charts.PerformanceSummary(retsList)
retsList <- retsList[!is.na(retsList[,191]),]
retsList <- retsList[-1,]

retsList <- retsList[-c(1538, 1539, 1450),] #for monthly aggregation, remove start of Dec 2014


About as straightforward as things get (the results, as we'll see, are solid, as well). In this case, I tested every SMA between a 10 day and a classic 200 day SMA. And since this strategy is a single-parameter strategy (unless you want to get into adjusting the ratio critical values up and down away from 1), instead of heatmaps, we'll suffice with basic scatter plots and line plots (which make things about as simple as they come).

The heuristic I decided upon was to take some PerfA functions (Return.annualized, SharpeRatio.annualized, for instance), and compare the rank of the average of their monthly ranks (that is, a two-layer rank, very similar to the process in Flexible Asset Allocation) to the aggregate, whole time-period rank. The idea here is that performance based on a few lucky months may have a high aggregate ranking, but a much lower monthly ranking, which would be reflected in a scatter plot. Ideally, the scatter plot would go from lower left to upper right in terms of ranks comparisons, with a correlation of 1, meaning that the strategy with the best overall return will have the best average monthly return rank, and so on down the list. This can also apply to the Sharpe ratio, and so on.

Here's my off-the-cuff implementation of such an idea:

rankComparison <- function(rets, perfAfun="Return.cumulative") {
fun <- match.fun(perfAfun)
monthlyFun <- apply.monthly(rets, fun)
monthlyRank <- t(apply(monthlyFun, MARGIN=1, FUN=rank))
meanMonthlyRank <- apply(monthlyRank, MARGIN=2, FUN=mean)
rankMMR <- rank(meanMonthlyRank)

aggFun <- fun(rets)
aggFunRank <- rank(aggFun)
plot(aggFunRank~rankMMR, main=perfAfun)
print(cor(aggFunRank, meanMonthlyRank))
}


So, I get a chart and a correlation of average monthly ranks against a single-pass whole-period rank. Here are the results for cumulative returns and Sharpe ratio:

> rankComparison(retsList)
[1] 0.8485374


Basically, the interpretation is this: the outliers above and to the left of the main cluster can be interpreted as those having those “few lucky months”, while those to the lower right consistently perform somewhat well, but for whatever reason, are just stricken with bad luck. However, the critical results that we’re looking for is that the best overall performers (the highest aggregate rank) are also those with the most *consistent* performance (the highest monthly rank), which is generally what we see.

Furthermore, the correlation of .85 also lends credence that this is a robust strategy.

Here’s the process repeated with the annualized Sharpe ratio:

> rankComparison(retsList, perfAfun="SharpeRatio.annualized")
[1] 0.8647353


In other words, an even clearer relationship here, and again, we see that the best performers overall are also the best monthly performers, so we can feel safe in the robustness of the strategy.

So what’s the punchline? Well, the idea is now that we’ve established that the best results on aggregate are also the strongest results when analyzing the results across time, let’s look to see if the various rankings of risk and reward metrics reveal which configurations those are.

Here’s a chart of the aggregate rankings of annualized return (aka cumulative return), annualized Sharpe, MAR (return over max drawdown), and max drawdown.

aggReturns <- Return.annualized(retsList)
aggSharpe <- SharpeRatio.annualized(retsList)
aggMAR <- Return.annualized(retsList)/maxDrawdown(retsList)
aggDD <- maxDrawdown(retsList)

plot(rank(aggReturns)~as.numeric(colnames(aggReturns)), type="l", ylab="annualized returns rank", xlab="SMA",
main="Risk and return rank comparison")
lines(rank(aggSharpe)~as.numeric(colnames(aggSharpe)), type="l", ylab="annualized Sharpe rank", xlab="SMA", col="blue")
lines(rank(aggMAR)~as.numeric(colnames(aggMAR)), type="l", ylab="Max return over max drawdown", xlab="SMA", col="red")
lines(rank(-aggDD)~as.numeric(colnames(aggDD)), type="l", ylab="max DD", xlab="SMA", col="green")
legend("bottomright", c("Return rank", "Sharpe rank", "MAR rank", "Drawdown rank"), pch=0, col=c("black", "blue", "red", "green"))


And the resulting plot itself:

So, looking at these results, here are some interpretations, moving from left to right:

At the lower end of the SMA, the results are just plain terrible. Sure, the drawdowns are lower, but the returns are in the basement.
The spike around the 50-day SMA makes me question if there is some sort of behavioral bias at work here.
Next, there’s a region with fairly solid performance between that and the 100-day SMA, but is surrounded on both sides by pretty abysmal performance.
Moving onto the 100-day SMA region, the annualized returns and Sharpe ratios are strong, but get the parameter estimation incorrect going forward, and there’s a severe risk of incurring heavy drawdowns. The jump improvement in the drawdown metric is also interesting. Again, is there some sort of bias towards some of the round numbers? (50, 100, etc.)
Lastly, there’s nothing particularly spectacular about the performances until we get to the high 100s and the 200 day SMA, at which point, we see a stable region of configurations with high ranks in all categories.

Let’s look at that region more closely:

truncRets <- retsList[,161:191]
stats <- data.frame(cbind(t(Return.annualized(truncRets)),
t(SharpeRatio.annualized(truncRets)),
t(maxDrawdown(truncRets))))
colnames(stats) <- c("A.Return", "A.Sharpe", "Worst_Drawdown")
stats\$MAR <- stats[,1]/stats[,3]
stats <- round(stats, 3)


And the results:

> stats
A.Return A.Sharpe Worst_Drawdown   MAR
170    0.729    1.562          0.427 1.709
171    0.723    1.547          0.427 1.693
172    0.723    1.548          0.427 1.694
173    0.709    1.518          0.427 1.661
174    0.711    1.522          0.427 1.665
175    0.711    1.522          0.427 1.665
176    0.711    1.522          0.427 1.665
177    0.711    1.522          0.427 1.665
178    0.696    1.481          0.427 1.631
179    0.667    1.418          0.427 1.563
180    0.677    1.441          0.427 1.586
181    0.677    1.441          0.427 1.586
182    0.677    1.441          0.427 1.586
183    0.675    1.437          0.427 1.582
184    0.738    1.591          0.427 1.729
185    0.760    1.637          0.403 1.886
186    0.794    1.714          0.403 1.970
187    0.798    1.721          0.403 1.978
188    0.802    1.731          0.403 1.990
189    0.823    1.775          0.403 2.042
190    0.823    1.774          0.403 2.041
191    0.823    1.774          0.403 2.041
192    0.819    1.765          0.403 2.031
193    0.822    1.772          0.403 2.040
194    0.832    1.792          0.403 2.063
195    0.832    1.792          0.403 2.063
196    0.802    1.723          0.403 1.989
197    0.810    1.741          0.403 2.009
198    0.782    1.677          0.403 1.941
199    0.781    1.673          0.403 1.937
200    0.779    1.670          0.403 1.934


So starting from SMA 186 through SMA 200, we see some fairly strong performance–returns in the high 70s to the low 80s, and MARs in the high 1s to low 2s. And of course, since this is about a trading strategy, equity curves are of course, obligatory. Here is what that looks like:

strongRets <- retsList[,177:191]
charts.PerformanceSummary(strongRets)


Basically, on aggregate, some very strong performance. However, it is certainly not *smooth* performance. New equity highs are followed by strong drawdowns, which are then followed by a recovery and new, higher equity highs.

To conclude (for the moment, I’ll have a new post on this next week with a slight wrinkle that gets even better results), I hope that I presented not only a simple but effective strategy, but also a simple but effective (if a bit time consuming, to do all the monthly computations on 191 return streams) heuristic suggested/implied by Tony Cooper of double digit numerics for analyzing the performance and robustness of your trading strategies. Certainly, while many professors and theorists elucidate on robustness (with plenty of math that makes stiff bagels look digestible), I believe not a lot of attention is actually paid to it in more common circles, using more intuitive methods. After all, if someone would want to be an unscrupulous individual selling trading systems or signals (instead of worrying about the strategy’s capacity for capital), it’s easy to show an overfit equity curve while making up some excuse so as to not reveal the (most likely overfit) strategy. One thing I’d hope this post inspires is for individuals to ask not only look at equity curves, but also plots such as aggregate against average monthly (or higher frequencies, if the strategies are tested over mere months, for instance, such as intraday trading) metric rankings when performing parameter optimization.

Is this heuristic the most robust and advanced that can be done? Probably not. Would one need to employ even more advanced techniques if computing time becomes an issue? Probably (bootstrapping and sampling come to mind). Can this be built on? Of course. *Will* someone build on it? I certainly plan on revisiting this topic in the future.

Lastly, on the nature of the strategy itself: while Trading The Odds presented a strategy functioning on a very short time frame, I’m surprised that instead, we have a strategy whose parameters are on a much higher end of the numerical spectrum.