[This article was first published on QuantStrat TradeR » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

So, before revealing a slight wrinkle on the last strategy I wrote about, I’d like to clear up a bit of confusion regarding Jaekle and Tomasini’s idea of a stable region.

Essentially, the entire idea *is* that similar parameter configurations behave in very similar ways, and so, are supposed to be highly correlated. It does not mean the strategy may not be overfit in other ways, but that incremental changes to a parameter should mean incremental changes to performance, rather than seeing some sort of lucky spike in a sea of poor performance.

In any case, the one change to the strategy from last week is that rather than get in at the current close (aka observe close, execute at close), to get in at the next day’s close.

Again, here’s the strategy script:

require(downloader)
require(PerformanceAnalytics)
require(TTR)

destfile="vxvData.csv")
destfile="vxmtData.csv")

vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
vxmt <- xts(read.zoo("vxmtData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
ratio <- Cl(vxv)/Cl(vxmt)

destfile="longXIV.txt")

xiv <- merge(xiv, ratio, join='inner')
vxx <- merge(vxx, ratio, join='inner')
colnames(xiv)[5] <- colnames(vxx)[5] <- "ratio"

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))

retsList <- list()
count <- 1
for(i in 10:200) {
ratioSMA <- SMA(ratio, n=i)
vxxSig <- lag(ratio > 1 & ratio > ratioSMA, 2)
xivSig <- lag(ratio < 1 & ratio < ratioSMA, 2)
rets <- vxxSig*vxxRets + xivSig*xivRets
colnames(rets) <- i
retsList[[i]]  <- rets
count <- count+1
}
retsList <- do.call(cbind, retsList)
colnames(retsList) <- gsub("X", "", colnames(retsList))
charts.PerformanceSummary(retsList)
retsList <- retsList[!is.na(retsList[,191]),]
retsList <- retsList[-1,]


The one change I made is that rather than go with the default lag value, I went with a lag of 2. A lag of zero induces look-ahead bias. In any case, let’s run through the process again of analyzing for robustness.

rankComparison <- function(rets, perfAfun="Return.cumulative") {
fun <- match.fun(perfAfun)
monthlyFun <- apply.monthly(rets, fun)
monthlyRank <- t(apply(monthlyFun, MARGIN=1, FUN=rank))
meanMonthlyRank <- apply(monthlyRank, MARGIN=2, FUN=mean)
rankMMR <- rank(meanMonthlyRank)

aggFun <- fun(rets)
aggFunRank <- rank(aggFun)

bothRanks <- data.frame(cbind(aggFunRank, rankMMR, names(rankMMR)), stringsAsFactors=FALSE)
names(bothRanks) <- c("aggregateRank", "averageMonthlyRank", "configName")
bothRanks$aggregateRank <- as.numeric(bothRanks$aggregateRank)
bothRanks$averageMonthlyRank <- as.numeric(bothRanks$averageMonthlyRank)
bothRanks$sum <- bothRanks[,1] + bothRanks[,2] bothRanks <- bothRanks[order(bothRanks$sum, decreasing=TRUE),]

plot(aggFunRank~rankMMR, main=perfAfun)
print(cor(aggFunRank, meanMonthlyRank))
return(bothRanks)
}

retRank <- rankComparison(retsList)
sharpeRank <- rankComparison(retsList, perfAfun="SharpeRatio.annualized")


In this case, I added some functionality to not only do the plotting and correlation, but to spit out a table comparing both the aggregate metric along with the rank of the average monthly rank (again, dual ranking layer), and ordered the table by the sum of both the aggregate and the monthly metric, starting with the highest.

For instance, here’s the output from the returns comparison:

> retRank <- rankComparison(retsList)
[1] 0.736377

aggregateRank averageMonthlyRank configName sum
62            190                191         62 381
63            189                187         63 376
60            185                189         60 374
66            191                182         66 373
65            187                183         65 370
59            184                185         59 369
56            181                186         56 367
64            188                179         64 367
152           174                190        152 364
61            183                178         61 361
67            186                167         67 353
151           165                188        151 353
57            179                173         57 352
153           167                184        153 351
58            182                164         58 346
154           170                175        154 345
53            164                180         53 344
155           166                176        155 342
158           163                177        158 340
150           157                181        150 338


So, for this configuration, the correlation went down from above .8 to around .74…which is still strong and credence that the strategy configurations have validity outside some lucky months. The new feature I added was the data frame of the two ranks side by side, along with their configuration name (in this case, my names were simply the SMA parameter, but the names could be anything such as say, SMA_60_lag_2), and the sum of the two rankings, which orders the configurations. As there were 191 configurations (SMA ranging from 10 to 200), the best score that could be achieved was 382. Furthermore, note that although there seems to be a strong region from SMA 53 to SMA 67, there also seems to be another region, at least when it comes to absolute return, of an SMA parameter at SMA 150+.

Here’s the same table for annualized Sharpe (this variation takes a bit longer to compute due to the monthly annualized Sharpes).

> sharpeRank <- rankComparison(retsList, perfAfun="SharpeRatio.annualized")
[1] 0.5590881
aggregateRank averageMonthlyRank configName   sum
62            190              191.0         62 381.0
59            185              190.0         59 375.0
61            183              186.5         61 369.5
60            186              181.0         60 367.0
63            189              175.0         63 364.0
66            191              164.0         66 355.0
152           166              173.0        152 339.0
58            182              155.0         58 337.0
56            181              151.0         56 332.0
53            174              153.0         53 327.0
57            179              148.0         57 327.0
151           159              162.0        151 321.0
76            177              143.0         76 320.0
150           152              163.0        150 315.0
54            173              140.0         54 313.0
77            178              131.0         77 309.0
65            187              119.0         65 306.0
143           146              156.0        143 302.0
74            167              132.0         74 299.0
153           161              138.0        153 299.0


So, largely the same sort of results as we see with the annualized returns. A correlation of .5 gives some cause for concern, which will hopefully show up in the line plot of the rank of the four metrics (returns, Sharpe, drawdowns, and return to drawdown), which will reveal the regions with strong performance, and not-so-strong performances.

Here’s the ranking line plot.

aggReturns <- Return.annualized(retsList)
aggSharpe <- SharpeRatio.annualized(retsList)
aggMAR <- Return.annualized(retsList)/maxDrawdown(retsList)
aggDD <- maxDrawdown(retsList)

plot(rank(aggReturns)~as.numeric(colnames(aggReturns)), type="l", ylab="annualized returns rank", xlab="SMA",
main="Risk and return rank comparison")
lines(rank(aggSharpe)~as.numeric(colnames(aggSharpe)), type="l", ylab="annualized Sharpe rank", xlab="SMA", col="blue")
lines(rank(aggMAR)~as.numeric(colnames(aggMAR)), type="l", ylab="Max return over max drawdown", xlab="SMA", col="red")
lines(rank(-aggDD)~as.numeric(colnames(aggDD)), type="l", ylab="max DD", xlab="SMA", col="green")
legend("bottomright", c("Return rank", "Sharpe rank", "MAR rank", "Drawdown rank"), pch=0, col=c("black", "blue", "red", "green"))


And the resulting plot:

There are several regions that show similar, strong metrics for similar parameter choices for the value of SMA when we use a “delayed” entry. Namely, the regions around the 60 day SMA, the 150 day SMA, and the 125 day SMA.

Let’s look at those configurations.

truncRets <- retsList[,c(51, 116, 141)]
stats <- data.frame(cbind(t(Return.annualized(truncRets)),
t(SharpeRatio.annualized(truncRets)),
t(maxDrawdown(truncRets))))
colnames(stats) <- c("A.Return", "A.Sharpe", "Worst_Drawdown")
stats\$MAR <- stats[,1]/stats[,3]
stats <- round(stats, 3)


And the results:

> stats
A.Return A.Sharpe Worst_Drawdown   MAR
60     1.103    2.490          0.330 3.342
125    0.988    2.220          0.368 2.683
150    0.983    2.189          0.404 2.435


And the resulting performance, on both a regular, and log scale:

charts.PerformanceSummary(truncRets)

logRets <- log(cumprod(1+truncRets))
chart.TimeSeries(logRets)


Perfect strategies? There’s probably room for improvement. As good if not better than the volatility strategies posted elsewhere on the internet? Probably. Is there more investigation that can be done regarding the differences in signal delay? Yes.

So, in conclusion for this post, I’m hoping that the rank comparison heuristic and its new output gives people another tool to consider, along with another vol strategy to consider as well.