When is a Backtest Too Good to be True?

September 9, 2015

(This article was first published on Quintuitive » R, and kindly contributed to R-bloggers)

One statistic which I find useful to form a first impression of a backtest is the success/winning percentage. Since it can mean different things, let’s be more precise: for a strategy over daily data, the winning percentage is the percentage of the days on which the strategy had positive returns (in other words, the strategy guessed the sign of the return correctly on these days). Now the question – if I see 60% winning percentage for a S&P 500 strategy, does/should my bullshit-alarm go off?

That actually happened not too long ago while reading a paper. One of the best strategy in the paper was reporting 60% winning percentage out-of-sample. My gut feeling was that it’s a unrealistic, cherry-picked sample. I was pretty sure I am correct, but it raises other interesting questions, so I decided to investigate a bit more.

Let’s perform an experiment. Let’s take 20 years of daily returns on the S&P 500 and draw a number of samples. All samples have the same size – 60% of the data. For each sample, we assume that we guessed the sign on these days correctly, and we were wrong on the rest. One way to implement this in R is:

return.mc = function(rets, samples=1000, size=252) {
   # The annualized return for each sample
   result = rep(NA, samples)
   for(ii in 1:samples) {
      # Sample the indexes
      aa = sample(1:NROW(rets), size=size)
      # All days we guessed wrong
      bb = -abs(rets)
      # On the days in the sample we guessed correctly
      bb[aa] = abs(bb[aa])
      # Compute the annualized return for this sample.
      # Note, we convert bb to a vector, otherwise, the computation takes forever.
      result[ii] = Return.annualized(as.numeric(bb),scale=252)

Now, let’s see what we get for a couple success rates:

gspc = getSymbols("^GSPC", from="1900-01-01", auto.assign=F)
rets = ROC(Cl(gspc),type="discrete",na.pad=F)["1994/2013"]

aa = return.mc(rets, size=as.integer(0.6*NROW(rets)))
##   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.3353  0.4518  0.4849  0.4860  0.5183  0.6666

aa = return.mc(rets, size=as.integer(0.55*NROW(rets)))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.08744 0.17870 0.20410 0.20410 0.23130 0.33670

For the 60% success rate, we have an average (over all samples) of 48% annually! It seems it was completely justified to make me very suspicious about serious cherry-picking. It’s either that, or the authors have discovered the ultimate trading strategy. Needless to say it turned out to be the former …

The second result, the 55% success is about where I draw the line of something plausible, although I’d be surprised if something of that quality is shared in public.

One last observation, which is important mostly for long-only strategies. We have 53.81% positive returns overall, yet, at the 50% success rate we still get negative mean return! That’s a manifestation of the fact that the positive and negative market profiles are totally different. In any case, what it means is that for long-only strategies I might push the plausible percentage up a bit.

The post When is a Backtest Too Good to be True? appeared first on Quintuitive.

To leave a comment for the author, please follow the link and comment on their blog: Quintuitive » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)