Can one beat a Random Walk– IMPOSSIBLE (you say?)

[This article was first published on Intelligent Trading, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Firstly, apologies for the long absence as I’ve been busy with a few things.  Secondly, apologies for the horrific use of caps in the title (for the grammar monitors).  Certainly, you’ll gain something useful from today’s musing, as it’s a pretty profound insight for most (was for me at the time). I’ve also considered carefully, whether or not to divulge this concept, but considering it’s often overlooked and in the public literature (I’ll even share a source), I decided to discuss it.

Fig 1. Random Walk and the 75% rule

I’ve seen the same debate launched over and over on various chat boards, which concerns the impossibility of theoretically beating a random walk.  In this case, I am giving you the code to determine the answer yourself.
The requirements: 1) the generated data must be from an IID gaussian distribution 2) series must be coaxed to a stationary form.

The following script will generate a random series of data and follow the so called 75% rule which says,
Pr[Price>Price(n-1) & Pr<(n-1) < Price_median] Or [Price < Price(n-1) & Price(n-1) > Price_median] = 75%.  This very insightful rule (which is explained both mathematically and in layman’s terms in the book ‘Statistical Arbitrage’ linked on the amazon box to the right), shows that given some stationary, IID, random sequence that has an underlying Gaussian distribution, the above rule set can be shown to converge to a correct prediction rate of 75%!

Now, we all know that market data is not Gaussian (nor is it commision/slippage/friction free), and therein lies the rub. But hopefully, it gives you some food for thought as well as a bit of knowledge to retort, when you hear the debates about impossibilities of beating a random walk.

R Code is below. Notice the cumsum of the random variate x is just the integrated version or random walk of the underlying sequence. Given a random walk, you can simply take the 1st difference to coax it back into the stationary form.

#gen rnd seq for 75% RULE

m <- median(x)

for( i in 1:(length(x)-1)){
if(x[i] < m ) rule_below[i]<- sign(x[i+1]-x[i])
if(x[i] > m ) rule_above[i]<- sign(x[i+1]-x[i])
if(rule_below[i]!= 0) change[i]<-x[i+1]-x[i]
if(rule_above[i]!= 0) change[i]<-x[i]-x[i+1]}


plot(cumsum(x),type=’line’,main=’random walk’)
plot(cumsum(change), type=’line’,main=’rule based equity curve’)

To leave a comment for the author, please follow the link and comment on their blog: Intelligent Trading. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)