# The ARORA guessing game

November 9, 2010
By

(This article was first published on Portfolio Probe » R language, and kindly contributed to R-bloggers)

## The game

ARORA (A random or real array) is a website that gives you two time series at a time. Your job is to guess which series is real market data and which is permuted data.  It’s fun — try it.

With some practice you will probably be able to guess which is which well above chance. I have a hypothesis or two about why. But before you read my hypotheses, you should try it out yourself without my contamination.

## Some hypotheses

The original paper describing the experiment is Is it real or is it randomized?: A financial Turing test by Jasmina Hasanhodzic, Andrew Lo and Emanuele Viola.

My hypotheses are explained in the working paper Some hypotheses about ARORA, the financial Turing test.

Do you think I’m right?

Do you have other hypotheses?

## Do it yourself in R

Generating a random series in (presumably) the way that ARORA does it is easy in R.  Here is some code that imitates a static version of a single ARORA test:

> par(mfcol=c(2,1))
> plot(priceSeries, type="l", axes=FALSE, xlab='',
+     ylab=''); box()
> plot(exp(c(0, cumsum(sample(diff(log(priceSeries)))))),
+     type="l", axes=FALSE, xlab='', ylab=''); box()

The par command sets up the graphics page to have two plots on it, one above the other.  (In this case it doesn’t matter if you use mfcol or mfrow.)

The first plot command plots the price series as a line with the usual labeling of the axes removed.  The box command draws a box around the plot — this is usually done for such plots but not when the axes are not drawn.

The second plot command contains all of the computation of the random series.  We can explain it by starting on the inside and working outwards.

diff(log(priceSeries)) computes log returns from the series (see A tail of two returns).  Then the sample command does a random permutation of those numbers. The cumsum function performs a cumulative sum of the permuted returns.  We add a zero onto the front of that vector of numbers, and finally use exp to go from log returns back to prices (starting at 1).

A more polished version would add a mar argument to the call to par in order to not waste so much space in the resulting graphic.

## The real question

Figure 1: Panthera onca.

Does it matter if you can tell a panthera onca from a panthera pardus?  Probably not (though knowing what continent you’re on might be useful).  What matters is if you can outrun her if she decides to eat you.  (Probably not.)

The ability to distinguish the real data series has been used to give credence to chartists.  While the opposite result would tend to rule out the efficacy of chart-reading, I’m not convinced that this is especially supportive of chartism.

The real task is to tell where a price series is going.

A test of that is the Technical Analysis Challenge on the Burns Statistics website.  This is another multiple choice game that you can play yourself.  However, it isn’t as nicely presented as ARORA.  You are given a price series and four possible extensions of that series.  Only one of the four, of course, is the correct extension.

Of the few people who officially entered the challenge, there was no indication of skill at guessing the extensions.  (Except for  a certain someone who industriously cheated.)

## Epilogue

Thanks to Lisa Goldberg for pointing out ARORA.

Photo from stock.xchng.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: , , ,