Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The article https://online.wsj.com/public/resources/documents/VirtuOverview.pdf is a neat little illustration of a simple asymptotic toy distribution given an initial probability of a win or loss per-trade. It is used as an example to illustrate the basic methodology behind the working market-maker business – develop a small edge and scale this up as cheaply as possible to maximise the probability of overall profit.

If we take $p=0.51$ as the probability of a win per-trade and then after $n$ transactions we will have a number of ‘wins’ k that will vary from 0 to n. We model each trade as the outcome of a binomial 0-1 trial.

In order to come out at breakeven or better, the number of wins k needs to be at least $\frac{n}{2}$. Using the binomial distribution this can be modelled as:

$P\left(n>\frac{k}{2}\right) = \sum_{\frac{k}{2}}^\infty \frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}$

As the binomial distribution converges to a normal $\mathcal{N}(np, np(1-p))$ as n gets large, we can use the distribution below to model the win/loss probability over n:

$\int_{\frac{k}{2}}^\infty \mathcal{N}\left(np, np(1-p) \right) dx$

Which is

$\int_{\frac{k}{2}}^\infty \frac{1}{\sigma\sqrt{2}\pi}e^{-\frac{1}{2}\frac{x-\mu}{\sigma}^2} dx$

Where $\mu=np$ and $\sigma^2=np(1-p)$

This can be modelled in R

> p <- 0.51
> n <- 100
> 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
[1] 0.5792754
> n <- 1000
> 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
[1] 0.7364967

Showing that with a win probability of 51% 100 trades gives us a 57% probability of breakeven or better and 1000 trades gives us a 73% chance of breakeven or better.

We can plot the probability of breakeven holding p constant and changing n from 1 to 1000:

 n<-seq(1,1000)
> y <- 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
> library(ggplot2)
> library(scales)
> qplot(n,y)+scale_y_continuous(label=percent)

Which produces the following graph

Which shows the convergence to a sure 100% probability of profit as n gets large.

To make it more interesting we can generate different paths for n from 1 to 10000 but also vary the win probability from say 45% to 51% and look at the paths as we vary n and p:

n <- seq(1,10000)
p<- 0.5
y <- 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
plot(n, y, type='l', ylim=c(0,1))

probs <- seq(0.45, 0.55, length.out = 100)
for (pr in seq_along(probs)){
p<-probs[pr]
y<-1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
lines(x=n,y=y,col=ifelse(y<0.5,rgb(1,0,0,.5),rgb(0,1,0,.5)))
}

Which shows the probabilities of breakeven or better given a number of different starting win/loss probabilities and a varying number of trades. The path with $p=0.5$ is shown in black.