Cross validated question

[This article was first published on Xi'an's Og » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Another problem generated by X’validated (on which I spent much too much time!): given an unbiased coin that produced M heads in the first M tosses, what is the expected number of additional tosses needed to get N (N>M) consecutive heads?

Consider the preliminary question of getting a sequence of N heads out of k tosses, with probability 1-p(N,k). The complementary probability is given by the recurrence formula
p(N,k) = \begin{cases} 1 &\text{if } k<N\\  \sum_{m=1}^{N} \frac{1}{2^m}p(N,k-m) &\text{else}\\  \end{cases}
Indeed, my reasoning is that the event of no consecutive N heads out of k tosses can be decomposed according to the first occurrence of a tail out of the first N tosses. Conditioning on whether this first tail occurs at the first, second, …, nth draw leads to this recurrence relation. As I wanted to make sure, I rand the following R code

#no sequence of length N out of k draws
  if (k<N){
    for (j in 1:N)

and got the following check:

 > k=15
> #N=2
> 1-pnk(2,k)-sum(apply(vale[,-1]*vale[,-k],1,max))/10^6
[1] 6.442773e-05
> #N=3
> 1-pnk(3,k)-sum(apply(vale[,-(1:2)]*vale[,-c(1,k)]*vale[,-((k-1):k)],1,max))/10^6
[1] 0.0004090137

Next, the probability of getting the first consecutive N heads in m≥ N tosses is
q(N,m) =\begin{cases}  0 &\text{if }m<N\\  \frac{1}{2^N} &\text{if }m=N\\  \frac{p(N,m-N-1) }{2^{N+1}} &\text{if } N<m\\  \end{cases}
Both first cases are self-explanatory. the third case corresponds to a tail occurring at the m−N−1th draw, followed by N heads, and prohibiting N consecutive heads prior to the m−N−1th toss. When checking by

while (i<=length(SS)){
  if ((SS[i]==1)&&(SS[i-1]==1)){
for (t in 1:20) trobs[t]=qmn(2,t)

I however get a discrepancy shown in the above graph for the cases m=3,4, and N=2, which is be due to the pseudo-clever way I compute the waiting times, removing the extra 1′s… Because the probabilities to wait 3 and 4 times for 2 heads should really be both equal to 1/2³.

Now, the probability to get M heads first and N heads in m≥ N tosses (and no less) is
r(M,N,m) = \begin{cases}  \frac{1}{2^N} &\text{if }m=N\\  0 &\text{if } N<m<N+M\\  \sum_{r=M+1}^{N}\frac{q(N,m-r)}{2^{r}}&\text{if } m\ge N+M  \end{cases}

The third case is explained by the fact that completions of the first sequence of heads must stop (by a tail) before reaching N heads. Hence the conditional probability of waiting m tosses to get N consecutive heads given the first M consecutive heads is

s(M,N,m) = \begin{cases}  \frac{1}{2^{N-M}} &\text{if }m=N\\  0 &\text{if } N<m<N+M\\  \sum_{r=M+1}^{N}\frac{q(N,m-r)}{2^{r-M}}&\text{if } m\ge N+M  \end{cases}
The expected number can then be derived by

\mathfrak{E}(M,N)= \sum_{m=N}^\infty m\, s(M,N,m)



for the number of *additional* steps…

Checking for the smallest values of M and N, I got a reasonable agreement with the theoretical value of 2N+1-2M+1(established on Cross validated). (For larger values of M and N, I had to replace the recursive definition of pnk with a matrix computed once for all.)

Filed under: R, Statistics, University life Tagged: conditioning, heads and tails, R, recursion

To leave a comment for the author, please follow the link and comment on their blog: Xi'an's Og » R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)