# another riddle with a stopping rule

May 26, 2016
By

(This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers)

A puzzle on The Riddler last week that is rather similar to an earlier one. Given the probability (1/2,1/3,1/6) on {1,2,3}, what is the mean of the number N of draws to see all possible outcomes and what is the average number of 1’s in those draws? The second question is straightforward, as the proportions of 1’s, 2’s and 3’s in the sequence till all values are observed remain 3/6, 2/6 and 1/6. The first question follows from the representation of the average

as the probability to exceed n is the probability that at least one value is not observed by the n-th draw, namely

3+(1/2)n+(2/3)n+(5/6)n-(1/6)n-(1/3)n-(1/2)n

which leads to an easy summation for the expectation, namely

3+(2/3)³/(1/3)+(5/6)³/(1/6)-(1/3)³/(2/3)-(1/6)³/(5/6)=73/10

Checking the results hold is also straightforward:

averages <- function(n=1){
x=matrix(sample(1:3,100,rep=TRUE,prob=1:3),100,3)
x[,1]=as.integer(x[,2]<2) x[,3]=as.integer(x[,2]>2)
x[,2]=1-x[,1]-x[,3]
y=apply(apply(x,2,cumsum),1,prod)
m=1+sum(y==0)
return(apply(x[1:m,],2,sum))}


since this gives

mumbl=matrix(0,1e5,3)
for (t in 1:1e5) mumbl[t,]=averages()
> apply(mumbl,2,mean)
[1] 1.21766 2.43265 3.64759
> sum(apply(mumbl,2,mean))
[1] 7.2979
> apply(mumbl,2,mean)*c(6,3,2)
[1] 7.30596 7.29795 7.29518


Filed under: Books, Kids, R Tagged: 538, FiveThirtyEight, stopping rule, The Riddler

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...