Here you will find daily news and tutorials about R, contributed by over 750 bloggers.
There are many ways to follow us - By e-mail:On Facebook: If you are an R blogger yourself you are invited to add your own R content feed to this site (Non-English R bloggers should add themselves- here)

A question from X validated that took me quite a while to fathom and then the solution suddenly became quite obvious:

If a sample taken from an arbitrary distribution on {0,1}⁶ is censored from its (0,0,0,0,0,0) elements, and if the marginal probabilities are know for all six components of the random vector, what is an estimate of the proportion of (missing) (0,0,0,0,0,0) elements?

Since the censoring modifies all probabilities by the same renormalisation, i.e. divides them by the probability to be different from (0,0,0,0,0,0), ρ, this probability can be estimated by looking at the marginal probabilities to be equal to 1, which equal the original and known marginal probabilities divided by ρ. Here is a short R code illustrating the approach that I wrote in the taxi home yesterday night:

#generate vectors
N=1e5
zprobs=c(.1,.9) #iid example
smpl=matrix(sample(0:1,6*N,rep=TRUE,prob=zprobs),ncol=6)
pty=apply(smpl,1,sum)
smpl=smpl[pty>0,]
ps=apply(smpl,2,mean)
cor=mean(ps/rep(zprobs[2],6))
#estimated original size
length(smpl[,1])*cor

A broader question is how many values (and which values) of the sample can be removed before this recovery gets impossible (with the same amount of information).