This early morning, just before going out for my daily run around The Parc, I checked X validated for new questions and came upon that one. Namely, how to simulate X a Bin(8,2/3) variate and Y a Bin(18,2/3) such that corr(X,Y)=0.5. (No reason or motivation provided for this constraint.) And I thought the following (presumably well-known) resolution, namely to break the two binomials as sums of 8 and 18 Bernoulli variates, respectively, and to use some of those Bernoulli variates as being common to both sums. For this specific set of values (8,18,0.5), since 8×18=12², the solution is 0.5×12=6 common variates. (The probability of success does not matter.) While running, I first thought this was a very artificial problem because of this occurrence of 8×18 being a perfect square, 12², and cor(X,Y)x12 an integer. A wee bit later I realised that all positive values of cor(X,Y) could be achieved by randomisation, i.e., by deciding the identity of a Bernoulli variate in X with a Bernoulli variate in Y with a certain probability ϖ. For negative correlations, one can use the (U,1-U) trick, namely to write both Bernoulli variates as
in order to minimise the probability they coincide.
I also checked this result with an R simulation
> z=rbinom(10^8,6,.66) > y=z+rbinom(10^8,12,.66) > x=z+rbinom(10^8,2,.66) cor(x,y) > cor(x,y)  0.5000539
Filed under: Books, Kids, pictures, R, Running, Statistics, University life Tagged: binomial distribution, cross validated, inverse cdf, Jacob Bernoulli, Parc de Sceaux, R, random simulation, stackoverflow