# simulation under zero measure constraints

November 16, 2016
By

(This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers)

A theme that comes up fairly regularly on X validated is the production of a sample with given moments, either for calibration motives or from a misunderstanding of the difference between a distribution mean and a sample average. Here are some entries on that topic:

In most of those questions, the constraint in on the sum or mean of the sample, which allows for an easy resolution by a change of variables. It however gets somewhat harder when the constraint involves more moments or, worse, an implicit solution to an equation. A good example of the later is the quest for a sample with a given maximum likelihood estimate in the case this MLE cannot be derived analytically. As for instance with a location-scale t sample…

Actually, even when the constraint is solely on the sum, a relevant question is the production of an efficient simulation mechanism. Using a Gibbs sampler that changes one component of the sample at each iteration does not qualify, even though it eventually produces the proper sample. Except for small samples. As in this example

``````n=3;T=1e4
s0=.5 #fixed average
sampl=matrix(s0,T,n)
for (t in 2:T){
sampl[t,]=sampl[t-1,]
for (i in 1:(n-1)){
sampl[t,i]=runif(1,
min=max(0,n*s0-sum(sampl[t,c(-i,-n)])-1),
max=min(1,n*s0-sum(sampl[t,c(-i,-n)])))
sampl[t,n]=n*s0-sum(sampl[t,-n])}}``````

For very large samples, I figure that proposing from the unconstrained density can achieve a sufficient efficiency, but the in-between setting remains an interesting problem.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...