An email about mixtures

Posted on March 4, 2010 by xi'an in R bloggers | 0 Comments

[This article was first published on Xi'an's Og » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

As a coincidence, or not, I received the following email just before starting our mixture estimation workshop (the above is Ben Nevis on Monday, whose skyline really looks like a three component mixture!) and giving a discussion on label switching:

I am implementing a Markov-Chain Monte Carlo method for Gibbs sampling from a simple mixture of normal model. I am using a decision-theoretic approach per Stephens (2000), but am also looking at simpler methods. Specifically, I have used:
i) mean ordering
and also tried
ii) ordering by the mixture component weights
I have gotten much better results with the former approach than the later. I was wondering if (ii) was, in fact, an accepted approach? Per Stephens (2000), it seems ordering is only done by means or variances.

PS- By ordering by the cluster weights w (taken from Dirichlet distribution as I believe is standard I mean):
i) discard burn-in
ii) for each iteration check order of $ latex w$’s. if order satisfies our condition (e.g., $w_1<w_2<...<w_n)$ keep the iteration; otherwise, discard this iteration

MIsha xxxx

Obviously, any ordering creates an identifiability constraint and is equally acceptable. Or not. Indeed, my opinion on ordering is now the same as it was at the time our 2000 JASA paper got published: the cut (or more exactly quotienting) of the parameter space created by the ordering is not tuned to the topology of the likelihood/posterior surface. Therefore, the resulting subset may well contain incomplete parts of several modes, instead of concentrating on one of the $k!$ modes. The right identifiability constraint should provide a single modal region, but I am unclear whether or not this is at all possible… Anyway, with a large enough number of components, I believe any ordering device, whether it is on mean, variance or weight, will eventually fail.

Note that the above email implements the ordering by discarding wrongly ordered $w$ ‘s. This is unnecessary: when running the MCMC sampler, wrongly ordered $w$ ‘s can simply be reordered by the appropriate permutation. (In R, just use the function order().)