Since a new paper that I’ve co-written has appeared on arXiv, here is a quick post summarizing it. The paper is named:
and describes improvements over the Wang-Landau algorithm described by Atchadé and Liu, which is itself a generalization of the work of Wang and Landau (which itself is commonly used in computational physics and well deserves its Wikipedia article). It is joint work with Luke Bornn from UBC, Arnaud Doucet from Oxford and Pierre Del Moral from INRIA Bordeaux.
We focus on the utility of this kind of “exploratory” algorithms for statistical inference. More precisely, suppose that you want to explore (e.g. by simulating from) a complex probability distribution, complex in the sense that most standard algorithms would fail to sample precisely from it, either because it is defined on a very high dimensional state space, or because there are many local modes. The general idea of the Wang-Landau algorithm is that the parts of the state space that have been explored already (e.g. the Markov chain generated by the algorithm has spent some time there) are “penalized”, so that they are less likely to be explored further. Hence, the other parts of the space are “favoured”. That goes on until all the parts (in some sense) have been explored enough. A pretty natural idea!
Now defining these “parts” of the space is a bit tricky, and has proved to be a limit of the method: if the parts are not well designed, the Wang-Landau algorithm does not provide very useful results compared to classical MCMC methods. Part of our work was to design a somewhat “automatic” method to partition the space dynamically during the algorithm, so that the generated chains explore the whole space of interest indeed. Other improvements include the use of parallel chains and adaptive proposal distributions. On top of algorithmic improvements, we argue for a more general use of exploratory methods to prevent getting stuck in local modes without even noticing it.
We try our improved Wang-Landau on four models: a toy example involving a mixture of three bi-variate Gaussian distributions, a variable selection problem, a challenging mixture model with lots of well-separated modes (that’s the figure above), and an Ising model applied to image analysis, where the difficulty is to design efficient moves to travel around the very big space. In all cases we compare to adaptive MCMC with parallel chains, matching the computational cost in terms of target density evaluations.
To make things easier, we provide an R package (my very first!) here:
Hence the graphs of the article can be easily reproduced by launching a few files. We hope that it can also be easily adapted to a lot of target distributions, if you want to try the method on your model! We’ll probably submit to CRAN after some clean-up.
I’m also working on more theoretical aspects of the Wang-Landau algorithm with Robin Ryder, so I’ll blog more on that soon.