Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

As we were completing our arXiv summary about ABC model choice, we were helpfully pointed to a recent CRiSM tech. report by X. Didelot, R. Everitt, A. Johansen and D. Lawson on  Likelihood-free estimation of model evidence. This paper is quite related to our study of the performances of the ABC approximation to the Bayes factor, deriving in particular the limiting behaviour for the ratio,

$B_{12}(x) = \dfrac{g_1(x)}{g_2(x)}\,B^S_{12}(x).$

However, Didelot et al. reach the opposite conclusion from ours, namely that the problem can be solved by a sufficiency argument. Their point is that, when comparing models within exponential families (which is the natural realm for sufficient statistics), it is always possible to build an encompassing model with a sufficient statistic that remains sufficient across models. This construction of Didelot et al. is correct from a mathematical perspective, as seen for instance in the Poisson versus geometric example we first mentioned in Grelaud et al. (2009): adding

$\prod_{i=1}^n x_i!$

to the sum of the observables into a large sufficient statistic produces a ratio g1/g2 that is equal to 1.

Nonetheless, we do not think this encompassing property has a direct impact on the performances of ABC model choice. In practice, complex models do not enjoy sufficient statistics (if only because the overwhelming majority of them are not exponential families, with the notable exception of Gibbs random fields where the above agreement graph is derived). There is therefore a strict loss of information in using ABC model choice, due to the call both to insufficient statistics and to non-zero tolerances. Looking at what happens in the limiting case when one is relying on a common sufficient statistic is a formal study that brings light on the potentially huge discrepancy between the ABC-based Bayes factor and the true Bayes factor. This is why we consider that finding a solution in this formal case—while a valuable extension of the Gibbs random fields case—does not directly help towards the understanding of the discrepancy found in non-exponential complex models.

Filed under: R, Statistics Tagged: ABC, Bayesian model choice, encompassing model, sufficient statistics