ABC model choice not to be trusted [2]

January 27, 2011

(This article was first published on Xi'an's Og » R, and kindly contributed to R-bloggers)

As we were completing our arXiv summary about ABC model choice, we were helpfully pointed to a recent CRiSM tech. report by X. Didelot, R. Everitt, A. Johansen and D. Lawson on  Likelihood-free estimation of model evidence. This paper is quite related to our study of the performances of the ABC approximation to the Bayes factor, deriving in particular the limiting behaviour for the ratio,

B_{12}(x) = \dfrac{g_1(x)}{g_2(x)}\,B^S_{12}(x).

However, Didelot et al. reach the opposite conclusion from ours, namely that the problem can be solved by a sufficiency argument. Their point is that, when comparing models within exponential families (which is the natural realm for sufficient statistics), it is always possible to build an encompassing model with a sufficient statistic that remains sufficient across models. This construction of Didelot et al. is correct from a mathematical perspective, as seen for instance in the Poisson versus geometric example we first mentioned in Grelaud et al. (2009): adding

\prod_{i=1}^n x_i!

to the sum of the observables into a large sufficient statistic produces a ratio g1/g2 that is equal to 1.

Nonetheless, we do not think this encompassing property has a direct impact on the performances of ABC model choice. In practice, complex models do not enjoy sufficient statistics (if only because the overwhelming majority of them are not exponential families, with the notable exception of Gibbs random fields where the above agreement graph is derived). There is therefore a strict loss of information in using ABC model choice, due to the call both to insufficient statistics and to non-zero tolerances. Looking at what happens in the limiting case when one is relying on a common sufficient statistic is a formal study that brings light on the potentially huge discrepancy between the ABC-based Bayes factor and the true Bayes factor. This is why we consider that finding a solution in this formal case—while a valuable extension of the Gibbs random fields case—does not directly help towards the understanding of the discrepancy found in non-exponential complex models.

Filed under: R, Statistics Tagged: ABC, Bayesian model choice, encompassing model, sufficient statistics

To leave a comment for the author, please follow the link and comment on their blog: Xi'an's Og » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)