**Xi'an's Og » R**, and kindly contributed to R-bloggers)

This paper by Weixuan Zhu, Juan Miguel Marín [from Carlos III in Madrid, not to be confused with Jean-Michel Marin, from Montpellier!], and Fabrizio Leisen proposes an alternative to our 2013 PNAS paper with Kerrie Mengersen and Pierre Pudlo on empirical likelihood ABC, or BC_{el}. The alternative is based on Davison, Hinkley and Worton’s (1992) bootstrap likelihood, which relies on a double-bootstrap to produce a non-parametric estimate of the distribution of a given estimator of the parameter θ. Including a smooth curve-fitting algorithm step, for which not much description is available from the paper.

“…in contrast with the empirical likelihood method, the bootstrap likelihood doesn’t require any set of subjective constrains taking advantage from the bootstrap methodology. This makes the algorithm an automatic and reliable procedure where only a few parameters need to be specified.”

The spirit is indeed quite similar to ours in that a non-parametric substitute plays the role of the actual likelihood, with no correction for the substitution. Both approaches are convergent, with similar or identical convergence speeds. While the empirical likelihood relies on a choice of parameter identifying constraints, the bootstrap version starts directly from the [subjectively] chosen estimator of θ. For it indeed needs to be *chosen*. And computed.

“Another benefit of using the bootstrap likelihood (…) is that the construction of bootstrap likelihood could be done once and not at every iteration as the empirical likelihood. This leads to significant improvement in the computing time when different priors are compared.”

This is an improvement that could apply to the empirical likelihood approach, as well, once a large enough collection of likelihood values has been gathered. But only in small enough dimensions where smooth curve-fitting algorithms can operate. The same criticism applying to the derivation of a non-parametric density estimate for the distribution of the estimator of θ. Critically, the paper only processes examples with a few parameters.

In the comparisons between BC_{el} and BC_{bl} that are produced in the paper, the gain is indeed towards BC_{bl}. Since this paper is mostly based on examples and illustrations, not unlike ours, I would like to see more details on the calibration of the non-parametric methods and of regular ABC, as well as on the computing time. And the variability of both methods on more than a *single* Monte Carlo experiment.

I am however uncertain as to how the authors process the population genetic example. They refer to the composite likelihood used in our paper to set the moment equations. Since this is not the true likelihood, how do the authors select their parameter estimates in the double-bootstrap experiment? The inclusion of Crakel’s and Flegal’s (2013) bivariate Beta, is somewhat superfluous as this example sounds to me like an artificial setting.

In the case of the Ising model, maybe the pre-processing step in our paper with Matt Moores could be compared with the other algorithms. In terms of BC_{bl}, how does the bootstrap operate on an Ising model, i.e. (a) how does one subsample pixels and (b)what are the validity guarantees?

A test that would be of interest is to start from a standard ABC solution and use this solution as the reference estimator of θ, then proceeding to apply BC_{bl} for that estimator. Given that the reference table would have to be produced only once, this would not necessarily increase the computational cost by a large amount…

Filed under: Books, R, Statistics, University life Tagged: ABC, ABCel, bivariate Beta distribution, bootstrap, bootstrap likelihood, double bootstrap, empirical likelihood, Ising model, population genetics

**leave a comment**for the author, please follow the link and comment on their blog:

**Xi'an's Og » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...