Two methods of inferring (effective) population dynamics from genetic variation are compared: (i) Markov chain Monte Carlo (MCMC; using BEAST); and (ii) integrated nested Laplace approximation (INLA; using R interface of that name). INLA runs >1000 times faster than MCMC and produces the same results in 7/10 tests, including the two shown in figure 4.
In order to convert effective population size to census population size, two further quantities must be known: (i) generation time; and (ii) population variability in offspring number. The [Kingman-]coalescent-based framework ignores effects of population structure, recombination and selection.
BCI – Bayesian credible interval
CGGP – coalescent grid Gaussian process
EGP – exact Gaussian process
RGGP – regular grid Gaussian process
The goal of phylodynamics, an area on the intersection of phylogenetics and population genetics, is to reconstruct population size dynamics from genetic data. Recently, a series of nonparametric Bayesian methods have been proposed for such demographic reconstructions. These methods rely on prior specifications based on Gaussian processes and proceed by approximating the posterior distribution of population size trajectories via Markov chain Monte Carlo (MCMC) methods. In this paper, we adapt an integrated nested Laplace approximation (INLA), a recently proposed approximate Bayesian inference for latent Gaussian models, to the estimation of population size trajectories. We show that when a genealogy of sampled individuals can be reliably estimated from genetic data, INLA enjoys high accuracy and can replace MCMC entirely. We demonstrate significant computational efficiency over the state-of-the-art MCMC methods. We illustrate INLA-based population size inference using simulations and genealogies of hepatitis C and human influenza viruses.
Julia A. Palacios, Vladimir N. Minin