Smooth skyride through a rough skyline: Bayesian coalescent-based inference of population dynamics

VN Minin, EW Bloomquist… - Molecular biology and …, 2008 - academic.oup.com
VN Minin, EW Bloomquist, MA Suchard
Molecular biology and evolution, 2008academic.oup.com
Kingman's coalescent process opens the door for estimation of population genetics model
parameters from molecular sequences. One paramount parameter of interest is the effective
population size. Temporal variation of this quantity characterizes the demographic history of
a population. Because researchers are rarely able to choose a priori a deterministic model
describing effective population size dynamics for data at hand, nonparametric curve-fitting
methods based on multiple change-point (MCP) models have been developed. We propose …
Abstract
Kingman's coalescent process opens the door for estimation of population genetics model parameters from molecular sequences. One paramount parameter of interest is the effective population size. Temporal variation of this quantity characterizes the demographic history of a population. Because researchers are rarely able to choose a priori a deterministic model describing effective population size dynamics for data at hand, nonparametric curve-fitting methods based on multiple change-point (MCP) models have been developed. We propose an alternative to change-point modeling that exploits Gaussian Markov random fields to achieve temporal smoothing of the effective population size in a Bayesian framework. The main advantage of our approach is that, in contrast to MCP models, the explicit temporal smoothing does not require strong prior decisions. To approximate the posterior distribution of the population dynamics, we use efficient, fast mixing Markov chain Monte Carlo algorithms designed for highly structured Gaussian models. In a simulation study, we demonstrate that the proposed temporal smoothing method, named Bayesian skyride, successfully recovers “true” population size trajectories in all simulation scenarios and competes well with the MCP approaches without evoking strong prior assumptions. We apply our Bayesian skyride method to 2 real data sets. We analyze sequences of hepatitis C virus contemporaneously sampled in Egypt, reproducing all key known aspects of the viral population dynamics. Next, we estimate the demographic histories of human influenza A hemagglutinin sequences, serially sampled throughout 3 flu seasons.
Oxford University Press