TY - JOUR
T1 - Estimating evolutionary rates using time-structured data
T2 - a general comparison of phylogenetic methods
AU - Duchêne, Sebastián
AU - Geoghegan, Jemma L.
AU - Holmes, Edward C.
AU - Ho, Simon Y. W.
PY - 2016/11/15
Y1 - 2016/11/15
N2 - Motivation: In rapidly evolving pathogens, including viruses and some bacteria, genetic change can accumulate over short time-frames. Accordingly, their sampling times can be used to calibrate molecular clocks, allowing estimation of evolutionary rates. Methods for estimating rates from time-structured data vary in how they treat phylogenetic uncertainty and rate variation among lineages. We compiled 81 virus data sets and estimated nucleotide substitution rates using root-to-tip regression, least-squares dating and Bayesian inference. Results: Although estimates from these three methods were often congruent, this largely relied on the choice of clock model. In particular, relaxed-clock models tended to produce higher rate estimates than methods that assume constant rates. Discrepancies in rate estimates were also associated with high among-lineage rate variation, and phylogenetic and temporal clustering. These results provide insights into the factors that affect the reliability of rate estimates from time-structured sequence data, emphasizing the importance of clock-model testing.
AB - Motivation: In rapidly evolving pathogens, including viruses and some bacteria, genetic change can accumulate over short time-frames. Accordingly, their sampling times can be used to calibrate molecular clocks, allowing estimation of evolutionary rates. Methods for estimating rates from time-structured data vary in how they treat phylogenetic uncertainty and rate variation among lineages. We compiled 81 virus data sets and estimated nucleotide substitution rates using root-to-tip regression, least-squares dating and Bayesian inference. Results: Although estimates from these three methods were often congruent, this largely relied on the choice of clock model. In particular, relaxed-clock models tended to produce higher rate estimates than methods that assume constant rates. Discrepancies in rate estimates were also associated with high among-lineage rate variation, and phylogenetic and temporal clustering. These results provide insights into the factors that affect the reliability of rate estimates from time-structured sequence data, emphasizing the importance of clock-model testing.
UR - http://www.scopus.com/inward/record.url?scp=84995490382&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btw421
DO - 10.1093/bioinformatics/btw421
M3 - Article
C2 - 27412094
SN - 1367-4803
VL - 32
SP - 3375
EP - 3379
JO - Bioinformatics
JF - Bioinformatics
IS - 22
ER -