Abstract.
Algorithmic details to obtain maximum likelihood estimates of parameters on a large phylogeny are discussed. On a large tree, an efficient approach is to optimize branch lengths one at a time while updating parameters in the substitution model simultaneously. Codon substitution models that allow for variable nonsynonymous/synonymous rate ratios (ω=d N/d S) among sites are used to analyze a data set of human influenza virus type A hemagglutinin (HA) genes. The data set has 349 sequences. Methods for obtaining approximate estimates of branch lengths for codon models are explored, and the estimates are used to test for positive selection and to identify sites under selection. Compared with results obtained from the exact method estimating all parameters by maximum likelihood, the approximate methods produced reliable results. The analysis identified a number of sites in the viral gene under diversifying Darwinian selection and demonstrated the importance of including many sequences in the data in detecting positive selection at individual sites.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Author information
Authors and Affiliations
Additional information
Received: 25 April 2000 / Accepted: 24 July 2000
Rights and permissions
About this article
Cite this article
Yang, Z. Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A. J Mol Evol 51, 423–432 (2000). https://doi.org/10.1007/s002390010105
Issue Date:
DOI: https://doi.org/10.1007/s002390010105