Abstract.
We present a method for estimating the most general reversible substitution matrix corresponding to a given collection of pairwise aligned DNA sequences. This matrix can then be used to calculate evolutionary distances between pairs of sequences in the collection. If only two sequences are considered, our method is equivalent to that of Lanave et al. (1984). The main novelty of our approach is in combining data from different sequence pairs. We describe a weighting method for pairs of taxa related by a known tree that results in uniform weights for all branches. Our method for estimating the rate matrix results in fast execution times, even on large data sets, and does not require knowledge of the phylogenetic relationships among sequences. In a test case on a primate pseudogene, the matrix we arrived at resembles one obtained using maximum likelihood, and the resulting distance measure is shown to have better linearity than is obtained in a less general model.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Arvestad, L., Bruno, W. Estimation of Reversible Substitution Matrices from Multiple Pairs of Sequences. J Mol Evol 45, 696–703 (1997). https://doi.org/10.1007/PL00006274
Issue Date:
DOI: https://doi.org/10.1007/PL00006274