Summary
The course of evolutionary change in DNA sequences has been modeled as a Markov process. The Markov process was represented by discrete time matrix methods. The parameters of the Markov transition matrices were estimated by least-squares direct-search optimization of the fit of the calculated divergence matrix to that observed for two aligned sequences. The Markov process corrected for multiple and parallel substitutions of bases at the same site. The method avoided the incorrect assumption of all previously described methods that the divergence between two present-day sequences is twice the divergence of either from the common and unknown ancestral sequence. The three previous methods were shown to be equivalent. The present method also avoided the undesirable assumptions that sequence composition has not changed with time and that the substitution rates in the two descendant lineages were the same. It permitted simultaneous estimation of ancestral sequence composition and, if applicable, of different substitution rates for the two descendant lineages, provided the total number of estimated parameters was less than 16. Properties of the Markov chain were discussed. It was proved for symmetric substitution matrices that all elements of the equilibrium divergence matrix equal 1/16, and that the total difference in the divergence matrix at epoch k equals the total change in the common substitution matrix at epoch 2k for all values of k. It was shown how to resolve an ambiguity in the assignment of two different substitution rates to the two descendant lineages when four or more similar sequences are available. The method was applied to the divergence matrix for codon site 3 for the mouse and rabbit beta-globins. This observed divergence matrix was significantly asymmetric and required at least two different substitution rates. This result could be achieved only by using different asymmetric substitution matrices for the two lineages.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Box MJ (1965) A new method of constrained optimization and a comparison with other methods. Comput J 8:42–52
Dayhoff MO (ed) (1978) Atlas of protein sequence and structure, vol 5, suppl 3. National Biomedical Research Foundation, Washington, DC
Feller W (1968) An introduction to probability theory and its applications, vol. 1, 3rd ed. John Wiley and Sons, New York
Fitch WM (1980) Estimating the total number of nucleotide substitutions since the common ancestor of a pair of homologous genes: comparison of several methods and three beta homoglobulin messenger RNA's. J Mol Evol 16:153–209.
Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 155:279–284
Gojobori T, Ishii K, Nei M (1982) Estimation of a verage number of nucleotide substitutions when the rate of substitution varies with nucleotide. J Mol Evol 18:414–422
Holmquist R (1972) Theoretical foundations for a quantitative approach to paleogenetics. J Mol Evol 1:115–133
Holmquist R (1976) Solution to a gene divergence problem under arbitrary stable nucleotide transition probabilities. J Mol Evol 8:337–349
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro NH (ed) Mammalian protein metabolism, Academic Press, New York, pp 21–123
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Kimura M (1981) Estimation of evolutionary differences between homologous nucleotide sequences. Proc Natl Acad Sci USA 78:454–458
Konkel DA, Maizel JV, Leder P (1979) The evolution and sequence comparison of two recently diverged chromosome beta globin genes. Cell 18:865–873
Lanave C, Preparata G, Saccone C, Semo G (1984) A new method for calculating evolutionary substitution rates. J Mol Evol 20:86–93
Lawson CL, Hanson RJ (1974) Solving least squares problems. Prentice-Hall, Englewood Cliffs, New Jersey
Pauling L, Zuckerkandl E (1963) Chemical paleogenetics, molecular “restoration studies” of extinct forms of life. Acta Chem Scand 17(suppl 1):59–516
Powell MJD (1964) An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput J 7:155–162
Spendley W, Hext GR, Himsworth FR (1962) Sequential applications of simplex designs in optimization and evolutionary operation. Technometrics 4:441–461
Tajima F, Nei M (1984) Estimation of evolutionary distance between nucleotide sequences. Mol Biol Evol 1:269–285
Takahata N, Kimura M (1981) A model of evolutionary base substitutions and its application with special reference to rapid change of pseudo-genes. Genetics 98:641–657
van Ooyen A, VandenBerg J, Mantei N, Weissman, C (1979) Comparisons of total sequence of a cloned rabbit beta globin gene and its flanking regions with a homologous mouse sequence. Science 206:337–344
Zuckerkandl E, Pauling L (1962) Molecular disease, evolution, and genic heterogeneity. In: Kasha M, Pullman B (eds) Horizons in biochemistry. Academic Press, New York, pp 189–225
Zuckerkandl E, Pauling L (1965) Evolutionary divergence and convergence in proteins. In: Bryson V, Vogel HJ (eds) Evolving genes and proteins. Academic Press, New York, pp 97–166
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Blaisdell, B.E. A method of estimating from two aligned present-day DNA sequences their ancestral composition and subsequent rates of substitution, possibly different in the two lineages, corrected for multiple and parallel substitutions at the same site. J Mol Evol 22, 69–81 (1985). https://doi.org/10.1007/BF02105807
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02105807