Abstract
Using a maximum-likelihood formalism, we have developed a method with which to reconstruct the sequences of ancestral proteins. Our approach allows the calculation of not only the most probable ancestral sequence but also of the probability of any amino acid at any given node in the evolutionary tree. Because we consider evolution on the amino acid level, we are better able to include effects of evolutionary pressure and take advantage of structural information about the protein through the use of mutation matrices that depend on secondary structure and surface accessibility. The computational complexity of this method scales linearly with the number of homologous proteins used to reconstruct the ancestral sequence.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Abbreviations
- ML:
-
maximum likelihood
- MP:
-
maximum parsimony
- PAM:
-
point-accepted mutations
References
Benner SA, Badcoe I, Cohen MA, Gerloff DL (1994a) Bona fide prediction of aspects of protein conformation. J Mol Biol 235:926–958
Benner SA, Cohen MA, Gerloff DL (1994b) Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng 7:1323–1332
Cooper, A, Mourer-Chauvire C, Chambers GK, von Haeseler A, Wilson AC, Paabo S (1992) Independent origins of New Zealand moas and kiwis. Proc Nat Acad Sci USA 89:8741–8744
Czelusniak J, Goodman M, Moncrief ND, Kehoe SM (1990) Maximum parsimony approach to construction of evolutionary trees from aligned homologous sequences. Methods Enzymol 183:601–615
Dayhoff MO, Eck RV (1968) A model of evolutionary change in proteins. In: Dayhoff MO, Eck RV (eds) Atlas of protein sequence and structure, volume 3. National Biomedical Research Foundation Silver Spring, MD, pp 33–41
DeSalle R, Gatesy J, Wheeler W, Grimaldi D (1992) DNA sequences from a fossil: termite in oligo-miocene amber and their phylogenetic implications. Science 257:1933–1936
Felsenstein J (1973) Maximum likelihood and minimum steps methods for estimating evolutionary trees from data on discrete characters. Syst Zool 22:240–249
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376
Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Systematic Zool 20:406–416
Gobel U, Sander C, Schneider R, Valencia A (1994) Correlated mutations and residue contacts in proteins. Proteins 18:309–317
Higgins DG, Bleasby AJ, Fuchs R (1992) Clustal V: improved software for multiple sequence alignment. CABIOS 8:189–191
Higuchi R, Bowman B, Freiberger M, Ryder OA, Wilson AC (1984) DNA sequences from the quagga, an extinct member of the horse family. Nature 312:282–284
Holmquist R (1979) The method of parsimony: an experimental test and theoretical analysis of the adequacy of molecular restoration studies. J Mol Biol 135:939–958
Koshi JM, Goldstein RA (1995) Context-dependent optimal substitution matrices derived using Bayesian statistics and phylogenetic trees. Protein Eng 8:641–645
Libertini G, Donato AD (1994) Reconstruction of ancestral sequences by the inferential method, a tool for protein engineering studies. J Mol Evol 39:219–229
Malcolm BA, Wilson KP, Matthews BW, Kirsch JF, Wilson AC (1990) Ancestral lysozymes reconstructed, neutrality tested, and thermostability linked to hydrocarbon packing. Nature 345:86–88
Moore GW, Barnabas J, Goodman M (1973) A method for constructing maximum parsimony ancestral amino acid sequences on a given network. J Theor Biol 38:459–485
Neher E (1994) How frequent are correlated changes in families of protein sequences. Proc Nat Acad Sci USA 91:98–102
Paabo S (1989) Ancient DNA: extraction, characterization, molecular cloning, and enzymatic amplification. Proc Nat Acad Sci USA 86:1939–1943
Rost B, Sander C (1994) Conservation and prediction of solvent accessibility in protein families. Proteins 20:216–226
Rost B, Sander C, Schneider R (1994) Redefining the goals of protein secondary structure prediction. J Mol Biol 235:13–26
Saitou N (1990) Maximum likelihood methods. Methods Enzymol 183:584–598
Shih P, Malcolm BA, Rosenberg S, Kirsch JF, Wilson AC (1993) Reconstruction and testing ancestral proteins. Methods Enzymol 224:576–590
Shindyalov I, Kochanov N, Sander C (1994) Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations. Protein Eng 7(3):349–358
Stackhouse J, Presnell SR, McGeehan GM, Nambiar KP, Benner SA (1990) The ribonuclease from an extinct ruminant. FEBS Lett 262:104–106
Taylor WR, Hatrick K (1994) Compensating changes in protein multiple sequence alignments. Protein Eng 7:341–348
Yang Z (1994) Statistical properties of the maximum likelihood method of phylogenetic estimation and comparison with distance matrix methods. Systematic Biol 43:329–342
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Koshi, J.M., Goldstein, R.A. Probabilistic reconstruction of ancestral protein sequences. J Mol Evol 42, 313–320 (1996). https://doi.org/10.1007/BF02198858
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02198858