Summary
A codon-based approach to estimating the number of variable sites in a protein is presented. When first and second positions of codons are assumed to be replacement positions, a capture-recapture model can be used to estimate the number of variable codons from every pair of homologous and aligned sequences. The capture-recapture estimate is compared to a maximum likelihood estimate of the number of variable codons and to previous approaches that estimate the number of variable sites (not codons) in a sequence. Computer simulations are presented that show under which circumstances the capture-recapture estimate can be used to correct biases in distance matrices. Analysis of published sequences of two genes, calmodulin and serum albumin, shows that distance corrections that employ a capture-recapture estimate of the number of variable sites may be considerably different from corrections that assume that the number of variable sites is equal to the total number of positions in the sequence.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Fitch WM (1986) The estimate of total nucleotide substitutions from pairwise differences is biased. Philos Trans R Soc Lond B 312:317–324
Fitch WM, Margoliash E (1967) A method for estimating the number of invariant amino acid coding positions in a gene using cytochrome c as a model case. Biochem Genet 1:65–71
Hasegawa M, Kishino H, Yano T (1985) Dating of the humanape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 26:132–147
Jukes TH, Cantor CH (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–123
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Nguyen T (1991) Statistical models in molecular evolution. PhD thesis, University of California, Berkeley
Seber GAF (1982) The estimation of animal abundance, ed 2. Griffin, London
Shoemaker JS, Fitch WM (1989) Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated. Mol Biol Evol 6:270–289
Author information
Authors and Affiliations
Additional information
Offprint requests to: A. Sidow
Rights and permissions
About this article
Cite this article
Sidow, A., Nguyen, T. & Speed, T.P. Estimating the fraction of invariable codons with a capture-recapture method. J Mol Evol 35, 253–260 (1992). https://doi.org/10.1007/BF00178601
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00178601