Abstract
The analysis of DNA composition and codon usage reveals many factors that influence the evolution of genes and genomes. In this chapter, we show how to use CodonExplorer, a web tool and interactive database that contains millions of genes, to better understand the principles governing evolution at the single gene and whole-genome level. We present principles and practical procedures for using analyses of GC content and codon usage frequency to identify highly expressed or horizontally transferred genes and to study the relative contribution of different types of mutation to gene and genome composition. CodonExplorer’s combination of a user-friendly web interface and a comprehensive genomic database makes these diverse analyses fast and straightforward to perform. CodonExplorer is thus a powerful tool that facilitates and automates a wide range of compositional analyses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Nirenberg, M. W., and Matthaei, J. H. (1961) The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc Natl Acad Sci USA 47, 1588–602.
Soll, D., Ohtsuka, E., Jones, D. S., Lohrmann, R., Hayatsu, H., Nishimura, S., and Khorana, H. G. (1965) Studies on polynucleotides, XLIX. Stimulation of the binding of aminoacyl-sRNA’s to ribosomes by ribotrinucleotides and a survey of codon assignments for 20 amino acids. Proc Natl Acad Sci USA 54, 1378–85.
Sueoka, N. (1961) Compositional correlation between deoxyribonucleic acid and protein. Cold Spring Harb Symp Quant Biol 26, 35–43.
Efstratiadis, A., Kafatos, F. C., and Maniatis, T. (1977) The primary structure of rabbit beta-globin mRNA as determined from cloned DNA. Cell 10, 571–85.
Sanger, F., Nicklen, S., and Coulson, A. R. (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74, 5463–7.
Sonneborn, T. M. (1965) Nucleotide sequence of a gene: first complete specification. Science 148, 1410.
Ikemura, T., and Ozeki, H. (1983) Codon usage and transfer RNA contents: organism-specific codon-choice patterns in reference to the isoacceptor contents. Cold Spring Harb Symp Quant Biol 47 Pt 2, 1087–97.
Crick, F. H. (1966) Codon – anticodon pairing: the wobble hypothesis. J Mol Biol 19, 548–55.
Agris, P. F., Vendeix, F. A., and Graham, W. D. (2007) tRNA’s wobble decoding of the genome: 40 years of modification. J Mol Biol 366, 1–13.
Ikemura, T. (1981) Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol 151, 389–409.
Ikemura, T. (1981) Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes. J Mol Biol 146, 1–21.
Ikemura, T. (1982) Correlation between the abundance of yeast transfer RNAs and the occurrence of the respective codons in protein genes. Differences in synonymous codon choice patterns of yeast and Escherichia coli with reference to the abundance of isoaccepting transfer RNAs. J Mol Biol 158, 573–97.
Moriyama, E. N., and Powell, J. R. (1997) Codon usage bias and tRNA abundance in Drosophila. J Mol Evol 45, 514–23.
Sharp, P. M., and Li, W. H. (1987) The codon Adaptation Index – a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15, 1281–95.
Kane, J. F. (1995) Effects of rare codon clusters on high-level expression of heterologous proteins in Escherichia coli. Curr Opin Biotechnol 6, 494–500.
Muto, A., and Osawa, S. (1987) The guanine and cytosine content of genomic DNA and bacterial evolution. Proc Natl Acad Sci USA 84, 166–9.
Knight, R. D., Freeland, S. J., and Landweber, L. F. (2001) A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol 2, RESEARCH0010.
Gupta, S. K., and Ghosh, T. C. (2001) Gene expressivity is the main factor in dictating the codon usage variation among the genes in Pseudomonas aeruginosa. Gene 273, 63–70.
Sueoka, N. (1999) Translation-coupled violation of Parity Rule 2 in human genes is not the cause of heterogeneity of the DNA G+C content of third codon position. Gene 238, 53–8.
Sueoka, N. (2002) Wide intra-genomic G+C heterogeneity in human and chicken is mainly due to strand-symmetric directional mutation pressures: dGTP-oxidation and symmetric cytosine-deamination hypotheses. Gene 300, 141–54.
Sueoka, N. (1988) Directional mutation pressure and neutral molecular evolution. Proc Natl Acad Sci USA 85, 2653–7.
Bernardi, G. (1993) The vertebrate genome: isochores and evolution. Mol Biol Evol 10, 186–204.
Costantini, M., Clay, O., Auletta, F., and Bernardi, G. (2006) An isochore map of human chromosomes. Genome Res 16, 536–41.
Sueoka, N. (1999) Two aspects of DNA base composition: G+C content and translation-coupled deviation from intra-strand rule of A = T and G = C. J Mol Evol 49, 49–62.
Lobry, J. R., and Sueoka, N. (2002) Asymmetric directional mutation pressures in bacteria. Genome Biol 3, RESEARCH0058.
Sueoka, N. (1995) Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J Mol Evol 40, 318–25.
Karlin, S., Mrazek, J., and Campbell, A. M. (1998) Codon usages in different gene classes of the Escherichia coli genome. Mol Microbiol 29, 1341–55.
Lawrence, J. G., and Ochman, H. (1997) Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol 44, 383–97.
Groisman, E. A., Sturmoski, M. A., Solomon, F. R., Lin, R., and Ochman, H. (1993) Molecular, functional, and evolutionary analysis of sequences specific to Salmonella. Proc Natl Acad Sci USA 90, 1033–7.
Nakamura, Y., Itoh, T., Matsuda, H., and Gojobori, T. (2004) Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat Genet 36, 760–6.
Lobry, J. R. (1997) Influence of genomic G+C content on average amino-acid composition of proteins from 59 bacterial species. Gene 205, 309–16.
Faith, J. J., and Pollock, D. D. (2003) Likelihood analysis of asymmetrical mutation bias gradients in vertebrate mitochondrial genomes. Genetics 165, 735–45.
Hacker, J., and Kaper, J. B. (2000) Pathogenicity islands and the evolution of microbes. Annu Rev Microbiol 54, 641–79.
Hsiao, W. W., Ung, K., Aeschliman, D., Bryan, J., Finlay, B. B., and Brinkman, F. S. (2005) Evidence of a large novel gene pool associated with prokaryotic genomic islands. PLoS Genet 1, e62.
Rudner, R., Karkas, J. D., and Chargaff, E. (1969) Separation of microbial deoxyribonucleic acids into complementary strands. Proc Natl Acad Sci USA 63, 152–9.
Sueoka, N. (1962) On the genetic basis of variation and heterogeneity of DNA base composition. Proc Natl Acad Sci USA 48, 582–92.
Sharp, P. M., and Devine, K. M. (1989) Codon usage and gene expression level in Dictyostelium discoideum: highly expressed genes do ‘prefer’ optimal codons. Nucleic Acids Res 17, 5029–39.
Peden, J. F. (1999) Analysis of codon usage. (Ph.D. Thesis), Department of Genetics, University of Nottingham, Nottingham, UK.
Thioulouse, J., Chessel, D., Dolédec, S., and Olivier, J. M. (1996) ADE-4: a multivariate analysis and graphical display software. . Stat Comput 7, 75–83.
Roten, C. A., Gamba, P., Barblan, J. L., and Karamata, D. (2002) Comparative Genometrics (CG): a database dedicated to biometric comparisons of whole genomes. Nucleic Acids Res 30, 142–4.
Wu, G., Bashir-Bello, N., and Freeland, S. J. (2006) The synthetic gene designer: a flexible web platform to explore sequence manipulation for heterologous expression. Protein Expr Purif 47, 441–5.
Nakamura, Y., Gojobori, T., and Ikemura, T. (1997) Codon usage tabulated from the international DNA sequence databases. Nucleic Acids Res 25, 244–5.
Sharp, P. M., and Li, W. H. (1986) Codon usage in regulatory genes in Escherichia coli does not reflect selection for ‘rare’ codons. Nucleic Acids Res 14, 7737–49.
Kyte, J., and Doolittle, R. F. (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157, 105–32.
Acknowledgments
This work was supported in part by the Biophysics and SCR training grants T32GM08759 and T32GM065103 from NIH. CodonExplorer is hosted on the W.M. Keck Foundation Bioinformatics Facility at the University of Colorado, Boulder.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Zaneveld, J., Hamady, M., Sueoka, N., Knight, R. (2009). CodonExplorer: An Interactive Online Database for the Analysis of Codon Usage and Sequence Composition. In: Posada, D. (eds) Bioinformatics for DNA Sequence Analysis. Methods in Molecular Biology, vol 537. Humana Press. https://doi.org/10.1007/978-1-59745-251-9_10
Download citation
DOI: https://doi.org/10.1007/978-1-59745-251-9_10
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-58829-910-9
Online ISBN: 978-1-59745-251-9
eBook Packages: Springer Protocols