Summary
We have investigated the relationship between the G + C content of silent (synonymous) sites in codons and the amino acid composition of encoded proteins for approximately 1,600 human genes. There are positive correlations between silent site G + C and the proportions of codons for Arg, Pro, Ala, Trp, His, Gln, and Leu and negative ones for Tyr, Phe, Asn, Ile, Lys, Asp, Thr, and Glu. The median proteins coded by groups of genes that differ in silent-site G + C content also differ in amino acid composition, as do some proteins coded by homologous genes. The pattern of compositional change can be largely explained by directional mutation pressure, the genetic code, and differences in the frequencies of accepted amino acid substitutions; the shifts in protein composition are likely to be selectively neutral.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aïssani B et al (1991) The compositional properties of human genes. J Mol Evol 32:493–503
Aota S, Ikemura T (1986) Diversity of G + C content at the third position of codons in vertebrate genes and its cause. Nucleic Acids Res 14:6345–6355
Baker AR et al (1988) Cloning and expression of full-length cDNA encoding human vitamin D receptor. Proc Natl Acad Sci USA 85:3294–3298
Bernardi G et al (1985) The mosaic genome of vertebrates. Science 228:953–958
Bernardi G, Bernardi G (1986a) Compositional constraints and genome evolution. J Mol Evol 24:1–11
Bernardi G, Bernardi G (1986b) The human genome and its evolutionary context. Cold Spring Harbor Symp Quant Biol 51:479–487
Bernardi G, Bernardi G (1991) Compositional properties of nuclear genes from cold-blooded vertebrates. J Mol Evol 33:57–67
Bilofsky HS, Burks C (1988) The GenBank genetic sequence data bank. Nucleic Acids Res 16:1861–1864
Brown R et al (1984) Mechanism of activation of an N-ras gene in the human fibrosarcoma cell line HT 1080. EMBO J 3:1321–1326
Collins DW et al (1992) Numerical classification of coding sequences. Nucleic Acids Res 20(6):1405–1410
Cox EC, Yanofsky C (1967) Altered base ratios in the DNA of an Escherichia coli mutator strain. Proc Natl Acad Sci USA 58:1895–1902
Dayhoff MO (1978) Atlas of protein sequence and structure, vol 5, suppl 3, National Biomedical Research Foundation, Silver Spring, MD
de The H et al (1987) A novel steroid thyroid hormone receptor-related gene inappropriately expressed in human hepatocellular carcinoma. Nature 330:667–670
de Vos AM et al (1988) Three-dimensional structure of an oncogene protein: catalytic domain of human c-H-ras p21. Science 239:888–893
D'Onofrio G et al (1991) Correlation between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol 32:504–510
Filipski J (1990) Evolution of DNA sequence. Contribution of mutation bias and selection to the origin of chromosomal compartments. Adv Mutagenesis Res 2:1–54
Fischer R et al (1988) Multiple divergent mRNAs code for a single human calmodulin. J Biol Chem 263:17055–17062
Hirai H et al (1985) Activation of the c-K-ras oncogene in a human pancreas carcinoma. Biochem Biophy Res Commun 127:168–174
Ikemura T, Aota S (1988) Global variation in G + C content along vertebrate genome DNA. Possible correlation with chromosome band structures. J Mol Biol 203:1–13
Ikemura T et al (1990) Giant G + C% mosaic structures of the human genome found by arrangement of GenBank human DNA sequences according to genetic positions. Genomics 8:207–216
Ikemura T, Wada K (1991) Evident diversity of codon usage patterns of human genes with respect to chromosome banding patterns and chromosome numbers; relation between nucleotide sequence data and cytogenetic data. Nucleic Acids Res 19:4333–4339
Jukes TH, Kimura M (1984) Evolutionary constraints and the neutral theory. J Mol Evol 21:90–92
Jukes TH, Bhushan V (1986) Silent nucleotide substitutions and G + C content of some mitochondrial and bacterial genes. J Mol Evol 24:39–44
Karlin S et al (1990) Contrasts in codon usage of latent versus productive genes of Epstein-Barn virus: data and hypotheses. J Virol 64(9):4264–4273
Li W-H et al (1985) A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol Biol Evol 2(2):150–174
Maki H, Sekiguchi M (1992) MutT protein specifically hydrolyses a potent mutagenic substrate for DNA synthesis. Nature 355:273–275
Miyajima N et al (1988) Identification of two novel members of erbA superfamily by molecular cloning: the gene products of the two are highly related to each other. Nucleic Acids Res 16:11057–11074
Muto A, Osawa S (1987) The guanine and cytosine content of genomic DNA and bacterial evolution. Proc Natl Acad Sci USA 84:166–169
Nathans J et al (1986) Molecular genetics of human color vision: the genes encoding blue, green and red pigments. Nature 232:193–202
Perutz M (1983) Species adaptation in a protein molecule. Mol Biol Evol 1:1–28
Rolfe R, Meselson M (1959) The relative homogeneity of microbial DNA. Proc Natl Acad Sci USA 45:1039–1043
Sekiya T et al (1984) Molecular cloning and the total nucleotide sequence of the human c-Ha-ras-1 gene activated in a melanoma from a Japanese patient. Proc Natl Acad Sci USA 81: 5384–5388
Sueoka N et al (1959) Heterogeneity in deoxyribonucleic acids. II. Dependency of the density of deoxyribonucleic acids on guanine-cytosine content. Nature 183:1429–1431
Sueoka N (1961) Correlation between base composition of deoxyribonucleic acid and amino acid composition of protein. Proc Natl Acad Sci USA 47:1141–1149
Sueoka N (1962) On the genetic basis of variation and heterogeneity of DNA base composition. Proc Natl Acad Sci USA 48:582–592
Sueoka N (1988) Directional mutation pressure and neutral molecular evolution. Proc Natl Acad Sci 85:2633–2657
Sueoka N (1992) Directional mutation pressure, selective constraints and genetic equilibria. J Mol Evol 34:95–114
Wada K et al (1991) Codon usage tabulated from the GenBank genetic sequence data. Nucleic Acids Res 19 (Suppl):1981–1986
Author information
Authors and Affiliations
Additional information
Offprint requests to: D.W. Collins
Rights and permissions
About this article
Cite this article
Collins, D.W., Jukes, T.H. Relationship between G + C in silent sites of codons and amino acid composition of human proteins. J Mol Evol 36, 201–213 (1993). https://doi.org/10.1007/BF00160475
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00160475