Abstract
The natural circular code consists of 20 codons (X0) overrepresented in the coding frame of protein-coding genes as compared to remaining noncoding frames, and X1 and X2 (N1N2N3 → N3N1N2 and N1N2N3 → N2N3N1 permutations of X0, overrepresented in + 1 and − 1 frames of protein-coding genes, not self-complementary). X0, X1 and X2 detect ribosomal, + 1 and − 1 frames. X0 spontaneously emerges in the 25 theoretical minimal RNA rings, 22-nucleotide-long circular RNAs designed to code once for each of the genetic code’s coding signals (a start, a stop and each of the 20 amino acids) by three overlapping frames. RNA rings presumed ancient are biased for X1, and bias for X0 increases in presumed recent RNA rings, indicating an evolutionary X1-to-X0 switch. Here, analyses explore biases for X0, X1 and X2 in non-redundant nucleotide tetra- and pentamers, for different genetic codes. Biases for X0 occur in non-redundant nucleotide pentamers and seem stronger in nuclear than mitochondrial genetic codes; tendencies are opposite for X1. Strand-asymmetric replication presumably causes mitogenomes to escape Chargaff’s rule which expects ratios A/T = G/C = 1 in single-stranded sequences. Hence, presumably X1 emerged in ancient genetic codes used in single-stranded protogenomes/coding RNAs; the self-complementary X0 presumably evolved secondarily with double-stranded genomes and strand-symmetric replication. Results indicate that selection for non-redundant overlap coding in short nucleotide sequences produced the natural circular code.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The genetic code, meaning the assignment of codons to 22 coding signals (20 amino acids, starts and stops), is not fixed, and different taxa have different genetic codes (Elzanowski and Ostell 2019, https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi). Some extensive recodings involve stop codons and occur in some individuals (mitochondria of an olive ridley (Lepidochelys olivacea, Seligmann 2012a, b) and of a stonefly (Aleurodicus dispersus, Seligmann 2018a); HIV inactivation in HIV-resistant patients (Colson et al 2014). In genetic code evolution, the most frequent codon reassignments involve start and stop codons, showing that punctuation codes within genetic codes (El Houmami and Seligmann 2017) evolve the most (Seligmann 2015, 2018b). Stop codon assignment is optimized for regulating frameshifted translation (Seligmann and Pollock 2004; Itzkovitz and Alon 2007; Tse et al 2010; Křížek and Křížek 2012; Seligmann 2012a, b; Abrahams and Hurst 2018). This involves also increased densities of off-frame stop codons in protein-coding genes immediately downstream of shifty codons (homopolymer codons AAA, CCC, GGG, TTT) (Seligmann 2019).
Note that the genetic code seems designed to enable a relative conservation of amino acid properties after frameshifts (Wang et al 2015, 2018; Geyer and Mamlouk 2018; Bartonek et al. 2019). Beyond this, overlapping frames of the same gene potentially code for many random combinations of proteins belonging to very different protein families (Opuu et al 2017).
The Natural Circular Code
Start and stop codons signal translational initiation and termination; hence, they signal frontiers between genes, and between genes and noncoding sequences. In addition, off-frame stops regulate frameshifted ribosomal translation after ribosomal slippage (Seligmann 2007, 2010). Other punctuation signals exist: Within genes, the most frequently observed frontiers between exons and introns are two heptamers, 3′-GGTAAGT-5′ and 5′-TTCA(G)GA-3′ (present, respectively, in the D-loop and Tψ-loop of many tRNAs Demongeot and Norris 2019). A punctuation system also regulates ribosomal frame before frameshifts: The natural circular code (Arquès and Michel 1996) enables detecting the ribosomal translational frame (Ahmed et al 2010).
The subsets X = X0, X1 and X2, of each 20 trinucleotides, are found with overrepresentation in, respectively, the frames 0 (reading frame), 1 (frame 0 shifted by one nucleotide) and 2 (frame 0 shifted by two nucleotides) in genes of both prokaryotes and eukaryotes. These groups of overrepresented codons are non-overlapping, meaning that no codon belonging to one of these groups is found in any of the two remaining groups. No homopolymer nucleotide triplet is included in any of these three groups of 20 nucleotide triplets. The set X0 contains, for example, the following 20 trinucleotides: X0 = {AAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC}.
These X0, X1 and X2 present peculiar mathematical properties. Each X0, X1 (overrepresented in frame 1) and X2 (overrepresented in frame 2) are circular codes, meaning that they enable detecting the frame of a sequence (Fimmel and Strüngmann 2016, 2018). There are only 64 nucleotide triplets, and homopolymer triplets cannot be included in circular codes. Hence, no circular code based on triplet codons includes more than 20 triplets. Therefore, X0, X1 and X2 are maximal circular codes (Michel and Pirillo 2010; Gonzalez et al 2011, 2017; Michel et al 2016).
Each codon from X0 has its reverse-complement codon among the remaining 19 codons belonging to X0; hence, X0 is a maximal self-complementary circular code (Fimmel et al 2018). A relation exists between X0, X1 and X2. Nucleotide triplets in X1 result from the N1N2N3 → N3N1N2 permutation of X0 codons, and those in X2 result from the N1N2N3 → N2N3N1 permutation of X0 codons. Hence, the natural circular code empirically discovered in protein-coding genes is a maximal, self-complementary C3 circular code (Michel 2012). The C3 indicates that the N1N2N3 → N3N1N2 and the N1N2N3 → N2N3N1 permutations of X0 are also maximal circular codes.
Comma-free codes are particularly efficient circular codes, as a single codon belonging to a comma-free code detects the coding frame. Regular circular codes require longer sequences for frame detection (Fimmel et al 2017). Early on, it was believed that the genetic code was a comma-free code (Crick et al 1957), but it seems the genetic code optimizes between frame disambiguation and coding flexibility. Nevertheless, the natural circular code includes a subset of four codons (CAG, CTG, CTC and GAG) that form a comma-free code (Ahmed et al 2010).
Bijective transformations of X0 produce also maximal circular codes, some among which are self-complementary (Fimmel et al 2015; Michel and Seligmann 2014). The molecular mechanisms by which X0 regulates ribosomal frame detection remain unknown; however, these probably involve conserved nucleotide motifs belonging to X0 found in tRNAs (Michel 2012, 2013) and rRNAs (El Soufi and Michel 2014, 2015). The natural circular code seems not vestigial and probably still functions in modern translation: X0 codons are more conserved than synonymous codons not belonging to X0 (Dila et al 2018). X0 also regulates frameshifting transcription (El Houmami and Seligmann 2017; Warthi and Seligmann 2019).
Theoretical Minimal RNA Rings
Though some variant natural circular codes occur (Michel 2015, 2017) or are suspected (Arquès and Michel 1997), X0 is a more conserved punctuation system than start and stop codon punctuations and seems near-universal. Some recent peculiar observations shed light on the evolution of the natural circular code (Demongeot and Seligmann 2019a).
A theoretical search for life’s primordial RNAs assumed that these would code over the shortest possible sequence once for each coding signal, a start, a stop and each of the 20 amino acids, and form a stem-loop hairpin preventing degradation (Demongeot 1978; Demongeot and Besson 1983). These constraints define 25 theoretical minimal RNA rings. These seem homologous to a consensual ancestral tRNA (Demongeot and Moreira 2007) as it was defined by Eigen and Winkler-Oswatitsch (1981). Concerning the barycenter (for Hamming and edits distance) of these 25 RNA rings, RNA ring homology with tRNAs results from heptamers GAAUGGU and UUCAAGA frequently found in D-loop and Tψ-loop of many tRNAs and at frontiers between exon/intron and between intron/exon. The tRNA homology defines anticodons for each RNA ring, and a presumed order of evolution according to the cognate amino acid (Trifonov 2000) defined by that predicted anticodon. This evolution order based on cognate amino acid identity is congruent with independent RNA ring properties derived from their peptide coding properties (Demongeot and Seligmann 2019b, c), their occurrences in tRNA synthetase genes (Demongeot and Seligmann 2019d) and their predicted anticodons functioning as binding sites for replication initiation by polymerases (Demongeot and Seligmann 2019e).
These observations suggest that theoretical minimal RNA rings are realistic primordial proto-life RNA sequences. Indeed, they also include information relevant to the evolution of the natural circular code. Codons belonging to X0 are overrepresented in RNA rings. Their numbers increase from presumed ancient to recent RNA rings [ranked according to the presumed genetic code integration order of their cognate amino acid (Trifonov 2000)]. X1 codon numbers decrease from ancient to recent RNA rings. Hence, prebiotic/early life sequences presumably switched from X1 to X0 and between coding frames, potentially explaining X1 and X2 occurrences in + 1 and − 1 frames of modern genes (Demongeot and Seligmann 2019a). This could explain biases for some specific codon structures in RNA rings (Demongeot and Seligmann 2019c).
Working Hypothesis: Non-redundant Coding and the Natural Circular Code
The most stringent constraint involved in the design of RNA rings is that overlapping frames cannot code for the same amino acid. This property resembles the property of single-frame motifs in dicodons, which defines dicodons where none of the codons occurring in any of the three overlapping frames of the dicodon occurs in any other frame of that dicodon (Michel 2019). Here, non-redundant coding excludes also nonidentical synonymous codons. We explore whether non-redundancy of coded amino acids between overlapping frames of pentamer nucleotide sequences biases populations of short coding nucleotide sequences toward codons belonging to X0, X1 and/or X2: Did the natural circular code evolve through selection for non-redundant coding? This would be in line with observations that the 20 biogenic amino acids are more diverse than would be expected from a random sample of likely available amino acids (Philip and Freeland 2011; Ilardo et al 2015).
Materials and Methods
Sequences
We examine non-redundant coding across frames for all tetra-nucleotide (64 × 4 = 256) and penta-nucleotide (64 × 4 × 4 = 1024) sequences. Pentamers are the shortest sequences for which a complete codon exists for each frame. (Complete codons occur only in two frames of tetramers.) Hence, pentamers consist of the simplest system in which coding non-redundancy can be examined across all three potential coding frames. Tetra- and pentamer sequences are produced along the following method. We added to the 5′ extremity of each of the 64 codons nucleotide A, producing 64 tetramers. This was repeated adding C, G and T, respectively, producing all 256 tetramers. The same procedure was applied adding A, C, G or T to the 3′ extremity of the 64 codons. The 1024 pentamers were produced by adding A, C, G or T to the 5′ extremity of the latter 256 tetramers. The amino acids coded by frames − 1, 0, and + 1 of these sequences were translated according to each of the known genetic codes (Elzanowski and Ostell 2019, https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi). (For tetramers, only frames 0 and + 1 exist.)
Non-redundant Coding
Sequences were classified into two groups, those lacking coding redundancy among frames versus those where the same amino acid is coded by more than one frame. These groups differ according to genetic codes used for translation. Hence, sequence translations and classifications were done separately for each alternative genetic code.
Biases
Abundances of all 64 codons were counted considering the nucleotides 2, 3 and 4 of pentamers, among non-redundant and redundant sequences, for all alternative genetic codes. Differences between X0 abundances vs of the remaining 44 codons are estimated by chi-square tests. This is also done for X1 and X2 codons. For example, pentamers where nucleotides 2, 3 and 4 are X2 codons are non-redundant and coding-redundant in 287 and 33 cases, respectively (88.5% non-redundant). Pentamers where nucleotides 2, 3 and 4 do not belong to X2 are non-redundant and coding-redundant in 590 and 114 cases, respectively (80.7%); a chi-square test estimates the statistical difference between these distributions. This procedure was done for the abundances of X0, X1 and X2 codons in non-redundant versus redundant sequences compared in each case to abundances of the 44 remaining codons, for each genetic code.
Abundances of redundant and non-redundant pentamers for each circular code and the remaining 44 codons can be obtained from data in Table 2. This requires to subtract numbers of non-redundant and redundant pentamers, respectively, found for a given circular code, from the total numbers of non-redundant and redundant pentamers, respectively.
Results
Table 1 presents the numbers of non-redundant tetra- and pentamers for each of the 64 codons, considering all possible tetramers (4 × 64 = 256, separately for − 1 and + 1 frameshifted tetramers) and pentamers (4 × 4 × 64 = 1024, all three frames considered, and excluding incomplete codons). Results clearly show a bias against homopolymer codons AAA, CCC, GGG and TTT. Hence, under selection for non-redundancy of coding among frames, primitive prebiotic sequences would have had a strong bias against homopolymer codons. Results for the + 1 frame of tetramers indicate positive biases for codons belonging to X0 and X1, in line with previous results from theoretical minimal RNA rings (Demongeot and Seligmann 2019e).
Results for pentamers indicate a stronger bias for X0 codons than observed in tetramers. From here on, analyses focus on pentamers as these have three overlapping frames. (Tetramers have no more than two overlapping coding frames.)
Table 2 presents the results from analyses similar to those presented in Table 2, for all known genetic codes. Overall, statistically significant bias for X0 occurs in all genetic codes. P values are lower (meaning stronger bias) for X0 than X1 and X2 in all but three and five cases, respectively (P = 1.8 × 10−5 and P = 0.0033, respectively, two-tailed sign tests). All exceptions to the main tendency are for mitochondrial genetic codes.
We used the Benjamini–Hochberg adjustment for test multiplicity to account for multiple tests (Benjamini and Hochberg 1995). After adjustment, biases for X0 and X1 remain significant in 20 among 24 genetic codes, and in 21 among 24 cases for X2. Hence, accounting for multiple tests does not qualitatively alter results: Assuming selection for non-redundancy between overlapping frames of pentamers biases for codons belonging to X0.
Reading Frame Retrieval of Circular Code Codons
The codons belonging to X0 vary in terms of their contribution to detect the coding frame. For sequences entirely constructed by codons from X0, some codons, such as CTG, can only belong to the coding frame, which is the “construction” frame. Hence, CTG has a reading frame retrieval (RFR) capacity of “100” (Ahmed et al 2010). The RFR values for X0 are indicated in Table 1. If coding non-redundancy in pentamers determined which codons would belong to X0, one expects correlations between RFR and the number of non-redundant pentamers to which that X0 codon belongs. Non-redundancy of X0 codons correlates negatively with RFR in each of the genetic codes examined. This correlation is statistically significant at P < 0.05 for three genetic codes, which are all the known genetic codes specific to yeast: the yeast mitochondrial genetic code, r = − 0.481, P = 0.032; the alternative yeast nuclear genetic code; and the Pachysolen tannophilus nuclear genetic code, r = − 0.527, P = 0.017 (two-tailed tests).
The biological meaning of the negative correlation between non-redundancy and RFR is not obvious. It probably suggests an optimization between two partially opposite functions: One is coding frame detection estimated by RFR, and the other is overlap coding in different frames for close, but nonidentical alternative proteins, promulgating redundancy between frames. These results suggest that overlap coding should be particularly strong in yeasts with unusual genetic codes. The coevolution of this property between nuclear and mitochondrial alternative yeast genetic codes is particularly remarkable. Overall, the result indicates that pentamer coding non-redundancy (and redundancy) among frames affected the detailed design of the natural circular code.
Genetic Code History and Non-redundancy in Codon Families
Several hypotheses suggest that there exist evolutionary orders of integration of the 20 biogenic amino acids in the genetic code (Trifonov 2000). The mean ranks of amino acids in modern proteins recapitulate these hypothetical evolutionary orders (Seligmann 2018c), reminding speculations about ontogeny recapitulating phylogeny. We added to the genetic code evolutionary hypotheses reviewed by Trifonov (2000) the order deduced from the self-referential hypothesis (Guimarães et al 2008; Guimarães 2017), the order deduced from preference of l- over d-amino acids for interacting with D-RNA (Michel and Seligmann 2014; deduced from data from Han et al 2010), the Rogers hypothesis (Rogers 2019) and the subtraction of preferences for interactions between codons and cognate amino acids in ribosomal structures from preferences for interactions between anticodons and cognate amino acids in ribosomes (Johnson and Wang 2010). Overall, ranks converge among many hypotheses, suggesting that structurally complex and chemically inactive amino acids integrated first the genetic code, and that complex, chemically reactive amino acids were last.
The mean non-redundancy of synonymous codons assigned to amino acids overall increases with these ranks of integration for most hypotheses (37 among 44, P = 0.0000027, two-tailed sign test). Two specific hypotheses produce correlations with P < 0.05 (two-tailed test) with non-redundancy: The hypothesis developed on the transition between self-replicating oligoribotides to peptide-assisted RNA replication (Ferreira and Coutinho 1993, r = 0.81, P = 0.000007, two-tailed test); and with the self-referential hypothesis (r = 0.708, P = 0.00033, two-tailed test). The latter hypothesis is notable because at this point it is the most complete hypothesis, meaning that it integrates elements from several components of the translation system: amino acid properties and interactions with RNA, tRNAs and tRNA synthetases. Overall, results suggest that early genetic code codon-amino acid assignments favored redundancy, indicating tolerance to low-accuracy processes, and high non-redundancy for codons assigned to late amino acids, suggesting error avoidance for these more complex and overall less mutable residues. It is also noteworthy that mean non-redundancy of pentamers is inversely correlated with Trifonov’s redundancy of tetracodons (r = − 0.803, one-tailed P = 0.000006) (Fig. 1).
Discussion and Conclusion
Homopolymer codons have very high coding redundancies. Hence, selection for non-redundant codons would have biased against these codons before the evolution of complex biomolecular machineries where homopolymers cause polymerase and ribosomal frameshifts in modern protein-coding genes.
The analysis of redundant and non-redundant amino acid coding in pentamers shows biases for non-redundancy in codons belonging to the reading frame circular code X0. Hence, prebiotic selection for maximal coding diversity would have contributed to the origin of the near-universal circular code X0. Biases regarding X1 and X2, the circular codes detecting the two noncoding frames of protein-coding genes, are much weaker and usually not statistically significant. This could suggest transitions from X1 and/or X2 to X0, which could also explain associations between genetic code and codon structures (Seligmann and Warthi 2017).
Non-redundant coding did not only constrain the general pool of codons, but also the detail of the codons belonging to X0. RFR, the capacity of specific codons belonging to X0 for reading frame retrieval, is inversely proportional to that codon’s non-redundancy in pentamers, in all genetic codes and, in particular, in alternative nuclear and mitochondrial genetic codes specific to yeasts. This suggests an optimization between reading frame retrieval and the capacity for overlap coding for relatively conserved protein variants, especially in yeasts.
Moreover, biases occur for the X1 circular code in some mitochondrial genetic codes. This is in line with previous observations that presumed early theoretical minimal RNA rings, 22-nucleotide-long RNAs designed for non-redundant overlap coding, are biased toward X1, and biases toward X0 increase for presumed more recent RNA rings. We suggest that non-self-complementary circular codes such as X1 are favored in single-stranded RNA and that the self-complementary X0 evolved upon transition to double-stranded RNA (or DNA). Mitochondrial genomes are among the rare exceptions to Chargaff’s rule (same-strand A/T and C/G ratios close to 1) (Nikolaou and Almirantis 2006; Fimmel et al 2019), probably because of their mainly strand-asymmetric, unidirectional replication mode (Xia 2012), hence reproducing or conserving prebiotic conditions that presumably favored non-self-complementary circular codes. Results indicate that circular codes evolved from selection for non-redundant overlap coding in short nucleotide sequences.
References
Abrahams L, Hurst LD (2018) Refining the Ambush hypothesis: evidence that GC- and AT-rich bacteria employ different frameshift defence strategies. Genome Biol Evol 10:1153–1173
Ahmed A, Frey G, Michel CJ (2010) Essential molecular functions associated with the circular code evolution. J Theor Biol 264:613–622
Arquès DG, Michel CJ (1996) A complementary circular code in the protein coding genes. J Theor Biol 182:45–58
Arquès DG, Michel CJ (1997) A circular code in the protein coding genes of mitochondria. J Theor Biol 189:273–290
Bartonek L, Braun D, Zagrovic B (2019) Invariants of frameshifted variants. bioRxiv, 27 June 2019. https://doi.org/10.1101/684076.
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300
Colson P, Ravaux I, Tamalet C, Glazunova O, Baptiste E, Chabriere E, Wiedemann A, Lacabaratz C, Chefrour M, Picard C, Stein A, Levy Y, Raoult D (2014) HIV infection en route to endogenization: two cases. Clim Microbiol Infect 20:1280–1288
Crick FH, Griffith JS, Orgel LE (1957) Codes without commas. Proc Natl Acad Sci USA 43:416–421
Demongeot J (1978) Sur la possibilité de considérer le code génétique comme un code à enchaînement. Rev Biomath 62:61–66
Demongeot J, Besson J (1983) Genetic-code and cyclic codes. Comptes R Acad Sci III Life Sci 296:807–810
Demongeot J, Moreira A (2007) A possible circular RNA at the origin of life. J Theor Biol 249:314–324
Demongeot J, Norris V (2019) Emergence of a “cyclosome” in a primitive network capable of building “infinite” proteins. Life 9:51
Demongeot J, Seligmann H (2019a) Spontaneous evolution of circular codes in theoretical minimal RNA rings. Gene 705:95–102
Demongeot J, Seligmann H (2019b) Theoretical minimal RNA rings recapitulate the order of the genetic code's codon-amino acid assignments. J Theor Biol 471:108–116
Demongeot J, Seligmann H (2019c) Bias for 3'-dominant codon directional asymmetry in theoretical minimal RNA rings. J Comput Biol. https://doi.org/10.1089/cmb.2018.0256
Demongeot J, Seligmann H (2019d) More pieces of ancient than recent theoretical minimal proto-tRNA-like RNA rings in genes coding for tRNA synthetases. J Mol Evol 87(4–6):152–174
Demongeot J, Seligmann H (2019e) Theoretical minimal RNA rings designed according to coding constraints mimic deamination gradients. The Science of Life / Die Naturwissenschaften 106:44
Dila G, Michel CJ, Poch O, Ripp R, Thompson J (2018) Evolutionary conservation and functional implications of circular code motifs in eukaryotic genomes. Biosystems 175:57–74
Eigen M, Winkler-Oswatitsch R (1981) Transfer-RNA, an early gene? Naturwissenschaften 68:282–292
El Houmami N, Seligmann H (2017) Evolution of nucleotide punctuation marks: from structural to linear signals. Font Genet 8:36
El Soufi K, Michel CJ (2014) Circular code motifs in the ribosome decoding center. Comput Biol Chem 52:9–17
El Soufi K, Michel CJ (2015) Circular code motifs near the ribosome decoding center. Comput Biol Chem 59A:158–176
Elzanowski A, Ostell J (2019) The genetic codes. https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi
Ferreira R, Coutinho KR (1993) Simulation studies of self-replicating oligoribotides, with a proposal for the transition to a peptide-assisted stage. J Theor Biol 164:291–305
Fimmel E, Strüngmann L (2016) Codon distribution in error-detecting circular codes. Life 6:e14
Fimmel E, Strüngmann L (2018) Mathematical fundamentals for the noise immunity of the genetic code. Biosystems 164:186–198
Fimmel E, Giannerini S, Gonzalez DL, Strüngmann L (2015) Circular codes, symmetries and transformations. J Math Biol 70:1623–1644
Fimmel E, Michel CJ, Strüngmann L (2017) Strong comma-free codes in genetic information. Bull Math Biol 79:176–1819
Fimmel E, Michel CJ, Starman M, Strüngmann L (2018) Self-complementary circular codes in coding theory. Theory Biosci 137:51–65
Fimmel E, Gumbel M, Karpuzoglu A, Petoukhov S (2019) On comparing composition principles of long DNA sequences with those of random ones. Biosystems 180:101–108
Geyer R, Mamlouk AM (2018) On the efficiency of the genetic code after frameshift mutations. PeerJ 6:e4825. https://doi.org/10.7717/peerj.4825
Gonzalez DL, Giannerini S, Rosa R (2011) Circular codes revisited: a statistical approach. J Theor Biol 275:21–28
Gonzalez DL, Giannerini S, Rosa R (2017) The non-power model of the genetic code: a paradigm for interpreting genomic information. Philos Trans A. https://doi.org/10.1098/rsta.2015.0062
Guimarães RC (2017) Self-referential encoding on modules of anticodon pairs-root of the biological flow system. Life (Basel) 7:16
Guimarães RC, Moreira CH, de Farias ST (2008) A self-referential model for the formation of the genetic code. Theory Biosci 127:249–270
Han DX, Wang HY, Ji ZL, Hu AF, Zhao YF (2010) Amino acid chirality may be linked to the origin of phosphate-based life. J Mol Evol 70:572–582
Ilardo M, Meringer M, Freeland S, Rasulev B, Cleaves HJ II (2015) Extraordinarily adaptive properties of the genetically encoded amino acids. Sci Rep 5:9414
Itzkovitz S, Alon U (2007) The genetic code is nearly optimal for allowing additional information within protein-coding sequences. Genome Res 17:405–412
Johnson DB, Wang L (2010) Imprints of the genetic code in the ribosome. Proc Natl Acad Sci USA 107:8298–8303
Křížek M, Křížek P (2012) Why has nature invented three stop codons of DNA and only one start codon? J Theor Biol 304:183–187
Michel CJ (2012) Circular code motifs in transfer and 16S ribosomal RNAs: a possible translation code in genes. Comput Biol Chem 37:24–37
Michel CJ (2013) Circular code motifs in transfer RNAs. Comput Biol Chem 45:17–29
Michel CJ (2015) The maximal C(3) self-complementary trinucleotide circular code X in genes of bacteria, eukaryotes, plasmids and viruses. J Theor Biol 380:156–177
Michel CJ (2017) The maximal C3 self-complementary trinucleotide circular code x in genes of bacteria, archaea, eukaryotes. Plasmids and Viruses. Life (Basel) 7:e20
Michel CJ (2019) Single-frame, multiple frame and framing motifs in genes. Life 9:18
Michel CJ, Pirillo G (2010) Identification of all trinucleotide circular codes. Comput Biol Chem 34:122–125
Michel CJ, Seligmann H (2014) Bijective transformation circular codes and nucleotide exchanging RNA transcription. Biosystems 118:39–50
Michel CJ, Pellegrini M, Pirillo G (2016) Maximal dinucleotide and trinucleotide circular codes. J Theor Biol 389:40–46
Nikolaou C, Almirantis Y (2006) Deviations from Chargaff's second parity rule in organellar DNA. Insights into the evolution of organellar genomes. Gene 381:34–41
Opuu V, Silvert M, Simonson T (2017) Computational design of fully overlapping coding schemes for protein pairs and triplets. Sci Rep 7:15873
Philip GK, Freeland SJ (2011) Did evolution select a nonrandom “alphabet” of amino acids? Astrobiology 11:235–240
Rogers SO (2019) Evolution of the genetic code based on conservative changes of codons, amino acids, and aminoacyl tRNA synthetases. J Theor Biol 466:1–10
Seligmann H (2007) Cost minimization of ribosomal frameshifts. J Theor Biol 249:162–167
Seligmann H (2010) The ambush hypothesis at the whole-organism level: off frame, 'hidden' stops in vertebrate mitochondrial genes increase developmental stability. Comput Biol Chem 34:80–85
Seligmann H (2012a) Overlapping genetic codes for overlapping frameshifted genes in Testudines, and Lepidochelys olivacea as special case. Comput Biol Chem 41:18–34
Seligmann H (2012b) Coding constraints modulate chemically spontaneous mutational replication gradients in mitochondrial genomes. Curr genomics 13:37–54
Seligmann H (2015) Phylogeny of genetic codes and punctuation codes within genetic codes. Biosystems 129:36–43
Seligmann H (2018a) Directed mutations recode mitochondrial genes: from regular to stopless genetic codes. In: Seligmann H and Warthi G eds, Mitochondrial DNA: New Insights. Intechopen, London. https://doi.org/10.5772/intechopen.80871
Seligmann H (2018b) Alignment-based and alignment-free methods converge with experimental data on amino acids coded by stop codons at split between nuclear and mitochondrial genetic codes. Biosystems 167:33–46
Seligmann H (2018c) Protein sequences recapitulate genetic code evolution. Comput Struct Biotechnol J 16:177–189
Seligmann H (2019) Localized context-dependent effects of the "ambush" hypothesis: more off frame stop codons downstream of shifty codons. DNA Cell Biol 38(8):786–795
Seligmann H, Pollock DD (2004) The ambush hypothesis: hidden stop codons prevent off frame gene reading. DNA Cell Biol 23:701–705
Seligmann H, Warthi G (2017) Genetic code optimization for cotranslational protein folding: codon directional asymmetry correlates with antiparallel betasheets, tRNA synthetase classes. Comput Struct Biotechnol 15:412–424
Seligmann H, Warthi G (2019) Transcripts with systematic nucleotide deletion of 1–12 nucleotide in human mitochondrion suggest potential non-canonical transcription. PLoS ONE 14:e0217356
Trifonov EN (2000) Consensus temporal order of amino acids and evolution of the triplet code. Gene 261:139–151
Tse H, Cai JJ, Tsoi HW, Lam EP, Yuen KY (2010) Natural selection retains overrepresented out-of-frame stop codons against frameshift peptides in prokaryotes. BMC Genom 11:491
Wang X, Wang X, Chen G, Zhang J, Liu Y, Yang C (2015) The shiftability of protein coding genes: the genetic code was optimized for frameshift tolerating. PeerJ. https://doi.org/10.7287/peerj.preprints.806v1
Wang X, Dong Q, Chen G, Zhang J, Liu Y, Zhao J, Peng H, Wang Y, Cai Y, Wang X, Yang C (2018) The universal genetic code, protein coding genes and genomes of all species were optimized for frameshift tolerance. bioRxiv:067736
Xia X (2012) DNA replication and strand asymmetry in prokaryotic and mitochondrial genomes. Curr Genom 13:16–27
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Michelle Meyer.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Authors order alphabetical.
Rights and permissions
About this article
Cite this article
Demongeot, J., Seligmann, H. Pentamers with Non-redundant Frames: Bias for Natural Circular Code Codons. J Mol Evol 88, 194–201 (2020). https://doi.org/10.1007/s00239-019-09925-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-019-09925-0