Introduction

The evolutionary order of the genetic code's codon–amino acid assignments has been studied along numerous independent approaches (Trifonov 2000; Guimarães 2017; Seligmann 2018; Rogers 2019). Though no definitive answer exists on that order, overall patterns obtained from chemical, observational, structural, and informational approaches converge. Miller's chemical experiment reconstructing prebiotic earth conditions (Miller 1953; Miller and Urey 1959) obtained amino acid orders (ranked by decreasing experimental yields) similar to those from meteorite composition (Kvenvolden et al. 1971) and amino acids ranked by increasing physicochemical structural complexity (Dufton 1997). A group of 20 codons forms the universal natural circular code for retrieving the ribosomal translation frame (Arquès and Michel 1996). This information-based system avoids redundancy among different overlapping frames formed by the 20 circular code codons (Ahmed et al. 2007, 2010; Michel 2019). The latter 20 codons code specifically for amino acids assumed most ancient according to previous approaches (Miller's chemical experiments, meteorite composition, and Dufton's complexity score). This homogeneity among hypotheses from different backgrounds reinforces their local coherence within a common global explanatory framework.

Biomolecular structures also include fossil imprints of the genetic code's evolutionary process. Contact biases in the ribosome's 3D structure between nucleotide triplets from ribosomal RNAs and amino acids show preference for interactions between amino acids and triplets that are their genetic code cognate codons, specifically for the amino acids presumed most ancient by the other, previously mentioned hypotheses (Miller's experiments, meteorite composition, structural simplicity, and information-based circular code theory) for ranking genetic code amino acid inclusions. The presumed more recent amino acids have biased contacts with nucleotide triplets that are their anticodons (Johnson and Wang 2010). This putatively reflects a transition from direct codon–amino acid interactions to a more recent tRNA-based translation.

Hence, even when no definitive answer exists, congruence between several independent methods based on different premises and approaches produces a useful consensus body of knowledge and a workable basis for further research on that topic.

Genetic Code Evolution from tRNA Properties

Several tRNA properties have also been used to propose candidate evolutionary hypotheses on genetic code integration orders of their cognate amino acids. Nucleotide triplets in the 5′ acceptor stem of some prokaryote tRNAs code for the tRNA's cognate amino acid, suggesting a primitive code in acceptor stems of these tRNAs (Möller and Janssen 1990, 1992, as predicted by Hopfield 1978). This code occurs in tRNAs with cognates ranked as ancient amino acids by the above-mentioned hypotheses, potentially reflecting remnants of direct codon–amino acid interactions (Seligmann and Amzallag 2002).

Another approach assumed that prebiotic genes were multifunctional, combining proto-tRNA and peptide coding properties (Eigen and Winkler-Oswatitsch 1981a, b). Peptide amino acid compositions of translated ancestral tRNAs are also congruent with the hypotheses mentioned in the previous section: amino acids frequent in that composition are relatively ancient, and those rare or absent in that composition are relatively recent.

A third approach examined the diversity of isoacceptor tRNAs for each tRNA species, assuming that high diversity indicates ancient tRNAs and corresponding cognate amino acids (Chaley et al. 1999). The candidate genetic code integration order of amino acids that this method produces is the least congruent with the above-mentioned hypotheses, including the two other tRNA-derived hypotheses (Table 1).

Table 1 A ranks of amino acid integration in the genetic code according to selected hypotheses (column numbers follow Trifonov (2000)): 3-Miller experiment, 33-Murchison meteorite, 37-Dufton' size complexity index, $ contacts in ribosomal structure (Johnson and Wang 2010), & natural circular code (Arquès and Michel 1996), 24-tRNA stem primitive code, 25-isoacceptor tRNA diversity, and 32- tRNA Urgen composition; B non-parametric Spearman rank correlation coefficients between these ranks, * for one-tailed P < 0.05

Genetic Code Evolution and tRNA Secondary Structure

The previous section shows that two tRNA-derived hypotheses for the genetic code evolution are compatible with hypotheses derived from chemical and structural properties of amino acids. A third tRNA-derived hypothesis, based on isoacceptor tRNA diversity, seems unrelated. However, the genetic code evolutionary order based on tRNA diversity associates with the tRNA-rRNA secondary structure evolutionary score (Demongeot and Seligmann 2019a). This score estimates the relative similarities of tRNA cloverleaves with tRNA- vs rRNA-like secondary structure clusters. Results show that tRNAs with relatively rRNA-like secondary structures are recent (low isoacceptor tRNA diversity), and those more tRNA-like are relatively ancient (high isoacceptor tRNA diversity).

Hence, the isoacceptor tRNA diversity hypothesis reflects tRNA evolution, while the two other tRNA-derived hypotheses (acceptor stem primitive code (Möller and Janssen 1990, 1992), and peptide composition (Eigen and Winkler-Oswatitsch 1981a,b)) reflect evolution of pre-tRNA metabolites.

Structure-Derived tRNA Evolution Hypotheses

Some hypotheses used observations on symmetries in tRNA cloverleaf structures to reconstruct plausible scenarios on tRNA evolution, by assembly of two (Di Giulio 1992, 1995, 1999; Tanaka and Kikuchi 2001; Widmann et al. 2005; Branciamore and Di Giulio 2011; Di Giulio 2012, 2013; Tamura 2015) or three hairpin-like structures (Root-Bernstein et al. 2016; Kim et al. 2018). We have no specific preference for any of these models, yet agree that the former is more parsimonious than the latter model (Di Giulio 2019).

The senior protagonists of these hypotheses, M. Di Giulio and Z.F. Burton, comment that their structure-based hypotheses on hairpin assembly are better predictors of modern tRNAs than other hypotheses, specifically the Uroburos hypothesis (Demongeot and Seligmann 2019b), itself derived from the theoretical minimal RNA ring hypothesis for ancestral prebiotic RNAs. Both commentaries (in the form available to us at this point, their first drafts submitted for publication in this journal) fail to present the RNA ring hypothesis, and especially its premises. Hence, before replying to their comments, we describe the theoretical minimal RNA ring hypothesis and evidence for it. This description is central to the argumentation of their criticisms.

Theoretical Minimal RNA Rings

Theoretical minimal RNA rings are sequences designed in silico to match two specific constraints:

  1. 1.

    The sequence should code over the shortest possible length for the highest possible diversity of genetic code signals. This means it should include a start and a stop codon, and a single codon coding for each of the 20 biogenic amino acids.

  2. 2.

    The sequence should form the longest possible stem-loop hairpin, to avoid fast degradation in prebiotic conditions.

These constraints define exactly 25 22-nucleotide long, circular RNAs, coding for 22 codons by three consecutive translation rounds of partially overlapping codons, ending with a stop codon and having an alternative hairpin form of nine paired nucleotides. Note that at no point information from modern tRNAs or tRNA cloverleaf structures is included in the underlying assumptions/constraints, a major difference with the two- and three-piece assembly hypotheses derived from observations of tRNA structures. Hence, similarities between RNA rings (Demongeot and Moreira 2007) and loops of ancestral (Eigen and Winkler-Oswatitsch 1981a, b) and modern (Mazauric et al. 1998; Michaud et al. 2011) tRNAs are particularly supportive for tRNAs evolving from RNA rings or RNA ring-like sequences. This is because RNA ring design does not include explicit information from tRNAs (Fig. 1). The reconstruction from hairpins formed by RNA ring 13 (barycenter for Table 2 distances) of the Arabidopsis thaliana tRNA-Gly (from Michaud et al. 2011) produces 41 matches for 70 nucleotides (P-value ≈ 0.3 × 10–10, one-tailed binomial test, with H0 equal to the random occurrence of matches in 70 choices of nucleotides with probability ¼).

Fig. 1
figure 1

a Consensus sequence of yeast tRNAGly(UCC), tRNAGly(GCC), and tRNAGly(CCC), and mammalian tRNAGly(GCC) and tRNAGly(CCC) (from Mazauric et al. 1998); b reconstructed tRNAGly(GCC) from hairpins of RNA ring 13 (ATGGTACTGCCATTCAAGATGA), where the nucleotides in red match with the corresponding nucleotides of tRNA in c; cArabidopsis thaliana tRNAGly(GCC) (from Michaud et al. 2011); d consensus tRNA secondary structure among trnadb 6597 tRNAs from trnadb: 1 denotes the pivot of the three-dimensional tRNA L-shape, and 2 and 3 denote the bases linking the lateral loops in the tRNA tertiary structure (Sprinzl et al. 1998; Demongeot and Moreira 2007; Jühling et al. 2009); e consensus mitochondrial tRNA secondary structure from trnadb (Trnadb 2019) (Color figure online)

Table 2 Pairwise distances between RNA rings (numbered and cognate amino acid) and the first principal component of these pairwise distances, PC1 (36% of total variation) and code (% of tRNAs with acceptor stem code averaged from Table 1 in Möller and Janssen 1992)

This implies that the genetic code and coding non-redundancy for amino acids among codons used for obtaining RNA rings define both protein (or peptide) coding sequences and proto-tRNA-like sequences. This matches hypotheses and underlying related evidences that ancestral genes were multifunctional (tRNA and coding sequences, Eigen and Winkler-Oswatitsch 1981a, b; rRNA, tRNA ,and coding sequences, Root-Bernstein and Root-Bernstein 2015, 2016, 2019). Modern genes like tRNA synthetases and structural RNAs like 16S rRNAs contain a significant (more than randomly expected) densities of n-mers, sub-sequences of the RNA rings (Demongeot and Norris 2019). For the 16S rRNAs, this significant density begins for n = 9, which is in agreement with a possible dissociation of the RNA ring hairpin of length 9 (constrained by RNA ring design): the search for the n-mers (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch) from RNA ring 13 in the NCBI 16S rRNA sequences (Bacteria and Archaea) database finds 222 decamers among 20,829 sequences, significantly less than the expected 312 + 30*, and 1441 nonamers, significantly more than the expected 1363 + 59* (* for the 95%-confidence upper bond).

Multifunctionality holds also for many mitochondrial tRNAs, whose parts apparently code for alternative 3′ and 5′ extremities of neighboring protein coding genes: mitochondrial tRNAs include highly conserved nucleotide triplets corresponding to a stop and a start codon (nucleotides 8–10 and 47–49, Fig. 1 in Faure and Barthélémy 2018, 2019), suggesting they frequently code for the N- and carboxyl-termini of the proteins encoded upstream and downstream of the tRNA, respectively.

RNA rings mimick several properties of modern protein coding genes (Demongeot and Seligmann 2019c, d, e), including overrepresentation of the 20 codons forming the natural circular code for ribosomal translation frame retrieval (Arquès and Michel 1996). Similarities with tRNA loops define also a candidate anticodon for each RNA ring (Demongeot and Moreira 2007). Associated cognate amino acids enable to rank RNA rings according to the various genetic code integration order hypotheses of amino acids. Each RNA ring property examined until now coevolves with the genetic code integration ranks, mainly with the above-discussed hypotheses derived from tRNA properties. In addition, RNA ring pieces exist in modern protein coding genes, especially from RNA rings with ancient cognates (Demongeot and Seligmann 2019f).

RNA rings include also deamination gradients starting at their presumed anticodon, as predicted by similarities with ancestral tRNA loops (Demongeot and Moreira 2007). In natural genomes such as mitogenomes, deamination gradients start at replication origins (Reyes et al. 1998). This is in line with several evidence-based hypotheses: 1. tRNAs derived from stem-loop hairpins that presumably resembled modern replication origins and were involved in prebiotic and/or early life replication (Weiner and Maizels 1987; Maizels and Weiner 1994), 2. mitochondrial tRNAs function as alternative mitochondrial light strand replication origins (Seligmann et al. 2006a, b; Seligmann and Krishnan 2006; Seligmann 2008, 2010, 2011; Seligmann and Labra 2014), 3. the loop of the mitochondrial light strand replication origin (OL) is homologous to parts of a neighboring tRNA (Seligmann 2016), and 4. the vertebrate mitochondrial gamma DNA polymerase evolved from a bacterial tRNA synthetase (Wolf and Koonin 2001).

Notably, the polymerase's site binding to the OL is homologous to the tRNA synthetase's site that interacts with the tRNA's anticodon loop (Fan et al. 1999; Carrodeguas and Bogenhagen 2000). Hence, RNA rings inherently mimick evolution and evolutionary functional transitions between replicational and translational biomolecules. This implies that the genetic code's codon–amino acid assignments, the main information used to design RNA rings, predetermines the evolutionary links between tRNAs and replication origins and between tRNA synthetases and polymerases.

Hypotheses on tRNA Evolution

In the case of amino acid integration ranks in the genetic code, the actual order is unknown. However, for tRNAs, the two- and three-piece aggregation hypotheses are designed to fit a known predetermined result, the tRNA. Moreover, these models are derived from observations on tRNAs. It is hence predictable that these models will fairly predict tRNA structure. No information on tRNAs was used in the RNA ring design. Their design was not even aimed at mimicking tRNAs. The resulting RNA rings include several properties of protein coding genes and of tRNAs, including the evolution of properties as varied as the natural circular code, the tRNA–rRNA secondary structure evolutionary axis, the relation between tRNAs and replication origins, and specifically the recognition of tRNA anticodon loops as origins of replication (Seligmann 2010). None of the comments by Di Giulio and Burton addresses these points, nor do their respective two- and three-piece structural hypotheses integrate such various properties of the cell's replicational, translational, and coding biomolecules. We note here that Burton refers to RNA rings as an accretion and random sequence model. However, these result from a deterministic design that produces exactly 25 solutions to its underlying constraints, the opposite of a random process. Moreover, there is no solution if the ring length is strictly inferior to 22, which is per se an interesting deterministic result related to the combinatorial character of the minmax problem related to the constraints imposed to the rings.

We indicate some possible caveats in the respective two- and three-piece hypotheses, hoping to contribute to the elaboration of more complete hypotheses.

We did not find mention of the primitive tRNA acceptor stem code (Möller and Janssen 1990, 1992) in any publication by Burton and coauthors on tRNA-related topics (Root-Bernstein et al. 2016; Pak et al. 2017, 2018a, b; Kim et al. 2018, 2019; Opron and Burton 2018). We did not find explanations for the primitive tRNA acceptor stem code in Di Giulio's two-piece hypothesis.

The association between acceptor stem and anticodon sequences fits the hypothesis that tRNAs result from the accretion of anticodon-like sequences (Seligmann and Amzallag 2002). It is compatible with the RNA ring hypothesis: Table 2 shows the pairwise distance matrix between the 25 RNA rings, which sums identical vs non-identical nucleotides in combinations of two RNA rings (nucleotide identity: 1; nonidentity: 0). The first principal component extracted from this distance matrix correlates well with the primitive tRNA acceptor stem code (r = − 0.554, two tailed P = 0.0061; Table 2, last columns).

Di Giulio's two-piece hypothesis seems compatible with all tRNAs, though evidence for split mitochondrial tRNA genes is in our view inconclusive. However, the three-piece hypothesis should be re-examined to see if it fits mitochondrial tRNA structures from the point of view of structural symmetries within tRNAs. Losses of mitochondrial tRNA sidearms (Fujishima and Kanai 2014) could be construed as indirect evidence for the three-piece hypothesis, but result from secondary losses, rather than ancestral states.

However, observations that mt tRNA sidearm loops, when examining their sequences as if these were anticodon loops, are in line with the three-piece tRNA structure hypothesis: mt tRNA sidearm "anticodon" abundances coevolve with mt proteomic amino acid abundances (Seligmann 2013, 2014). This would fit the view that the three tRNA branches functioned independently in translation and accreted into modern mt tRNAs. Secondary losses of mitochondrial tRNA sidearms might recover an ancestral state.

An important point favoring Di Giulio's two-piece hypothesis is that tRNAs are split in the anticodon's midst according to that hypothesis. This matches split tRNA genes, usually split at the anticodon (Tanaka and Kikuchi 2001; Fujishima and Kanai 2014). The latter observation moves the discussion on the two-piece hypothesis of tRNA evolution to the issue of character polarization: Are split tRNAs ancestral or derived (Di Giulio 2008a, b, 2009, 2013)? Only the former fits with the two-piece hypothesis. Though arguments can be made for that scenario, overall, character polarizations are frequently debatable. Here too, RNA rings bring important evidence: secondary structures formed by RNA rings fit best the tRNA–rRNA secondary structure evolution hypothesis (Demongeot and Seligmann 2019a, b) when RNA rings are split in the midst of their predicted anticodon. Similarly, RNA ring analyses yield the strongest support for deamination gradients when spliced at the anticodon (Demongeot and Seligmann 2019g). Hence, RNA rings favor ancestral status of tRNAs split in the anticodon's midst, strengthening the two-piece hypothesis.

The version of Di Giulio's comment at our disposition suggests that accretions of three RNA ring hairpins could not have formed tRNAs, because all three hairpins are identical, while tRNA branches are not identical. This application of RNA rings to reconstruct tRNAs is inadequate: it uses only RNA ring 13 (AL) and ignores that there are 25 different RNA rings which could be combined. Hence, combining hairpins formed by different RNA rings produces tRNA-like structures with non-identical branches. This results from misunderstandings and/or ignorance of the RNA ring model, which in our view characterizes comments by Burton and by Di Giulio.

The authors of the two- and three-piece models each present their respective models as fail-proof holistic explanations. From that point of view, they are more in contradiction with each other than with the RNA ring approach, which (a) does not claim to explain everything about tRNAs, (b) was not designed to mimick tRNAs nor their evolution, and (c) only secondarily happens to match some tRNA properties, among several other different properties of prebiotic and early life biomolecules and metabolism (Fig. 2).

Fig. 2
figure 2

Scheme of biomolecule properties embedded in RNA rings. Major unexplained properties are ribosome formation (not in scheme) and genetic code codon–amino acid assignments (???). RNA rings contribute to coding and non-coding biomolecular machineries (vertical divide) (Color figure online)

None of the discussed models, including the RNA ring approach, address how tRNAs evolve from splicing/self-assembly, and by which catalysts. The two- and three-piece models do not address the origin of the (two or three) pieces that are being assembled, the RNA rings provide potential answers to this, through a deterministic process (contrarily to Burton's claim that RNA rings are random sequences).

RNA rings are rationally designed. Hence, only rational/mathematical proofs that RNA rings are not the solutions to the constraints of their design can prove that the RNA rings are incorrect. Discussions of their relative relevancy to tRNAs, other biomolecules, and properties of the proto-metabolic world are interesting but do not address the truth that RNA rings solve the minmax problem constraining their design.

More importantly, arguments developed here show that integrating different hypotheses, rather than focusing on finding (and sometimes creating by error) incompatibilities between hypotheses, improves our understanding. In addition, the possibility that tRNAs are polyphyletic (Di Giulio 2006, 2008a, b, 2013) stresses that different tRNAs might have evolved through different pathways: more than one hypothesis might account for polyphyletic tRNA evolution.