Introduction and Hypothesis

Of the various models suggested to explain the origin of the tRNA molecule (Di Giulio 2009a) there are two that seem particularly interesting because of the simplicity with which they hypothesise that the molecule might have emerged. Both models postulate that the tRNA molecule had a RNA hairpin structure as its precursor (Fig. 1; Moller and Janssen 1990, 1992; Di Giulio 1992). Di Giulio’s model (1992, 1995, 1999, 2004) postulates that the assembly of two hairpin-like molecules (formally a direct duplication of a hairpin structure) created the conditions for the evolution of the tRNA molecule (Fig. 1). Indeed, if two hairpin structures are joined to create a cruciform structure (Fig. 1 top, Di Giulio 1992), then the double hairpin, i.e. the thermodynamically stable secondary structure (Fig. 1 top, Tanaka and Kikuchi 2001; Di Giulio 2004), might have been the intermediate evolutionary stage for the transition towards the cloverleaf secondary structure characterising the tRNA molecule (Fig. 1 top). Therefore, this model (Di Giulio 1992, 1995, 1999) implies a double hairpin as a transitional stage in the evolutionary pathway that presumably led to the final tRNA molecule (Tanaka and Kikuchi 2001; Di Giulio 2004). Whereas, Moller and Janssen’s model (1990, 1992) which is formally based on an indirect duplication of a hairpin structure (Fig. 1 bottom) is characterised by a ‘cigar’ structure, i.e. a long-stem structure with a small loop (Fig. 1 bottom), although a double hairpin structure is also present in this model even if it is not shown in Fig. 1 (Di Giulio 2009a). Nevertheless, the ‘cigar’ structure should characterise the latter model because it is the structure emerging from the replication–ligation reactions of a hairpin structure (Fig. 1 bottom, Moller and Janssen 1990, 1992).

Fig. 1
figure 1

This shows (i) top the assembly of two identical hairpin structures to create a cruciform structure with the relative double hairpin, partly open, and therefore, with homologous loops; and (ii) bottom the inverse duplication of a hairpin to create the cigar with the relative cruciform structure and complementary loops (two bases are also indicated, A and T, to show the similarity or complementarity of loops). On the far right, the cloverleaf structure of tRNA is depicted

Although these two models make specific predictions on what the evolutionary relations between the 5′ and 3′ halves of the tRNA molecule should be, it is difficult to distinguish between them because both models envisage the double hairpin structure albeit with different loops (Fig. 1). Indeed, while Di Giulio’s model envisages a homology relationship between the tRNA halves because it imposes a common origin of the various hairpin structures used to assemble the tRNA molecule (Di Giulio 1992, 1995, 1999, 2004, 2006a, 2008a, b, c, 2009a), Moller and Janssen’s model, in which the initial hairpin is replicated (indirect duplication), imposes a complementarity relationship between the two halves of the tRNA molecule and also, therefore, a cigar structure as the first evolutionary stage in the pathway towards the final form of the tRNA molecule (Fig. 1 bottom).

Therefore, although the two models are formally based on a direct or an indirect duplication and predict the existence of homology and complementarity relationships between the two halves of the tRNA molecule, it is not easy to distinguish between them because, as the precursor of these duplications is a hairpin structure, the products of these two duplications are indistinguishable except in the loop regions which are expected to be homologous (similar) in terms of their direct duplication and complementary for their indirect duplication (Fig. 1; Di Giulio 2009a). Furthermore, Di Giulio’s model predicts that although the two hairpin structures called upon to assemble the tRNA molecule share a common origin (which took place by direct duplication) they are not identical but, indeed, rather diversified (Di Giulio 1992, 1995, 1999, 2004, 2006a, 2008a, b, c, 2009a). With this further specification, Di Giulio’s model (1992, 1999) predicts the presence of the cigar only with a very low probability, unlike Moller and Janssen’s model (1990, 1992). However, despite this further discrimination, the predictions of the two models still overlap and cannot be easily discriminated/falsified. Nevertheless, research aiming to establish whether the tRNA sequences contain vestiges that substantiate a homology (similarity) or complementarity relationship between the two halves of tRNAs seems, on the whole, to be sensible and, in particular, could distinguish between the two models as the cigar structure seems to typify only Moller and Janssen’s and not Di Giulio’s model, and the loop region of tRNAs might also help in this discrimination (Fig. 1).

This study is also motivated by the observation that the double hairpin and the cigar are secondary structures that cannot have both been present in the pathway leading to the formation of the tRNA molecule because they are ‘incompatible’ secondary structures. Indeed, in passing from the double hairpin to the cloverleaf structure of tRNA, the pairings of 2 × 2 = 4 regions must be restructured, which is also needed to pass from the cigar to the cloverleaf structure. Whereas, to pass from the cigar to the double hairpin, or vice versa, it would be necessary to re-pair twice as many regions, which would exclude the possibility of the latter structures being part of the evolutionary pathway leading to the cloverleaf structure because the transition from the cigar to the double hairpin, or vice versa, is not very parsimonious in evolutionary terms and would have greatly slowed down the appearance of the cloverleaf structure. Therefore, this incompatibility could be a further discriminating factor in performing such a study. (It should be pointed out that the transition from the cigar to the double hairpin could be ‘immediate’ as soon as the cigar is formed. Indeed, the cigar structure in Fig. 1 bottom could be represented, as already said, as a double hairpin (Di Giulio 2009a). However, as time passes and the divergence between regions increases, the cigar would no longer transform easily into a hairpin).

In conclusion, we will try to establish whether the current tRNA sequences include indications on the existence of the double hairpin or the cigar structure as an evolutionary stage through which the evolution of the tRNA molecule passed. This will be performed both by calculating the free energy of the secondary structures of the double hairpin and the cigar that can be derived from tRNAs, and by analysing the sequences of tRNAs for similarity and complementarity, comparing the two whole halves of tRNAs and only regions of them (Di Giulio 1992, 1995, 1999; Tanaka and Kikuchi 2001; Widmann et al. 2005).

Materials and Methods

Source of tRNA Sequences and Construction of the Random Library

tRNA sequences were downloaded from tRNAdb (Juhling et al. 2009). The database provides information on sequences and secondary structures represented in dot, bracket notation.

Generation of Random tRNA Sequences

Each simulated tRNA sequence has the following properties: (i) the same base composition, (ii) it can fold in the same cloverleaf structure of the corresponding real tRNA, and (iii) the anticodon is preserved.

The algorithm used to produce a simulated of a real tRNA sequence consists of 4 steps:

  1. (1)

    read the real tRNA sequence from the database and evaluate the base composition

  2. (2)

    read the corresponding secondary structure in dot, brackets notation for the same tRNA from the database;

  3. (3)

    starting from the 5′ end read the secondary structure of the tRNA at position i

    • if position i is a dot ‘.’ assign a random nucleotide N with a probability proportional to the nucleotide frequencies in the real tRNA;

    • if position i is an open bracket ‘(’ assign a random nucleotide N with a probability proportional to the nucleotide frequencies in the real tRNA; then assign to its corresponding paired base the complementary nucleotide of N, N′.

Repeat step 3 for the whole length of the sequence.

  1. (4)

    Insert the anticodon in the same position as the corresponding real tRNA.

Ser, Leu, Tyr and Sec tRNAs were excluded from the analysis; after this exclusion our database counts 7,558 tRNA sequences. tRNAs with secondary structures, as reported in the database, that are too different from the canonical tRNA secondary structure were also excluded. For example, all tRNAs with length lower than 71 and greater than 78 were excluded from further analysis. The final database consists of 6,810 tRNA sequences.

Evaluation of the Free Energy of DH and CI Structures

Vienna RNA package was used to evaluate the free energy of secondary structures for a given sequence (Schuster et al. 1994).

For each particular real sequence in the database, we have calculated a Z score comparing its own free energy for a DH (double hairpin) (or CI (cigar)) secondary structure with the free energy of an ensemble of 1,000 simulated sequences generated as described above using:

$$ Z(\Updelta G) = {\frac{{\Updelta G^{\text{Real}} - \overline{\Updelta G}^{\text{Rand}} }}{{\sqrt {\sum\limits_{i} {{\frac{{\left( {\Updelta G_{i}^{\text{Rand}} - \overline{\Updelta G}^{\text{Rand}} } \right)^{2} }}{N - 1}}} } }}} $$

where \( \Updelta G^{\text{Real}} \), \( \Updelta G_{i}^{\text{Rand}} \) and \( \overline{\Updelta G}^{\text{Rand}} \) are the free energy for the real tRNA sequence, for the ith simulated tRNA and for the average of stimulant ensemble, respectively. N is the total number of simulated that in all experiments was 1,000.

A Z test was performed in order to verify if the free energy for DH (or CI (cigar)) in real tRNA sequences has a significantly lower energy (energetically more stable) compared to what was expected for a random sequence. Distributions for all tRNA sequences should be normally distributed with mean zero.

Comparison of the CL Region

Alignments of different fragments were obtained using software Muscle (Edgar 2004) implemented in custom perl scripts. The identity percentage calculated as the number of identical nucleotides in the alignment of real tRNAs from the database was compared with that observed in a simulated sample. For each real tRNA, 10,000 simulated were generated as described in the previous section.

$$ Z_{\text{mn}} (I_{\text{mn}} ) = {\frac{{I_{\text{mn}}^{\text{Real}} - \overline{{I_{\text{mn}}^{\text{Rand}} }} }}{{\sqrt {\sum\limits_{i} {{\frac{{\left( {\left( {I_{\text{mn}}^{\text{Rand}} } \right)_{i} - \overline{{I_{\text{mn}}^{\text{Rand}} }} } \right)^{2} }}{N - 1}}} } }}} $$

m and n are the sequence fragments of the tRNA molecule as described in Figs. 2 and 3. I mn is the identity percentage of the comparison mn. \( I_{\text{mn}}^{\text{Real}} \) is the identity percentage found in real tRNA sequences, \( \overline{{I_{\text{mn}}^{\text{Rand}} }} \) is the average identity percentage found in the simulated sample, \( \left( {I_{\text{mn}}^{\text{Rand}} } \right)_{i} \) is the identity percentage observed in the simulated I, and N is the number of simulated, which was 1,000 in all experiments.

Fig. 2
figure 2

Typical tRNA cloverleaf structure (CL). For convenience, segments of the tRNA are progressively labelled with letters from the 5′ end to the 3′ end to represent different regions of DH. Marks +1, +2, and +3 show the nucleotides added to the 5′ end of region I (TΨC-loop)

Fig. 3
figure 3

Typical tRNA cloverleaf structure (CL). For convenience, segments of the tRNA are conveniently progressively labelled with letters from the 5′ end to the 3′ end to represent different regions of CI. Q and R fragments include regions from B to E and from H to J, respectively

In the intra-molecular analysis, we compare regions of tRNAs that belong to the same molecule. In the inter-molecular analysis, the fragment of the 5′ half of a molecule was compared with a fragment of the 3′ half of another tRNA molecule taken randomly from the remaining tRNAs of the database, with the exclusion of all tRNAs coding for the same amino acid.

Results

We analyse two different hypothetical tRNA secondary structures, the double hairpin (DH) and cigar (CI) as possible evolutionary precursors of the modern cloverleaf (CL) secondary structure.

In order to evaluate the structural compatibility of the modern tRNA sequences with the DH and CI structures, we develop an algorithm that topologically transforms a given specific CL, as provided in tRNA database (Juhling et al. 2009), to a DH or a CI structure. These topological transformations are shown in Figs. 4 and 5, respectively.

Fig. 4
figure 4

a Representation of a typical tRNA cloverleaf secondary structure (CL); black and white solid circles represent the nucleotides unpaired in CL but that are tentatively paired in the double hairpin structure (DH). Grey solid circles represent nucleotides topologically equivalent in CL and DH. b DH secondary structure of Nanoarchaeum equitans Pro tRNA (anticodon GGG) ΔG = −31.00 kcal/mol

Fig. 5
figure 5

a Representation of a typical tRNA cloverleaf secondary structure (CL). Black and white solid circle represent the nucleotides that result unpaired in CL but are tentatively paired in the Cigar structure (CI). Grey solid circle represents nucleotides that are topologically equivalent in CL and CI. b CI secondary structure of Staphylothermus marinus Arg tRNA (anticodon GCG), ΔG = −28.24 kcal/mol

The D-loop and the TΨC-loop and their respective stems are structurally conserved in the DH (grey solid circles in Fig. 4a). Nucleotides that are tentatively paired in DH but were not paired in CL are marked as black and white solid circles for the regions at 5′ and 3′ of the anticodon loop, respectively (Fig. 4a).

In the case of the CI structure, the anticodon loop and the acceptor stem remain preserved during the topological rearrangement (grey solid circles in Fig. 5), whereas the other regions of the original CL are re-arranged to form the CI. Nucleotides that are paired in the CI but were not paired in the CL structure are marked as black and white solid circles (Fig. 5a).

Once the DH and CI are obtained, it is possible to estimate the energy associated with these structures. Figures 4b and 5b report, as examples, the putative DH and CI secondary structures obtained from real tRNA sequences present in the database. In the case of the DH structure, the lowest ΔG energy (−31.00 kcal/mol) observed was relative to the tRNA sequence for proline (anticodon GGG) of Nanoarchaeum equitans (Fig. 4b). Whereas in the case of the CI structure, the lowest ΔG energy (−28.24 kcal/mol) observed was relative to the tRNA sequence for Arg (anticodon GCG) of Staphylothermus marinus (Fig. 5b).

For each corresponding modern tRNA sequence, the energy associated with DH and CI was compared to what was expected by chance. For each single tRNA, we generated a library of 1,000 random sequences with the following characteristics: (i) the base composition is the same as the real sequence; (ii) the random sequences can fold in a CL that is the same as what is reported in the tRNA database (Juhling et al. 2009) for the corresponding sequence; and (iii) the anticodon and its position are preserved in the anticodon loop (see ‘Materials and Methods’ section for details).

For each tRNA, we calculated a ZG) value that represents the number of standard deviations that the energy of the DH (or CI) secondary structure of the real tRNA sequence is far from the mean of simulated generated as described in the ‘Materials and Methods’ section. As reported by Chamary and Hurst (2005), the ZG) can be interpreted as a measure of ‘relative stability’ of a given secondary structure relative to what one would expect by chance alone, with a negative Z score implying a higher than expected stability (see ‘Materials and Methods’ section).

As shown in Table 1, real tRNA sequences present, on average, highly significant more stable structures for both DH and CI secondary structures compared to what is expected by chance. In our case, because we are testing the hypothesis that the double hairpin and the cigar were evolutionary precursors of the modern cloverleaf structure, negative ZG) values can be interpreted as a structural echo that is still present in modern tRNA sequences (Table 1).

Table 1 ΔG free energy values relative to different secondary structures

A close inspection of ΔG values associated to DH and CI secondary structures of modern tRNA sequences shows that in some cases (see Tables 9 and 10 in the Appendix) the ΔG value for those hypothetical structures is comparable to, if not lower than, that of the biological functional CL structure. This observation does not necessarily imply that these structures exist or that they have any biological function in modern cells, but strongly outlines that present-day tRNA sequences are structurally compatible to the postulated DH and CI secondary structures.

The DH model predicts the association of two hairpin structures to generate the double hairpin structure (Di Giulio 1992, 1999, 2004; Tanaka and Kikuchi 2001). For this reason, we have constructed a DH structure assembling two hairpins from two randomly chosen halves from the tRNA database. The DH was constructed such that the first hairpin and the anticodon loop in Fig. 4 belong to the first tRNA molecule, whereas the second hairpin belongs to a randomly chosen tRNA (with the exclusion of tRNAs that codify for the same amino acid of the first half). We then compared the ΔG value associated to the DH structure made by real half tRNAs to what is expected by chance. The analysis shows that DHs constructed as described above are significantly more stable than expected only by chance (ΔG mean = −9.67 kcal/mol, Z(ΔG) = −0.352, Z test: P < 2 × 10−16).

We apply the same procedure in order to evaluate the stability of the CI structure assembling two halves from two different tRNA sequences. The resulting structures show, also in this case, a significantly lower ΔG energy than in a random sample (ZG) = −0.352, Z test: P < 2 × 10−16), although in this case the individual structures are energetically not stable ΔG > 0 with a ΔG mean = +19.23 kcal/mol.

The observed difference in the stability in DH and CI constructed by intra-molecular assembly is probably attributable to the different constraints of the two structures. In the case of DH, the stem regions close to the D-loop and TΨC-loop are necessarily paired because they are paired in the CL and belong to the same half of the molecule. Instead, in the case of the inter-molecular assembly of CI structures all putative stem portions belong to different halves of tRNAs and consequently are not necessarily paired in the intra-molecular chimera.

A complementary approach to check the hypothesis that DH and CI were structural precursors of the modern CL structure is to check the expected relationships amongst different regions of the tRNA, under the hypothesis that DH and CI predate the appearance of the CL secondary structure in the evolution of the tRNA.

Double Hairpin Structure

In Fig. 2, we conveniently label different regions of a typical CL to match up to different segments of the DH structure. It is then possible to find the expected relationship between sequence fragments under the model of the assembly of two hairpins (Di Giulio, 1992, 1999). For instance, we expect that regions A and E, and G and K should show some degree of complementarity because forming the two stems of the DH structure. Regions A and G, and E and K should, instead, be homologous because they are derived from link of two homologous hairpins (Di Giulio 1992, 1999). To analyse if modern tRNA sequences still reflect the relationships predicted by the two different models, we compare the identity percentages of these regions to what expected by chance. In the case of the similarity (homology) relationship, we directly align the two regions of interest, whereas in the case of complementarity we align the complementary reverse of one of the two regions and then we estimate the identity percentage.

A library of 1,000 simulated tRNAs was obtained as described in the ‘Materials and Methods’ section. For each comparison (mn) and tRNA in the database, we calculate a Z mn(I) that represents the number of standard deviations that the observed identity percentage in the real tRNA sequences differs from what is observed in the simulated. Positive values of Z mn(I) represent a similarity higher than is expected only by chance.

We perform an intra- and inter-molecular comparison. In the first case, the regions compared belong to the same tRNA sequence. In the later case, the second region belongs to a second tRNA molecule chosen randomly from the database (see ‘Materials and Methods’ section for details).

Results show that regions A and E are significantly more complementary than expected by chance (Tables 2, 4). Similarly, regions A-G and E-K show a statistically significant degree of similarity, as predicted by the model of the assembly of two hairpins (Tables 2, 4). Contrary to what is expected by this model, regions G and K do not show complementarity, likewise the regions of the D-loop and the TΨC-loop seem to present a complementarity relationship more than a similarity relationship as instead predicted by this model (Tables 2, 4).

Table 2 Intra-molecular comparison of different regions of tRNAs as labelled in Fig. 2 corresponding to the DH structure

Cigar Structure

Similarly to what is described in the above section, in Fig. 3 we have conveniently labelled different segments of the CL in order to outline the relationships predicted by the CI model. Regions A and K shown in Fig. 3 are complementary in both CI and CL secondary structures and are, thus, not interesting for the analysis, whereas regions B and J, which are not related in CL, are predicted to be complementary in the CI secondary structure. In Tables 6 and 7, we report the results of the identity percentages relative to the expected relationships of different tRNA fragments according to the CI model. The results seem contrasting because regions D and H show significant complementarity as predicted by the CI model, whereas fragments B and J are significantly less complementary than is expected only by chance (Tables 6, 7).

According to the model of evolution of the tRNA molecule, the joining point for the assembly of the two ancestral hairpins is represented by the position of the intron commonly found in tRNA genes of the three domains of life and depicted in Figs. 2 and 3 (Di Giulio 1992, 1995, 1999). The 5′ and 3′ halves were, for convenience, labelled X and Y, respectively. We analyse the relationships of homology and complementarity between these two halves evaluating the identity percentages and comparing them what is expected by chance. Results for the intra- and inter-molecular analysis show how the homology relationship between these two halves is highly significant for both intra- and inter-molecular analysis (Table 8). The complementarity relationship, instead, turns out not to be significant or significantly not complementary compared to what is expected only by chance (Table 8).

Discussion

The presence of the double hairpin and the cigar is undeniable, given the high statistical and thermodynamic significance with which these structures are observed in tRNA sequences (Tables 1, 2, 3, 4, 5, 6, 7, 8; Figs. 4, 5; see also Appendix). These observations confirm and extend those of Tanaka and Kikuchi (2001) who identified the presence of the double hairpin in tRNA sequences. This could be interpreted both as a memory in the tRNA sequences of how the tRNA molecule originated and as evidence in favour of the hypothesis that these structures could actually be used in present-day organisms. Indeed, there is some experimental evidence in favour of the hypothesis that the double hairpin might be a structure existing in modern-day cells (Hori et al. 2000). However, while the presence of the double hairpin is significant in intra-molecular analysis (Table 1) and, consequently also in inter-molecular analysis (P < 2 × 10−16, see ‘Results’ section) thus producing one of the greatest pieces of empirical evidence in favour of the hypothesis that the single hairpin was the precursor of the tRNA molecule (Hopfield 1978), the presence of the cigar is significant in intra-molecular analysis (Table 1) and only partially significant in inter-molecular analysis because the latter’s ΔG is much higher than zero (ΔG mean = +19.23, see ‘Results’ section), suggesting that the presence of the cigar might be less ‘strong’ than that of the double hairpin. In other words, the presence of the cigar in intra-molecular analysis could be interpreted by maintaining that it is observed not because it reflects Moller and Janssen’s model (1990, 1992) of the complementarity relationship between the two tRNA halves but because it reflects the homology relationship between the two halves as predicted by Di Giulio’s model (1992, 1999). Indeed, the data seem to favour the homology more than the complementarity of the two halves of the tRNA molecule (Tables 2, 3, 4, 5, 6, 7, 8). With the exception of the important loop region (Table 3), the individual regions are seen to be more homologous than complementary (Tables 2, 4). The other observation in favour of complementarity is that regions D and H (Fig. 3) seem complementary (Tables 6, 7) although regions B and J (Fig. 3) do not; indeed they are significantly less complementary than is expected by chance (Tables 6, 7). Furthermore, a similar observation holds for regions R and Q (which include regions H, I and J and B, C and D, respectively (Fig. 3)) which display a complementarity which is significant but negative (Tables 6, 7) and, therefore, seem indeed to indicate that the homology relationship is probably true. Finally, it must be remembered that in real tRNAs, the loop regions can interact with the tertiary structure to form hydrogen bonds between bases and this could favour complementarity between the two halves of tRNAs rather than their homology. All these observations favouring complementarity and not homology between the two halves of tRNAs contrast with the analyses regarding the ancestral sequences of tRNAs, which show that the loop regions are homologous and not complementary (Di Giulio 1992, 1995; Widmann et al. 2005). In particular, the invariance of some nucleotides in the loops and neighbouring regions (Eigen et al. 1989) is such as to make the similarity of these regions unquestionable (Di Giulio 1992) even if it was not easy to identify the truly homologous nucleotides in the two loops (Di Giulio 1992) and this might, at least partly, justify the contrasting behaviour of these regions (Tables 2, 3, 4, 5; regions C and I). Therefore, we verified the robustness of the complementarity in the loop regions and we find that it transforms into homology when the compared regions of the TΨC-loop are less rigid and allow two or three nucleotides from the nearby stem to be included in the analysis (Tables 3, 5; Fig. 3). Indeed, the loop regions now become similar and no longer complementary (Tables 3, 5), thus supporting a homology rather than a complementarity relationship between the two tRNA halves. This is confirmed by the analysis on the entire halves of tRNAs (Table 8). Indeed, the two halves of tRNAs seem homologous and not complementary (Table 8) and, in particular, in the intra-molecular comparison the complementarity is significantly lower than is expected by chance (Table 8).

Table 3 Intra-molecular comparison of the regions C and I (D-loop and the TΨC-loop, respectively) of tRNAs as labelled in Fig. 2
Table 4 Inter-molecular comparison of different regions of tRNAs as labelled in Fig. 2 corresponding to the DH structure
Table 5 Inter-molecular comparison of regions C and I (D-loop and the TΨC-loop, respectively) of tRNAs as labelled in Fig. 2
Table 6 Intra-molecular comparison of different regions of tRNAs as labelled in Fig. 3 corresponding to the CI structure
Table 7 Inter-molecular comparison of different regions of tRNAs as labelled in Fig. 3 corresponding to the CI structure
Table 8 Intra- and inter-molecular comparison of the two halves (X = 5′ half and Y = 3′ half) of tRNAs

We support the hypothesis that, on the whole, these data favour the two hairpin assembly model (Di Giulio 1992, 1995, 1999, 2004) and not the inverted duplication model (Moller and Janssen 1990, 1992) because a comparison of the two whole halves of tRNAs favours a homology relationship rather than a complementary one (Table 8). We believe that the analysis referring to the single regions (Tables 2, 3, 4, 5, 6, 7) and in particular to regions C and I (Tables 3, 5) may be subject to errors due to the inelastic, i.e. constrained, choice of the segments to be compared that cannot use any nucleotides from neighbouring regions, as we have shown above. We, therefore, suggest that the high statistical significance with which the cigar is observed (Table 1; Fig. 5; see Appendix) should be interpreted by saying that this secondary structure is expected because it actually reflects more the direct than the inverse duplication, given that is almost impossible to distinguish between these duplications if they were on a hairpin (Fig. 1, see ‘Introduction and Hypothesis’ section). In other words, the cigar ought to have been observed because it can be derived from the cloverleaf even if it is not its evolutionary stage but simply an effect of the quasi equivalence of the homology and complementarity relationships between the two tRNA halves when they stem from action on a hairpin structure (see also ‘Introduction and Hypothesis’ section). That is, the double hairpin and the cigar are equivalent structures simply because they are two faces of the same coin as a result of the quasi equivalence of the homology and complementarity relations if acting on a single hairpin. These structures thus represent and bear witness to the transition from the single hairpin to the cloverleaf of tRNA.

Nevertheless, we more generally favour the homology between the two halves of tRNA and not their complementarity, above all because Nanoarchaeum equitans was found to contain completely split tRNA genes (Randau et al. 2005) in which the tRNA molecule is codified on two distant genes whose products are later joined to form the final tRNA molecule (Randau et al. 2005). These tRNA split genes are shown to be the ancestral form of tRNA genes (Di Giulio 2006a, 2006b, 2008c, 2009b) and also define their polyphyletic origin, which is contrary to the complementarity hypothesis of the two halves of tRNA (Di Giulio 2006a, 2006b, 2008a, 2008b). These genes also seem to have all the characteristics predicted and required by our model (Di Giulio 2009a).