Abstract
Transversion and transition mutations have variable effects on the stability of RNA secondary structure considering that the former destabilizes the double helix geometry to a greater extent by introducing purine:purine (R:R) or pyrimidine:pyrimidine (Y:Y) base pairs. Therefore, transversion frequency is likely to be lower than that of transition in the secondary structure regions of RNA genes. Here, we performed an analysis of transition and transversion frequencies in tRNA genes defined well with secondary structure and compared with the intergenic regions in five bacterial species namely Escherichia coli, Klebsiella pneumoniae, Salmonella enterica, Staphylococcus aureus and Streptococcus pneumoniae using a large genome sequence data set. In general, the transversion frequency was observed to be lower than that of transition in both tRNA genes and intergenic regions. The transition to transversion ratio was observed to be greater in tRNA genes than that in the intergenic regions in all the five bacteria that we studied. Interestingly, the intraspecies base substitution analysis in tRNA genes revealed that non-compensatory substitutions were more frequent than compensatory substitutions in the stem region. Further, transition to transversion ratio in the loop region was observed to be significantly lesser than that among the non-compensatory substitutions in the stem region. This indicated that the transversion is more deleterious than transition in the stem regions. In addition, substitutions from amino bases (A/C) to keto bases (G/T) were also observed to be more than the reverse substitutions in the stem region. Substitution from amino bases to keto bases are likely to facilitate the stable G:U pairing unlike the reverse substitution that facilitates the unstable A:C pairing in the stem region of tRNA. This work provides additional support that the secondary structure of tRNA molecule is what drives the different substitutions in its gene sequence.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
In DNA there are four different transition (ti) substitutions in which a purine (or a pyrimidine) base is replaced by another purine (or a pyrimidine) base (R → R; Y → Y). Similarly, there are eight different transversion (tv) substitutions in which a purine (or a pyrimidine) base is replaced by a pyrimidine (or a purine) base (R → Y; Y → R) (Fersht and Knill-Jones 1981) (Fig. 1). Under conditions where all the 12 substitutions are occurring in equal proportions, the ti/tv ratio would be 0.50. But, the reported ratio in different genomes is usually ~ 2.00 or more, which suggests that a transition is ~ four times more frequent than a transversion in DNA (Seplyarskiy et al. 2012; Duchêne et al. 2015; Stoltzfus and Norris 2016; Lyons and Lauring 2017). This bias towards transition has been known since comparison between homologous DNA sequences (Gojobori et al. 1982; Wu and Maeda 1987). The purine:purine and pyrimidine:pyrimidine pairing distorts the geometry of the double helix for which transversion frequency is lower than that of transition (Topal and Fresco 1976; Sinha and Haimes 1981). The lower transversion frequency can also be due to several other factors. One such factor is cytosine deamination in DNA during replication and transcription causing one of the most common transitions, C → T (G → A) (Lewis et al. 2016; Tubbs and Nussenzweig 2017). In coding regions, transversions at the third and first codon positions are more non-synonymous than transition (Vogel and Kopun 1977; Abdullah et al. 2016).
Though it is well-known that ti is more frequent than tv in genomes, all factors influencing their occurrence in genomes remains to be explored further. Transition and transversion were observed with similar frequencies at non-methylated cytosine sites in grasshopper pseudo-genes (Keller et al. 2007). Selective constraints imposed by secondary structure can account for the relative enrichment of ti in tRNA and rRNA genes (Knies et al. 2008). In case of tRNA and rRNA genes, there is selection for G + C enrichment in the stem region unlike in the loop region, in thermophilic bacteria (Galtier and Lobry 1997; Wang and Hickey 2002; Hickey and Singer 2004). It is known that RNA secondary structure plays an important role in gene expression and regulation (Park et al. 2013). There can be different impacts of transition and transversions in the encoded RNA that forms double helix as explained with the help of schematic scenarios as described in Fig. 2. For example, a transition such as G → A results in A:C pairing while the A → G substitution results in G:U pairing in the encoded transcript. A transversion such as G → U (R → Y) results in U:C (Y:Y) pairing while U → G (Y → R) substitution results in G:A (R:R) pairing in the transcript. The different base pairing due to ti and tv substitutions can cause different magnitudes of instabilities in the secondary structure of a transcript (Rossetti et al. 2015). The selection pressure on secondary structure acts in favor of transition and against the transversion, which is reflected in changes in free energy level (Fig. 2). Therefore, it can be assumed that tv and ti mutations are fixed/selected according to evolutionary constraints associated with maintenance of tRNA secondary structure.
Earlier studies on tRNA genes regarding ti and tv have mainly been carried out by comparing genes across the species (Higgs 2000; Savill et al. 2001; Jow et al. 2002; Hudelot et al. 2003). The main finding is that compensatory transition or transversion substitutions are more frequent than single site independent substitution in stem regions of these genes. Further, ti to tv ratio is observed to be higher in stem region in comparison to loop region as compensatory tv is more rare than compensatory ti. However, a systematic study of ti and tv in tRNA by comparing gene sequences within a species have not been reported in literature till date. Considering single substitution being more frequent than double substitutions, the possible occurrence of non-compensatory substitutions in stem region cannot be avoided if the study is carried out within a species. In addition, as G:U base pairing often being allowed in tRNA stem, non-compensatory substitution from an amino base to keto base might be preferred over the reverse substitution, which is yet to be explored. Availability of genome sequences from a large number of strains belonging to a species has opened up avenue to address the above queries. tRNA structures are relatively easy to survey for finding out the stem and the loop regions in different organisms. Therefore, in this study, we carried out a comparative analysis of ti and tv substitutions in tRNA genes using large whole genome datasets of five bacterial species: Escherichia coli (Ec), Klebsiella pneumoniae (Kp), Salmonella enterica (Se), Staphylococcus aureus (Sa) and Streptococcus pneumoniae (Sp). The frequency of ti was found to be higher than tv in the stem regions than in the loop region of tRNA genes. Further the transitions from amino to keto bases were found to be more frequent than the reverse transitions in the stem regions. These observations indicate that secondary structure in tRNA influences differentially the transition and transversion frequencies in organism.
Materials and Methods
Extracting Intergenic Regions, tRNA Genes and Segregating Loop and Stem Regions
In this study, we have considered five bacteria, Escherichia coli (Ec), Klebsiella pneumoniae (Kp), Staphylococcus aureus (Sa), Salmonella enterica (Se) and Streptococcus pneumoniae (Sp) for which large number of genome sequences were available in public databases. In total we have done a detailed computational analysis of the tRNA genes of the alignment of 157 Ec strains (Thorpe et al. 2017), 208 Kp strains (Holt et al. 2015), 132 Sa strains (Reuter et al. 2016), 366 Se strains (Thorpe et al. 2017), and 264 Sp strains (Chewapreecha et al. 2014) for finding out base substitutions. Python scripts were written to extract alignments of the tRNA gene sequences from the alignment of DNA sequences using coordinate information of the annotated tRNA genes. Intergenic regions (IRs) which are the stretches of DNA sequences located between genes are also extracted considering coordinates of the coding regions (protein coding genes, tRNA genes and rRNA genes). Genomic compositional details of the tRNA genes and IRs analyzed are given in Table 1.
We extracted the predicted secondary structure of the tRNA genes using tRNAscan-SE On-line webserver (Lowe and Chan 2016) available at http://lowelab.ucsc.edu/tRNAscan-SE/. This web database (GtRNAdb) is rich in genomic tRNA information. It confirms the clover leaf shaped tRNA secondary structure, gene size, anti-codon position and anti-codon for a given query tRNA gene sequence. The extracted tRNA sequences from the alignments of genomes but not confirmed in the GtRNAdb, were not analyzed in this study. The extracted genes of Ec, Kp, Sa, Se and Sp are 89, 86, 61, 88 and 59 respectively, but confirmed number of genes from GtRNAdb of the organisms are 87, 85, 61, 87 and 58 respectively. Using this webserver, we obtained predicted secondary structure of the tRNAs in terms of dot-bracket notations given in the Supplementary Table 1 and the list of tRNA genes of five bacteria is presented in Supplementary Table 2. For further analysis, we considered these predicted secondary structures and classified nucleotides into two groups (i) paired and (ii) unpaired. Paired and unpaired nucleotides are considered largely to be from stem and loop regions, respectively. In general, the paired nucleotides or the stem region are the bases occurring in the acceptor stem, D-arm, anticodon-arm, variable region and T-arm. The unpaired nucleotides of the above four arms and the variable region are considered as the bases in the loop regions (Supplementary Fig. 1).
Segregating Compensatory and Non-compensatory Substitutions in Stem Regions
For each tRNA gene, substitution positions were mapped to the secondary structure and segregated into loop and stem regions. Further, the substitutions in stem region were marked as compensatory or non-compensatory depending on whether there exist a pair of substitutions or only one substitution in a paired position in the stem region respectively (Fig. 3).
Finding Substitutions from the Sequence Alignments
Considering the most frequent nucleotide at a position in the alignment of the nucleotide sequences of a tRNA gene, we computed a reference sequence and then used this reference sequence to identify a substitution in each sequence (Thorpe et al. 2017; Supplementary Document 1). Substitution frequencies were computed by dividing total count of a given substitution by the total number of the nucleotide in which substitution has occurred. For example, suppose the total number of C → U substitution is 2 and total number of C in a tRNA gene sequence is 10, then the normalized frequency would be 2/10 = 0.2. We further wrote a Python script to classify these substitutions into transitions (ti) and transversions (tv). Observed substitutions in secondary structure of a sample tRNA gene are shown in Supplementary Fig. 1. We further classified the substitutions in stem regions as compensatory and non-compensatory. For statistical analysis and determining p-value for significance test, Mann Whitney test is used (Mann and Whitney 1947).
Visualization of 2-D and 3-D Structures of tRNA Genes
For 2-D visualization of tRNA secondary structure we have used tRNAscan-SE On-line software (Lowe and Chan 2016). To visualize the tRNA 3-D structures, we have used two web servers. First, we gave the tRNA sequence and the secondary structure in dot-bracket notation obtained from tRNAscan-SE On-line software as input to Vfold3D webserver (Xu et al. 2014; Zhao et al. 2017), to obtain the 3-D structure in pdb format. Next, the 3-D secondary structure of tRNA was visualized from the pdb file using iCn3D web server (Wang et al. 2020) (Supplementary Fig. 1).
Results
Higher Transition to Transversion Ratio in tRNA Genes than Intergenic Regions
By multiple sequence alignment from hundreds of strains of a species, the possible twelve substitutions were found in intergenic regions (IRs) and tRNA genes of five bacterial species such as Ec, Kp, Sa, Se and Sp (Table 1). Substitution frequency in tRNA genes was observed to be ten times lower than that in the IRs. The difference between IRs and tRNA genes is significant (p-value < 0.01) in all the five bacteria. This finding was anticipated because the tRNA genes are transcribed to make functional tRNA structures which carries out the vital translation process inside the cell. Therefore, the low substitution frequency in tRNA genes is most likely due to strong purifying selection on these genes in comparison to IRs. We compared ti and tv between the two regions. The ti/tv values were greater than 1.0 across the five bacteria, which suggested that the ti frequency is more than that of tv in both the genomic regions. The ti/tv values in tRNA genes were greater than that in the IRs (p-value < 0.05) (Table 1). There might be two possibilities for the higher ti/tv value in tRNAs: either low tv or high ti. We calculated the relative fold increase in the ti and tv in the IRs, separately, in comparison to the tRNA genes (Supplementary Table 3). The fold increase of tv was more than that of ti in the IRs (p-value < 0.05), which suggested that the higher ti/tv in tRNA is due to lower tv occurrence in tRNA than that in IRs. This observation was in support of the notion that impact of tv on tRNA secondary structure is higher than that of ti (Fig. 2).
Higher Transition to Transversion Ratio at the Stem Regions than the Loop Regions Within tRNA Genes
Transfer RNA genes have well-defined secondary structures: the double stranded helical stem regions that constitutes ~ 2/3rd part and the single stranded loop regions that constitutes ~ 1/3rd part of a tRNA gene. We analyzed substitutions separately in stem and loop regions. The substitutions in both loop as well as stem regions were observed to be significantly lower than that in the IRs. This difference indicated that both the regions are under strong purifying selection as mentioned above. It is known that the stem and the loop regions are functionally important and make contacts with translation factors.
To compare the stem and the loop regions regarding ti and tv, we separated substitutions in the stem regions as compensatory as well as non-compensatory substitutions (Table 1). It was evident that non-compensatory substitutions were more in number than compensatory substitutions (Supplementary Fig. 2). This is pertinent to note that in previous studies tRNA genes were compared across species for which often the substitutions were observed as compensatory, unlike the observed values here. It may be noted that compensatory substitutions are relatively earlier in evolution in comparison to non-compensatory substitutions considering the higher stability of the former than the latter. We considered the ti/tv values of non-compensatory substitutions in the stem region to compare it further with the substitutions in the loop regions. The ti/tv values in the stem region were observed to be significantly higher than that in the loop regions across the bacteria (p-value < 0.01) (Fig. 4). This suggested that frequency of tv is more in the loop region than the stem region. This observation was in concordance with our hypothesis that tv in stem regions are more deleterious than the ti.
Biased Transition Substitution Towards keto Bases in the Stem Region of tRNA Genes
In tRNA stem regions, the G:U pairing is found to be accepted more favorably than A:C pairing. Therefore, non-compensatory substitutions from amino bases (A/C) to keto bases (G/T) that facilitates G:U pairing are likely to be favorable in the stem region. However, the reverse transition such as non-compensatory substitutions from keto bases (G/T) to amino bases (A/C) that facilitates A:C pairing are likely to be less preferred in the stem region. These two pairings have been described in Supplementary Fig. 3 and the stability of these pairing in terms of interaction energy has been calculated using GAUSSIAN 09 software (Frisch et al. 2009) which are presented in Supplementary Table 4. This encouraged us to compare the stem and the loop regions further in terms of substitutions from amino bases to keto bases and the vice versa. In stem regions, substitutions from amino (A/C) to keto bases (G/T) were significantly more than keto to amino bases (p-value < 0.05) (Fig. 5). This high amino to keto substitutions were consistently observed among all the five bacteria studied. However, in loop regions, no such significant pattern was observed with regard to substitutions from keto to amino or amino to keto bases (Fig. 6). This provided additional support that secondary structure influences significantly towards different substitutions in tRNA genes. Further, the comparative results between keto to amino and amino to keto transitions in IRs indicated that, there was no significant difference between these two transition substitutions in IRs (p > 0.01) (Supplementary Fig. 4).
Isoacceptor tRNAs are with different anticodons but charged with the same amino acid by the same amino acyl tRNA synthetase enzyme. Therefore, we explored to compare among the isoacceptor tRNA genes regarding substitution frequencies. In majority of the cases, number of substitutions per tRNA gene was observed as 0 or 1 or 2 (Supplementary Table 5). Therefore, an elaborate comparison among these iso-acceptors tRNA genes regarding substitutions was not possible. However, there were a few cases where the number of substitutions among isoacceptor tRNA genes have noticeable difference as follows. In Se, there are five Ser tRNA genes of which one with CGA anticodon had 9 substitutions, one tRNA gene with GCT anticodon had 1 substitution, while two genes with GGA anticodon and one gene with TGA anticodon had 0 substitutions. In Sa, isoacceptor tRNA genes in case of Ser as well as Gly were observed to have different substitutions. There are five Ser tRNA genes of which one with GCT anticodon had no substitutions while one with GGA anticodon had 9 substitutions. There are seven Gly tRNA genes of which two tRNA genes with GCC anticodon had no substitutions but five Gly tRNA with TCC anticodon had 0 to 9 substitutions. In Sp there are three Lys tRNA genes of which one with CTT anticodon and the other two with TTT anticodon. It is interesting that the tRNA gene with CTT anticodon was observed with 10 substitutions while the other tRNA genes with TTT anticodon were with no substitutions. Future studies will elucidate these differences observed among the isoacceptor tRNA genes.
Discussion
Secondary structure in transcripts is important for its function and intra-strand base pairing is important for their stability. Transfer RNA genes are known to have well-defined secondary structures unlike IRs. Though it is known in literature that ti frequency is higher than tv, role of RNA secondary structure towards it has not been explored adequately at species level. Our endeavor in this aspect is to study ti and tv in tRNA genes and compare these substitutions between loop and stem regions. We have observed that in comparison with IRs, tv frequency is proportionately lower than that of ti in tRNA genes. This observation is in concordance with the assumption that secondary structure region is likely to have low tv frequency. Further we have compared ti and tv between loop and the stem regions. In stem regions ti were proportionately higher than tv when compared with the loop region. This is in concordance with the assumption made in this study that tv is more deleterious in the stem regions than ti. It is known that G:U is a more stable pair than A:C pair in tRNA stem. Therefore, ti substitution from amino base (A/C) to keto base (G/T) that results stable G:U pairing is likely to be preferred over the ti substitution from keto base (G/T) to amino base (A/C) that results unstable A:C pairing. In concordance to this hypothesis in tRNA stem region transition substitution from amino base (A/C) to keto base (G/T) is observed to be significantly higher than that in the loop region. This further supports the notion that secondary structure in tRNA influences base substitutions. It may be the postulation that tv and ti mutations are fixed/selected according to evolutionary constraints associated with maintenance of tRNA secondary structure.
Previous researchers had studied ti and tv in stem and loop regions of tRNA genes by comparing sequences across the species. They had observed more compensatory substitutions in tRNA stem region than non-compensatory substitutions. They had attributed the high ti/tv values in tRNA stem regions due to low frequency of compensatory tv in comparison to the frequency of compensatory ti. In the present work, we analyzed ti and tv in tRNA genes by comparing sequences within a species. So, we observed non-compensatory substitutions more in number than compensatory substitutions. It is likely that the non-compensatory substitutions are more recent ones in evolution than the compensatory substitutions. Among these substitutions, those having strong deleterious impact on tRNA structures are removed faster from the population in comparison to the ones that have moderate impact. It is assumed that tv are likely to have more deleterious impact on the secondary structure than ti. Therefore, we observed tv frequency is significantly lower than ti in stem region, in comparison to the ti and tv in loop regions. Further, ti substitutions with keto to amino results A:C pairing in the stem which destabilizes the secondary structure more than ti substitutions with amino to keto that results in G:U pairing. Though G:U pairing has been reported to be present in tRNA stem regions (Holley et al. 1965; Hou and Schimmel 1988; McClain and Foss 1988) there is recent report stating that, G:U pairing in tRNA genes having deleterious impact on mouse survivability (Ishimura et al. 2014). Therefore, the former ti should be under stronger purifying selection than the latter. In concordance to it, amino to keto transition was observed to be higher than keto to amino transition in stem regions. However, no such pattern was observed in the loop region. Unlike the stem region, the loop region in tRNA undergoes several posttranscriptional modifications, that are important for their stability as well as various functions such as recognition by amino acyl tRNA synthetase, binding to translation factors, ribosome and codon anticodon recognition (decoding the codons) (Agris 2008; Agris et al. 2017; Pereira et al. 2018). Therefore, any base substitutions arising at the modification site are likely to be selected against in the loop regions. This might be the reason for low substitution rates in the loop region. In future studies this sequence alignment studies might be helpful to find out residues in the loop region that undergoes post-transcriptional modifications. However, the influence of secondary structure on different base substitutions is evident from this analysis. Future research may be done to find out possible implications of compensatory as well as non-compensatory substitutions in tRNA gene in phylogenetic studies.
We believe our findings on mutations in stem vis-à-vis loop regions in tRNA genes might be extended to understand regions in mRNA that are potentially involved in secondary structure formation and gene expression regulation. It is known that the presence of secondary structure in mRNA is important in case of rho-independent (Abe and Aiba 1996) and rho-dependent (Kriner and Groisman 2017) transcription termination, pausing of ribosome (Bao et al. 2020) during translation and protein folding. It is also known that secondary structure near the ribosome binding site or Shine-Dalgarno sequence plays important role in translation regulation (Hall et al. 1982). The role of RNA secondary structure on riboswitches is well documented in the literature (Grundy and Henkin 2006; Montange and Batey 2008). Further, there are regions in other RNA such as miRNA (Maiti et al. 2010) and introns in eukaryotes (Libri et al. 1995) involved in secondary structure formation. The difference between amino to keto and keto to amino transitions in the stem regions of tRNA might be of importance to understand the potential stem region in the transcript. In the stem regions of secondary structures, amino to keto is expected to be more compared to keto to amino transitions. Exploring this in future, might be helpful to understand the contribution of RNA secondary structure on gene regulation. Further, it has been reported that the GC content of the paired stem regions of the 16S rRNA genes positively correlates with the optimal growth temperature of bacteria and archaea (Wang et al. 2006). Therefore, it might be anticipated that difference between the two transitions such as amino to keto and keto to amino will be more in thermophiles in comparison to mesophiles or psychrophiles. In future it will be interesting to study this in bacteria.
Data Availability
The data underlying this article will be shared on request to the corresponding author S. S. Satapathy.
Code Availability
Not applicable.
References
Abdullah T, Faiza M, Pant P et al (2016) An analysis of single nucleotide substitution in genetic codons-probabilities and outcomes. Bioinformation 12:98
Abe H, Aiba H (1996) Differential contributions of two elements of rho-independent terminator to transcription termination and mRNA stabilization. Biochimie 78:1035–1042
Agris PF (2008) Bringing order to translation: the contributions of transfer RNA anticodon-domain modifications. EMBO Rep 9:629–635
Agris PF, Narendran A, Sarachan K et al (2017) The importance of being modified: the role of RNA modifications in translational fidelity. Enzyme 41:1–50
Bao C, Loerch S, Ling C et al (2020) mRNA stem-loops can pause the ribosome by hindering A-site tRNA binding. Elife 9:e55799
Chewapreecha C, Marttinen P, Croucher NJ et al (2014) Comprehensive identification of single nucleotide polymorphisms associated with beta-lactam resistance within pneumococcal mosaic genes. PLoS Genet 10:e1004547
Duchêne S, Ho SYW, Holmes EC (2015) Declining transition/transversion ratios through time reveal limitations to the accuracy of nucleotide substitution models. BMC Evol Biol 15:1–10
Fersht AR, Knill-Jones JW (1981) DNA polymerase accuracy and spontaneous mutation rates: frequencies of purine. purine, purine. pyrimidine, and pyrimidine. Pyrimidine mismatches during DNA replication. Proc Natl Acad Sci 78:4251–4255
Frisch M, Trucks GW, Schlegel HB et al (2009) Gaussian 09, Revision d. 01. Gaussian Inc, Wallingford, p 201
Galtier N, Lobry JR (1997) Relationships between genomic G+ C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J Mol Evol 44:632–636
Gojobori T, Li W-H, Graur D (1982) Patterns of nucleotide substitution in pseudogenes and functional genes. J Mol Evol 18:360–369
Grundy FJ, Henkin TM (2006) From ribosome to riboswitch: control of gene expression in bacteria by RNA structural rearrangements. Crit Rev Biochem Mol Biol 41:329–338
Gruber AR, Lorenz R, Bernhert SH, Neuböck R, Hofacker IL (2008) The Vienna RNA websuite. Nucleic Acid Res 36:W70–W74
Hall MN, Gabay J, Débarbouillé M, Schwartz M (1982) A role for mRNA secondary structure in the control of translation initiation. Nature 295:616–618
Hickey DA, Singer GAC (2004) Genomic and proteomic adaptations to growth at high temperature. Genome Biol 5:1–7
Higgs PG (2000) RNA secondary structure: physical and computational aspects. Q Rev Biophys 33:199–253
Holley RW, Apgar J, Everett GA et al (1965) Structure of a ribonucleic acid. Science 147:1462–1465
Holt KE, Wertheim H, Zadoks RN et al (2015) Genomic analysis of diversity, population structure, virulence, and antimicrobial resistance in Klebsiella pneumoniae, an urgent threat to public health. Proc Natl Acad Sci 112:E3574–E3581
Hou Y-M, Schimmel P (1988) A simple structural feature is a major determinant of the identity of a transfer RNA. Nature 333:140–145
Hudelot C, Gowri-Shankar V, Jow H et al (2003) RNA-based phylogenetic methods: application to mammalian mitochondrial RNA sequences. Mol Phylogenet Evol 28:241–252
Ishimura R, Nagy G, Dotu I et al (2014) Ribosome stalling induced by mutation of a CNS-specific tRNA causes neurodegeneration. Science 345:455–459
Jow H, Hudelot C, Rattray M, Higgs PG (2002) Bayesian phylogenetics using an RNA substitution model applied to early mammalian evolution. Mol Biol Evol 19:1591–1601
Keller I, Bensasson D, Nichols RA (2007) Transition-transversion bias is not universal: a counter example from grasshopper pseudogenes. PLoS Genet 3:e22
Knies JL, Dang KK, Vision TJ et al (2008) Compensatory evolution in RNA secondary structures increases substitution rate variation among sites. Mol Biol Evol 25:1778–1787
Kriner MA, Groisman EA (2017) RNA secondary structures regulate three steps of Rho-dependent transcription termination within a bacterial mRNA leader. Nucleic Acids Res 45:631–642
Lewis CA, Crayle J, Zhou S et al (2016) Cytosine deamination and the precipitous decline of spontaneous mutation during Earth’s history. Proc Natl Acad Sci 113:8194–8199
Libri D, Stutz F, McCarthy T, Rosbash M (1995) RNA structural patterns and splicing: molecular basis for an RNA-based enhancer. RNA 1:425–436
Lorenz R, Bernhart SH, HönerZuSiederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA package 2.0. Algorithms Mol Biol 6:26
Lowe TM, Chan PP (2016) tRNAscan-SE On-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res 44:W54–W57
Lyons DM, Lauring AS (2017) Evidence for the selective basis of transition-to-transversion substitution bias in two RNA viruses. Mol Biol Evol 34:3205–3215
Maiti M, Nauwelaerts K, Lescrinier E et al (2010) Self-complementary sequence context in mature miRNAs. Biochem Biophys Res Commun 392:572–576
Mann HB, Whitney DR (1947) On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18:50–60
Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH (2004) Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc Natl Acad Sci USA 101:7287–7292
McClain WH, Foss K (1988) Changing the identity of a tRNA by introducing a GU wobble pair near the 3’acceptor end. Science 240:793–796
Montange RK, Batey RT (2008) Riboswitches: emerging themes in RNA structure and function. Annu Rev Biophys 37:117–133
Park C, Chen X, Yang J-R, Zhang J (2013) Differential requirements for mRNA folding partially explain why highly expressed proteins evolve slowly. Proc Natl Acad Sci 110:E678–E686
Pereira M, Francisco S, Varanda AS et al (2018) Impact of tRNA modifications and tRNA-modifying enzymes on proteostasis and human disease. Int J Mol Sci 19:3738
Reuter S, Török ME, Holden MTG et al (2016) Building a genomic framework for prospective MRSA surveillance in the United Kingdom and the Republic of Ireland. Genome Res 26:263–270
Rossetti G, Dans PD, Gomez-Pinto I et al (2015) The structural impact of DNA mismatches. Nucleic Acids Res 43:4309–4321
Savill NJ, Hoyle DC, Higgs PG (2001) RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using maximum-likelihood methods. Genetics 157:399–411
Seplyarskiy VB, Kharchenko P, Kondrashov AS, Bazykin GA (2012) Heterogeneity of the transition/transversion ratio in Drosophila and Hominidae genomes. Mol Biol Evol 29:1943–1955
Sinha NK, Haimes MD (1981) Molecular mechanisms of substitution mutagenesis. An experimental test of the Watson-Crick and topal-fresco models of base mispairings. J Biol Chem 256:10671–10683
Stoltzfus A, Norris RW (2016) On the causes of evolutionary transition: transversion bias. Mol Biol Evol 33:595–602
Thorpe HA, Bayliss SC, Hurst LD, Feil EJ (2017) Comparative analyses of selection operating on nontranslated intergenic regions of diverse bacterial species. Genetics 206:363–376
Topal MD, Fresco JR (1976) Complementary base pairing and the origin of substitution mutations. Nature 263:285–289
Tubbs A, Nussenzweig A (2017) Endogenous DNA damage as a source of genomic instability in cancer. Cell 168:644–656
Vogel F, Kopun M (1977) Higher frequencies of transitions among point mutations. J Mol Evol 9:159–180
Wang H, Hickey DA (2002) Evidence for strong selective constraint acting on the nucleotide composition of 16S ribosomal RNA genes. Nucleic Acids Res 30:2501–2507
Wang H-C, Xia X, Hickey D (2006) Thermal adaptation of the small subunit ribosomal RNA gene: a comparative study. J Mol Evol 63:120–126
Wang J, Youkharibache P, Zhang D et al (2020) iCn3D, a web-based 3D viewer for sharing 1D/2D/3D representations of biomolecular structures. Bioinformatics 36:131–135
Wu C-I, Maeda N (1987) Inequality in mutation rates of the two strands of DNA. Nature 327:169–170
Xu X, Zhao P, Chen S-J (2014) Vfold: a web server for RNA structure and folding thermodynamics prediction. PLoS ONE 9:e107504
Zhao C, Xu X, Chen S-J (2017) Predicting RNA structure with Vfold. In: Kaufmann M, Klinger C, Savelsbergh A (eds) Functional genomics. Springer, New York, pp 3–15
Acknowledgements
We are highly grateful to anonymous reviewers and Prof. Paul Higgs for their kind suggestions that improved the quality of the manuscript significantly. We thank Dr. Harry Thrope for providing the sequence data. PS is thankful to UGC, GoI New Delhi for the JRF. SSS and SKR are thankful to DBT, GoI for the twinning Grant (BT/PR16361/NER/95/192/2015 date 18-10-2016) to them. SKR and RCD are thankful to DBT, GoI for the twinning Grant BT/PR16182/NER/95/92/2015. SSS is thankful to DBT for the NE Overseas Associateship, which helped him to work in University of Bath. EF, SSS, RCD and SKR are thankful to the society for Molecular Biology and Evolution (SMBE) for holding the satellite meeting at Kaziranga, Assam, India on Dec 14th–6th, 2017, which helped the authors to have collaboration on this work.
Funding
No funding supported for this research.
Author information
Authors and Affiliations
Contributions
PS performed the work, analyzed data; PS wrote the computer programs; RCD performed energy calculation of different base pairs; SSS and SKR designed the experiment; PS, RA, SSS, EF and SKR critically analyzed the data; PS, RA, RCD, EF, SKR and SSS wrote the manuscript. All authors have read and approved the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
No author has conflict of interest.
Additional information
Handling editor: Arturo Becerra.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Sen, P., Aziz, R., Deka, R.C. et al. Stem Region of tRNA Genes Favors Transition Substitution Towards Keto Bases in Bacteria. J Mol Evol 90, 114–123 (2022). https://doi.org/10.1007/s00239-021-10045-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-021-10045-x