Introduction

At least 75% human genome is transcribed into RNAs [1], fulfilling complex functions in diverse cell types. Though novel classes of non-coding RNAs [2] (ncRNAs) are kept being discovered with technical advances, the coding and noncoding RNAs with functional annotations remain a relatively small fraction [3]. Identifying the shared features, if any, of the previous discoveries facilitates the discovery of new functional RNAs, which helps fulfilling this knowledge gap towards revealing the whole picture of genome functionalities.

RNAs are diverse and have refined roles. Besides canonical protein-encoding RNAs, various long and short ncRNAs have been continuously discovered with advances in the experimental and computational approaches of contemporary biology. We propose to classify RNAs with (partially) characterized functionalities into three categories (Table 1). The first class encompasses protein-coding associated RNAs including RNAs designated to protein synthesis, i.e., messenger RNA (mRNA), transfer RNA (tRNA) and ribosomal RNA (rRNA), and RNAs responsible for their maturation. While mRNAs, rRNAs and tRNAs are responsible for protein translation; small nuclear RNAs (snRNA) primarily function in pre-mRNA processing in the nucleus, ribonuclease P (RNase P) and ribonuclease MRP (RNase MRP) are responsible for tRNA and rRNA maturation, respectively, and small nucleolar RNAs (snoRNA) possess impressively diverse functions including rRNA and snRNA modification. The second class of RNAs comprises of regulatory RNAs. For example, micro RNAs (miRNAs), short hairpin RNAs (shRNAs), small interfering RNAs (siRNA) are short ncRNAs interfering with gene expression via degrading mRNA molecules post-transcriptionally; piwi-interacting RNAs (piRNAs) silence retrotransposons and other genetic elements in germline cells; antisense RNAs (aRNAs) are endogenous RNAs with partial or full sequence complementarity to other transcripts, which use diverse transcriptional and post-transcriptional gene regulatory mechanisms to carry out a wide variety of biological functions; competing endogenous RNAs (ceRNAs) regulate mRNA transcripts by competing for shared miRNAs; long noncoding RNAs (lncRNAs) are over 200 nucleotides in length and transcribed in a tissue specific and developmentally regulated pattern that can mediate epigenetic alterations via recruiting chromatin remodeling complexes to specific genomic loci; enhancer RNAs (eRNAs) are short ncRNA molecules transcribed from the enhancer regions and actively play a role in transcriptional regulation in cis and in trans [4]. The last group is comprised of parasitic RNAs including, e.g., retrotransposons capable of self-propagation, RNA viruses, and CRISPR RNAs (crRNAs) that constitute prokaryotic immune system by helping Cas proteins recognize and cut exogenous DNA. Such a classification is not mutually exclusive. Some RNAs were identified initially from one class and later found with roles played in another. For example, circular RNAs (circRNAs) [5] were firstly characterized as ncRNAs with gene regulatory potential [6], some of which were shown later with protein-coding roles [7]. The precise roles of many ncRNAs are still under investigation. For instance, sno-derived RNAs (sdRNAs) are miRNA-like RNAs originated from H/ACA box snoRNAs or C/D box snoRNAs with hypothetical roles during the interplay between RNA silencing and snoRNA-mediated RNA processing systems; miRNA-offset RNAs (moRNAs) are produced from human miRNA precursors but have considerably lower expression levels than the corresponding miRNAs; transcription initiation RNAs (tiRNAs) are mapped within − 60 to + 120 nucleotides of the transcription start site (TSS) and suggested as a general feature of transcription in possibly all eukaryotes with exact functions uncharacterized.

Table 1 Classification and functionalities of currently known RNAs

The diverse types of RNA functionalities largely depend on their interactions with macromolecules, which are often determined by the sequence and structural features of both interplay partners. This review classifies currently identified RNA interactions into six categories depending on their interplay partners, outlines the capturing technologies in each category as well as the derived RNA functionalities, and identifies understudied niches as future directions to warrant more investigations.

RNA interactions with macromolecules

RNA and DNA are nucleic acids which, along with proteins, lipids and carbohydrates, constitute the four major macromolecules essential for known life forms. RNAs are known to interact with them as well as metabolites to achieve diverse functionalities (Fig. 1).

Fig. 1
figure 1

Conceptual scheme representing the functional RNA realm driven by multi-player interactions. The primary classes, interacting macromolecules and functionalities of RNAs are represented in blue, green and black, respectively, which collectively convey the message that interactions between RNAs and macromolecules manifest RNA associated functionalities

RNA interactions with DNA

RNA–DNA interactions possess prominent functionalities in diverse biological processes through genetic and epigenetic regulations. NcRNAs are key regulators of chromatin states and gene expression for important biological processes such as dosage compensation [8], imprinting [9], development process [10], lineage differentiation, and disease progression including cancinogenesis [11], where DNA–RNA hybrid formation plays an important role. RNAs provide the templates to orchestrate genome rearrangement in some species such as Oxytricha and act as the templates to facilitate DNA translocation in the ‘trans-splicing mediation’ model of non-canonical gene fusions resulted from intergenic splicing [12]. Further, the activity of the telomerase holoenzyme is determined by the binding affinity between the telomerase RNA template region and the DNA primer [13]; the CRISPR system against foreign nucleic acid invasion in bacteria and archaea involves DNA/RNA hybrids; steady-state circRNAs have been mapped to thousands of genomic loci in mammals and contribute to gene regulation [6]; and transcriptional pausing can be modeled using sequence-dependent free energy of nucleic acid interactions including DNA–RNA base pairing. While some lncRNAs work in cis on neighboring genes, others function in trans to regulate distantly located genes. For example, while both playing fundamental roles in X chromosome inactivation and dose compensation, human lncRNA Xist functions in cis [8] and Drosophila ncRNAs roX1 and roX2 bind numerous regions in trans on the X chromosome of male cells.

Most evidence on the interactions between RNA and DNA was observed at the chromatin level. We extract RNA–DNA interaction modes from this context and summarize them into 4 types. We define ‘type 1’ as ‘RNA interaction with single DNA strand forming R loop’, ‘type 2’ as ‘RNA interaction with double DNA strand forming triple helix’, ‘type 3’ as ‘RNA interaction with single DNA strand without R loop’, and ‘type 4’ as ‘RNA interaction with double DNA strand forming tertiary structure’. Type 1 is the mode used for mRNA elongation in the transcription machinery, where the R loop comprises the nascent RNA hybridized with the DNA template strand and the single-stranded non-template DNA [14]. Recent advances have confirmed the role of type 1 mode RNA–DNA interaction in regulating ncRNA expression. For instance, R loops are formed over the promoter region of the lncRNA COOLAIR and stabilized by AtNDX (a ssDNA-binding homeodomain protein) to suppress its transcription [15]. The sgRNAs in the CRISPR system function via the type 1 mode for genome-editing, representing an exogenous example (Fig. 2a). The type 2 mode is adopted by many lncRNAs to regulate gene expression in cis or in trans [16]. For instance, promoter-associated RNAs (pRNAs) regulate rRNA gene expression by forming stable RNA–DNA triple helix with the promoter sequence, and the formed triplex is specifically recognized by DNA methyltransferase DNMT3b that results in de novo CpG methylation of rRNA genes (Fig. 2b). It is also reported that specific endogenous miRNAs take the type 2 mode to fight against viral invasion in eukaryotic cells by entangling with double strand DNA viruses such as HIV-1 and forming stable triplexes with viral DNA motifs [17]. The RNA template of telomerase takes the type 3 mode to interact with single DNA strand for chromosome elongation (Fig. 2c). The type 4 mode is a potential model recently proposed where RNA serves as a structural component to maintain the 3D genome conformation (Fig. 2d).

Fig. 2
figure 2

Schematic representation of RNA-DNA interaction modes and classical examples. a Type 1 mode: RNA interaction with single DNA strand forming R loop; b Type 2 mode: RNA interaction with double DNA strand forming triple helix; c Type 3 mode: RNA interaction with single DNA strand without R loop; d Type 4 mode: RNA interaction with double DNA strand forming tertiary structure

A series of experimental approaches have been developed to characterize RNA–DNA interactions. RNA-centric biochemical purification techniques such as RNA antisense purification (RAP) have enabled comprehensive mapping of RNA–DNA Interactions in vivo [8] (Table 2). Kingston et al. established a hybridization-based technique that specifically enriches endogenous RNAs along with their targets from reversibly cross-linked chromatin extracts, namely CHART (capture hybridization analysis of RNA targets), to map the genomic binding sites for endogenous RNAs [18] (Table 2). Chang et al. developed a method termed ChIRP (chromatin isolation by RNA purification) to allow unbiased high-throughput discovery of RNA-bound DNA and proteins in vivo, where cultured cells are cross-linked and RNAs of interest are hybridized to target RNAs through biotinylated complementary oligonucleotides followed by magnetic bead isolation (Table 2). While most efforts have been devoted to approaches utilizing cell extracts, Zhong et al. developed MARGI (mapping RNA-genome interactions) to massively reveal native RNA-chromatin interactions from unperturbed cells, which is achieved through RNA–DNA proximity ligation followed by paired-end sequencing of these chimeric sequences (Table 2).

Table 2 RNA capture interaction technologies

RNA interactions with RNA

RNA–RNA interactions have led to the discovery of many important functions. The milestone in this regard is perhaps the discovery of Alanine transfer RNA and decoding of the triplet system by Robert Holley for which he was awarded with the Nobel Prize. The interactions between tRNA and mRNA transfer genetic information to protein functionalities and fill up the gap in genetic information flow. Another fundamental landmark awarded with Nobel Prize is the regulatory roles of RNAs on gene expression derived from siRNA-mRNA interactions, which has led to technical advances in gene expression modulation, namely RNA interference. RNA-RNA interactions represent a general strategy used by many ncRNAs to achieve complicated and diverse biological functionalities. For example, genome-wide RNA interactome analysis revealed that TINCR (terminal differentiation-induced ncRNA) interacts with a range of differentiation mRNAs through a 25-nucleotide ‘TINCR box’ motif, and is required for their high mRNA abundance.

Many ncRNAs interact with other RNAs either directly through base-paring such as miRNA-mRNA, lncRNA-miRNA or snRNA-mRNA hybridization, or indirectly via protein intermediates, e.g., the ribosome is composed of multiple ncRNA components, and numerous lncRNAs associate with proteins to regulate RNA processing (Supplementary Fig. 1a) [19]. The formation of structured RNA such as duplexes represents a critical feature of RNA mediated biological processes. The lead RNA can take diverse roles when interacting with its RNA partner such as enzyme (e.g., ribozyme in mRNA degradation), sponge (e.g., circRNA in miRNA regulation), schaffold (e.g., lncRNA during nascent RNA production), and guider (e.g., snoRNA in rRNA maturation) (Fig. 3). Besides locally positioned RNA structures, long-distance intragenomic RNA–RNA interactions also exist.

Fig. 3
figure 3

Typical examples conveying diverse RNA functionalities through RNA-RNA or RNA–protein interactions. a Ribozyme-RNA interaction acts as an RNA enzyme for mRNA cleavage; ribosome involves complex interactions between RNA and protein that collectively functions as an enzyme to translate message from mRNA to protein. b CircRNA-miRNA interaction acts as a sponge of miRNA and lncRNA-TF interaction functions as a sponge of TF to interfere with the regulatory role of miRNA and TF, respectively, on gene expression. c During lncRNA facilitated nascent RNA production, lncRNA-mRNA and lncRA-protein function both function as the schaffold. d SnoRNAs interact with rRNAs, where snRNA functions as a guider; The referred RNA-RNA interactions are represented by red circles, and RNA-protein interactions are indicated by red squares. (Color figure online)

Classical RNA–RNA interactions such as U1 snRNA interactions with pre-mRNA were identified through observations of sequence complementarity followed by targeted genetic modulation and in vitro affinity examination. Many methods have been developed for global-scale mapping of RNA duplexes in vitro and in vivo in a systematic global scale that couple high-throughput sequencing and RNA footprinting strategies utilizing structure-sensitive chemicals (such as structure-seq [20]) or nucleases (such as FragSeq [21], which is short for fragmentation sequencing). Detection of long-range and other tertiary RNA interactions has been achieved by structure-based approaches including nuclear magnetic resonance, X-ray crystallography and cryo-electron microscropy [22]. RNA proximity ligation following immunoprecipitation of complexes of interest given a known interacting protein partner has been launched with success for mapping several types of RNA interactions such as AGO-CLIP (crosslinking and immunoprecipitation) in identifying Argonaute-bound miRNA-mRNA interactions [23], hiCLIP (RNA hybrid and individual-nucleotide resolution UV cross-linking and immunoprecipitation) in recognizing Staufen-bound structured RNAs [24], and CLASH (cross-linking, ligation, and sequencing of hybrids) for characterizing snoRNP-bound snoRNA-rRNA interactions and miRNA interactions [25]. A RAP based method, namely RAP-RNA (RNA antisense purification-RNA), has been proposed to comprehensively characterize in vivo intermolecular RNA–RNA interactions of a target RNA and distinguish direct and indirect interactions through differential usage of crosslinking reagents with different specificities for proteins and nucleic acids [26] (Table 2). More sophisticated approaches have been continuously developed for globally detecting RNA–RNA interactions in vivo without prior knowledge of RNAs forming interactions and their interacting proteins. Wan et al. proposed the SPLASH (sequencing of psoralen crosslinked, ligated, and selected hybrids) approach to map pairwise RNA interactions in vivo in a genome-wide scale, where EZ-Link-Psoralen-PEG3-Biotin was used to enable biotinylation of RNA via UV-light-activated intercalation of the psoralen group with thymine- and other pyrimidine-containing bases to form covalent bonds and the interacting RNA pairs were identified by sequencing (Table 2). Blencowe et al. proposed LIGR-seq (ligation of interacting RNA followed by high-throughput sequencing) to enable the global-scale mapping of RNA-RNA duplexes cross-linked in vivo, where modified psoralen derivative 4′-aminomethyltrioxalen (AMT) was used to generate AMT-cross-linked RNA duplex followed by high-throughput sequencing of RNA chimeras [27] (Table 2). Chang et al. developed the PARIS (psoralen analysis of RNA interactions and structures) method based on reversible AMT crosslinking to globally map RNA duplexes in living cells with near base-pair resolution and capture the complexity of RNA structures through direct identification of base-paired helices [28] (Table 2). Sheng et al. presented the MARIO (mapping RNA interactome in vivo) technique to massively reveal RNA–RNA interactions from unperturbed cells by cross-linking RNAs with their bound proteins followed by ligating interacting RNAs to biotinylated RNA linker and subjecting RNA1-Linker-RNA2 to paired-end sequencing [29] (Table 2).

RNA interactions with protein

RNA binding proteins (RBPs) and ribonucleoprotein (RNP) complexes are typical functional forms of RNA–protein interactions, which extensively control, at the post-transcriptional level, the expression of genes in eukaryotes. RBPs such as PRC2 (polycomb repressive complex 2), Argonaute (the key player in all small-RNA-guided gene-silencing processes [30] and interact with approximately 20% lncRNAs), PUM2, QKI, IGF2BP1-3, TNRC61-C and Nova interact with RNAs to drive mRNA processing, nuclear export, translation, localization, turnover, etc. For example, the interaction between the Ezh2 protein domain (a RNA binding subunit of PRC2) and RepA (a 1.6 kb ncRNA residing in lncRNA Xist) recruits PRC2 to the X chromosome that leads to X inactivation; and enforced lncRNA HOTAIR expression re-allocates PRC2 to form an occupancy pattern more resembling embryonic fibroblasts, leading to altered histone H3 lysine 27 methylation, gene expression and, ultimately, increased cancer metastasis [31]. RNPs are RNA–protein complexes that play an integral part in numerous biological processes, e.g., ribosome in protein synthesis, telomerase in chromosome protection, ribonuclease P in small ncRNA transcription, hnRNP (heterogeneous nuclear ribonucleoprotein) in gene transcription and post-transcriptional modification, and snRNP in pre-mRNA splicing.

RNAs can function as the enzyme, sponge, schaffold, guider during RNA–protein interactions. For instance, the ribosomal RNAs convey enzymatic roles for protein production such as the catalytic peptidyl transferase activity in linking amino acids together (Fig. 3a); lncRNA PANDAR acts as a sponge of p53 protein to block its binding with the promoter region of CDKN1A gene in gastric cancer (Fig. 3b) [32]; the lncRNA Malat1 regulates nascent pre-mRNA processing through recruitment or modification of serine/arginine (SR) proteins localized to these sites [26] (Fig. 3c); several types of small RNAs function as the guider of protein enzymes to find their targets for cutting, digestion or modification such as sgRNA in guiding Cas9/dCas9 to certain genomic loci and siRNA in guiding DICER to specific mRNAs (Fig. 3d).

Several approaches have been developed to identify RNA–protein interactions. Garciablanco et al. developed the RIP (ribonucleoprotein immunoprecipitation) assay to study RNA–protein interactions in vivo (Table 2). This technique has been combined with high-throughput technologies and evolved as RIP-Chip (RIP microarray) and RIP-seq (RIP sequencing) [33] to facilitate studies on RNA–protein interactions in the genome-wide scale (Table 2). Darnell et al. developed the CLIP (UV-crosslinking and immunoprecipitation) approach to reveal the interactions between RNAs and their binding proteins through in vivo UV crosslinking and immunoprecipitation (Table 2); and later proposed the HITS-CLIP technique by combining CLIP with high-throughput sequencing to produce transcriptome-wide RNA binding maps with higher accuracy and resolution [34] (Table 2). Tuschl et al. developed a step-by-step protocol for the transcriptome-wide isolation of RNA segments bound by the protein of interest from the background un-crosslinked RNAs, namely PAR-CLIP (photoactivatable ribonucleoside-enhanced crosslinking an immunoprecipitation), via introducing photoreactive nucleosides that generate characteristic sequence alterations on crosslinking, converting isolated crosslinked RNA fragments into a cDNA library followed by deep-sequencing [35] (Table 2). Ule et al. developed individual-nucleotide resolution CLIP (iCLIP) to identify protein-RNA crosslink sites with nucleotide resolution, where the use of two cleavable adapter regions and barcode that enables PCR amplification of truncated cDNAs (the truncated sites represent the cross-linked nucleotide contributes) followed by circularization, linear cleavage and high-throughput sequencing contribute to the trick of this technology (Table 2). TRAP/RAT (tandem RNA-affinity purification/RNA affinity in tandem) does not depend on formaldehyde nor cross-linking, but is an RNA tag-based method for affinity purification of endogenously assembled RNP complexes followed by mass spectrometry protein identification (Table 2). RiboTrap is another approach of this kind that purifies targeted RNPs from cell lysates through immunoaffinity precipitation to describe in vivo endogenous assembly of ribonucleoproteins (Table 2). MS2-BioTRAP (MS2 in vivo biotin tagged RNA affinity purification) co-expresses HB-tagged bacteriophage protein MS2 and stem-loop tagged target RNAs in cells followed by HB-tag based affinity purification of authentic RNA–protein complexes, and proteins associated with target RNAs are subsequently identified and quantified using SILAC-based quantitative mass spectrometry [36] (Table 2).

Various approaches have been proposed or evolved from existing techniques to address specific scientific questions. CRAC (cross-linking and analysis of cDNAs) resolves the restrictions imposed by the use of highly-specific antibodies in CLIP for analyzing protein-RNA interactions in large RNPs containing many different proteins (Table 2). By adapting the RNA antisense purification (RAP) method to purify a specific lncRNA complexes and identifying the interacting proteins using quantitative mass spectrometry, RAP-MS (RNA antisense purification by mass spectrometry) enables characterization of interacting proteins of a given lncRNA [37] (Table 2). RaPID (RNA purification and identification) allows for the isolation of specific mRNAs of interest and subsequent analysis of the associated proteins using mass spectrometry [38] (Table 2). A couple of methods have been developed to determine RNA–protein binding specificities. While RNA-Compete (competition between individual RNA sequences binding to proteins), SEQRS (in vitro selection, high-throughput sequencing of RNA and sequence specificity landscapes) [39], RBNS (RNA Bind-n-Seq) [40] and RNA-MITOMI (RNA-mechanically induced trapping of molecular interactions) [41] capture RNA–protein binding specificities in vitro, RNA-MaP (RNA on a massively parallel array) and HiTS-RAP (high-throughput sequencing-RNA affinity profiling) [42] achieve this in situ. RNA-Compete uses RNA–protein binding followed by microarray analysis to enable high-throughput identification of RNA binding motifs (Table 2). SEQRS is featured by being capable of identifying RNAs interacting with the protein of interest with low binding affinities and multiple binding modes in a single experiment [39] (Table 2). RBNS novels in measuring the binding affinities both quantitatively and in a high-throughput manner (Table 2). RNA-MITOMI provides a microfluidic platform to allow the identification of new binding specificities through screening the interaction of a RNA mutant library with stem-loop-binding protein [41] (Table 2). RNA-MaP repurposes a high-throughput sequencing instrument to quantitatively measure the binding and dissociation of a fluorescently labeled protein to a library of RNA targets generated on a flow cell surface by in situ transcription [43] (Table 2). HiTS-RAP allows simultaneous transcription of hundreds of millions of DNA clusters that are each derived from amplifying a single molecule with primers covalently linked to the glass flowcell, and the binding of fluorescently labeled proteins to these transcribed RNA clusters is analyzed afterwards [42] (Table 2).

RNA interactions with lipid

The primordial functions of RNAs and lipids have been hypothesized in the ‘RNA world’. The observation that the association of RNA molecules and cellular membranes is involved in forming signal recognition particles in Escherichia Coli and regulating cell membrane permeability in Saccharomyces cerevisiae, and the fact that all positive-sense (+ RNA) RNA viruses remodel intracellular membranes into unique structures for viral genome replication [44] support the notion that RNA-lipid interactions exist and may be physiologically important. A recent report on the direct interaction between lncRNA LINK-A (long intergenic noncoding RNA for kinase activation) and the phospholipid PI(3,4,5)P3 at the single nucleotide level has broken up the mystery and deciphered the signal transduction role of RNA-lipid interactions [45] (Supplementary Fig. 1a).

In fact, lipids play prominent roles in the life cycle of RNA viruses. Positive strand RNA viruses such as HCV (hepatitis C virus), DENV (dengue virus), ChikV (chi-kungunya virus) and CoV (coronavirus), PV (picornavirus) replicate their genomic RNA on virus-modified intracellular membranes termed replication organelles (ROs) [44]. The fusion of RNA viruses with host cell membrane, either during the entry or egress process, may imply interactions of RNAs with lipids.

The sole lipid-lncRNA interaction case so far reported was characterized using lncRNA array for RNAs extracted from the lipid-fraction of samples [45]. For instance, the MS2-TRAP (MS2-tagged RNA affinity purification) system (Table 2) was exploited to examine the LINK-A-PIP3 interaction in vivo by expressing MS2-tagged full length LINK-A or △PIP3 deletion mutant in cancer cells and analyzing the protein-RNA-lipid complex pulled down by GST antibodies [45]. With the incrementally recognized importance of RNA-lipid interactions and urgent demand for their characterization, fast pace and solid foundation for the development of this field, approaches identifying such interplays at the genome scale are urgently needed and forthcoming.

RNA interactions with metabolite

Riboswitches are genetic control elements found in the 5′-untranslated region of certain mRNAs that exert gene regulatory functionalities through RNA-metabolite interactions. The RNA regulates gene expression through allosteric structural reorganization on site-specific metabolite binding without protein aid (Supplementary Fig. 1b). Several distinct types of riboswitches have been discovered, such as coenzyme B12-binding RNA in the control of btuB gene in Escherichia coli, TPP (thiamine pyrophosphate) riboswitch in modulating operons in Escherichia coli and Bacillus subtilis, and FMN-dependent riboswitch in controlling FMN biosynthesis or import in Bacillus subtilis. Riboswitches were firstly identified as metabolite-mediated mRNA control of gene expression in prokaryotes, and later found potentially exist in eukaryotes given the presence of TPP binding RNA elements in genes of plants such as Arabidopsis thaliana, Oriza sativa (rice) and Poa secunda (bluegrass), and fungi such as Neurospora crassa and Fusarium oxysporum.

Metabolite binding triggers allosteric alterations in base-pairing arrangements near gene control elements in prokaryotes, and rearranges the secondary structure of mRNAs that may modulate a greater variety of RNA processing, transport and expression pathways in eukaryotes. For example, TPP riboswitches function in Escherichia coli by sequestering the ribosome-binding site and transcription termination, and potentially regulate mRNA processing and stability in plants and guide RNA splicing in the fungal genome of Neurospora crassa.

Current approaches identifying riboswitches that imply RNA-metabolite interactions largely rely on sequencing techniques and genome comparison. NAIM-NAIS (nucleotide analog interference mapping and suppression) represents a direct RNA-metabolite interaction approach. By incorporating phosphorothioate-containing nucleotide analogs into the functional RNA of interest and optimizing a selective activity assay that physically distinguishes functional RNA molecules from inactive molecules, NAIM-NAIS has been used to investigate the interactions between glmS riboswitch and its metabolite GlcN6P (Table 2). Some computational methods have been implemented to study riboswitches, such as the computational modeling of co-transcriptional RNA-ligand interaction dynamics in I-A 2′-deoxyguanosine (a’dG)-sensing riboswitch from Mesoplasma florum.

RNA interactions with carbohydrate

According to the ‘RNA world’ hypothesis, interactions between RNAs and carbohydrates are indispensible. Relatively rare has been reported in this regard. Transglycosylation is important for RNA modification and editing. Though such a process largely relies on the activity of tRNA-guanine transglycosylase and represents RNA–protein interactions, it implicates the importance of carbohydrates in manifesting RNA functionalities. Heparin sulfate (HS) and glycosaminoglycans are polysaccharides abundantly expressed on the surface of many cell types and mediate the entry of many RNA viruses into host cells [46], suggestive of RNA-carbohydrate interactions. However, direct evidence demonstrating RNA-carbohydrate interactions is still lacking, representing an understudied niche.

Functionalities derived from RNA interactions

Genetic information flow

The prominent role of RNAs in genetic information flow has long been recognized, as stated by the Central Dogma of molecular biology that the flow of genetic information moves from DNA to protein via RNA. A living entity is a complex adaptive system that differs from any chemical structure in being capable of self-information processing. Under this conception, RNA viruses have been computationally and experimentally selected to model the origin of genetic information, supporting the notion that RNAs are fundamental in genetic information flow.

As DNA and protein have later evolved to play more specialized roles, RNAs have differentiated into diverse types to interact with macromolecules of the Central Dogma, which are determinant in translating genotypes into phenotypes in a temporal and spatially controlled manner (Fig. 1).

The biological significance of mRNAs, tRNAs and rRNAs during genetic information flow has been well acknowledged, which form the three primary types in the RNA realm. Some RNAs form the critical catalytic part of the enzymes triggering the maturation process of these basic classes of RNAs. For instance, RNase P enzymes consist of both RNA and protein subunits where the RNA fraction is indispensible for catalyzing the removal of the 5′ leader sequences from tRNA precursors; RNase MRP (mitochondrial RNA processing) enzymes cleave the A4 site in the ITS (internal transcribed spacer) region of pre-rRNAs to generate mature 5.8S rRNA. Some RNAs serve as the guide to specify the sites of modification for the enzyme to execute the catalytic function. For instance, site-specific synthesis of the 2′-O-methylated nucleotides and pseudouridines in rRNAs and snRNAs is guided by snoRNAs through base-pairing followed by snoRNP protein catalyzed 2′-O-methyl transfer and uridine-to-pseudouridine isomerization reactions. Some RNAs direct the associated proteins to the targeted sites to trigger conformational change of the substrate for subsequent catalysis. For example, snRNAs constitute a large family with at least 10 members; they form ribonucleoprotein complexes (snRNPs) with their associated proteins and bind specific sequences on the pre-mRNA, resulting in the exposure of nucleotides favorable for splicing.

Signal transduction

Regulatory RNAs, e.g., lncRNAs and miRNAs, also participate in the signal transduction of many biological processes.

LncRNAs transduce signals via specific interactions with proteins such as transcriptional factors and RNP complexes. For instance, lncRNA BCAR4 directly binds to two transcription factors (SNIP1, PNUTS) in response to chemokine CCL21, resulting in the release of SNIP1′s inhibition on p300-dependent histone acetylation which in turn enables the binding of BCAR4-recruited PNUTS to H3K18ac and relieved inhibition of RNA Pol II via activating the PP1 phosphatase; this activates the Hedgehog/GLI2 transcriptional program that ultimately promotes cell migration [47]. The lncRNA lnc-DC binds directly to STAT3 and promotes STAT3 phosphorylation on tyrosine-705, resulting in activated STAT3 signaling that regulates dendritic cell differentiation [48]. LncRNA MRHL (meiotic recombination hot spot locus) interacts with RNPs such as p68, whose down-regulation results in tyrosine-phosphorylated p68 cytoplasmic translocation, β-catenin nuclear localization, β-catenin-TCF4 interaction, β-catenin occupancy at the promoter regions of Wnt target genes and ultimately activated Wnt signaling in mouse [49]. The lncRNA NKILA (NF-kB interacting lncRNA) directly interacts with the functional domains of signaling proteins NF-kB/IkB to form a stable complex that prevents over-activation of NF-kB pathway in inflammation-stimulated breast epithelial cells and ultimately less tumor metastasis [50]. The cytoplasmic lncRNA LINK-A facilitates the recruitment of BRK to the EGFR:GPNMB complex and BRK kinase activation, leading to heterodimer-dependent HIF1αphosphorylation at Tyr565 and Ser797 by BRK and LRRK2, respectively, HIF1α stabilization, HIF1α-p300 interaction, and HIF1αtranscriptional programs under normoxic conditions.

Physiological lncRNA-phospholipid interactions are implicated in mediating signal transduction important for homeostasis and disease. The lncRNA LINK-A has been recently shown to interact with PIP3, facilitating PIP3-AKT interactions that ultimately lead to hyperactivated AKT and poor cancer patient outcome [45].

MiRNAs have also been integrated into signaling pathways as well represented by members of the miRNA-34 family, where activated p53 directly trans-activates the miRNA-34 transcription units that are able to regulate BCL2 and induce apoptosis.

Regulation

The discovery that the genomes of complex organisms including human are largely transcribed into ncRNAs has challenged our traditional conceptions on the functionalities of RNAs and associated inter-molecular interactions.

LncRNAs have attracted much attention due to their large quantity and biological significance, which interact with proteins and function in specific genomic loci to achieve their regulatory activities. Many lncRNAs have been identified to interact with regulatory genomic elements such as promoters, enhancers, ultraconserved regions, introns and epigenomic regulators such as miRNAs [19]. Given that lncRNAs are over 200 nucleotides in length, they could fold into complex three-dimensional structures that allow them achieving the regulatory functionalities by interacting with biomolecule partners such as transcription factors, histones and chromatin-modifying proteins. Consequently, altered lncRNA expression would affect the expression of a broad spectrum of genes via protein partners and lead to profound phenotypic changes as well as severe pathological consequences.

Some small ncRNAs suppress gene expression by directing ribonucleases to the target RNA or DNA sequences. While miRNAs are endogenous interfering RNAs introducing Dicer to silence the target mRNAs, shRNAs and siRNAs are exogenous RNAs and have become standard approaches for gene modulation at the post-transcriptional level. Similarly, CRISPR RNAs (crRNAs) and trans-activating CRISPR RNAs (transRNAs) orchestrate in leading the Cas9 endonuclease to the target site to fight against foreign genetic elements invasion in the prokaryotic immune system. This has been modified into an effective genome editing tool where crRNA and transRNA are modified into a synthetic guide RNA (gRNA) delivered together with the Cas9 nuclease for the modulation of any desired location in a cell’s genome.

CircRNAs have diverse lengths and are more stable than linear RNAs. CircRNAs can function to modulate miRNA activities [51], regulate alternative splicing [52, 53], and sponge other factors such as RBPs or RNP complexes [51]. Being a miRNA activity regulator, circRNA could sponge miRNA complexes and potentially release them on cleavage. For instance, circRNA 0000515 sponges miRNA326 to promote cervical cancer progression via up-regulating ELK1 [54], circRNA TTN acts as a sponge of miRNA432 to facilitate the proliferation and differentiation of myoblasts through the IGF2/PI3K/AKT pathway [55], circRNA NRIP1 functions as a miRNA149-5p sponger to promote gastric cancer progression via AKT1/mTOR signaling [56], and circRNA TRIM33-12 acts as the sponge of miRNA191 to suppress hepatocellular cancer progression [57]. The regulatory role of circRNA on alternative splicing is associated with circRNA biogenesis. As pre-mRNA splicing is dictated by competitions between alternative pairing of 5′ and 3′ splice sites and backsplicing during circRNA generation affects the splicing pattern of the remaining pre-mRNA, circRNAs modulate alternative splicing as a process regardless of the functionalities of circRNAs produced [52]. CircRNAs can also sponge RBPs or RNP complexes to prevent them from acting, provide a protective reservoir of these factors, and deliver them to particular subcellular locations [58]. Lastly, the protein-coding function of circRNAs has been recently identified but still lack sufficient evidence, and novel approaches such as the ‘intron-mediated enhancement system’ have been proposed to investigate circRNA protein-coding functionalities through increasing circRNA formation [59].

Perspectives

With advances in high-throughput technologies, novel classes of RNAs, their interactions with various macromolecules and the diversified RNA functionalities derived have been continuously deciphered. This has led to a paradigm shift from our conception on the RNA world to the regulatory roles of RNAs in cellular homostasis and pathogenesis.

Unlike DNAs and proteins, the functionalities of RNAs are largely derived from their interactions with macromolecules including DNAs, RNAs, proteins, lipids, carbohydrates and metabolites. This is especially true in eukaryotes and concords with the hypothesis that RNAs devolved some of their functions to DNAs and proteins and pertained the mediating role. RNAs were selected as the primary regulator of many cell events due to their rapid metabolism that is required for dynamic regulation of cellular homostasis. Though RNAs preserve the information storage capacity in prokaryotes as evidenced by RNA viruses, they are simply genetic information storage unit and need to hijack host materials for propagation. This suggests the prominent roles of macromolecular interactions in driving RNA functionalities in any organisms. Importantly, many biological activities involve multi-player interactions, e.g., RNA and protein function together to maintain chromosome stability for telomerase to take on the action mode. Thus, we cannot dissect any function from its context and leave molecular interactions aside. RNAs interact with other molecules through base pairing, and this actualizes its multiple functionalities such as information transduction and executioner guidance in the case of short regulatory RNAs, and tertiary structure formation in the case of long regulatory RNAs and mRNAs.

RNAs can achieve various roles to support the life cycle of parasites. RNA viruses have RNA as their genetic materials that can be translated on cell infection through interacting with host cellular system. For instance, positive strand RNA viruses perform many functions during viral replication such as viral RNA recruitment for replication, membrane-bound replicase complex assembly, replicase activity regulation, and minus- and plus- strand RNA synthesis through interacting with various host RBPs, and these RBPs include eEF1A, hnRNP proteins and the Lsm 1-7 complex that are conserved across diverse positive strand RNA viruses [60]. Viroids are small nonencapsulated circular ssRNAs that are not protein-coding but replicate in plants as free RNA molecules through RNA structure-based interactions with host cellular factors for functionalities and reply on host DNA-dependent RNA polymerase II for transcription. For instance, potato spindle tuber viroid modulates its replication through directly interacting with a negative splicing regulator RPL5 [61]. CrRNAs constitute prokaryotic immune system by assisting Cas proteins recognize and cut exogenous DNA. On phage infection, proteo-spacers are lysed by bacteria Cas protein complexes and integrated into the 5′ end of the CRISPR sites in host genome which are then transcribed into crRNAs and function as the template against phage proto-spacers once infected again [62,63,64].

Since proteins have long been conceived as principle elements manifesting phenotypes, our focus has been attracted to protein-coding RNAs and interactions of RNAs with primary elements comprising the Central Dogma. Recent discoveries on physiological RNA-phospholipid interactions have challenged our conceptions on RNA functionalities and raised our interest in their involvement in signal transduction and essential cellular processes. Far less effort has been laid on communications between RNAs and carbohydrates except for the glycosylation potential of tRNAs [65], as well as interactions between RNAs and metabolites except for riboswitches. The story does not stop here and we cannot help ask whether other molecules, such as metallic ions, hormones and ATP interact with RNAs? What are the exact roles played by RNAs if such interplays existed? Do these interactions represent a distinct class of RNA function besides information flow, genetic and epigenetic regulation, as well as signal transduction as afore-discussed?

Conclusion

RNAs had devolved most of its functionalities to other elements such as DNA and protein, and pertained the regulatory roles that are derived from their interplays with diverse macromolecules. Thus, we can presume that while DNAs store inheritable information and proteins take action, RNAs constitute as the primary controller of cell fate as the connectivity of a network determines the topology and consequently the cell state.

Despite the incrementally acknowledged importance of RNAs in life and disease control as well as our accumulated understandings on RNA functionalities, still much remains to be explored. These include less well-studied RNA-macromolecule interactions, working mechanisms of novel RNAs, and experimental techniques facilitating these investigations.