1 Introduction

The “central dogma of biology” proposed by Francis Crick considered RNA molecules to be mere intermediates between DNA and proteins [1]. Surprisingly, it was then later shown that less than 2% of the human genome is translated into proteins. Lacking a protein coding potential, the remaining genetic information was considered “junk” accumulated across evolution [2]. Curiously, recent studies highlight a stronger correlation between the size of these non-coding regions and biological/evolutionary complexity, than those established for protein-coding genes. Intriguingly, genes containing large introns have a significantly higher transcriptional activity in the nervous system and lower transcriptional activity in cancer. This suggests a regulatory potential of these non-coding regions coupled to tissue-specific gene expression patterns and corresponding mitosis rates [3]. Accordingly, the larger part of the human genome has been linked to active transcriptional loci under specific physiological contexts [4, 5]. Therefore, the genetic information enclosed in a particular sequence of DNA can be converted either into an RNA transcript, which encodes a protein, or into a transcript holding a different molecular function, which might be able to modulate the transcription (or translation) of other coding or non-coding genes [2, 6]. This complexity further increases when taking into account that a stretch of DNA can encode for a protein while simultaneously holding a coding-independent function, affecting other transcripts [7]. After transcription, the translation of mRNAs into proteins is assisted by ribosomal RNAs (rRNAs) and transfer RNAs (tRNAs). Both housekeeping transcripts belong to the group of non-coding RNAs (ncRNAs) and have well-established roles in protein synthesis. Additionally, the non-coding transcriptome comprises other functional RNA molecules of a different nature. Despite their myriad of potential functions, they are roughly classified into two major classes according to their length: small non-coding RNAs (sncRNAs) and long non-coding RNAs (lncRNAs), having less or more than 200 nucleotides in length, respectively. However, this classification is unrelated to their biogenesis, function, and cellular location [8].

The conservation of the so-called junk genetic material across time suggests that it underwent an advantageous selection. Importantly, the introduction of parasitic sequences in the eukaryotic genome was evolutionarily simultaneous to the appearance of epigenetic suppressive mechanisms, such as DNA methylation. These mechanisms prevent the mobility of transposable elements and restrict their expression, thus avoiding chromosome instability, chromosomal translocations, and gene disruptions [9,10,11]. Since epigenetic marks are dynamic, some of these epigenetic silenced regions have gradually evolved to create tissue-specific regulatory networks, exhibiting enhancer-like activities [12]. Moreover, DNA methylation is thought to have evolved simultaneously to X-chromosome inactivation and genomic imprinting. In the latter case, gene silencing occurs at very specific genomic regions [13, 14]. In this mechanism, allele-specific maternal and paternal transcription is commonly achieved by allele-specific DNA methylation. One of such examples demonstrates that allele-specific methylation is coupled to an allele-specific transcription of the ncRNA KCNQ1OT1, which is then involved in the transcriptional silencing of other specific genes [15, 16] and indicates an effective cooperation between ncRNAs and epigenetics.

ncRNAs establish a complex layer of transcriptional and posttranscriptional regulation deeply shaped by an epigenetic landscape that is itself regulated by these molecules. This led to the revision of the concept of epigenetics, since it was historically based on two layers of gene regulation: DNA methylation and histone posttranslational modifications. The field is currently conscious of the functional association of ncRNAs with proteins, DNA, and other RNAs, sculpting the epigenetic landscape [17]. Accordingly, ncRNAs are able to recruit and interact with histone-modifying complexes and modulate the activity of DNA methyltransferases, regulating the transcriptional activity across the genome. In summary, ncRNAs affect directly or indirectly through epigenetic changes the gene expression of both coding and non-coding transcripts, and are themselves epigenetically regulated [17, 18].

Utilizing various epigenetic mechanisms, ncRNAs mediate development and cellular differentiation and guarantee cell-specific transcriptional requirements. Recent experimental approaches have demonstrated that the complex network of gene regulation, comprising these non-coding molecules and epigenetics, is often disrupted by genetic and epigenetic events in cancer [19,20,21]. Simultaneously, ncRNAs have been annotated as cancer-related biomarkers for diagnosis, prognosis, and personalized medicine [22,23,24,25,26].

2 Non-coding RNAs as players in gene expression regulation

MicroRNAs (miRNAs) are single-stranded ncRNA species of ~ 22 nucleotides in length derived from hairpin structures. They mediate posttranscriptional gene silencing by partial complementarity with the 3′UTR of a target mRNA. As more than 60% of the protein-coding genes are regulated by miRNAs, they represent one of the most extensively studied subclass of ncRNAs [27, 28]. Gene silencing by miRNA is achieved by triggering an endonucleolytic cleavage [29] or repressing translation [30]. Endogenous small interfering RNAs (siRNAs) have a similar length and biogenesis pathway as miRNAs, but while the latter are excised from stem-loop structures, siRNAs derive from long fully complementary double-stranded RNA precursors primarily of exogenous origin, such as retrotransposons and viral sequences. Both siRNAs and miRNAs are associated with translational repression and mRNA cleavage, depending on the homology to their mRNA target site. Perfect complementarity is commonly associated to mRNA degradation, typical of siRNAs, whereas bulge pairing sites lead generally to a translational silencing, typical of miRNAs [30, 31]. By contrast, PIWI-interacting RNAs (piRNAs) direct transposon cleavage, protecting the genome against transposon-induced insertional mutagenesis, guaranteeing genome integrity [32]. piRNAs are also implicated in epigenetic programming mechanisms [33, 34] and involved in a miRNA-like posttranscriptional silencing of mRNAs through transposon sequences that overlap with the 3′ or 5′UTRs of mRNAs [35, 36]. Curiously, the 3′UTR of more than one quarter of RefSeq transcripts overlap with retrotransposon sequences and those transcripts have lower expression compared to retrotransposon-free transcripts [35]. piRNAs are mainly transcribed in germline cells, but they also show expression in somatic tissues where the large majority is encoded in known transcripts instead of piRNA clusters, as described in germline cells [37]. Small nucleolar RNAs (snoRNAs) are generally encoded in introns of host genes, guiding posttranscriptional modifications of spliceosomal and ribosomal RNAs [38,39,40]. Inherently, they guarantee the accurate assembly and function of ribosomes [41]. Recently, it was described that snoRNAs can modulate 3′ end processing of mRNAs, thus controlling the expression of a subset of mRNAs [42]. They are located in the nucleolus [43] and play an important role in direct and indirect cellular functions such as splicing and translation [44].

lncRNAs are a heterogeneous class of ncRNAs that control transcription, translation, and mRNA stability, functions which are being highlighted in cell differentiation and development [18, 45, 46]. Lacking an open reading frame (ORF) and typically located in the chromatin and nucleus, the transcription of lncRNAs is generally associated (to a lower extent) with the expression of their antisense protein-coding genes, presenting a similar tissue-specific expression pattern [47]. Enhancer RNAs (eRNAs) are typically unspliced ncRNAs that are transcribed from enhancer loci. Their median size places them in the group of lncRNAs, and their expression correlates positively with the transcriptional activation of nearby genes [48, 49]. Circular RNAs (circRNAs) are covalently closed single-stranded ncRNAs (without 5′ cap and 3′ tail), mainly derived from back-splicing circularization of protein-coding exons, with a length corresponding to the incorporated exon(s). Their co-transcriptional biogenesis mechanism competes with pre-mRNA splicing [50], occurring preferentially in genes containing long flanking intronic sequences, containing complementary ALU repeats [51, 52]. The majority of these transcripts have a scarce, although cell-type-specific, expression pattern which supports the notion they are derived as by-products of pre-mRNA splicing [53]. circRNAs are stable and frequently show a developmental-stage and tissue-specific expression, suggesting that the cellular content of these molecules should also vary according to the mitotic index of a particular cell [54]. Overall, they represent 5–10% of the linear expression; however, some have estimated that particular circular isoforms have higher expression than the linear counterparts [55]. Important transcriptional regulatory functions were already described for this class of ncRNA. For instance, the circRNA derived from the CDR1 antisense transcript (CDR1as also called ciRS-7) holds 63 conserved binding sites for miR-7, acting as miRNA sponge [54, 56]. In turn, it is also targeted by miR-671 with higher complementarity that promotes its cleavage, adding complexity to this regulatory scenario [57]. The unusual stability of circRNAs allows their exploration as potential biomarkers in cancer. Accordingly, a recent study analyzing serum exosomes was able to discriminate patients with colon cancer from healthy controls based on the expression of circRNAs. Moreover, they verified that circRNAs were proportionally more represented in exosomes than their linear counterparts [58].

Taking into account the transcriptional and translational regulatory functions of ncRNAs by targeting other RNA transcripts, it is crucial to depict not only the epigenetic layer beyond their regulation but also their effects in the epigenetic landscape, establishing new bridges in this complex archipelago. Several studies highlighted the role of nuclear lncRNAs in guiding chromatin regulatory complexes to specific genomic loci, through multiple interactions between proteins, DNA, and RNAs [59]. In this context, transcriptional activation is achieved through the recruitment of chromatin modifiers such as the histone H3K4 methyltransferase complex [60], while transcriptional silencing is associated with H3K9me2/3, H3K27me3, and polycomb repressive complex 2 (PRC2) [61]. Some nuclear lncRNAs with these characteristics have already been described. For instance, HOTAIR is a lncRNA transcribed from the HOXC locus that interacts with PRC2 through its 5′ domain, mediating its occupancy at the HOXD locus. In turn, PRC2 promotes H3K27me3 deposition, repressing transcription in trans across 40 kb. Similarly, HOTAIR binds the LSD1/CoREST/REST complex, through the 3′ domain, guiding the enzymatic demethylation of H3K4me2 [62, 63]. Nevertheless, there are lncRNAs holding larger transcriptional silencing capacities. The ncRNA Xist is stably repressed in the active X chromosome in females (and in the unpaired X chromosome in males), being exclusively transcribed from the inactive one, coating and silencing the X-chromosome in cis [64]. X-inactivation maintenance is guaranteed not only by Xist expression but also by the support of other epigenetic mechanisms, namely the hypoacetylation of histone H4 and the hypermethylation of CpG islands (CGIs) [65]. Remarkably, the transcription of Xist is controlled by the methylation status of its own CGI, dependent on the activity of two other ncRNAs with antagonistic effects, Tsix promoting DNA methylation and Ftx associating with the lack of DNA methylation [66, 67].

3 Genetic variation of non-coding RNAs in cancer

The canonical studies in cancer biology highlighted genetic alterations in protein-coding genes such as mutations in TP53 [68], deletions of the RB1 locus [69], amplifications of MYC [70], and chromosomal rearrangements in the MLL locus [71]. Similarly, ncRNAs are also targeted by mutations and contain genomic variations associated with aberrant expression and activity. In chronic lymphocytic leukemia (CLL), patients undergo a frequent deletion at the 13q14 region that encodes for miR-15 and miR-16, which are implicated in apoptosis by targeting BCL2 [72, 73]. In a lower extent, these patients experience a 11q deletion that comprises the miR-34b/34c cluster locus (target of p53), promoting the downregulation of these miRNAs that are known to cooperate in the repression of malignant growth [72, 74]. In contrast, the chromosomal region containing the miR-17~92 cluster (13q31-q32) is amplified in diffuse large B cell lymphoma patients [75]. In a synergistic scenario, the expression of this locus is upregulated by Myc, promoting enhanced tumor growth [76]. On the other hand, these ncRNAs are also involved in chemoresistance [77] and radioresistance [78]. Likewise, miR-30d, miR-21, miR-17, and miR-155 undergo a high copy number variation (CNV) in non-small cell lung cancer (NSCLC), similarly to DICER1 and DROSHA that encode for proteins involved in their biogenesis [79]. ncRNAs are also affected by single-nucleotide polymorphisms (SNPs). A common G/C polymorphism within the pre-miR-146a sequence decreases the expression of the mature miRNA, being associated with higher predisposition to papillary thyroid carcinoma [80]. Similarly, the SNP rs61764370 is linked to cancer by disrupting the binding site for let-7 in the 3′UTR of the KRAS oncogene, increasing its expression [81, 82]. Despite several studies that have interrogated the expression of piRNAs in cancer, little is known about the possible genetic variations affecting piRNAs or the genes encoding for the proteins involved in their biogenesis/function. By analyzing the piRNA transcriptome of samples from The Cancer Genome Atlas (TCGA) consortium, several genetic variants were unveiled. Authors detected that 17 piRNA sequences overlap the position of already described mutations. Moreover, suggesting a possible oncogenic role of some piRNAs, a high expression of mitochondrial piRNAs was observed in tumor tissues, likely derived from an increased mitochondrial DNA (mtDNA) content in cancer [37]. snoRNAs are also affected by genetic alterations. SNORD50A and SNORD50B are recurrently deleted in different cancer types and are correlated with reduced survival [83]. It was observed that the snoRNA U50 is targeted by a homozygous 2-bp (TT) deletion in prostate cancer and by a recurrent heterozygous deletion in breast cancer. Accordingly, its overexpression led to reduced colony-formation capacity in vitro, suggesting a tumor suppressor nature for this ncRNA [84, 85]. Curiously, U50 is located at the chromosome breakpoint t(3;6)(q27;q15) described in human B cell lymphoma [86]. Another study showed that the genomic region encoding SNORA42 is frequently amplified in NSCLC, which is correlated with its higher expression, while the expression of the host gene remains unchanged. Functional studies indicated that SNORA42 confers an advantage in cell proliferation, contrary to its host gene [87]. On the other hand, there are also genomic variations affecting proteins that interact with ncRNAs, further suggesting their possible role in cancer. For instance, mutations in the dyskerin (DKC1) gene, are linked to cancer susceptibility [88]. The encoded enzyme associates with H/ACA box snoRNAs to catalyze the pseudouridylation of rRNAs. In turn, alterations in ribosome biogenesis are linked to tumor progression [89, 90].

It has been described that long intergenic non-coding RNA (lincRNA) loci hold cancer-associated SNPs and are affected by CNVs in cancer [91, 92]. One study showed that FAL1 is a lncRNA with frequent genomic amplification in epithelial tumors. The high genomic copy number was correlated with a higher RNA expression and associated with cancer progression in ovarian tumors [93]. Moreover, in prostate cancer, two lncRNAs, PCAN-R1 and PCAN-R2, were described to have higher expression in tumors correlated with a higher copy number. The oncogenic function of these lncRNAs was functionally revealed using knockdown experiments that exhibited a reduced cell proliferation [94]. Other genomic variations are associated with lncRNA transcriptional changes in cancer. For instance, the presence of the high-risk neuroblastoma-associated SNP rs6939340 is linked to a lower expression of the lncRNA NBAT-1 that is suggested to be implicated in metastatic progression and poor prognosis [95].

4 Epigenetic regulation of non-coding RNAs in cancer

Genetic and epigenetic changes cooperate to promote oncogenesis. While the first ones are implicated in the activation of oncogenes and inactivation of tumor suppressor genes, the second ones guide their transcriptional regulation. In this regard, the mechanisms of tumor formation cannot be fully elucidated without mentioning the occurrence of “epimutations,” such as aberrant histone modifications and DNA hyper- and hypomethylation events across the entire genome [96]. CpG hypomethylation is associated with a specific chromatin conformation that allows the accessibility of the genetic information to transcription factors. They promote the transcription of oncogenes such as BCL2 in leukemias [97]. Alternatively, CpG hypermethylation leads to the downregulation of important tumor suppressor genes, such as BRCA1 in breast cancer [98].

While cancer has been historically interrogated based on genetics and protein-coding genes, this view is now challenged by the realization that ncRNAs and epigenetics are indispensable to explain the entire tumorigenesis process. In this context, ncRNAs can be transcriptionally regulated by epigenetics and are themselves able to sculpt the epigenetic landscape of a normal or a malignant cell.

Several studies uncovered tumor suppressor and oncogenic ncRNAs epigenetically deregulated in cancer (Table 1) [128,129,130]. Genome-wide DNA hypomethylation is a common hallmark of cancer cells [131] and extends to the genomic loci of ncRNAs. Most commonly, studies have interrogated the existence of a transcriptional repression mediated by local CGI hypermethylation. Moreover, the discovery and scrutiny of epigenetic pathways altered during tumor formation and metastases are uncovering not only the complexity of a cancer cell but also possible biomarkers for diagnosis, prognosis, and targets for better therapies.

Table 1 Selected epigenetically deregulated ncRNAs in cancer and metastasis. Each referenced study is associated with at least one of the mentioned epigenetic marks

4.1 miRNAs

As a widely studied class of ncRNAs, there are several miRNAs described that undergo transcriptional inactivation by CGI hypermethylation. miRNAs with CGIs overlapping their promoter region are silenced through the transcriptional repression mediated by methyl-CpG-binding domain (MBD) proteins in a chromatin context characterized by the absence of histone modifications linked to active transcription (e.g., H3K4me3) [105, 119, 132]. For instance, miR-124a is epigenetically silenced in colon cancer, which is associated with the posttranslational de-repression of the oncogene CDK6, promoting the phosphorylation and inactivation of the tumor suppressor gene RB1 (Fig. 1a). Despite this miRNA being encoded in three different genomic loci, all the corresponding CGIs are hypermethylated in the colon cancer cell line HCT-116 in comparison with normal colon [105]. The epigenetic silencing of this miRNA was also noticed in several other cancer types [106,107,108,109,110,111]. Another example is given by the miR-132 promoter-associated CGI that was observed to be hypermethylated in ~ 40% of prostate cancer patients with corresponding transcriptional silencing. Functionally, this miRNA decreases cell adhesion, followed by death induction, and reduces cell migration and invasion. The pro-survival genes HB-EGF and TALIN2 were identified as direct targets of this miRNA, but their silencing did not entirely recapitulate the effects of miR-132 overexpression [112]. This miRNA is also epigenetically downregulated by CGI hypermethylation in colorectal [113] and pancreatic [114] cancers, and by SOX4/EZH2-mediated H3K27me3 in ovarian cancer [115]. In a later study, it was demonstrated that miR-132 silencing leads to a metabolic switch, increasing GLUT-1 protein expression which is implicated in lactate production and glucose uptake [133]. Curiously, CGIs are regulatory elements that commonly direct the expression of more than one transcript. For instance, miR-34b/c and BTG4 transcription is regulated by a CGI overlapping a bidirectional promoter. Its hypermethylation and decreased H3K4me3 are associated with the repression of both non-coding and coding transcripts, both suggested to be tumor suppressor genes in colorectal cancer [119]. miR-34b is also silenced by CGI hypermethylation in HCC [120] and lung adenocarcinomas (mir-34b/c), in the latter case associated with metastasis [121]. Despite the fact that hypomethylation events directing gene transcription are less studied, they have a tremendous importance in cancer and metastases. For instance, miR-191 is highly expressed in hepatocellular carcinoma (HCC), which correlates with the hypomethylation of the associated locus. It was suggested that this miRNA represses TIMP3 protein expression, contributing to the epithelial-to-mesenchymal transition (EMT) associated with a poor prognosis [117]. Importantly, another study suggested the existence of a dynamic epigenetic regulation associated with EMT and mesenchymal-epithelial transitions (MET). A CGI hypermethylation-associated repression of miR-200 loci was observed in transformed cells with mesenchymal phenotype. The repression of the miR-200 family allowed the expression of ZEB1 and ZEB2 which are transcriptional repressors of cell adhesion and polarity genes. In vitro experiments showed that miR-200ba/429 and miR-200c/141 repress migration, reducing tumoral growth and metastasis in vivo. Importantly, authors showed that the transitory CGI hypermethylation-associated repression of miR-200 loci induced by TGFβ treatment (EMT) was reverted by its withdrawal [118]. There are other ncRNAs that play a key role in metastasis and are deregulated in cancer. For instance, miR-1 undergoes a CGI hypermethylation-associated silencing in HCC [100] and colorectal cancer [101, 102]. This repression is also associated with reduced H3K4me3 levels. In vitro experiments demonstrated that the overexpression of this miRNA prevents cell proliferation, colony formation, cell motility, and cell invasion [101]. Similarly, another study showed that miR-145 is also silenced by CGI hypermethylation in glioma cell lines. This miRNA inhibits cell proliferation and cell invasion in vitro, suppresses xenograft growth in vivo, and directly targets SOX9 and ADD3. The ectopic expression of miR-145 reduces the expression of c-myc, N-myc, cyclin D1, E-cadherin, and N-cadherin, implicating an important role in cell adhesion and invasion [116].

Fig. 1
figure 1

Epigenetic regulation of ncRNAs in cancer. There are ncRNAs epigenetically deregulated in cancer that contribute directly or indirectly to cancer progression and metastasis. a In normal cells, miR-124 can target the 3′UTR of CDK6 leading to its posttranscriptional silencing. In cancer, its CGI hypermethylation-associated repression is responsible for CDK6 de-repression which phosphorylates and inactivates the tumor suppressor RB1 [105]. b NBAT-1 is expressed in normal brain but downregulated in high-risk neuroblastoma patients, associated with the hypermethylation of its promoter region and the presence of the SNP rs6939340. Through the interaction of NBAT-1 with EZH2, this lncRNA is linked to the epigenetic silencing of target genes, decreasing cell proliferation and invasion [95]. c The lncRNA TP53TG1 was demonstrated to be repressed by CGI promoter hypermethylation in primary gastric tumors. Functionally, TP53TG1 interacts with YBX1, preventing its nuclear localization and the activation of growth-promoting genes [124]

In cancer, while both hypo- and hypermethylation events can coexist in parallel at different genomic loci, few studies addressed these changes simultaneously. In this context, a high-throughput study depicted and compared the transcriptome and methylome of miRNAs in HCC. Hypermethylation of miR-148a, miR-375, miR-195, miR-497, and miR-378 correlated with their silencing, while hypomethylation of miR-106b, miR-25, miR-93, miR-23a, and miR-27a was associated with their expression. Curiously, in silico analysis suggested that miR-148a targets both DNA methyltransferases DNMT1 and DNMT3B, potentially establishing a negative feedback loop. It was also hypothesized that the repression of miR-195/497 resulted from the promoter hypermethylation of the miRNAs themselves and of the transcription factors NEUROG2 and DDIT3 needed for their expression [103].

Super-enhancers are transcriptional regulatory loci that drive the robust expression of certain genes, guarantying cell and tissue identity [134, 135]. Similar to what happens in proximal regulatory regions such as CGIs, super-enhancers are also targeted by cancer-related hypermethylation events that are correlated with the transcriptional silencing of their related genes. In lung and breast cancers, bioinformatic analysis showed that the super-enhancer controlling the lncRNA MIRLET7BHG is coupled with the transcriptional repression of the corresponding encoded tumor suppressors let-7a-3 and let-7b [99]. Curiously, there are super-enhancers with a de novo regulatory role in malignant cells [135,136,137], having a potential to drive the expression of other coding and non-coding genes.

Finally, it is essential to mention the importance of the methylation at CGI shores and the existence of mirtrons which are intronic miRNA precursors processed in a Drosha-independent manner [138]. A screening performed in urothelial cell carcinoma (UCC) showed that both miRNAs and mirtrons are susceptible to hypermethylation-mediated silencing which was more common and dense in CpG shores than CGIs. Authors also explored the lower urinary expression of the epigenetically silenced miRNAs (miRs-152/328/1224) as potential biomarkers for the diagnosis of UCC [104].

4.1.1 piRNAs

Several studies were conducted to analyze the expression and potential role of both piRNAs and associated proteins in cancer. Functionally, the knockdown of PiwiL2 in murine bone marrow mesenchymal stem cells was found to reduce the expression of tumor suppressor genes, increasing cell proliferation [139]. Concordantly, lower expression of PIWIL1, PIWIL2, and PIWIL4 was associated with poor prognosis in renal cell carcinoma [140]. In soft tissue sarcoma patients, lower expression of PIWIL2 and PIWIL4 correlates with a worse prognosis [141]. These genes were also found to be downregulated in NSCLC, where patients with lower levels of PIWIL4 had shorter overall survival. In the same study, PIWIL1 was found to be expressed in a set of NSCLC cases with worse prognosis. Upon 5-Aza-dC treatment, authors showed a dose-dependent expression of PIWIL1, in two NSCLC cell lines. Moreover, they correlated the expression of PIWIL1 with a higher percentage of unmethylated CpGs within its CGI, suggesting that PIWIL1 expression could be partially regulated by DNA methylation [142]. Despite the fact that the expression of PIWI-proteins has been documented in somatic tissues, very few studies reported the existence of piRNAs in normal or cancer somatic tissues [143]. In primary testicular tumors, PIWIL1, PIWIL2, PIWIL4, and TDRD1 were found to be epigenetically silenced by CGI hypermethylation [144], which is associated with a loss of piRNAs [144, 145] and hypomethylation events at the LINE-1 loci [144, 146]. Curiously, the epigenetic silencing of the genes encoding for the PIWI proteins was also described in non-genetic male infertility syndromes [147] which are epidemiologically linked to testicular cancer [148, 149].

4.2 lncRNAs

There are also alterations in the methylation profile of the promoter region of lnRNAs, in cancer [92]. The comparison between the transcriptomes of low- and high-risk neuroblastomas has established a correlation between lower NBAT-1 expression and poor clinical outcomes. Authors discovered that the hypermethylation of the promoter region of NBAT-1 was linked to its lower expression in high-risk neuroblastoma patients. Later, NBAT-1 was functionally associated to the differentiation of neuronal tumor cells and to the decrease in cell proliferation and invasion, through the epigenetic silencing of target genes mediated by its interaction with EZH2 (Fig. 1b) [95]. In colon cancer, the promoter CGI hypermethylation of Vimentin (VIM) and its head-to-head antisense transcript was shown to lead to their transcriptional silencing. A detailed study suggested that the antisense transcription allows the formation of an R-loop structure, enhancing transcription by maintaining an open local chromatin conformation. In this context, both R-loop destabilization and antisense knockdown promote chromatin compaction and prevent the binding of transcriptional regulators of the NF-κB pathway [125]. Another example of epigenetic deregulation of lncRNAs is given by a recent study showing that p53-induced lncRNA, TP53TG1, is silenced by promoter hypermethylation in gastric and colon tumors with an associated poor prognosis. Authors showed that TP53TG1 interacts with the DNA/RNA binding protein YBX1, impeding its nuclear localization and preventing the activation of oncogenes. It was suggested that upon cancer-specific silencing of TP53TG1, YBX1 is able to activate growth-promoting genes and create chemoresistance (Fig. 1c) [124].

4.3 T-UCRs, snoRNAs, and snoRNA-host genes

Transcribed-ultraconserved regions (T-UCRs) are lncRNAs encoded from DNA sequences absolutely conserved between orthologous regions of the human, rat, and mouse genomes. It was described that T-UCRs Uc.160+, Uc283+A, and Uc.346+ undergo cancer-specific CGI hypermethylation-associated silencing which is a common event in several tumor types [127]. Interestingly, Uc.283+A binds to pri-miR-195, impairing miR-195 maturation and function [150]. Additionally, snoRNAs are also affected by epigenetic mechanisms in cancer. For instance, three snoRNA host genes were found to undergo a cancer-specific promoter hypermethylation-associated silencing with an associated low expression of the mature snoRNAs: SNORD123, U70C, and ACA59B. Interestingly, the methylation of a unique CGI was inversely correlated with the transcription of three different transcripts, namely SNORD123, its host gene LOC100505806, and SEMA5A which is transcribed in the opposite direction. Curiously, SNORD123 and ACA59B are snoRNAs conserved across vertebrates without a known target (orphan snoRNAs), suggesting that these snoRNAs may have a role in cancer by an unknown mechanism not related with ribosomal and spliceosomal RNA-guided modifications [126]. By contrast, U70C was found to be repressed in CLL patients [151] and deregulated in X-linked dyskeratosis congenita, a congenital disorder associated with cancer susceptibility [152]. In this case, this snoRNA directs a modification of 18S rRNA which is suggested to be associated with cancer [153,154,155]. The downregulation of snoRNAs was also described in acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) [156]. Importantly, apart from their classical functions, they may have a role in gene silencing acting through an antisense-like mechanism in the nucleus, promoting pre-mRNA degradation, preventing splicing, or inhibiting the carriage of the transcript [157, 158].

There are lncRNAs that can act as endogenous competitors for miRNAs, being able to de-repress their targets. Growth arrest specific 5 gene (GAS5) is a lncRNA and snoRNA host gene with tumor suppressor functions that is downregulated in several solid tumors [159,160,161,162,163,164,165,166,167]. In the liver, this lncRNA was described as a competing endogenous RNA (ceRNA) for miR-222, increasing the protein levels of the tumor suppressor p27, a target of this miRNA [168]. Moreover, GAS5 exerts its tumor suppressor functions by interacting with E2F1, promoting its binding and activating the P27Kip1 promoter [169]. This lcnRNA may also act as endogenous sponge for the miR-21 oncogene [170]. Similarly, a recent study has described an antisense transcript encoded in the GAS5 genomic locus, GAS5-AS1, with a reduced expression in NSCLC tumors. Functional assays were unable to show an effect of this lncRNA in terms of proliferation, cell cycle progression, or apoptosis. However, it was demonstrated that this antisense transcript directs a pronounced decrease in cell migration and invasion, reducing ZEB1, N-Cadherin, VIM, and/or SNAIL1, crucial for EMT. Authors have suggested that the epigenetic silencing of GAS5-AS1 in NSCLC is, at least partially, due to histone deacetylation, while GAS5 is silenced through DNA methylation [123].

4.4 Genomic-imprinted ncRNAs

Genomic imprinting is an epigenetic mechanism associated with DNA methylation and histone posttranslational modifications, through which genes are transcriptionally regulated in a parental allele-specific manner. Loss of imprinting with aberrant DNA methylation profiles at genomic-imprinted loci is associated with transcriptional changes of their encoded genes in cancer and other diseases [171]. DLK1-DIO3-imprinted locus (14q32) is controlled by two intergenic, differentially methylated regions (DMRs). It encodes three protein-coding genes, DLK1, RTL1, and DIO3, from the paternal allele and several long and short ncRNAs from the maternally inherited allele [172]. The methylation signature of this locus was investigated by comparing lung cancer versus non-tumoral tissue, being observed an inverse correlation between DNA methylation and expression of the corresponding genes. Accordingly, while the hypermethylation of DIO3 was associated with its transcriptional silencing, the hypomethylation of SNORD113-5, SNORD113-7, SNORD114-9, and miR-889 correlated with their higher expression in smoking-induced lung cancer. Concordantly, these methylome differences were extended to other protein-coding genes, snoRNAs, miRNAs, and lncRNAs [173]. miR-411, miR-370, and miR-376a, encoded in this locus, are upregulated in lung cancer and linked to a more aggressive phenotype, as well as poor survival. miR-411 downregulation decreased cell migration [174]. Independent of the methylation profile, other cancer types have shown an increase in the expression of the miRNA members of this imprinted locus, namely in HCC [175] and uterine carcinoma [176]. Contradictorily, a downregulation of these miRNAs was observed in colorectal cancer [177], gastric adenocarcinoma [178], medulloblastoma [179], and papillary thyroid cancer [180]. Concordantly, a genome-wide analysis unveiled the silencing of the miR-379/miR-656 cluster across different human cancers. This was observed in a high percentage of samples from glioblastoma multiforme, kidney renal clear cell carcinoma, breast invasive carcinoma, and ovarian serous cystadenocarcinoma [181]. Intriguingly, miRNAs expressed from this imprinted locus are preferentially exported in exosomes, being almost absent in the cells where they are produced, which suggest that the levels of the abovementioned miRNAs may not reflect their transcriptional activity [182]. Another study showed that this locus also encodes 138 piRNAs. Seven of these piRNAs are exclusively encoded within this imprinted locus and are somatically expressed in lung (non-malignant and tumor samples), suggesting their potential role in the anomalous methylation profile of the imprinted locus during lung cancer progression. By comparing paired tumor and non-malignant lung tissue, the authors noticed that four of these piRNAs were upregulated in lung adenocarcinoma and one in lung squamous cell carcinoma [183]. Concerning the lncRNAs of this locus, they are also deregulated in cancer. MEG3 encodes for a lncRNA that is highly expressed in brain. In vitro assays revealed that MEG3 has an anti-proliferative activity, inhibiting DNA synthesis, suppressing colony formation, and activating p53-mediated transactivation. Curiously, its transcriptional silencing is a common event in meningiomas through its allelic loss (in higher-grade tumors) or through an increase in CpG methylation within its promoter or the imprinting control region [122].

5 Non-coding RNAs as epigenomic regulators

HOTAIR is a lncRNA upregulated in different types of cancer [184] and has important functions through its interactions with other RNA molecules or recruiting PRC2 and LSD1/CoREST/REST chromatin-modifying complexes. The transient LSD1-mediated demethylation of H3K4 promotes the assembly of the Myc-induced transcription initiation complex [185]. HBXIP, an oncoprotein that directly interacts with c-Myc, was suggested to recruit HOTAIR together with LSD1, mechanistically mediating the c-Myc transcriptional activation through the c-Myc/HBXIP/Hotair/LSD1 complex, where HOTAIR serves as a scaffold, in breast cancer cells (Fig. 2a) [186]. In esophageal squamous cell carcinoma (ESCC), HOTAIR promotes H3K27me3 deposition at the promoter region of WIF-1, an inhibitor of the Wnt/β-catenin signaling pathway, leading to the epigenetic silencing of WIF-1 and consequent de-repression of Wnt target genes. This observation supports the employment of HOTAIR expression as a prognostic factor for metastatic progression in ESCC (Fig. 2a) [187]. Moreover, HOTAIR upregulation was associated with a genome-wide retargeting of PRC2 linked to both breast and colorectal cancer metastases [195,196,197]. Besides the capacity to recruit two distinct chromatin-modifying complexes, HOTAIR can also function as a ceRNA. For instance, higher expression of HOTAIR was demonstrated to be associated with a malignant phenotype and poor prognosis in gastric cancer patients. HOTAIR promotes migration and invasion, and its knockdown restrains cell proliferation and induces apoptosis. In this context, HOTAIR upregulation is correlated with higher expression of HER2, due to its function as a sponge for miR-331-3p, de-repressing its target gene HER2 (Fig. 2b) [188]. Despite mir-141 undergoing a CGI hypermethylation-associated silencing in gliomas, it was also verified that HOTAIR acts as a sponge for this miRNA, positively regulating SKA2, which is implicated in cancer progression (Fig. 2b) [189]. Similarly, by sponging miR-152, HOTAIR induces HLA-G upregulation, pointing to a potential role of this lncRNA to escape cancer immune surveillance (Fig. 2b) [190, 198]. In esophageal cancer, HOTAIR sequestrates miR-148a, which de-represses Snail2, and promotes EMT (Fig. 2b) [191]. In summary, HOTAIR acts not only by regulating gene expression through epigenetic changes but also as a sponge for miRNAs, allowing the expression of their targets. The complexity of this epigenetic scenario increases, taking into account that each chromatin-modifying complex may interact with various ncRNA molecules. HOTAIR binds EZH2, the enzymatic subunit of the repressive polycomb complex PRC2. However, since EZH2 is upregulated in several human cancer types and associated with aggressiveness and poor survival, it was essential to depict how polycomb complexes are guided to their specific target genomic locations. By cross-linking methods, several intronic RNA sequences were identified as being able to bind EZH2, regulating the transcriptional activity of their host gene. In this regard, the ectopic expression of the EZH2-bound intronic RNA sequence associated with SMYD3 resulted in a higher genomic occupancy of EZH2 in the corresponding genomic locus, reducing the transcription of the corresponding host gene [199].

Fig. 2
figure 2

ncRNAs as posttranscriptional and epigenetic modulators of gene expression in cancer. HOTAIR and other ncRNAs act as regulatory molecules in a wide variety of biological processes. a, Left: In cancer, HBXIP interacts with c-Myc and recruits HOTAIR together with LSD1 (mediates the transient demethylation of H3K4me2). The complex c-Myc/HBXIP/Hotair/LSD1 is responsible for c-Myc transcriptional activation [186]. a, Right: HOTAIR can also serve as a scaffold for PRC2 complex promoting H3K27me3 in the promoter region of WIF-1, which is an inhibitor of the Wnt/β-catenin signaling pathway. The epigenetic silencing of WIF-1 associates with the activation of the WNT-β-catenin pathway [187]. b Additionally, HOTAIR not only regulates chromatin dynamics but also influences gene expression posttranscriptionally. This lncRNA acts as a ceRNA, sponging miR-331-3p, mir-141, miR-152, and miR-148a, de-repressing the cancer- and metastases-related proteins HER2 [188], SKA2 [189], HLA-G [190], and Snail2 [191], respectively. c DNMT enzymes catalyze the conversion of cytosine to 5-mC whereas TET enzymes catalyze the conversion of 5-mC to 5-hmC. DNA demethylation (loss of 5mC) can be achieved either as a passive process through DNA replication in the absence of functional DNA methylation maintenance or actively through TET-mediated 5mC oxidation. miR-29b targets DNMT3A, DNMT3B, and indirectly DNMT1, by targeting its transactivator SP1, promoting a global DNA hypomethylation [192]. Simultaneously, this miRNA is able to target the 3′UTR of TET1, TET2, and TET3 decreasing, as a consequence, the cellular 5-hmC content [193]. miR-29b was found to be downregulated in AML patients with balanced 11q23 translocation [194] but upregulated in TET2-wild-type AML patients [193], suggesting that this miRNA should have an important role in the epigenetic profile of AML cells. Importantly, the oncogenic or tumor suppressor role of this miRNA in AML is controversial

There are several other examples of ncRNAs that are able to modulate the epigenome in cancer and metastases. For instance, FAL1 is a lncRNA frequently amplified in epithelial tumors. Its association with the polycomb complex protein BMI1 promotes its binding to the CDKN1A promoter, repressing p21 expression in cancer [93]. By contrast, NBAT-1 can be epigenetically silenced in cancer and, per se, the expression of this lncRNA is biologically responsible for the epigenetic silencing of target genes such as SOX9, VCAN, and OSMR, through the interaction with EZH2, suggesting an important role in tumor progression [95]. ZFAS1 is a snoRNA host lncRNA described as a tumor suppressor gene in breast cancer [23, 200] and as an oncogene in colorectal cancer [201] and HCC [202]. The latter study showed an upregulation of ZFAS1 in primary tumors compared to adjacent non-malignant tissue. The oncogenic function was attributed to miR-150 sequestration, de-repressing ZEB1, MMP14, and MMP16, which may act to promote metastasis [202]. Mechanistically, it was also shown that ZFAS1 is associated with the inhibition of the CpG methylation of the miR-9 promoter-related CGI, through a DNMT1-dependent mechanism [203]. Accordingly, the upregulation of miR-9 in HCC was associated with an aggressive phenotype and poor prognosis [204]. In addition, it was shown that silencing of E-cadherin mediated by miR-9 leads to the activation of β-catenin signaling, activating pro-metastatic genes in breast cancer [205]. Other still uncharacterized snoRNA-host genes and “orphan” snoRNAs may hold unexpected functions in the cellular context with potential roles in cancer. As many snoRNAs are intronically encoded in genes coding for ribosomal proteins, guiding themselves posttranscriptional modifications in rRNA, it is worth thinking that some “orphan” snoRNAs might have a function somehow associated with their host gene. Curiously, a recent study identified novel “orphan” snoRNAs, encoded within host genes with epigenetic functions [206], whose expression is deregulated in cancer, e.g., DNMT3A [207] and KAT6B [208], anticipating that they might have an epigenetic-associated role in cancer.

ncRNAs have different epigenetic roles in cancer, promoting and preventing specific epigenetic changes. For instance, a ncRNA encompassing the full mRNA sequence of CEBPA interacts with DNMT1 blocking the DNA methylation of the CEBPA locus [209]. In contrast, it was suggested that the lncRNA HNF1A-AS1, upregulated in lung adenocarcinomas, binds to DNMT1, mediating its binding to the E-cadherin promoter, sustaining its repression. Reduced levels of HNF1A-AS1 increased E-cadherin and decreased N-cadherin and β-catenin, showing the involvement of this lncRNA in EMT. The ectopic expression of HNF1A-AS1 promoted cell proliferation, and its downregulation inhibited cell migration and invasion in vitro and decreased tumor growth and metastasis in vivo [210]. Similarly, HNF1A-AS1 is upregulated in bladder cancer and HCC, where this lncRNA promotes proliferation by acting as a ceRNA, sponging miR-30b-5p and de-repressing its target gene Bcl-2 [211, 212]. HNF1A-AS1 also exerts its oncogenic functions, acting as a sponge for miRNA-34a, positively regulating SIRT1 in colon cancer [213]. Importantly, there are ncRNAs deregulated in cancer that directly target epigenetic genes, leading to global genomic effects. This is the case for miR-29b, which is downregulated in AML patients with balanced 11q23 translocation [194]. Taking into account that DNMTs catalyze the conversion of cytosine to 5-methylcytosine (5-mC), it was functionally shown that miR-29b targets DNMT3A, DNMT3B, and SP1, a transactivator of DNMT1, promoting a genome-wide DNA hypomethylation, and induces the transcription of tumor suppressor genes in AML (Fig. 2c) [192]. By contrast, miR-29b, among other miRNAs, was shown to be upregulated in TET2-wild-type AML patients. Considering that TET2 catalyzes the conversion of 5-mC to 5-hydroxymethylcytosine (5-hmC), being involved in active and passive DNA demethylation, functional assays have demonstrated that miR-29b targets and decreases TET2 expression (in addition to TET1 and TET3), and reduces the cellular levels of 5-hmC in hematopoietic cells (Fig. 2c) [193]. Thus, the expression levels of miR-29b should have an important role in the epigenetic profile of AML cells.

Recently, piRNAs were suggested to guide gene-specific CpG methylation at non-transposable element genetic loci, which may be partially mediated by their direct binding to genomic DNA or nascent mRNA transcripts, close to target CpG sites [214]. For instance, the piwi/piRNA complex was linked to the methylation of a CGI overlapping the promoter region of CREB2, in neurons [215]. On the other hand, the SNP rs1326306 G>T at the piR-021285 locus increased the risk of breast cancer, since the polymorphism within the mature sequence of a piRNA might alter its potential to methylate its target loci. Transfection of breast cancer cell lines with piRNA mimics showed the existence of significant methylation differences between wild-type and variant piR-021285 mimics. Among other genes, the variant piR-021285 mimic induced lower ARHGAP11A 5′UTR/first exon methylation, which was associated with a higher mRNA expression. This variant also increased cell invasiveness that might be associated with higher levels of ARHGAP11A, since the expression of this gene has been found to be upregulated in migratory breast cancer cells [216].

6 Future perspectives and conclusions

This review provides examples and new insights into the epigenetic regulation of ncRNAs and how ncRNAs by their own can produce important changes in the epigenetic landscape of a cancer cell, modulating the expression of other cancer-related coding and non-coding genes. The meticulous scrutiny of the transcriptional networks established in cancer by ncRNAs and epigenetics opens the opportunity to establish novel biomarkers and new candidate targets, resulting in a better and more personalized cancer treatment through new pharmacological approaches. Translationally, the use of the current epigenetic drugs is still very limited due to genome-wide effects [217]. In this context, ncRNAs could be exploited not only as specific biomarkers for early diagnosis and personalized treatment but also as targets with specific downstream transcriptional and translational effects. RNA interference (RNAi) is a promising approach to functionally and specifically downregulate almost any RNA molecule without genome-wide side effects [218,219,220]. For instance, Miravirsen is a miRNA-targeting drug, which employs locked nucleic acid-modified oligonucleotides to repress miR-122, required for hepatitis C virus infection [221, 222]. In addition, miRNA replacement was also approached using RNA mimics, namely in cancer [223]. Considering the epigenetic roles of some RNA molecules and their aptitudes to control gene expression, the establishment of mechanistic gene regulatory networks supported by the development of novel approaches to modulate the expression of some ncRNAs could lead to the development of novel epigenetic and non-epigenetic therapeutic strategies in cancer.