1 Introduction

Following the central dogma of molecular biology, DNA is transcribed into messenger RNAs, which in turn serves as the guide for protein synthesis [1, 2]. Although exceptions to this rule were known to occur, in the form of transfer RNAs and ribosomal RNAs for a long time [3, 4], only over the last few years’ evidence emerged that RNA displayed functional roles beyond the messenger between DNA and protein. It is now widely accepted that RNA plays a key role in the regulation of genome organization and gene expression [5, 6]. Multiple studies have demonstrated that the vast majority of the human genome is dynamically and differentially transcribed to produce multiple and complex non-coding RNAs (ncRNAs), i.e., RNA transcripts that do not encode for a protein but rather act as regulatory RNAs, in a phenomenon called pervasive transcription [2, 79].

The human genome comprises more than 3.2 billion nucleotides that unfolded correspond to more than 2 m of linear DNA, which is packed into three-dimensional structures in the nucleus of each cell [10]. However, the part of the genome that represents protein-coding genes is approximately only 2–3 %, while a vast and diverse plethora of ncRNAs originate from the remaining nucleotides in the genome [5, 10]. Current ENCODE predictions suggest that ~80 % of the genome’s DNA is transcribed into RNA and contribute to the overall estimates of 80 % of the genome being biochemically active/functional [11]. Although these include some functionally well-characterized small and long ncRNAs [1115], it has been hypothesized that their abundance and sheer complexity alone is reason enough to believe that they must play major regulatory roles in complex organisms. Furthermore, because such complexity is not the result of the amount of synthesized proteins, it should represent the extent and nature of genome regulation [12, 16]. However, caution should be taken when considering those transcribed DNA elements as functional players. Multiple lines of evidence indicate that this genome-wide transcription is a stochastic process rather than a sign of function per se [17, 18]. Indeed, any given DNA element can be transcribed when it is associated with specific histone marks, binds to transcription factors, and is located in an open chromatin area [11]. Transcription is certainly a prerequisite for a genetic element to be functional, but it is not synonymous of that condition [18]. Further, if one looks at the term “function” from an evolutionary standpoint and assumes that a putative undetected function of 98 % of the human genome is not a human-specific trait, it might be argued that a given DNA element with an important function would possibly show significant signs of selective pressure (i.e., sequential conservation in related organisms) to maintain this functionality over evolutionary time as demonstrated for a very limited number of long ncRNAs (lncRNAs) [19]. Comparative studies showed that broadly conserved lncRNAs share a short and 5′-biased patches of conserved sequence [20]. Moreover, lncRNA structure is considerably renewed during evolution, in part due to exonization of transposable elements [20].

Nevertheless, an ever-increasing number of novel classes of small and long ncRNAs are being described, regardless of its homology to that of any related organism or demonstrated function. Driven by recent paradigm shifts in the appreciation of genomic architecture, regulation, and transcriptional output, this seems a valid approach to many researchers. Many of these novel ncRNAs are able to interact with DNA, RNA, and proteins. Some take part in diverse structural, functional, and regulatory activities, controlling nuclear organization and transcription, post-transcriptional, and epigenetic regulation [10, 16]. This expanding inventory of ncRNAs is implicated in a broad spectrum of processes including organ homeostasis and pathogenesis. This growing index of ncRNAs is fuelled by the discoveries of large-scale consortia, aiming to dissect the functional genomic elements such as ENCODE and FANTOM [5]. These projects exposed the complexity and plasticity of the genome: It encompasses not only protein-coding genes with multiple transcription start sites, alternative promoter and enhancer elements, splicing initiation, and donor sites, as well as variable 3ʹ-untranslated regions (UTRs), but also an unpredictably large number of ncRNAs [5]. These display numerous regulatory functions and similarly serve as substrates for transcriptional and post-transcriptional diversification [12, 16]. The advent of sequencing technologies revealed that the vast majority of the genome is transcribed either in sense or antisense, and it is also expressed in a highly cell type-, subcellular compartment-, and developmental stage-specific manner [21]. The current view of RNA transcription is that each nucleotide can contribute to context-dependent transcription, mediated by specific RNA polymerases, ultimately giving rise to numerous and overlapping transcripts [21].

Several reports shed light on the global deregulation of non-coding transcriptome that occurs in cancer cells. One of the best examples is the overexpression of lncRNA HOTAIR in breast cancer. HOTAIR reprograms breast cells’ epigenome in a Polycomb repressive complex 2 (PRC2)-dependent manner, contributing to increased invasiveness and metastasization [22]. Recently, it was shown that intronic RNA may serve as molecular scaffold for epigenetic regulation through recruitment of PRC2 proteins to specific gene loci [23]. This misregulation of RNA–protein interactions ultimately leads to tumor formation [23]. Interestingly, R-loop-formation and head-to-head antisense transcription are known to be involved in transcriptional activation in cancer [24]. However, one of the first hints on ncRNAs involvement in cancer was the deletion and concomitant downregulation of miR-15 and miR-16 in chronic lymphocytic leukemia [25]. The de-regulation of some PIWI-interacting RNAs may also contribute to breast cancer-specific biology, possibly by remodeling the cancer epigenome [26]. Taken together, this body of evidence sets ncRNA as critical components of cancer biology: ncRNAs are cancer-related genes due to their potential tumor suppressive and/or oncogenic functions [27].

2 The diversity of non-coding RNAs in humans

According to its size, ncRNAs are classified in two main families: lncRNAs, corresponding to transcripts with over 200 nt that does not appear to contain a protein coding sequence, and small ncRNAs (sncRNAs), when the RNA sequence contains less than 200 nt [12]. The ncRNAs localize both to the nucleus and cytoplasm and may be found in exosomes and other microvesicles present in bodily fluids such as urine, blood, and seminal fluid, although the abundance and activity of ncRNAs in exosomes remains unclear [28]. Exosomes are released from tumor cells and may transfer proteins and RNA across cells. Thus, it is tempting to speculate whether in the PCa microenvironment, miRNAs (the most commonly studied ncRNA in exosomes) may be transferred among stromal cells and cancer stem cells, although it is not clear how miRNAs reassemble into a functional miRISC upon import into other cells.

Moreover, circulating ncRNAs in serum, plasma or urine, although at low levels, may provide new opportunities for biomarker development [29].

2.1 Small non-coding RNAs

Small ncRNAs differ from lncRNAs by its length and are typically classified according to different biogenesis pathways and genomic origins (Table 1). Classically, sncRNAs include all transfer RNAs (tRNAs), some ribosomal RNAs (rRNAs), small nuclear RNAs (snRNAs), small nucleolar RNAs and its derivatives, microRNAs (miRNA), short interfering RNAs, and piwi-interacting RNAs [30]. More recently, several other small RNAs associated with protein-coding gene transcription and splicing regulation, such as transcription initiation RNAs (tiRNAs), promoter-associated short RNAs, termini-associated short RNAs, 3′-untranslated region-derived RNAs, and antisense termini-associated short RNAs, have been added to this class [30, 31].

Table 1 mRNA/lncRNA convergent and divergent features

2.1.1 microRNAs

Presently, sncRNAs involved in post-transcriptional regulation of target RNAs via the RNAi pathway, such as miRNA, siRNA, and piRNA, are considered the most biologically relevant [32]. Owing to its involvement in human diseases such as cancer and its potential as disease biomarkers, miRNAs are undoubtedly the best-studied sncRNA class. Mature miRNAs are typically ~22 nucleotide single-stranded RNAs (ssRNA), canonically derived from longer primary transcripts (pri-miRNAs) which are processed to intermediate precursor-miRNAs (pre-miRNAs) by DROSHA-DGCR8 microprocessor complex [33]. These hairpin precursors are exported from the nucleus to the cytoplasm, via exportin 5 (XPO5), where the terminal loop region of the hairpin is removed by DICER/TRBP2 [34], resulting in a double-stranded mature/star RNA molecule (dsRNA) (Fig. 1) [35]. Both Drosha and Dicer are RNase-Type III proteins and leave characteristic 2 nt offsets on their substrate that can be used for bioinformatics description of miRNAs [36]. While the canonical microRNA biogenesis and action model are being constantly refined, Drosha [37] and Dicer [38] independent biogenesis mechanism have been described, but they represent rare exceptions. After the canonical, multistep processing, typically only one of the strands (mature miRNA) is loaded by Argonaute proteins and coupled with diverse components of the RNA-induced silencing complex (RISC). Constrained by the structure of the Argonaute protein, only 7 nucleotides (position 2–8) of the mature miRNA are exposed [39]. This so-called “seed region” defines the range of potential target RNAs by usually perfect complementarity of these few nucleotides. In the vast majority of cases, target interaction of miRNAs occurs at the 3′UTR of protein coding genes [40]. miRNA:mRNA interactions that include the RISC complex then lead to repression or degradation of those transcripts and, ultimately, to a moderate downregulation of the corresponding proteins (Fig. 2) [35]. While this seemed a straightforward model when miRNA function was first discovered, complexity was quickly added when researchers realized that multiple copies of nearly identical and evolutionarily related miRNAs might be found in the genome that share seed sequences and, consequently, the range of targets (miRNA families). Moreover, single miRNAs have not only one but hundreds of target RNAs and single protein coding genes are targeted by multiple miRNAs (Fig. 3). Consequently, there is redundancy in microRNA targeting. Indeed, one miRNA may have different MRE in the same target RNA. It is, thus, likely that most miRNA act as rheostats, fine tuning the expression of hundreds of genes, in intricate gene networks [41]. miRNAs may, in fact, establish thresholds and increase the coherence of the expression of its targets, as well as reduce the cell-to-cell variability in target gene expression [42].

Fig. 1
figure 1

MicroRNA biogenesis in human cells. In the canonical miRNA biogenesis pathway, primary transcripts are processed by Drosha in the nucleus and by Dicer in the cytoplasm. Both Drosha and Dicer are RNAse III enzimes and produce a hairpin precursor with 2-nt offsets at the 5p and 3p arm. The lncRNA pri-miRNA displays a 7-methylguanosine cap (m7Gppp), ends with a 3ʹ poly(A) tail, and is transcribed by RNA polymerase II (Pol II). The pri-miRNA contains a stem–loop structure that is cleaved in the nucleus by the endonuclease Drosha together with its double-stranded RNA (dsRNA)-binding protein parner DGCR8, forming a complex called Microprocessor. The output of this trimming is a precursor miRNA (pre-miRNA). It is then exported to the cytoplasm by exportin 5 and further cleaved by the endonuclease Dicer, together with its dsRNA-binding partner TRBP, to produce a miRNA-miRNA* duplex. Further maturation steps reject miRNA* and incorporate the mature miRNA strand into the miRNA-induced silencing complex (miRISC). Alternative biogenesis pathways are also acknowledged. Mirtrons are short introns with hairpin potential that are spliced and disbranched into pre-miRNAs and bypass the Drosha cleavage of the canonical miRNA pathway. It lacks a lower stem and basal single-stranded segments, which are structural features of pri-miRNA and mediate recognition/cleavage by the DGCR8/Drosha complex. In this pathway, pri-miRNA is generated from a branched mirtron structure that undergoes lariat debranching. Another alternative biogenesis pathway involves pre-miRNA escaping to Dicer processing after nuclear export that is directly loaded into AGO2 protein. AGO2 is responsible for processing the pre-miRNA into a single-stranded miRNA (hsa-miR-451)

Fig. 2
figure 2

miRNA-mediated gene regulation. Most plant and a few animal miRNAs direct endonucleolytic cleavage of their mRNA targets by perfect complementarity. However, highly complementary sites in animals’ transcriptomes are infrequent. Accordingly, miRNA-directed translational repression is indistinguishable from mRNA destruction via decapping and 5ʹ-to-3ʹ decay. Thus, it was suggested that miRNAs mainly direct target mRNAs for decay. Nonetheless, the predominant mode of miRNA-mediated repression may be context-dependent. The core RISC complex is formed by Argonaute proteins (1–4 in mamals) and GW182. If mRNA decay and subsequent target mRNA destabilization is observed, it suggests that miRISC interacts with the CCR4–NOT and PAN2 deadenylase complexes to facilitate deadenylation of the poly(A) tail. Following deadenylation, the 5-terminal cap (m7G) is removed by decapping the DCP1–DCP2 complex, and mRNA decay is affected by an exonuclease. The miRISC inhibits translation initiation by interfering with eIF4E-cap recognition and 40S small ribosomal subunit recruitment or by antagonizing 60S subunit joining and preventing 80S ribosomal complex formation. Furthermore, miRISC might inhibit translation at post-initiation steps by inhibiting ribosome elongation. MicroRNA-target interactions might be additionally mapped by Ribosome profiling, providing a “snapshot” of all the ribosomes active in a cell at a specific time point. MicroRNA manipulation would allow systematic monitoring of cellular translation processes and prediction of protein abundance. Coupled with bioinformatic target predictions, this would help determine which mRNA is being translated and which region of the mRNA is being targeted by the miRNA

Fig. 3
figure 3

Magnitude of microRNA-mediated gene regulation. a Redundancy of microRNA targeting. Most miRNA act as rheostats, fine tuning the expression of hundreds of genes, in intricated gene networks. A single miRNA can target multiple transcripts and one specific mRNA is able to be targeted by several miRNAs. Indeed, one microRNA may have different MRE in the same target mRNA (miRNA binds to the specific MRE in the same color). b Competing endogenous RNAs: natural occurring miRNA decoys. Linear or circular lncRNAs may function as miRNA decoys to sequester miRNAs from their target mRNAs of functionally relevant lncRNAs. Base pairing is also the mode of action of ceRNAs. Those lnc-ceRNAs can mask miRNA-binding sites on a target mRNA to block miRNA-induced silencing through the RNA-induced silencing complex (RISC). The ceRNA comprises both circular RNA (circRNA), lncRNAs, and pseudogenes competing for the complementarity with miRNAs. The ultimate impact of these interactions is that protein-coding RNAs and non-coding RNAs may crosstalk by competing for miRNA binding through their miRNA recognition motifs impairing cell homeostasis. (ORF, open reading frame)

miRNAs themselves are also subject to modifications, including post-transcriptional RNA editing (methylation, uridylation, and adenylation) [27, 43] and miRNA tailing [35]. Deep sequencing data revealed that the majority of miRNAs show length and sequence isoforms (isomiRs) with largely unknown functions in cancer biology [43], although it has been shown that they can selectively associate with specific RNA-binding proteins (e.g., Argonaute) or exosomes [44] implying context-dependent functional roles. These alterations may, thus, affect miRNA maturation and de-regulate target-genes:miRNA interactions stoichiometry.

Despite growing knowledge on miRNAs biology and the ever-increasing amount of sncRNA sequencing data, defining what is and what is not a miRNA has become a challenge. This uncertainty led to descriptions of several hundreds to several thousands miRNAs in the human genome [45, 46]. Remarkably, when structural criteria for annotation and nomenclature of human miRNA genes were recently updated, only a small proportion of the previously reported human miRNAs was recognized, constituting about one third of the 1881 putative miRNA entries in the widely used online repository miRBase (last updated June 2014) were considered bona fide miRNAs [36]. Consequently, the 523 currently accepted human miRNAs represent a solid foundation for future studies, but a reassessment of all published (but not yet in miRBase listed) miRNAs is desirable to further expand our understanding of the human microRNAome.

Box 1: miRNA structural features

• miRNAs are between 20-26 nt long

• They are genome-encode and derive from hairpin precursor that shows imperfect complimentarity (~16 nts)

• Mature products of both hairpin arms are expressed (mature, co-mature or star sequence)

• Show a 5′ read homogeneity in 90 % of the reads

• Display a 2 nt offset on both ends which is a consequence of Drosha/Dicer processing

• Mature miRNA sequences usually start with A or U.

• Flanking region upstream shows UG motif at Position 14, loop shows UGU motif at the 3′ end of the 5′ arm, and flanking region downstream shows CNNC motif at Position 17–18.

• At least some miRNAs of any higher animal taxon are representatives of phylogenetically conserved miRNA families and show very high sequence similarities.

2.1.2 siRNAs

Endo-siRNAs are double-stranded, 21–26 nt, RNAs (dsRNA) that are cleaved from longer dsRNA intermediates precursors derived from repetitive sequences, sense–antisense pairs (derived from transposons), or long stem-loop structures [47, 48]. Endo-siRNAs biogenic pathway in humans is DICER-dependent (although Drosha-independent) and involves Argonaute proteins (AGO2) [47, 49]. Endo-siRNAs originate from diverse genomic locations and have been implicated in post-transcriptional (mRNA cleavage) and epigenetic silencing of protein-coding genes and transposon-derived ncRNAs, respectively, as well as other unclear functions [49]. Contrasting with miRNAs, endo-siRNAs bind only RNA molecules containing perfectly complementary sequences [32].

2.1.3 PIWI-interacting RNAs

Another important family of RNAi comprises piRNAs [50], 24–30 nucleotide ssRNAs, derived from single-stranded RNA precursors transcribed from intergenic repetitive elements, transposons, or large piRNA clusters [47]. The biogenesis is Drosha-/DICER-independent and requires Piwi proteins of the Argonaute/Piwi family [50, 51]. During piRNAs biogenesis, piRNA precursors undergo nuclear processing and export, primary or cyclic secondary processing (the ping-pong cycle, catalyzed by PIWI proteins MILI and MIWI2), and PIWI ribonucleoprotein complex (piRNP) assembly [21, 52]. The ping-pong amplification cycle generates antisense piRNAs capable of suppressing the transcript of origin [52]. The assembly of the piRNP is essential to establish post-transcriptional regulation and transposon modulation. The piRNAs functions are connected to its origin: If derived from transposons, piRNAs are implicated in regulating cognate transposon activity, whereas piRNAs resulting from piRNA clusters are involved in gene expression control [52, 53]. piRNAs were primarily found in germ cells, but recent studies have recognized that piRNAs are expressed in somatic cells, including non-tumorous and tumourous tissues from 11 organs [54].

Current views suggest that both endo-siRNAs and piRNAs are defensive mechanisms against nucleic acid-based parasites, acting as genome’s guardian. However, both endo-siRNAs and piRNAs are not considered cancer-related genes, and consequently, additional data is need to ascertain the true relevance of these ncRNA families in tumorigenesis [54, 55]. Remarkably, these three major families of sncRNAs associate with different AGO protein subclades to perform sequence-specific gene silencing [35].

2.1.4 snoRNAs

Small non-coding RNAs exert far more biological regulation rather than just RNAi-silencing. One of the firstly described classes of small ncRNAs was snoRNAs. SnoRNAs are 60–300 nt long, mainly localized in the nucleolus, which are encoded by introns of coding and non-coding genes [56, 57]. Their function is to guide RNA for post-transcriptional modification of ribosomal RNAs and some spliceosomal RNAs, with a few others involved in nucleolytic processing of the original rRNA transcript [57, 58]. Two subdivisions of snoRNAs are known to exist and are involved in two different types of RNA post-transcriptional modification. The C/D box snoRNAs define the target sites for 2′-O-ribose methylation and H/ACA box snoRNAs demarcate the target sites for pseudouridylation [57, 59]. C/D box and H/ACA box snoRNAs are structurally distinct, and those differences make the connection with the binding of specific proteins to form the small nucleolar ribonucleoprotein (snoRNP) complexes that identify and modify the cognate targets [60]. During processing of rRNA, snoRNA guide sequences hybridize to the target rRNA and lead the snoRNP to direct the modification of ribose 2′-hydroxyl groups or the isomerization of uredines to pseudouridines within pre-rRNAs [58, 61]. Dyskerin is the enzyme recruited by H/ACA box snoRNAs to catalyze pseudouridylation at specific ribonucleotides, whereas C/D box snoRNAs activity requires the methyltransferase fibrillarin, to mediate the 2′-O-methylation [59]. The reactions occur generally at conserved sites in nascent rRNAs [58, 61]. In addition to catalyze nucleotide modification, snoRNP association with pre-rRNAs may also serve to chaperone correct RNA folding for rRNA processing and ribosome assembly [28]

Additionally, snoRNAs have other functions (e.g., Small Cajal body RNAs [62]) and it has recently been found that snoRNA loci may also produce miRNA-like small RNAs [63, 64], uncovering a putative complex crosstalk between snoRNA-guided RNA processing and RNAi pathways. Strikingly, novel evidences implicate snoRNA as controllers of cell homeostasis and snoRNA dysregulation may thus contribute to carcinogenesis [57, 65].

2.1.5 Small RNAs incertae sedis

Although the previously described sncRNA families are relatively well understood from a biological standpoint, others are still poorly characterized. These include sncRNAs resulting from gene regulatory regions and gene boundaries [subclasses of promoter-associated small RNAs, such as transcription initiation RNAs (tiRNAs)], termini-associated short RNAs, antisense termini-associated short RNAs, and splice-site RNA (spliRNA) [66]. Others are structural components of chromosomes—the centromere-associated RNAs and telomere small RNAs. Additionally, some small RNAs are cleavage sub-products of other ncRNAs [e.g., transfer RNA-derived RNA fragments (tRFs)] or are derived from different sources (mitochondrial ncRNAs and miRNA-offset RNAs) [12, 66]. tRFs are one of the most abundant sncRNAs, thought to be present in most organisms and generated by ribonucleolytic processing of tRNAs by Dicer and RNAse Z [67]. The definition of the multiple tRFs classes is made according to the position of the tRNA cleavage site that gives rise to tRFs. Among the known classes, the most prominent includes 5′- and 3′-tRNA halves (cleaved in the anti-codon loop), 5′- and 3′-tRFs (also known as 3′CCA tRF), and 3′U tRFs [68]. The stress induction of tRFs results in stress granule assembly and inhibition of protein synthesis, linking tRFS to cell homeostasis through control of cell proliferation and mediating RNA inactivation through Argonaute engagement [68].

To further ascertain the specific biological roles of these enigmatic small RNAs, functional studies are needed. For instance, when deleting tiRNAs associated with binding sites for RNAPOLII CTCF binding factor, there is a dramatic alteration in CTCF binding and nucleosome density at genomic loci proximal to sites of tiRNA biogenesis [69]. Further research is thus required to dissect the evolution, biogenesis, and functions of these small ncRNAs classes and explore its potential connections with cancer.

2.2 Long non-coding RNAs

According to GENCODE annotation v7, there are 20,687 protein-coding genes and in total, GENCODE-annotated exons of protein coding genes cover 2.94 % of the genome or 1.22 % of protein-coding exons [5]. These data clearly point out that the vast majority of the human genome is transcribed not into a biochemically active RNA but rather into a structural, nc RNA. Transcripts lacking the capacity to code for a protein, are uniformly abundant in all organisms, from yeast to humans [16]. There is growing evidence that ncRNA have biologic functions and operate through defined mechanisms. However, this compelling abundance of ncRNAs triggered the discussion whether ncRNA transcription is the output of transcription or ordinary by-products of the transcriptional system or simply a methodological artifact [12, 70]. Thanks to global efforts, it has been possible to assign specific features to define lncRNAs as distinct transcripts: The vast majority of lncRNAs is generated by the same transcriptional machinery, similar to other mRNAs, as emphasized by RNA polymerase II occupancy and histone modifications associated with transcription initiation (promoter, H3K4me3) and elongation (H3K36me3 in the gene body) [16, 30]. lncRNAs possess a 5′ terminal methylguanosine cap, are often spliced via canonical genomic splice site motifs, and some of them are polyadenylated whereas other are not. Alternative pathways also contribute to the generation of lncRNAs such as non-polyadenylated lncRNAs, likely expressed from RNA polymerase III promoters [16]. Not only lncRNA regulation is made by well-established transcription factors, but also lncRNA are frequently expressed in a tissue-specific manner (Table 1) [30].

Generally, lncRNAs are expressed in lower amounts compared to their protein-coding counterparts, making it difficult to robustly detect in clinical samples [12, 16]. Consistent with the many regulatory functions assigned to lncRNAs, the low expression may restrict these lncRNAs to subtle or redundant roles, or reflect incomplete repression in nonspecific cells [16, 66]. By comparison to protein-coding genes, lncRNA expression has higher cell specificity than proteins, consistent with their proposed role in architectural regulation in which each cell displays a unique transcriptome [16]. The organization of lncRNA loci in the genome revealed transcriptional complexity as lncRNA genes often display large numbers of isoforms. Moreover, lncRNAs are often organized in association with protein-coding genes and half of the protein coding genes have complementary non-coding antisense transcription, further expanding the complexity of genome transcriptional dynamics. lncRNAs may be transcribed from intergenic regions [large intergenic ncRNAs (lincRNAs)]; in antisense, overlapping, intronic, and bidirectional orientations relative to protein-coding genes (Fig. 4); from gene regulatory regions—UTR, promoters, and enhancers; from specific chromosomal domains (telomere) or derived from the mitochondrial genome [12, 16, 66].

Fig. 4
figure 4

Descriptive structure of a long non-coding RNA loci. Normally, lncRNAs are defined by their location accordingly to protein-coding genes in the vicinity. Antisense lncRNAs transcription initiate inside or 3′ of a protein-coding gene. They are transcribed in the opposite direction of protein-coding genes, overlapping any portion of a mRNA. Intronic lncRNAs initiate inside an intron of a protein-coding gene in either direction and terminate without overlapping exons. Bidirectional lncRNAs are transcripts that initiate in a divergent fashion from the promoter of a protein-coding gene; the precise distance cutoff that constitutes bidirectionality is not defined but is generally within ~100 base pairs. Intergenic lncRNAs (also termed large intervening non-coding RNAs or lincRNAs) are lncRNAs with separate transcriptional units from protein-coding genes. A key structural feature is that lincRNAs need to be 5 kb away from protein-coding genes. LincRNA genes are preferentially found within 10 kb of protein coding genes. These are defined as lncRNA transcripts that encompass a protein-coding gene within the “intron” of a lncRNA or as lncRNAs that overlap the intron of a protein coding gene

lncRNAs act by a multitude of regulatory mechanisms according to its specific location in the cell. lncRNA play a role as organizing factors in the dynamic nuclear organization that shapes the cell nucleus through nucleosome remodeling [71, 72]. Nuclear lncRNAs might be involved in gene-to-gene interactions either locally or in the context of cross chromosome interactions, i.e., cis- and trans-mediated regulatory roles, respectively [6, 10]. In most cases, nuclear lncRNAs function by recruiting chromatin-remodeling complexes to particular DNA loci [22], as it have been shown to form ribonucleoprotein (RNP) complexes by recruiting DNA methyltransferases, the Polycomb repressive complex (PRC) 2 (promotes H3K27 trimethylation) [22], and H3K9 methyltransferases, resulting in the formation of repressive heterochromatin and transcriptional inhibition. However, lncRNAs are also associated with transcriptional activation through engaging of chromatin-modifying complexes, including H3K4 methyltransferases, specific transcription factors, and recruiting POLII [7375]. Nuclear lncRNAs may also bind and sequester transcription factors away from their target chromosomal regions, thus indirectly impairing gene expression [76].

Nonetheless, a significant number of lncRNAs are transferred to and lodged in the cytoplasm. Functions of cytoplasmic lncRNAs include protein localization, mRNA translation, and stability. By recognition of the target by base pairing, they can modulate mRNA at different levels: (a) base pairing between BACE1 and BACE1-AS induces stabilization of target mRNA and increases the BACE1 protein expression [77], (b) repression of translation (e.g., lincRNA-p21 suppresses target mRNA translation) [78], and (c) competition with endogenous RNAs (ceRNAs) for miRNA binding [79]. This regulatory system in which multiple RNAs (both coding genes, pseudogenes, and lncRNA) may crosstalk and compete for shared miRNA binding are thought to be relevant for many processes, including cancer [79]. Moreover, circular RNAs (circRNA) also function as miRNA “sponges” [80], and given that linear ceRNAs have a short half-life, it provides superior stability and its turnover can be controlled by the occurrence of a perfect miRNA binding site (Fig. 3b) [81, 82].

Another function of cytoplasmic lncRNA is related to protein localization: lncRNA contains distinct domains that interact with specific protein complexes and, through a combination of domains, bring specific regulatory components into proximity, resulting in the formation of a specific functional complex to coordinate gene expression [83]. Additionally, lncRNAs not only act as decoys for sncRNAs but they may also function as precursors of sncRNAs, including small nucleolar RNAs (snoRNAs) and miRNAs [84].

2.2.1 Natural antisense transcripts

Natural antisense transcripts (NATs) are endogenous RNAs that partially or totally overlap transcripts (either coding or non-coding) originating from the opposite DNA strand [85]. NATs can be originated from independent promoters, shared bidirectional promoters or cryptic promoters that are situated within genes [85]. Depending on the orientation of the sense transcript, overlapping pairs are classified as: head-to-head (5′-regions overlap, HTH), tail-to- tail (3′-regions overlap, TTT), embedded (one transcript is fully contained within the other) (EMB), or intronic (INT) pairs [85, 86]. NATs function locally (in the nucleus, preferentially) or distally (in the cytoplasm) [85], and are usually not abundant (around 10-fold lower abundance than associated mRNA) [85].

Although NATs clearly typify cis regulation, affecting alleles on the DNA strand from which they are produced, in a local fashion, they also act in trans because they can interact with other loci taking advantage of 3-D organization of chromatin [86]. Cis-regulation is due to antisense transcription in a given locus, whereas trans-regulation is mediated by the antisense transcript corresponding to the RNA being transcribed [85]. Cis-acting NATs function either locally (e.g., in promoter–gene interactions) or distally (e.g., in enhancer–gene interactions). Local cis-regulation comprises epigenetic alterations proximal to a target gene (e.g., regulation of transcription initiation by affecting DNA methylation), whereas distal cis-regulation requires RNA-RNA interactions amid transcripts originated from the same locus [85]. Moreover, when NATs remain at the loci of origin, they can mediate cis effects due to formation of R-loops, triple helices, or stalled polymerases. The functional output of cis-regulation by NATs leads to activation or silencing of the corresponding sense mRNA, via transcriptional activation or silencing, mRNA stabilization, alternative splicing or post-translational regulation [86]. Because antisense and sense transcripts are transcribed from the same locus, it is suggested that antisense transcripts function recurrently in cis whereas other ncRNAs commonly function in trans, although there is evidence for trans-acting antisense transcripts. Antisense transcription might be far more extensive than previously anticipated, with around 50 % of sense transcripts having antisense partners [86]. Interestingly, NATs’ genomic distribution suggests that they might act as self-regulatory loops that control its own expression.

2.2.2 Enhancer elements and RNAs

Enhancers are non-coding genomic regions that activate transcription of target genes at long distances. Mammalian genomes contain hundreds of thousands of putative enhancer elements, located upstream and downstream of coding target gene promoters, which are critical for cell-specific gene expression programs [87, 88]. Enhancers are also considered transcription units, giving rise to transcription of a class of lncRNAs, the eRNAs [88]. Histone modification signatures characterize enhancer-like regions, including enrichment of H3K4me1 and H3K4me2 and reduced levels of H3K4me3, compared to promoters [87, 88]. eRNAs may be either polyadenylated or non-polyadenilated and are subdivided into unidirectional and bi-directional transcripts. eRNAs exhibit a 5′cap, usually are not spliced or polyadenylated, and can be produced as unidirectional or bi-directional transcripts. A growing body of evidence suggests that eRNAs are functionally important per se and contribute to enhancer-mediated transcriptional activation of target genes. Primed enhancers are marked with H3K4me1 and H3K4me2 and lack histone acetylation. The repressive histone modification H3K27me3 marks enhancers that are considered to be poised. In contrast, active enhancer regions are enriched for H3K27ac and are bound by actively transcribing RNA Polymerase II (PolII) [88]. eRNA expression is a hallmark of active enhancers, and it has been used as a signature to identify those regions through transcriptomic profiling. The mechanisms by which eRNAs regulate gene expression are not completely clear, but it has been hypothesized that it may stabilize enhancer-promoter looping and facilitate PolII recruitment and its transition into productive elongation [87]. As such, eRNAs are likely to have important functions in many regulated programs of gene transcription including those mediated by androgen receptor [89], p53 and ERα (ESR1) [90].

2.2.3 Long intervening non-coding RNAs

lincRNAs (long intervening non-coding), also called long intergenic non-coding RNAs, are lncRNAs that do not overlap exons of either protein-coding or other non-lincRNA types of genes [19]. They are transcribed from multiple loci in the human genome and are located in nucleus, although they are more frequently reported in cytoplasm. lincRNAs lack defining sequence or structure characteristics as they combine multiple classes of non-coding RNAs (such as intronic and intergenic genes) (Fig. 4). However, a few common features are observed, including being composed of few exons (normally 2–3) which makes them shorter than mRNAs [19]. The average length of lincRNAs exons is no larger than its counterparts in PCG. Although the transcriptional regulation, chromatin modifications, and splicing signals are similar to PCG, lincRNAs seem to be less efficiently spliced. Interestingly, lincRNAs significantly overlap repetitive elements, probably due to the fact that lincRNA functions are more tolerant to retrotransposon insertions. Repetitive elements were reported to play important mechanistic roles in lincRNAs roles, enabling base pairing with other RNAs containing repeats from the same family. Finally, the median lincRNA levels are only about a tenth of that of mRNAs, as lincRNA expression is typically more variable among tissues and enriched in testis and brain. LincRNAs’ functions include co-transcriptional regulation, regulation of gene expression (both in cis and trans) by bridging proteins and chromatin, scaffolding nuclear and cytoplasmic complexes, and RNA-RNA interactions. Consequently, lincRNAs are believed to play a widespread role in gene regulation and maintenance of cell’s homeostasis [19].

2.2.4 Pseudogenes

Pseudogenes are ancestral copies of protein-coding genes that arose from genomic duplication or retrotransposition of mRNA sequences into the genome, followed by accumulation of deleterious mutations due to loss of selection pressure [91]. A pseudogene shares an evolutionary history with a functional protein-coding gene, but it has been mutated through evolution to contain frameshift and/or stop codon(s) that disrupt the open reading frame [92]. Pseudogenes pervade the genome in close sequence similarity with their cognate genes. There are three main types of pseudogenes: (a) unitary pseudogenes—species-specific unprocessed pseudogene without a parent gene in the same species but with an active orthologue in another species; (b) processed pseudogenes—which appear to have been produced by integration of a reverse transcribed mRNA into the genome; and (c) unprocessed pseudogene—those that may contain introns as it resulted from gene duplication [92]. Pseudogenes are of capital importance owing to its competing endogenous RNAs (ceRNAs) action as natural miRNA “sponges” [79]. Moreover, they may also regulate the expression of their parent gene by decreasing mRNA stability of the functional gene through its own over-expression [93]. Interestingly, pseudogenes are also a source of small interfering RNAs, impacting on gene expression by means of RNAi pathway [94] and by generating antisense transcripts [95].

3 Non-coding RNAs in prostate cancer

The constantly expanding inventory of ncRNAs has been implicated in a broad spectrum of processes including prostate homeostasis and pathogenesis. The emergence of ncRNAs is of crucial importance for prostate biology because prostate cells are transcriptionally active, and numerous reports documented the deregulation of ncRNAs in prostate cancer (PCa) [27, 30]. This is the most commonly diagnosed cancer among men worldwide and a major cause of morbidity and mortality [96]. Although radical prostatectomy reduces mortality among men with localized prostate cancer, up to 40 % of patients experience disease progression and recurrence [97]. Numerous studies using mRNA-based techniques contributed to a better understanding of the molecular pathways involved in prostate carcinogenesis [98]. It is likely then that unraveling the biological functions of ncRNAs in PCa will provide new insights into their functions, mechanisms of action, and potential usefulness as tools for PCa management [27]. Despite the myriad of ncRNA families described thus far, only a small proportion is known to be involved in PCa. The best examples include small RNA families, mainly miRNAs and some lncRNAs, including eRNAs and antisense RNAs. Other classes are under active investigation, including pseudogenes, lincRNAs, and tRNAs. These ncRNAs not only control functional pathways of cell biology, but it may also constitute novel therapeutic targets or diagnostics biomarkers. In the following sections, we review the rapidly growing knowledge on ncRNAs as key players in prostate tumorigenesis and highlight their translational potential into the clinics.

3.1 Small non-coding RNAs

To date, the most extensively studied sncRNAs in PCa are miRNAs. These are classified as oncomirs (when miRNA expression favors tumor development) or tumor suppressor miRNA (i.e., when its expression normally counteracts tumor initiation and/or development) and play a critical role in PCa [58]. Dysregulation of miRNAs in cancer may occur through epigenetic changes (commonly, promoter CpG island hypermethylation) or genetic alterations, as well as miRNA biogenesis machinery dysfunction, which subsequently affects transcription of primary miRNA, its processing to mature miRNAs, and/or interactions with mRNA targets [59].

3.1.1 Dysregulation of microRNAs in PCa

NGS-based profiling, which enables high-throughput analysis of the miRNAome with single-base resolution [99], revealed common downregulation of miR-205, miR-143, and miR-145 and upregulation of miR-375 and miR-148, among others (Table 2). However, due to the inherent heterogeneity of PCa, sample selection, and technological platforms used, some discrepancies in results are apparent. Downregulation of miR-15a-miR-16-1 cluster (putative tumor suppressors through targeting of BCL2, CCDN1, and WNT3A) in PCa due to 13q14 deletion is commonly acknowledged [135]. Deletion of this cluster fuels survival, proliferation, and invasion of PCa cells, whereas in vivo overexpression results in growth arrest, apoptosis, and marked regression of PCa xenografts [135]. Strikingly, in vitro blockade of miR-15a and miR-16-1 promotes survival, proliferation, and invasiveness of previously untransformed prostate cells, which become tumorigenic in immunodeficient NOD-SCID mice [135]. On the other hand, loss of miR-101 contributes to overexpression of EZH2, linking PCa progression and altered of epigenetic reprogramming [136]. Indeed, one or both genomic loci encoding for miR-101 [located in chromosome 1 (MIR101-1) and in chromosome 9 (MIR101-2)] are lost in a sizeable proportion of primary PCa and up to 2/3 of metastatic PCa [136]. In DU145 cells, forced expression of miR-101 impairs cell invasion and reduces tumor growth in a mouse xenograft, whereas miR-101 re-expression globally decreases H3K27me3 histone mark levels at PRC2 target genes’ promoters, demonstrating that manipulation of miR-101 expression may be of therapeutic usefulness [136].

Table 2 Representative microRNA in PCa biology

Mir-34a, a p53 target [137], is downregulated due to promoter methylation [138] and underexpressed in CD44+ PCa cells purified from xenograft and primary tumors [139]. Its overexpression in cell pool or purified CD44+ PCa cells inhibits clonogenic expansion, tumor regeneration, and metastasis, whereas delivery miR-34a antogomirs to CD44− PCa cells has the opposite effects [139]. These effects are mediated by CD44, a target of miR-34a, and CD44 silencing phenocopied miR-34a overexpression [139]. Additionally, miR-34 cooperates with p53 to counteract cancer progression and jointly regulates prostate stem/progenitor cell activity [139]. This action is also mediated by MET, a mutual p53/miR-34 downstream target and a critical regulator of stem cell compartment [139]. This suggests a therapeutic potential for miR-34a against PCa by directly acting on cancer stem cells [139].

One of the main focuses on miRNA research in PCa is AR-signaling pathway. Not only AR is targeted by multiple miRNAs, but it also modulates miRNA expression, mediated by androgen-responsive elements within the promoter region [140]. miR-21 is overexpressed in primary PCa and DU145 and PC-3 PCa cell lines [141] and AR binding to its promoter enhances transcriptional activity, promoting hormone-dependent and hormone-independent PCa growth [142]. miR-21 inhibition using antisense oligonucleotides does not affect proliferation, although it increases sensitivity to apoptosis and inhibits cell motility and invasion by targeting MARCKS, a gene with a role in cell motility [143], whereas miR-21 overexpression represses BTG2, which induces expression of luminal markers and promotes epithelial-mesenchymal transition (EMT) [144]. Increased miR-21 expression is associated with shorter biochemical recurrence-free survival and predicts biochemical recurrence in PCa patients submitted to radical prostatectomy [145]. The list of AR-regulated miRNA also includes miR-27a, miR-141, miR-101, and miR-125b [146]. Conversely, miR-135b, miR-185, miR-297, miR-34a, miR-34c, miR-421, miR-634, miR-654-5p, and miR-9 influence androgen signalling by targeting AR [140]. Thus, miRNAs are involved in hormone-dependent and hormone-independent PCa growth, constituting putative therapeutic strategies to inhibit AR function and androgen-dependent cell growth in PCa.

PTEN is a tumor suppressor that antagonizes PI3K/AKT signalling and its expression is frequently abrogated in PCa. Decreased PTEN abundance due to upregulation of miR-106b~25 cluster (due to genomic amplification) and miR-22 in PCa is critical for malignant transformation of prostate cells [147]. In DU145 cells, stable over-expression of pri-miR-22 markedly increases colony formation and caused increased proliferation and tumor growth, as well as over-stimulation of AKT pathway in xenografts [147]. The same effects are apparent when miR-106b~25 cluster is stably expressed in PCa cells, leading to decreased PTEN abundance and activity [147]. Strikingly, that miRNA locus also collaborates with its host gene, MCM7, to promote malignant transformation. In nude mice, larger tumors were formed, compared to control cells, when miR-106b~25 cluster was overexpressed [147]. Moreover, miR-22 and miR-106b~25 clusters cooperate with c-MYC, further emphasizing its proto-oncogenic properties. Indeed, MCM7, and, consequently, miR-106b~25 cluster transcription is enhanced by c-MYC, suggesting that its oncogenic activity may also involve transcriptional activation of PTEN-targeting miRNAs [147].

MicroRNAs might be also involved in development of PCa bone metastasis, as loss of miR-15 and miR-16 and increased miR-21 expression stimulate dissemination and bone marrow colonization, through aberrant TGF-β and Hedgehog signalling [141].

In exosomes derived from PCa bulk and cancer stem cells (CSC), miR-100-5p and miR-21-5p were the most abundant miRNAs in both cell types, among 1839 miRNAs isolated [148]. Strikingly, biological processes controlled by the differentially expressed miRNAs in bulk exosomes were related to fibroblast growth, epithelial proliferation, and EMT through MMPs activation, whereas those from CSCs exosomes controlled proliferation, epithelial differentiation, and angiogenesis [148]. Overexpression of miR-100-5p, miR-21-5p, and miR-139-5p in a normal prostate fibroblast cell line (WPMY-1) resulted in increased expression of MMPs [148], with a predominant effect of miR-21-5p on MMP9 and of miR-100-5p on MMP2 and MMP13, whereas miR-139 induced expression of all MMPs. Ultimately, transfection of those miRNAs significantly increased RANKL expression, which induces cell proliferation, emphasizing that miRNAs contained in exosomes may play a significant role in cancer invasion and metastasis [148].

It was recently demonstrated that PCa-derived adipose stem cells (pASCs) stimulated with conditioned media or exosomes (isolated from PC-3 and C4-2B cells) induced prostate-like neoplastic lesion in vivo [99]. The oncogenic stimulation of pASCs might be a consequence of the RNA transfer by PCa-derived exosomes and activation of oncomiRNAs (e.g., miR-125b, miR-130b, and miR-155), along with oncogenic factors (e.g., H-RAS and K-RAS) [99]. In fact, expression of miR-125b and miR-130b promoted downregulation of tumor suppressors Lats2 and PDCD4 in pASCs exposed to PCa-derived exosomes. Functionally, pASC tumors acquire cytogenetic aberrations and mesenchymal-to-epithelial transition features and expressed neoplastic markers reminiscent of molecular features of PCa xenografts [99]. Due to its plasticity and cargo potential, PCa-derived exosomes might play a critical role in clonal expansion of tumors through neoplastic reprogramming of tumor-ASCs in cancer patients. This also emphasizes that deregulated expression of oncomiRs causes oncogenic transformation of pASCs due to disruption of transcriptional networks of tumor suppressor genes [99]. Further research on other ncRNA families and different prostate cells (e.g., basal, luminal, and fibroblasts) might help understand how exosomes are involved in crosstalk between tumor and stromal cells to synergistically promote tumor progression and drug resistance.

3.1.2 Small nuclear and nucleolar RNAs in PCa

The role of other small ncRNAs in prostate tumorigenesis has been also investigated. snoRNA U50 is mutated and downregulated in PCa, and a homozygous 2-bp (TT) deletion was identified both in PCa cell lines and primary tissues. Ectopic expression of snoRNA U50 abrogates colony formation, a feature associated with tumor suppression [135].

The nucleolar protein dyskerin (DKC1) catalyzes pseudouridylation of rRNA, and it is also required for the formation of hTR, the RNA component of telomerase. Compared to benign tissues, DKC1 mRNA levels were higher in PCa samples, especially in lymph node metastases [136]. SiRNA-mediated depletion of DKC1 decreased cell proliferation of prostate cells [136], suggesting that deregulation of snoRNA machinery is important for prostate carcinogenesis.

Using a deep sequencing approach to characterized small non-coding RNA transcriptome, an increase in both global snoRNAs and tRNA expression in PCa metastatic to lymph node compared to that of primary PCa was shown, suggesting a possible oncogenic role for snoRNAs, particularly in more advanced tumors [138]. In addition, there is a strong differential expression of snoRNAs and tRNAs, comparing PCa and normal prostate samples [138]. Additionally, snoRNA-derived RNAs (sdRNAs) display higher differential expression than miRNAs and they are greatly upregulated in PCa. Using qPCR, SNORD44, SNORD78, SNORD74, and SNORD81, sdRNAs were shown to be upregulated in PCa. The higher expression levels of SNORD78 and its sdRNA—sd78-3′—were associated with metastatic PCa [138].

The ribosome biogenesis begins in the nucleolus [137]. Here, the ribosomal RNA (rRNA) is transcribed, processed, and assembled into ribosomal subunits [139]. It hosts a transcriptional unit encoding a 45S ribosomal RNA precursor that is processed into the mature 18S, 5.8S, and 28S RNA species [140]. 45S percursor rRNA and mature rRNAs 28S, 18S, and 5.8S are overexpressed in PCa samples compared to morphologically normal prostate tissues [139]. The mechanism leading to the aberrant expression is not well characterized, but apparently, overexpression is not associated with rDNA promoter hypomethylation [139]. In fact, 45S, 18S, and 5.8S rRNA expression levels altered nucleolar structure and function and are more closely associated with MYC mRNA levels [139], suggesting that MYC might be involved in rRNA biogenesis. In a different report, MYC was found to be required for rRNA transcription and processing [142]. In PCa cells, MYC binds to the 5′ upstream region of Fibrillarin (FBL), a gene required for rRNA production and processing [142]. FBL is overexpressed in PCa samples and siRNA-mediated depletion of FBL suppressed cell proliferation and clonogenic survival. Moreover, FBL knockdown decreased the levels of 5.8s, 18s, and 28s rRNAs, whereas only a modest reduction in 45S pre-rRNA was observed [142]. Conversely, MYC knockdown associated with decreased levels of pre-rRNA as well as of processed rRNAs, indicating that MYC is required for rRNA transcription and processing [142]. Genome-wide analysis of MYC depletion revealed downregulation of 133 nucleolus-associated genes and of 64 genes associated with rRNA processing [142]. Those comprised fibrillarin, nucleolin, UBF, and nucleophosmin. In addition, overall nucleolar size was reduced after MYC depletion in vitro [142]. Considering these findings, MYC overexpression in PCa cells can drive enhanced de novo nucleolar and ribosomal gene expression, thus fostering the malignant phenotype.

rRNA is crucial for both androgen-dependent and androgen-independent growth of PCa cells. The androgen-AR signaling leads to the accumulation of rRNA in androgen-dependent prostate cells, and angiogenin (ANG) is upregulated in PCa cells, mediating androgen-stimulated rRNA transcription [141]. In androgen-dependent cells, androgen stimulation promotes ANG nuclear translocation, where it binds to rDNA promoter, stimulating rRNA transcription [141]. Blocking ANG leads to inhibition of androgen-induced rRNA transcription. Moreover, ANG signalling is not only critical for androgen-dependent growth but also for the castration-resistant phenotype. In an androgen-independent context, ANG stimulation leads to constitutive nuclear translocation in androgen-insensitive cells, ensuing a continuous rRNA overproduction and thereby stimulating cell proliferation [141].

3.1.3 tRNA-derived RNA fragments

Global expression profile of prostate cell lines revealed that the second most abundant group of sncRNA was that of fragments derived from tRNA, the tRNA-derived RNA fragments (tRFs) [143]. Deep sequencing characterization of LNCaP and C4-2 cell lines disclosed 17 tRNA-related small RNAs, including the most abundant: tRF-1, tRF-3, and tRF-5. For downstream validation, tRF-1001, a member of tRF-1 series, was selected. tRF-1001 is derived from the 3′ end of a Ser-TGA tRNA precursor transcript, which is not retained in the mature tRNA [143].

The tRF-1001 is expressed more abundantly in cell lines than in tissues, but its expression decreases either upon starvation or high cell density in DU145 and LNCaP cells. Reduction of cellular metabolism also decreased expression of tRF-1001 precursor, but the corresponding mature tRNA levels were unaffected. The tRF-1 series of small RNAs are 3′ sequences from pre-tRNA, released through a cleavage by tRNA endonuclease ELAC2 during the 3′-end maturation of tRNA. Knockdown of ELAC2 decreased tRF-1001 expression, leading to accumulation of the pre-tRNA [143]. tRF-1001 and its precursor tRNA are exclusively localized in the cytoplasm, providing evidence that biogenesis occurs in the cytoplasm, rather than in the nucleus as it happens for tRNAs. These data sustain a functional role for tRFs, putting aside the idea of a mere by-product of tRNA biochemical processing [143].

Recently, RNA sequencing was used to profile tRFs in fresh frozen tissue samples derived from normal adjacent prostate and PCa at different stages [144]. A total of 598 unique tRFs were identified, and several are deregulated in PCa. Strikingly, 5′-tRFs constitute approximately 75 % of all tRFs detected in prostate tissues. Notably, most of the identified tRFs are derived from 5′- and 3′- of mature cytosolic tRNAs. Nonetheless, tRFs derived from different segments of tRNAs, including pre-tRNA trailers and leaders, as well as tRFs from mitochondrial tRNAs were catalogued. Globally, 110 tRFS were found deregulated (72 upregulated, 24 downregulated, and 13 upregulated in one group but downregulated in the other group [144]). Most of the upregulated tRFs were 5′-tRFs and most of downregulated were 3′-tRFs. Downstream qPCR validation of 6 different tRFS revealed that 4 tRFs (three 5′-tRFs and one D-tRF) were upregulated (Fig. 3c–f), and 2 tRFs (3′-tRF class) were downregulated in PCa. tRF-544 (isotype Phe, anticodon GAA - tRNAPheGAA) is thought to be associated with aggressive forms or advanced stages of PCa. Interestingly, high expression level ratio tRF-315/tRF-544 significantly associated with poorer progression-free survival and shorter time to disease relapse.

Sex hormone-dependent tRNA-derived RNAs (SHOT-RNAs) are commonly expressed in AR-positive PCa cancer cell lines [145]. In LNCaP-FGC cells, both 5′- and 3′-tRNA halves from SHOT-RNAAspGUC and SHOT-RNA HisGUG are detected by northern blot, but not in DU145 or PC3 cells, and AR knockdown reduced tRNAs expression levels. One of must abundant SHOT-RNAs detected by Honda et al.—5′- SHOT-RNA RNALysCUU—was knocked down using siRNAs in LNCaP-FGC, and cell growth rate was decreased compared to control siRNA [145]. Because levels of mature tRNA were not changed by siRNA transfection, reduced proliferation seems to be solely attributable to the change in SHOT-RNA RNA levels. This strategy was also applied to SHOT-RNAAspGUC and SHOT-RNAHisGUG and depletion of each SHOT-RNA impaired cell growth as well. Nevertheless, 3′-SHOT-RNAAspGUC depletion failed to impair cell growth [145]. Overall, these data support SHOT-RNAs as functional RNA molecules and different species of 5′-SHOT-RNA are involved in cell proliferation [145]. To determine whether 3′-SHOT-RNA holds functional relevance or not, additional studies are required.

The current understanding of tRFs, however, suggests that they are not merely byproducts of random cleavage of tRNAs, but might act as mediators of translational and/or gene regulation. Although some isolated functions have been indicated, the vast majority of tRFs appear to operate via uncharacterized mechanisms. It has been proposed that 5′- but not 3′-derived tRFs play a role in stress granule assembly or inhibition of protein synthesis in vitro [146]. However, 3′-derived tRFs are able to repress their mRNA targets in a miRNA-like fashion and may exert tumor-suppressive functions [147].

3.1.4 Other short RNAs

Although no direct involvement of piRNA in PCa has been reported, some genes implicated in piRNA biogenesis are deregulated in PCa. Defects in Tudor-domain proteins significantly impair piRNA pathway, especially its ping-pong components, although not abolishing it. Because multiple Tudor-domain-containing proteins exists, one may argue that it exhibits overlapping or redundant roles in the piRNA pathway, explaining the somewhat minor phenotypes of the individual mutants [148].

Tudor domain-containing protein 1 (TDRD1) is a direct target gene of ERG, strongly correlating gene with ERG overexpression [149]. Mechanistically, ERG is able to disrupt tissue-specific DNA methylation pattern at the TDRD1 promoter, resulting in TDRD1 transcriptional activation [149]. Piwil2 has been recently described as an oncogene able to modulating invasion and metastasis, as well as EMT [150]. Of note, global piRNA levels were not assessed to quantify the deregulation caused by TDRD1 and Piwil2 aberrations.

3.2 Long non-coding ncRNA

Although ncRNAs research, and specially lncRNAs, is still in its infancy, significant roles have been ascribed to some lncRNAs in PCa and these are summarized in Table 3.

Table 3 lncRNA manipulation and consequential phenotypes in PCa

3.2.1 Antisense regulatory lncRNAs in PCa

The role of dysregulated antisense transcript expression is under investigation in PCa. The polyadenilated antisense transcript ANRIL (encoded by CDKN2B-AS1) is expressed from the tumor-suppressor locus INK4b-ARF-INK4a (9q21.3). ANRIL and CBX7 (member of Polycomb Repressor Group 1) are both upregulated in PCa samples [162]. Furthermore, CBX7 is responsible for maintaining silenced chromatin states through recognition of H3K27me3 [162]. CBX7 binds to H3K27me3 and interacts with ANRIL at the INK4b-ARF-INK4a locus. CBX7 employs different regions within its chromodomain for binding to H3K27me3 and ANRIL RNA, suggesting that both interactions are important for the sustained cis-repression of the locus [162]. Thus, RNA–protein interaction underlies the ability of PRC1 to repress the INK4b-ARF-INK4a cluster, and its disruption contributes to PCa development by reducing senescence [162]. Interestingly, these data might indicate that the frequent promoter hypermethylation observed at this locus occurs as a secondary event after cell differentiation.

Another NAT with critical impact in PCa cells is CTBP1-AS, an androgen-responsive lncRNA that promotes PCa growth through sense-antisense repression of the transcriptional co-regulator CTBP1 and global epigenetic regulation of tumor suppressor genes [163]. The upregulation of CTBP1-AS is inversely correlated with CTBP1 in primary and metastatic PCa, associating with high AR expression status. Depletion of CTBP1-AS mRNA abolished the androgen-dependent reduction of CTBP1, indicating that CTBP1-AS directly regulates CTBP1 at RNA level [163]. Silencing CTBP1-AS reduced LNCaP cell proliferation, and in vivo tumor growth was also reduced, concomitantly with an increased CTBP1 expression. Microarray analysis showed that transcriptional activation of androgen-induced genes was diminished by siCTBP1-AS [163]. Interestingly, CTBP1-AS overexpression stimulated cell growth and promoted resistance to growth inhibition by bicalutamide, ultimately rendering in vivo tumor growth after castration. Mechanistically, CTBP1-AS coordinates cis-repression of CTBP1 promoter, reducing H3Ac and H4K4me levels but not altering repressive marks [163]. CTBP1-AS binds to HDAC-Sin3A complex and coordinates HDAC-mediated repression by chromatin deacetylation within CTBP1 promoter in the AR-dependent system. Moreover, CTBP1-AS also interacts with PSF, which binds at CTBP1 promoter to induce histone deacetylation by HDACs to promote transcriptional repression of CTBP1. Additionally, CTBP1-AS may also act as trans-acting regulator of androgen-regulated genes by recruiting the HDAC/Sin3A repressor complex via PSF, prompting cell cycle progression by repressing cell cycle regulators and modulating global androgen signaling (e.g., p53, SMAD3) [163].

Another example of antisense gene regulation is the transcriptional control of tumor-suppressor gene RASSF1A by RASSF1A-antisense RNA 1 [164]. RASSF1A-AS1 is upregulated is PCa cell lines, inversely correlating with RASSF1A expression [164]. RASSF1A and RASSF1A-AS1 form a RNA-DNA hybrid at the RASSF1A promoter and recruits the polycomb repressor complex PRC2. PRC2 contributes to chromatin compaction by catalyzing the methylation of histone H3 at lysine 27, which is enriched at RASSF1A promoter, and specifically blocks RASSF1A expression [164].

3.2.2 Enhancers and enhancer RNAs in PCa

Cancer cells display altered expression patterns and enhancer usage in comparison with their normal counterparts [165]. In PCa, enhancer RNAs (eRNAs) have been implicated in assisting AR-mediated signaling, as mediators of enhancer-promoter looping and in altering transcription factor binding (Fig. 5).

Fig. 5
figure 5

Transcripton derived from enhancer is important for long-range transcriptional control. eRNAs are lncRNA derived from short regions of DNA that enhance the expression of genes at varying distances. Effects can be mediated by transcription factor binding to these sites, such as androgen receptor (AR). AR controls PCa cell-specific gene expression programs through interactions with diverse co-activators and the transcription machinery. Gene activation may involve DNA loop formation between enhancer-bound AR and the transcription machinery at the core promoter. This interaction seems to be mediated by mediator complex and cohesin, as they have been reported to interact physically and functionally connect the enhancers and core promoters of active genes. The eRNAs produced from AR-bindind DNA segments, facilitate the spatial interaction between enhancer and promoter, ultimately enhancing long-distance transcriptional regulation. Moreover, specific eRNA might encompass androgen response elements (ARE), supporting AR and mediator interactions. This mechanism is critical for PCa cells, as androgen-induced eRNAs scaffolds the AR-associated protein complex that modulate chromosomal architecture and selectively enhance AR-dependent gene expression involved in PCa initiation and progression

FoxA1 has been reported to contribute to the enhancer code in PCa cells, as FoxA1 regulates AR genomic targeting by simultaneously anchoring AR to cognate loci and restricting AR from other ARE-containing loci in the human genome [72]. In addition, knockdown of FoxA1 markedly elevated di-hydro-testosterone (DHT) response and caused AR binding to a distinct cohort of enhancers. Global nuclear run-on sequencing (GRO-seq) was applied to understand how differential AR binding is translated into hormonal gene response [72]. After DHT treatment, GRO-seq detected ncRNA expression from a subset of H3K4me1-positive and H3K4me3-negative regions. These differentially expressed eRNAs are largely symmetrical and bidirectional (as depicted for the KLK3 enhancer). Moreover, these AR-activated enhancers marked by increased eRNA expression are responsible for activation of nearby coding transcription units [72]. Chromosome conformation capture (3C) suggested that eRNA induction per se is the most precise mark of the functional looping between an activated enhancer and its regulated gene promoter, rather than p300 or MED12 binding [72]. Moreover, both DHT and FoxA1 knockdown demonstrated a strong H3K4me2-marked central nucleosome, suggesting that nucleosome remodeling is not required to induce specific enhancer-promoter looping and subsequent target gene activation [72].

Furthermore, it has been reported that PolII binds to a large number of intergenic AR-bound enhancers, marked by H3K4me1 and H3K27ac, which produce eRNAs that may regulate neighbor or distantly located genes [89]. This evidence suggests than eRNAs may contribute to AR-driven looping complex that enhances spatial communication of distal enhancers and target promoters, leading to transcriptional activation of specific genes [89]. The KLK3 enhancer is marked by AR binding, H3K27ac and H3K4me1, and produces a bidirectional eRNA named KLK3e [89]. Both KLK3 and KLK3e expression is induced by DHT treatment and blocked by bicalutamide, indicating a high correlation of activity-dependent induction between eRNAs and adjacent protein coding genes. KLK3e sense strand gives rise to a >2 kb polyadenylated transcript that is substantially more expressed than the antisense transcript. KLK3e facilitates the spatial interaction of the KLK3 enhancer and the KLK2 promoter, enhancing long-distance KLK2 transcriptional activation [89]. KLK3e contains the core enhancer element derived from the androgen response element III (AREIII) required for the interaction of AR and Mediator 1 (MED1). Suppression of either KLK3e or MED1 reduced the interaction of KLK3/2 loci, supporting a role for MED1 as a mediator of the long-range chromatin looping and cooperating with KLK3e in the enhancer target-promoter interaction. Globally, these data suggests that KLK3e forms a functional complex with AR and MED1 that facilitates the association of AR-bound enhancers with promoters, resulting in transcriptional activation of target genes [89]. Supporting this hypothesis, KLK3e expression is significantly correlated with KLK3 and KLK2 (R 2 = 0.62; 0.59, respectively). Further understanding of how AR-induced eRNAs act as a scaffold for AR-associated protein complex that selectively modulate chromosomal architecture and gene expression may translate into new RNA-based therapy to improve response to androgen deprivation therapy [89].

Recently, a new role for single-strand nicks was identified, mediated by DNA topoisomerase 1 (TOP1), in relaxing supercoiled DNA at gene enhancers to promote enhancer-dependent transcription [166]. In LNCaP cells, TOP1 was recruited to AR-regulated enhancers in response to androgen treatment. Using ChIP-seq, of the 6545 putative AR-bound enhancers, 96 % were occupied by TOP1. Of these, 60 % revealed an androgen-stimulated increase in TOP1 binding as well as in RNAPOL II occupancy, indicative of active transcriptional activity [166]. GRO-seq analysis of serum-starved LNCaP cells treated with DHT identified 644 putative enhancers (74 % of them showed increased TOP1 occupancy) with significantly upregulated eRNA expression. Knockdown of endogenous TOP1 resulted in decreased eRNA expression of 79 % of AR-regulated enhancer, accompanied by lower expression levels of 368 protein-coding mRNAs (including KLK3, KLK2, TMPRSS2, and NDRG1) [166]. Having proved that TOP1 reduces both eRNA and mRNA production of most AR-regulated target genes, the authors found that prior binding by NKX3.1 was required to recruit TOP1 to enhancers following androgen treatment. siRNA-depletion of NKX3.1 inhibited recruitment of TOP1 and reduced DHT-dependent upregulation of eRNA expression [166]. Strikingly, depletion of both TOP1 and NKX3.1 reduced DHT-mediated eRNA upregulation at the same AR-bound enhancers, apparently without affecting AR recruitment. This reveals that NKX3.1 and TOP1 occupy the same binding sites at enhancer elements and co-regulate an AR transcription program. The Y723F TOP1 mutant did not block transcriptional activity in TOP1-depleted cells, suggesting that the nicking activity of TOP1 is required for its effects on enhancer activation [166]. Given that single-strand nicks might lead to the formation of DNA double-strand breaks, several components of the DNA damage response pathway—MRE11, RAD50, and ATR—are recruited to AR-regulated enhancers after DHT treatment and are required for eRNA and protein-coding mRNA transcription. Taken together, these data suggest a common usage of the DNA damage repair machinery to regulate AR-mediated gene transcription, highlighting the complexity of PCa [166].

In a recent study, using Chem-seq [167], a compound that inhibits cell proliferation in vitro and tumor growth in vivo—SD70—was identified. SD70 binds to AR-bound functional enhancers, regulating DHT-induced gene transcriptional programs [167]. Moreover, it was found that KDM4C binds at AR-regulated enhancers and is recruited in a DHT-dependent fashion. In vitro, SD70 inhibits KDM4C demethylase activity causing elevated H3K9me2 levels at enhancer and promoter regions—a plausible component of the inhibitory effects on DHT target gene expression [167]. These results suggest that targeting enhancer regions has potential therapeutic value for PCa.

3.2.3 lncRNA as master regulators of alternative splicing and translation

Recently, it has been shown that lncRNA are required to assemble nuclear domains specialized in RNA processing, such as the nuclear speckle and the paraspeckle. The oncogenic lncRNA MALAT1 (also known as NEAT2, located at 11q13.1) is present within the nuclear speckle and spatially re-organizes the actively transcribed genes closed to the nuclear speckles, a domain know for its abundance in pre-mRNA splicing factors [10]. Knockdown of MALAT1 disclosed that this lncRNA regulates alternative splicing of multiple genes by controlling the availability of serine/arginine-rich splicing factors in active transcription sites [168]. Interestingly, during post-transcriptional processing of MALAT1, a conserved 3′ tRNA-like sequence generates a short tRNA-like ncRNA called MALAT1-associated small cytoplasmic RNA (MASCRNA), whose function is still unclear. MASCRNA is a 61-bp short tRNA-like ncRNA of unknown function, generated by RNase-P cleavage and then exported to the cytoplasm [169].

On the other hand, the lncRNA NEAT1 is an essential structural element to initiate de novo assembly of paraspeckles, which are believed to be nuclear domains specialized in retention of adenosine-to-inosine edited mRNAs [170]. Inducing NEAT1 transcription locus is sufficient to form new paraspeckles at the integration locus. However, active transcription of NEAT1 is necessary to tether the lncRNA to its own transcription locus and carry out this role [170]. Taking into account that MALAT1 and NEAT1 are separated by approximately 70 kb, it is conceivable that coordinated deregulation of both loci may hinder alternative splicing by controlling the nuclear localization of splicing factors as well the control of RNA editing and export, further contributing to prostate carcinogenesis [10]. Indeed, both MALAT1 [171] and NEAT1 [172] are overexpressed and possess pro-tumorigenic activity in PCa. MALAT1 overexpression in primary PCa is associated with higher Gleason score, pathological stage, and serum PSA >20 ng/ml [173]. Besides its association with poor prognosis, MALAT1 expression is significantly increased in castration-resistant PCa (CRPC) compared to hormone-sensitive PCa [173]. Functional assays using siRNA specific to knock down MALAT1 expression in 22RV1 and LNCaP-AI cells inhibited cell cycle at G0/G1 phase, migration, and invasion [173]. RNAi silencing of MALAT1 in PCa xenografts of castrated male nude mice resulted in significant reduction of tumor volume and metastasis number, increasing survival time [173]. Whether these alterations are specific of MALAT1 or are the combined effect in downstream genes (e.g., RNA splicing deregulation) controlled by MALAT1 is still a matter under study. Using EZH2 antibody-based RNA immunoprecipitation combined with next-generation sequencing (RIP-seq), EZH2 was found to bind to MALAT1 [171]. Both GST pull-down and RIP assays showed that the 3’ end of MALAT1 interacts with the N-terminal of EZH2. Moreover, MALAT1 and EZH2 are positively correlated in CRPC samples. Moreover, depletion of MALAT1 impaired EZH2 recruitment to its target loci (DAB2IP and BRACHYURY) and caused its upregulation, suggesting that MALAT1 mediates EZH2-enhanced migration and invasion in CRPC cell lines [171]. Moreover, MALAT1 enhances expression of PRC2-independent target genes of EZH2 both in vitro and in patient-derived xenografts (TMEM48 and KIAA0101) [171].

NEAT1 is an ERα-regulated lncRNA, upregulated in PCa, producing two RNA isoforms that overlap completely at the 5′-end. The shorter isoform is 3.7 kB in length and more abundant than the longer, 23 kB, isoform (NEAT1_2) [172]. NEAT1 expression is a prognostic biomarker for aggressive PCa independent of standard clinical and pathologic parameters [172]. Estrogen treatment upregulates NEAT1 transcript levels in a time-dependent manner and in VCaP cells results in re-distribution of NEAT1 from paraspeckles to an enhanced distribution throughout the nucleus [172]. Knockout of NEAT1 compromised the expression of ERα target genes, suggesting that NEAT1 is not only a downstream target but also a mediator of ERα signalling in PCa cells. NEAT1 transcriptionally regulates a compendium of genes known to be involved in PCa progression, including PSMA and GJB1 [172]. Overexpression of NEAT1_1 significantly increased active chromatin marks H3K4Me3 and H3AcK9 at the PSMA promoter and induced subsequent recruitment of NEAT1_1 and ERα to the same promoter. RNA immunoprecipitation revealed that NEAT1 directly interacts with histone H3, favoring a chromatin landscape for active transcription through active histone marks [172]. Phenotipically, knockdown of NEAT1 in VCaP cells significantly decreased proliferation and the invasive properties of cells. Overexpression of NEAT1 resulted in a significantly higher number of viable colonies, establishing an oncogenic role for NEAT1 [172]. In athymic nude mice, injection of either VCaP or NCI-H660 overexpressing NEAT1 resulted in a significantly higher tumour growth rate compared to scramble cells. Moreover, in vitro NEAT1 expression is inhibited when cells are treated with ERα antagonists in combination with E2. Similar results observed with AR antagonists enzalutamide and bicalutamide suggest that NEAT1 is associated with resistance to therapy [172]. Thus, these data suggest a role for paraspeckles in the lncRNA-mediated regulation of gene expression in PCa.

3.2.4 lincRNAs deregulation in PCa

Long intervening non-coding RNAs (lincRNAs) are emerging as key regulators of diverse cellular processes, but determining their individual function remains a challenge. lincRNAs are also called long intergenic non-coding RNAs, although lincRNAs derive from genes and are thus genic, which do not overlap with exons of either protein-coding or other non-lincRNA types of genes [19].

Ab initio transcriptome sequencing of polyA+ RNA from 102 PCa tissues and cell lines revealed a total of 1859 unannotated lincRNAs throughout the human genome [174]. A set of 121 of those transcripts accurately distinguished benign, localized and metastatic PCa by unsupervised clustering. PCAT-1 (located in the 8q24 gene desert) is predominantly cytoplasmic and was upregulated in PCa samples especially in high-grade (GS ≥ 7) and metastatic tumors. Strikingly, PCAT-1 and EZH2 expression was nearly mutually exclusive, suggesting that their expression may define two subsets of high-grade disease. However, upregulation of PCAT-1 was not dependent of 8q24 amplification [174]. Inhibiting EZH2, using either shRNAs or DZNep, caused a dramatic upregulation of PCAT-1 in VCaP cells. ChIP assay showed that SUZ12, a core component of PRC2, directly binds to PCAT-1 promoter ~1 kb upstream of TSS [174]. By RNA immunoprecipitation, it was demonstrated that PCAT-1 binds to SUZ12 protein in VCaP cells, a feature that was abolished by RNase A, RNase H, or DNase I treatment. This suggests that PCAT-1 exists primarily as a single-stranded RNA and secondarily as a RNA/DNA hybrid. Moreover, PCAT-1 stable overexpression in RWPE cells promoted cell proliferation, and RNAi silencing decreased cell proliferation in LNCaP but not in DU145 (lacks PCAT-1 expression) or VCaP cells (PCAT-1 is repressed by PRC2) [174]. Genome-wide expression analysis of LNCaP cells after treatment with siRNAs against PCAT-1 disclosed upregulation of 255 genes and repression of 115 genes, revealing that PCAT-1 is predominantly repressive. Additionally, the upregulated genes showed enrichment for mitosis and cell cycle [174]. Specifically, PCAT-1 targets BRCA2, CENPE, and CENPF, whose expression is upregulated upon PCAT-1 silencing in LNCaP cells. Further research demonstrated that PCAT-1 overexpression decreased RAD51 foci formation (a component of homologous recombination, HR) after therapy with PARP1 inhibitors and PCAT-1 knockdown increased foci formation upon therapy, in PCa cells [175]. BRCA2 inactivation impairs both HR and double-stranded DNA break repair (DSB). PCAT-1 expression is correlated with decreased BRCA2 levels, and in vitro, the 5′ end of PCAT-1 is able to directly repress the activity of BRCA2 3′UTR [175]. PCAT-1 overexpression produces a functional deficiency in HR through post-transcriptional repression of BRC2 tumor suppressor, which, in turn, reveals a high sensitivity to small molecule inhibitors of PARP1, both in vitro and in vivo [175]. Whether PCAT-1 may act as predictive biomarker for patient response to PARP1 inhibitor therapy is still to be proved.

PCAT-1 is located 725 kb upstream of the MYC oncogene [176]. Overexpression of PCAT-1 in DU145 and RWPE increased c-MYC protein levels, while silencing of PCAT-1 in LNCaP decreased c-MYC protein, suggesting a cis-regulation involving these loci [176]. Strikingly, c-MYC silencing fully abrogated the proliferative effects of PCAT-1 overexpression in DU145 and RWPE, indicating that PCAT-1 mediated cell proliferation is dependent of c-MYC overexpression. Luciferase assay revealed that PCAT-1 overexpression increased cMYC 3′UTR activity, whereas silencing of PCAT-1 decreased c-MYC 3′UTR activity. Mechanistically, this suggests that PCAT-1 regulates c-MYC in a post-transcriptional manner by 3′UTR activation, which can result in gene activation and increased protein abundance [95].

Another important lincRNA in PCa is SChLAP1 (second chromosome locus associated with prostate-1; also designated LINC00913) [177]. SChLAP1 is located in a “gene desert” on chromosome 2q31.3 and is highly expressed in ~25 % of PCa, being more frequently expressed in metastatic compared to localized PCa. Its expression was associated with ETS gene fusions and PTEN deletions in localized PCa [177]. Moreover, SChLAP1 levels independently predict poor outcome, including metastasization and PCa-specific mortality [177]. Knockdown of SChLAP1 dramatically impaired cell invasion and proliferation in vitro and, in turn, overexpression of a siRNA-resistant SChLAP1 isoform rescued the in vitro invasive phenotype of 22Rv1 cells treated with siRNA. Overexpression of the three SChLAP1 isoforms in RWPE cells dramatically increased the ability of these cells to invade in vitro but did not affect cell proliferation. In vivo, SChLAP1 depletion impaired metastatic seeding and growth. Overall, SChLAP1 seems to control tumor invasion and metastasis by influencing cancer cell intravasation, extravasation, and subsequent tumor cell seeding [177]. Using Gene Set Enrichment Analysis of 22Rv1 and LNCaP cells with SChLAP1 knockdown, SChLAP1-regulated genes were correlated with the SWI/SNF complex, a multiprotein complex known to physically rearrange nucleosomes at gene promoters, thus controlling transcription [177]. Mechanistically, SChLAP1 co-immunoprecipitates with SNF5 and attenuates SNF5 genome-wide localization. Upon knockdown of SChLAP1, 9 of 12 target genes disclosed a substantial increase in SNF5 binding. These data sustain that oncogenic SChLAP1 overexpression antagonizes the tumor-suppressive role of SWI/SNF complex function by attenuating the genomic binding of this complex, thereby impairing its ability to properly regulate gene expression [177].

Prostate cancer antigen 3 (PCA3) is a spliced intronic antisense lncRNA embedded within intron 6 of the corresponding sense gene PRUNE2 and upregulated in PCa samples, holding promise as biomarker for PCa detection [178]. PCA3 controls PRUNE2 levels via a unique regulatory mechanism involving formation of a PRUNE2/PCA3 double-stranded RNA that undergoes ADAR-dependent adenosine-to-inosine RNA editing [178]. Because Drosophila behavior human splicing (DBHS) protein P54NRB binds to inosine-containing RNA (RNA-I), regulating gene expression, it was found, using RNA-ChIP, that PCA3 and PRUNE2 pre-mRNA species associate with P54NRB protein, suggesting that DBHS proteins also contribute for PRUNE2/PCA3 regulation [178]. In vitro stimulation with a synthetic testosterone homolog induced PCA3 expression and decrease PRUNE2 levels [97]. PCA3 silencing or ectopic PRUNE2 expression decreased cell proliferation and transformation in vitro; in contrast, PRUNE2 silencing or ectopic PCA3 expression increased cell proliferation and transformation [178]. PRUNE2-deficient PC3 cells stably expressing ectopic PRUNE2 display lower levels of proliferation and transformation in vitro, consistent with the negative regulation of PRUNE2 by PCA3 [178]. In SCID mice, PRUNE2 silencing and ectopic PCA3 expression yielded markedly larger tumor xenografts than controls; in contrast, tumor growth was significantly diminished compared to controls when PCA3 was silenced, further illustrating the oncogenic activity of PCA3 [178]. Serum PSA was increased in SCID mice injected with LNCaP cells with ectopic PCA3 expression or PRUNE2 silencing, compared to controls [178]. In human PCa samples, PCA3 and PRUNE2 levels inversely correlate. Moreover, A > G/T > C alterations were the most frequent substitutions, indicative of A-to-I editing in both PCA3 and PRUNE2 pre-mRNA strands [178]. These results establish PCA3 as a dominant-negative oncogene and PRUNE2 as a tumor suppressor gene in PCa, and their regulatory axis represents a putative target for clinical intervention [178].

3.2.5 Pseudogenes

CXADR-ψ, a processed pseudogene on chromosome 15, parental of the tumor-suppressor CXADR, was found overexpressed in PCa tissues compared to benign tissue samples [91]. CDNA cloning from two PCa samples positive for CXADR-ψ showed perfect sequence similarity to the pseudogene CXADR-J and only 84 % to CXADR wild-type gene [91]. No correlation was depicted for CXADR and CXADR-ψ. Interestingly, CXADR-ψ expression was nearly restricted to PCa lacking an ETS gene fusion, with few ETS-positive samples exhibiting expression of this pseudogene [91]. On the other hand, CXADR gene expression was found in both ETS-positive and ETS-negative samples [91]. In the same study, a PCa-specific readthrough transcript involving KLK4, an androgen-induced gene, and KLKP1, an adjacent pseudogene, was identified. KLK4-KLKP1 transcript was highly expressed in 30–50 % of PCa tissues, and this expression was lineage and cancer specific, with low expression detected in benign prostate and other tissues [91]. KLK4-KLKP1 transcript was previously described in LNCaP as a cis sense-antisense chimeric transcript [91]. This chimeric transcript is composed of the first two exons of KLK4 and the last two exons of KLKP1. It retains an open reading frame incorporating 54 amino acids encoded by the KLKP1 pseudogene in the putative chimeric protein [91]. Additional studies are needed to understand the biological role of the chimeric transcript KLK4-KLKP1 in PCa biology.

Pseudogene transcription has also been shown to regulate cognate wild-type gene expression by sequestering miRNA acting endogenous miRNA sponges, or competing endogenous RNAs (ceRNAs) [79]. ceRNAs communicate and co-regulate each other by competing to bind to a common pool of miRNAs, thus altering miRNA availability and stoichiometry [79]. PTENP1 pseudogene has been reported to regulate levels of its cognate gene, PTEN, by competing for shared miRNAs [79]. Both miR-19b and miR-20a (normally over-expressed in PCa) suppressed both PTEN and PTENP1 mRNA abundance. Blocking miR-17 and miR-19 family increased PTEN/PTENP1 levels, highlighting a shared miRNA-mediated regulation between these two genes and highlights the role of PTENP1 as a tumor suppressor acting as a decoy for oncogenic miRNA-targeting of PTEN [79]. Additionally, KRAS/KRAS1P transcript levels are positively correlated in PCa and KRAS1P 3′UTR overexpression in DU145 cells resulted in increased KRAS mRNA abundance and cell growth. These data support a role for KRAS1P in PCa, being targeted by KRAS-targeting miRNAs. In silico analysis revealed that KRAS1P maintains the validated binding sites for miR-143 and let-7 family previously reported for KRAS [44]. These data provide a framework of pseudogenes as natural miRNA decoys in PCa development.

3.2.6 Transcribed ultraconserved region

Ultraconserved regions (UCR) are genomic sequences with 100 % conservation between human and rodent genomes, more than 200 base pairs in length but not harboring any known gene [179]. Due to the high levels of sequence conservation, UCR must have biological functions essential to mammalian cells, although still largely enigmatic. Some UCR have been functionally implicated in transcriptional enhancement, alternative splicing, nonsense mediated decay mechanisms, or miRNA-binding decoys [179]. There are 481 UCRs described, some of which overlap with coding exons, although it is believed that more than half of them do not encode any protein. Surprisingly, 68 % of UCRs (i.e., 325) are transcribed, defining a new class of long non-coding RNA: transcribed ultraconserved region (T-UCRs) [180]. Many transcripts from T-UCRs are polyadenylated and enriched for H3K4me3 at the TSS [181]. Although UCRs range from 200 to 779 bp in length, the transcriptional units of T-UCRs (the non-spliced, full-length cDNAs) are usually up to 2 kb for known T-UCRs [180, 182]. T-UCRs are expressed in normal tissues both ubiquitously or in a tissue-specific pattern.

The expression profile of the 481 known UCR revealed that particular T-UCRs are deregulated in PCa, including uc.106+, uc.477+, uc.363 + A, uc.454 + A, associating with cancer progression, Gleason score, and extraprostatic extension [179]. Modulation with the epigenetic drugs TSA and 5-AzaC increase uc.283 + A expression while treatment with R1881 increased the expression of uc.287+ and repressed uc.283 + A expression, indicating that both epigenetic factors and androgens are responsible for regulation of T-UCRs. Genome-wide expression analysis of LNCaP cells treated with a specific siRNA against uc.106+ or sicontrol indicated that uc.106+ might impair cellular transcription of genes involved in cell proliferation and cell death, as well as immune response. Although the experimental concept of this work [179] was not the most clear, it showed, for the first time, differential expression of T-UCR in prostate tissue samples.

The SNP rs8004379 in the UCR uc.368 is significantly associated with BCR [183]. Interestingly, the variant allele, C, for rs8004379 indicates a decreased risk of BCR in a dose-dependent manner after adjusting for age, PSA level, pathologic Gleason score, and stage [183]. RNA secondary structure prediction reveals that rs8004379 has a marked effect on uc.368 RNA structure, with a slight reduction in the free energy of the C allele compared to the A allele. Moreover, this SNP is located in the intron of NPAS3 gene, and C allele in rs8004379 is correlated with increased NPAS3 expression [183].

More detailed investigation is needed to establish a role for T-UCR in PCa.

4 Clinical utility of ncRNA in PCa management

4.1 Diagnostic and prognostic biomarkers

The emergence of regulatory RNA offers several putative benefits due to its tissue- and cancer-specific expression and involvement in the regulation of PCa hallmarks (Fig. 6). Serum PSA is currently in widespread clinical use, increasing prostate cancer early detection. However, its lack of specificity results in high negative biopsy rate, overdiagnosis, and overtreatment of PCa [184]. NcRNAs may, thus, provide new biomarkes to accurately diagnose PCa, improve disease management, and reduce overtreatment. Given that sncRNAs are resistant to variations in temperature and pH as well as to endogenous RNase activity, they offer unprecedented potential to become blood/urine-based biomarkers [185]. Serum samples from men with low-risk, localized PCa, and metastatic CRPC have been shown to exhibit distinct circulating miRNA signatures [186]. Indeed, miR-21 [187], miR-141 [185, 186], and miR-375 [186] expression levels are increased in the plasma/sera and discriminate patients with advanced PCa from healthy controls, associating with poor prognosis. Moreover, miR-21 serum levels are particularly elevated in patients resistant to docetaxel-based chemotherapy [187]. In two independent cohorts, promoter hypermethylation of GABRE ~ miR-452 ~ miR-224 predicted biochemical recurrence after radical prostatectomy [188]. Moreover, GABRE ~ miR-452 ~ miR-224 methylation levels also accurately distinguished non-malignant from PCa samples (AUC: 0.98), suggesting that this locus might be suitable for urine-based PCa detection. Not only GABRE ~ miR-452 ~ miR-224 has biomarker potential, but also re-expression of miR-224 and miR-452 impaired cell viability, migration, and invasion capabilities [188].

Fig. 6
figure 6

lncRNAs as master regulators of PCa phenotype. Alterations in genomic sequence and/or expression levels in PCa cells led to initial identification of PCa-associated lncRNAs. Subsequent functional studies directly connected some of the identified lncRNAs with prostate carcinogenesis. Those not only control some of the hallmarks of cancer but also contribute to androgen-independent growth, transcriptional regulation and may be of value for clinical management of PCa patients

The lncRNA PCA3 is markedly overexpressed in more than 95 % of primary PCa [189]. Due to its PCa specificity, urinary detection of PCA3 has been developed as a PCa detection test with superior tumor specificity compared to PSA [184]. FDA approved this test for clinical use under the name of Progensa PCA3 with the ultimate goal of aiding in the decision of repeat prostate biopsy. However, correlations between PCA3 expression and clinical and pathological parameters are conflicting, although some studies reported that PCA3 test is negative in men with indolent PCa [190]. To improve its performance as a prognostic biomarker, PCA3 was combined with other de-regulated genes, such as TMPRSS2-ERG. In two independent prospective, multicentric, evaluations the panel composed of PCA3 and TMPRSS2-ERG showed superior PCa specificity over serum PSA. This finding might help reduce the number of excessive prostate biopsies [191] and could also have utility for risk stratification in an active surveillance setting [192].

ncRNAs may also be detected in exosomes secreted into blood stream or urine. Exosomes are membranous vesicles containing various biomolecules, including lncRNAs, involved in cellular communication and are secreted from many cells, including cancer cells. Combining sncRNA-sequencing and qPCR validation in exosomes derived from CRPC patients, increased expression of miR-1290 and miR-375 was found in exosomes and associated with decreased overall survival in CRPC patients [193]. A multivariate model that included miR-1290 and miR-375 levels, ADT failure time, and PSA levels at the time of CRPC stage, concluded that patients with a high risk score had a 2.58-fold higher risk of death than patients with a low-risk score (HR: 2.58; 95 % CI, 1.51–4.41) [193]. In exosomes purified from urine samples either from PCa patients or individuals with benign prostatic hyperplasia (BPH), the expression levels of lincRNA-p21 were significantly higher in PCa, discriminating from BPH [194]. The biomarker performance of lincRNA-p21, however, was disappointing (67 % sensitivity and 63 % specificity). Combination with serum PSA increased specificity to 94 %, but sensitivity decreased to 52 %. Testing in larger cohorts is needed to fully disclose the biomarker potential of exosomal ncRNAs in PCa.

4.2 ncRNAs as tools for genomic epidemiology and risk prediction

Over the last years, genome-wide association studies have become a routine tool to identify germline SNPs and cancer-associated genetic variations that map to non-coding coordinates [195]. The vast majority of those SNPs are located within enhancers, but others are localized within ncRNA-gene body [196]. Although PCa risk-related loci were enriched in lncRNAs, the SNP density in regions of lncRNA was similar to that of protein-coding regions [197]. The 8q24 region has been identified as the most important susceptibility region for PCa [198]. This 1.2 Mb stretch of the genome is enriched for lncRNAs, including PCAT1, PRNCR1, and PVT1 and it also harbors the c-MYC gene. The eight SNPs detected at 8q24 account for approximately 8 % of the 2-fold increased risk of PCa in first-degree relatives of men with the disease [198]. The link between 8q24 SNPs and PCa risk is, however, not clear, although the proximity to c-MYC oncogene suggests that these SNPs might be involved in long-range control of MYC expression, notwithstanding the lack of experimental data to support this speculation [198].

Mapping of DNase I hypersensitive sites identified a variant called rs378854, which is in complete linkage disequilibrium with rs620861, as a novel functional PCa-specific genetic variant [199]. In vitro, the risk allele (G) of rs378854 reduces binding of the transcription factor YY1 (a putative tumor-suppressor in PCa). Chromatin conformation capture experiments depicted that the region surrounding rs378854 interacts with MYC and PVT1 promoters. Moreover, expression of the PVT1 oncogene in normal prostate tissue increased with the presence of the risk allele of rs378854, whereas expression of MYC was not affected [199].

Collectively, clinical use of some SNPs may help to identify patients at risk for PCa and may stratify patient phenotypes (such as clinically aggressive vs. indolent) and outcome. The use of specific SNPs may also be useful to predict patients’ response to therapy.

5 Discussion and conclusions

RNA is not only functional as a messenger between DNA and protein but it is also involved in the regulation of genome organization and gene expression, which is extremely elaborated in complex organisms. Among the challenges in the coming years, depiction of the crosstalk between different types of structural RNAs as well as the hierarchy of RNA- and protein-mediated regulation of gene expression that contribute to PCa are capital. Additionally, characterization of the mechanisms mediating RNA communication between PCa cells and mapping the genomic locations of RNA-binding sites [66] are mandatory to further understand the how gene expression control and cell state decisions are accomplished in PCa. Will ncRNA help on achieving a better definition of PCa as single pathological entity or ncRNA profiling may render a subclassifcation of PCa?

Cellular RNAs contain more than a hundred structurally distinct post-transcriptional modifications at different sites [200]. These RNA modifications may play an adaptive role that can fine-tune the structures and functions of mature RNAs to influence gene expression [200]. Some post-transcriptional RNA modifications can be dynamic and might have regulatory roles equivalent to those of post-translational protein modifications. Therefore, RNA epigenetics will help determine both mechanisms and functions of these dynamic RNA modifications and ultimately define the “prostate cancer epitranscriptome.”

Genome editing using CRISPR approaches will offer the capability to dissect ncRNAs functions. Moreover, it will provide the ability to directly modify or correct critical PCa-associated alterations by targeting a genomic locus with an engineered guide RNA, offering new therapeutic options for PCa.

During prostate epithelial transformation, AR cistrome undergoes extensive reprogramming. Accordingly, androgen-induced eRNA scaffolds AR-associated protein complexes that modulate chromosomal architecture, suggesting that eRNAs are the most critical RNAs involved in PCa.

Translating the developments in RNA biology and technology updates into deeper understanding of prostate carcinogenesis may assist in the advance of precision medicine, providing not only new and more robust biomarkers (either single or panel ncRNA) but also paving the way for patient-tailored RNA-based therapies, as an alternative to currently available therapeutic strategies. The age of RNA has come.