Introduction

Alterations in gene expression that occur without a change in nucleotide bases are known as epigenetic changes. Epigenetic processes involve a change in chromatin structure and organization due to the incorporation of epigenetic modifications such as DNA methylation, histone post-translational modifications (PTMs), and histone variants [76]. These epigenetic modifications are heritable, as they are transmitted to daughter cells through cell divisions and can even be propagated to future generations [33]. DNA methylation involves adding a methyl group (-CH3) to the 5th carbon in the nucleotide cytosine ring, creating 5-methylcytosine. DNA methylation represses gene expression and is carried out by enzymes known as DNA methyltransferases [83]. Histone modifications have characteristic effects on chromatin structure. Histone PTMs such as methylation, acetylation, phosphorylation, and ubiquitination are imparted by histone-modifying enzymes and can either cause DNA to become more tightly wrapped around histones, making it less accessible to transcription factors or, conversely, cause DNA to become less tightly packed, making it more accessible to transcription factors [6, 77]. These changes in chromatin structure lead to the regulation of gene expression. For example, trimethylation of lysine 9 of histone H3 (H3K9me3) leads to chromatin compaction and the repression of gene expression [91]. Conversely, histone H4 lysine 16 acetylation (H4K16ac) leads to chromatin decompaction and increased gene expression [157]. Centromere inheritance is a specialized epigenetic process in which a histone H3 variant called centromeric protein A (CENP-A), is considered to be the epigenetic mark of centromeres, which are the sites that bind to spindles to facilitate accurate chromosome segregation [12].

RNA provides an intermediate between DNA sequences and protein, serving as a store of genetic information. However, a subset of RNA, referred to as non-coding RNA (ncRNA) is not translated into protein, but is instead involved in regulating gene expression (Table 1). Classes of ncRNAs that are > 200nt are called long ncRNAs (lncRNAs) [124]. Several lines of evidence suggest that ncRNAs are one of the key factors influencing chromatin structure by regulating the recruitment of epigenetic marks [13, 58]. Additionally, emerging evidence suggests that ncRNAs are one of the most versatile chromatin-modifying epigenetic factors [81]. ncRNAs serve various roles, such as recruiting and directing histone and chromatin-modifying proteins to either repress or activate the expression of protein-coding genes. This regulation by ncRNAs is crucial, as aberrant gene expression is associated with a multitude of diseases, including cancers [69]. In this review, we will discuss the roles of ncRNAs in the epigenetic regulation of heterochromatin silencing, dosage compensation in Drosophila and mammals, transgenerational memory in Caenorhabditis elegans, and centromere inheritance.

Table 1 Examples of ncRNAs involved in cellular processes

Roles of ncRNAs in heterochromatin formation and silencing

Heterochromatin is a chromatin state in which adjacent nucleosomes are densely arranged to form a highly compact structure. This dense packaging restricts the access of RNA polymerase (RNAP), leading to the repression of transcription [96]. Heterochromatin is classified into two types – constitutive and facultative. Constitutive heterochromatin is mainly formed on repetitive regions and regions poor in protein-coding genes [92]. Once established early in development, the majority of constitutive heterochromatin is maintained throughout the lifetime of an organism. Facultative heterochromatin is present in developmentally regulated genomic regions, with the potential to change into a more accessible structure to allow for transcription to occur [147]. Conversely, euchromatin consists of chromatin regions that are less densely packed and are typically more gene-rich [93]. Both facultative and constitutive heterochromatin contain post-translational histone modifications that influence their level of compaction [65]. Constitutive heterochromatin is marked by H3K9me3 and is mainly present at pericentromeric and subtelomeric regions [116]. Constitutive heterochromatin is also found on interspersed repeats such as transposons, where it plays a vital role in maintaining genome integrity by repressing transposons [145]. H3K9me3 serves as the template for the deposition of heterochromatin protein 1 (HP1), which further condenses H3K9me3 heterochromatin into highly compact structures [92]. H3K9me3-mediated silencing is important for ensuring that repetitive sequences, such as transposons, are not expressed at high levels [8]. The constitutive heterochromatin is epigenetically propagated through cell divisions [32, 104, 156]. The recruitment of H3K9me3 at constitutive heterochromatin has been shown to be dependent on ncRNAs in a variety of organisms [1, 30, 121, 142]. Additionally, ncRNAs recruit chromatin and histone-modifying complexes to establish and maintain H3K9me3, as well as engage in RNAi pathways that maintain heterochromatin stability and compactness [1, 22, 78, 86].

Role of ncRNA in constitutive heterochromatin formation in S. pombe

Mechanisms of the formation and maintenance of constitutive heterochromatin is most well studied in S. pombe, where it is found at regions containing highly repetitive sequences, such as the pericentric regions, telomeres, and the silent mating type locus [2]. In addition to the H3K9me3 modification, S. pombe constitutive heterochromatin is characterized by the presence of histone hypo-acetylation and very low rates of histone turnover [109]. Heterochromatin pericentric regions of S. pombe serve as the binding sites for cohesin between sister chromatids and regulate proper chromosome segregation during cell division [87]. Similarly, heterochromatin at the mating-type region is necessary for the sexual life cycle, as well as the maintenance of cell-type identity in S. pombe [2].

Heterochromatin formation at pericentric regions

The formation of heterochromatin at pericentric regions in S. pombe occurs through a cis-acting posttranscriptional RNA interference (RNAi) pathway that involves the association of ncRNAs with an Argonaute family protein, leading to the silencing of a specific target locus. S. pombe pericentric regions consist of dg and dh repeats that are transcribed during the S-phase when the underlying chromatin opens up for replication [3]. The ncRNAs transcribed from dg and dh repeats are first processed into double-stranded RNAs (dsRNAs), that are then further processed into small interfering RNAs (siRNAs) by the Dicer ribonuclease [21]. These siRNAs interact with a piwi domain-containing Argonaute protein, a chromodomain protein known as Chp1, and an Argonaute interacting protein known as Tas3, to form the RNA-induced transcriptional silencing (RITS) complex [143]. Using siRNA-DNA base pairing, siRNAs guide and recruit the RITS complex precisely to pericentric repeats [121]. The RITS complex recruits the H3K9me3 histone methyltransferase (HMT), Clr4, which then recruits H3K9me3 onto the pericentric chromatin. H3K9me3 is recognized by the HP1 homologs, Swi6 and Chp2, in S. pombe. Chp2 serves to recruit a SHREC histone deacetylase complex and the Clr3 subunit of the SHREC complex deacetylases H3K14 on pericentric chromatin [126]. Histone acetylation is associated with chromatin decompaction of the target locus, and thus deacetylation of H3K14 at pericentric regions leads to further compaction of pericentric heterochromatin and repression of underlying DNA repeats [43, 126]. Overall, maintenance of the pericentric heterochromatin structure is regulated by ncRNAs that directly recruit protein complexes to place epigenetic marks, leading to changes in chromatin structure [78]. The integrity of Swi6-bound pericentric H3K9me3 heterochromatin is crucial as it facilitates cohesin binding, preventing premature separation of sister chromatids during chromosome segregation. The epigenetic pericentromeric heterochromatin maintains the stable silencing of underlying DNA repeats throughout successive cell divisions to allow for proper chromosomal segregation [144].

Heterochromatin formation at the mating-type locus

Heterochromatin formation at the mating-type locus in S. pombe involves two pathways that include a TORC2-Gad8 mediated pathway and a RNA interference (RNAi) pathway [29, 121]. These two pathways overlap and interact with each other to establish stable silencing at the mating-type locus. TORC2 activates the protein kinase Gad8, and the TORC2-Gad8 complex then regulates the recruitment of H3K9me3 [29]. Removal of the TORC2-Gad8 complex results in a loss of H3K9me2 and an increase in H3K4me3, leading to chromatin decompaction, suggesting that TORC2-Gad8 is essential for promoting chromatin compaction and stable silencing. The RNAi pathway plays a redundant, yet important role in specifying and maintaining heterochromatin structure at the mating-type locus, which contains a centromere-homologous repeat (cenH) region and flanking inverted repeats, known as IRCs [29, 50]. The mating-type locus produces ncRNAs that are processed into siRNAs. These siRNAs then trigger the activation of the effector complex, RITS, which leads to the formation of H3K9me3 heterochromatin [22, 29]. Resulting siRNAs also activate the TORC2-Gad8 complex, which then silences a reporter gene that is embedded within the mating-type locus through directing the recruitment of H3K9me3 [29].

In summary, these two pathways of heterochromatin formation demonstrate the overlap in the mechanisms of action off ncRNAs, as well as their reinforcing nature. Both pathways involve the recruitment of proteins and protein complexes by ncRNAs through RNAi pathways, leading to the formation of stable heterochromatic regions [99]. The RNAi pathway at pericentromeric regions produces siRNAs that activate the RITS complex, which recruits heterochromatic epigenetic marks to the mating-type locus through the action of Clr4 [2]. At the mating-type locus, ncRNAs direct TORC2-Gad8 complex, which functions alongside an RNAi pathway to recruit H3K9me3 [29].

Role of ncRNA in constitutive heterochromatin formation at the pericentric regions in mouse

Mouse pericentric regions consist of several megabase long arrays of 234 bp major satellite repeat (MSR) units and flank centromeric regions that are composed of arrays of 120 bp minor satellite repeat units [150]. These satellite DNA (satDNA) repeat arrays replicate asynchronously. The major satellites replicate during the middle of S phase, whereas the minor satellites replicate during late S phase [73]. Mouse pericentric regions form chromocenters that are heterochromatin clusters created by the coalescence of the major satellites [49]. The MSR units are transcribed to produce ncRNAs and MSR transcription during early mouse development is required for the establishment of silencing in pericentric regions [23]. These ncRNAs are proposed to help recruit the histone lysine methyltransferases (HKMTs), Suv39h1 and Suv39h2, to the major satellites [68]. These HKMTs contain domains that preferentially bind to the single-stranded MSR ncRNAs, suggesting their potential recruitment interactions. The formation of RNA-nucleosome scaffolds containing MSR RNA:DNA hybrids, as well as MSR-repeat RNA, help mediate interactions between the HKMTs and the major satellites for the proper recruitment of H3K9me3 and subsequent chromatin compaction at pericentric regions [142]. Absence of the Suv39h HKMT leads to a loss of sister chromatid cohesion at major satellites. Involvement of MSR ncRNAs at mouse pericentric regions demonstrate their ability to interact with histone-modifying proteins in order to direct the recruitment of repressive epigenetic marks, as well as to organize the heterochromatin through their scaffolding roles [142]. In addition to guiding the HKMTs to targeted genes for silencing, MSR ncRNAs have also been suggested to take part in forming RNA-nucleosome scaffolds that bring together distinct chromatin regions that are potentially far apart on the length of the chromosome in order to coordinate the recruitment of epigenetic modifications.

The formation and maintenance of pericentric heterochromatin relies on ncRNAs in both S. pombe and the mouse (Fig. 1). These ncRNAs interact with similar chromatin-modifying complexes in both organisms, as Clr4 and Swi6/Chp2 are functional homologs of Suv39H and HP1, respectively. The difference between S. pombe and mouse in this pathway lies in the biogenesis of RNA. In S. pombe, transcripts from pericentric DNA repeats are processed by the dicer ribonuclease into siRNAs. Conversely, in mouse, ncRNAs are directly localized to pericentric satellite arrays following transcription of the MSR units that are much longer compared to S. pombe dg and dh repeats. S. pombe siRNAs coordinate with the RITS complex for the recruitment of chromatin-modifying complexes. It remains to be investigated if a similar protein complex or a set of proteins is involved in supporting MSR ncRNA function at mouse pericentric regions. In both organisms, the pericentric silencing that is achieved is crucial for the maintenance of pericentric heterochromatin, which ensures proper chromosomal segregation.

Fig. 1
figure 1

A comparison of the ncRNA-mediated pericentric heterochromatin formation in S. pombe and mouse. Both S. pombe and the mouse pericentric regions are assembled into highly condensed and silenced heterochromatin, marked by the H3K9me3 epigenetic modification. Mechanisms of ncRNA-dependent pericentric heterochromatin formation share similarities as well as differences in these two organisms. In S. pombe, pericentromeres are composed of dg and dh repeats that are transcribed and further processed by the dicer ribonuclease into siRNAs (left). These siRNAs then recruit the RITS complex, which is composed of the proteins Tas3 GW, Ago1, and Chp1, to pericentric regions in a sequence specific manner. The siRNAs and the RITS complex then recruit the HMT, Clr4, which recruits the H3K9me3 modification onto the chromatin. In mouse, the pericentric regions of the chromosome are composed of major satellite repeat units that are transcribed to form ncRNAs called MSR RNAs (right). MSR ncRNAs are involved in the recruitment of the HKMTs, Suv39h1 and Suv39h2, which recruit H3K9me3 to the chromatin. The H3K9me3 modification at the pericentric regions is recognized by HP1 homologs (Chp2 and Swi6 in S. pombe and HP1α in mouse), which causes chromatin compaction and stable silencing of pericentric DNA repeats. Pericentric heterochromatin compaction and silencing are required for proper chromosomal segregation during cell division

Roles of ncRNAs in constitutive heterochromatin formation in human, plant, and Drosophila

Similar to S. pombe and mouse, pericentric heterochromatin in humans, Drosophila, and plants is also subjected to transcription to produce ncRNAs that contribute to the maintenance of H3K9me3 heterochromatin [90, 138, 149]. The expression levels of these ncRNAs and their biogenesis pathways vary among these species [36, 90, 149].

Human pericentromeric H3K9me3 heterochromatin is assembled on several different satDNA families, including satellites I, II, III, and the GC-rich beta and gamma satellites [133]. Transcripts produced from the human pericentromeric and centromeric regions show differential expression levels based on changing environmental conditions and can therefore be seen as indicators of how a cell responds to its environment [36]. Human pericentromeric satellites undergo transcriptional activation in heat shock conditions that lead to cell stress and the activation of heat shock transcription factor 1 (HSF1), which then controls the further activation of heat shock genes [36]. HSF1 then binds to satellite III (SatIII) repeat arrays on the pericentric 9pq12 locus, forming stress-induced nuclear structures, known as nuclear stress bodies (nSBs) [111]. The formation of these nSBs leads to increased transcription levels of the SatIII repeats into large and stable RNAs [57]. A fraction of SatIII arrays have been found to normally exist in an open chromatin conformation that allows for low levels of transcription, but heat shock leads to a large increase in the expression of these RNAs [44]. Transcription of SatIII is highly asymmetrical as most transcripts contain the G-rich strand of the repeat [139]. Upon heat shock treatment and after other stress treatments, including exposure to heavy metals, UV-C, oxidative, and hyperosmotic conditions, the amount of G-rich RNAs increases [139]. Additionally, several RNA-binding factors help SatIII RNAs associate with the nSBs at the SatIII locus [138]. Association of SatIII RNAs with the SatIII locus suggests that these RNAs might play a role in regulating further expression of these pericentromeric regions, and potentially influence the chromatin structuring of satellite regions.

Plant centromeric and pericentromeric regions also carry both satDNAs and retrotransposons that are largely heterochromatic in nature [42, 133]. For example, rice centromeres are enriched in centromeric retrotransposons of rice (CRR) that are marked with H3K9me3 [90]. Although CRR elements are constitutively transcribed in the tissues of the roots and leaves, the overall level of transcription is relatively low. Transcripts produced from CRR elements are processed into siRNAs via an RNAi pathway [11, 90]. Resulting siRNAs recognize and target the repetitive DNA sequences of the CRR elements to maintain their silenced heterochromatic nature [48].

The Drosophila genome contains transposons and several classes of complex DNA satellites (repeating units > 100 bp in length) that transcribe into different types of ncRNAs, such as endo-siRNAs and PIWI interacting RNAs (piRNAs) expressed within the germline [38, 115, 149]. Both piRNAs and the piRNA biogenesis pathway maintain the heterochromatic state of the underlying satDNA loci, regulating their transcription to preserve genome stability and integrity [15, 55]. Pericentromeric heterochromatin relies on PIWI-dependent H3K9me3 deposition at piRNA clusters [1]. Expression of piRNAs involves the Rhino-Deadlock-Cutoff (RDC) complex and the transcription factor Moonshiner [149]. Rhino, an HP1 variant, recruits Deadlock and Cutoff to form the RDC complex and to recognize H3K9me3-marked chromatin [158]. The transcription factor Moonshiner initiates transcription of dual-stranded piRNA clusters through its recruitment of TRF2 [4]. Disruption of the pathway producing these piRNAs leads to the loss of both satDNA-derived piRNAs and the repressive heterochromatin modifications found at these loci, suggesting that the piRNAs are necessary for regulating their own expression, as well as maintaining the pericentromeric heterochromatin through regulating the deposition of repressive epigenetic modifications [149].

Possible role of architectural RNA in the phase separation of pericentric H3K9me3 heterochromatin

Pericentromeric H3K9me3-marked heterochromatic domains are organized into large clusters known as chromocenters [49]. Chromocenters have recently been shown to undergo liquid–liquid phase separation (LLPS), owing to their inability to mix with the nucleoplasm [27]. HP1 exhibits LLPS behavior in vitro and plays an important role in the LLPS of pericentric heterochromatin [66, 88]. SAF-B also exhibits LLPS behavior and its knockdown results in a dispersed localization of pericentromeric heterochromatin, suggesting that phase separation mediated by SAF-B is involved in forming chromocenter condensates [52]. Interestingly, both HP1 and SAF-B are RNA binding proteins [52, 85]. The RNA binding property of HP1 is required for H3K9me3 localization to the heterochromatin, as in situ depletion of RNA leads HP1’s binding ability to pericentromeric heterochromatin to become compromised [85]. Furthermore, upon treatment with Ribonuclease A (RNase A), SUV39H1 fails to bind to its target site, suggesting that SUV39H1-RNA-binding interactions are necessary for recruiting, as well as retaining, SUV39H1 on the chromatin [56]. In the absence of RNA-dependent recruitment of SUV39H1 to the condensed pericentromeric heterochromatin, organization of this region is lost [131, 132]. Additionally, the specific depletion of mouse pericentromeric MSR transcripts leads to a decrease in chromocenters, suggesting the role of ncRNA in the maintenance and organization of these structures [52]. RNA plays an important role in the phase separation of other membraneless nuclear bodies such as the nucleolus, due to multivalent weak interactions between RNA and RNA-binding proteins [14, 27, 28, 119, 120, 148]. Due to its ability to form diverse secondary structures, RNA can engage in a variety of multivalent interactions that are required for LLPS [27, 148]. Therefore, it has been proposed that RNA, especially architectural RNA, may play a crucial role in regulating the phase-separation behavior of chromocenters and can possibly contribute to the epigenetic nature of H3K9me3 heterochromatin [132].

Roles of lncRNAs in Dosage Compensation

The biological basis for the determination of sex lies in the inheritance of chromosomes. Among eukaryotic species, different sexes are characterized by their varying types and numbers of sex chromosomes. In both Drosophila melanogaster and eutherian mammals, females inherit two X chromosomes, while males inherit an X and Y chromosome. Due to this unequal distribution of sex chromosomes, females carry twice the number of X chromosome genes than their male counterparts [118]. While the X chromosome contains over 1,000 genes, many of which are essential for proper development among both sexes, the Y chromosome has lost most of its genes and has only retained genes that are required for spermatogenesis and reproduction in males [19]. In order to ensure that males and females have equivalent genetic information being expressed to avoid potential developmental defects, organisms undergo dosage compensation, an epigenetic process that equalizes the expression of genes on sex chromosomes between biological sexes. Although mechanisms of dosage compensation are different in Drosophila and mammals, both require lncRNAs and serve as the paradigms of lncRNA biology.

Dosage compensation in Drosophila

To balance the gene expression of sex chromosomes inherited by males and females, Drosophila males increase transcription of their X chromosome twofold. The upregulation of transcription on the male X chromosome is mediated by a ribonucleoprotein complex known as the dosage compensation complex (DCC) (also known as the male specific lethal (MSL) complex) [75]. The DCC is composed of at least five proteins: MLE (maleless), MSL1 (male‐specific lethal 1), MSL2 and MSL3and MOF (males absent on the first), and the lncRNAs roX1 and roX2, which are among the most well-studied lncRNAs in an epigenetic process (dosage compensation). Mutations in the MSL proteins kill males during the late larval or early pupal stages, but have no detectable effect in females, suggesting their role in regulating expression of the X chromosome [10]. The DCC is assembled in a stepwise manner: first, MSL-1 and MSL-2 bind to multiple high-affinity sites, two of which are roX1 and roX2 coding loci located on the X chromosome. Next, roX1 and roX2 transcripts are integrated into the DCC by MLE, an RNA helicase. Subsequent recruitment and spreading of the DCC is highly dependent on roX RNAs. Insertion of either roX1 or roX2 DNA sequence on an autosome as a transgene can recruit the MSL complex to the insertion site and result in the spreading of the complex into flanking autosomal DNA [60]. The roX1 and roX2 lncRNAs differ in both sequence and size (3.7 kb and 0.6 kb, respectively) and are redundant in their role in the assembly of a functional DCC, as demonstrated by mutations in either roX1 or roX2 alone having no effect on male viability. Both rox1 and roX2 lncRNAs contain evolutionarily conserved tandem loops that bind to the DCC protein components, MLE RNA helicase and MSL2, a ubiquitin ligase [54]. The DCC component, MOF, a MYST family histone acetyltransferase that interacts with MSL1 and MSL3 as well as RNA, then adds the histone H4 lysine 16 acetylation (H4K16ac) modification to the male X chromosome [84, 137]. H4K16ac decompacts the chromatin on the male X chromosome, making it more accessible for transcription factors, leading to increased expression of the genes located on the male X chromosome. The mature DCC spreads in cis and coats the entire X chromosome [117]. Additionally, the roX lncRNAs and MSL complex allow for the enrichment of JIL-1, a tandem kinase, which phosphorylates histone H3 on Ser10 (H3S10) on the X chromosome [74]. The phosphorylation of H3S10 helps to further decompact the chromatin, leading to the activation of gene expression. Although H3S10 phosphorylation leads to decompaction in the male X chromosome of Drosophila, it has also been implicated in both compaction and decompaction, depending on the context of the conditions and modifications of surrounding chromatin [146]. Together, these activating histone modifications are necessary for the stable maintenance of chromatin structure to ensure dosage compensation and male viability in Drosophila throughout the entire lifetime.

Dosage compensation in mammals

In eutherian mammals, one of the two female X chromosomes is transcriptionally silenced via epigenetic mechanisms during early development to equalize gene expression between sexes [102]. The process by which one of the two female X chromosomes is chosen for silencing and the other is chosen for activation is random. In female mice, X chromosomes inactivation (XCI) takes place in two stages [140]. The first, called imprinted XCI, preferentially silences the paternal X chromosome and occurs during the pre-implantation phase. The imprinted XCI is retained in placental tissues but reversed in the embryo proper. The embryo proper undergoes a second round of XCI, which is random in nature. Once the selection of X chromosomes to be inactivated is made during the second wave, the same X chromosome is silenced during subsequent mitotic divisions in daughter cells [114].

Interactions of lncRNAs with PRC2

XCI is modulated by a pair of lncRNAs, Xist and Tsix, that are expressed from the inactive X chromosome (Xi) and the active X chromosome (Xa), respectively [95]. The Xist lncRNA is transcribed from the X-inactivation center and spreads along the entire length of the Xi in cis, excluding a small number of escapee genes that remain active [122]. Xist transcripts initially bind to regions that are the most densely filled with genes, and then extend over the rest of the chromosome [24]. Xist then mediates the recruitment of the gene-silencing protein complex known as polycomb repressive complex 2 (PRC2) throughout the length of the entire Xi. The PRC2 complex coats the Xi with the repressive heterochromatin modification, H3K27me3, which silences the Xi by excluding RNAPII and preventing transcription from occurring [24]. Xist lncRNA is approximately 17 kb long and contains repeats named repeats A–F, which are conserved between human and mouse, and play important roles in recruiting the PRC2 complex to the Xi [18, 20]. Repeat A regulates Xist transcription to trigger gene silencing. Repeats C, E, and F together recruit the Xist transcript to the Xi, and repeats B and C are required for the recruitment of PRC1 to the Xi [9, 89, 153]. The Xist transcript also contains a region called RepA, a ~ 1.2 kb ncRNA within the 5’ end of the Xist transcript overlapping with repeat A. PRC2 is initially recruited to the X chromosome by the RepA region of the Xist lncRNA via binding to the catalytic subunit of PRC2: the H3K27me3 HMT called EZH2. Additionally, it has been found that a PRC2 cofactor, Jarid2, mediates binding of PRC2 to the Xist transcript, and ATRX (a chromatin remodeler) facilitates loading of the PRC2 onto the RepA region [17]. RepA-dependent H3K27me3 modification of the Xi leads to chromatin compaction and silencing [159]. H3K27me3 coated Xi appears as a condensed heterochromatic structure in the nucleus known as the Barr Body [94].

The antisense Tsix lncRNA represses Xist transcription on the Xa chromosome and prevents Xist from accumulating and creating heterochromatic regions on the Xa, keeping the Xa available for expression [114]. The changes associated with XCI represent lncRNA-driven epigenetic modifications that lead to stable repression of the Xi that is sustained through cell divisions.

Interactions of lncRNAs with histone H2A variant MacroH2A1

The nucleosomes within the chromatin of the Xi of female mammals is distinct from other nucleosomes in that it is enriched with the histone H2A variant, macroH2A, which contains a C-terminal non-histone domain, called a macro domain [128]. Presence of the macro domain in macroH2A makes it much larger in size as compared to the typical H2A [130]. The macrodomain is suggested to provide a binding site for chromatin regulators. The chromatin region containing macroH2A forms what is known as the macrochromatin body (MCB), which is solely found on the Xi, and not on the Xa [82]. The macroH2A physically interacts with the Xist transcript, allowing it to target Xist directly to the Xi [25]. This targeting to the Xi plays a role in directing silencing solely to the proper chromosome and ensuring that silencing is upheld. The variant histone portion of the macroH2A contains necessary information for directing silencing [25].

Interactions of lncRNAs with CTCF (CCCTC-binding factor)

A critical step in the process of XCI is the selection of which chromosome will be inactivated to ensure that the silencing only effects one of the two female X chromosomes. To achieve the differential designation as active Xa and inactive Xi, homologous X chromosomes communicate in trans through homologous pairing at the imprinting/choice center carrying the lncRNAs Tsix and Xite [155]. The homologous pairing of X chromosomes involves interactions of Tsix and Xite with a trans acting factor, CTCF, which is a transcription factor and insulator protein carrying multiple zinc-fingers [154]. The imprinting/choice center contain 60–70 base pair repeats that carry binding sites for CTCF [100]. CTCF interactions with both the Xite and Tsix lncRNAs guide CTCF in cis to the X chromosome inactivation center, where it then mediates transient long-range homologous pairing between Xs [64]. CTCF binds to Xa and Xi differentially and the cross-talk between the two X chromosomes generates asymmetry in the chromatin states of the two Xs, establishing a regulatable epigenetic switch to facilitate the mutually exclusive silencing of the two X chromosomes [5, 64].

CTCF binding sites are also present in the Xist promoter, and CTCF binding to the Xist promoter represses Xist transcription in pre-XCI cells [127]. At the onset of XCI, Jpx lncRNA, which is also transcribed from the X chromosome inactivation center, relieves CTCF-mediated Xist repression by binding to CTCF and titrating it away from the Xist promoter [127, 134]. CTCF is also suggested to have a role in allowing a small subset of genes on the inactive X chromosome to escape the silencing process, so that they remain active throughout XCI. The type and number of specific escape genes varies among organisms and cell types [34]. Escape genes lack H3K27me3 and are enriched in the activating modification H3K36me3, as well as transcription elongation marks. CTCF clusters have been found to be enriched near escape regions and CTCF binding is associated with increased levels of escape, suggesting that CTCF prevents the spreading of silencing into the escape region [39]. It remains to be seen whether or not interactions of CTCF with lncRNAs are involved in XCI escape.

Xist as a molecular scaffold

Multiple repeat sequences within the Xist lncRNA can engage in inter-repeat binding and fold in 3 dimensional (3D) space [101]. This 3D structuring allows Xist to act as a molecular scaffold on which several proteins are recruited. By scaffolding these proteins, Xist coordinates their function to targeted chromosomal regions. Additionally, Xist further organizes the Xi in 3D by bringing together specific, and distant, regions of the chromosome [160]. This allows Xist to target proteins within the scaffolding to precise loci on the Xi for silencing and to maintain organization of the silenced chromatin. There are over thirty different RNA-binding proteins that directly interact with Xist [101]. Xist is able to interact with each of these proteins, allowing for organized and highly regulated interactions with the Xi. Many of these proteins that direct how Xist interacts with the Xi are part of the insoluble nuclear scaffold, also known as the nuclear matrix [31]. Xist is structurally embedded with proteins of the nuclear matrix, serving to direct scaffolding activities. An important nuclear matrix protein that interacts with Xist is the scaffold association factor-A (SAF-A), also known as heterogeneous nuclear ribonucleoprotein U (hnRNPU) [16]. SAF-A is enriched on the Xi and plays a role in anchoring Xist to the Xi, ensuring that it remains at the Xi to function [31]. SAF-A contains both RNA and DNA binding domains, allowing SAF-A to associate with both Xist and Xi, serving as a bridge between them, allowing the localization of the Xist transcript to the necessary locations for silencing [51].

Overall, the lncRNA-mediated epigenetic process of dosage compensation is essential for proper cell and organismal functioning and development, through ensuring equal gene dosage among different sexes, in both mammals and Drosophila (Fig. 2). Different modes of actions to achieve the same functional outcomes (i.e. balancing the gene expression between sexes) via dosage compensation in two different phylogenetic groups demonstrate the highly versatile nature of lncRNAs. Moreover, the ability of the lncRNAs Tsix and Xite to differentiate between the two Xs by creating a regulatable switch for XCI, reveals that lncRNAs are involved in key regulatory decision-making cellular processes (e.g. the choice of X chromosome to be inactivated). Additionally, lncRNAs such as Xist, can behave as architectural molecules by forming nuclear scaffolds involved in the 3D organization of the Xi.

Fig. 2
figure 2

A comparison of the roles of lncRNAs in regulating dosage compensation of sex chromosomes in Drosophila and mammals. In both Drosophila and mammals, the principal of dosage compensation is to use lncRNAs to induce epigenetic modifications on sex chromosomes, so that they are expressed differentially in one sex in order to balance gene products. However, the manner in which Drosophila and mammals achieve this common goal varies greatly. Male Drosophila upregulate the expression of their one X chromosome twofold in order to match the expression level of the two female X chromosomes (top). In order to do this, Drosophila utilize the lncRNAs, roX1 and roX2, to help assemble and recruit the DCC, composed of MLE, MSL1, MSL2 and MSL3and MOF, to the X chromosome. The DCC engages in the recruitment of H4K16ac onto the chromatin. This modification leads to the recruitment of JIL-1 kinase, which phosphorylates H3S10. Together, H4K16ac and H3S10 phosphorylation lead to chromatin decompaction, thereby increasing gene expression. Female mammals silence one of their two X chromosomes in order to equalize their gene expression with males (bottom). On the female X chromosome that will be silenced, known as the Xi, the X inactivation center is transcribed by RNAPII to produce the lncRNA Xist. Xist then coats the entire length of the Xi, where it engages in recruiting PRC2, which deposits H3K27me3 onto the chromatin. This modification results in the compaction of the chromatin, leading to the formation of a condensed structure known as the Barr Body. This results in gene repression and inactivation of the X chromosome

Transgenerational epigenetic inheritance in C. elegans

Transgenerational inheritance involves the propagation of epigenetic changes acquired during the lifetime of an individual to offspring throughout multiple generations [98]. Transgenerational inheritance is involved in the regulation of traits associated with lifespan/longevity and immunity in C. elegans, a species of roundworm [47]. Transgenerational inheritance in C. elegans is a prime example of how ncRNAs exert their effects on organisms and cells in a very stable and long-term manner. When C. elegans take in foreign double-stranded RNA (dsRNA) by ingestion of bacteria that express dsRNAs, RNA interference (RNAi) mediated silencing is triggered in both somatic and germline cells [40]. The silencing is maintained in future generations (F3) in the absence of the initial dsRNA supply. Therefore, the inheritance of the RNAi response in the offspring of C. elegans is a classic example of transgenerational inheritance. In this RNAi pathway, the foreign dsRNAs taken in from the environment by C. elegans are processed into primary siRNAs [106]. This mechanism involves the generation and amplification of siRNAs via RNA-dependent RNA polymerase (rdrp), and their transfer from somatic cells to the germline. The rdrp-mediated amplification of siRNA prevents their dilution when they are transferred across multiple generations. The classical RNAi silences the transcription of the target locus post-transcriptionally, where siRNAs bind to transcripts in the cytoplasm, thereby blocking protein synthesis. However, RNAi can also occur in the nucleus, where it involves a change in chromatin structure through the recruitment and action of chromatin-modifying complexes, leading to the regulation of transcription. Chromatin-modifying proteins are involved in the long-term RNAi inheritance, but the exact mechanism by which chromatin modifiers affect siRNA biosynthesis to mediate transgenerational inheritance remains unclear [141]. RNAi triggers HMTs to engage in silencing by adding the repressive H3K9 trimethylation modification to target loci [123]. H3K9me3 modifications serve as the template for chromatin modifiers that induce chromatin compaction, making DNA less accessible to the transcription machinery, thereby leading to gene silencing [70]. In C. elegans, H3K9 trimethylation is a two-step process in which the MET-2 HMT deposits mono- and di-methyl on H3K9, and subsequently, the third methyl is added by SET-25. Another H3K9 HMT, SET-32, promotes initial RNAi silencing establishment, while all MET-2, SET-25, and SET-32 are involved in the maintenance of silencing [59]. Once this modification is established, the initial silencing trigger is no longer needed, which is one of the features of epigenetic marks. These epigenetic changes can last at least twenty generations. Upon exposure to high temperatures, the HMT SET-25 is inhibited, leading to derepression of target genes [37]. This derepression was found to be inherited, despite future generations being raised at normal temperatures, demonstrating the maintenance of these epigenetic states, despite environmental differences [37]. MET is also shown to be involved in the biogenesis of heritable siRNAs. Although RNAi is stably inherited transgenerationally in met-2 mutants, siRNAs targeting repetitive elements are dramatically reduced [71]. These repetitive elements, which are enriched for H3K9me2 in the wild-type, are expressed in met-2 mutants [79]. It remains to be investigated if chromatin modifications can themselves propagate transgenerationally, once established after the initial RNAi response. The epigenetic silencing in C. elegans is suggested to depend upon the balance between heterochromatic H3K9 methylation and euchromatic H3K4 methylation, both of which impact transgenerational epigenetic inheritance. C. elegans without H3K4me1/me2-specific demethylase spr-5 exhibit normal fertility at P0 generation, but display increasing infertility in successive generations concomitant with global accumulation of H3K4me2 [45]. The increase in H3K4me2 modification in spr-5 mutants is associated with decreased H3K9me3 across multiple generations [45]. The H3K4me3 chromatin modification has also been found to be involved in the regulation of the lifespan of C. elegans. Mutations in components of the H3K4me3-modifying Trithorax HMT complex in the parents increase the lifespan of offspring up until the third generation [46].

Transgenerational inheritance of silencing via chromatin modifiers in C. elegans demonstrates how ncRNAs are involved in inducing stable changes in chromatin structure that accumulate over the lifetime and can be passed to offspring (Fig. 3). The ability of ncRNAs to stably silence chromatin in future generations, even with the initial silencing signal being removed, reveals the importance of ncRNAs in epigenetics. These findings have enormous implications for health and disease within humans, given that these epigenetic modifications have the potential to affect the fertility, lifespan, and immunity of progeny. This view of ncRNAs provides insight into how the effects of the environment have the potential to shape gene expression in future generations.

Fig. 3
figure 3

The role of ncRNAs in the regulation of epigenetic modifications in transgenerational inheritance in C. elegans. Transgenerational inheritance begins with C. elegans taking in foreign dsRNAs from the surrounding environment. Once taken in, these dsRNAs are then processed by RNA-dependent RNA polymerase into siRNAs. These siRNAs recruit and direct the HTMs, MET-2 and SET-25, which function to recruit the H3K9me3 modification, leading to compaction of the chromatin. Once recruited, the HMT SET-32 helps to further maintain compaction, thereby limiting transcription factors from gaining access to the chromatin, leading to decreased expression of these genes. This H3K9me3 modification and subsequent chromatin compaction are propagated to future generations. Future generations experience inherited silencing and control of gene expression, even when the initial silencing signal had only been experienced by their parents

Role of ncRNAs in centromere function

Centromeres are constricted regions on chromosomes where a multproteinaceous structure called the kinetochore is assembled. The kinetochore makes connection with spindle microtubules, which pull the duplicated sister chromatids apart during cell division. Correct chromosomal segregation is important for preventing aneuploidy and ensuring that daughter cells receive proper and equal amount of genetic material. Centromeres contain a specialized chromatin in which histone H3 is replaced by its variant CENP-A, which is considered to be the epigenetic mark of centromeres [12]. The presence of CENP-A serves to physically distinguish centromeric chromatin from the rest of the chromosome. CENP-A chromatin acts as the foundation for the assembly of kinetochores and its stability lasts over the course of cell divisions, as this mark is maintained to specify centromeric chromatin in daughter cells as well [62]. Centromeric DNA evolves rapidly in closely related species, and in the absence of conserved DNA sequences, CENP-A is proposed to recognize centromeric regions epigenetically. However, the exact mechanism by which CENP-A is targeted to centromeric DNA remains unclear.

Although once assumed to be inactive and silent regions, centromeres have now been shown to be transcribed at low levels into cenRNAs in a variety of organisms [112, 135, 151]. Both the sense and antisense cenRNAs are produced because the centromere satellite region is transcribed from both DNA strands [53]. Human RNAPII and its binding partner, TBP, localize to centromeres during G1 phase of the cell cycle [103]. When RNAPI or RNAPII are inhibited from transcribing these centromeric regions, chromosomes mis-segregate and the kinetochore does not assemble properly, suggesting the key role that the ncRNA transcripts produced from this region play in cell division [151]. Additionally, both RNAPII and TBP interact with CENP-A and CENP-C (CENP-A binding partner) [103].

Drosophila repeat satellite III (SAT III) from the X chromosome produces a lncRNA (Sat III RNA), which localizes to centromeres of all chromosomes, and its depletion causes mitotic defects [112]. SAT III lncRNA also interacts with CENP-C and is required for the localization of CENP-A and CENP-C, as well as outer kinetochore proteins [112]. Maize cenRNAs are also an integral component of the kinetochore and Maize CENP-C directly binds to single stranded RNA, but does not show specificity to cenRNAs [35, 135].

Mammalian centromeric DNA consists of arrays of satellite repeats. In humans, centromeric DNA consists of up to several megabase long arrays of tandemly repeated 171 bp α-satellites. Human centromeric α-satellites produce chromosome-specific RNAs and are transcribed during late mitosis and G1 into cenRNAs [26, 80, 103]. Human cenRNAs localize to the interphase nucleolus along with the centromeric proteins CENP-C and INCENP in an RNAPI-dependent manner. These cenRNAs are required for the nucleolar targeting of CENP-C and INCENP [151]. CenRNAs produced from functional centromeric α-satellite arrays (called active arrays) interact with CENP-A and CENP-C [80]. RNA from non-functional α-satellite arrays (called inactive-arrays) are less stable and interact with another centromeric protein called CENP-B [80]. The stability of cenRNAs is not affected by the depletion of CENP-A. However, when array-specific RNAs are depleted, both CENP-A and CENP-C decrease at the centromere, leading to cell arrest before mitosis, suggesting that cenRNAs act upstream of CENP-A chromatin assembly.

CENP-A is recruited to centromeres by the CENP-A-specific chaperone, holiday junction recognition protein (HJURP) [7]. A 1.3 kb cenRNA physically associates with the soluble HJURP/CENP-A pre-assembly complex at eG1 and inhibition of RNAPII leads to a reduction in the levels of CENP-A and HJURP at human centromeres, suggesting that cenRNAs play an important role in the formation of centromeric chromatin [103]. Additionally, cenRNAs also interact with Aurora Kinase B, a protein kinase involved in cell division. Regulation by cenRNAs ensures proper functioning of Aurora Kinase B to promote error-free chromosomal segregation throughout mitosis [53]. Similarly, mouse cenRNAs transcribed from the minor satellites interact with CENP-A, as well as Aurora Kinase B, demonstrating that these specific associations are well-conserved in mammals [53, 97]. Upon treatment with ribonuclease, an electron dense layer between centromeric chromatin and the kinetochore is lost, suggesting that RNA also plays a structural role in maintaining centromere structure [110].

In summary, cenRNAs regulate the function of the kinetochore, and therefore play an important role in chromosomal segregation (Fig. 4) [12, 53, 103]. Studies suggest that cenRNAs are involved in recruiting, as well as maintaining the CENP-A chromatin, which serves as the epigenetic mark of the centromere and one of the key factors required for proper chromosomal segregation [12]. We speculate that these cenRNAs ensure that the specific mark of centromeric chromatin is recruited solely to this proper region, aiding in distinguishing and specifying centromeric chromatin from the rest of the chromosome.

Fig. 4
figure 4

The role of cenRNAs in the maintenance of centromeric chromatin identity and function. Centromeric chromatin is distinct from other chromatin regions on the chromosome in that it contains the special epigenetic feature of CENP-A nucleosomes. Centromeric satellite repeats are transcribed into cenRNAs. These cenRNAs interact with CENP-A-specific chaperone HJURP and contribute to recruiting and targeting the CENP-A to the centromeric region for deposition, leading to the formation of specialized CENP-A chromatin and centromere specification. Additionally, cenRNAs also interact with and help recruit CENP-C, a binding partner of CENP-A and an essential kinetochore protein, to centromeric chromatin. Both the role of cenRNAs in the formation of centromeric chromatin and their interactions with kinetochore proteins are important for proper chromosome segregation

Conclusion and future perspective

Despite initially being considered to have little functional significance, ncRNAs have now been shown to regulate important epigenetic processes in a wide range of organisms. Through their regulation of chromatin structure by recruiting histone and chromatin-modification complexes, as well as acting as scaffolds, ncRNAs exert long-term effects on the regulation of gene expression by directing the recruitment of epigenetic marks. A comparison of the modes of action of ncRNAs in various epigenetic pathways ranging from X chromosome regulation in mammals and Drosophila, to transgenerational epigenetic inheritance in C. elegans, has revealed that not only do the ncRNAs involved in these pathways share striking similarities, but they also exhibit versatility by binding to a variety of proteins involved in species-specific processes. For example, the lncRNAs that regulate dosage compensation work through differing mechanisms and achieve varying end results in mammals and Drosophila under the common goal of maintaining equivalent expression levels of sex chromosomes between males and females. By interacting with specific histones and chromatin-modifying proteins that recruit varying epigenetic modifications, lncRNAs achieve differential levels of transcription. In Drosophila, lncRNAs recruit chromatin-modifying proteins that facilitate decompaction of chromatin and an upregulation of gene expression on the male X chromosome [75]. Conversely, in mammals, lncRNAs recruit chromatin-modifying proteins that facilitate chromatin compaction and silencing of gene expression on one of the female X chromosomes [24]. The outcome of a ncRNA pathway therefore is intimately tied to the protein factors and complexes that the specific ncRNA is able to interact with and recruit to the target gene. The commonality in these pathways is evident: ncRNAs regulate chromatin structure through their ability to recruit, direct, and regulate chromatin-modifying complexes.

The field of ncRNAs and epigenetics has garnered more attention after the realization that changes to an organism’s environment can lead to the recruitment of epigenetic modifications that alter the levels of transcription of genes that have essential cellular functions [41, 61, 63, 105, 107]. Because ncRNAs play an important role in directing the recruitment of epigenetic marks that can be inherited across cell divisions and generations, they serve as a potential avenue for understanding how certain diseases begin and progress under particular environmental and dietary conditions. Understanding the intricacies by which ncRNAs are able to regulate chromatin states and how their misregulation can lead to the onset of disease serves as an important field for future study. Through this review, we have discussed how ncRNAs are incredibly versatile regulators of chromatin structure in epigenetic processes. Although tremendous diversity and versatility have already been uncovered for roles of ncRNAs, future studies are needed to understand how they maintain and regulate epigenetic states over such long periods. Comprehensive studies of these epigenetic pathways, including further details of the interactions between ncRNAs and chromatin-associated proteins, as well as the discovery of additional ncRNA-interacting chromatin regulators, will help shed more light on how relevant ncRNAs are to the overall field of biology.