Tandemly repetitive DNA

The genomes of eukaryotic species are made up of large amounts of repeated sequences. Among them, the most abundant fraction is constituted of satellite DNA (satDNA) whose monomeric units are organized in tandem in long arrays preferentially located at centromeric, pericentromeric, and subtelomeric regions but also at interstitial positions (Henikoff and Dalal 2005; Plohl et al. 2012). In each species, several satDNAs showing differences in monomer length, nucleotide sequence, and complexity may coexist. SatDNAs are characterized by high evolutionary mutation rate resulting into species-specific repetitive elements. However, this is not a feature of all satDNAs leading some of them to persist in genomes for long evolutionary times. The preservation of some satDNAs or particular motifs in evolutionary distant species suggests that the evolution of such elements is influenced by selective constraints (Mravinać et al. 2005; Petraccioli et al. 2015). For example, the CENP-B box is a 17 bp motif binding the centromeric protein CENP-B and it is present in human centromeric alpha satellite DNA (Masumoto et al. 1989). This motif has been found in other mammals and in satDNAs of a variety of organisms such as insects (Lorite et al. 2002), molluscs (Canapa et al. 2000), and nematods (Meštrović et al. 2013). Moreover, another hallmark of centromeric satDNAs is the monomer length of 170 or 340 bp useful to wrap one or two nucleosomes, respectively (Henikoff et al. 2001).

The repeats of some satDNAs are organized in high-order repeat (HOR) unit, a structure composed of different/diverse repeats that merge together to build a new monomer unit characterized by a high sequence similarity between HORs but not within them (Heslop-Harrison et al. 2003; Plohl et al. 2012). Besides monomer length, nucleotide sequence, and complexity, satDNAs differ also in copy number as a result of unequal crossing over. Indeed, according with the “library” model (Fry and Salser 1977; Plohl et al. 2008) and the variant library model (Cesari et al. 2003; Plohl et al. 2012), the satellite DNAs and/or monomer variants could be shared by related species and could undergo amplification, leading to a higher abundance in some of them. Evidence of a satDNA library was reported in many species from invertebrates to vertebrates (Meštrović et al. 1998, 2006; Martìnez-Lage et al. 2005; Caraballo et al. 2010; Vittorazzi et al. 2014). Changes in copy number might influence the karyotype stability since chromosomal fusions seem to be related to contractions of centromeric satDNAs while translocations with expansions (Slamovits et al. 2001; Bulazel et al. 2007).

The functional significance of satDNA has been long debated. Because of the correspondence to the heterochromatic fraction, the absence of transcriptional activity as well as the high sequence divergence across species, this DNA was labeled as “junk” (Ohno 1972; Palazzo and Gregory 2014). However, several evidences have challenged this view assigning a role in genomic functions such as centromere structure, kinetochore assembly, chromosome pairing, and segregation. The evolution of repetitive DNA has been related to reproductive isolation and hence in the onset of new species; moreover, it has been suggested as force driving the genome integrity and karyotype evolution (reviewed in Shapiro et al. 2005 and Plohl et al. 2012).

Over the years, an increasing number of studies provided evidence that satDNA is also transcriptionally active in vertebrates, invertebrates, and plants (Ugarković 2005). Indeed, in human, 97–98 % of the genome produces stable RNAs, the so-called dark matter RNA (Pennisi 2010). The major part of these transcripts are noncoding RNAs (ncRNAs) and are involved not only in heterochromatin maintenance, centromere, and kinetochore assembly but also in gene expression regulation in several biological contexts (reviewed in Pezer et al. 2012).

In particular, this review summarizes studies dealing with the functional importance of satellite ncRNAs describing their involvement in centromeric heterochromatin formation in yeasts and mammals, the multiple roles at telomeres and the effects in cellular stress and disease.

Pericentromeric, centromeric, and telomeric heterochromatin

Chromatin is cytologically distinguished in euchromatin and heterochromatin on the basis of chromosomal stains (Heitz 1928). The former is characterized by a low degree of condensation and it is transcriptionally active; the latter maintains a condensed state throughout the interphase and comprises facultative and constitutive heterochromatin. The activity of facultative heterochromatin depends on the developmental stage while the constitutive heterochromatin, previously thought transcriptionally silent throughout the cell cycle, is made up mainly of highly repetitive DNA (Choo, 1997), replicates late in S phase, and is located at pericentromeric, centromeric, and telomeric regions (Fig. 1). Pericentromeric heterochromatin is placed at both sides of centromere, is essential for sister chromatid cohesion, and is marked by di- and tri-methylation of lysine residues 9 and 27 of histone H3 tails (H3K9me2, H3K9me3, H3K27me2, and H3K27me3) by a lysine methyltransferase (Martin and Zhang 2005). Nonhistone proteins are also associated with pericentromeric heterochromatin such as chromodomain heterochromatin protein-1 (HP1) in human or Swi6 in Schizosaccharomyces pombe that recognize specifically H3K9 modification (Kwon and Workman 2008; Lachner et al. 2001). At the centromere, heterochromatin contains the histone H3 variant CENP-A in mammals or CID in Drosophila interspersed with the histone H3 (Blower et al. 2002) marked with methylation and di-methylation of lysine 4 (H3K4me1 and H3K4me2) and di- and tri-methylation of lysine 36 (H3K36me2 and H3K36me3). Centromere plays a pivotal role in chromosome segregation during cell division and it is flanked by kinetochore, a protein structure that mediates the attachment of spindle microtubules ensuring the correct repartition of the replicated chromosomes. Although this function is evolutionarily conserved, the sequences present at this region are highly divergent among organisms so that Henikoff et al. (2001) coined the term “centromere paradox.” The identity and functions of these regions seem to be regulated epigenetically by the presence of specific histone variants and by the posttranslational modifications of the histone tails that may influence the activity of the chromatin.

Fig. 1
figure 1

Schematic representation of centromeric and telomeric heterochromatin. a Centromeric heterochromatin: long tandem array of human high-order repeats (multicolored arrows HOR) made up of a set of 171 bp alpha satellite monomers flanked by monomeric units disorderly arranged; mouse centromere constituted by Major satellite (6 Mb of 234 bp units) located at pericentromere and Minor satellite (600 kb of 120 bp units) located at the primary constriction; fission yeast centromere including a central region (cnt) flanked by large inverted innermost repeats (innr) which in turn are flanked by tandem copies of outermost elements that are composed of dg and dh repeats. b Telomeric heterochromatin: in vertebrate telomeres, a variable number of TTAGGG repeats constitute tandem arrays ending with a single-strand tail termed the G-overhang. The Shelterin complex composed of six proteins in humans (TRF1, TRF2, POT1, a single-stranded DNA-binding protein that directly recognizes TTAGGG repeats, TIN2, TPP1, and Rap1) associates with telomeres to form protective structure called T-loop (modified from O’Sullivan and Karlseder 2010). The symbol “\\” indicates that the cluster comprises more units than those represented. The schemes are not in scale

Like pericentromeres and centromeres, telomeres are heterochromatic structures located instead at the end of linear eukaryotic chromosomes and play a role as protection against DNA damage preventing, for example, inappropriate recombination events. In vertebrates, these regions are composed of a variable number of TTAGGG repeats constituting tandem arrays in which the G-rich leading strand extends in 3′ direction forming a single strand tail termed the G-overhang. To avoid that this region is misidentified as DNA damage by the DNA damage response machinery, the Shelterin complex composed of six proteins in humans (TRF1, TRF2, POT1, TIN2, TPP1, and Rap1) associates with telomeres and interacts with other factors allowing the generation of a protective structure called “T-loop”: the single-stranded G-overhang invades the double-stranded TTAGGG repeats and the looped structure formed protects this regions from DNA repairs (Fig. 1b) (reviewed in O’Sullivan and Karlseder 2010). Telomeres are subject to progressive shortening (40–200 bp per cell division) during semiconservative DNA replication since the DNA polymerase involved is unable to completely replicate the extremities of the linear DNA. Telomere length may reach a critical size which represents a signal to stop dividing and initiate senescence or aberrant chromosomal rearrangements lethal to the cell are favored. In most vertebrates, this problem is counteracted by the reverse transcribing activity of telomerase that adds de novo species-specific telomeric sequences onto chromosome ends. Telomerase is a ribonucleoprotein enzyme consisting of a protein telomerase reverse transcriptase (TERT), an RNA subunit telomerase template RNA (TER) which provides the template for repeat synthesis at chromosome ends, and an auxillary protein dyskerin (DKC1). The formation of this complex occurs in the Cajal body in the nucleus and it is transferred to telomeres by the telomerase Cajal body protein 1 (TCAB1). The Shelterin complex is also required to recruit telomerase to telomere.

These chromosome regions are made up of heterochromatin, are condensed and inaccessible to transcriptional machinery, and have been thought to be static and inert. However, an increasing number of studies are providing evidences on the transcriptional activity of the constitutive heterochromatin challenging this view and suggesting an unexpected dynamism.

Roles of pericentromeric and centromeric ncRNAs

Transcripts homologous to centromeric and pericentromeric repetitive sequences have been identified in several organisms such as yeast (Ohkuni and Kitagawa 2011; Choi et al. 2012), mouse (Ferri et al. 2009), and humans (Saffery et al. 2003; Wong et al. 2007). The assembly and function of these chromosome regions are strictly linked to the transcription of repetitive sequences which they contain. Indeed defects in transcription at these loci lead to chromosome mis-segregation during cell division as it has been documented in yeast (Ohkuni and Kitagawa 2011), in HeLa cells (Chan et al. 2012), and in human artificial chromosomes (HACs) (Bergmann et al. 2012).

The RNAi-mediated heterochromatin assembly in fission yeast

In S. pombe, the RNAs transcribed from the pericentromeric repetitive elements have been proposed to be involved in heterochromatin formation and maintenance by RNA interference (RNAi) machinery (Volpe et al. 2002; Verdel et al. 2004; Motamedi et al. 2004). RNAi is a conserved mechanism employed not only to silence target mRNAs at transcriptional and posttranscriptional level but also to counteract transposons and viral invasions (Ghildiyal and Zamore 2009). Double-stranded RNAs (dsRNAs) are cleaved by Dicer, a ribonuclease III, into small interfering RNAs (siRNAs). These RNA molecules are long 22–24 nucleotides and bind an Argonaute/PIWI family protein contained in the RNA-induced silencing complex (RISC) to target cognate mRNAs for silencing.

In fission yeast, the RNA polymerase II (RNA Pol II) transcribes centromeric and pericentromeric repetitive sequences in long noncoding RNAs (lncRNAs). These molecules are turned in double strand by Rdp1, an RNA-directed RNA polymerase, that associated with Hrr1, an helicase, and with the Cid12, a member of poly(A) polymerase family of enzymes, forms the RNA-directed RNA polymerase complex (RDRC). Dicer cleaves the generated dsRNAs into siRNAs which recruit factors employed in heterochromatin assembly. siRNAs are loaded by Ago1, a homolog of the Argonaute/PIWI family, that forms the ARC complex containing other two proteins, Arb1 and Arb2 (Buker et al. 2007). Subsequently, Ago1 together with the chromodomain protein (Chp1) and the targeting complex subunit 3 (Tas3) form the RNA-induced initiation of transcriptional gene silencing (RITS) complex. This complex uses single-stranded siRNAs bound to Ago1 to recognize and to target specific chromosome regions by a mechanism that probably involves either siRNA-DNA or siRNA-nascent transcript base-pairing interactions. Moreover, RITS acts as a “priming complex” recruiting RDRC in which the Rdp1 uses as a primer the single-stranded siRNA to synthesize dsRNAs and to finally complete the siRNA amplification circle.

The localization of RITS to centromeric DNA repeats and to centromeric transcripts is Clr4 methyltransferase dependent. In fact, Clr4 methylates the histone H3 at lysine 9, providing a binding site for the Chp1 and stabilizing the tethering of RITS. Clr4 is part of the complex Clr4-Rik1-Cul4 (CLRC) (Zhang et al. 2008) which is associated with RITS through the LIM-domain protein Stc1 (Bayne et al. 2010). The Clr4 binds H3K9 methylated histones through its chromodomain to stabilize the association of CLRC to the heterochromatic loci. Moreover, Clr4 modifies adjacent histones creating additional binding sites for CLRC and other chromodomain proteins such as Swi6 that recruits SHREC, promoting heterochromatin spreading.

Recently, Kowalik and coworkers (2015) showed that RITS is negatively controlled by the highly conserved RNA polymerase-associated factor1 complex (Paf1C) which promotes efficient transcription termination and rapid release of the RNA from the site of transcription preventing targeting of nascent transcripts by the RITS and CLRC. Consequently, stable heterochromatin cannot not be established (Fig. 2a). siRNAs might be also generated from ectopic hairpin RNA bypassing the requirement for the RITS-RDRC-dependent siRNA amplification circle (Fig. 2a) (Iida et al. 2008; Djupedal et al. 2009). Paradoxically, heterochromatin needs to be transcriptionally active to maintain its inactive state.

Fig. 2
figure 2

Heterochromatin formation mediated by centromeric and pericentromeric satellite transcripts. a In S. pombe, the RNA-induced transcriptional silencing (RITS) complex is involved in heterochromatin formation by small interfering RNAs. RNA polymerase II (RNA Pol II) transcribes centromeric and pericentromeric repetitive sequences in long noncoding RNA (red line) which are turned in double strand by Rdp1, an RNA-directed RNA polymerase contained in RDRC complex. This enzyme is recruited by RITS to the nascent RNA transcribed and uses the siRNA bound to the Ago1 as primer. The dsRNAs are cleaved by Dicer in new siRNAs, loaded onto the ARC complex, containing Ago1 and two proteins Arb1 and Arb2, and then transported to RITS to complete the siRNA amplification circle. Clr4-Rik1-Cul4 (CLRC) is a third complex associated with RITS through the LIM-domain protein Stc1. The Clr4 binds H3K9 methylated histones (red circle) through its chromodomain to stabilize the association of CLRC to the heterochromatic loci. Moreover, Clr4 modifies adjacent histones creating additional binding sites for CLRC and other chromodomain proteins such as Swi6 that recruits SHREC, promoting heterochromatin spreading. siRNAs might be generated from ectopic hairpin RNA bypassing the requirement for the RITS-RDRC-dependent siRNA circle. RNA polymerase-associated factor 1 complex (Paf1C) promotes rapid transcription and release of the nascent transcript from the DNA template hindering the binding of siRNA-guided RITS complex with nascent transcripts and the recruitment of CLRC (modified from Iida et al. 2008 and Holoch and Moazed 2015). b In H. sapiens, telomerase reverse transcriptase (TERT) is involved in human heterochromatin formation acting as an RNA-dependent DNA polymerase (RdDP) in nontelomeric regions. TERT together with Brahma-related gene 1 (BRG1) and the nucleolar GTP-binding protein nucleostemin (NS) form the TBN complex which binds a nascent RNA transcribed from heterochromatic regions. The antisense RNAs synthesized by TERT are targeted to the heterochromatic regions to suppress the expression in mitosis (modified from Maida et al. 2014)

The heterochromatin formation in human and mouse

The model described in S. pombe might be broadly present in eukaryotes (Lippman and Martienssen 2004; Neumann et al. 2007; Hsieh et al. 2011). However, the involvement of RNAi in heterochromatin formation in other eukaryotic organisms is still debated. In chicken, the accumulation of pericentromeric transcripts after downregulation of Dicer seems to indicate a similar mechanism (Fukagawa et al. 2004).

In human, an RNAi-dependent mechanism leading to the establishment of the heterochromatic state at centromere has been recently proposed (Maida et al. 2014). Human centromere is composed of alphoid DNA (α-satellite), made up of 171-bp AT-rich monomers, tandemly arranged in long array of 3–5 Mb (Fig. 1a). This repetitive DNA exhibits a complex higher-order repeat pattern, in which adjacent monomers may share about 50 % sequence identity, but corresponding monomers among the HORs share more than 90 % identity. The region harboring HORs is flanked by monomers lacking periodicity (Aldrup-Macdonald and Sullivan 2014). Maida and colleagues (2014) report the involvement of a telomerase reverse transcriptase (TERT) that might act in nontelomeric regions. According to their findings, TERT together with Brahma-related gene 1 (BRG1) and the nucleolar GTP-binding protein nucleostemin (NS) form the TBN complex that regulates the assembly of heterochromatin in mammalian cells during mitosis. Indeed, the nascent RNAs transcribed from heterochromatic regions are bound by the TBN complex. TERT, acting as an RdRP, synthesizes dsRNA from which siRNAs are produced in an AGO2-dependent manner. The siRNAs originated target to the heterochromatin regions to suppress the expression in mitosis (Fig. 2b). Despite previous works reported, evidence of defects in heterochromatin maintenance in Dicer-deficient human and murine cells (Fukagawa et al. 2004; Kanellopoulou et al. 2005), the model proposed by Maida et al. (2014) is independent from Dicer activity. This suggests that other TBN-independent mechanisms involving siRNA generation may regulate the heterochromatin maintenance in human.

Mouse centromere is made up of two types of repetitive sequences: Major satellite (6 Mb of 234 bp units) located at pericentromere and Minor satellite (600 kb of 120 bp units) located at the primary constriction (Fig. 1a). The condensation of chromatin might be due to WD repeat and HMG-box DNA binding protein 1 (WDHD1), an acidic nucleoplasmic DNA-binding protein, which associates with centromere in mid-to-late S phase and might be involved in an RNA-dependent epigenetic mechanism (Hsieh et al. 2011) analogous to RNAi pathway in S. pombe. In the absence of this protein, HP1 localization is compromised and epigenetic silencing at pericentromeric and centromeric regions is altered, leading to an increase in the transcription of Major and Minor satellites with consequences on centromere integrity and genomic stability. Transcripts arising from centromeres and telomeres might also be used as guides to recruit the chromatin remodelling complex NoRC (known to silence a fraction of rRNA genes) which establishes a repressive heterochromatic environment through the formation of higher-order chromatin structures (Postepska-Igielska et al. 2013). The Major satellite transcripts may have a role in the de novo heterochromatin formation during the embryo development. Indeed, the depletion of the Major satellite transcripts determines the developmental arrest (Probst et al. 2010).

Satellite transcription and centromere identity

Transcription of pericentromeric and centromeric repetitive sequences seems to have also roles not strictly related to heterochromatin establishment but also in maintaining centromere identity. The active centromeric transcription has also been related to the CENP-A loading, a process probably linked to specific size of the satDNA transcripts (Okada et al. 2009). At human centromeres, the targeting and the subsequent loading of CENP-A require the transcription of a 1.3 kb lncRNA that directly binds the CENP-A protein and its chaperone before it can be assembled onto the centromeric DNA (Quénet and Dalal 2014).

The localization of the kinetochore proteins, as CENP-C (Wong et al. 2007), and the activity of the mitotic kinase Aurora B (Ferri et al. 2009) depend on transcripts originated from centromere. The chromosomal passenger complex (CPC), composed of Aurora B kinase and its three regulatory subunits, the inner centromere protein (INCENP), Survivin, and Borealin, is a key regulator in coordinating chromosomal and cytoskeletal rearrangements during cell division. At the beginning of mitosis this complex is associated with the transcripts originated from the murine centromeric Minor satellite and the kinase activity of Aurora B results potentiated and dependent from these transcripts (Ferri et al. 2009). Recent evidence has demonstrated that also in HeLa cells Satellite I RNA associates with Aurora B and INCENP (Ideue et al. 2014). Moreover, in the absence of SATIII pericentromeric transcripts, chromosome segregation defects at both X chromosome and autosomes, and partial loss of kinetochore components have been observed in Drosophila melanogaster. The RNA molecules, transcribed from a satellite located at acrocentric X chromosome of fruit fly, bind to the inner kinetochore protein CENP-C, stabilizing its centromeric positioning and therefore also of CENP-A. This interaction may represent a platform to bind outer kinetochore proteins involved in the attachment of mitotic spindle to chromosomes ensuring a faithful segregation. These results suggest that centromeric RNAs play a widespread role in ensuring the formation of centromere and its associated structures, essential for proper chromosome pairing and segregation, despite centromeric satellite sequences are not conserved (Rošić et al. 2014).

Roles of TERRA

As for centromeric and pericentromeric regions, also eukaryotic telomeres are transcribed by DNA-dependent RNA polymerase II into telomeric repeat-containing RNA (TERRA), lncRNA that are an integral component of telomeric heterochromatin (Azzalin et al. 2007; Schoeftner and Blasco 2008, 2009). These transcripts are variable in length for example from 100 bp to 9 kb in mammals (Azzalin et al. 2007) and contain both subtelomeric sequences and C-rich telomeric repeats. TERRA promoters may be located in CpG islands or 5–10 kb away from telomeric repeats in some human chromosomes (Porro et al. 2014). The expression of TERRA is mainly cell cycle-dependent, showing the highest level in G1 and the lowest in late S/G2 (Porro et al. 2010), even if several factors may affect their transcription. TERRAs are evolutionary conserved since they have been identified from yeast to human (reviewed in Cusanelli and Chartrand 2014).

TERRA in heterochromatin formation

The involvement of TERRA in heterochromatin formation has been suggested by its association with heterochromatic marks such as HP1 and H3K9me3. TERRAs act as scaffold in recruiting chromatin remodelling factors, such as HP1, TRF1 and TRF2 (two components of Shelterin complex), and ORC. TERRA interacts with the amino terminal GAR domain of TRF2 which in turn interacts with ORC1, hence TERRAs mediate and stabilize this interaction. In the absence of TERRA and ORC a loss of H3K9me3 is observed suggesting that TERRA and ORC cooperate to maintain telomeric heterochromatin (Deng et al. 2009). Moreover, in human the regulation of TERRA expression depends on telomere length with a negative effect on heterochromatin formation. Indeed, according to negative-feedback-loop model, the over-elongation of telomeres leads to synthesize longer TERRAs which in turn recruit histone methyltransferases and their activity results in an increased trimethylation of H3K9. The abundance of TERRAs at telomeres is also correlated with the recovery of HP1. The localization of this protein may be due to the direct recruitment from TERRAs and to the additional H3K9me3 sites provided. The recruitment of HP1 and the increase of H3K9me3 at telomeres induce the transcriptional repression of TERRAs preventing the heterochromatin hyperformation (Fig. 3a) (Arnoult et al. 2012). This might be the first example of “telomere position effect” (TPE), the ability that telomeres usually have in repressing the expression of nearby genes. Moreover, a putative role of TERRAs in TPE at nontelomeric loci cannot be excluded.

Fig. 3
figure 3

Roles of telomeric repeat-containing RNA (TERRA). a In heterochromatin formation: TERRA (red lines) may contribute to the recruitment of the heterochromatic protein HP1 (blue circle) at telomeres, thereby promoting the telomeric histone H3K9 trimethylation (red circle) through the recruitment of the histone methyltransferases (pink circle). The localization of this protein may be due to the direct recruitment from TERRAs (curved black arrows) and to the additional H3K9me3 sites provided (curved green arrows) (modified from Arnoult et al. 2012). The heterochromatin hyperformation is avoided by the repression of TERRA expression according to a negative-feedback-loop. Moreover, TERRA interacts with the amino terminal GAR domain of TRF2 which in turn interacts with ORC cooperating in maintaining telomeric heterochromatin. b In telomere elongation: negative control through telomerase sequestration by TERRA tethering this enzyme through base-pairing interaction with the telomerase RNA template according three proposed models: 1, TERRA bound to telomere sequesters telomerase (blue circle); 2, released TERRA sequesters telomerase; 3, TERRA interacts with telomerase bound to telomere hindering to reach the 3′ G-overhang (modified from Redon et al. 2010); negative control promoting Exo1-dependent resections: TERRA bound to TRF2 prevents the formation of T-loop and the transcripts bound to Ku70/80 dimer hinder this enzyme in inhibiting the exonuclease 1 (Exo1) which in turn shortens telomeres (modified from Wang et al. 2015); positive control: in short telomeres, TERRA expression is induced and the transcripts obtained accumulate into a focus. In S phase, the TERRA focus acts as scaffold to nucleate the TLC1 RNA (contained in the budding yeast telomerase) forming telomerase recruitment clusters (T-Recs). In late S phase T-Recs, recruited at short telomeres from which TERRA originated, promote the elongation of these regions. c In telomere capping or replication: see text for details. Red curved line: TERRA; blue oval: Shelterin complex; green circle: RPA; blue rectangle: hnRNPA1; pink rectangle: POT1-TPP1 (modified from Flynn et al. 2011)

Also, in the protist Plasmodium falciparum, a model has been proposed in which a family of a long telomere-associated noncoding RNAs (lncRNA-TAREs), encoded in the telomere-associated repetitive element (TARE), facilitate telomeric heterochromatin assembly and/or interact with telomere-associated proteins. Consistent with this role, the expression profile of lncRNA-TAREs is cell cycle-dependent as observed for TERRAs (Broadbent et al. 2011). This evidence might suggest a conserved role of telomeric transcripts among organisms.

TERRA and telomere length

TERRAs may act on telomeres influencing, negatively or positively, their length. The first model proposed predicts the sequestration of telomerase that might occur according to three putative manners: TERRA bound to telomeric chromatin sequesters telomerase (Fig. 3b (1)); TERRA, released from telomeres, binds and inhibits telomerase preventing its access to the chromosome ends (Fig. 3b (2)); TERRA interacts with telomerase bound to telomeric chromatin preventing its access to 3′end (Fig. 3b (3)). The binding of TERRA with telomerase requires a base-pairing interaction with the telomerase RNA template and the TERT polypeptide. The sequestration of telomerase by one of those mechanisms results in telomere shortening (Redon et al. 2010).

Another model suggests a negative control of TERRA on telomere length by promoting Exo1-dependent resections (Wang et al. 2015). The CCCTC-binding factor/chromatin organizing factor (CTCF) recruits RNA Pol II to subtelomeric CpG-island promoter inducing TERRA transcription (Deng et al. 2012). CTCF and cohesin stabilize the binding of TRF1 and TRF2 to subtelomere inhibiting the telomere DNA damage signaling and allow the formation of T-loops. THO complex is involved in TERRA packaging into ribonucleoprotein particles hindering the formation of R-loops, a structure in which TERRA is base paired with a strand of a duplex DNA. These RNA-DNA hybrid structures interfere with chromosome end cap formation, and consequently, the telomere is unprotected from the action of Exonuclease 1 (Exo 1), an enzyme which induces telomere shortening (Pfeiffer et al. 2013). Therefore, in the presence of TERRA transcription, the synthesized RNA molecules might bind TRF2 hampering the T-loop formation with the consequent activation of the ATM kinase pathway involved in the DNA repair. Additionally, TERRA transcripts might associate with Ku70/80 dimer which inhibits the activity of Exo 1 so that this enzyme can operate (Fig. 3b).

Other studies proposed TERRA as positive regulator in telomere elongation. Indeed, in S. pombe, TERRA arisen from short telomere coordinates the recruitment of telomerases to these regions leading to their consequent elongation (Cusanelli et al. 2013). In normal telomeres, the inhibiting action of Rap1 on RNAPol II and the degradation of TERRA by the RNA exonuclease Rat1 do not allow TERRA expression. On the contrary, in short telomeres, TERRA expression is induced and the transcripts obtained accumulate into a focus. In S phase, the TERRA focus acts as scaffold to nucleate the TLC1 RNA (contained in the budding yeast telomerase) forming telomerase recruitment clusters (T-Recs). In late S phase, the TERRA-telomerase complexes, recruited at short telomeres from which TERRA originated, promote the elongation of these regions (Fig. 3b).

TERRA in telomere capping and replication

TERRA has a role in telomere capping and replication. The Shelterin component POT1 competes with the heterotrimeric protein replication protein A (RPA) to bind the telomeric single strand. The tethering of RPA to this locus is hindered by heterogeneous nuclear ribonucleoprotein A1 (hnRPA1). During the first mid of S phase, TERRA sequesters hnRNPA1, allowing RPA to bind to telomeric ssDNA instead of POT1, permitting telomere extension. In the late S phase, TERRA levels decrease and hnRPA1 displaces RPA from ssDNA at telomere. In G2, TERRA re-accumulates, gradually sequestering hnRPA1 from telomeric ssDNA. In this window of time, RPA and POT1 might bind to this region; however, only the tethering of POT1 is stable since RPA might be displaced by hnRPA1 not completely sequestered by TERRA. The presence of POT1 at telomere promotes the capping (Fig. 3c) (Flynn et al. 2011).

Centromeric transcription in cellular stress and disease

The transcription levels of pericentromeric and centromeric SatDNAs depend on developmental stages and tissue types (Pezer and Ugarković 2008; Probst et al. 2010). However, various cellular stresses, including heat shock, exposure to heavy metals, hazardous chemicals, and ultraviolet radiation, as well as hyperosmotic and oxidative conditions, may also influence the expression of these tandem repetitive elements (Jolly et al. 2004; Rizzi et al. 2004; Bouzinba-Segard et al. 2006; Valgardsdottir et al. 2008; Eymery et al. 2009a; Hsieh et al. 2011; Hall et al. 2012; Enukashvily and Ponomartsev 2013). This transcriptional response during stress is highly conserved since, besides in mammals, it has been described also in insects (Pezer and Ugarković 2012) and Arabidopsis (Pecinka et al. 2010; Tittel-Elmer et al. 2010). The expression levels vary according to the nature of the cellular stress; for example, with heat shock, the transcription is highly induced and transcripts in both orientations have been evidenced unlike other cellular stresses in which only sense transcripts have been observed (Valgardsdottir et al. 2008). In response to heat shock, distinct nuclear structures, termed “nuclear stress bodies” (nSBs), originate at pericentromeric regions of human cells (Denegri et al. 2002; Jolly et al. 2002). Within nSBs the epigenetic status of pericentromeric DNA changes assuming the characteristics of euchromatin, including histone hyper-acetylation and DNA demethylation. In addition, the transcripts produced from SATIII, a specific pericentromeric satellite of chromosome 9, result in highly polyadenylated RNAs that are associated with nSBs. The transcription of SATIII requires the recruitment of heat shock factor 1 (HSF1) and RNA Pol II to nSBs (Jolly et al. 2004; Rizzi et al. 2004). The role of SATIII transcript in the organization of nSBs has been confirmed by knockdown experiments in which using antisense oligonucleotides and RNAi, the recruitment of RNA processing factors to these nuclear bodies resulted affected (Chiodi et al. 2004; Metz et al. 2004). The transcription of pericentromeric SATIII and nSBs formation also have been reported in other kinds of cellular stresses; however, the different transcription factors involved depend on the type of stress. Moreover, the SATIII expression levels and the number and size of nSBs are lower than those occurring in heat shock (Valgardsdottir et al. 2008; Saksouk et al. 2015).

Enemy and coworkers (2009a) have compared satellite transcription during cellular stress in normal as well as in diseased human cells. Their findings evidenced the upregulation of pericentromeric transcripts during recovery from heat shock in spite of centromeric transcripts. This suggests that centromeric and pericentromeric transcripts are under different transcriptional controls. However, the function of this specific transcription, or of this RNA, in stress response remains unclear. Some hypotheses have been advanced: these transcripts could be involved in heterochromatin re-formation according to an RNAi-dependent or RNAi-independent pathway; long noncoding pericentromeric transcripts might be requested for the establishment or maintenance of a specific chromatin state; the SATIII transcripts could protect a fragile region of the genome avoiding the damage induced by stress conditions; pericentromeric transcripts could activate genes located nearby acting in cis or in trans (reviewed in Saksouk et al. 2015).

In mouse cells stressed by chemical exposure, the transcription of the centromeric Minor satellite is induced. The high levels of transcripts impair centromere function, affecting centromere chromatin condensation and sister chromatid cohesion in mitosis leading to aneuploidy events (Bouzinba-Segard et al. 2006).

After long-term heat shock, the activity of genes associated with repeats of satellite DNA may be influenced as it has observed in the beetle Tribolium castaneum (Feliciello et al. 2015) in which the major satellite DNA TCAST1 supresses the nearby genes. According to the model proposed under stress conditions TCAST1 satellite DNA is transcribed and its transcripts are processed into siRNAs. These short RNA molecules bind dispersed TCAST1 repeats associated with genes by base-pairing interaction. At these loci TCAST1 siRNAs recruit the chromatin modifiers increasing the level of silent heterochromatin. This state influences negatively the expression of close genes.

The formation of polyadenylated pericentromeric transcripts has also been observed in cellular senescence where, as a result of deep epigenetic changes, constitutive heterochromatin at pericentromeres results decondensed and shows lower DNA methylation levels (Enukashvily et al. 2007).

The decondensation of pericentromeric heterochromatin and the overexpression of pericentromeric satellite repeats have been reported in some pathological incidences, in several cancers and genetic disorders (Shumaker et al. 2006; Alexiadis et al. 2007; Enukashvily et al. 2007; Ehrlich et al. 2008; Eymery et al. 2009b; Ting et al. 2011; Zhu et al. 2011). Tumor suppressor proteins are known to target pericentromeres, ensuring a compact and condensed chromatin structure and playing a role in the repression of pericentromeric expression (Frescas et al. 2008; Zhu et al. 2011). The absence of these proteins leads to a strong increase in pericentromeric satellite transcripts, and consequently, cells are subject to segregation defects and an overall genomic instability. The noncoding pericentromeric transcripts could be a driving force for malignant transformation. In particular, the DNA methylation, one of the main epigenetic marks of constitutive heterochromatin at pericentromeres, is impaired in neoplasia, because characterized by a global DNA demethylation as well as a localized hypomethylation of oncogenes and a hypermethylation of tumor suppressor genes (Wilson et al. 2007; Sugimura et al. 2010; Ting et al. 2011). An exhaustive review on the relationship between ncRNA transcribed from satDNA and cancer has been described by Ferreira et al. (2015).

In addition to cancer, misregulation of DNA methylation at pericentromeric regions could also be linked to immunodeficiency, centromere instability, and human facial anomalies syndrome (ICF), in which mutations in DNA methyltransferases DNMT3B cause DNA hypomethylation of SATIII repeats in chromosome 9 and of SATII repeats in chromosomes 1 and 16 (Jeanpierre et al. 1993). Despite hypomethylation, the SATII expression levels were low suggesting that hypomethylation is not sufficient for transcriptional activation of these elements (Alexiadis et al. 2007). The hypomethylation of SATII and SATIII repeats has been proposed to be responsible for the dysregulated expression of specific genes acting in trans (Ehrlich et al. 2008).

Pericentromeric transcripts of SATIII (Shumaker et al. 2006), related with a loss of constitutive heterochromatin, were also observed in the Hutchinson-Gilford progeria syndrome caused by mutations in the Lamin A gene (reviewed in Prokocimer et al. 2013). Lamins are structural components of the nucleoskeleton, are implicated in the structural integrity of the nucleus and interact with pericentromeric heterochromatin contributing to its organization (Towbin et al. 2012; Solovei et al. 2013; Saksouk et al. 2014).

The transcription of pericentromeric and centromeric heterochromatin may affect centromere stability and genome integrity during embryo development. Indeed, in mouse, the chromatin remodeling protein alpha thalassemia/mental retardation syndrome X-linked (ATRX) silences Major satellite transcripts in the maternal genome and Minor satellite transcripts in neurons leading to a decrease in centromeric mitotic recombination, sister chromatid exchanges and double-strand DNA breaks (De La Fuente et al. 2015; Noh et al. 2015).

Double-strand DNA breaks may also be induced by genotoxic stresses. Yang and colleagues (2015) have demonstrated that DNA damage induces siRNA production specifically from repetitive DNA loci, triggering the RNAi pathway to maintain genome stability by regulating the homologous recombination process. siRNAs induced from DNA damage have been reported also in Drosophila (Michalik et al. 2012) and mammals (Wei et al. 2012), suggesting that it is a widespread mechanism in eukaryotes.

The abnormal variation of centromeric and pericentromeric DNA transcription across major eukaryotic lineages in stress condition and disease has suggested a critical role that these transcripts may play and the potentially dire consequences for the organism. However, the putative function of the transcription process or their resulting transcripts remains to be clarified.

Conclusions

The works here summarized show that the DNA repeated in tandem, once called “junk DNA,” is an active part of the genome. Recently, thanks also to the development of next generation sequencing techniques, an increasing number of studies reports the transcriptional activity of these elements and more information are accumulating about their roles. Indeed, satellite DNA-derived transcripts play a structural function in the heterochromatin formation and maintenance at both centromere and telomere, are involved in determining the centromere identity interacting with the CENP-A and the kinetochore proteins, control the telomere length, capping and replication in a cell cycle-dependent manner. However, to date, the scarcity of species analyzed do not allow to evaluate the conservation of these functions among eukaryotes and in many cases the fragmentary of data do not allow to trace comprehensive mechanisms.

Several examples above mentioned highlight also the importance of a proper transcription of tandemly repeated elements since an imbalance of the transcriptional activity of centromeric and pericentromeric elements in specific stress and/or biological conditions may affect chromosome and genomic stability with dire consequences on the organism. Understanding the precise mechanisms in which these transcripts are involved in both normal and abnormal conditions may help in searching therapies. Future efforts should be routed on the investigation of a larger number of organisms and in understanding the transcription itself and the biogenesis and processing of satellite ncRNAs.