Introduction

Centromeres play an essential role in kinetochore assembly and equal chromosome segregation, and are marked by a specific histone H3 variant (CENP-A in human and fission yeast; CENH3 in plants and Cse4 in budding yeast) (Henikoff et al. 2020; Dhatchinamoorthy et al. 2018). There are three major classes of centromeres, point centromere, regional centromere and holocentromere. Point centromeres are 120–200-bp, and are found in budding yeast (Kobayashi et al. 2015). Regional centromeres are most prevalent among human, mice, fission yeast, plants and other higher eukaryotes, which may reflect the ancestral centromere organization. Holocentromeres are found in Caenorhabditis elegans, for example, and encompass the length of a chromosome (Henikoff et al. 2020; Pluta 1995). Most plant and animal centromeres favor AT-rich DNA that comprise retrotransposons and tandemly repetitive DNA known as satellites (Fig. 1). Even though their functions are evolutionary highly conserved, the underlying centromeric DNAs are highly variable in sequence and evolve quickly, which are not essential for centromere identity (Cleveland et al. 2003; Stimpson and Sullivan 2010), suggesting that epigenetic marks are involved in establishing the centromeric state, like associated RNAs, proteins and other epigenetic modifications.

Fig. 1
figure 1

An optimal level of RNA transcription is required for maintaining the proper function of the centromere. Both sense and antisense cenRNAs including circRNA transcripted from centromeric repeats are detected. A low level of cenRNAs would lead to aberrant mitosis, micronuclei and autophosphorylation of Aurora B increase. Conversely, a high level of cenRNA would also cause mislocalization of centromere-associated proteins, centromere inactivation, and centromeric epigenetic changes

Centromeres are dynamic, rather than being inert. A common feature of centromeres is that they occur in gene-free regions but include genes that are transcribed at a very low level (Su et al. 2019; Henikoff and Talbert 2018). Although centromeric transcripts are a conserved epigenetic mechanism regulating centromeres across species, they vary dramatically in size. CenRNAs are associated with a broad range of functions, including participating in the regulation of chromosome behavior, gene transcription, and chromatin architecture (Arunkumar and Melters. 2020).

Various kinds of RNAs of eukaryotes and prokaryotes are attracting a lot of attention from researchers, including small RNAs, long noncoding RNAs (lncRNAs), circular RNAs (circRNAs), and RNA: DNA hybrids. Given the growing number of studies showing the role of cenRNAs, our current knowledge is largely derived from work in animals and yeasts; their global function are still enigmatic. We refer readers to a comprehensive summary that covers the possible biological roles of cenRNAs.

  1. 1.

    A balanced level of cenRNA is essential for maintaining the proper function of centromeres

    Centromeres comprise two domains: the central core and the flanking pericentric heterochromatin, serving for assembly of the kinetochore and centromere cohesion separately (Corless et al. 2020). Centromeres are the most condensed region of a chromosome. However, a low level transcription of the core and the flanking pericentric regions is detected in human cells (Wong et al. 2007), mouse cells (Ferri et al. 2009), fission yeast (Choi et al. 2011) and plants (Lv et al. 2020), suggesting that CENP-A chromatin contains some open chromatin. A human neocentromere contains 51 genes, of which about one-third are expressed (Saffery et al. 2003). Centromeric α-satellite transcripts are estimated to be about 0.5% that of a housekeeping gene (Chan et al. 2012). Genes transcription is also detected in de novo centromeric regions of maize (Su et al. 2016). In Arabidopsis, more than 47 expressed genes were found to be flanked by core centromeric repetitive sequence such as cen180 (Arabidopsis Genome 2000; May et al. 2005). In rice centromere 8, four of the genes present have normal transcripts (Nagaki et al. 2004). It has long been known that yeast centromere function can be switched off or on by controlling transcription induction or repression from the GAL1 promoter producing of a conditional centromere (Hill et al. 1987). There is also evidence that shows direct links between transcription and centromere activation from yeast to human. In fission yeast, centromeric histone H3 with CENP-ACnp1 tends to associate with a subset of RNA polymerase II (RNA PolII) promoters where RNA PolII binding is high. Similar findings were also found in S. cerevisiae CENP-ACse4 (Choi et al. 2011; Ólafsson et al. 2020). In Schizosaccharomyces pombe, Ams2 is a cell cycle-regulated GATA-like transcription factor, depletion of which results in the reduction of CENP-A binding to centromeres and thus chromosome missegration. Conversely, with the accumulation of Ams2, association of CENP-A mutant protein with a centromere is restored (Chen et al. 2003), implicating that transcription acts in centromere function.

    Transcription of centromeres is largely dependent on activation of RNA PolII and varies between development at stages and tissues (McNulty et al. 2017; Maison et al. 2010). Point centromere activity requires an optimal level of centromeric noncoding RNA. RNA PoII-mediated centromeric transcriptional level that is excessively high or low leads to centromere inactivation and failures in segregation (Ling et al. 2019; Ohkuni et al. 2011). CenRNA over-expression in cbf1 (centromere-binding protein 1) and htz1 (histone H2A variant) deletion increases budding yeast minichromosome loss. Minichromosome loss was also significantly increased when all the cenRNAs were knocked down (Ling et al. 2019). Nakano et al. (2008) developed a human artificial chromosome with an operable epigenetic state and also found that only moderate levels of transcription are compatible with correct centromere function. Functional centromere activity was deactivated by strong transcription from an artificial promoter, and was restored when centromeric transctipts decreased (Collins et al. 2005; Ohkuni et al. 2011). Thus, there is an optimal level of RNA transcription required for centromere and kinetochore assembly (Fig. 1).

    Currently, several transcriptional regulators including RNA PolII are essential for keeping cenRNA balance, which were described in yeast, mouse and human cells. A nuclear protein ZFAT binds to centromeres to control centromeric non-coding RNA transcription through a specific 8-bp DNA sequence in human and mouse cells. (Ishikura et al. 2020). In budding yeast, centromeric transcription is suppressed to a low level by kinetochore protein Cbf1 and histone H2A variant H2A.ZHtz1 (Ling et al. 2019). In mice, MIWI regulates the post-transcription of mRNA, lncRNA and transposons. MIWI- and Dicer-mediated cleavage of the centromeric satellite RNAs prevents aneuploidy by preventing the over-expression of satellite RNAs (Hsieh et al. 2020). In human cells, alpha-satellite expression was repressed by centromere-nucleolar interactions (Bury et al. 2020). Other studies suggest cenRNAs remain at centromeres. However, Bury et al. (2020) found that alpha-satellite RNA transcripts were broadly distributed within the cytoplasm during mitosis, which provides a different perspective for cenRNA function. This may be explained by MuNulty’s (2017) opinion that each human alpha satellite array produces a unique set of non-coding transcripts to perform different functions (Fig. 2).

  2. 2.

    CenRNAs are essential for CENP-A loading onto centromeres.

    Loading of CENP-A at centromeres occurs in a cell cycle-specific manner. Synthesis of new CENP-A is deposited in metaphase in D. melanogaster S2 cells (Mellone et al. 2011), during telophase and G1 in human (Jansen et al. 2007), and prior to mitosis in G2 in plants such as maize, barley, rye and arabidopsis (Topp et al. 2004; Lermontova et al. 2007; Schubert et al. 2014; Lermontova et al. 2011). Consistently, cenRNAs levels are also cell cycle-regulated. In recent years, a direct RNA–protein interaction between centromeric RNAs and CENP-A has been found in many eukaryotes. In humans and Drosophila, centromeres are actively transcribed by RNA polymerase II from late mitosis to early G1 (Jansen et al. 2007; Dunleavy et al. 2012). A 1.3-kb RNA that originates from centromeres was associated with CENP-A. The long-term loss of centromeric transcripts led to the loss of CENP-A recruitment and its chaperone HJURP to centromeres, whereas its overexpression increases CENP-A and HJURP recruitment (Quénet et al. 2014). Yet, there is ample evidence from human and Xenopus oocytes that knock-down of centromeric transcripts results in reduced CENP-A levels at centromeres (Quénet et al. 2014; Saffery et al. 2003; Bergmann et al. 2011; Grenfell et al. 2016). McNulty et al. (2017) also found non-coding RNAs transcribed from human alpha satellite are complexed with CENP-A and CENP-C. Loss of CENP-A does not affect transcript abundance, but CENP-A and CENP-C at the targeted centromere are reduced when cenRNA is depleted. In mouse, minor repeats yield transcripts up to 4-kb long, and may impair centromeric architecture and function under stress (Bouzinba-Segard et al. 2006). In maize, nearly half of the centromeric retrotransposons (CRMs) and satellite repeats (CentC) RNA, which is larger than 40-nt in length were bound to CENH3. and siRNA-sized (22–30-nt) molecules were not detected (Topp et al. 2004).

    There has been great progress in understanding centromere and kinetochore function over the last few years. A key question of how CENP-A recognizes DNA and targets the proper chromosomal location still remains. Henikoff and researchers have suggested that the replacement of histone H3 with CENH3 is often associated with active transcription, which can disrupt nucleosome and open chromatin, similar to the role of human Xist RNA in regulating X-inactivation by facilitating the replacement of histone H2 with macroH2 (Boeger et al. 2003; Plath et al. 2002; Jiang et al. 2003; Sullivan et al. 2001; Choo et al. 2001). However, Nechemia-Arbely (2019) found support for the idea that CENP-A was assembled into nucleosomes onto more than ten thousand transcriptionally active sites on the chromosome arms. DNA replication acts as an error correction mechanism to remove non-centromeric CENP-A.

    RNA is also an essential structural and functional component of neocentromere chromatin. In humans, the L1 retrotransposon (~ 6-kb in size) belongs to the only active subfamily of LINEs. A significant enrichment of FL-L1b RNA (one of the elements of L1RNA) in the CENP-A bound fractions at the 10q25 neocentromeric chromatin was observed by anti-CENP-A RNA ChIP-seq, indicating that RNA transcribed from the L1 retrotransposon of a neocentromere could be incorporated into the core neocentromere chromatin and serve as a critical epigenetic determinant in chromosome remodeling, leading to neocentromere formation (Chueh et al. 2009). Taken together, ceRNAs appear to assist in cenH3 loading.

  3. 3.

    CenRNAs are required for kinetochore structure and accurate chromosome segregation.

    The kinetochore is a multiprotein complex that adheres to centromeric chromatin through the inner plate and binds to microtubules through the outer plate, which is essential to accurate chromosome segregation (Rošić et al. 2016; Yamagishi et al. 2014). Although RNA was first observed in kinetochores in the 1970s (Rieder, 1979; Braseton 1975), Topp et al. (2004) found that cenRNAs played a role in assembly and stabilization of kinetochore chromatin structure. Centromeric transcripts are bound by several kinetochore proteins that involve kinetochore assembly. Both sense and antisense cenRNA interact with the inner kinetochore protein CENP-C, as found in D. melanogaster (Rošić et al. 2014), plants (Du et al. 2010) and human cells (Wong et al. 2007). In human, CENP-C binds three different cenRNAs (Henikoff et al. 2018). Aurora B kinase interacts with ncRNA transcribed from centromeric satellite I, and knock down of satellite I RNA displays mitotic chromosomes segregation errors by inducing the defective attachment of microtubules to kinetochores (Wong et al. 2007; Indue et al. 2014). CenRNAs are also required for activation of Aurora B kinase in X. laevis eggs and mouse cells (Ferri et al. 2009; Blower 2016). CenRNAs processing contributes to proper spindle and kinetochore assembly in Xenopus egg extracts. Inhibition of transcription initiation or RNA splicing result in spindle defects (Grenfell et al. 2016). A-satellite RNA is also a key component in the assembly of other kinetochore proteins like Sgo1 (Talbert et al. 2018), CENP-A, and CENPC1 (Wong et al 2007). The over-accumulation of major and minor satellite transcripts alters meiotic kinetochore assembly and causes chromosome mis-segregation (Table 1; Fig. 1). Cell cycle-regulated cenRNAs may stabilize the binding of CENP-C to DNA, and help to recruit CENP-A loading and kinetochore assembly to regulate centromere function and facilitate accurate chromosome segregation.

  4. 4.

    Cell-cycle-dependent cenRNAs act in pericentromeric heterochromatin assembly

    During mitosis, heterochromatin formation is essential for gene regulation and maintaining centromere stability to ensure accurate chromosome segregation. As centomeres and pericentromeres are populated with enormous amounts of repeat sequence, the centromere is the most condensed and constricted region of a chromosome. However, a low level transcription of the core and the flanking pericentric regions is detected. Centromeric transcription mediated chromatin remodeling is favorable for transition of CENP-A to incorporate nucleosomes at the centromere (Georg et al. 2018). Transcription is actually required to inititiate heterochromatin formation (Reinhart and Bartel 2002). In mouse, major satellite RNAs stabilize pericentromeric heterochromatin retention of H3K9me3 methyltransferases by forming a RNA: DNA hybrid (Camacho et al. 2017). In Drosophila S2 cells, repeated RNAs are principally derived from active retrotransposons, especially gypsy elements, acting in both cis and trans on chromatin to help maintain pericentromeric hetetochromatin (Hao et al. 2020). Antisense transcripts can occur in the presence of heterochromatin, sense transcripts are repressed by Clr6 complexes (Volpe et al. 2002; Nicolas et al. 2007). Pathways to establish centromeric and pericentromeric heterochromatin have been better described in fission yeast and S. pombe. Formation of heterochromatin at centromeres relies on the RNA interference (RNAi) machinery, which involves processing of centromeric noncoding RNAs (Verdel et al. 2004; Chen et al. 2008). Similar findings also have been identified in plants (Lippman et al. 2004; Neumann et al. 2007). Transcripts from both strands of centromeric DNA are cell cycle regulated. The forward transcripts with preferential accumulation during S phase indicate the accessibility of heterochromatin structures in this phase (Chen et al. 2008).

  5. 5.

    CenRNAs are associated with centromeres via RNA: DNA hybrids.

    Topological organization of centromeric chromatin has recently gained increasing attention. R-loops are three strand nucleic acid structures consisting of an RNA: DNA hybrid and a displaced single-stranded DNA (Fig. 4). As to their functions, R-loops have been reported to be associated with DNA replication initiation (Yu et al. 2003), DNA-damage response (Hamperl et al. 2017), gene transcription (Fang et al. 2019), DNA repair (Lu et al. 2018), and genome instability (Frederic and Craig 2020). The formation of RNA–DNA hybrids is also an important mechanism of sequence-specific targeting of RNA to chromatin (Maldonado et al. 2019). Non-coding RNAs as a structural chromatin component is well documented for telomeric heterochromatin, and has been implicated to remain associated with telomeric chromatin by forming RNA: DNA hybrids that mediate telomere length and heterochromatin formation (Nakama et al. 2012; Schoeftner et al. 2009; Graf et al. 2017; Feretzaki et al. 2020). Centromeric ncRNA research falls behind that of telomeres. Apart from budding yeast, centromeric R-loops have been identified in human, rice, Arabidopsis, maize and other eukaryotic organisms (Kabeche et al. 2018; Fang et al. 2019; Xu et al. 2017) However, research on centromeric R-loops remains less explored. Centromeric R-loops are generated through RNAPII-mediated transcription during mitosis (Mishra et al. 2020). In maize, high levels of R-loops in centromeric retrotransposons led to a reduced localization of CENH3 (Liu et al. 2020). In mouse, major satellite RNAs stabilize pericentromeric heterochromatin retention of H3K9me3 methyltransferases by forming a RNA: DNA hybrid (Camacho et al. 2017). In human, R-loops are detected at centromeres in mitosis; and an R-loop-driven signaling pathway promotes faithful chromosome segregation and genome stability (Kabeche et al. 2018). Interestingly, the opposite effects of centromeric R-loops on chromosomal instability was also reported. Using hpr1∆ strains that accumulate R-loops. Mishra and other researchers find that R-loops at centromere chromatin contribute to defects in kinetochore integrity and chromosomal instability. They also found that R-loops at centromeres were not accumulated when centromeric non-coding RNA is increased (Mishra et al. 2020; Unoki et al. 2020). These findings indicate the negative and positive impact of R-loops on the function of kinetochores and centromeres. Importantly, R-loops are also observed in neocentromere regions in maize (Han et al. unpublished), which suggests a role of R-loops in neocentromere formation.

    Circular RNAs (circRNAs) are a novel class of noncoding RNAs that are involved in gene expression regulation, and has been extensively explored in worm, metazoans, fruit fly, mouse, monkey, and human (Ivanov et al. 2015; Westholm et al. 2014; Fan et al. 2015; Memczak et al. 2013 and Salzman et al. 2012). Genomewide circRNAs also have been were identified in plants, including Arabidopsis thaliana, Oryza sativa, maize, wheat, barely, tomato, soybean (Ye et al. 2015; Chen et al. 2018; Wang et al. 2017a, b; Darbani et al. 2016; Zhao et al. 2017a, b; Zhou et al. 2016; Wang et al. 2017a, b; Zeng et al. 2018; Zuo et al. 2016). However, due to the limitation of bioinformatics tools identifying circRNAs and the repetitive nature of centromeric DNA, centromeric circRNAs can’t be identified easily. Liu et al (2020) first reported the role of centromeric circRNAs derived from retrotransposons in maize, which act by binding to the centromere through R-loops (Figs. 3, 4). The molecular features and function of these centromeric circRNAs are still being investigated. These clues shed new light on the function of centromeric circRNAs and R-loops, and it's an appealing line of research on the function and stabilization of centromeres.

Fig. 2
figure 2

Distribution of centromeric specific DNA repeats in plant chromosomes. a The distribution of centromeric retrotransposon of wheat (red) signals along the wheat chromosome. b The distribution of CRM1(green) and CentC(red) signals along the maize chromosome. DAPI-stained chromosomes are blue. Bar = 10 μm

Table 1 Summary of known centromeric transcripts in various species
Fig. 3
figure 3

AFM image of the circular CRM1 RNAs in maize. The white arrow indicate 354nt circular RNA. The scale bar is 800 nm. AFM atomic force microscopy

Fig. 4
figure 4

Various types of cenRNAs play an important role in centromere function. The centromere-specific nucleosomes are distributed in a specific reigon of chromosome. Centromeric transcripts are processed, including small RNAs, lncRNAs, circRNAs and DNA-RNA hybrids, which are associated with CENP-A, CENP-C, Aurora B, pericentric heterochromatin and so on

Conclusion and perspective

How does CENH3 recognize and target centromeric DNA? Which factors are involved in centromere assembly and function? These questions remain subjects for further investigation. Besides, the repetitive nature of centromeric DNA and indefinite origin and length of RNA will continue to challenge centromere transcription and identity. Studying the function of centromeric circRNA and R-loops appears to offer a breakthrough. Centromeric DNA transcription and RNA localization is independent of CENP-A (McNulty 2017). Variant centromeric transcripts interact properly with different centromere proteins. Dissecting different centromeric RNA-binding proteins might bring some new clues for determining the mechanisms of centromere formation and function, centromere inactivation or other biological processes.