Introduction

Ribosomes are non-membranous cellular organelles that perform the essential function of protein synthesis across taxonomic kingdoms. In eukaryotes, they are composed of four catalytic ribosomal RNA (rRNA) and approximately 80 protein subunits (Pikaard 2018). Eukaryotic genomes encode hundreds to thousands of rRNA genes that are tandemly arrayed at chromosomal locations called nucleolus organizer regions (NORs) (Fig. 1). NORs are named so because rRNA transcription and processing causes the nucleolus to form during interphase in each cell cycle (McClintock 1934; Ritossa and Spiegelman 1965; Wallace and Birnstiel 1966; Raska et al. 2006a, b). NORs appear as secondary constrictions during metaphase, the centromere being the primary constriction (McClintock 1934; Chen and Pikaard 1997b). The number and location of NORs vary among eukaryotes, often within a species (Copenhaver and Pikaard 1996b; Kobayashi et al. 2004; Britton-Davidian et al. 2012; McStay 2016; Zhang et al. 2016). Each rRNA gene consists of a promoter, external transcribed spacers (5′ ETS and 3′ ETS), internal transcribed spacers (ITS1 and ITS2) and 18S, 5.8S, 25 or 28S rRNA encoding regions (Fig. 1) (Copenhaver et al. 1995; Turowski and Tollervey 2015; Chandrasekhara et al. 2016). The tandem rRNA genes (rDNA repeats) are separated by untranscribed intergenic sequences (IGS) the length of which varies from 2 to 3 kb in plants, 10–30 kb in mammals and several other vertebrates, and often within a species (3–9 kb in Xenopus laevis) (Erickson and Schmickel 1985; Hadjiolov 1985; Pontvianne et al. 2010; Shaw and Brown 2012). The rRNA genes, which vary in size from 35 to 48S depending on species, are transcribed within the nucleolus by RNA Polymerase I (Pol I) to yield single precursor RNAs called pre-45S rRNAs that are processed by excising ETS and ITS regions to generate distinct 18S, 5.8S, and 25-28S rRNAs, which form the catalytic core of the ribosomes (Fig. 1) (Long and Dawid 1980; Gerbi 1986; Hannan et al. 2013; Turowski and Tollervey 2015; Viktorovskaya and Schneider 2015). The fourth rRNA, 5S, is encoded by 5S rRNA genes which are transcribed by RNA polymerase III (Pol III) elsewhere in the nucleus and in most of the eukaryotes, these genes occupy different chromosomal locations than 45S rRNA genes but still occur as tandem repeats at one or more chromosomal loci (Fig. 1) (Highett et al. 1993; McStay 2016; Scaldaferro et al. 2016; Zhang et al. 2016).

Fig. 1
figure 1

rRNA gene arrangement and ribosome biogenesis. A chromosome showing the location of the nucleolus organizer region (NOR), where an array of rDNA repeats is arranged in a head to tail fashion. Each rDNA unit has a coding region separated by intergenic spacers (IGS). rDNA unit is transcribed by RNA pol I resulting in the generation of single pre-rRNA, which upon processing, separates 18S, 5.8S, and 25 or 28S from the internal transcribed spacers (ITS-1 and ITS-2) and the external transcribed spacers (5′ETS and 3′ETS). (Right) 5S rRNA repeats are transcribed by RNA pol III, which upon processing leads to the formation of mature 5S rRNA. All the rRNAs and several ribosomal proteins are assembled as a ribonucleoprotein complex to form a mature ribosome

In the most studied diploid plant model species Arabidopsis thaliana, rRNA genes are clustered at two loci, one on chromosome 2 called NOR2 and the other on chromosome 4 called NOR4. In the ecotype Columbia-0 (Col-0), each NOR consists of ~370 copies of 45S rRNA genes (~4 Mb), which are arranged as head-to-tail tandem repeats at the northern tips of chromosomes 2 and 4, respectively (Copenhaver et al. 1995; Copenhaver and Pikaard 1996a, b). Although the number of NOR loci appears to be two among different ecotypes of A. thaliana, there exists a massive variation for 45S/5S rRNA copy number, which is correlated with the similarly varying genome sizes (Davison et al. 2007; Long et al. 2013). Such a correlation also exists broadly among different plant and animal species (Prokopowich et al. 2003).

Transcription of rRNA genes and modes of transcriptional regulation

RNA polymerase I (Pol I), which transcribes 45S rRNA genes in the nucleolus, is a multi-subunit enzyme complex that is highly conserved across eukaryotes (Russell and Zomerdijk 2006; Hannan et al. 2013; Ream et al. 2015; Turowski and Tollervey 2015). Although RNA Pol I has been studied in greatest detail in yeast, the first analysis of protein subunit composition of plant Pol I, along with RNA Pol III, was recently carried out in Arabidopsis, according to which Pol I has 14 subunits (Ream et al. 2015). Although rRNA accounts for most of the RNA in eukaryotic cells, the number of active rRNA genes required to meet cellular demands for ribosomes and protein synthesis varies depending on cell type, developmental stage, and growth status (Chen and Pikaard 1997b; Zhou et al. 2015; Pikaard 2018). For example, in the plant genus Arabidopsis, demand for protein synthesis is high during early stages of plant development compared to later stages (Pontes et al. 2007). However, in the plant genus Brassica, the demand for protein synthesis is high during flowering stages compared to vegetative phase (Chen and Pikaard 1997b). The phenomenon called ‘dosage control’ manifests because eukaryotic cells appear to have more rRNA genes than they need. Several lines of evidence indicate that only a subpopulation of rRNA genes is active at any given time (reviewed in (Grummt and Pikaard 2003). Even in actively growing/proliferating cells in which rRNA gene transcription can account for up to 80%, the number of active genes has been 50% or less (Jacob 1995; Moss and Stefanovsky 2002; Grummt and Pikaard 2003). Dosage compensation of rRNA genes was first observed in natural hybrids of plants and animals several decades ago, as described below.

Nucleolar dominance

In eukaryotic cells, the nucleolus constitutes the largest structural component of the nucleus, and it is not enclosed by any membrane. Several cellular phenomena occur within the nucleolus, with the most important being assembly of ribosomes, the protein-synthesizing organelles made up of ribonucleoproteins (Boisvert et al. 2007; Hernandez-Verdun et al. 2010; Grummt 2013). Transcription of rRNA genes by RNA polymerase I (Pol I) drives the formation of nucleoli (Tucker et al. 2010; Pikaard 2018). Interestingly, it was observed that in natural hybrids of diverse eukaryotic species, rRNA genes of only one progenitor species were expressed while the rRNA genes from the other progenitor species were silenced. This differential expression of rRNA genes was termed nucleolar dominance (ND) (Fig. 2) (Reeder 1985; Pikaard 1999; Neves et al. 2005; McStay 2006, 2016; Tucker et al. 2010; Goodfellow and Zomerdijk 2013; Hannan et al. 2013) (for a detailed review see (Pikaard 2018). For example, in Arabidopsis suecica, a natural hybrid of A. thaliana × Arabidopsis aeronosa, rRNA genes of A. aeronosa are preferentially expressed while A. thaliana rRNA genes are selectively silenced during early developmental stages (Pontes et al. 2007; Pikaard 2018). ND involves uniparental silencing of rRNA genes on a multi-megabase scale, and in comparison, it is second only to X chromosome inactivation in mammals (Earley et al. 2006). ND is a manifestation of dosage control of rRNA genes, similar to dosage control of X-linked genes in mammals, but unlike X chromosome inactivation, the choice of the progenitor species for silencing in ND is neither random nor does it involve maternal or paternal effects. ND has been shown to be an epigenetic phenomenon involving cytosine methylation and various histone modifications (Santoro et al. 2002; Zhou et al. 2002; Lawrence et al. 2004; Lawrence and Pikaard 2004; Earley et al. 2006; Mayer et al. 2006; Preuss et al. 2008; Schmitz et al. 2010; Borowska-Zuchowska and Hasterok 2017).

Fig. 2
figure 2

Nucleolar Dominance in natural hybrids. Cartoon representing the phenomenon of nuclear dominance as seen in the metaphase chromosomes of two progenitor species (A and B), each having a haploid set of three chromosomes, and their hybrid in the diploid state. rRNA genes shown to be located on chromosome III are active in species A and B, forming individual secondary constrictions but in the hybrid, rRNA genes of only one of the progenitor species (species B) are expressed, resulting in the formation of secondary constriction only by species B NOR

Similar dosage control of rRNA genes also manifests in diploid species of plants and animals (Zhou et al. 2002; Grummt and Pikaard 2003; Pontvianne et al. 2010). Findings from studies in yeast, plants, and mammals indicate the existence of at least two ways of rRNA gene regulation. One is coarse control mechanisms, which influence the number of genes that are on or off. Several epigenetic regulatory mechanisms are involved in this type of rRNA gene dosage control (Chen and Pikaard 1997a; Santoro and Grummt 2001; Sandmeier et al. 2002; Santoro et al. 2002; Grummt and Pikaard 2003; Pontvianne et al. 2012, 2013; Blevins et al. 2014; Holland et al. 2016). The other is fine-tuning the rate of rRNA gene transcription, which is accomplished by regulating the number of RNA polymerase I enzymes involved in transcription of each active gene (French et al. 2003; McStay and Grummt 2008; Grummt and Langst 2013).

Earlier hypotheses to explain dosage control of rRNA genes

For long, it has been known that rRNA genes are differentially regulated during growth and development, with some genes turned on while the rest turned off. It is also well known that rRNA gene sequences within a species are nearly identical. Then, the challenging question had been- how the organisms choose which genes should be silenced and which should be active when only some of them are required to be active? Initially, it was hypothesized that in multicellular eukaryotes, dosage control would be accomplished by turning on/off one gene at a time by making use of sequence information inherent to each gene. Based on some evidence, it was thought that whatever the sequence differences that exist among rRNA genes sequence could result in differential transcription factor binding affinities and thus resulting in their differential expression status. Those hypotheses also presumed that, akin to most of the eukaryotic genes, the default state of rRNA genes is being silent and they are selectively activated (Reeder 1985; Wong 2016). However, subsequent studies in plants showed that dosage control of rRNA genes involves selective silencing, not selective activation (Chen and Pikaard 1997b). Moreover, no differential transcription factor binding affinities for the rRNA genes were detected in the experiments involving transient expression or cell-free transcription systems (Saez-Vasquez and Pikaard 1997; Frieman et al. 1999). Subsequent models proposed that subtle sequence differences among rRNA genes could influence nucleosome positioning and/or base-pairing of non-coding RNAs resulting in differential binding of regulatory proteins with abilities to influence cytosine methylation and/or histone modifications (Chen and Pikaard 1997a; Santoro and Grummt 2001; Santoro et al. 2002, 2010; Zhou et al. 2002; Lawrence et al. 2004; Strohner et al. 2004; Earley et al. 2006; Li et al. 2006; Mayer et al. 2006; Preuss et al. 2008; Felle et al. 2010; Pontvianne et al. 2012). Such hypotheses are plausible in genetic hybrids exhibiting nucleolar dominance wherein rRNA genes of the progenitors are differentially expressed because the progenitors’ rRNA genes can substantially differ in their regulatory sequences (Tucker et al. 2010). However, it is difficult to conceive mechanistically how rRNA genes in a diploid species (non-hybrids) can be discriminated from one another when their rRNA genes are identical in sequences.

A New hypothesis to explain dosage control of rRNA genes

Based on multiple lines of evidence from studies in Arabidopsis, a new hypothesis has been proposed to explain how dosage control of rRNA genes takes place. According to this hypothesis, rRNA genes are regulated at the subchromosomal (NOR) level, not at the level of individual genes. Further, the chromosomal context, not the gene sequence, determines the activity status of rRNA genes (Chandrasekhara et al. 2016; Mohannath et al. 2016). The sections below include the description and illustration of evidence that led to this new hypothesis.

The ecotype (strain) Col-0 belonging to the diploid species A. thaliana consists of ~1500 copies of 45S rRNA genes, distributed approximately equally between the two NORs (NOR2 and NOR4) (Fig. 3a) (Copenhaver and Pikaard 1996b). These genes are nearly identical in sequence except for the variable 3′ETS regions, which upon amplification by the flanking PCR primers yield products that define VAR1, VAR2, VAR3, and VAR4 rRNA gene subtypes (Pontvianne et al. 2010). Among these variants, VAR1 constitutes ~48% of the total rRNA genes followed by VAR2 (~30%), VAR3 (~22%), and the least abundant VAR4 (~1% or less) (Fig. 3b) (Pontvianne et al. 2010). All the four variants are expressed in newly germinated seeds but, by 10–14 d post-germination and throughout subsequent stages of development, approximately half of the rRNA gene subtypes (VAR1 and a subset of VAR3 subtypes) are selectively silenced (Earley et al. 2006, 2010; Pontvianne et al. 2010, 2013; Chandrasekhara et al. 2016). However, the same set of rRNA gene subtypes fail to be silenced when seeds were germinated on plant growth medium containing cytosine methylation inhibitor 5-aza-2′-deoxycytidine or if plants are deficient for histone deacetylase 6 (HDA6), a repressive chromatin modifier (Earley et al. 2006, 2010; Pontvianne et al. 2013; Chandrasekhara et al. 2016; Mohannath et al. 2016). Mainly, RT-PCR and Fluorescence-Activated Nucleolar Sorting (FANoS) techniques were used to analyse the expression of rRNA genes (Pontvianne et al. 2010, 2013). Results from the FANoS experiments showed that in the wild-type Col-0, 2 weeks post-germination, only the active rRNA variants (VAR2, VAR3, and VAR4) are found within the nucleolus, excluding the inactive variants (VAR1) (Pontvianne et al. 2013). However, in the hda6 mutant plants, at a similar growth stage, all rRNA variants, including VAR1 type, were found within the nucleolus, corroborating results from RT-PCR experiments (Pontvianne et al. 2013). These findings showed that by default, rRNA genes are active and they are selectively silenced.

Fig. 3
figure 3

rRNA gene content of Col-0 NORs. (a) Cartoon depicting NOR2 and NOR4 of Arabidopsis thaliana (ecotype Col-0). (b) A diagram depicting the variable 3′ETS region of A. thaliana (ecotype Col-0) rRNA genes and the position of the primer pair used to amplify rRNA gene subtypes VAR1 through VAR4. (Right) Gel image shows genomic DNA PCR amplification products as well as RT-PCR products obtained by using leaf DNA or RNA of A. thaliana (ecotype Col-0)

A survey of about a dozen ecotypes of A. thaliana revealed that these ecotypes do not share the same number of rRNA gene subtypes. For example, the ecotype Bur-0 has only VAR1 subtype, Sha ecotype has only VAR3 subtype, Ler ecotype has VAR1 and VAR3 subtypes (Chandrasekhara et al. 2016). Such a variation proved beneficial in mapping the rRNA variants to their respective NOR(s), as described below. Further, deep sequencing analysis of Col-0 genome identified ~15 single nucleotide polymorphisms (SNPs) and short insertions/deletions (indels) among the rRNA genes which in combination with the four length variants, constitute a total of ~20 rRNA gene subtypes (Chandrasekhara et al. 2016; Mohannath et al. 2016). SNPs and indels were converted to either cleaved amplified polymorphic sequences (CAPS) or derived CAPS (dCAPS) markers. For each of these markers, the expression state (on/off) was determined using the RT-PCR method (Chandrasekhara et al. 2016). All the subtypes (~20) were mapped to their respective NOR(s) using recombinant inbred lines (RILs) and/or F2 progeny as mapping populations derived from the crosses between Col-0 and Sha or Bur-0 or Ler ecotypes (Chandrasekhara et al. 2016). For example, VAR1, VAR2, and VAR4 were mapped using mapping populations (RILs and/or F2) obtained from crosses between Col-0 and Sha ecotypes. Such a genetic mapping was possible because Sha ecotype has only VAR3 rRNA subtype and lacks VAR1, VAR2, and VAR4 rRNA subtypes. In other words, the ecotypes Col-0 and Sha show polymorphism with respect to VAR1, VAR2, and VAR4 subtypes and thus fulfil a basic requirement underlying genetic mapping approaches (Chandrasekhara et al. 2016). Similarly, all the rRNA subtypes (~20) were mapped, often using multiple mapping populations derived from the parents that are polymorphic for the subtype to be mapped (Chandrasekhara et al. 2016). The mapping results showed that all the silent rRNA subtypes map to NOR2 and the active subtypes map to NOR4 (Fig. 4a) (Chandrasekhara et al. 2016). In some cases only a SNP in the 3′ ETS (the region most distant to the promoter sequence) differentiates active and inactive rRNA subtypes, clearly indicating that not the gene sequence but the chromosomal location that matters in determining the activity status of rRNA genes (Chandrasekhara et al. 2016).

Fig. 4
figure 4

Evidence underlying the formation of a new hypothesis to explain rRNA gene dosage control. (a) Summary of mapping results of rRNA subtypes in the ecotype Col-0. (b) Summary of NOR-specific rRNA expression analysis the A. thaliana introgression line named ColSf-NOR4. (c) Summary of NOR-specific rRNA expression analysis the A. thaliana line named ASC1, which was isolated from atxr5 atxr6 double mutants

The second line of evidence in support of the new hypothesis comes from a study on an introgression line referred to as ColSf-NOR4, which carries NOR2 of Col-0 ecotype and NOR4 of Sf-2 ecotype (Chandrasekhara et al. 2016). ColSf-NOR4 was created by initially crossing Sf-2 with Col-0, followed by recurrent backcrossing of the progeny to Sf-2 for ten generations (Lee and Amasino 1995). During every generation of backcrossing, the progeny was selected for the FRIGIDA (FRI) gene of Sf-2 ecotype and in doing so, the authors inadvertently also selected for NOR4 of Sf-2, which lies ~280 kb from FRI locus on chromosome 4. Note that although NOR regions in Arabidopsis span few Mb, the regions act as suppressors of meiotic crossing over, and therefore, FRI locus and NOR4 of Sf-2 inherited together due to tight linkage, during every generation of backcrossing followed by selection for FRI locus of Sf-2. Although ColSf-NOR4 was created to study the effect of the FRI gene of Sf-2 ecotype on cold-responsive flowering, it served as a nice genetic material to test the new hypothesis because Sf-2 NOR4 has only VAR1 subtype of rRNA variants. On the other hand, ColSf-NOR4 has NOR2 of Col-0 predominantly with VAR1 subtype and some copies of the VAR3 subtype. VAR1 subtypes of Sf-2 can be distinguished from VAR1 subtypes of Col-0 by a SNP, which creates a HindIII restriction site in the amplified region in Sf-2 but not in Col-0. In the ColSf-NOR4 introgression line, analysis of 4-week old plant tissue showed that VAR1 rRNA subtype variants of SF-2 located on NOR4 are expressed while VAR1 and VAR3 subtypes of Col-0, located on NOR2, are silenced (Fig. 4b) (Chandrasekhara et al. 2016). Comparison of Sf-2 VAR1 and Col-0 VAR1 subtype gene sequences revealed 99% identity, and if only the promoter regions are compared, identity increases even further. These results collectively proved that NOR4 is active, and NOR2 is silenced during development irrespective of the type of rRNA variants that occupy NOR2 or NOR4 (Chandrasekhara et al. 2016).

The last line of evidence for the new hypothesis comes from a study on a mutant line of Arabidopsis deficient for ATXR5 or ATXR6-dependent histone 3 lysine 27 monomethylation (H3K27me1) (Mohannath et al. 2016). The atxr5 atxr6 double mutant line lacks histone H3 lysine 27 monomethylase (H3K27me1) activity due to mutations disrupting the ATXR5 and ATXR6 genes (Jacob et al. 2009). In atxr5 atxr6 double mutants, heterochromatin over-replicates (Jacob et al. 2010), a potential source of chromosome instability at heterochromatin-euchromatin boundaries. In this line, at least 2–3 Mb of NOR4, including the adjoining telomere (TEL4N), had been converted to the corresponding sequences of NOR2 and its adjoining telomere (TEL2N) (Mohannath et al. 2016). This line has been referred to as ASC1 (Altered Subtype Content 1) because most of the VAR2, VAR3, and all VAR4 genes from NOR4 are lost and replaced by VAR1 and VAR3 of NOR2. In ASC1, NOR2-specific rRNA genes, which are normally silenced, escape silencing upon translocation to NOR4 (Fig. 4c). This activation of NOR2-specific rRNA genes now located on NOR4 is independent of atxr mutations. The chromosome 2 position effect on rRNA gene silencing is not explained by the NOR2-associated telomere, TEL2N, because TEL2N has also been translocated to NOR4 (Mohannath et al. 2016). Moreover, the intact NOR2 remains silenced in ASC1, as revealed by the analysis of rRNA variants isolated using the FANoS approach (Mohannath et al. 2016). This study indeed provided a direct experimental proof for the new hypothesis that, as part of dosage control, rRNA genes are regulated at the level of NORs, influenced by chromosomal location, not at the level of individual genes based on gene sequence differences. An earlier study on ATXR5 and ATXR6 H3K27 monomethyltransferases has concluded that these enzymes will play a role in rRNA dosage control by somehow preferentially creating an epigenetic silencing mark on NOR2 chromatin and thus the loss of ATXR5 and ATXR6 led to release of NOR2 silencing (Pontvianne et al. 2012). However, the above-described study proved that the release of NOR2 silencing is due to the translocation event that occurred in atxr5 axr6 double mutants, and this release of silencing was independent of atxr5 atxr6 mutations (Mohannath et al. 2016). Also, analysis of rRNA variant expression in newly isolated atxr5 axr6 double mutants with intact NORs (carry no translocation events) revealed that atxr5 and atxr6 mutations bear only modest effect on releasing NOR2 silencing, contrary to what was thought of before. By considering this study as a cautionary tale, we need to revisit the proposed effects of other characterized epigenetic modifications on rRNA gene silencing per se. Such efforts are currently being made by our laboratory. In summary, the new hypothesis proposes that in Arabidopsis, during the initial stages of development, both NOR2 and NOR4 are active and about ~2 weeks post-germination, NOR2 is selectively silenced as part of rRNA gene dosage control phenomenon (Fig. 5) (Chandrasekhara et al. 2016; Mohannath et al. 2016). Lastly, existing evidence indicates that 5S rRNA genes are selectively silenced, and are regulated at chromosomal level (Vaillant et al. 2007; Simon et al. 2018).

Fig. 5
figure 5

Developmental regulation of 45S rRNA genes in Arabidopsis thaliana. A cartoon depicting activity status of NOR2 and NOR4 in A. thaliana during different stages of development

The proposition that regulation occurs at the NOR level is consistent with previous studies on nucleolar dominance (ND) in the allotetraploid species Arabidopsis suecica, a natural hybrid of A. thaliana × A. aeronosa. In this hybrid, A. aeronosa rRNA genes are preferentially expressed while A. thaliana rRNA genes are selectively silenced during early developmental stages (Pontes et al. 2007; Pikaard 2018). Similar preferential expression of rRNA genes from one of the progenitor species had also been demonstrated in other tetraploids of Arabidopsis and Brassica (Chen and Pikaard 1997b; Lewis et al. 2007). Another study, which demonstrated that silencing in ND is restricted to rRNA genes within NORs and does not spread to adjacent genes, also indicated that NORs may be regulatory units (Lewis and Pikaard 2001). In this study, the authors show the expression of a protein-coding gene located ~3 kb away from the silenced NOR, proving no spread of silencing from NOR regions to the centromere-proximal adjoining regions. Studies involving RILs derived from crosses of Cvi-0 × Ler ecotype of A. thaliana had also provided some evidence in support of the hypothesis that NORs act as units of rRNA gene regulation (Lewis et al. 2004). Further, full-length A. thaliana rRNA transgenes when ectopically inserted at locations outside of the NORs in the A. thaliana genome escaped silencing in the interspecific hybrids, indicating that localization of rRNA genes within a NOR is essential for their silencing (Lewis et al. 2007).

Some available evidence suggests that rRNA gene regulation at the level of NORs may also be a chromosomal phenomenon common to other multicellular eukaryotes. Studies in fruit flies, barley, and wheat showed that changing the location of NOR or deleting NOR-adjacent regions can disrupt NOR silencing (Durica and Krider 1978; Schubert and Künzel 1990; Viera et al. 1990; Greil and Ahmad 2012). An earlier study on a human cell line showed that only a subset of the NORs located on the five acrocentric chromosomes (Chromosome 13, 14, 15, 21, and 22) is associated with the RNA Pol I transcription machinery in most of the cells, indicating that active and inactive rRNA genes occupy different NORs (Roussel et al. 1996; McStay and Grummt 2008). However, this study was conducted in HeLa cells, an established human cancer cell line, and therefore, may not reflect what happens in normal human cells because dysregulation of rRNA genes is a hallmark of several human cancers (Bywater et al. 2012; Uemura et al. 2012; Zhou et al. 2016; Tsoi et al. 2017). In fact, rRNA gene transcription and levels of rRNA have been therapeutically targeted to control cancers (Bywater et al. 2012; Tsoi et al. 2017; Low et al. 2019). Therefore, it will be of significance to know how rRNA genes are regulated in humans, as the finding may provide additional therapeutic targets to control cancers (McStay 2016).

Future directions

Recent studies in Arabidopsis have answered some vital questions concerning rRNA gene regulation in eukaryotes but, at the same time, have given rise to interesting new questions. Firstly, why NOR2 has been selectively targeted for silencing? Physical mapping studies of NOR regions in A. thaliana (ecotype Col-0) using pulsed-field gel electrophoresis (PFGE), a technique to analyze large DNA fragments (up to 9 Mb) (Schwartz and Cantor 1984; Cox et al. 1990), indicated that both NOR2 and NOR4 are fully composed of tandem head-to-tail rRNA gene repeats oriented in the same direction, without the presence of any non-rRNA gene sequences (Copenhaver et al. 1995; Copenhaver and Pikaard 1996a, b). Moreover, physical mapping studies, the Southern blot analysis, and cloning and sequencing of NOR-telomere junctions showed that the terminal rRNA genes of NOR2 and NOR4 are immediately bordered by telomere repeats without the presence of unique subtelomeric sequences (Copenhaver and Pikaard 1996a; Mohannath et al. 2016). Collectively, these studies rule out any putative NOR regulatory sequences that might distinguish the two NORs to be present either within NORs or between NOR and the adjoining telomeres (Copenhaver and Pikaard 1996a; Mohannath et al. 2016). Any role for the NOR2-adjoining telomere (TEL2N) in NOR2 silencing has also been ruled out (Mohannath et al. 2016). However, the centromere-proximal region that is right next to the NOR2 is composed of transposable elements and transposon remnants that extend for ~75 kb before the first protein-coding genes are found (Chandrasekhara et al. 2016). This region, characterized by heavy cytosine methylation and histone post-translational modifications, appears to form condensed, transcriptionally repressed chromatin. In contrast, the corresponding NOR4-flanking region has few transposon-related sequences, with active protein-coding genes encountered ∼3 kb away from the edge of the NOR4 (Lewis and Pikaard 2001). Thus, one hypothesis is that heterochromatin formation initiated at the transposon-rich NOR2 flanking region could spread into the entire 4 Mb region of NOR2, thereby condensing and inactivating the rRNA genes located within the NOR2. As possible supporting evidence, NOR2 has been shown to be hypermethylated compared to the hypomethylated NOR4 (Pontvianne et al. 2013; Mohannath and Pikaard 2016) (Mohannath et al. unpublished data). Moving forward, the Pikaard Lab at Indiana University is currently testing this hypothesis using chromosome engineering tools. A similar hypothesis has also been proposed in the fruit fly Drosophila wherein, the region adjacent to Y chromosome rDNA array has been suspected to be important in determining dominance of Y chromosome over X chromosome (Greil and Ahmad 2012). However, how such sequences could play a role in nucleolar dominance remains to be understood.

Another observation of interest is that gene density on chromosome 4 is higher than chromosome 2, indicating corresponding enrichment of euchromatic regions on chromosome 4 (Lenoir et al. 2001). Could this difference in gene densities predispose NOR4 to be active and NOR2 to be silenced in A. thaliana? This hypothesis remains to be tested. Further, is NOR4 always dominant over NOR2 in A. thaliana? Multiple studies indicate the existence of all possible relationships between NOR2 and NOR4. Studies using genome sequencing data from the 1001 Genomes Consortium had characterized rRNA gene sequence variation within and among several accessions of A. thaliana (Rabanal et al. 2017). Such variants were called rDNA haplotypes, and those belonging to a NOR of an ecotype were called a rDNA cluster. Using linkage analysis among the rDNA clusters of various ecotypes of A. thaliana, the authors show that rRNA gene cluster expression is controlled through complex epistatic and allelic interactions between rDNA haplotypes, apparently regulating the entire rRNA gene cluster (Rabanal et al. 2017). The study also reports dominance of NOR2 over NOR4, dominance of NOR4 over NOR2, and codominance between NOR2–NOR4, when NOR2 and NOR4 of different ecotypes come together (Rabanal et al. 2017). Such varied relationships between NOR2 and NOR4 had also been observed in studies involving Cvi-0 × Ler recombinant inbred lines (RILs) of A. thaliana (Lewis et al. 2004). In a related study, genome-wide Hi-C analysis in Arabidopsis revealed that NORs interact with each other but make no other genomic contacts (Feng et al. 2014). However, based on these findings, we cannot entirely rule out the involvement of other genomic loci in determining which NOR(s) needs to be silenced or kept active.

Lastly, it would be interesting to know how NOR2s and NOR4s are regulated in tetraploid species A. suecica, a natural hybrid of A. thaliana and A. aeronosa, wherein rRNA genes of A. thaliana undergo selective silencing while that of A. aeronosa remain active, as described above (Pikaard 1999, 2018). Similar questions concerning rRNA gene regulation can also be extended to other plant species and non-photosynthetic eukaryotes that have more than one NOR regions.