Heterochromatin: the classical definition

Emil Heitz (1928) originally identified heterochromatin as the nuclear material distinguished by its dense pattern of staining throughout the cell cycle. This “constitutive” heterochromatin generally includes telomeric and pericentric regions, both critically important structural domains required for the maintenance of intact chromosomes and faithful transmission of their genetic information (Le et al. 2004). In the past 20 years, the cytological definition of heterochromatin has been expanded to include chromosomal regions that, while not visibly condensed, exert silencing on reporter genes. Repetitive blocks of ribosomal RNA genes are packaged in heterochromatic structures, which serve to silence some repeats and suppress recombination between them (Gottlieb and Esposito 1989). In fission yeast, the inactive mating-type loci are packaged as heterochromatin, which silences and suppresses these regions of the genome (Lorentz et al. 1992) while promoting the long-range chromatin interactions necessary for mating-type class switching (Jia et al. 2004). Budding yeast also silence inactive mating-type loci using a similar but evolutionarily distinct system which has been treated thoroughly in other recent reviews (Huang 2002; Moazed 2001) and will not be considered here. Most of these blocks of constitutive heterochromatin are characterized by a high proportion of repetitious sequences, late S-phase replication, and lack of recombination (Richards and Elgin 2002). It must be noted that these key features of constitutive heterochromatin are not present in every case but have sufficient breadth of applicability to be useful generalizations for the purposes of this discussion.

Constitutive heterochromatin exerts a repressive effect on the expression of most normally euchromatic genes placed in or near it; this is the basis of position effect variegation (PEV). Although PEV has been observed in many organisms from budding yeast to mammals (Dillon and Festenstein 2002), it was originally described in flies as a mottled red and white eye phenotype, the result of an X-ray induced inversion of a large segment of the X chromosome (Muller 1930). The mutant phenotype arises from a rearrangement which moves the gene white (required for normal red eye pigmentation) from its wild-type distal position to a position proximal to a breakpoint in pericentric heterochromatin (Muller 1930; Panshin 1941). As a result of this change in position relative to the centromere and its associated heterochromatic mass, the white gene is silenced in a subset of the cells that pigment the eye. The activity state, either on or off, is clonally inherited, resulting in mosaic expression. The dependent pattern of silencing of a second gene, roughest, which is on the same chromosome arm but further from the centromeric heterochromatin than white in these rearrangements, provided evidence for a spreading mechanism that has a range of hundreds of kilobases (Schultz 1939). More recently, it has been noted that there are cases where roughest is silent in eye facets where white is expressed, indicating a more complex mechanism than the simple linear extension of pericentric heterochromatin initially envisioned, perhaps involving multiple centers of initiation to relay the signal for heterochromatin assembly (Henikoff 1996; Talbert and Henikoff 2000). The random occurrence of white expression in variegating mutants is thought to reflect the stochastic nature of the mechanism by which constitutive heterochromatin is established and maintained. The property of spreading is another hallmark of classically identified heterochromatin domains. An understanding of heterochromatin that explains these properties is the goal of current research in chromatin structure.

Molecular revisions to the classical definition

The finding that an ectopic copy of white mobilized in a transposable element vector recapitulates the PEV phenotypes observed in classical genetic rearrangements when inserted in a heterochromatic domain (Hazelrigg et al. 1984) opened up the possibility of using molecular biological techniques to study PEV. In the past decade, molecular biologists have analyzed the chromatin packaging of such reporter transgenes using a variety of tools to refine the definition of heterochromatin. Reflecting the condensed state of heterochromatin observed microscopically, variegating transgenes are more resistant to DNase I digestion than uniformly expressed transgenes inserted in euchromatin. Heterochromatin formation has been found to result in regularly spaced nucleosome arrays; in some cases, the spacing reflects the repetitive nature of underlying DNA sequences. Perhaps as a consequence of regular spacing of nucleosomes, the formation of heterochromatin on an hsp26 reporter gene results in loss of the DNase I hypersensitive sites normally present at the 5′ regulatory sequences (Sun et al. 2001; Wallrath and Elgin 1995).

Molecular characterization has led to the identification of several key constituents of heterochromatin, which are conserved from fission yeast to humans. Extensive methylation of histone 3 at lysine 9 (H3K9me) is a characteristic of heterochromatin domains shared by most eukaryotes (Richards and Elgin 2002). This posttranslational modification is catalyzed by the evolutionarily conserved H3K9-specific histone methyltransferase SU(VAR)3-9 (Suv39h1&2 in mammals, Clr4 in fission yeast). Another conserved feature of heterochromatin is the presence of non-histone chromosomal protein heterochromatin protein 1 (HP1, Swi6 in fission yeast). HP1 is a small protein with two conserved domains critical for its function in heterochromatin, the N-terminal chromodomain and the C-terminal chromoshadow domain, separated by a “hinge” domain. HP1 interacts stably with SU(VAR)3-9 via its chromoshadow and hinge domains, and with the H3K9 methyl mark via its chromodomain in flies and mammals (Aagaard et al. 1999; Bannister et al. 2001; Jacobs et al. 2001; Lachner et al. 2001; Nakayama et al. 2001). SU(VAR)3-9 and HP1 associate with chromatin interdependently (Schotta et al. 2002). Together, the biochemical triad of SU(VAR)3-9 enzyme, its catalytic product H3K9me, and HP1 form the foundation for a self-assembly mechanism not unlike the SIR protein silencing system in budding yeast. Such assembly systems could permit heterochromatic structure to spread linearly on a chromosome from putative sites of initiation, as suggested by earlier studies of PEV (Locke et al. 1988).

Consistent with this proposal, loss of function mutations or depletion of product from genes encoding heterochromatin components such as SU(VAR)3-9 and HP1 results in suppression of the PEV phenotype (diminution of heterochromatin and loss of silencing) in flies, while overexpression leads to enhancement of PEV (increased heterochromatin formation and increase in silencing) (Eissenberg et al. 1992; Schotta et al. 2002). DNase I hypersensitive sites lost upon heterochromatinization of an hsp26 transgene are regained in flies heterozygous for HP1 (Su(var)2-5) mutations (Cryderman et al. 1998). Cytological analysis of chromosomes from Su(var)3-9 mutants show defects in chromocenter morphology which are consistent with PEV results (Schotta et al. 2002).

Other components of this heterochromatin-forming system, best studied in fission yeast and in flies, may be placed in a sequential pathway with SU(VAR)3-9 and HP1. Obviously, methylation of H3K9 requires prior deacetylation of H3K9, and RPD3 (HDAC1) is known to form a complex and function with SU(VAR)3-9 to accomplish this transition (Czermin et al. 2001). Other components that reverse euchromatic marks, such as histone H3K4 demethylases, are likely to be involved (Shi et al. 2004). Methylation by SU(VAR)3-9 is required in turn to recruit stable binding of HP1 and the initiation of a spreading cascade of heterochromatin formation (Hall et al. 2002; Schotta et al. 2003). SU(VAR)3-9 is epistatic to the SU(VAR)4-20 enzyme, which binds to HP1 and methylates lysine 20 of histone H4, further promoting stable silencing (Schotta et al. 2004). Unlike SU(VAR)3-9 and HP1, RPD3 and SU(VAR)4-20 have a broad distribution on polytene chromosomes, indicating their participation in many euchromatic processes as well.

What makes heterochromatin heterochromatin?

H3K9me and abundant deposition of HP1 are conserved, key features of constitutive heterochromatin domains necessary for heterochromatin structure and function in most eukaryotes. However, although necessary, these marks do not themselves uniquely identify heterochromatin. It has been noted that both HP1 and H3K9me are found at a subset of sites in the euchromatic arms of polytene chromosomes of flies (Cowell et al. 2002; Fanti et al. 2003), particularly where clusters of repetitious sequence elements occur (de Wit et al. 2005). In other species, a similar circumstance is obtained. In mammals, methylation of H3K9 and recruitment of HP1 have also been observed at the promoters of repressed genes, where they appear to contribute to gene silencing (Nielsen et al. 2001; Ogawa et al. 2002; Roopra et al. 2004; Schultz et al. 2002). In these instances, there is no indication that a spreading mechanism (like that invoked to explain variegation at pericentric inversions) is at work. Still, in each of these cases, H3K9me and HP1 are associated with the formation of a closed chromatin structure, which is repressive to transcription, also a feature of constitutive heterochromatin domains. It should be noted, however, that certain genes, termed heterochromatic genes, require the presence of HP1 for normal levels of expression (Lu et al. 2000); the mechanism involved is not yet known. However, unlike the surrounding chromatin, the transcribed regions of heterochromatic genes do not exhibit the regular nucleosome spacing typically associated with heterochromatin structure but rather show the less regular packaging associated with euchromatin (Sun et al. 2001).

Harder to reconcile with the notion that H3K9me and HP1 constitute unique marks of transcriptionally silent domains is the fact that HP1 can also be found in association with transcriptionally active heat shock puffs in polytene chromosomes (Piacentini et al. 2003). More surprising still is the fact that researchers have recently reported finding both H3K9me and recruited HP1 in the transcribed regions of activated genes (Vakoc et al. 2005). The presence of HP1 and H3K9me in these genes is coextensive with the active chromatin mark H3K4me and may be dependent upon elongation by RNA polymerase II. The generality and significance of these findings has not been determined, but it certainly calls into question the simple model in which HP1 andH3K9me are exclusive marks of silent chromatin. Moreover, the presence of H3K9me and HP1, which are necessary for the formation of heterochromatin, in the bodies of transcribed genes raises questions about how their function is specified in each context. The existence of other H3K9 methyltransferases and of distinct isoforms of HP1 may go some way toward answering these questions.

Elaborating the molecular model of heterochromatin

While the chromatin of active genes apparently can contain H3K9me and HP1, the methylation of H3K9 in such regions is likely to be accomplished by an enzyme distinct from SU(VAR)3-9. Two other families of SET methyltransferases, which use H3K9 as their substrate, have been characterized in the last few years (Fig. 1). These are the G9a family and the SET domain bifurcated (SETDB) family (Schultz et al. 2002; Tachibana et al. 2002; Yang et al. 2002). The histone substrate specificity of these enzymes has been demonstrated in mammals, as has their preference for euchromatic substrates (Dodge et al. 2004; Tachibana et al. 2002). With few exceptions, enzymes from these two families are recruited by repressor complexes to perform promoter-specific histone H3K9 methylation (Table 1) (Ogawa et al. 2002; Roopra et al. 2004; Schultz et al. 2002), and it is expected that these enzymes will also be responsible for histone methylation in the body of active genes.

Fig. 1
figure 1

Phylogenetic relationships among H3K9 methyltransferases in eukaryotes. An unrooted tree diagram generated on the basis of CLUSTALW alignment of the full-length peptide sequence of each protein represented. The tree has three main branches representing the three families of H3K9 methyltransferases: the Suv39h family [which includes the only representatives from the fission yeast (Clr4p) and slime mold (DIM5) genomes], the G9a family, and the SETDB family. Protein names designate species first (h, human; m, mouse; x, frog; d, fly) except in the case of Clr4p and DIM5

Table 1 H3K9 methyltransferases and HP1 proteins recruited by repressor complexes (mammals)

Cytological studies have confirmed the distinct genomic targets of the G9a and Suv39h methyltransferases (Peters et al. 2003; Rice et al. 2003; Wu et al. 2005). Moreover, these studies indicate that the two enzymes are responsible for distinct degrees of methylation on their respective genomic substrates. In cultured mammalian cells, G9a is responsible for the bulk of H3K9 mono- and dimethylation in silent euchromatin, while the Suv39 enzymes catalyze trimethylation specifically in pericentric heterochromatin (Peters et al. 2003; Rice et al. 2003). The specific contribution of SETDB methyltransferases to wild-type histone methylation patterns in vivo has not been reported to date, but SETDB1 can methylate pericentric H3K9 in the absence of Suv39h1&2 activity in mammalian cells (Kourmouli et al. 2005). In vitro, SETDB1 is capable of producing mono- and dimethylated lysine 9 and can catalyze conversion of dimethyl to trimethyl lysine 9 in the presence of an additional cofactor (Wang et al. 2003). The histone substrate is potentially complex in terms of other modifications that may be present; whether the different enzymes have different preferences in this regard remains to be explored.

These results altogether suggest that H3K9 methylation in heterochromatin and euchromatin is regulated by distinct cellular systems, which utilize different methyltransferase enzymes. Questions still remain, however, regarding the broad distribution of HP1, the protein which binds methylated H3K9 with specificity and high affinity (Jacobs et al. 2001). It is important to note that HP1 is not a single protein, but a protein family in many eukaryotes (Fig. 2). Both mammals and flies have three distinct HP1 isoforms (in mammals, HP1α, β, and γ; in flies, HP1a, b, and c). While all of the isoforms share a common domain structure, having an N-terminal chromodomain and a C-terminal chromoshadow domain, they all differ from one another significantly at the level of peptide sequence. This variability is mirrored in functional differences between the isoforms within a species. All of the HP1 isoforms in insects and mammals are chromatin proteins, but each has a pattern of localization distinct from other family members. In mammals, HP1α and HP1β are found in overlapping domains, primarily in pericentric chromatin, as is Drosophila HP1a, whereas mammalian HP1γ is also found at discrete sites in euchromatin, as are Drosophila HP1b and HP1c (Minc et al. 1999; Smothers and Henikoff 2000). When HP1α is expressed in Drosophila melanogaster, its distribution on polytene chromosomes overlaps completely with fly HP1a, and it can rescue homozygous lethal Su(var)2-5 mutations (Ma et al. 2001; Norwood et al. 2004). Each of these HP1 isoforms presumably participates in a distinct set of interactions with other cellular factors at its sites of function in the genome, and these interactions may modify its function.

Fig. 2
figure 2

Phylogenetic relationships among HP1-like proteins from mammals, flies, and fungi. An unrooted tree diagram generated on the basis of CLUSTALW alignment of the chromodomain peptide sequence of each protein represented. The mammalian HP1α, β, and γ are more similar to each other (paralogous) than to the fly HP1a, b, and c proteins or the fungal HP1-like proteins from slime mold (NcHP1) and fission yeast (Chp1&2p and Swi6p)

When mammalian HP1 is recruited to repressed promoters upon H3K9 methylation, HP1α is the predominant isoform utilized, although HP1β and γ are also detected in some cases (Ichimura et al. 2005; Nielsen et al. 2001; Roopra et al. 2004; Schultz et al. 2002). The only HP1 protein found by chromatin immunoprecipitation to be abundant in the transcribed region of a number of activated genes is interestingly HP1γ (Vakoc et al. 2005). Relevant to this finding is the fact that HP1γ overexpression seems to enhance gene expression at HP1 target genes while HP1α and HP1β overexpression are repressive (Hwang and Worman 2002). Coimmunoprecipitation experiments have demonstrated that HP1γ is associated with the phosphorylated CTD of elongating RNA polymerase II, which may help to specify its unique pattern of deposition (Vakoc et al. 2005). The functional consequences of HP1α and β deposition in pericentric and telomeric regions of the genome include the establishment of a tightly condensed, well-ordered chromatin structure that is generally refractory to transcription. HP1γ may function in a distinct but related way to HP1α and β, perhaps stabilizing chromatin structure in the wake of transcription. Whether or not HP1γ interacts with H3K9me in this context (or, indeed, whether the total modification state of the H3 tails would lend itself to such interaction) remains to be investigated; such an analysis could certainly illuminate the function of HP1 in this context. HP1 may alternatively play a role in transcription elongation by an entirely different mechanism related to the RNase-sensitive interactions of HP1 at sites of transcription (Piacentini et al. 2003). Another study has determined that the hinge region of HP1α confers RNA binding activity both in vitro and in vivo, indicating that RNase treatment may eliminate a direct target of HP1 binding (Muchardt et al. 2002).

A role for small RNAs in heterochromatin formation

Exciting studies over the past several years have uncovered a role for small RNAs and the proteins that interact with them in directing gene silencing at the genome level. The first hints of an RNA-related gene silencing mechanism came from cosuppression in plants, where a transgenic copy of a flower pigmentation gene (chalcone synthase) caused suppression of both the transgene and the endogenous gene (Napoli et al. 1990; van der Krol et al. 1990). At that time, it was not understood how this silencing occurred, although there was evidence that pointed to an RNA turnover mechanism (van Blokland et al. 1994). Key findings in other organisms shed light on the mechanism behind this phenomenon. The surprising discovery in Caenorhabditis elegans that injection of double-stranded RNA (dsRNA) caused sequence-specific silencing of a homologous gene expanded the known functions of RNA in a cell (Fire et al. 1998). This dsRNA-induced mode of silencing was termed RNA interference (RNAi) and can occur in a host of organisms including fission yeast, flies, mammals, and plants (Kennerdell and Carthew 1998; Ngo et al. 1998; Paddison and Hannon 2002; Raponi and Arndt 2003; Sanchez Alvarado and Newmark 1999; Shi 2003). One notable exception is the well-studied yeast Saccharomyces cerevisiae, which lacks the RNAi machinery (and lacks homologues of HP1 and H3K9-specific histone methyltransferases).

Investigation into the biological mechanism behind RNAi has generated insights into the pathways by which dsRNA-induced silencing occurs. Genetic and biochemical studies have identified the endogenous components that mediate the dsRNA-induced gene silencing. These components are collectively referred to as the RNAi machinery. Common components include the dsRNA endonuclease Dicer (Bernstein et al. 2001) and members of the Argonaute family of small RNA-binding proteins (Grishok et al. 2000; Hammond et al. 2001; Tabara et al. 1999), which are conserved across all organisms that undergo RNAi-dependent silencing. One or more RNA-dependent RNA polymerases (RdRp) also play a key role in RNAi-dependent gene silencing in several organisms, although their occurrence is not as widespread as that of Dicer and the Argonaute family (Dalmay et al. 2000; Sijen et al. 2001). In addition to protein components, small interfering RNAs (siRNAs) of 19–25 nt, which are homologous to regions of the target gene, were identified in cells challenged with dsRNA (Hamilton and Baulcombe 1999; Hammond et al. 2000; Zamore et al. 2000). The first characterized mode of RNAi-mediated gene silencing occurs post-transcriptionally (post-transcriptional gene silencing, PTGS), using a mechanism whereby the siRNAs, generated from long dsRNAs by Dicer, guide the destruction of the target mRNA by an Argonaute protein (Bernstein et al. 2001; Hammond et al. 2000, 2001; Liu et al. 2004; Song et al. 2004; Zamore et al. 2000).

While the initial role established for small RNAs and the RNAi machinery was in PTGS, their function does not end there. In 2002, the breakthrough discovery was made in Schizosaccharomyces pombe that several of the proteins required for RNAi-mediated PTGS were also involved in the heterochromatin formation at the centromere and mating-type locus that results in transcriptional gene silencing (TGS) (Hall et al. 2002; Volpe et al. 2002). Deletion of the genes for Dicer, Argonaute, or RdRp, each present in only one copy in S. pombe, causes loss of silencing of a reporter gene inserted in the outer repeat region surrounding the centromere and at an ectopic site that carries the minimal centromere sequence necessary to silence a reporter gene (Volpe et al. 2002, 2003). At the heterochromatic mating-type region in S. pombe, the RNAi-machinery is required to establish, but not maintain, silencing (Hall et al. 2002). Identified were small RNAs that are homologous to the outer repeats of the centromere and a similar sequence, cenH, located at the mating-type region (Cam et al. 2005; Reinhart and Bartel 2002). Loss of any one of these RNAi components leads to a loss of the heterochromatic marks H3K9me and HP1 (Swi6 in S. pombe) at the pericentric heterochromatin or at the mating-type locus when a second, redundant pathway is also disrupted (Jia et al. 2004; Yamada et al. 2005). Subsequent studies have suggested that the RNAi machinery also plays a similar role in heterochromatin formation in other organisms, including plants, flies, and vertebrate cells (Table 2) (Fukagawa et al. 2004; Kanellopoulou et al. 2005; Pal-Bhadra et al. 2004; Zilberman et al. 2003).

Table 2 Factors implicated in heterochromatin formation

Most heterochromatic genomic regions, from a variety of organisms, are enriched for transposable elements and repetitive DNA. This enrichment implicates these sequences and the small RNAs corresponding to them, in a possible functional role in targeting heterochromatin formation. While in some cases DNA binding proteins with a specific preference for the heterochromatic sequences have been identified (for example, the D1 protein of Drosophila specifically binds AT-rich satellite DNA (Aulner et al. 2002)), in most cases such proteins have not been identified, suggesting some other mode of recognition. The outer regions of the S. pombe centromere are composed of two different types of repetitive sequences, the dg and dh repeats (Takahashi et al. 1992). Heterochromatic regions in plants and Drosophila are also enriched for repetitive sequences, which are often transposable elements and related sequences (Hoskins et al. 2002; Lippman et al. 2004). Small RNAs [referred to as repeat-associated small interfering RNA (rasiRNAs)] have been identified that matched these regions. The list includes the outer repeats of S. pombe centromeres (Cam et al. 2005; Reinhart and Bartel 2002), several different transposable elements present in the pericentric and fourth chromosome heterochromatic regions in flies (Aravin et al. 2003), and the repetitious elements in the heterochromatic knob region in Arabidopsis (Lippman et al. 2004). In addition, small RNAs that are homologous to mammalian satellite repeats have been identified (Fukagawa et al. 2004; Kanellopoulou et al. 2005). However, deletion of RNAi components which results in changes in the presence of transcripts corresponding to satellite repeats is not necessarily sufficient to disrupt heterochromatin formation (Murchison et al. 2005). The strongest evidence supporting a role for rasiRNAs in heterochromatin formation comes from S. pombe, as explained in more detail below.

In plants, expression of a dsRNA hairpin is sufficient to induce transcriptional gene silencing, including the associated changes in chromatin structure, at a reporter homologous to the dsRNA (Mette et al. 2000). Initial studies indicated that this was also true in S. pombe where a dsRNA ura4 hairpin was reported to trigger heterochromatin formation at an ectopic ura4 reporter (Schramke and Allshire 2003; Schramke et al. 2005b), but this work has since been retracted (Allshire 2005; Schramke et al. 2005a). Therefore, while it is clear that dsRNA enters an RNAi-mediated pathway, what determines whether it enters the PTGS or TGS pathways is not understood at this point. Of course, these differences could be organism specific. A few studies have been reported using other organisms that implicate specific repeats as targets for TGS, for example, the 1360 element (aka hopple) in the Drosophila fourth chromosome (Sun et al. 2004).

By far, the best studied model system for investigations of this type of heterochromatin formation is the fission yeast S. pombe. Therefore, to diagram a model for RNAi-directed heterochromatin formation, we will turn to S. pombe pericentric heterochromatin as an example. The S. pombe centromeres consist of a non-repetitive central core (cnt) that is flanked by inverted repeat regions composed of two different types of repeats, the innermost (imr) and outer (otr) repeats. The cnt and imr make up the central domain where centromere-specific histone variant CENP-A is found. The outer repeat region is composed of two different tandem repeat sequences, dg and dh, and it is this region that is representative of classical constitutive heterochromatin (reviewed by Allshire 2004). In cells mutant for components of the RNAi machinery, the dgdh regions are transcribed and long pre-siRNA transcripts from both strands (designated forward and reverse) accumulate. However, in wild-type cells, one does not detect forward transcripts and only very low levels of the reverse transcripts are detected (Volpe et al. 2002).

Two different complexes that play an important role in heterochromatin formation at S. pombe pericentric heterochromatin have been identified (Fig. 3). The first complex, RNA-induced initiation of transcriptional silencing (RITS), contains the Argonaute family member Ago1, Tas3, and the chromodomain-containing protein Chp1, as well as siRNAs homologous to the dg–dh repeats (Verdel et al. 2004). The second complex, RNA-directed RNA polymerase complex (RDRC), is also composed of three proteins: Rdp1, the S. pombe RdRp (RNA-dependent RNA polymerase); Hrr1, an RNA helicase; and Cid12, a member of the polyA polymerase family (Motamedi et al. 2004). The two complexes interact with each other and localize to pericentric heterochromatin in a manner dependent upon each other, Dcr1 (having Dicer activity), and Clr4 (the H3K9 methyltransferase) (Motamedi et al. 2004; Noma et al. 2004; Sadaie et al. 2004; Sugiyama et al. 2005). Furthermore, RdRp activity is required for proper heterochromatin formation at centromeres (Sugiyama et al. 2005). It is thought that the RdRp might generate long dsRNAs from a ssRNA template, presumably the “reverse” transcript mentioned above. The resulting long pre-siRNAs are presumably diced into siRNAs by Dicer, which is required for their incorporation into RITS (Verdel et al. 2004). Deletion of components of RITS or RDRC causes the loss of H3K9me and Swi6 binding at pericentric heterochromatin (Volpe et al. 2002). The RITS and RDRC complexes as well as Clr4 are all required for the binding of the S. pombe HP1 homologue Swi6 to pericentric heterochromatin. The mutual interactions between H3K9me, Swi6, and possibly Clr4 would allow for the subsequent spreading of Swi6 and Clr4 across the region to create heterochromatin (Nakayama et al. 2001; Noma et al. 2001). Clr4 is also associated with additional factors required for proper heterochromatin formation, including Rik1, Cul4 (aka Pcu4), and the Rik1 associated factors, Raf1 and Raf2 (aka Dos1 and Dos2) (Horn et al. 2005; Jia et al. 2005; Li et al. 2005). It is interesting to note that Cul4 is a member of the cullin-dependent family of E3 ubiquitin ligases involved in protein ubiquitination, suggesting a possible role for ubiquitination in heterochromatin formation in S. pombe (Horn et al. 2005; Jia et al. 2005). As it stands now, the model outlined above and diagramed in Fig. 3 addresses how heterochromatin formation is maintained in this system, but the mechanism leading to initiation of heterochromatin formation is still unclear.

Fig. 3
figure 3

Schematic representation of proposed self-reinforcing loop model of heterochromatin formation in fission yeast at pericentric repeats. RNA pol II is shown transcribing the “reverse” strand of the dgdh repeats (gray box). Rdp1, a component of the RNA-directed RNA polymerase complex could generate dsRNA from that ssRNA. This dsRNA is acted on by Dicer to generate siRNAs, which are incorporated into RITS. The RITS complex can interact with chromatin through the chromodomain of Chp1, which binds to H3K9me (blue triangles). RITS and RDRC interact with each other and localize to the repeats in a manner dependent upon the H3K9 methyltransferase Clr4. The dark gray arrows indicate the cyclical nature of this self-reinforcing loop. Clr4 associates with Cul4 (aka Pcu4), Rik1, Raf1, and Raf2, which are also known as Dos1 and Dos2. Additionally, Clr4 may interact with the S. pombe HP1 homologue, Swi6, facilitating the spreading of H3K9Me, thereby creating additional binding sites for Chp1

While there are similarities in heterochromatin formation across organisms, what are key components of heterochromatin for some organisms are missing in others (Table 2). Conserved aspects for eukaryotes other than S. cerevisiae include the presence of repetitive DNA and H3K9 methylation in heterochromatic regions. Factors critical for some systems but not for others include DNA methylation, which occurs at heterochromatic regions in plants and mammals; HP1, which is widely conserved as a heterochromatin protein but not found in plants (Libault et al. 2005; Nakahigashi et al. 2005); and the use of small RNAs and the RNAi machinery, which are evident in most systems, but do not play a role in heterochromatin formation in Neurospora (Freitag et al. 2004). In addition, two RNA polymerases involved in heterochromatin formation exhibit species specificity. RdRp, which can use ssRNA to generate a dsRNA, the required trigger for the RNAi cascade, plays a role in heterochromatin formation in S. pombe and plants and is needed for transcriptional silencing in worms (Grishok et al. 2005). However, no RdRp homologue has been identified in Drosophila or in mammals. These organisms either generate dsRNA without an RdRp (although it is not clear how this is accomplished) or an RdRp exists, but it is not easily recognized based on sequence homology. However, in Drosophila RNAi-mediated PTGS, a specific mRNA-isoform can be targeted by using dsRNA complementary to isoform-specific exons, thereby arguing against a functional RdRp (Celotto and Graveley 2002; Roignant et al. 2003). By contrast, C. elegans has a functional RdRp and exhibits a transitive response upon injection of a dsRNA (Sijen et al. 2001). More recently, a plant-specific RNA polymerase that plays a role in heterochromatin formation, RNA pol IV, has been identified (Herr et al. 2005; Kanno et al. 2005; Onodera et al. 2005; Pontier et al. 2005). The implications of this finding and a possible role for pol IV are discussed in a subsequent section. As more details about the mechanism of RNAi-directed heterochromatin formation are elucidated, the role of genus-specific factors should become clearer.

The current model of how RNAi-mediated TGS is maintained in S. pombe leaves several unanswered questions. First, the generality of the model is uncertain, especially given the species specificity of certain components. Second, the requirement for the Clr4 histone methyltransferase (HMT) to localize RITS and RDRC components to the pericentric heterochromatin is surprising. If the RNAi machinery “directs” heterochromatin formation to repeat-rich regions, one would not expect it to be dependent upon the HMT. The self-reinforcing loop model, where Clr4 is recruited by RITS/RDRC and methylation of H3K9 by Clr4 helps recruit RITS and RDRC (Noma et al. 2004; Sugiyama et al. 2005), helps to explain this apparent contradiction. However, this model does not explain how the loop is initiated. An alternative mechanism for targeting the HMT to heterochromatin regions that is not dependent on RITS or RDRC is one specified by protein–DNA interactions. In fact, this does occur at the mating-type locus in fission yeast (Jia et al. 2004; Kim et al. 2004; Yamada et al. 2005). In addition, the model places the generation of low levels of reverse transcripts at the beginning of the pathway. How these “long” pre-siRNA precursor transcripts were generated was unclear, until recent studies in S. pombe demonstrated that RNA polymerase II (pol II) plays a role. It is inntriguing that this does not seem to be the only role for pol II in RNAi-directed heterochromatin formation.

A role for RNA polymerase II in heterochromatin formation

While it seems counter-intuitive to involve RNA polymerases in a silencing mechanism, this indeed seems to be the case when invoking RNAi. Three different mutations in RNA polymerase II have been identified in S. pombe that affect centromeric heterochromatin assembly (Table 3). Mutations in the RPB7 or RPB2 subunits, as well as a partial deletion of the C-terminal domain (CTD) of Rpb1, relieve centromeric heterochromatin-associated gene silencing (Djupedal et al. 2005; Kato et al. 2005; Schramke et al. 2005a,b). Of importance is the fact that none of these pol II mutations affect expression of other known components of the RNAi-directed silencing pathway when assayed by expression microarrays, thereby making it unlikely that the silencing defects are indirect. Additional evidence implicating pol II in having a direct role in pericentromeric heterochromatin formation includes the ability of Rpb1 to coimmunoprecipitate with the RITS component Ago1. The Rpb1–Ago1 interaction requires Dcr1, suggesting either a role for Dcr1-generated siRNAs in mediating the interaction or that Dcr1 itself mediates the interaction (Schramke et al. 2005b). In addition, chromatin immunoprecipitation experiments have demonstrated that pol II, along with other RNAi components, localizes to the centromeric repeats (Cam et al. 2005; Kato et al. 2005). As a whole, this data implicates pol II as a critical player in the RNAi-dependent silencing pathway.

Table 3 Impact of mutations in the RNAi machinery and RNA pol II on accumulation of siRNA transcripts and heterochromatin formation in S. pombe

It is interesting that while all three pol II mutants alleviate the silencing associated with pericentric heterochromatin formation, each exhibits a different profile with respect to the impact on production of small RNAs and their precursor transcripts (Table 3). As expected, the loss of silencing corresponds to a loss of H3K9 methylation and Swi6 binding at the centromere. The rpb7 mutation results in the disappearance of the pre-siRNA transcripts, as well as their descendent siRNAs. Fusion constructs containing the “reverse” and forward promoters, as well as a positive control, demonstrated that the rpb7 mutation specifically compromises initiation of transcription from the “reverse” promoter (Djupedal et al. 2005). This result implicates pol II as the polymerase in S. pombe responsible for generating the long, pre-siRNA transcripts from the reverse strand, which are then transcribed by RdRp to generate dsRNA for processing by Dicer to generate siRNAs. However, pol II’s role does not end there. The rpb2 mutation causes an accumulation of these pre-siRNA transcripts in a manner similar, although not as severe, to that seen in a Δdcr1 strain (Kato et al. 2005). The effects of this mutation suggest that pol II might have a role in coupling transcription of the pre-siRNAs to their processing. Finally, the CTD mutation is reported to cause a defect in silencing without affecting either the pre-siRNAs or the siRNAs. This finding suggests a role for pol II in a step downstream of siRNA production (Schramke et al. 2005b). Coimmunoprecipitation experiments from yeast cells with Rpb1 and Ago1 indicate that these factors are capable of immunoprecipitating each other in a manner dependent upon dcr1. While this does not demonstrate a direct interaction between pol II and Ago1, it suggests that the CTD might play a role in recruiting Ago1 (and possibly the whole RITS complex) to the outer repeats of the centromere where the siRNAs are being produced (Schramke et al. 2005b). It seems likely that all of these interactions are occurring at or near the dgdh repeats, as pol II and several other components of the pathway have been shown to localize to this region in chromatin immunoprecipitation experiments (Cam et al. 2005; Kato et al. 2005; Motamedi et al. 2004; Noma et al. 2004; Sugiyama et al. 2005; Volpe et al. 2003).

Although these experiments have addressed the question of what generates the pre-siRNA transcripts in yeast, it is still not clear at what stage in heterochromatin formation these pre-siRNAs are generated. Is transcription from both forward and reverse strands used to initiate silencing (producing the first siRNA) and the signal maintained by the RNA-dependent RNA polymerase working from occasional reverse transcripts? Or is there a constant low level of transcription that continuously generates dsRNA Dicer substrates? If this is the case, it implies that pol II is able to traverse a silenced region of chromatin. This is counterintuitive to what is generally understood about the inhibitory effect of heterochromatin formation on pol II transcription.

A new RNA polymerase with a role in heterochromatin formation in plants

As mentioned earlier, the RNAi machinery plays a major role in heterochromatin formation in plants. The RNAi machinery here directs DNA methylation, which is enriched in heterochromatin in plants. Heterochromatic regions that have been closely examined show high congruence between the presence of methylated DNA and of H3 methylated on K9 (Lippman et al. 2004); however, HP1 is not found in these regions. HP1-like homologues in plants appear to have a role in down-regulating euchromatic genes and are perhaps more analogous to HP1γ (Nakahigashi et al. 2005).

A new plant-specific RNA polymerase that functions in the production of small RNAs, thereby affecting heterochromatin formation in Arabidopsis, has recently been identified by several groups (Herr et al. 2005; Kanno et al. 2005; Onodera et al. 2005; Pontier et al. 2005). This polymerase has been designated RNA polymerase IV (pol IV). A homology search indicates that pol IV is present only in plants. There are two forms of pol IV in Arabidopsis, RNAPIVa and RNAPIVb, which differ in at least the largest subunit they contain, NRPD1 (Onodera et al. 2005; Pontier et al. 2005). Two genes were identified as potential variants of pol IV’s second largest subunit, NRPD2, but only one of these, NRPD2a, is expressed and apparently encodes the second largest subunit of both complexes. Genetic screens for mutations that cause a loss of heterochromatin-induced silencing identified several distinct alleles that map to either the NRPD1a, NRPD1b, or NRPD2a genes (Herr et al. 2005; Kanno et al. 2005; Pontier et al. 2005).

Mutations in the NRPD1a and NRPD1b genes appear to affect two different steps in the RNAi-directed DNA methylation pathway. Both classes can affect heterochromatic gene silencing and cause changes in DNA methylation patterns. All loci examined show a depletion of siRNAs when NRPD1a is mutated (Herr et al. 2005; Onodera et al. 2005; Pontier et al. 2005), while NRPD1b mutants only affect siRNA production from a subset of these loci (Pontier et al. 2005). Mutations in the NRPD2 subunit affect both classes, which is expected due to its presence in both pol IV complexes. The manner in which NRPD1b was identified suggests that it might act in a step downstream of siRNA production. Mutations in this gene were found in a screen where the subject siRNA was generated at a locus distinct from the reporter using an inverted repeat under the control of a constitutive promoter (Kanno et al. 2005). While pol IV mutants cause a loss of small RNAs, it is not clear that the enzyme is actually responsible for transcribing RNA from a chromosomal template. An appealing model is that pol IV is uniquely adapted to transcribe silent heterochromatin, generating the long pre-siRNA transcript which could serve as a template for RNA-dependent RNA polymerase (Herr et al. 2005; Zamore and Haley 2005). However, there is no convincing evidence that pol IV generates primary transcripts from silent heterochromatin or elsewhere. At one silenced region, AtSN1, the long pre-siRNA transcripts are more abundant in a nrpd1a mutant than in the wild type (Herr et al. 2005). This is the opposite of what one would expect if pol IV was the polymerase responsible for generating this same transcript. Furthermore, biochemical analysis has not demonstrated any classical DNA-dependent RNA polymerase activity associated with pol IV purified by following the NRPD2 subunit, which presumably should represent both pol IV complexes (Onodera et al. 2005). What role does pol IV play? Does it generate transcripts? If so, what serves as the template if it is not DNA? Possibilities include using an RNA:DNA or RNA:RNA template and generating siRNAs in conjunction with the RNA-dependent RNA polymerase. This polymerase may alternatively require a methylated DNA template or a chromatin template similar to that present at heterochromatic loci. Another possibility is that the role of pol IV is not to transcribe DNA but instead to open up the DNA (by traversing it), a step that may be necessary for interactions with the siRNAs.

Mysteries remaining: peculiar features of heterochromatin

Heterochromatin formation appears to be a domain-based phenomenon. A unique feature of heterochromatin packaging is the ability to spread across breakpoints from chromosomal rearrangements and to encompass euchromatic genes if inserted as transgenes into the heterochromatic domain. Work from several organisms points to a mechanism in which heterochromatin is initiated at specific sites on the chromosome and can then spread (in the absence of barriers) to adjacent regions. Support for this model comes from work in S. pombe (Hall et al. 2002; Noma et al. 2004; Partridge et al. 2000) as well as studies of PEV in Drosophila discussed above. A model of targeting and spreading has also been invoked to explain the action of the dosage compensation complex in C. elegans (Csankovszki et al. 2004) and the dosage compensation complex in Drosophila (Oh et al. 2004), as well as other chromatin-based events regulating large domains.

How does this spreading occur in heterochromatin? The data at hand can be invoked to support alternative mechanisms. A popular proposal focuses on interacting protein complexes that can both recognize and propagate a covalent mark. This is the case for HP1, which both interacts with H3K9me mark and binds SU(VAR)3-9, the enzyme which methylates histone H3K9, creating the mark. Thus, HP1 facilitates the propagation of the modification, leading to the packaging of the region into heterochromatin. One problem with this model is that the process is not actually this simple. Approximately 20 genes showing Su(var) activity when mutant have been characterized at the molecular level, and many more candidate loci have been identified (Wustmann et al. 1989); a mutation in any one of these genes can lead to the loss of silencing. The loss of silencing could be the result of either a failure in initiation and stable maintenance of the heterochromatin state or a failure in spreading. Spreading might also occur through a transcription-based mechanism. This might be similar to the spreading of PTGS, which occurs as the RdRp copies messages. In spite of a few notable exceptions, transcription is considered a highly processive reaction; once started, it will continue until the polymerase is removed from the DNA. Sense strand transcripts are processed and terminated, but perhaps antisense strand transcripts lack the signals necessary for this. Continuous transcription may generate long transcripts that end up as siRNAs, thereby propagating the trigger. However, of the organisms using RNAi in TGS, only a subset has a RdRp to contribute to this mechanism. Another factor that may impact spreading is the use of different HMTs and HP1 variants. Possibly only the HP1a–SU(VAR)3-9 interaction is capable of supporting the spread of heterochromatin.

A second peculiar feature of heterochromatin is the variegating state induced at the reporter locus. The reporter will be off in some cells where expression is normally observed but on in others. The difference in transcription states reflects a difference in chromatin packaging, but why and how two stable alternatives are set (rather than a graded difference) is unclear. The idea that spreading is a stochastic process has been invoked, but it is not clear if this explanation is sufficient or correct. Is it possible that the variegating state reflects a competition between transcription resulting in expression and transcription resulting in silencing? The use of siRNAs and RNA polymerases in heterochromatin formation has added an additional pathway where RNAs and RNA polymerases play a critical role. While the role of polymerase in euchromatic gene expression has been well examined, the recent studies in S. pombe with pol II and in plants with pol IV have expanded the role for polymerases in heterochromatin formation. Much remains to be discovered.