Keywords

1.1 Introduction

Repetitive DNA makes up a large portion of the genomes of higher eukaryotes. Satellite DNA, composed of tandem repeats that assemble into constitutive heterochromatin, was first described when mouse DNA was subjected to density gradient centrifugation and “satellite bands” of different densities formed above or below the bulk of the genome (Kit 1961; reviewed in Garrido-Ramos 2017). Fifty years ago the pioneering technique of in situ hybridization to mitotic chromosomes demonstrated that mouse satellite DNA was strikingly localized around the centromere (Pardue and Gall 1970). Although satellites propagate and may be mobile, expansion and movement are passive, relying on processes such as replication slippage, unequal crossing-over, or gene conversion to expand and move. This distinguishes satellites from transposable elements that typically insert as monomers and encode genes necessary for mobilization. However, this dichotomy is not clean. Some satellite repeats may be derived from transposable elements (Dias et al. 2015; Meštrović et al. 2015). Both transposable elements and satellite repeats are enriched in heterochromatic regions, are subject to silencing by heterochromatin formation, and are often grouped with transposable elements for the purpose of analysis and discussion.

In spite of their abundance, satellite repeats are typically thought of as having few cellular functions besides contributing to the formation of heterochromatin, centromeres, and telomeres. But satellite DNA and RNA participate in a number of diverse processes, including gene regulation, stress response, and nuclear organization in Drosophila melanogaster and many other organisms. The mutability of satellites makes them prominent actors in the evolution of genomes. In accord with this, satellite repeats are a potent and adaptable weapon in genomic conflicts between species, and between chromosomes within a species. This review will focus on the functions of satellite repeats in Drosophila with particular attention to the properties that make satellites a versatile and powerful force in nuclear organization, gene regulation and evolution.

1.2 Seeing the Dark Matter of the Genome

Much of eukaryotic genomes are comprised of vast, uncharted blocks of heterochromatin surrounding centromeres and telomeres. These regions, made up of satellite repeats and transposable elements, resist cloning and have posed an insurmountable barrier to traditional methods of genome sequencing and assembly. But advances in long-read sequencing of unamplified DNA have allowed the most challenging regions of genomes to be assembled. Nanopore sequencing of high molecular weight DNA was used to complete human centromeres (Miga et al. 2020). PacBio sequencing has similarly enabled Drosophila centromeres to be assembled (Chang et al. 2019). These methods avoid bias in library preparation but have high error rates, making assembly of repetitive regions challenging. The performance of correction and assembly methods must consequently be validated before being used to reconstruct repetitive regions (Khost et al. 2017). At present, sequencing and assembly of major repetitive regions remains labor intensive and technically challenging. In contrast, the diversity and abundance of different types satellite repeats in the genome can be determined by sequencing of unamplified genomic DNA (Lower et al. 2018). This approach revealed differences in satellite composition between strains of Drosophila melanogaster, supporting the idea that satellites are a rapidly evolving portion of the genome (Wei et al. 2014). Interestingly, the satellite composition of different chromosomes is often distinct. This is observed in humans, where the variants of the α-satellite arrays that make up centromeres are chromosome specific (Rudd et al. 2006). It is also the rule in flies, where distinctive combinations of satellite repeats make up the pericentric heterochromatin of different chromosomes (Lohe et al. 1993; Blattes et al. 2006; Jagannathan et al. 2017; Chang et al. 2019).

1.3 Biophysical Properties of Satellites

Many satellites have interesting biophysical properties that are often commented on. How these contribute to function remains unclear in most instances. With the exception of Dodeca, D. melanogaster satellites are notably AT rich (Table 1.1). Indeed, AT richness is common in satellite DNA and may contribute to a curving of the duplex that enhances nucleosome stability (Fitzgerald et al. 1994; reviewed in Palomeque and Lorite 2008). Also suggestive is the observation that monomers of longer and more complex satellites often approximate the length of mono-, di-, or tri-nucleosomes, suggesting the potential for nucleosome phasing (Henikoff et al. 2001). For example, α-satellite repeats of human centromeres (171 bp) and the 359 bp satellite family of D. melanogaster suggest mono- and di-nucleosomes, respectively (Table 1.1). Nucleosomes are phased over the centromeric satellites of multiple species (reviewed in Heslop-Harrison and Schwarzacher 2013). The human centromere protein CENP-B enforces phasing by binding a 17 bp sequence in α-satellite repeats (Ando et al. 2002). The Responder (Rsp) repeats of D. melanogaster, composed of two similar 120 bp units, also enforce phasing as demonstrated by an extended nucleosome periodicity of 240 bp (Doshi et al. 1991). The Rsp locus is known for its role as the target of meiotic drive, but the potential role of nucleosome phasing in this process is unknown. Taken together, these observations suggest that satellites display intrinsic features that are expected to contribute to nucleosome stability and influence the biophysical properties of chromatin.

Table 1.1 Major D. melanogaster satellite repeats

1.4 How Do Satellites Expand, Move and Change?

D. melanogaster is rich in satellites, having approximately twice the diversity as humans (Shatskikh et al. 2020). The most abundant of these are 12 bp or less. As satellites generally lack coding potential their movement is nonautonomous. In spite of this limitation, they have been extraordinarily successful. Expansion and mobilization of satellite DNA reflects the propensities of replication and repair systems. For example, short tandem repeats are intrinsically unstable as they expand and contract by replication slippage (Fig. 1.1a) (Tautz et al. 1986; Bzymek and Lovett 2001; reviewed in Richards and Sutherland 1994; Levinson and Gutman 1987). Satellites also expand and contract by unequal crossing over during replication or repair (Fig. 1.1b). As longer, more complex monomers pose less of a challenge to the replication machinery, unequal crossing over is presumed to be a major factor variation of long repeats (Cabot et al. 1993; Southern 1975). A relevant question is why the expansion of noncoding sequence is tolerated. Some organisms have a considerably lower accumulation of satellite DNA, suggesting differences in susceptibility to slippage and unequal crossing over or tolerance of repetitive sequence. A comparison of related organisms with dramatic differences in genome size supports the idea that tolerance of additional genetic material is species specific (Petrov et al. 2000; Hartl 2000).

Fig. 1.1
figure 1

The repetitive structure of satellite repeats facilitates movement and change. (a) Replication stalling at repeats allows mispairing and template slippage. This produces contraction (top) or expansion (bottom) of tandem repeats. (b) Unequal crossing over leads to the expansion and contraction of tandem arrays. Cycles of unequal crossing over homogenize repeats at the center of an array. (c) Extrachromosomal loops generated by recombination within a tandem array can insert at a new site. (d) Gene conversion occurs when a related sequence serves as a template for recombination or repair

The movement of satellites may also occur by the formation of extrachromosomal loops that occur by recombination within an array. rDNA and noncoding tandem repeats are recovered as extrachromosomal loops in Drosophila and mammalian cells (Kiyama et al. 1986, 1987; Pont et al. 1987; Cohen et al. 2003, 2006; reviewed in Cohen and Segal 2009). This suggests a simple mechanism for movement to new sites. Extrachromosomal loops could undergo recombination with similar sequences or insert at random (Fig. 1.1c). The risk of extrachromosomal loops to genome integrity is moderated by the assembly of satellite repeats into heterochromatin. In accord with this idea, the loss of heterochromatin factors elevates the level of extrachromosomal loops and increases genomic damage (Larson et al. 2012; Peng and Karpen 2007). The erosion of heterochromatic silencing that is observed in aging and cancer is presumed to lead to the increase in extrachromosomal loops and contribute to genome instability in these cells (Sinclair and Guarente 1997; Larson et al. 2012; Turner et al. 2017; deCarvalho et al. 2018; Kim et al. 2020).

Tandem arrays are also subject to gene conversion by recombination within a cluster or with similar sequences from elsewhere in the genome (Fig. 1.1d). This general process is of interest for its role in the evolutionary divergence of duplicated genes (Osada and Innan 2008). The extreme abundance of satellite repeats that could be used as templates favors this process but raises the potential for large, damaging genome rearrangements. The idea that protection of repetitive DNA from inappropriate recombination is one of the functions of heterochromatin is supported by the behavior of repair foci in heterochromatic regions (Caridi et al. 2018). These foci move out of the nuclear territory occupied by heterochromatin before completion of the repair, supporting the idea that recombination and repair of repetitive DNA are potentially dangerous and under tight control.

1.5 Evolution of Satellite Repeats Is Rapid and Driven

Closely related species often display striking variations in satellite repeat composition and abundance (Bosco et al. 2007; Jagannathan et al. 2017; Lohe and Brutlag 1987). A comparative study of the satellite composition of four closely related species, D. melanogaster, D. sechellia, D. simulans, and D. mauritiana used hybridization to mitotic chromosomes to compare satellite composition and localization (Jagannathan et al. 2017). Some classes of satellites undergo complete replacement in closely related species. For example, Prodsat (AATAACATAG, Table 1.1) makes up 2% of the D. melanogaster genome but is not detected in the other three species (Török et al. 2000; Jagannathan et al. 2017). One caveat of this approach is that sites with low copy numbers of repeats are below the detection limit on mitotic chromosomes. For example, several megabases of 359 bp satellites in pericentromeric heterochromatin on the D. melanogaster X are detected by this method, but hundreds of closely related satellites dispersed throughout X euchromatin are not.

Dispersed satellites in euchromatin have also been subject to rapid, widespread changes in sequence, position, and abundance (Sproul et al. 2020). Although striking, this wholesale replacement is the natural outcome of two mutagenic processes with vastly different speeds. Point mutations diversify sequence, and the accumulation of mutations will eventually destroy the identity between sequences derived from the same progenitor. In contrast, gene conversion occurs orders of magnitude much more rapidly and acts to homogenize repeats within an array, and between arrays at different sites in the genome (Ohta and Dover 1984). The outcome of these competing mutational processes is the replacement and homogenization of satellites throughout the genome, termed molecular drive (Dover 1982). This is particularly dramatic when comparing closely related species, such as the Drosophilids (Jagannathan et al. 2017; Sproul et al. 2020; de Lima et al. 2020; Larracuente 2014).

1.6 Satellites, Silencing, and Organization of Chromatin

One of the most prominent features of satellite repeats is their role in heterochromatin formation. Large arrays of tandem repeats trigger silencing through heterochromatin formation that is largely sequence independent (Henikoff 1998). This has bedeviled mouse genetic studies because random transgene insertions produce tandem arrays subject to silencing. Using Cre/LoxP to excise extra copies from a mouse transgene array, Garrick et al. (1998) demonstrated that chromatin compaction and transgene silencing was not an intrinsic feature of the insertion site or transgene sequence, but was instead induced by multicopy arrays. Silencing of tandem transgenes is also observed in flies and plants, indicative of a common strategy for inactivating repetitive DNA (Dorer and Henikoff 1994). As most repetitive sequences are potential threats to genome integrity, the recognition and silencing of repeats represent a triumph of genome defense.

RNA derived from transposable elements and satellites direct the chromatin modifications that initiate heterochromatin formation in fission yeast and this serves as a useful model for the process (reviewed by Grewal and Elgin 2007). Transcription through repeats generates RNAs that are processed into siRNAs and loaded onto Argonaut effector complexes (Höck and Meister 2008). Nascent RNA from cognate regions of the genome is bound by these complexes, which recruit a histone methyltransferase that places the H3K9me mark. The heterochromatin protein Swi6, and a number of small RNA processing factors, are recruited by H3K9me to ensure maintenance of silencing (Zhang et al. 2008). While Drosophila heterochromatin is heterogeneous, evidence suggests that small RNA pathways also contribute to chromatin regulation in flies (Swenson et al. 2016; Cernilogar et al. 2011). Mislocalization of heterochromatin proteins and break down of silencing have been observed when Argonaut effectors or the genes necessary to produce small RNAs are inactivated (Fagegaltier et al. 2009). A genetically distinct silencing system in the germ line controls transposable elements by message destruction and transcriptional silencing (Khurana et al. 2010). This is directed by Piwi RNAs (piRNAs), generated from transposon sequences archived in piRNA clusters and expressed in the germ line (Brennecke et al. 2007). The resulting piRNAs enable Piwi, a germ line-specific Argonaut protein, to identify and bind nascent transcripts from mobile elements. Piwi recruits an H3K9 methyltransferase through an adapter protein to establish silencing (Sienski et al. 2015). Components of the Piwi system are also involved in chromatin compaction and silencing at later developmental stages. Maternal depletion of Piwi impairs heterochromatic silencing in the adult, a long-lasting effect that is observed by reduction of Position Effect Variegation (PEV) (Gu and Elgin 2013). PEV occurs when transgenes in repressive environments are silenced in some cells (Elgin and Reuter 2013). The majority of piRNAs have the identity to transposons, consistent with their vital role in the repression of mobile elements (Brennecke et al. 2007). But satellite piRNA are also present and may direct heterochromatin compaction of some repeats in the early embryo. Maternally deposited cues, possibly small RNA, direct the formation of zygotic heterochromatin over a cluster of 359 bp satellites on the X chromosome (Ferree and Barbash 2009; Yuan and O’Farrell 2016). The 359 bp satellites are notable for their role in hybrid incompatibility between closely related species, discussed in a following section.

Heterochromatin itself displays remarkable biophysical properties. Visualization of D. melanogaster heterochromatin reveals a subnuclear compartment that is distinct from euchromatin and which may consolidate the major heterochromatic regions of all chromosomes (see Caridi et al. 2018). The discovery that fly and human HP1 phase separate in vitro, and that heterochromatin itself displays the properties of phase separation in cells, suggested a biophysical explanation for how segregation is achieved (Strom et al. 2017; Larson et al. 2017). Phase separation of subcellular bodies occurs by the self-association of disordered proteins (reviewed in Hall et al. 2019). Separation is favored by multivalent interactions, protein crowding, and assembly with a polymer, such as RNA or chromatin. HP1 has disordered domains, interacts with a large number of proteins and also binds RNA (Alekseyenko et al. 2014; Muchardt et al. 2002; Roach et al. 2020). A functional role for RNA in HP1 localization is suggested by the finding that HP1a is released from mouse nuclei by RNase and the association of fly HP1a with chromatin is also RNA dependent (Maison et al. 2002; Piacentini et al. 2003).

Many chromatin proteins in addition to HP1 have RNA-binding domains or interact with RNA-binding proteins, in a manner that suggests a structural role for RNA in chromatin organization. One of these, Decondensation factor 31 (Df31), a small, hydrophobic, and highly disordered RNA binding protein, also boasts a large protein–protein interaction network. The general distribution of Df31 in the nucleus suggests a role in maintaining chromosome territories (Rohrbaugh et al. 2013). In cultured Drosophila cells association of Df31 with RNA is necessary for accessible chromatin (Schubert et al. 2012). In vitro assays found that Df31 association with chromatin was RNA dependent and RNase treatment collapsed chromatin into a nuclease-resistant state. While Df31 shows hallmarks of a protein involved in phase separation, it is enriched in euchromatic regions.

Scaffold Attachment Factor A (SAF-A, HNRNPU in humans) and SAF-B have DNA binding domains that recognize AT-rich matrix or scaffold attachment sites (Fackelmayer et al. 1994; Göhring and Fackelmayer 1997; Nozawa et al. 2017; Fan et al. 2018). These similar proteins also have RNA binding domains and large disordered regions. Loss of these proteins disrupts chromatin structure and DNA accessibility, as does RNase digestion (Nickerson et al. 1989; Nozawa et al. 2017; Fan et al. 2018). Mouse SAF-B binds a variety of long noncoding RNAs, but transcripts from pericentric satellite repeats are its predominant partners (Huo et al. 2020). Depletion of mouse SAF-B allowed heterochromatin bodies in the nucleus to expand and make interchromosomal contacts. Imaging reveals that SAF-B coats the exterior of H3K9me3-rich heterochromatin bodies, suggesting a SAF-B shell that prevents inappropriate mingling of phase-separated heterochromatin domains from different chromosomes (Huo et al. 2020). Drosophila SAF-B binds chromatin and is also visualized as an extrachromosomal network (Alfonso-Parra and Maggert 2010). The Association of fly SAF-B with chromatin responds to transcription and is differentially affected by mutation of its DNA binding domain and RNase treatment, but whether or not fly SAF-B interacts with specific RNAs is unknown.

Responses to heat shock and stress suggest that satellite RNA is situated in an interconnected web of RNA-binding proteins that organize chromatin and coordinate mRNA processing. When mammalian cells are subjected to stress, transcription of Sat III RNA is dramatically upregulated (Rizzi et al. 2004). SAF-B, and many RNA-binding factors involved in message processing, are recruited to nuclear stress bodies that form at sites of Sat III transcription (Valgardsdottir et al. 2008). Knockdown of Sat III RNA partially reversed the transcriptional repression induced by heat shock, suggesting a mechanism for rapidly restructuring chromatin and RNA processing pathways during stress (Goenka et al. 2016). In flies the Heat shock RNA omega (Hsrω) RNA serves a similar function. This noncoding transcript orchestrates stress response by sequestering splicing and RNA processing factors (reviewed by Jolly and Lakhotia 2006). D. melanogaster Hsrω includes 20 kb of AT-rich, 280 bp tandem repeats, thus conforming to the pattern of AT-rich satellites with a repeat length corresponding to multiples of nucleosome length.

1.7 Tandem Repeats in Euchromatin Modulate Nearby Genes

The role of satellite repeats in nucleating heterochromatin formation is well known, but tandem repeats of all types, including satellites, play interesting and surprising roles in gene regulation in euchromatin. A portion of satellite DNA is distributed throughout the euchromatic genome in tandem arrays. Changes in the number of repeats have created a wealth of genetic variation that has been exploited in forensic analysis, population genetics, and conservation. Although often considered neutral, microsatellites are highly represented in the promoters of human genes (Sawaya et al. 2013; Tomilin 2008). Dinucleotide repeats are enriched in fly enhancers, where they contribute to normal expression levels (Yanez-Cuna et al. 2014). These authors concluded that the association of short tandem repeats with regulatory regions is broadly conserved. Roughly 25% of the promoters in baker’s yeast, Saccharomyces cerevisiae, also contain tandem repeats (Vinces et al. 2009). These increase gene expression as the length of the repeats expanded. Tandem repeats also mediate repression. In the beetle Tribolium castaneum a major satellite DNA family near euchromatic genes maintains repression after heat stress (Feliciello et al. 2015). The mutability of short tandem repeats suggests a potential source of phenotypic variation. In accord with this idea, variation in repetitive DNA has been linked to expression differences in plants and insects (Ranathunge et al. 2018; Brajković et al. 2012). Repeat length variations in developmental genes, coupled with selection by breeders, are responsible for rapid phenotypic evolution in dogs (Fondon and Garner 2004). Social behavior in voles is influenced by satellite polymorphisms in a vasopressin receptor and length variants of repeats in the period (per) gene of D. melanogaster determine the male courtship song rhythm (Yu et al. 1987; Hammock and Young 2005). In addition to providing a source of genetic diversity, satellite repeats have been recruited to wage genomic conflicts and enable sex chromosome dosage compensation, described in the following sections. Their usefulness in these contexts owes to the properties described above: mobility, rapid evolution, and multifaceted roles in the structure and regulation of chromatin.

1.8 Chromosome Identification During Dosage Compensation

Organisms with highly differentiated sex chromosomes, such as humans and Drosophila, must address the problem of sex chromosome gene dosage. Males are functionally hemizygous for X-linked genes. Flies meet this challenge by increasing expression from virtually every gene on the single male X chromosome to match that of the two female X chromosomes. The Male-Specific Lethal (MSL) complex, composed of five proteins and one of two redundant RNAs, is essential for this process (reviewed in Kuroda et al. 2016). The MSL complex is selectively recruited to actively expressed X-linked genes (Alekseyenko et al. 2006; Bell et al. 2008; Sural et al. 2008). One of the MSL proteins, Males absent on the first (Mof), is a histone acetyltransferase that deposits the H4K16ac mark within the gene body (Kind et al. 2008; Copur et al. 2018). Histone acetylation increases the likelihood that initiated transcripts will be completed, raising the level of transcripts approximately twofold (Larschan et al. 2011). A long noncoding RNA, roX1 or roX2, must be part of the complex for proper X localization (Meller and Rattner 2002). Severe roX1 roX2 mutants are male lethal, the expression of X-linked genes is reduced and MSL proteins localize to ectopic autosomal sites (Deng and Meller 2006). How the MSL complex identifies the X chromosome with the required selectivity is still unknown. Studies in a number of laboratories characterized Chromatin Entry Sites (CES) on the X chromosome that bind an adapter protein and recruit the MSL complex directly (Alekseyenko et al. 2008; Straub et al. 2008; Soruco et al. 2013). However, the adapter protein binds related sites on all chromosome arms but only recruits the MSL complex in the context of X-linked sites. This suggests the presence of additional X identity elements.

The striking enrichment of a clade of 359 bp repeats, termed the 1.688X repeats (Table 1.1) in X euchromatin pointed to a potential role in an X chromosome-specific process such as dosage compensation (Hsieh and Brutlag 1979; Waring and Pollack 1987; Dibartolomeis et al. 1992). The 1.688X repeats are enriched near genes, including promoters and introns, leading to the suggestion that they could modulate expression (Kuhn et al. 2012). Autosomal insertions of short clusters of these repeats induced recruitment of the MSL complex and partial compensation of genes as much as 140 kb away (Joshi and Meller 2017; Deshpande and Meller 2018). A clue to how these repeats function came from the discovery that mutations in the siRNA pathway enhanced the lethality of males with partial loss of function roX1 and roX2 chromosomes (Menon and Meller 2012). Furthermore, ectopic expression of siRNA from one 1.688X repeat partially restored MSL localization and rescued roX1 roX2 males (Menon et al. 2014). Taken together, these studies reveal that the 1.688X satellite repeats are X identify elements and suggest that the siRNA pathway mediates their function. Interestingly, other Drosophilid X chromosomes are highly enriched for chromosome-specific repeats, although the sequence of these repeats is not highly conserved (Gallach 2014). Particularly striking is the rapid acquisition of repeats by neo-X chromosomes that arise by the fusion of an autosome to a sex chromosome. These fusions also produce a neo-Y, fated to degenerate as it passes exclusively through males without recombination (reviewed by Wei and Barbash 2015). Degeneration of the neo-Y necessitates compensation of genes on the neo-X. The relative mobility of satellite repeats makes them well suited for marking a young X chromosome to enable it to capture the dosage compensation machinery.

The manner in which the 1.688X satellites identify the X chromosome remains unknown, but there are clues that the mechanism is very different than that of the CES. The 1.688X satellites on the X chromosome are not generally enriched for the MSL proteins, suggesting that they do not recruit directly but mark the X chromosome in some fashion (Deshpande and Meller 2018). Ectopic expression of 1.688X siRNA increased the repressive H3K9me2 mark on autosomal 1.688X satellite insertions (Deshpande and Meller 2018). Contrary to conventional expectations for a repressive mark, this increased the expression of nearby genes in males. The finding that HP1 is modestly enriched on the male X chromosome, and that mutations in several heterochromatin factors selectively disrupt the structure of the polytenized male X, support the idea that repressive marks are in some way linked to fly dosage compensation (Spierer et al. 2005, 2008; De Wit et al. 2005). How satellite repeats and repressive marks might accomplish this is speculative, but one possibility is through influencing chromatin organization in the nucleus. Nuclear organization is a factor in X chromosome compensation in other organisms. For example, the single male X chromosome of C. elegans is located at the periphery of the nucleus, but the two female X chromosomes are centrally located (Sharma et al. 2014). Interaction with nuclear pore proteins may elevate the expression of X-linked genes in this species. A role for nuclear pore proteins in MSL loading and activation has also been proposed in flies (Mendjan et al. 2006). More generally, the ability of X-linked genes to acquire dosage compensation during development is attributed to the three-dimensional organization of X chromosomes in the nucleus of mammals and flies (Engreitz et al. 2013; Schauer et al. 2017; Ramírez et al. 2015). A role of the location or organization of the X chromosome is one way that 1.688X satellites might promote X recognition.

1.9 Satellites and Centromeres

The most widely appreciated function of satellite DNA is at centromeres. Human centromeres contain α-satellite arrays harboring a motif that interacts with the centromeric H3 variant CENP-A, suggesting determination by sequence (Masumoto et al. 1989; reviewed by McNulty et al. 2017; Willard 1985; Schueler et al. 2001). However, human centromeres occupy only part of the array and satellites are absent from some neo-centromeres, challenging the idea that sequence is the primary centromere determinant. Fly centromeres also form within extensive arrays of satellite repeats, but the centromere itself assembles at “islands” of transposons embedded in this sea of satellites (Chang et al. 2019). Fly centromeres are defined by the incorporation of an H3 variant called Cid. The importance of epigenetic information in specifying centromeres in flies is demonstrated by the fact that the transposons at the fly centromere are by no means limited to the centromere, appearing in both heterochromatic and euchromatic contexts throughout the genome (Chang et al. 2019). It is also reflected in the persistence of a functional centromere following transient anchoring of the centromere-specific chaperone CAL-1, which is capable of loading Cid at ectopic sites (Chen et al. 2014). These findings indicate that both fly and humans centromeres are specified by a combination of DNA sequence, genomic context, and epigenetic marking. Centromeres are surrounded by heterochromatin that contributes to their function. A large deletion of heterochromatin flanking a fly centromere produced mitotic instability and premature sister chromatid separation (Wines and Henikoff 1992). This is consistent with the enrichment of cohesin in heterochromatin and suggests that the mitotic machinery is tuned to a certain arrangement of heterochromatin surrounding the centromere (Bernard and Allshire 2002).

RNA from satellites has also been found to localize to centromeres. Transcripts from a large block of pericentric 359 bp satellites on the D. melanogaster X chromosome bind in cis to the centromeric region (Bobkov et al. 2018). RNA from the mammalian α-satellite also binds to centromeric proteins and localizes in cis at centromeres (Wong et al. 2007; reviewed in Ideue and Tani 2020; McNulty et al. 2017). This RNA is also necessary for characteristic localization of centromeric proteins CENPC1 and INCENP, and so may function to recruit or stabilize components of the centromere.

1.10 Satellites Are the Ammunition of Genomic Conflicts

Chromosomes are fundamental units of inheritance and take an active role in biasing their own transmission to the next generation. Meiotic drive, when a genetic element manipulates reproduction to favor its own transmission and overthrow Mendel’s rules, is the outcome. A selfish chromosome able to accomplish this will increase in the population. Evolutionary theory posits that systems of meiotic drive emerge frequently and sweep through the population, driving enrichment of one chromosome and limiting genetic variation. In addition, unfavorable, genetically linked alleles are allowed to proliferate (Courret et al. 2019). This extracts a cost in fitness that enables suppressors of drive to emerge and restore Mendelian segregation. A history of recurring cycles of drive and suppression is revealed when wild-caught flies are outcrossed and suppressed drivers emerge (Hartl and Hartung 1975). These conflicts are often mediated through satellite repeats and the outcome shapes the genome.

Sex differences in meiosis ensure that a strategy for biasing chromosome transmission can only function in one sex. While all products of male meiosis have the potential to develop into sperm, only one of the four products of female meiosis will become the egg. To gain an advantage in female meiosis the critical point is the alignment of homologs on the spindle at the first division (Fig. 1.2a). A centromere that attaches to the egg pole will escape elimination in the polar body (Rosin and Mellone 2017; Kursel and Malik 2018). To take advantage of this requires asymmetry in the meiotic spindle and a centromere able to exploit the asymmetry. The extraordinary reproductive advantage that a stronger centromere holds is thought to fuel an evolutionary race that drives rapid changes in centromeric DNA and proteins (Malik 2009). Predictions of this model are fulfilled in mice where an expansion of satellite repeats has produced a large centromere with an advantage over a homolog with fewer satellites (Iwata-Otsubo et al. 2017). Interestingly, the kinetochores of the larger centromere detach more frequently from cortical spindle fibers, providing an opportunity to reattach to the egg pole (Akera et al. 2019). Suppressors of centromere drive would benefit a population in which a chromosome had begun to cheat. The observation that the centromeric variants Cid and CEN-A are remarkably fast-evolving and divergent from other histones suggests their involvement (Black et al. 2004). Amino acid changes in Cid are concentrated in a region that interacts with H4 and an extended loop that contacts DNA (Vermaak et al. 2002). The rapid evolution of centromeres and the proteins that bind them seems at odds with the very conservative function of centromeres but is in accord with the idea that these structures are the site of an evolutionary battle (Malik 2009).

Fig. 1.2
figure 2

Meiotic drive in female and male germlines. (a) Stronger centromeres (larger dot) gain an advantage in the female germline by avoiding the cortical spindle and becoming an egg nucleus more than 50% of the time. (b) Chromosomes that achieve drive in males do so by sabotaging their homolog. This may be direct, as implied in the cartoon, or indirect by the establishment of an environment that is toxic to cells inheriting the susceptible homolog. Left) The Paris sex ratio X chromosome produces a factor that blocks segregation of the Y at the second meiotic division. Failure to form Y-bearing sperm ensures predominantly female broods. Right) The Sd and Winters drivers sabotage the maturation of sperm carrying susceptible homologs. In both systems, arrest occurs before chromatin compaction and these malformed cells are eliminated (center). Sperm carrying the driver (right) develop normally

All products of male meiosis have the opportunity to become sperm. To gain an advantage in the male germ line a chromosome must exert a negative effect on its homolog (Fig. 1.2b). Several examples of this are well known, including Segregation distorter/Responder (Sd/Rsp) in D. melanogaster. Rsp, the target of drive, is an array of two very similar 120 bp repeats present in dozens to thousands of copies (Khost et al. 2017). Larger arrays confer increased sensitivity to Sd (Larracuente and Presgraves 2012; Moschetti et al. 1996). Sd is a truncated but enzymatically active duplication of RanGAP that mislocalizes to the interior of the nucleus (Kusano et al. 2001). When a male has Sd on one homolog and a sensitive Rsp allele (RspS) on the other, maturation of sperm carrying RspS is arrested and these cells are eliminated. The sperm carrying Sd develop normally and are responsible for most or all fertilizations. The Sd phenotype is enhanced by a number of modifiers, all genetically linked to Sd on the second chromosome (reviewed by Larracuente and Presgraves 2012). Of course, Sd chromosomes must themselves carry insensitive Rsp arrays in order to escape elimination. Although the precise molecular defect that causes arrest is unclear, abnormal localization of mutant RanGAP is thought to disrupt the RanGTP/GDP gradient across the nuclear envelope and this may interfere with transport in and out of the nucleus (Kusano et al. 2003). In this environment, the expanded repeats on RspS chromosome precipitate failure of sperm maturation. The process that is affected must be unique to sperm development as Sd/RspS females produce normal offspring ratios. During sperm maturation chromatin is remodeled by the replacement of histones, a process requiring the import of protamine. This step is male-limited and appropriate to the stage of arrest, but it is unclear how expanded RspS arrays and defects in protamine levels would induce arrest. Disruption of small RNA import leading to a defect in repackaging Rsp chromatin is also possible. Small RNAs from Rsp have been identified in the germ line and mutations in aubegine (aub), an argonaut family protein that participates in piRNA production, enhances distortion by Sd (Gell and Reenan 2013; Nagao et al. 2010). This finding suggests that an additional role of germ line small RNA systems is to defend against meiotic drive.

Rsp provides an excellent example of satellite turnover. Two families of repeats, the Rsp and Rsp-like family and the 359 bp family, which includes an extensive array of pericentromeric 359 bp repeats in X heterochromatin and the euchromatic 1.688X satellites, were found to occupy overlapping sites in related species (Sproul et al. 2020). Both the Rsp and 359 families are AT-rich, but the 359 bp repeats are widespread, older, and more diversified in related species. Examination of satellite repeats in several Drosophila species revealed that 1.688X and Rsp-like satellites occupy many of the same euchromatic sites (Sproul et al. 2020). Sites in which Rsp-like repeats have been inserted in an existing 1.688X array, possibly in the process of replacing it, were identified in D. simulans and D. mauritiana. This suggests a model in which young Rsp-like repeats use homology with existing 1.688X repeats to enter these sites, a process that may be facilitated by long-range interactions in the nucleus. Extrachromosomal circular DNAs are also a potential mechanism for movement. The correlation between the abundance of one of the repetitive elements and extrachromosomal DNA also suggests a role in Rsp-like invasion (Sproul et al. 2020).

D. simulans has at least three meiotic drive systems that bias sex chromosome inheritance and thus distort the sex ratio. All of these involve drivers on the X chromosome. In the Winters system, named for the location where the flies were collected, the X chromosome distorter prevents Y-bearing gametes from completing maturation. Failure occurs during condensation of the haploid nucleus, a timing that is similar to that observed in the D. melanogaster Sd/Rsp system (Tao et al. 2007b). The driver, Distorter on the X chromosome (Dox), is a partial duplication that produces an RNA with limited coding potential (Tao et al. 2007a). The mechanism of Dox action is unknown but suppressors of Dox on the second chromosome generate siRNAs that reduce levels of the Dox transcript (Lin et al. 2018). As the Y chromosome is primarily composed of satellite repeats and transposons, it is likely that the toxic effect of Dox depends on the unique sequence and chromatin composition of this chromosome. A second D. simulans sex ratio distortion system, Paris, induces anaphase bridges and failure of Y chromosome disjunction during the second meiotic division (Fig. 1.2b, Cazemajor et al. 2000). One component of the X-linked driver was discovered to be a loss of function mutation in a rapidly evolving member of the HP1 family, HP1D2 (Helleu et al. 2016). Intriguingly, the HP1D2 protein is specifically enriched on the Y chromosome, suggesting that a defect in the organization or compaction of this chromosome prevents segregation. Sex ratio distortion leads to populations with unbalanced ratios of males and females and creates a strong selective advantage for an individual with a novel suppressor of drive. In accord with this, Y chromosomes that are resistant to the Paris or Winters driver have been discovered (Branco et al. 2013; Helleu et al. 2019). As the coding potential on the Y is limited, it is quite possible that these suppressors are changes in satellite or transposon content that make them insensitive to the X-linked driver.

1.11 Satellite Repeats Mediate Conflict Between Species

Hybrid incompatibilities enforce the reproductive isolation that defines species (Castillo and Barbash 2017). The rapid evolution of heterochromatin DNA and proteins is a potential source of incompatibilities that produce lethality or infertility upon hybridization of closely related Drosophilids (Presgraves 2010; Ferree and Barbash 2009; Gatti et al. 1976; Yunis and Yasmineh 1971; reviewed in Ferree and Prasad 2012). For example, when D. melanogaster males are mated to D. simulans females, male offspring emerge as sterile adults but females die as embryos. Early female lethality is attributable to the D. melanogaster X chromosome in D. simulans cytoplasm. Specifically, the large array of 359 bp repeats at the base of the D. melanogaster X chromosome fails to compact and X chromatids become entangled in anaphase bridges (Ferree and Barbash 2009). But when D. melanogaster males transmitted a Zygotic hybrid rescue (Zhr) chromosome that was deleted for the pericentric 359 bp repeats, female offspring survived (Sawamura et al. 1993). Small RNAs from the 359 bp satellite are present in oocytes from D. melanogaster females, and it is plausible that these direct heterochromatin formation over the 359 bp satellites in fertilized embryos (Ferree and Barbash 2009). The X chromosome of D. simulans lacks 359 bp repeats and the relevant class of siRNA is not present in D. simulans eggs. These ideas are supported by a study demonstrating that heterochromatin formation at the 359 bp satellites occurred with different timing than that of another large satellite array and required maternal factors missing from D. simulans ooplasm (Yuan and O’Farrell 2016).

The reciprocal mating, D. melanogaster females mated to D. simulans males, produced sterile female adults but no male offspring. The toxic interaction producing male lethality can be traced to heterochromatin proteins, Lethal hybrid rescue (Lhr, D. simulans) and Hybrid male rescue (Hmr, D. melanogaster) (Maheshwari and Barbash 2012). Higher expression of Lhr from the D. simulans chromosome is the basis of hybrid lethality. Loss of D. simulans Lhr rescues hybrid lethality, but loss of D. melanogaster Lhr, which is expressed at only half the rate as D. simulans, does not achieve rescue. Lhr encodes a rapidly evolving heterochromatin protein that interacts with HP1 (Brideau et al. 2006; Brideau and Barbash 2011; Thomae et al. 2013). Loss of Lhr results in bloated polytene chromosomes in D. simulans, a phenotype associated with a loss of chromosome structure (Pal Bhadra et al. 2006). The Hmr gene is also rapidly evolving and encodes a DNA binding protein that localizes to heterochromatin in a complex with Lhr and HP1a (Satyaki et al. 2014; Alekseyenko et al. 2014). Hybrid males, which die as larvae, display poorly condensed chromosomes and anaphase bridges between sister chromatids, consistent with the idea that heterochromatin assembly and compaction is the primary defect (Blum et al. 2017).

The conflicts between genetic elements within a species that produce meiotic drive, and between species that lead to hybrid incompatibility, rely on an overlapping cast of characters. In accord with this, it has been suggested that meiotic drive contributes to the genetic divergence that produces hybrid incompatibility (McDermott and Noor 2010). This notion is supported by the discovery that a single gene, Overdrive (Ovd), appears responsible for both meiotic drive and hybrid incompatibility in D. pseudoobscura (Phadnis and Orr 2009). Males from a mating between subspecies are sterile when young but become weakly fertile and produce almost exclusively daughters when aged. Although the molecular mechanisms at play are currently unknown, the finding that one gene is involved in both phenomena supports the idea that meiotic drive and hybrid incompatibility are produced by similar genetic conflicts.

Summary

Satellite repeats appeared both troublesome and singularly unpromising at the dawn of the genomics era. The typical concentration of satellites in vast, unclonable blocks of heterochromatin was an additional deterrent to their study. But the ability of satellites to move, expand, and undergo relatively rapid genome-wide replacement enables them to shape genomes and respond to evolutionary pressures. Satellite DNA, and small RNA pathways capable of directing modifications to chromatin, are a powerful combination that can be adapted to novel roles. This can be appreciated by the dispersed, euchromatic 1.688X satellites that recruit dosage compensation while very similar heterochromatic 359 bp satellites mediate hybrid incompatibility at a different life stage and in different sex. In spite of the stark differences in the roles of these satellites, it is likely that small RNA normally directs chromatin modifications to both, and that these modifications are essential for normal function. When heterochromatin is compromised satellites become unstable and devastating disruptions of nuclear organization result. This intrinsic risk can be appreciated by the destruction unleased by 359 bp satellites in a hybrid environment. The remarkable ability of heterochromatin to assemble satellite DNA into a nondestructive form can be credited with enabling satellite repeats to expand to their current position of prominence in higher eukaryotes. All of the properties described above, including the mutability of satellites and their intrinsic danger, put repetitive sequences at the leading edge of evolution.