Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Subtelomeres are an unusual part of primate genomes, enriched in genes, repetitive DNA, structural polymorphisms, and chromosome rearrangements. As with subtelomeres of other orders, such genomic variation in primates can lead to genetic diversity, the birth of new genes, and an explosion of gene families. However, rearrangements in human subtelomeres can also alter developmentally critical genes, causing intellectual disability and birth defects. Analysis of subtelomeric breakpoints has revealed “hot spots” of chromosome breakage that may be initiated by specific types of repetitive DNA abundant in subtelomeres. In most cases, subtelomeric breaks are repaired by non-homologous end-joining and DNA replication processes, rather than homologous recombination. Comparative genomic studies of orthologous subtelomeres in closely related primates show even greater diversity between species, consistent with the rapid evolution of chromosome ends.

8.1 Primate Subtelomere Organization

Primate subtelomeres are enriched in repetitive elements, including segmental duplications (SDs), satellite DNA, tandem repeats, and degenerate telomere repeats (Riethman et al. 2004; Linardopoulou et al. 2005) (Fig. 8.1). Though this repetitive structure may be important for subtelomere biology and evolution, it has made assembling these parts of the genome a challenge. Although the human genome assembly is more “complete” than other primate genomes, in the most recent build (GRCh37/hg19), only 17 of 46 of chromosome ends have traversed subtelomeric sequences to reach the end of the chromosome, terminating in perfect telomeric repeats, (TTAGGG)n. Other primate genomes [chimpanzee (The Chimpanzee Sequencing and Analysis Consortium 2005), orangutan (Locke et al. 2011), and rhesus (Gibbs et al. 2007)] have been assembled using at least some comparisons to the human genome, so sequence gaps in the human reference genome as well as non-aligning regions between species are likely to remain as gaps in the assemblies of non-human primate genomes. In addition, subtelomeric sequences are incredibly polymorphic, and only a handful of subtelomeric alleles have been captured in the reference genome assembly (Trask et al. 1998; Linardopoulou et al. 2005). Thus, despite the successes assembling more and more primate genomes, the subtelomeric genome assemblies of human and non-human primates remain largely incomplete. Most subtelomeric genomic studies have focused on particular subtelomeres for the comparative analysis of primates. Here, we will describe human subtelomeric organization and discuss the limited non-human primate subtelomeric data for a subset of chromosome ends.

Fig. 8.1
figure 1

Repeat content and gene density of human chromosome 9. Tandem repeats (green), percent GC (gray), segmental duplications (black), and genes (blue) are shown along the length of chromosome 9 as represented in the UCSC Genome Browser (http://genome.ucsc.edu/)

Human subtelomeres are made up of two major zones: a terminal region consisting of SDs and an adjacent region of chromosome-specific (non-duplicated) sequences (Fig. 8.2). SDs are operationally defined as DNA sequences 1 kb or larger that have another copy in the genome with ≥90 % identity. They make up more than 5 % of the human genome and are preferentially located at pericentromeres and subtelomeres (Bailey et al. 2002). In human subtelomeres, SDs occupy the terminal 5–300 kb of chromosomes. Each SD is shared between a subset of chromosome ends, and individual SDs range from 3 to 50 kb each (Linardopoulou et al. 2005). Copies of the same SD are 88–99.9 % identical and are occupy between 2 and 18 different chromosome ends, consistent with recent duplications that have rapidly spread to multiple chromosomes. Fluorescence in situ hybridization (FISH) analyses have shown that subtelomeric SDs are highly polymorphic, varying in copy number and chromosomal location from person to person (Trask et al. 1998; Linardopoulou et al. 2005). Given the number of subtelomeric SDs in the genome and the degree of polymorphism, it is likely that each human has a unique repertoire of subtelomeric SD sequence.

Fig. 8.2
figure 2

Human subtelomere organization. The terminal 1 Mb of chromosome 9q has a distal SD zone and an adjacent chromosome-specific zone. Segmental duplications (orange, yellow, and gray), percent GC (gray), assembly gaps (black), tandem repeats (green), and genes (blue) are shown as in the UCSC Genome Browser (http://genome.ucsc.edu/). Interstitial telomere sequences (ITS, gray vertical lines) were identified by RepeatMasker (http://www.repeatmasker.org). Breakpoints of subtelomeric rearrangements that cause intellectual disability (black vertical lines) were fine-mapped in (Luo et al. 2011)

Many of the genes in subtelomeric SDs are part of gene families, such as odorant and cytokine receptors, tubulins, and transcription factors (Linardopoulou et al. 2005). The redundancy of duplicated subtelomeric genes may allow some copies to acquire new functions and some copies to mutate, while other copies retain their original function. Frequent interchromosomal exchanges can also juxtapose parts of different subtelomeric genes, potentially creating novel hybrid genes. The olfactory receptors (ORs) are a striking example of a gene family that expanded in primate subtelomeres. There are over 900 OR genes in the human genome, a subset of which are found at subtelomeric locations (Glusman et al. 2001). Some subtelomeric ORs are no longer functional and have become pseudogenes, whereas other ORs are transcribed in certain tissues, such as olfactory epithelium and testis (Linardopoulou et al. 2001).

Just proximal to subtelomeric SDs begins a region of chromosome-specific DNA (Fig. 8.2). Some deletions and duplications of this region have been detected in phenotypically normal individuals, suggesting that, like in the SD zones, some variation in the chromosome-specific regions is tolerated (Ballif et al. 2000; Ravnan et al. 2006; Redon et al. 2006; Balikova et al. 2007; Mills et al. 2011). Nevertheless, larger rearrangements of the chromosome-specific subtelomeric regions are associated with intellectual disability and birth defects (Ravnan et al. 2006; Ballif et al. 2007; Martin et al. 2007; Shao et al. 2008). Such rearrangements were originally identified by chromosome banding and FISH (National Institutes of Health and Institute of Molecular Medicine Collaboration 1996; Knight et al. 2000) and are now detected via genomic microarrays (Rudd 2011). Studies of clinically relevant copy number variation (CNV) have shown that subtelomeric rearrangements are overrepresented among CNVs that cause intellectual disability. For example, microarray analysis of 15,749 developmentally disabled individuals revealed that 16.3 % of pathogenic chromosome anomalies lie within the terminal 5 Mb of chromosome ends (Kaminsky et al. 2011), which accounts for only 7 % of the human genome. These chromosome rearrangements include deletions, duplications, and unbalanced translocations that are typically hundreds of kb to several Mb in size and include tens to hundreds of genes.

Loss, gain, and mutation of genes in the chromosome-specific zone of subtelomeres can cause a clinically recognized phenotype. Studies of patients with common phenotypic features and overlapping CNVs have pinpointed critical regions and genes associated with disease in a given subtelomere. The 9q subtelomeric deletion syndrome was first identified in patients with overlapping deletions, including the EHMT1 gene, which is responsible for the phenotype, as EHMT1 mutations cause a typical 9q deletion phenotype (Harada et al. 2004; Stewart et al. 2004; Kleefstra et al. 2006). Terminal deletions of chromosome 22q cause the 22q13 deletion syndrome, and mutations in the SHANK3 gene in the critical region also cause those language disorders associated with the syndrome (Phelan et al. 2001; Wilson et al. 2003; Durand et al. 2007). Given the gene density at chromosome ends (Fig. 8.1), a host of candidate genes could be responsible for other “subtelomeric syndromes.”

8.2 Subtelomeric Hot Spots and Rearrangement Mechanisms

Analysis of subtelomeric breakpoints has revealed recurrent sites of chromosome breakage. Given the enrichment of particular types of repeats in subtelomeres, such “hot spots” are likely related to DNA sequence and/or chromatin structure. Though not all types of repetitive DNA are linked to chromosome breakage, tandem repeats, trinucleotide repeats, satellite DNA, and G-rich sequences are known to underlie chromosomal fragility at other loci (Sutherland 2003; Bacolla et al. 2006; Zhao et al. 2010) and are strong candidates for DNA sequence-dependent causes of subtelomeric rearrangement. Uncovering how such sequences could form secondary structures that might interfere with cellular processes, including recombination and DNA replication, is crucial to untangling the molecular mechanisms that give rise to subtelomeric rearrangements.

One of the best examples of a subtelomeric hot spot lies in chromosome band 22q13.3. Rearrangements of this subtelomere have been independently identified in numerous studies, and fine-mapped breakpoints cluster between exons 8 and 9 of the SHANK3 gene (Wong et al. 1997; Anderlid et al. 2002; Bonaglia et al. 2006, 2011; Durand et al. 2007; Philippe et al. 2008; Dhar et al. 2010; Luo et al. 2011; ). At least 13 published terminal deletion breakpoints lie in this 1.2-kb hot spot, which is made up of G-rich tandem repeats that are predicted to form G-quadruplexes. G-rich sequences that contain four tracts of at least three guanines separated by other bases can form G-quadruplexes by pairing between the four G-rich tracts (Huppert and Balasubramanian 2005; Burge et al. 2006). Such G-rich sequences can assemble highly stable G-quadruplexes in vitro (Neaves et al. 2009; Sanders 2010), and without specific helicases to unwind them, G-quadruplexes can cause chromosome breakage and genomic instability in vivo (Kruisselbrink et al. 2008; Ribeyre et al. 2009). Human subtelomeres are G-rich (Fig. 8.1), and there are many subtelomeric loci that contain the G-quadruplex consensus sequence, G3–5N1–7G3–5N1–7G3–5N1–7G3–5 (Huppert and Balasubramanian 2005). Although functional studies of fragility at the 22q13.3 hot spot are still lacking, the recurrent breakpoints and predicted G-quadruplex motifs are suggestive of a region that is particularly susceptible to double-strand breaks (DSBs). It is likely that other subtelomeric rearrangement breakpoints are also caused by DSBs in G-rich sequences that assemble G-quadruplexes or other secondary structures.

Another indicator of elevated DSBs in subtelomeres comes from studies of sister chromatid exchange (SCE) in chromosome ends. The rate of SCE is significantly elevated in telomeres and subtelomeres, as demonstrated using a fluorescence method called chromosome orientation FISH (CO-FISH) (Cornforth and Eberle 2001; Londono-Vallejo et al. 2004; Rudd et al. 2007). Seventeen percent of all SCE occurs in the most terminal ~100 kb of chromosomes, translating to a 160-fold elevation of the rate of subtelomeric SCE compared with the rest of the genome (Rudd et al. 2007). More direct evidence of DSBs at chromosome ends comes from chromatin immunoprecipitation studies of the DSB-binding protein, γ-H2AX (d’Adda di Fagagna et al. 2003). In senescent primary cells, γ-H2AX is enriched 60 kb–1.5 Mb from the telomere, across different chromosome ends (Meier et al. 2007). These physical measurements of DSBs suggest that subtelomeres incur more breaks than other parts of the genome, consistent with the concentration of breakpoints in human subtelomeres.

DSBs in subtelomeres may be resolved via various DNA repair pathways, and analyses of breakpoint junctions in the chromosome-specific and SD zones provide insight into the rearrangement mechanisms that have shaped these regions. There are two major types of DNA repair, one that requires long stretches of sequence homology (homologous recombination) and one that does not (non-homologous repair). Comparing subtelomeric breakpoint junctions to the pre-rearrangement genomic state can distinguish the two types of DNA repair. A large-scale analysis of over 100 subtelomeric breakpoints in the chromosome-specific zone revealed that three of 21 sequenced breakpoint junctions were the product of homologous recombination between interspersed repeats, including LINE and Alu elements. The remaining 18 rearrangements did not involve significant sequence homology at the junctions and were formed via non-homologous end-joining (NHEJ) and DNA replication processes (Luo et al. 2011). Other studies of subtelomeric breakpoint junctions in chromosome-specific zones have also found a preponderance of NHEJ versus homologous recombination (Ballif et al. 2003, 2004; Gajecka et al. 2006, 2008; Bonaglia et al. 2006; Yatsenko et al. 2009).

A similar trend regarding homologous and non-homologous repair is evident in subtelomeric junctions in the SD zone. This part of the genome is organized as a patchwork of SDs shared between a subset of chromosome ends; however, subtelomeric SDs are not organized in a random manner. Instead, subtelomeric SDs are almost always in the same orientation and relative order, suggesting translocation, rather than transposition, as the mechanism of sequence transfer (Linardopoulou et al. 2005). The alignment of paralogous SDs in human subtelomeres highlights the interchromosomal sequence transfers responsible for the highly polymorphic organization of subtelomeric SDs. Forty-nine out of 53 SD breakpoint junctions are the product of NHEJ, while only four are mediated by homologous recombination (Linardopoulou et al. 2005). Thus, non-homologous DNA repair is the predominant mechanism underlying subtelomeric breakpoints in the chromosome-specific and SD zones.

8.3 Subtelomere Evolution

Investigations into the subtelomeric differences between species have also given us insight into the DNA breakage and repair processes involved in this rapidly evolving part of the genome. Comparative genomic analyses of the great apes have shown that although most orthologous sequences are highly conserved, chromosome ends are far more diverse. Since most primate subtelomeres are not sequenced, comparative studies have relied on a combination of FISH, PCR, chromosome flow-sorting, and BAC sequencing to generate syntenic maps of these regions (Monfouilloux et al. 1998; Trask et al. 1998; Martin et al. 2002; Fan et al. 2002; Ventura et al. 2003, 2011; Linardopoulou et al. 2005; Rudd et al. 2009). Detailed analyses of several chromosome ends have found that subtelomeric sequences vary dramatically in copy number and genomic location between closely related species; however, the reticulate nature of subtelomeric DNA exchanges complicates the interpretation of the DNA sequence transfers that have shaped modern-day primate chromosome ends. Chromosome fissions and fusions that give rise to the birth and death, respectively, of subtelomeres are ideal for teasing apart the steps involved in subtelomere evolution. Fissions and fusions punctuate subtelomeric events, making it possible to track a given subtelomere before and after a major chromosomal change.

Human chromosome 2, for example, is the product of a head-to-head fusion of two ancestral chromosomes that remained separate in the other great apes (Yunis and Prakash 1982; Ijdo et al. 1991; Fan et al. 2002). The fused chromosome 2 inactivated one centromere and two telomeres, and the human 2q13–2q14.1 fusion site is marked by inverted telomere repeats and subtelomeric SDs that once resided at two independent chromosome ends (Fig. 8.3). These SDs are paralogous to several human subtelomeres, including 9p and 22q, consistent with multiple interchromosomal exchanges (Fan et al. 2002; Linardopoulou et al. 2005). The inverted telomere repeats at the fusion site are not perfect telomere arrays, but rather are 14 % diverged from the canonical telomere repeat, (TTAGGG)n. This could be due to the rapid divergence of perfect telomere repeats post-fusion, or it could indicate that the chromosomal fusion occurred at degenerate telomere repeats in the subtelomeres of the ancestral chromosomes, rather than as a fusion of the most terminal telomere sequences (Fan et al. 2002).

Fig. 8.3
figure 3

Chromosome fission and fusion in primates. Centromeres are represented as circles, telomeres are represented as arrowheads, and segmental duplications are represented as colored rectangles. a The chromosome fusion that gave rise to human chromosome 2 resulted in inactivation of one centromere (open circle) and the fusion of two telomeres (gray). b The chromosome fission (red squiggly line) in the ancestor of the great apes resulted in the birth of three new telomeres and two new centromeres (red) and the inactivation of one centromere (open circle). New segmental duplications (green and blue) were transferred to the 15q subtelomere post-fission

A chromosomal fission in the ancestor of great apes gave rise to human chromosomes 14 and 15 (Fig. 8.3). Rhesus macaque chromosome 7 represents the ancestral locus, in which regions orthologous to human chromosomes 15 and 14 are arranged in a head-to-tail configuration. After the fission of the ancestral chromosome, one new pericentromere (on chromosome 14) and one new subtelomere (on chromosome 15) were created at the fission site (Wienberg et al. 1992; Ventura et al. 2003; Rudd et al. 2009). In addition, the ancestral centromere inactivated, two new centromeres activated, and both chromosomes 14 and 15 acquired acrocentric short arms with new telomeres (Fig. 8.3). Since its birth at the chromosome fission, the 15q subtelomere has engaged in rampant sequence transfers. The orthologous regions of the 15q subtelomere in four great apes and an Old World monkey exist as completely different genomic structures in each species (Rudd et al. 2009). Terminal deletions, interstitial deletions, duplications, and interchromosomal exchanges have created a unique subtelomeric configuration in the genomes of rhesus macaque, orangutan, gorilla, chimpanzee, and human. The fission site was home to at least 21 olfactory receptor (OR) genes in the ancestral chromosome, and since the fission, ORs have been gained and lost in a lineage-specific manner in the genomes of all the great apes (Rudd et al. 2009).

Like human subtelomeres, non-human primate subtelomeres are also enriched in satellite DNA and SDs. However, different classes of repetitive DNA have expanded in different species, typical of concerted evolutionary processes. Heterochromatic “caps” of chromosome ends have been seen in chimpanzee and gorilla, but not in human (Yunis and Prakash 1982; Royle et al. 1994). Recent sequence analyses of chimpanzee and gorilla subtelomeres have revealed that both species have a 32-bp satellite at chromosome ends, but SDs that make up the chimpanzee caps are derived from the chromosome 2 fusion site, whereas the gorilla subtelomeric SDs are derived from a chromosome 10 sequence (Ventura et al. 2011).

Analysis of the SDs in the human genome assembly also provides information on the evolutionary timing of primate subtelomeres. Fifty percent of human subtelomeric SD sequence is >98.7 % identical to another chromosome end, indicating that the sequence transfer occurred since human and chimpanzee diverged (Linardopoulou et al. 2005). Further, FISH analysis of a subset of human subtelomeric SDs revealed variation in copy number and genomic location between individuals and heterozygosity for subtelomeric SDs within a single individual (Trask et al. 1998; Linardopoulou et al. 2005). Such data are consistent with subtelomeric SDs being one of the most rapidly evolving regions of the human genome.

Rearrangements in primate subtelomeres are a source of variation and disease. Although small rearrangements represent normal polymorphism, larger gains and losses involving dosage-sensitive genes can cause intellectual disabilities and birth defects, making these regions particularly relevant to studies of human disease and diversity. Though subtelomeric variation is recognized in the human genome, the causes of DSBs in chromosome ends are unknown. Functional studies of the DNA sequences underlying subtelomeric breakpoints are a crucial next step to discovering the risk factors and mechanisms of subtelomeric rearrangements.