Keywords

Introduction

The term centromere (kentron, center; meros, part), initially coined by Waldeyer in 1903 for the neck of sperm, was reinterpreted by Darlington in 1936 as the centric constriction on metaphase chromosomes to which spindle fibers attach during cell division [1]. Centromeres were cytologically distinguished by their constricted morphological appearance and with C-banding which is a Giemsa staining procedure that preferentially stains the heterochromatin regions [2]. Now, fluorescence in situ hybridization (FISH) probes against centromeric DNA of specific chromosome and antibodies against centromere proteins are commonly used for the localization of centromeric regions [35].

The human centromere is a region on the chromosome consisting of an underlying alpha satellite repetitive DNA sequence that winds around nucleosomes containing centromere protein (CENP)-A, a histone H3 variant. Hence, CENP-A is the epigenetic mark of a centromere and is one of the 17 proteins forming the constitutive centromere-associated network (CCAN) which is crucial in marking and maintaining the active centromere throughout the cell cycle [6, 7]. The kinetochore , on the other hand, is important in providing an interface for spindle microtubule binding, stabilizing correct attachments and participating in the spindle assembly checkpoint (SAC), as well as the movement of sister chromatids towards opposite poles during anaphase [8].

Together, the function of the centromere and kinetochore is to ensure high fidelity of chromosome segregation during cell division because an erroneous chromosome segregation can lead to cell arrest or cell death, or more dangerously, chromosomal instability (CIN) and aneuploidy in the daughter cells. CIN, the rate of karyotypic change resulting in anomalous organization and/or number of chromosomes , has been reported as one of the key features in cancer cells and was postulated to precede aneuploidy. Aneuploidy, however, is the karyotypic state depicted by abnormal number of chromosomes and has long been associated with carcinogenesis and birth disorder s . The first suggestion of a possible link between aneuploidy and cancer was in the monograph published by Boveri in 1914 [911].

Health

The main functional role of the centromere is to ensure that replicated chromosomes are distributed equally to daughter cells during cell division. These functions can be divided into the following classes: (1) Genetic/epigenetic marking or identity of the locus along a specified region of each chromosome , (2) SAC and correct attachment of microtubules, (3) sister chromatid cohesion and release, (4) movement of chromosomes to opposing poles and (5) cytokinesis where a group of transient proteins mark the site for the final separation of the daughter cells.

Centromere Structure

The centromere is comprised of three main zones (Fig. 9.1); (1) the cohesion zone that holds the two replicated sister chromatids together until the onset of anaphase, (2) the DNA interface where proteins directly interact with centromere DNA to mark and maintain the centromere site between cell divisions, and (3) a platform for the capture of spindle microtubules—this zone is commonly referred to as the kinetochore .

Fig. 9.1
figure 1

Centromere structure during interphase and metaphase. (a) The centromere locus is marked by a group of proteins (green) known as the CCAN complex which are found at the same chromosomal site throughout the cell cycle. (b) After DNA replication, the chromatin (purple) condenses to form the mature metaphase chromosome attached to spindle microtubules (blue). It is during this stage that the centromere attracts other proteins involved in microtubule spindle attachment (orange) and sister centromere cohesion (yellow)

Centromere DNA

Human centromere DNA is composed of a tandemly repeated AT-rich monomer of 171 bp commonly known as alpha satellite [12]. This repeat is organized into higher order repeats (HORs) ranging in size from 2 to 35 monomers, which are then organized into further tandem arrays spanning (250 kb to 3 Mb) (Fig. 9.2) [13]. One feature of alpha satellite HORs is that they have chromosome specificity [4]. Differences in the primary sequence of each HOR monomer repeat give rise to its unique chromosome specificity. This difference allows researchers and diagnostic scientists to use techniques such as FISH to identify single chromosomes such as the X or the Y. However, not all chromosomes can be distinguished by a single alpha satellite HOR class. This is nicely illustrated in the acrocentric chromosomes, 13, 14, 15, 21 and 22. HORs from these chromosomes share a high level of sequence homology, where single chromosomes cannot be differentiated by hybridization techniques such as FISH or Southern blot.

Fig. 9.2
figure 2

Schematic illustrating the genomic organization of human centromeres . A condensed metaphase chromosome showing centromeric alpha satellite (green) and pericentric DNA families (blue and pink). Note that the repetitive pericentromeric DNA is composed of different sized monomer sequences that are not similar to alpha satellite. The fundamental repeating unit of alpha satellite is an AT-rich 171 bp unit. This is organized into higher order repeats (HORs) that have a high level of sequence identity. Individual centromeres can be distinguished by their HOR type which varies in length and sequence structure. Furthermore, the overall length of the alpha satellite domain is highly polymorphic between individuals. An alternative way to view the organization of centromeric DNA is to break it up into monomer units shown as colored circles. The sequential order of each unit is linked with a line and the HOR monomer units are linked with thicker lines which represent multiple HORs

Alpha satellite DNA also contains a conserved 17-bp motif known as the CENP-B box. This sequence is present in varying frequencies in alpha satellite HORs, ranging from 50 to 0 % in chromosomes 21 and Y, respectively [14]. The CENP-B protein binds to the CENP-B box and is thought to be important for the de novo assembly of the centromere . The formation and stability of artificial chromosomes in the laboratory is dependent on CENP-B-box rich DNA [15]. Paradoxically, once the chromosome is in the cell it does not need the CENP-B protein for full centromere function, as shown by several lines of evidence—(1) knockouts of the CENP-B gene in mouse exhibit full centromere function, develop normally and are fertile, (2) natural centromeres such as in the Y chromosome do not contain the CENP-B box, and (3) human neocentromeres form on DNA that has no alpha satellite or CENP-B box motifs [14, 1619].

Alpha Satellite DNA Mapping and Sequencing

Alpha satellite DNA was one of the first sequences to be identified and sequenced, however when one examines the centromeric regions of the human reference genome it is quite apparent that they contain megabase-size gaps due to difficulties with contig assembly. While high-throughput genome sequencing has led to a revolution in rapidly identifying the molecular defects behind many human disorders, centromeres remain incomplete due to the short read length of the current parallel sequencing technologies, satellite DNA regions again suffer from poor assembly. Recently, novel computational methods involving unit monomer analysis has provided new ways in analyzing these regions. By grouping similar monomer units together predictions can be made that reveal the overall array length for haploid centromeres such as in the X and Y chromosomes can be made (Fig. 9.2) [20].

CENP-A and Alpha Satellite: Centromeric Chromatin

The centromere -specific histone H3 variant CENP-A (described in “Centromere Proteins” section below) is normally present within subsections of the HOR region of alpha satellite DNA . This has been shown with elegant anti-CENP-A and alpha satellite FISH experiments on extended chromatin fibers [21]. This subdomain structure of CENP-A and alpha satellite is considered to play a role in the three dimensional assembly of the mature mitotic centromere, since alpha satellite DNA is present within the inner (pairing) and outer (microtubule binding) regions of the centromere (Fig. 9.1).

Eviction of the Invaders

Unlike centromeres of multi-cellular eukaryotes, human centromeres are mostly made up of one class of DNA, alpha satellite DNA . It is rare to find the presence of LINE and SINE transposable element s (TEs) within the HOR array. What is the possible mechanism that keeps the intruders at bay? Detailed sequence map analysis at the border regions of alpha satellite has shown that the age of TE insertion decreases as one goes from outer non-alpha satellite DNA to the inner higher order alpha arrays [22]. This suggests that TEs are rapidly pushed away from the HOR region to the periphery. A simple mechanism that would explain this would be unequal crossing over between homologous chromosomes or sister chromatids, which also contributes to the evolution of centromere DNA.

Centromere Proteins

Classes of Centromere/Kinetochore Proteins

The centromere /kinetochore complex can be broadly classified into two main groups based on structure and function with respect to chromosome segregation. The centromere locus needs to be identified and maintained in one region per chromosome, this memory between cell cycles is maintained by a group of proteins that are present at the centromere throughout the cell cycle (Fig. 9.1). The second group falls into the active process of preparing and executing chromosome segregation. These proteins are present at the centromere/kinetochore in a transient manner beginning after DNA replication to the completion of telophase.

To date, over 100 proteins have been shown to locate to the centromere /kinetochore at some stage during the cell cycle. For the purpose of this chapter we are only including proteins that have multiple lines of evidence such as antibody and epitope fusion localization. Some proteins have been misclassified because of artifact signals from antibody staining experiments.

The first set of human centromere proteins discovered were identified using auto-immune sera from patients with scleroderma disease [5]. Protein immunoblotting uncovered three common antigens, named CENP-A, -B and -C in ascending molecular weight order [23]. Serendipitously, these three proteins bind to the centromere DNA and form the foundation platform onto which other centromere and kinetochore proteins assemble the mature, functional structure.

Centromere Function

Epigenetic Marking

Most eukaryotic centromeres are characterized by long tracts of repeat DNA, either satellite or transposable element s . Furthermore, this DNA was often specific to the centromeric locus, for example alpha satellite in humans. One popular hypothesis regarding the interaction between centromere DNA and protein was that the protein had specific DNA-binding affinity, such as the CENP-B protein binding to the CENP-B box motif in alpha satellite [14]. However, immuno-fluorescence analysis of variant chromosomes such as dicentrics or neocentromeres (described in “Disease” section below) showed that some centromere proteins were only present at functionally active centromeres whether alpha satellite DNA was present or absent, and other proteins were present at both active and inactive centromeres [19, 24]. This line of evidence showed that centromeres had genetic and epigenetic characteristics unlike their telomere counterparts which are strictly genetic.

CENP-A: The Primary Mark

CENP-A is a histone H3 variant that is only found at active centromeres [25]. It replaces both units of histone H3 of the histone octamer which provides the centromeric epigenetic mark and a chromatin platform onto which the constitutive centromere-associated network (CCAN) of proteins bind to [6] (see Table 9.1). Further evidence to support the foundation role of CENP-A is shown in gene knockout/knockdown studies which result in the loss of downstream centromere proteins and the absence of a functional kinetochore [26].

Table 9.1 Centromere and kinetochore proteins organized into functional classes

Spindle Assembly Checkpoint (SAC)

After chromosomes have replicated and condensed, they then are captured by the mitotic spindle via the interaction with the kinetochore . The chromatid pairs then shuffle between spindle poles to ensure that each sister centromere has attached to spindle microtubules emanating from one pole and thus achieving bi-orientated attachment. Once all chromosomes have acquired correct attachment and equal tension, the chromatids are then ready to segregate to opposite poles. The cell is able to detect the tension and signal for the beginning of anaphase. A group of proteins that are essential for the correct attachment of chromosomes were identified through elegant genetic screens in budding yeast [27, 28]. These spindle assembly checkpoint (SAC) proteins are conserved in humans and mutations elevate the rate of chromosome segregation errors and have a role in cancer predisposition (see “Disease” section).

Sister Centromere Cohesion

After DNA replication, sister chromatids need to be held together to prevent them from prematurely separating, which can result in mis-segregation. A conserved protein complex, known as cohesin, holds the sister chromatids together until the early stages of mitosis when cohesin is progressively removed from the arms and remains at the centromere region until the onset of anaphase. A protector protein, Shugoshin, binds to the centromeric pool of cohesin and thus prevents its premature removal [29]. So the last chromosomal region to be held together before anaphase is the inner centromere domain. In addition to mitosis, cohesin also plays an important role during meiosis I when homologous chromosomes are held together at the centromere by a meiotic-specific cohesin complex. It is hypothesized that weakening of this complex due to aging may contribute to higher rates of chromosomal non-disjunction in women of advanced maternal age (see “Disease” section).

Chromosome Movement

One of the key roles of the kinetochore is to capture the spindle microtubules, align the chromosomes to the midzone and then move them to the opposite poles. Affinity biochemical experiments from yeast have shown that the budding yeast centromere comprises of one super-complex that binds to one microtubule. Humans contain around 20 microtubules per kinetochore attachment, thus there are multiple subunits that act together in concert [30]. Once each sister centromere is captured to the microtubules they then go through a pushing and pulling action between spindle poles to establish equal tension. This movement is partly triggered by motor, microtubule binding and checkpoint proteins. A protein complex at the heart of this process is the KMN network (Table 9.1). Again, like other complexes, it is conserved in a multitude of eukaryotic organisms and plays an essential role in chromosome segregation. Components of this complex are transiently present at the centromere and form a link between the centromeric chromatin and the outer kinetochore.

Cytokinesis

As described above, centromere cohesion plays an important role in holding the sister chromatids together until the beginning of anaphase. Additional roles include tension sensing and chromosome alignment or error correction. The complex of proteins at the heart of this region is the Chromosome Passenger Complex (CPC) [31]. This includes the four subunits, Aurora B kinase, INCENP, SURVIVIN and BOREALIN. Furthermore, the CPC has an additional role once chromosomes begin to move to opposite poles. They are left behind at the spindle midzone thus marking this region as the site of cellular/cytoplasmic constriction and eventual cleavage of the membranes and spindle microtubules to release the two daughter cells. Any defects in this later stage of mitosis can lead to cells with multiple copies of the genome (polyploidy) and are thought to be involved in tumour progression.

Disease

Structural abnormalities implicating the centromeric DNA, namely the presence of more than one centromere , repositioning of the centromere to a non-centromeric DNA site, prematurely separated centromeres , mutations and aberrant expression of centromere-associated kinetochore proteins, anomalous methylation and altered transcription of alpha satellite, as well as pericentric regions have all been associated with human diseases .

Chromosome Structural Abnormalities

Dicentric Chromosomes

Robertsonian translocations (ROBs) are the most common constitutional structural rearrangements in humans, observed at a rate of one in every thousand live births. ROBs involve whole-arm exchanges between two of the five non-homologous human acrocentric chromosomes (13, 14, 15, 21 and 22), giving rise to a karyotypically metacentric chromosome [32]. Carriers of balanced ROBs are generally normal but with increased risk of infertility due to conception of non-viable fetuses and also with elevated chance of having offspring with Down syndrome .

The other commonly reported constitutional dicentric chromosomes are the isodicentric X chromosomes especially idic(X)(p11) which could occur as both mosaic or non-mosaic. Idic(X)(p11) cases account for about 18 % of Turner syndrome patients, amounting to an incidence rate of approximately 1 in 14,000 females. Other dicentric X chromosomes might include rearranged derivatives of X chromosomes or isodicentrics that have breakpoints at sites other than Xp11 [33].

A rarer non-homologous, non-ROBs had also been reported to give rise to constitutional dicentric chromosomes . Thus far, only 27 cases were reported since the 1970s. Most cases (23/27) involved an acrocentric chromosome and 15/19 of cytogenetically distinguishable heterodicentric chromosomes had only one primary constriction whereby 12/15 of the inactivated centromere being the acrocentric centromeres . This is probably due to the relative stability of the dicentric formed as p-arm deletion of acrocentric chromosomes is not embryonic lethal and the centromeres of acrocentrics have higher tendency to become inactivated [34, 35].

Constitutional dicentric chromosomes are stably transmitted through cell divisions because one of the two centromeres is either inactivated via epigenetic mechanisms or deleted partially or fully (Fig. 9.3) [36, 37]. An inactivated centromere is positive for CENP-B but negative for the essential proteins, CENP-A, -C and -E and hence, is distinguishable from functionally active centromeres [38]. Stability of a dicentric chromosome with two functional centromeres could also be achieved through close proximity of the centromeres—an intercentromeric distance of less than 12 Mb as seen on isodicentric X chromosomes [39].

Fig. 9.3
figure 3

Epigenetic status of the centromere in abnormal chromosomes . Replicated sister chromatids (black and grey) are shown aligned and attached to microtubules. The satellite-rich centromere DNA (orange and light grey shaded boxes) mark the centromere locus. Functionally active centromeres build a mature kinetochore (red and blue ovals) which capture spindle microtubules and move chromatids to opposite poles. (Ai and ii) Functional dicentric chromosomes with closely spaced centromeres act in unison to correctly segregate the chromatids. (Aiii and iv) Dicentric chromosomes with centromeres spaced further apart can also segregate correctly but (Av and vi) sister chromatids can twist between the two centromeres resulting in single chromatids attached to both poles which causes possible breakage of the chromosome . (Avii and viii) Epigenetic inactivation of one of the centromeres (loss of the kinetochore) resolves the conflict between the two active centromeres and thus chromosomes can correctly segregate. Neocentromeres form on non-alpha satellite DNA , often in euchromatic regions. (B) Two possible mechanisms of neocentromere formation, (Bi and ii) repositioning of the centromere to a new region along the chromosome. The old centromere is subsequently inactivated. (Biii and iv) Another mechanism shows a breakage and the formation of an acentric fragment. This chromosomal fragment is rescued by the formation of a kinetochore but the underlying alpha satellite DNA is absent

In malignancies, dicentric chromosomes are generally an outcome of telomere fusion events due to telomere instability of cancer cells as observed in giant cell tumor of the bone, meningioma, chronic lymphocytic leukemia (CLL), pancreatic cancer and osteosarcoma [40, 41]. However, most dicentric chromosomes in hematological malignancies arise from reciprocal translocation that produces a dicentric chromosome and an acentric chromosomal fragment which might be lost in subsequent mitoses. Thus far, the mechanism of centromere inactivation in malignancies has not been well studied. Investigations into the dicentric chromosomes of acute myeloid leukemia (AML) and myelodysplastic syndromes indicated that a repertoire of strategies namely functional (epigenetic ) inactivation, intercentromeric deletion, inversion to reduce intercentromeric distance, and partial or full centromere excision were deployed to produce a more stable chromosome [41].

Neocentromeres

Neocentromere is the term coined for an ectopic centromere which forms in a region of the chromosome outside the repetitive alpha satellite DNA [19]. It binds all known centromere proteins except CENP-B and functions similarly to the native centromere [42] although the level of CENP-A incorporation [43], cohesion [44, 45] and error correction by Aurora B [46] appear to be lowered. Neocentromeres have been found in euchromatic sites and the formation of neocentromeres does not seem to correlate with reduced expression of the genes in those regions [47].

The first report of a constitutional human neocentromere in 1993 was from cytogenetic screening of a 4 year-old patient who was presented with delayed speech development [19]. Subsequent discoveries were made in patients with a wide spectrum of clinical presentations including facial dysmorphism and growth retardation in younger patients to infertility and high proportion of miscarriages in adult patients [47]. In children, several cancer types including retinoblastoma [48], Wilms tumor [49], cystic hygroma [50] and hemangioma [51] were reported as co-morbidities with the other developmental disorders.

In addition, neocentromeres have also been specifically associated with a few cancers thus far, namely AML, atypical lipomas and well-differentiated liposarcomas (ALP-WDLPS), lung sarcomatoid carcinoma and T-cell non-Hodgkin lymphoma [52]. The presence of neocentromeres on either a supernumerary ring or a long marker chromosome , both derived from the long arm of chromosome 12, is a defining characteristic of ALP-WDLPS of borderline malignancy [53]. These chromosomes have amplification of the 12q14-15 region containing oncogenes that include MDM2 and CDK4 [54]. However, the same amplified region is also found in other more aggressive liposarcomas but on chromosome 12 with alpha satellites suggesting that the neocentromere formed was to stabilize the complex rearranged acentric chromosome containing amplified 12q14-15 which might confer selective advantage within the tumor microenvironment besides highlighting the difference between neocentromere and the native centromere with alpha satellites [47].

Premature Centromere Division

Premature centromere division (PCD; OMIM #212790) is a cytogenetically detectable trait where the X chromosome appears to have no discernible centromere resulting in a rod-shaped X chromosome. The frequency of lymphocytes showing PCD and DNA damage increases as we age but for sporadic Alzheimer’s disease patients, the increased frequency was even more significant when compared to their age-matched controls. In addition, PCD was shown to be consistently more prominent in females than males and was thought to be the cause of chromosomal instability resulting in tissue mosaicism and neuronal cell death in Alzheimer’s disease [55].

PCD is also found in older females who experience significantly higher chance of spontaneous abortion and bearing children with trisomies especially trisomy 21. In females, the immature oocytes arrest in prophase I and only proceed with meiosis upon hormonal stimulation during the period after puberty until menopause. Hence, the chiasmata between homologous chromosomes and cohesion of the sister chromatid arms in prophase I as well as the subsequent centromere cohesion between the sister chromatids in meiosis II have to be properly maintained by the cohesin complexes for many years before these oocytes are released and potentially fertilized [56]. This long period of arrest led to the postulation of an age-dependent ‘cohesin fatigue’ being a contributing factor to the much higher aneuploidy rate of oocytes in older women [57, 58].

Centromere Protein Genes

CENP-A and HJURP in Cancer

Studies performed in colorectal, testicular, liver, breast and lung cancers were reported to have elevated expression of CENP-A while separate studies in lung, breast and brain cancers had reported on overexpression of the CENP-A chaperone, HJURP [59, 60]. CENP-A and HJURP could potentially be used as prognostic markers for certain groups of cancer . CENP-A has been demonstrated to correlate positively with pathological grade and negatively with survival prognosis in lung adenocarcinoma [61], epithelial ovarian cancer [62] and estrogen-receptor positive breast cancers that were not treated with systemic therapy [63]. HJURP has shown a similar pattern of correlation with astrocytomas, the most common type of adult brain cancer [59]. In combination, upregulation of both CENP-A and HJURP at their mRNA levels were found to be associated with decreased survival in breast cancer patients [64].

BUB1B, ESCO2, CASC5 and CENP-E in Developmental Disorders

Mosaic variegated aneuploidy syndrome (MVA; OMIM #257300) is a collective term for the cytogenetic characteristic where mosaic aneuploidies are commonly observed with clinical features namely microcephaly, mental retardation and growth retardation. In a subset of MVA patients, premature chromatid separation (PCS; OMIM #176430) was evident [65]. PCS is another cytogenetic description for a spectrum of diseases , in which a significant percentage of the mitotic lymphocytes appear to have separated centromeres and splayed chromatids. This is in contrast to the metaphase chromosome of normal, colchicine or colcemid treated cells where two sister chromatids are linked at the centromere region [66].

The SAC gene, BUB1B was not only the first gene found to be associated with MVA but also the first mitotic SAC gene where its allelic mutations in the germline were linked to a human disease [67]. Monoallelic BUB1B mutations appeared to give rise to the most severe phenotype including high occurrence of PCS, cataracts, Dandy–Walker syndrome and cancer . Biallelic BUB1B mutations yielded moderate phenotype while MVA without BUB1B mutations rarely had PCS and showed no signs of cataracts, Dandy–Walker syndrome and cancer [68]. Hence, many have postulated that other mitotic SAC genes might have important role in instigating the remaining forms of MVA.

The other cytogenetically observed trait around the centromere is heterochromatin repulsion which is most noticeable on chromosomes with large tracts of heterochromatin namely chromosomes 1, 9 and 16 [69]. This affects most metaphase chromosomes of patients with Roberts syndrome (RBS; OMIM #268300) and the milder SC phocomelia syndrome (SC; OMIM #269000). The causative gene for both of these syndromes was found to be Establishment of Cohesion 1 Homologue 2 (ESCO2) and these syndromes can be regarded as a spectrum depending on the variants of the mutated ESCO2. Clinical features of these patients include growth retardation, mental retardation and the presence of craniofacial abnormalities with microcephaly being the most common besides several others including hypertelorism, hypoplastic nasal alae and malar hypoplasia. The presence of cleft lip and palate was associated with the severity of limbs malformations while corneal opacities correlated with mental retardation and cardiac defects [70].

CASC5 or KNL1 mutations were reported to cause autosomal recessive primary microcephaly (MCPH; OMIM #251200). CASC5 is a member of the conserved KMN (KNL1/MIS12 complex/NDC80 complex) network of proteins within a kinetochore that links the chromosome to the microtubules. CASC5 which localizes to the kinetochore from G2 til late anaphase is also part of the SAC machinery as it is known to bind to BUB1B [71]. Compound heterozygous variants of CENPE had recently been described in two siblings with microcephalic osteodysplastic primordial dwarfism (MOPD2; OMIM #210720) which was a disease previously reported to be linked to mutated centrosome-associated protein, pericentrin (PCNT) [72]. CENP-E is a dimeric kinesin-like motor protein which was shown to be important for the stability of binding between kinetochore and the dynamic microtubules, while PCNT is essential in the formation of microtubule arrays at the centrosome [73]. This suggests that the overlapping phenotype for both CENPE and PCNT mutations might be spindle-related [72].

Other Centromere Protein Genes in Cancers

Kinetochore protein genes that are crucial for the normal function of the centromere have been reported to be mutated or differentially expressed in various cancers . Mutations in BUB1 were implicated in colorectal, adult T-cell leukemia/lymphoma (ATLL), lung and pancreatic cancers while mutations in BUB1B had been reported in more cancer types including colorectal cancer, MVA, ATLL, glioblastomas, Wilms tumor and B-cell lymphoma [74].

In addition to mutation, the level of expression for SAC proteins appears to be important in tumorigenesis. BUB1, BUB1B and BUB3 were reported to be unregulated in gastric cancer [75]. However, in pediatric glioblastoma, expression of BUB1 and BUB1B were upregulated whereas BUB3 was downregulated [76]. In clear cell renal carcinomas investigated for the expression of their SAC genes, BUB1, BUB1B and MAD2L1 (MAD2 mitotic arrest deficient-like 1) were found to be overexpressed while MAD1 had decreased expression [77].

Epigenetics

Epigenetics is the study of the changes in gene express ion or protein function that are not due to alterations in the DNA sequence of the gene, but are heritable through cell division. Such changes could occur at, (1) the genome structural level involving DNA methylation , histone modifications, nucleosome positioning, and histone variants, (2) the RNA level which includes RNA splicing and RNA interference, and (3) the protein level in cases of prion formation where the ‘infectious’ proteins are able to induce conformational change of native proteins rendering them ‘infectious’ as well as dysfunctional [78, 79].

DNA Methylation

Cytosine residues of the non-CpG island s (non-CGIs) or else referred to as the global CpG dinucleotides within intronic and intergenic regions, especially transposable element s and simple repeat sequences, are mostly methylated in somatic tissues as opposed to unmethylated cytosines in the CGIs that are known to coincide with gene promoters or regulatory regions [80].

Abnormal DNA methylation had been reported in immunodeficiency, centromeric instability, and facial anomalies (ICF) syndrome . ICF is a rare autosomal recessive disease which is currently categorized into three groups namely ICF1 (OMIM #242860) with mutations found in the DNA methyltransferase 3B (DNMT3B) gene, ICF2 (OMIM #614069) with mutations in zinc finger- and BTB domain-containing 24 (ZBTB24) gene, and the final group with currently unknown molecular etiology provisionally designated ICFX [81]. Although all three groups exhibit hypomethylation of satellites 2 and 3 which are part of the constitutive heterochromatin , ICF2 and ICFX, however, show additional hypomethylation at the alpha satellite [82]. In the heterochromatic region that exhibit reduced DNA methylation from an average level of 80 % in normal cells to 30 % in ICF cells, some heterochromatic genes were shown to have escaped silencing compared to the control, although each patient appeared to have his own signature of heterochromatic genes that escaped silencing across different chromosomes [83].

Wilms tumor is the most common renal tumor in children under 5 years of age, accounting for 90 % of the total pediatric renal cancer cases and contributing to approximately 7 % of all pediatric malignancies [84]. Hypomethylation of alpha satellite on chromosomes 1 and 10 was observed in Wilms tumor patient samples but it was not correlated with aneuploidy . To a lesser extent and frequency, satellite 2 was also hypomethylated on these chromosomes [85]. These studies into ICF and Wilms tumors indeed pose an interesting question about the mechanisms that lead to the differences in their hypomethylation profiles.

In cancer as well as aged cells, global hypomethylation and concomitant increase in the methylation of promoters have been observed and were thought to contribute to genomic instability and gene silencing respectively. Furthermore, global non-CGIs could be further subcategorized and studied. Hypomethylation of Alu, LINE-1 and alpha satellite in CLL patients were examined and alpha satellite hypomethylation was suggested to be a potential negative prognostic marker for CLL [86]. Separately, in a study performed in ovarian epithelial tumors, satellite 2 hypomethylation on chromosomes 1 and 16 was strongly correlated with both genome -wide hypomethylation and the degree of tumor malignancy. Extensive hypomethylation of chromosome 1 alpha satellite was also observed in larger proportion of carcinomas compared to the more benign forms [80]. Therefore, it appears that the level of DNA methylation at centromeric satellite sequences is a useful biomarker in the study and diagnosis of cancers .

Pericentric and Centromeric Transcription and Histone Modifications

Expression of pericentric and centromeric transcripts has been found to be altered in senescent cells and human cancer samples when compared to normal tissues, reflecting the global epigenetic deregulation [87]. This could partly be facilitated by the altered DNA methylation in these regions. In addition to global DNA methylation changes, the global histone marks have also been found to be altered. The loss of both H4K20me3, the pericentric constitutive heterochromatin mark, and H3K27me3, the facultative heterochromatin mark, were reported in lung cancer cells when examined with the non-tumor cells [88].

Upregulation of pericentric satellite 3 was also observed in a Hutchinson-Gilford progeria syndrome (HGPS) patient. HGPS (OMIM #176670) is a disease of rapid aging due to the expression of mutant Lamin A, a developmentally regulated gene. However, the expression of alpha satellite was unaltered, suggesting that the expression of pericentric and centromeric sequences are controlled by different mechanisms. The upregulation of satellite 3 was accompanied by the loss of H3K27me3 and pericentric constitutive heterochromatin mark H3K9me3 but by the increase of another constitutive heterochromatin mark, H4K20me3 [89]. Hence, thus far, the relationships between the expression of repetitive satellites and both DNA methylation as well as histone modifications remain to be clarified.

Although the cases aforementioned were characterized by upregulation of centromeric and/or pericentromeric sequences, the right transcriptional balance between sense and antisense strand of both pericentric and centromeric sequences appears to be crucial for the proper formation and function of a centromere [90, 91].

Evolution

Primate Centromere DNA

Alpha satellite DNA is a relatively conserved centromeric repeat family. It is found in great apes, old world monkeys and new world monkeys, which span approximately 43 million years of evolution since the last common ancestor. In great apes it is organized into HOR structures, however in more distant species, it is mainly found in divergent monomeric forms. One proposed hypothesis is that HOR structure arose after the divergence of the great apes from the rest of the primate species [92]. Interestingly, HOR structures are not necessarily restricted to certain species or particular chromosomes , for example the centromere array of the mouse Y chromosome contains a HOR but the autosomes and X chromosome only have the monomeric form [93].

As described above, alpha satellite is found in many primate species across 43 million years of evolution. This appears to be rather a long time when compared to other mammalian centromere satellites such as the mouse minor satellite DNA . The monomer repeat unit is 120 bp and is only found in a subset of species in the Mus genus spanning about 5–7 million years [94]. The higher rate of centromere DNA evolution in the mouse may be related to a much shorter generation time which increases the chance of the centromere array to rearrange and diverge during meiotic recombination. Even though minor and alpha satellite DNAs are quite diverged, they do share some features such as a high AT content and the conservation of the CENP-B box motif.

The Rapidly Evolving Y Centromere

The human Y chromosome exists in a haploid state in males, and offers a unique perspective into the evolution of centromere DNA within a species since there is no homologous counterpart of this region for meiotic recombination to occur. Like most other Y chromosome sequences that do not recombine with a homologue, the centromere DNA has undergone a rapid rate of sequence divergence. Some of the features that mark the Y centromere as separate from other human centromeres include; a diverged alpha satellite monomer, absence of the CENP-B box motif, diverged HOR and a significantly smaller overall length [13, 14]. Evidence for the rapid divergence in the Y alpha satellite sequence is nicely illustrated in the analysis of the HOR in humans and chimpanzees. The HOR of the X and 17 alpha satellite exhibits a conserved co-linearity of the HOR, whereas the Y alpha satellite has completely lost this conserved structure and the length of the HOR between humans and chimpanzees is also different [93].

The functional consequences of a rapidly diverging and smaller Y centromere may be responsible for the Y chromosome ’s partial instability during division of aging cells [95, 96]. Measurement of the CENP-A protein on Y centromeres shows that it contains around half the amount when compared to the autosomes and X centromeres [43]. This lower amount is consistent with less alpha satellite DNA present at Y centromeres. Y alpha satellite DNA ranges in length from 250 to 1500 kb, in contrast to the X centromere which is megabases in size, ranging from 1300 to 3700 kb [13]. On the extreme end of low amounts of alpha satellite, neocentromeres are formed on non-alpha satellite regions of the genome and are found to contain even lower amounts of CENP-A than the Y centromere [43] (see “Disease” section).

Adaptive Evolution of Foundation Centromere Proteins

It has been hypothesized that rapidly evolving centromere DNA can expand and create larger centromeres that can bind more spindle microtubules [97]. Other evolutionary mechanisms can increase the size of a centromere, such as translocations of acrocentric chromosomes in humans to generate metacentric (Robertsonian) chromosomes with two adjacent centromeres [98]. These chromosomes have a higher chance of being inherited during the asymmetric cell divisions of female meiosis where the egg spindle pole releases more microtubules to capture the bigger centromere than the polar body pole. To prevent a complete runaway of chromosomes with larger and larger centromeres the cell counters this expansion by epigenetic means, through the adaptive evolution of centromere chromatin proteins such as CENP-A and CENP-C [99]. Evidence for this hypothesis is now accumulating in many species groups, such as flies, plants and primates that show these two proteins are under adaptive evolution [100102]. In contrast, when similar sequence analysis across the primate group was performed, it did not show any evidence for adaptive selection for the non-essential CENP-B protein, even though this protein directly binds to the alpha satellite DNA [102].

ZNF397, an Evolutionary New Centromere Protein

Many centromere proteins are conserved in eukaryotic species, ranging from the single-celled budding yeast to humans. In some instances in evolution, new proteins appear via a variety of molecular mechanisms. One example of this is zinc finger protein 397 (ZNF397), which presumably arose from a gene duplication event after the separation of placental and marsupial mammals. We had previously identified ZNF397 using anti-centromere antibodies from a patient with autoimmune disease [103]. Interestingly, this protein has a unique cell cycle localization pattern where it is present from the end of telophase through to early prophase. Knockout experiments in mouse showed that the protein is not essential for chromosome segregation. One attractive hypothesis is that the protein has acquired centromere targeting activity but it is yet to be directly involved in full kinetochore function.

Karyotype Evolution and Meiotic Drive: Robertsonian Translocations

Mendel’s law of segregation implies that the two homologous chromosomes in a parent segregate at meiosis into the gametes to ensure the offspring acquire only one copy of each chromosome from each parent, thereby maintaining the proper chromosome number in sexually reproducing organisms. This law assumes that the process of segregation occurs in a random and non-biased manner. However, in humans, the most common ROBs namely der(13q14q) and der(14q21q) arise mainly during oogenesis but not spermatogenesis [32]. It was postulated that the chromosomal rearrangements of ROBs cause functional heterozygosity at the centromere of homologous chromosomes leading to the differential interactions with the meiotic spindle. This then contributes to preferential segregation of the rearranged chromosomes into functional meiotic product instead of the polar bodies [104]. These ROB chromosomes in male carriers are not subjected to the same meiotic drive owing to the process of spermatogenesis where polar bodies do not form, hence, do not render the opportunity for preferential segregation.

Neocentromeres and Evolutionary New Centromeres

One of the hallmarks of speciation at the genetic level is the divergence of karyotypes between two newly-formed species. This separation can often produce a reproductive barrier between the two groups which then further accelerates the rate of evolution. Chromosomal rearrangements including, translocations, inversions, deletions and duplications, are a driving force in the emergence of new karyotype configurations. As a consequence of genomic rearrangements, the centromere can also change position to either rescue an acentric chromosomal fragment or compensate for a partially deleted (inactivated) centromere, as has been observed in de novo clinical cytogenetic cases (Fig. 9.3) (see “Disease” section). Changes in centromere position in closely related species led to the concept of Evolutionary New Centromeres (ENCs) [105]. ENCs were initially thought to have arisen due to the physical repositioning of an extant centromere. This hypothesis has been replaced by the ENC hypothesis because of more accurate genome and cytogenetic mapping. So the evolutionary timeline of the formation of an ENC is as follows: (1) chromosomal rearrangement, (2) neocentromere formation and (3) accumulation of satellite DNA at the neocentromeric locus [106].

A good example that illustrates this progress from neocentromere to an ENC is in the orangutan. Early cytogenetic analyses of orangutan centromeres using alpha satellite FISH showed that at least one chromosome was devoid of alpha satellite DNA [107]. It wasn’t until high-throughput sequencing and CENP-A pulldown technologies that definitively revealed that chromosome 12 contained a neocentromere but had not yet acquired any alpha satellite DNA sequences [108]. The ENC chromosome 12 is present in 2 species of orangutan, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean) which shared a common ancestor between 0.4 and 1 million years ago. This shows that ENCs can take a long time to acquire satellite sequences. Interestingly, the progenitor chromosome 12 with the alphoid centromere still exists together with the ENC form in the two orangutan species.

Conclusions

In the last few decades we have made rapid progress in the discovery of most of the genomic and protein elements that make up a functional centromere . Human centromere DNAs have been identified and mapped to each chromosome . Current sequencing methods have made some in-roads towards completing the genome map of these repeat-rich regions. Furthermore, novel computational methods have allowed the interrogation of high-throughput genome sequencing results from individuals, however, gaps still remain. The next breakthrough in long-read single-molecule sequencing will allow these gaps to be closed and analyzed centromere-by-centromere. Insights will be made in the rate of evolution across populations and within families. This will enable researchers to further understand the contribution of variation and mutation on centromere dysfunction in human chromosome instability disorders.