Introduction

The centromere is a specialized chromosomal locus required for accurate chromosome segregation. The kinetochore structure is composed of more than 100 protein components that assemble on the centromere locus during the mitotic phase. The kinetochore interacts with microtubules and regulates the chromosomal segregation process via interactions with the mitotic checkpoint proteins (Cleveland et al. 2003; Musacchio and Salmon 2007; Allshire and Karpen 2008).

Almost 34 years have passed since eukaryotic artificial chromosomes were first generated in budding yeast. The first artificial chromosomes were constructed by combining origins of DNA replication with the functional centromere DNA elements in circular plasmids. Currently, artificial chromosomes with functional centromeres can be generated in human cells. During this process, centromere-specific repetitive DNA sequences (α-satellite DNA) are required for de novo assembly. However, human artificial chromosome formation is also strongly correlated with epigenetic regulation (e.g., chromatin modifications), as it is at natural centromeres, where evidence for epigenetic regulation came from the discovery of centromere inactivation (Earnshaw and Migeon 1985) and from the formation of human neocentromeres on non-α-satellite DNA if the natural centromeres are lost or deleted (Fukagawa and Earnshaw 2014). In this review, we focus on current advancements in de novo formation and maintenance of functional human centromeres and discuss how epigenetic chromatin modifications regulate centromere assembly, activation, and inactivation, including neocentromere formation.

The first eukaryotic artificial chromosome with a point centromere in budding yeast

In budding yeast (Saccharomyces cerevisiae) cells, the first autonomous replicating sequence (ARS) had been identified in part of the TRP1 gene as an element that enhanced the transformation efficiency of bacterial plasmid DNA in yeast cells (Stinchcomb et al. 1979; Struhl et al. 1979). Plasmid DNA containing ARS element replicates efficiently, accumulating as multicopy plasmids without integrating into the host chromosomes. However, such ARS-containing plasmids, which cannot properly segregate at mitosis, accumulate in the mother cells and are rapidly lost from cell population during growth without selection.

Clarke and Carbon (1980) identified the first functional centromere DNA element (CEN) near the CDC10 locus of chromosome III, which had been mapped genetically. Introduction of the cloned CEN DNA into a circular plasmid bearing an ARS element caused that plasmid to undergo classical Mendelian segregation mechanism with equally distribution into daughter cells and maintenance of a copy number of one per cell. The centromere and replication origin were both required for stable maintenance of this circular minichromosome.

Normal eukaryotic chromosomes are linear in structure and have telomere repeats at both chromosomal ends. The first linear artificial chromosome was constructed by Murray and Szostak (1983) in budding yeast. A linear DNA fragment containing a centromere, ARS, and selectable marker was capped at both ends with Tetrahymena rDNA repeats containing telomere structures. After transformation into yeast cells, this synthetic DNA construct formed a minichromosome, that segregated and stably maintained as a linear yeast artificial chromosome (YAC) (Fig. 1a, left).

Fig. 1
figure 1

Centromere DNAs, protein components, and artificial chromosomes. a Structures of yeast centromeres and yeast artificial chromosomes (YACs). b Structures of human chromosome X and 21 alphoid DNA. The higher order repeats (HOR/type I) and monomeric (type II) alphoid DNA repeats. The HOR unit of chromosome 21 is composed of 11 alphoid monomer units containing five CENP-B boxes and one pseudo-CENP-B box. Red-shaded sequences are critical for CENP-B binding. Blue-shaded sequences are changed in the pseudo CENP-B box. No CENP-B binding is present on the pseudo-box. In the CENP-B box mutants of 11mer, wild-type CENP-B boxes (white ovals) are substituted with pseudo-box sequences (black ovals). c The assembly of human artificial chromosomes (HACs). d Properties of CENP-B protein. e A possible pathway for the CENP-A nucleosome in kinetochore protein assembly

The minimum length of centromere DNA of S. cerevisiae called as the conserved DNA element (CDE) is a quite short ∼120 bp (Fitzgerald-Hayes et al. 1982). The essential centromere-specific histone H3 variant Cse4 assembles as a single nucleosome on CDE II with other core histones, including H2A, H2B, and H4 (Furuyama and Biggins 2007; Camahort et al. 2009). This centromere-specific histone H3, CENP-A (also known in some organisms as CenH3), is highly conserved among eukaryotes, from yeast to humans and plants (Earnshaw and Rothfield 1985; Talbert et al. 2012; Earnshaw et al. 2013). In S. cerevisiae, the Cse4/CENP-A nucleosome cooperatively recruits kinetochore proteins for assembly with another CDE I binding protein (i.e., CBF1) and CDE III-binding protein complex CBF3 (Kitagawa and Hieter 2001). The defining factor of the yeast point centromere is that it is solely defined by the DNA sequence. Any DNA molecule carrying the 125-bp CDEI, CDEII, and CDEIII sequences capable to assemble a centromere, and a single DNA base mutation in CDEIII can inactivate the centromere (Hegemann et al. 1988).

A unique core centromere sequence surrounded by heterochromatin is required for YAC formation in fission yeast

In fission yeast (Schizosaccharomyces pombe), the centromere-competent DNA is considerably larger than in S. cerevisiae. S. pombe centromere DNA is 35–110 kb in length and consists of a unique central core sequence (cnt, 4∼7 kb) surrounded by repetitive sequences (imr, dg, and dh) (Nakaseko et al. 1986; Chikashige et al. 1989). For de novo formation of an artificial minichromosome from the input naked DNA, both the unique core and surrounding repetitive sequences are required (Hahnenberger et al. 1989) (Fig. 1a, right).

The S. pombe centromere has a specialized chromatin structure—when S. pombe DNA is cleaved with micrococcal nuclease, the central core DNA exhibits a smear pattern rather than a nucleosomal ladder (Takahashi et al. 1992), and this is required for functional kinetochore assembly (Takahashi et al. 2000). Multiple nucleosomes containing Cnp1, the S. pombe CENP-A, and other centromere-related proteins are assembled at cnt and imr DNA (Takahashi et al. 2000; Allshire and Karpen 2008; Ogiyama and Ishii 2012). It is not clear the extent to which these are interspersed with nucleosomes containing canonical histone H3. In contrast, the outer repeat region (dg and dh), consists of H3 nucleosomes where H3 is lysine 9 di-methylated (H3K9me2) and recruits Swi6/HP1, Clr4/Suv39, and other heterochromatin factors (Grewal and Jia 2007). Although the central core including outer repeat sequences is required for de novo artificial chromosome formation and Cnp1/CENP-A assembly, the outer repeat sequences can be substituted with an artificially induced heterochromatin structure (Folco et al. 2008; Kagansky et al. 2009). Heterochromatin is therefore important for artificial chromosome formation in S. pombe. Heterochromatin may be required for the enrichment of cohesin between the sister chromatids through HP1 binding (Bernard et al. 2001; Nonaka et al. 2002) and/or the assembly or maintenance of CENP-A chromatin (Folco et al. 2008; Kagansky et al. 2009).

Human centromeric DNA and protein components

A characteristic feature of the centromere DNA of all normal human chromosomes is α-satellite DNA (alphoid DNA). Alphoid DNA is a highly repetitive sequence with a monomer unit length of approximately 171 bp (Willard and Waye 1987) and clustered in regular and irregular arrays extending to 0.5∼5 Mbp in all human centromeres.

In human cells, centromere proteins were identified before the demonstration of a functional human DNA sequence as a centromere. Discovery of the first centromere proteins was made possible when it was shown that autoantigens recognized by sera from patients with scleroderma-spectrum disease were located at centromeres (Moroi et al. 1980). These antigens were subsequently identified as CENP-A, CENP-B, and CENP-C (Earnshaw and Rothfield 1985). CENP-B, the first centromere protein cloned in any species (Earnshaw et al. 1987), which recognizes the 17-bp CENP-B box motif in alphoid DNA (Masumoto et al. 1989; Muro et al. 1992), has a unique N-terminal DNA-binding domain (Pluta et al. 1992; Yoda et al. 1992) and a self-dimerization domain at the carboxyl terminus (Kitagawa et al. 1995). CENP-A, the essential centromeric H3, is highly conserved among eukaryotes and forms a nucleosome structure (Palmer et al. 1987; Howman et al. 2000; Tachiwana et al. 2011—though this is controversial—see discussion in Fukagawa and Earnshaw 2014). CENP-A is required for kinetochore assembly as an epigenetic marker of centromere chromatin (Black and Cleveland 2011). CENP-C is also an essential protein that bridges the inner and outer plates of the kinetochore (Saitoh et al. 1992; Przewloka et al. 2011; Screpanti et al. 2011).

With the development of proteomics approaches, many other CENPs have been identified. CENP-A chromatin-associated proteins, which have variously been termed the Interphase Centromere Complex (ICEN), CENP-A nucleosome-associated complex (NAC)/CENP-A nucleosome distal (CAD), or constitutive centromere-associated network (CCAN) proteins, form the centromere-specific chromatin structure (Obuse et al. 2004; Izuta et al. 2006; Foltz et al. 2006; Okada et al. 2006). Numerous kinetochore proteins, including the KNL-1/Mis12 complex/Ndc80 (KMN) complex, and other checkpoint proteins assemble on this centromere chromatin during the mitotic phase (Cheeseman et al. 2006; Musacchio and Salmon 2007).

The CENP-B box is found both in the centromeric repetitive DNA of human alphoid DNA and mouse centromeric minor satellite DNA, which has an otherwise unrelated DNA sequence (Masumoto et al. 1989; Kipling et al. 1995). When mapped in detail, the centromeric alphoid DNA of chromosome 21 and chromosome X was found to consist of two domains. The inner, centromere core, on which the kinetochore assembles, is composed of a homogeneous repeating array of higher-order repeat (HOR) units. This is flanked by highly divergent alphoid monomeric units (Fig. 1b). In the homogeneous HOR locus (also termed the type I alphoid locus), chromosome-specific multimers of the 171-bp unit constitute the HOR unit, and the HOR units extend over mega-base pairs of a homogeneous locus (Choo et al. 1991; Ikeno and Masumoto 1994; Schueler et al. 2001; Aldrup-Macdonald and Sullivan 2014; Miga et al. 2014). In the monomeric repeating locus (also termed the type II alphoid locus), 171-bp alphoid monomers exhibit a divergence of ∼85 % and have no regular repeat structure. All human chromosomes have slightly different alphoid HOR units and monomeric loci around the centromeres. CENP-B boxes appear in all type I/HOR loci except the Y chromosome alphoid HOR locus (Masumoto et al. 1989; Earnshaw et al. 1991). Interestingly, the absence of CENP-B box motifs in Y chromosome centromeres is common among humans, chimpanzees, and mice (Pertile et al. 2009).

Human artificial chromosome formation with centromeric repetitive DNA

It has been technically difficult to investigate whether the entire alphoid repeat loci are required for centromere function or if only a part of the alphoid repeat is sufficient for de novo centromere formation. However, top-down approaches for determining the functional centromeric DNA sequence have been carried out. The human X and Y chromosomes were minimized using the targeted telomere sequence-mediated chromosome fragmentation method until centromere function was lost. This top–down approach indicated that centromere function is associated with the alphoid repeats because the minimum stable X and Y chromosomes contained alphoid DNA (Brown et al. 1994; Farr et al. 1995). The smallest stable derivative of the X chromosome that was obtained by this approach had 1.8 Mb of alphoid DNA (Mills et al. 1999).

In a bottom-up approach, Willard et al. cloned chromosome 17 alphoid DNA HOR unit in a BAC vector and extended the repeat structure in tandem using an in vitro ligation method up to several hundred kilobases in length. Extended alphoid DNAs with a selection marker DNA fragment were co-transfected with total human genomic DNA and telomere sequences as a mixture of DNAs into human fibrosarcoma HT1080 cells. A mega-base-sized minichromosome formed as a consequence of heterogeneous multimerization of input DNAs from this mixture. This minichromosome was stably maintained without selection as the first human artificial chromosome (HAC) (Harrington et al. 1997) (Fig. 1c, top).

In another approach, 80–100 kb of type I/HOR or type II/monomeric alphoid DNA fragments from chromosome 21 were cloned into a YAC vector. These alphoid YAC DNAs were retrofitted with human telomere sequences at both ends, purified from yeast cells and transfected into human HT1080 cells. HAC formation occurred efficiently with the type I/HOR alphoid YAC DNA as mega-base-sized multimers of the input DNA lacking detectable host chromosomal DNA. In contrast, no HAC formation occurred with the type II/monomeric alphoid YAC. CENP-A, CENP-B, CENP-C, and CENP-E (a kinetochore motor protein) assembled on the HACs, which segregated equally into daughter cells at mitosis (Ikeno et al. 1998; Masumoto et al. 1998; Tsuduki et al. 2006) (Fig. 1c bottom).

In all cases, the introduced alphoid HOR DNA multimerized and formed stable mega-base-sized HACs (Ikeno et al. 1998; Mejia et al. 2001; Grimes et al. 2002). Telomere sequences were not required for HAC formation when the input DNA was circular (Ebersole et al. 2000). In that case, the circular input DNA formed circular HACs multimerizing the input DNAs. Although HAC DNA exhibits normally controlled replication coordinated with the cell cycle, no requirement for specific origin of replication sequences has been identified in these experiments.

Thus, although YAC and HAC formation with de novo functional centromere assembly from introduced naked centromeric DNAs has occurred both in yeast and human cells, crucial differences exist. HAC formation is always accompanied by multimerization of input DNA, and the HAC acquires regulated replication without defined origin sequences. HAC formation is not the only fate of the input alphoid HOR DNA. The other fate is integration into the host chromosomes as arrays that lack centromere activity. One explanation is that a larger chromosome size is required for stable mitotic maintenance as a HAC. Such a size effect is also observed for YAC stability (Murray and Szostak 1983; Hahnenberger et al. 1989) and for truncated human X chromosomes in DT40 cells (Mills et al. 1999). A second possible explanation is that multiple internal initiations of replication may be necessary. Recent analysis has shown that alphoid DNAs have the potential to initiate replication. Endogenous alphoid HOR DNA is a mega-base homogeneous repeat locus, and this size is much larger than the conceivable distance for the progression of a single replication fork. In addition, the selectable marker gene (Bsr) on the HAC vector arm works efficiently as a replication origin sequence on the HAC (Erliandri et al. 2014). But why does the same input alphoid HOR DNA undergo different fates in human cells? This interesting point for the epigenetic regulation of repetitive DNA with chromatin modification is discussed later in this review.

Alphoid HOR DNA with CENP-B boxes is required for de novo centromere formation

To investigate the importance of alphoid HOR DNA and CENP-B boxes in HAC formation and de novo centromere assembly, point mutations, which lose their binding capacity to CENP-B, were introduced to all CENP-B boxes in a chromosome 21 alphoid HOR DNA array, which was then introduced into human cells. No HAC formation occurred when all CENP-B boxes were mutated, and CENP-A assembly was diminished on the input mutant alphoid HOR DNA (Ohzeki et al. 2002; Basu et al. 2005). Thus, the CENP-B box is required for de novo CENP-A assembly and HAC formation. However, neither HAC formation nor CENP-A assembly was observed using a synthetic repeat DNA composed of a CENP-B box and a pBR-based non-alphoid GC-rich DNA fragment (Ohzeki et al. 2002). Thus, some aspect of alphoid HOR DNA, in addition to the CENP-B box, is required for CENP-A assembly and HAC formation. The requirement for CENP-B boxes appears at first paradoxical, as the CENP-B gene is not essential for life in the mouse, possibly because de novo centromere formation does not normally occur outside laboratory conditions.

The CENP-B box in the chromosome 21 HOR unit appears once per alphoid dimer sequence (340 bp). When the density of the CENP-B box was reduced to once per 11mer from the HOR unit by nucleotide substitution, CENP-A assembly decreased and the HAC formation activity was lost (Okamoto et al. 2007). In contrast, artificially increasing the CENP-B box density in the chromosome 17 HOR (to once per alphoid monomer) increased the efficiency of HAC formation (Basu et al. 2005). These results indicate that the CENP-B binding somehow attracts CENP-A assembly in a quantitative manner. The CENP-A density may be a threshold for further centromere/kinetochore assembly.

HACs are multimers of input alphoid DNA, but a minimum length of the input alphoid DNA is also important for HAC centromere formation. When a series of YAC and BAC vectors containing several different lengths of the alphoid HOR DNA were created, efficient HAC formation was observed only when the alphoid HOR DNA was more than 30 kb in length. Shorter arrays of 10 kb did not form HACs with stable, functional centromeres, even though the 10-kb array had CENP-A assembly activity (Okamoto et al. 2007). On the other hand, extension of the alphoid HOR DNA array to 240 kb did not significantly improve the frequency of HAC formation. Continuous regional occupancy of CENP-A chromatin of at least 30 kb in length without interruption by vector DNA may also be a requirement for functional centromere assembly.

Involvement of CENP-B in de novo CENP-A and heterochromatin assembly

CENP-B protein is a key factor in de novo HAC formation. CENP-B has homology to transposases and is conserved among at least S. pombe and mammals. CENP-B function is divergent among species. Three CENP-B homologs exist in S. pombe, and these mutants are related to heterochromatin formation at the centromere repeat and transposon (Nakagawa et al. 2002; Cam et al. 2008).

Striking results were reported from three laboratories, CENP-B gene knockout mice are viable, centromere/kinetochore function is maintained, and chromosomes are segregated without CENP-B (Hudson et al. 1998; Perez-Castro et al. 1998; Kapoor et al. 1998). Thus, either CENP-B is not required for maintenance of an established centromere or its function is redundant. To further test the requirement for CENP-B in de novo centromere formation, human alphoid HOR DNA was transfected into mouse embryonic fibroblasts (MEFs) and de novo CENP-A assembly and artificial chromosome formation occurred (Okada et al. 2007). This de novo CENP-A assembly activity was lost when the CENP-B gene was knocked out. Significantly, CENP-A assembly on the introduced alphoid HOR DNA recovered when the CENP-B amino terminal domain was expressed in the knockout MEFs in an add-back experiment (Okada et al. 2007). Thus, CENP-B protein contributes to de novo assembly of CENP-A on the input alphoid DNA. Surprisingly, add-back of full-length CENP-B induced strong heterochromatic modification (H3K9me3) at sites of ectopic integration of the input alphoid HOR DNA and CENP-A assembly was suppressed (Okada et al. 2007). These observations suggest that CENP-B exerts a dual antagonistic role on centromeric satellite DNA, balancing de novo centromere assembly and heterochromatin-induced inactivation depending on the overall chromatin context of the surrounding environment (Fig. 1d).

Interactions between CENP-A, CENP-B, and CENP-C may explain why CENP-B is required for de novo centromere assembly but not for maintenance of established centromeres. All three proteins co-immunoprecipitate with alphoid DNA (Ando et al. 2002; Suzuki et al. 2004) and they interact with one another. For example, CENP-B interacts with CENP-C in yeast two-hybrid analysis (Suzuki et al. 2004). The situation with CENP-A is more complex. The highly conserved H3-like histone-fold domain of CENP-A is flanked by unique N-terminal and C-terminal residues (Black and Cleveland 2011). CENP-C binds the C-terminal residues of nucleosomal CENP-A (Carroll et al. 2010). Recently, a functional interaction between CENP-B and the CENP-A N-terminal tail domain was reported (Fachinetti et al. 2013).

Knockout of CENP-A is lethal (Oegema et al. 2001; Régnier et al. 2005). CENP-A is a relatively stable protein, and after its expression is shut off, pre-existing CENP-A persists at the centromeres, only gradually disappears, possibly by replicative dilution. These cells maintain kinetochore function for a few cell cycles, but then rapidly lose all kinetochore protein assembly when the CENP-A level reaches a lower threshold (Fachinetti et al. 2013). Expression of exogenous CENP-A can rescue this kinetochore loss; either the N-terminal tail or the C-terminal residues are required for suppression of this kinetochore loss. CENP-A lacking the N-terminal tail rescues kinetochore assembly, probably through CENP-C interactions, but in these cells CENP-B levels at centromeres fall by about a half. Consistently, CENP-A lacking the C-terminal residues rescues kinetochore assembly, but when CENP-B is depleted by RNAi, chromosome segregation errors increase dramatically (Fachinetti et al. 2013).

These results suggest two possible kinetochore recruiting pathways: one leading from the CENP-A C-terminal tail to CENP-C, and the other from the CENP-A N-terminal tail to CENP-B. It is thought that CENP-B recruits CENP-C or other centromere proteins (Fig. 1e). These two possible kinetochore recruiting pathways provide an answer for why a protein required for de novo centromere and HAC formation on alphoid HOR DNA is not required for the function and the maintenance of the established centromere itself. Once assembled, the centromere maintains its function as the CENP-A C-terminal tail interacts with CENP-C without CENP-B. However, during de novo kinetochore formation, kinetochore, and centromere function require a strong link between alphoid HOR DNA (through the CENP-B box), which binds CENP-B and subsequent CENP-B-CENP-A N-terminal tail interactions (Masumoto et al. 2004; Okada et al. 2007; Fachinetti et al. 2013).

The evolutionarily conserved absence of the CENP-B box from Y chromosome centromeres may reveal as-yet undiscovered aspects of the regulation of this sex-determining chromosome. In addition, CENP-B does not always interact with CENP-A and CENP-C, especially in heterochromatic regions of the inner centromere. How such interactions are controlled is an intriguing question for future study.

Epigenetic regulation at the centromere: neocentromere and dicentric inactivation

Although de novo centromere assembly and HAC formation under experimental conditions is highly dependent on the DNA sequence, in patients, rare examples can be found where the natural centromere is damaged or lost, and a new centromere (“neocentromere”) has formed on a chromosomal arm (du Sart et al. 1997; Fukagawa and Earnshaw 2014). Neocentromeres form on a wide range of DNA sequences and lack alphoid DNA or CENP-B. Nonetheless, they assemble CENP-A chromatin and functional kinetochores assemble and are maintained epigenetically at neocentromeres (Alonso et al. 2007). This is a strong evidence that centromeres are determined epigenetically and not simply by the existence of the underlying DNA sequence. Further evidence supporting this was the finding that DNA cloned from a human neocentromere was not able to seed de novo HAC formation in HT1080 cells (Saffery et al. 2001).

Experimental formation of neocentromeres has also been demonstrated following deleting of the endogenous centromeres in S. pombe, Neurospora, Candida, and chicken DT40 cells. Deletion of the centromere using Cre-loxP recombination in S. pombe and DT40 cells yielded neocentromeres with low efficiency (frequency of approximately 10−4 and 10−6, respectively; Ishii et al. 2008; Shang et al. 2013).

Another epigenetic centromere event is dicentric inactivation. Typically, dicentric chromosomes resulted from breakage or other mechanism are highly unstable. This is because the two centromeres of the resulting chromosome can attach to opposite spindle poles, resulting in the chromosome being stretched and breaking at anaphase (McClintock 1941; Stimpson et al. 2012). The broken chromosome end accelerates the chromosome fusion event for the repair reaction, which generates a new dicentric chromosome for the vicious cycle. Rarely, the dicentric chromosome is stabilized by inactivating one of the two centromeres (Distèche et al. 1972). Therefore, centromere inactivation is important for maintaining dicentric chromosome integrity after rearrangement. The known dicentric inactivation mechanisms are DNA deletion and epigenetic silencing of the centromere (Earnshaw and Migeon 1985; Merry et al. 1985; Earnshaw et al. 1989; Stimpson et al. 2012).

It is not yet known how CENP-A assembly is nucleated at the neocentromere, acquires and maintains a functional size, or how the dicentric centromere is inactivated epigenetically. Understanding such epigenetic centromere-regulating mechanisms may help to elucidate why and how species have different and divergent centromere DNA sequences.

Heterochromatin negatively regulates CENP-A and kinetochore assembly

Integration of input alphoid HOR DNA into ectopic chromosomal sites is another common fate of the transfected DNA during HAC formation assays. CENP-A and kinetochore assembly is generally suppressed at the ectopic sites, even though the same input alphoid HOR DNA has the capacity to undergo de novo centromere assembly and HAC formation in other cells in the same experiment. These ectopic integration sites may be a good model to understand epigenetic inactivation of centromere DNA.

A common chromatin modification observed at ectopic alphoid HOR integration sites is the constitutive heterochromatic modification H3K9me3 (Nakano et al. 2003, 2008; Nakashima et al. 2005; Okamoto et al. 2007; Okada et al. 2007). Interestingly, in normal human embryonic fibroblasts, centromeric H3K9me3 modification and CENP-B levels increase during the cellular senescence process, while at the same time, levels of CENP-A decrease at the centromeres (Maehara et al. 2010). This could suggest that increasing levels of H3K9me3 might inactivate centromere function and inhibit CENP-A assembly.

This question could be answered using the alphoidtetO-HAC with a conditional centromere, which was developed by a collaboration of three laboratories based on synthetic alphoid dimer HOR repeats in which one monomer contains a CENP-B box while the adjacent monomer contains a tetracycline operator (tetO) at the corresponding position in the alphoid DNA sequence (Fig. 2). Any desired tet repressor (tetR) fusion protein can then be tethered at this tetO site. For example, the tet transcriptional silencer (tTS) is a tetR fusion protein containing the KRAB-AB silencing domain of the Kid1 protein (SDKid1). Tethering of the tTS to the tetO sites of the alphoidtetO-HAC recruits a repressive complex including H3K9-metylatransferase SETDB1 through protein–protein interactions. Thus, tethering the tTS to the alphoidtetO-HAC causes an increase in the heterochromatic H3K9me3 modification at centromeres, accompanied by a decrease in CENP-A assembly, and eventual missegregation of the HAC (Nakano et al. 2008). A subsequent study revealed that tethering the full KAP1 protein reduces other centromere chromatin protein assembly in the order, CENP-H → CENP-C → CENP-A (Cardinale et al. 2009). Thus, heterochromatin-associated modifications are involved in hierarchal negative regulation of centromere assembly (Bergmann et al. 2012b).

Fig. 2
figure 2

Models of de novo centromere formation and epigenetic maintenance. Top, schematic diagram of de novo centromere formation and chromatin assembly balance. CENP-B nucleates de novo CENP-A assembly in a quantitative manner on input alphoid DNA. CENP-B or CENP-A also interacts with CENP-C, which may enhance centromere protein assembly. H3K9-trimetylase not only inhibits new CENP-A assembly but also disrupts CENP-C assembly. Middle, once established, centromere chromatin is maintained with epigenetic mechanisms. CCAN protein assembly including CENP-TWSX and CENP-C is a platform for further mitotic kinetochore protein assembly during metaphase. CENP-A chromatin is replenished in the G1 phase (Jansen et al. 2007). CENP-C interacts with M18BP1 protein, a subunit of the human Mis18 protein complex (Moree et al. 2011; Dambacher et al. 2012). The Mis18 complex may recruit the CENP-A deposition factor HJURP. CENP-A replenishment requires its conserved domain, the CENP-A targeting domain (CATD). During S-phase, the CENP-A density would be diluted by DNA replication. Bottom, artificial tethering of HAT and/or Suv39 as tetR fusion proteins opens the way to manipulate the assembly balance of the centromere chromatin on the alphoidtetO DNA

In mouse pericentromeric constitutive heterochromatin, the H3K9me3 modification is introduced by Suv39h1 and Suv39h2 (Peters et al. 2001). In MEFs with Suv39h1/2 knocked out, de novo CENP-A assembly on the transfected alphoid DNA is enhanced (Okada et al. 2007). Similarly, Suv39h1 depletion by siRNA in HeLa cells results in a low, but significant, level of CENP-A assembly at ectopically integrated alphoid HOR DNA (Ohzeki et al. 2012). Thus, Suv39h1 H3K9-trimethyase and heterochromatin appear to be involved in an intrinsic centromere inactivation mechanism.

Acetylation positively regulates CENP-A assembly

Depletion of Suv39h1 by siRNA in HeLa cells was not sufficient to induce functional kinetochore assembly at the ectopic alphoid DNA integration sites. Thus, depletion of one of the factors repressing CENP-A assembly was apparently not enough to induce de novo functional centromere formation. It appeared that some positive regulation would be necessary.

In HT1080 cells, de novo centromere assembly and HAC formation is relatively efficient. However, neither de novo HAC formation nor stable CENP-A assembly on exogenous alphoid DNA occurs in many other commonly used human cell lines, including TIG7, U2OS, and HeLa cells. In exploring this phenomenon, we noticed that in HT1080 cells, the level of heterochromatic H3K9me3 at endogenous alphoid DNA loci is relatively low and histone H3K9 acetylation (H3K9ac) is increased. In contrast, in TIG7, U2OS, and HeLa cells, higher levels of H3K9me3 modification are observed at endogenous centromeric alphoid DNA loci, and H3K9ac is low (Ohzeki et al. 2012).

The importance of acetylation for de novo centromere assembly was subsequently demonstrated in HeLa cells. Tethering the histone acetyl transferase (HAT) domain of p300 or PCAF to an ectopic integrated alphoidtetO DNA array containing high levels of H3K9me3 induced de novo hyper-assembly of CENP-A (Ohzeki et al. 2012). This HAT-assisted CENP-A hyper-assembly occurred across the entire ∼5 Mb region of the ectopic integrated alphoidtetO array within 2 h during G1 phase. Later, during mitosis, de novo kinetochore proteins were observed to assemble on the array and interact with bundles of spindle microtubules.

De novo HAC formation has never been observed in HeLa or U2OS cells following simple transfection of input alphoidtetO DNA. Strikingly, transient tethering of the histone acetyl transferase (HAT) domain of p300 or PCAF to the input alphoidtetO DNA following transfection did permit de novo formation of HACs with functional centromeres. It therefore appears that even transient HAT tethering can break an initial barrier (presumably due to hyper H3K9me3 levels at centromeres) for de novo centromere establishment on input alphoid DNA in HeLa cells. Once formed, these HACs were maintained stably without further assistance from the tethered HAT, even under strong centromeric heterochromatin pressure in HeLa cells (Ohzeki et al. 2012) (Fig. 2). Thus, the centromere itself can acquire a system to overcome such strong heterochromatin pressure.

Involvement of HAT activity in the pathway depositing newly synthesized CENP-A at centromeres has been suggested in several reports. hMis18 alpha is a subunit of the human Mis18 complex (Fujita et al. 2007), which assembles at centromeres only in telophase to G1. This complex functions upstream of the CENP-A/Histone H4 chaperone, HJURP, in CENP-A targeting (Dunleavy et al. 2009; Foltz et al. 2009). New CENP-A deposition is lost following hMis18α depletion, but hyperacetylation induced by histone deacetylase inhibitor TSA treatment can suppress this hMis18α depletion phenotype (Fujita et al. 2007). Indeed, during mitotic exit, a brief pulse of histone acetylation can be detected on centromeric alphoid DNA (Ohzeki et al. 2012). PCAF, p300 HATs (Craig et al. 2003), and MYST-HAT family complexes (Ohta et al. 2010) associate with endogenous centromeres. Intrinsic HAT activity may therefore cooperate with other factors during centromere assembly or functional maintenance.

Tethering of the CENP-A deposition factors from the Mis18 complex or HJURP can also induce CENP-A targeting and subsequent kinetochore assembly on the alphoidtetO DNA (Ohzeki et al. 2012). Indeed, tethering of HJURP can deposit CENP-A and induce subsequent kinetochore assembly, even on Lac operator (LacO) repeats in the absence of alphoid DNA sequence (Barnhart et al. 2011). CENP-A assembly on such LacO repeats can maintain centromere function without the tethering of additional initial factors (Hori et al. 2013). Established CENP-A chromatin is both a platform for kinetochore assembly and an epigenetic landmark for centromere maintenance.

Consistent with this view, the assembly of CENP-A chromatin can be bypassed if two proteins that link the inner and outer (microtubule-binding) elements of the kinetochore (CENP-T, and part of CENP-C) are tethered to ectopic arrays in vertebrate cells (Gascoigne et al. 2011; Hori et al. 2013). These synthetic kinetochores appear to be fully functional without the assembly of CENP-A or other inner kinetochore (CCAN) components.

Other epigenetic modifications and transcription for centromere/kinetochore assembly

Transcription-related chromatin modifications are also involved in centromere/kinetochore assembly. A euchromatic modification, H3K4me2, is associated with transcriptionally competent chromatin. H3K4me2 is also observed between blocks of CENP-A chromatin at endogenous centromeres (Sullivan and Karpen 2004). Tethering of LSD1, a H3K4me2 demethylase, on the tetO-HAC reduces assembly of the HJURP and consequent loss of new CENP-A assembly (Bergmann et al. 2011).

Monomethylation on H4K20 (H4K20me1) is also associated with both transcriptional activity and repression. H4K20me1 is specifically associated with CENP-A nucleosomes, both in endogenous centromeres in HeLa and DT40 cells and at neocentromeres in the latter (Hori et al. 2014). This modification appears to somehow license CENP-A chromatin for subsequent kinetochore assembly, and the assembly of centromere proteins CENP-H and CENP-T is diminished when PHF8, a demethylase for H4K20me1, is tethered to endogenous centromeres by fusion to CENP-U in human cells (Hori et al. 2014).

Surprisingly, elongating RNA polymerase II has been detected at centromeres in metaphase, where its transcriptional activity may be important for CENP-C assembly (Chan and Wong 2012). The level of this transcriptional activity must be finely balanced, however. On the alphoidtetO-HAC centromere, induction of strong transcriptional activity by the tethering of VP16, a strong transcriptional activator, not only blocks new CENP-A assembly but also strips the centromere of pre-existing CENP-A. In this experiment, VP16 raised centromeric transcript levels by >100-fold, to the level of a housekeeping gene. In this context, the HAC was rapidly lost due to failure of mitotic segregation. In contrast, weak (∼10×) transcriptional activation of alphoid arrays caused by tethering the activation domain of NF-KappaB p65 is compatible with CENP-A assembly (Bergmann et al. 2012a). H3K9me3 and acetylation are also involved in transcriptional regulation. It is not clear if the weak transcription is required for CENP-A assembly. However, differential histone H3.3 or H3.1 deposition is indeed coupled with transcription or replication machinery respectively (Tagami et al. 2004).

Other transcription-related factors, such as FACT or RSF1, also co-purify with CENP-A chromatin and involved in centromere regulation. RSF1 appears to be required for stable incorporation of CENP-A into chromatin (Perpelescu et al. 2009). Tethering of RSF also recruits CENP-S and CENP-X (Helfricht et al. 2013). FACT in yeast cells prevents ectopic CENP-A assembly through a ubiquitin-related degradation pathway (Choi et al. 2012; Dayter and Biggins 2014).

Balance between centromere nucleation and suppression

The size of the CENP-A-bound region of alphoid DNA has been suggested to be 30∼100 kb (Okamoto et al. 2007; Alonso et al. 2007; Shang et al. 2013), which is much smaller than the type I alphoid HOR. Thus, endogenous centromeres and HAC centromeres derived from a simple alphoid HOR DNA have a large number of non-CENP-A chromatin domains (Grimes et al. 2004; Okamoto et al. 2007), most of which are heterochromatin.

What is the role of heterochromatin in HAC formation? Heterochromatin assembly may act to enhance cohesion or suppress the CENP-A chromatin assembly from extending into the whole area of a large repetitive DNA. Disruption of heterochromatin during HAC formation by embedding an additional promoter in the vector arm results in intact de novo centromere assembly, but HAC formation is lost, presumably due to a loss of cohesin enrichment through HP1 binding (Nakashima et al. 2005). Another role of heterochromatin may be to control kinetochore size. Forced introduction of H3K9me3 via the tethering of Suv39h1 on the alphoidtetO-HAC centromere disrupts or cancels centromere protein assembly. In contrast, forced introduction of acetylation by tethering HATs results in extensive CENP-A chromatin and kinetochore assembly across an entire ∼5 Mb alphoidtetO DNA integration site (Ohzeki et al. 2012) (Fig. 2). Such large kinetochore cannot satisfy the mitotic checkpoint, probably due to spindle abnormalities or formation of functional dicentric chromosomes. A balance between centromere nucleation and suppression by heterochromatin therefore appears be important to control either kinetochore size or assembly.

S. pombe has a tRNA genes, which act as barrier elements, between heterochromatin and centromeres at the DNA level. Deletion of these genes induces heterochromatin invasion into the centromere domain, and this results in meiotic chromosomal instability (Scott et al. 2006). However, in humans and many other species, centromere assembly occurs on homogeneous, simple, repetitive DNAs. Maintenance of a mechanism balancing centromere nucleation and suppression by the heterochromatin is important. Centromere/kinetochore size can be adjusted by artificially embedding a boundary element between two different synthetic alphoid HOR DNAs. Boundary elements that function in this way include human HS4, human gamma satellite DNA, and tRNA genes, all of which prevent heterochromatin from spreading to adjacent marker genes (Kim et al. 2009; Ebersole et al. 2011). Development of conditional protein tethering to pericentromeric heterochromatin using combined tetO and/or LacO systems may increase our understanding of the interplay between centromeric chromatin and pericentromeric heterochromatin.

Concluding remarks

HAC formation is a powerful tool for understanding how chromosomes form and are organized. However, there is not a simple 1:1 correlation between input DNA and the generated HAC. More technical advancements and understanding of principal events that occur during dynamic HAC formation in vivo are necessary.

With the HAC formation assay, knowledge about centromere chromatin and heterochromatin has been accumulating. HACs can also be used for studies of replication, telomeres, and other chromosomal functions (Weuts et al. 2012; Wakai et al. 2014). The meiotic behavior of artificial mammalian chromosomes is also of interest.

HACs are also potentially useful as gene delivery vectors for transgenic animals and variety of cells (Ikeno et al. 2009; Hasegawa et al. 2014). Genome editing technologies are rapidly advancing, and any genomic locus could be loaded into a HAC via direct cloning through homologous recombination. In particular, the conditional alphoidtetO-HAC with a controllable kinetochore (Nakano et al. 2008; Kim et al. 2011; Kononenko et al. 2014) is useful for a hit-and-run gene expression, such as is useful during induction of pluripotency or direct reprogramming of cells (Hiratsuka et al. 2011). Advancement of chromosome transfer technology will accelerate such HAC applications.