Abstract
Centromeres can arise de novo from non-centromeric regions, which are often called “neocentromeres.” Neocentromere formation provides the best evidence for the concept that centromere function is not determined by the underlying DNA sequences, but controlled by poorly understood epigenetic mechanisms. Numerous neocentromeres have been reported in several plant and animal species. However, it has been elusive how and why a specific chromosomal region is chosen to be a new centromere during the neocentromere activation events. We report recurrent establishment of neocentromeres in a pericentromeric region of chromosome 3 in maize (Zea mays). This latent region is located in the short arm and is only 2 Mb away from the centromere (Cen3) of chromosome 3. At least three independent neocentromere activation events, which were likely induced by different mechanisms, occurred within this latent region. We mapped the binding domains of CENH3, the centromere-specific H3 histone variant, of the three neocentromeres and analyzed the genomic and epigenomic features associated with Cen3, the de novo centromeres and an inactivated centromere derived from an ancestral chromosome. Our results indicate that lack of genes and transcription and a relatively high level of DNA methylation in this pericentromeric region may provide a favorable chromatin environment for neocentromere activation.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Background
Centromeres play the key role in chromosomal segregation and transmission. Centromeric chromatin is marked by the presence of CENH3 (CENP-A in humans), a centromere-specific H3 variant. In most eukaryotes, centromeres contain highly repetitive DNA sequences, mostly satellite repeats and transposable elements (Henikoff et al. 2001; Jiang et al. 2003). The repetitive DNA sequences, however, are not required or essential for centromere function because centromeres can be activated de novo in genomic regions devoid of centromeric repeats, which are often called “neocentromeres.” Neocentromeres were first discovered in humans (Voullaire et al. 1993) and have since been reported in a number of plant and animal species (Fu et al. 2013; Liu et al. 2015; Marshall et al. 2008; Nasuda et al. 2005; Rocchi et al. 2012; Sullivan and Schwartz 1995; Tolomeo et al. 2017; Topp et al. 2009). Neocentromeres are functional centromeres and are marked by the presence of CENH3/CENP-A as well as all other essential centromeric proteins (Saffery et al. 2000; Scott and Sullivan 2014). It is hypothesized that repeat-based centromeres are evolved from neocentromeres via invasion of repetitive DNA sequences (Gong et al. 2012; Zhang et al. 2014).
More than 100 neocentromeres have been reported in humans (Marshall et al. 2008; Scott and Sullivan 2014). Most human neocentromeres were associated with significantly rearranged “marker chromosomes” discovered in human clinical cases. However, a few human neocentromeres were found on normal chromosomes that also contained an inactivated native centromere (Marshall et al. 2008; Rocchi et al. 2012), which are also described as “centromere repositioning” events. Some genomic regions are particularly prone for human neocentromere activation (Marshall et al. 2008), suggesting that neocentromere seeding is not a random process, yet it is not clear what genomic or epigenomic feature(s), if any, of the cognate chromosomal regions determine the seeding of neocentromeres. Interestingly, some human neocentromeres are located in regions that are predicted to be ancestral centromeres. These ancestral centromeres were inactivated during mammalian chromosomal evolution (Rocchi et al. 2012). Thus, regions associated with inactivated centromeres may maintain intrinsic genomic/epigenomic features that are favorable for neocentromere formation.
Neocentromeres can be recovered when the native centromere of a eukaryotic chromosome is conditionally deleted. Artificial neocentromere activation experiments were conducted in several model animal species, including Schizosaccharomyces pombe (Ishii et al. 2008), Candida albicans (Ketel et al. 2009; Thakur and Sanyal 2013), and chicken (Gallus gallus) (Shang et al. 2013). Over 100 neocentromeres were induced at various locations on chicken chromosomes Z and 5, but no specific DNA sequences or motifs were associated with these neocentromeres (Shang et al. 2013). Induced neocentromeres on chromosome I of S. pombe were exclusively located near the ends of the chromosome, which represent the most distinct heterochromatic domains on the chromosome (Ishii et al. 2008). By contrast, in C. albicans and chicken, induced neocentromeres were mostly formed at a close proximity to the native centromeres (Shang et al. 2013; Thakur and Sanyal 2013). Thus, results from induced neocentromeres in three different model animal species suggest that certain chromosomal regions have intrinsic genomic/epigenomic properties for neocentromere seeding.
Neocentromeres have been documented in several plant species, including barley (Nasuda et al. 2005), rice (Gong et al. 2009), oat (Topp et al. 2009), maize (Fu et al. 2013; Liu et al. 2015; Zhang et al. 2013), and wheat (Guo et al. 2016). Centromere repositioning events were also reported in plants (Han et al. 2009; Wang et al. 2014). Several centromeres of maize may have spontaneously shifted positions in different genetic backgrounds (Schneider et al. 2016). However, the sequence or epigenomic features that promote or sustain centromere movement are not known. Maize chromosome 3, including its centromere (Cen3), has been well sequenced (Jiao et al. 2017; Schnable et al. 2009). Evolutionarily, maize chromosome 3 was formed by fusion of two ancestral chromosomes and contains an inactivated ancestral centromere (Wang and Bennetzen 2012; Wei et al. 2007). Two independent neocentromeres associated chromosome 3 were reported (Schneider et al. 2016; Wang et al. 2014). Here, we report another neocentromere associated with chromosome 3. Interestingly, all three neocentromeres were located within a specific chromosomal region that is located very close to the progenitor Cen3. We demonstrate that lack of genes and transcription and a relatively high level of DNA methylation in this region may provide a favorable chromatin environment for neocentromere activation.
Results
The origin of the centromere of maize chromosome 3
Maize (2n = 2x = 20) and sorghum (2n = 2x = 20) split from a common ancestor approximately 12 million years ago (Swigonova et al. 2004). The maize genome initially contained 40 chromosomes after a genome-wide duplication event and underwent dramatic intra- and inter-chromosomal rearrangements, which resulted in the current karyotype with 20 chromosomes (Paterson et al. 2004; Wei et al. 2007). Reconstruction of maize progenitor chromosomes based on physical mapping and genome sequence analysis showed that the sorghum genome has maintained its original chromosome number and structure (Schnable et al. 2011; Wei et al. 2007). Thus, sorghum chromosomes are considered to maintain their synteny with the “ancient chromosomes” that predate the divergence of maize and sorghum. Comparative sequence analysis between maize and sorghum genomes indicated that maize chromosome 3 was derived from two ancient chromosomes, which are homologous to sorghum chromosomes 3 and 8, respectively (Wang and Bennetzen 2012) (Fig. 1a). Ancient chromosome 8 (achro. 8) inserted in the pericentromeric region of ancient chromosome 3 (achro. 3) and the fused chromosome underwent several intrachromosomal rearrangements, which resulted in the current maize chromosome 3 (Wang and Bennetzen 2012) (Fig. 1b).
The DNA composition of maize chromosome 3 (version 4 of B73 maize reference genome (Jiao et al. 2017)) can be revealed by syntenic analysis with sorghum chromosomes 3 and 8 (Fig. 1a). Based on this analysis, the short arm of achro. 3 (block “a” in Fig. 1) was retained nearly intact and represented the short arm of maize chromosome 3 (Fig. 1b). Achro. 8 became scrambled after the fusion. However, seven syntenic blocks (“1–7” in Fig. 1) derived from achro. 8 can be identified. The centromere of achro. 8 is predicted to be located between blocks 3 and 4 based on the centromere position on sorghum chromosome 8 (Fig. 1b). These two blocks have maintained the synteny between maize chromosome 3 and sorghum chromosome 8. These results suggest that the centromere of maize chromosome 3 (Cen3) was derived from achro. 3 and the centromere of achro. 8 was inactivated after the fusion of the two chromosomes (Fig. 1b).
Confirmation of the chromosomal position of Cen3
The CENH3-binding domain of Cen3 was previously mapped to 99.8–100.8 Mb on chromosome 3 (version 3 of the B73 reference genome) (Wang et al. 2014). However, this domain has recently been moved to 85.8–86.9 Mb on the version 4 pseudomolecule (http://ensembl.gramene.org/Zea_mays/Info/Index) (AGPv4 maize assembly). We wanted to verify if the version 3 or the version 4 pseudomolecule has the correct position of the CentC repeat, the landmark for maize centromeres (Zhong et al. 2002), on maize chromosome 3. We designed two single copy DNA probes, “L” and “R,” respectively (Table S1). Both probes were predicted to be located on the short arm and were 17 and 5 Mb, respectively, from the CentC array within Cen3 in version 3 genome (Fig. 2a). However, probe R was predicted to locate on the long arm and the two probes were 2 and 17 Mb, respectively, from the CentC array in version 4 (Fig. 2a). Fluorescence in situ hybridization (FISH) using L, R, and CentC probes revealed that the L probe resided on the short arm and flanked closely to the CentC repeats. In contrast, probe R resided on the long arm and was far away from the CentC repeats (Fig. 2b). Thus, the cytological positions of these three probes matched well with their relative positions on the version 4 pseudomolecule.
DNA compositions and genes flanking Cen3 and an inactive centromere
We performed a detailed analysis of the gene content within and surrounding Cen3 and the inactivated centromere derived from achro. 8. Based on the conservation of homologous maize and sorghum gene pairs, the 63.7–88.4 Mb region of maize chromosome 3 is syntenic to 17.1–45.3 Mb of sorghum chromosome 3 (Fig. 3b). Similarly, 88.8–103.1 Mb region of maize chromosome 3 is syntenic to 12–39.7 Mb of sorghum chromosome 8 (Fig. 3b). The fusion junction between achro. 3 and achro. 8 is predicted to be located between 88.4 and 88.8 Mb. For convenience in description, these two regions were named as P1 and P2, respectively (Fig. 3a).
The sorghum genomic region corresponding to P1 spans the centromere of sorghum chromosome 3 (Fig. 3b). Well aligned maize-sorghum gene pairs were identified along both sides of the two Cen3s, although the order of the gene pairs became more disrupted toward the centromeres (Fig. 3b, Fig. S1). These results support the conclusion that maize Cen3 is orthologous to sorghum Cen3.
The centromere position of achro. 8 is located between synteny blocks “3” and “4” (Fig. 1b). Thus, P2 is predicted to contain an inactivated centromere derived from achro. 8 (Fig. 3a). P2 and its corresponding genomic region of sorghum chromosome 8 contained a number of well conserved gene pairs (Fig. 3b). Interestingly, the order of these genes was reversed in the two genomes, suggesting that an inversion spanning the ancient Cen8 occurred during evolution. P2 included 14.3 Mb of DNA in maize. Strikingly, its corresponding sorghum genomic region included 27.7 Mb of DNA (Fig. 3b). These results suggested that the P2 region may have lost most of the repetitive DNA originally associated with the ancient Cen8. By contrast, P1 and its corresponding sorghum genomic region, which spans sorghum Cen3, included 24.7 and 28.2 Mb, respectively (Fig. 3b). Collectively, these results support the conclusion that maize Cen3 represents ancient Cen3 and ancient Cen8 was inactivated and lost the cognate centromeric repeats during the evolution of maize chromosome 3 during the last 12 million years.
Recurrent activation of a neocentromere in the pericentromeric region of chromosome 3
The CENH3-binding domain of maize Cen3 was mapped to 85.8–86.9 Mb (Fig. 4a). Maize chromosome 3 was transferred into the genetic background of oat through oat-maize cross and backcrosses (Kynast et al. 2001). Surprisingly, maize Cen3 in the oat-maize chromosome addition line OMA3.01 moved to a new position in the pericentromeric region toward the short arm (Topp et al. 2009; Wang et al. 2014) (Fig. 4b). The original maize Cen3 was inactivated although it still contained the CentC satellite repeats. The CENH3-binding domain of chromosome 3 moved from 85.8–86.9 to 79.6–84.7 Mb (Fig. 4b). The cause of this centromere repositioning event was not known. Interestingly, the CENH3-binding domains of all maize centromeres were expanded from ~ 1.8 to ~ 3.6 Mb in the oat background (Wang et al. 2014). It is likely that the expansion of Cen3 in oat was hindered by the presence of large transcription domains flanking the original CENH3-binding domain. Therefore, repositioning of Cen3 in the oat background became an alternative path to reach the expansion (Wang et al. 2014).
Recent mapping of centromere positions in 26 maize inbreds revealed a number of centromere repositioning events (Schneider et al. 2016). The position of the CENH3-binding domain of the Cen3 in most inbreds overlaps with that of B73. Interestingly, the CENH3-binding domain in one inbred, P39, was mapped to 80.6–82.8 Mb, which is approximately 3 Mb away from the Cen3 position in B73 (Fig. 4c). Strikingly, the profile of the CENH3-binding domain in P39 nearly perfectly overlaps with the center of the CENH3-binding domain in OMA3.01 (Fig. S2). The cause of Cen3 repositioning in P39 is not known. However, the authors suggested that many of the repositioning events were associated with the loss of CentC repeats (Schneider et al. 2016).
Two Cen3 repositioning events, which were likely induced by different mechanisms, resulted in formation of a neocentromere at the same location in P39 and OMA3.01. Intriguingly, P39 is an inbred sweet corn; and Seneca 60, which was the donor of maize chromosome 3 in the OMA3.01 line, was a hybrid sweet corn. We cannot exclude the possibility that P39 and Seneca 60 are related. Results from P39 and OMA3.01 suggest that maize chromosome 3 region around 80.6–82.8 Mb may provide an ideal chromatin environment for neocentromere activation.
Another independent neocentromere activation event
Searching of all previously published genome-wide CENH3 nucleosome mapping datasets in maize led to the discovery of another putative Cen3 repositioning event. Maize line Dp3a, originally derived from UV-treated materials (Stadler and Roman 1948), contains a small minichromosome derived from the long arm of chromosome 3. This small chromosome contains a de novo centromere over unique sequences that drives its transmission (Fu et al. 2013). Careful examination of data from chromatin immunoprecipitation (ChIP) followed by sequencing (ChIP-seq) from this line revealed that the CENH3-binding domain of Cen3 in the normal chromosome 3 was present at 83.3–85.3 Mb, immediately flanking the original Cen3 (Fig. 4d). This domain partially overlapped with the CENH3-binding domain of Cen3 in OMA3.01 (Fig. S2).
The minichromosome in the Dp3a line contains genes A1 and Sh2, which cause purple color of the aleurone layer and plump starchy kernels. Since the minichromosome has a low transmission rate, Dp3a with its minichromosome was maintained by crossing with recessive colorless, shrunken maintainer lines (a1, sh2), including ax-3, which contain a small deletion in this region. Thus, the minichromosome can be monitored based on morphological markers present on the kernels. We hypothesized that a repositioned Cen3 was present in maintainer lines. To answer this question, we conducted a CENH3 ChIP-seq experiment using the most recent maintainer stock line ax-3. We generated 22 millions of paired-end ChIP-seq reads and mapped 6.5 million unique reads to the B73 genome. Interestingly, we detected two major CENH3-binding domains on chromosome 3 of ax-3. The first CENH3 domain overlapped with the Cen3 in B73. The second CENH3 domain was mapped to the same region as that of the Dp3a line (Fig. 4e).
The CENH3-binding domain of Cen3 in the Dp3a line was mapped exclusively to 83.3–85.3 Mb (Fig. 4d). This result indicated that both copies of Cen3 in this line are present at this site. We performed single nucleotide polymorphism (SNP) analysis using the ChIP-seq sequences from the Dp3a line. The sequences from the 83.3–85.3 Mb region were heterozygous. Phasing of the haplotypes revealed that one of the haplotypes is the same as the maintainer line ax-3, confirming that one of the repositioned Cen3s in the Dp3a line was received from ax-3. We then analyzed the SNPs using the ChIP-seq sequences, which were mapped to 83.3–85.3 Mb, from ax-3, OMA3.01, and P39. The DNA sequences associated with the second haplotype of Cen3s in the Dp3a line was identical to those from OMA3.01 and P39 (Fig. S3). Thus, the data indicate two repositioned Cen3s in the Dp3a material analyzed. The second repositioned Cen3 was potentially received from another parental line used in the pedigree of the Dp3a line.
Genomic and epigenomic features associated with Cen3, the de novo centromeres, and the inactivated centromere
The recurrent neocentromere activations in 79.6–85.3 Mb prompted us to investigate if this region contains unique genomic or epigenomic features that would be favorable for CENH3 deposition. For convenience, we refer to 79.6–85.3 Mb as the latent region and the two CENH3-binding domains of P39 (80.6–82.8 Mb) (Fig. 4c) and the Dp3a line (83.3–85.3 Mb) (Fig. 4d) as two subcenters within the latent region. We first analyzed the chromosomal distribution of the CentC repeat and the CRM1/CRM2 elements, which are associated with most maize centromeres (Ananiev et al. 1998; Jin et al. 2004; Zhong et al. 2002). Unambiguous CentC signals were only detected in the Cen3 in B73 (Fig. 2b). CRM2 represents a young CRM subfamily and the majority of the CRM2 elements were mapped to the B73-like Cen3 (Fig. 5a), which agreed with the analyses based on other maize centromeres (Wolfgruber et al. 2009). By contrast, CRM1 represents an old CRM subfamily and the CRM1 elements were more broadly distributed in the pericentromeric region (Fig. 5a, b). No CRM2 and only a few CRM1 elements were detected in the neocentromeres highlighted in dark blue (Fig. 5). Similarly, only a single CRM2 element and no CRM1 element were detected in the inactivated centromere highlighted in green (Fig. 5).
The B73-like Cen3 and the two subcenters within the latent region represent the most gene-deficient regions on chromosome 3 (Fig. 5c, Fig. 6a). These three regions contained only 1, 10, and 9 genes, respectively. To compare the gene densities in other regions on the chromosome, we masked these three regions and divided chromosome 3 into 2-Mb windows. We found that only 5% of the 2-Mb windows contain less than 10 genes. By contrast, the inactivated centromere, spanning 96.4–99.3 Mb, contained 25 genes, which has a gene density of two-fold higher than the two subcenters. Mapping of RNA-seq datasets on chromosome 3 matched well with gene annotation results (Fig. 5c). The overall transcriptional activity within the latent region was lower than the inactive centromere. Analysis of the RNA-seq data indicated 8 expressed genes in the inactive centromere but only 3 and 2 in the two subcenters of latent region, respectively.
We then analyzed the DNA methylation associated with the three chromosomal regions. The Cen3 and the two subcenters within the latent region showed the highest levels of DNA methylation on the chromosome, which were significantly higher than the DNA methylation level in the inactivated centromere (Fig. 6b). Interestingly, region 89.5–91.4 Mb, which immediately flanks Cen3 on the long arm, represented the most gene-deficient region (4 genes/Mb) on the chromosome (Fig. 6a). However, the methylation level in this region was lower than the two subcenters as well as the inactivated centromere. Taken together, the lack of genes and transcription and a relatively higher level of DNA methylation in the latent region may provide a favorable chromatin environment for neocentromere formation.
Discussion
Neocentromeres associated with inactivated ancient centromeres
Several human neocentromere “hotspots” were associated with ancient centromeres that were inactivated during mammalian chromosome evolution (Capozzi et al. 2009; Kalitsis and Choo 2012; Ventura et al. 2004). For example, repeated neocentromere activation was found to be associated with region 15q24–26 of human chromosome 15 (Marshall et al. 2008). Comparative cytogenetic and genomics analyses revealed that this region contains an ancestral centromere that became inactivated about 25 million years ago (Ventura et al. 2003). This region contains segmentally duplicated DNA clusters that typically reside around pericentromeric regions of native human centromeres. Thus, the association between neocentromere and this inactivated ancestral centromere was proposed to be due to the persistence of recombinogenic duplications accrued within the ancient pericentromere, rather than the retention of “centromere-competent” sequences per se (Ventura et al. 2003). Chromosome 6p22.1 represents another region of neocentromere formation that is associated with an inactivated ancient centromere. However, only one neocentromere was reported in this region (Capozzi et al. 2009). The CENP-A binding domain of this neocentromere was precisely mapped. No peculiar sequence features, except for a massive clustering of tRNA, were found in this neocentromere (Capozzi et al. 2009).
The maize B73-like Cen3 likely represents the original (ancentral) centromere of this chromosome, which is supported by the fact that this region contains a CentC array and is highly enriched with CRM2 (Fig. 5), which are specifically associated with maize centromeres (Ananiev et al. 1998; Jin et al. 2004; Zhong et al. 2002). Maize chromosome 3 contains an inactivated ancentral centromere (Fig. 4). Comparative sequence analysis indicates that this inactivated centromere, possibly including its surrounding regions, has lost the typical genomic and epigenomic features associated with most native centromeres, which is reflected by the fact that this region has lost most of the repetitive DNA sequences associated with all maize and sorghum centromeres (Fig. 3). In a similar case, human chromosome 2 arose from fusion of two chromosomes after the divergence of the hominid and chimp lineages (Yunis and Prakash 1982). Remnant alpha satellite repeats, which are specific to human centromeres, can be detected in the inactivated centromere located on the long arm of chromosome 2 (Baldini et al. 1993). However, close examination of this chromosomal region indicated that this ancient centromere has lost the typical DNA composition and organization, such as high-order repeat organization and recombinogenic duplicons, that are found in native human centromeres. No human neocentromeres have been reported to be associated with this inactivated centromere (Kalitsis and Choo 2012).
Results from both humans and maize suggest that the potential of neocentromere activation from an inactivated ancient centromere will depend on whether the inactivated centromere has maintained the intrinsic genomic and/or epigenomic features associated with the active centromeres in the same species. Such features may become decayed or lost completely during evolution.
Interpretation for neocentromere activation in regions at close proximity to native centromeres
Most induced neocentromeres on different C. albicans chromosomes were formed near the native centromeres (Thakur and Sanyal 2013). In chicken, 76% of the induced neocentromeres on the Z chromosome were close to the native centromere. Similarly, 97% of the neocentromeres induced on chromosome 5 were formed within 3 Mb of the native centromere (Shang et al. 2013). Shang et al. (2013) proposed that the preferential positions of chicken neocentromeres near the native centromeres are due to the fact that potential epigenetic marks, which would be favorable for centromere formation, are enriched around the original centromeres. In a striking contrast, induced neocentromeres on chromosome I of S. pombe were located exclusively near the telomeric ends, far away from the native centromere. However, this preference at the telomeric end may be also related to the distinct epigenetic marks associated with telomeric chromatin (Ishii et al. 2008).
Both the latent region and the inactivated centromere on maize chromosome 3 are located in close proximity to the native centromere (Fig. 6). Cen3 and the latent region share more similarities in genomic (genes and transcription) and epigenomic (DNA methylation) characteristics as compared to the inactivated centromere. Thus, the latent region provides a similar chromatin environment as the native centromere for CENH3 deposition. However, this latent region is more close to the native centromere than the inactivated centromere (Fig. 6). Thus, both factors, distance to the native centromere and a favorable chromatin environment, may play the role for the recurrent establishment of de novo centromeres in the latent region.
It has been well documented that the pericentromeric regions of all maize chromosomes are nearly completely suppressed in meiotic recombination (Anderson et al. 2003). Neocentromeres may form at any sites on the chromosomes with a favorable environment (Fu et al. 2013; Liu et al. 2015), but distally formed ones have the potential to form destructive anaphase bridges if recombination occurs between displaced centromeres in a heterozygote. By contrast, a chromosome with a neocentromere located close to the native centromere will be favored to survive because a heterozygote, including one normal chromosome and one neocentric chromosome, will not self-destruct because crossing overs between the two centromeres will not likely occur (Lamb et al. 2007).
Materials and methods
FISH
Immature tassels were harvested from B73 plants grown in the green house and fixed using 3:1 ethanol:glacial acetic acid. FISH on meiotic pachytene chromosomes was performed following published procedures (Koo et al. 2011). DNA probes were labeled with DNP-11-dUTP (PerkinElmer, Waltham, MA), digoxigenin-11-dUTP, and biotin-16-dUTP (Roche Diagnostics USA, Indianapolis, IN). The hybridization signals were detected with Alexafluor 488 streptavidin (Invitrogen, Carlsbad, CA) for biotin-labeled probes, and rhodamine-conjugated anti-digoxigenin (Roche Diagnostics USA, Indianapolis, IN) for dig-labeled probe. The DNP-labeled probe was detected with rabbit anti-DNP, followed by amplification with a chicken anti-rabbit Alexafluor 647 antibody. Chromosomes were counterstained with 4′,6-diamidino-2-phenylindole (DAPI) in VECTASHIELD antifade solution (Vector Laboratories, Burlingame, CA). The images were captured with a Zeiss Axioplan 2 microscope using a cooled CCD camera CoolSNAP HQ2 (Photometrics, Tucson, AZ) and AxioVision 4.8 software (Carl Zeiss Microscopy LLC, Thornwood, NY).
ChIP and ChIP-seq
Maize maintainer line “ax-3” was grown in the greenhouse under photoperiod of 16/8 h light/dark. Leaf tissues from 2-week-old seedlings were collected and ground into fine powder in liquid nitrogen. The resulting powder was suspended in the nucleus extraction buffer [10 mM potassium phosphate, 100 mM NaCl, 0.1% β-mercaptoethanol, 1/10 (v/v) Hexylene glycol (Sigma Cat # 112100)] for nuclei isolation. ChIP was performed following a published protocol (Nagaki et al. 2003). An antibody against centromeric histone H3 (Nagaki et al. 2004) was used in the ChIP experiment. ChIP-seq libraries for Illumina sequencing were constructed according to the protocol of “Preparing samples for ChIP sequencing of DNA” provided by Illumina. Briefly, both ends of the ChIPed DNA fragments were repaired using an End-It DNA end repair kit (Epicenter, ER0720). The “dA” base was then added to 3′ ends of the end-repaired DNA fragments using Klenow fragment (New England BioLabs, M0212S), followed by Illumina adapter ligation for pair-end sequencing, using a quick ligase (New England BioLabs, M2200). Adapter-ligated DNA fragments were purified by running a 2% agarose gel in TAE buffer and were size-selected from 200 to 300 bp. The resulting DNA fragments were enriched by PCR with 13 cycles. The purified ChIP-seq libraries were ready for Illumina sequencing after passing quality validation. The library was sequenced using the Illumina HiSeq 2000 platform.
Mapping of CENH3-binding domains and synteny blocks between maize and sorghum
Mapping Adapters and nucleotides with quality score less than 30 were removed from raw sequencing reads by Cutadapt (Martin 2011). Trimmed reads were mapped to B73 reference genome (AGPv4) by BWA-MEM (Li 2013) using default parameters. Alignments with mapping quality at least 50 were retained and converted to BAM format using SAMtools (Li et al. 2009) for further analysis. CENH3 ChIP data of maize inbred line P39 (SRR3018404), maize line dp3a (SRR639499), and oat-maize chromosome 3 addition line OMA3.01 (SRR867050) were downloaded from NCBI (http://www.ncbi.nlm.nih.gov/). CENH3-binding domains in each line were identified using SICER (Zang et al. 2009). The parameters of SICER are window size of 1000 bp, gap size of 3000 bp, effective genome fraction 0.75, redundancy threshold 1, fragment size 150 bp and FDR 0.01. Synteny map between maize version 3 reference genome (AGPv3) and sorghum (V2.0) were conducted using DAGchainer (Haas et al. 2004) at SynMap (Lyons et al. 2008) with parameters of MegaBlast, -D 20, and –A 5. Maize AGPv3 genes were converted into that AGPv4 coordinates by mapping genes to the AGPv4 genome assembly.
Analyses of transcriptome and DNA methylation
Transcriptome (SRR445382) and B73 DNA methylation data (SRR850328) (Li et al. 2015) were downloaded from NCBI (http://www.ncbi.nlm.nih.gov/). Transcriptional analyses followed published procedures (Zhao et al. 2016). Briefly, the transcriptome reads were mapped by TopHat (Trapnell et al. 2009) and Cufflinks (Trapnell et al. 2010) with default parameters. DNA methylation reads were mapped by Bismark (Krueger and Andrews 2011) with parameters: -q -N 1. Methylation information of each cytosine was extracted by bismark_methylation_extractor in the Bismark tool. Each chromosome was divided into 50 Kb windows. Percentage methylation level of CpG in each window was calculated by dividing number of methylated CpG by total number of CpG.
Data availability
The CENH3 ChIP-seq sequencing data of ax-3 is available from NCBI Sequence Read Archive (SRA) under SRR3709784.
Abbreviations
- ChIP:
-
Chromatin immunoprecipitation
- DAPI:
-
4′,6-diamidino-2-phenylindole
- FISH:
-
Flourescence in situ hybridization
- OMA:
-
Oat-maize chromosome addition
- SNP:
-
Single nucleotide polymorphism
References
Ananiev EV, Phillips RL, Rines HW (1998) Chromosome-specific molecular organization of maize (Zea mays L.) centromeric regions. Proc Natl Acad Sci U S A 95:13073–13078
Anderson LK, Doyle GG, Brigham B, Carter J, Hooker KD, Lai A, Rice M, Stack SM (2003) High-resolution crossover maps for each bivalent of Zea mays using recombination nodules. Genetics 165:849–865
Baldini A, Ried T, Shridhar V, Ogura K, Daiuto L, Rocchi M, Ward DC (1993) An alphoid DNA sequence conserved in all human and great ape chromosomes—evidence for ancient centromeric sequences at human chromosomal regions 2q21 and 9q13. Hum Genet 90:577–583
Capozzi O, Purgato S, D'Addabbo P, Archidiacono N, Battaglia P, Baroncini A, Capucci A, Stanyon R, Della Valle G, Rocchi M (2009) Evolutionary descent of a human chromosome 6 neocentromere: a jump back to 17 million years ago. Genome Res 19:778–784
Fu SL, Lv ZL, Gao Z, Wu HJ, Pang JL, Zhang B, Dong QH, Guo X, Wang XJ, Birchler JA, Han FP (2013) De novo centromere formation on a chromosome fragment in maize. Proc Natl Acad Sci U S A 110:6033–6036
Gong ZY, Wu YF, Koblizkova A, Torres GA, Wang K, Iovene M, Neumann P, Zhang WL, Novak P, Buell CR, Macas J, Jiang JM (2012) Repeatless and repeat-based centromeres in potato: implications for centromere evolution. Plant Cell 24:3559–3574
Gong ZY, Yu HX, Huang J, Yi CD, Gu MH (2009) Unstable transmission of rice chromosomes without functional centromeric repeats in asexual propagation. Chromosom Res 17:863–872
Guo X, Su HD, Shi QH, Fu SL, Wang J, Zhang XQ, Hu ZM, Han FP (2016) De novo centromere formation and centromeric sequence expansion in wheat and its wide hybrids. PLoS Genet 12:e1005997
Haas BJ, Delcher AL, Wortman JR, Salzberg SL (2004) DAGchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics 20:3643–3646
Han YH, Zhang ZH, Liu CX, Liu JH, Huang SW, Jiang JM, Jin WW (2009) Centromere repositioning in cucurbit species: implication of the genomic impact from centromere activation and inactivation. Proc Natl Acad Sci U S A 106:14937–14941
Henikoff S, Ahmad K, Malik HS (2001) The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293:1098–1102
Ishii K, Ogiyama Y, Chikashige Y, Soejima S, Masuda F, Kakuma T, Hiraoka Y, Takahashi K (2008) Heterochromatin integrity affects chromosome reorganization after centromere dysfunction. Science 321:1088–1091
Jiang JM, Birchler JA, Parrott WA, Dawe RK (2003) A molecular view of plant centromeres. Trends Plant Sci 8:570–575
Jiao YP, Peluso P, Shi JH, Liang T, Stitzer MC, Wang B, Campbell MS, Stein JC, Wei XH, Chin CS, Guill K, Regulski M, Kumari S, Olson A, Gent J, Schneider KL, Wolfgruber TK, May MR, Springer NM, Antoniou E, McCombie WR, Presting GG, McMullen M, Ross-Ibarra J, Dawe RK, Hastie A, Rank DR, Ware D (2017) Improved maize reference genome with single-molecule technologies. Nature 546:524–527
Jin WW, Melo JR, Nagaki K, Talbert PB, Henikoff S, Dawe RK, Jiang JM (2004) Maize centromeres: organization and functional adaptation in the genetic background of oat. Plant Cell 16:571–581
Kalitsis P, Choo KHA (2012) The evolutionary life cycle of the resilient centromere. Chromosoma 121:327–340
Ketel C, Wang HSW, McClellan M, Bouchonville K, Selmecki A, Lahav T, Gerami-Nejad M, Berman J (2009) Neocentromeres form efficiently at multiple possible loci in Candida albicans. PLoS Genet 5:e1000400
Koo DH, Han FP, Birchler JA, Jiang JM (2011) Distinct DNA methylation patterns associated with active and inactive centromeres of the maize B chromosome. Genome Res 21:908–914
Krueger F, Andrews SR (2011) Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27:1571–1572
Kynast RG, Riera-Lizarazu O, Vales MI, Okagaki RJ, Maquieira SB, Chen G, Ananiev EV, Odland WE, Russell CD, Stec AO, Livingston SM, Zaia HA, Rines HW, Phillips RL (2001) A complete set of maize individual chromosome additions to the oat genome. Plant Physiol 125:1216–1227
Lamb JC, Meyer JM, Birchler JA (2007) A hemicentric inversion in the maize line knobless Tama flint created two sites of centromeric elements and moved the kinetochore-forming region. Chromosoma 116:237–247
Li H (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM http://arxiv.org/abs/1303.3997
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Li Q, Song J, West PT, Zynda G, Eichten SR, Vaughn MW, Springer NM (2015) Examining the causes and consequences of context-specific differential DNA methylation in maize. Plant Physiol 168:1262–1274
Liu YL, Su HD, Pang JL, Goo Z, Wang XJ, Birchler JA, Han FP (2015) Sequential de novo centromere formation and inactivation on a chromosomal fragment in maize. Proc Natl Acad Sci U S A 112:E1263–E1271
Lyons E, Pedersen B, Kane J, Freeling M (2008) The value of nonmodel genomes and an example using SynMap within CoGe to dissect the hexaploidy that predates the rosids. Trop Plant Biol 1:181–190
Marshall OJ, Chueh AC, Wong LH, Choo KHA (2008) Neocentromeres: new insights into centromere structure, disease development, and karyotype evolution. Am J Human Genet 82:261–282
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet journal 17:10–12
Nagaki K, Cheng ZK, Ouyang S, Talbert PB, Kim M, Jones KM, Henikoff S, Buell CR, Jiang JM (2004) Sequencing of a rice centromere uncovers active genes. Nat Genet 36:138–145
Nagaki K, Talbert PB, Zhong CX, Dawe RK, Henikoff S, Jiang JM (2003) Chromatin immunoprecipitation reveals that the 180-bp satellite repeat is the key functional DNA element of Arabidopsis thaliana centromeres. Genetics 163:1221–1225
Nasuda S, Hudakova S, Schubert I, Houben A, Endo TR (2005) Stable barley chromosomes without centromeric repeats. Proc Natl Acad Sci U S A 102:9842–9847
Paterson AH, Bowers JE, Chapman BA (2004) Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc Natl Acad Sci U S A 101:9903–9908
Rocchi M, Archidiacono N, Schempp W, Capozzi O, Stanyon R (2012) Centromere repositioning in mammals. Heredity 108:59–67
Saffery R, Irvine DV, Griffiths B, Kalitsis P, Wordeman L, Choo KHA (2000) Human centromeres and neocentromeres show identical distribution patterns of >20 functionally important kinetochore-associated proteins. Hum Mol Genet 9:175–185
Schnable JC, Springer NM, Freeling M (2011) Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc Natl Acad Sci U S A 108:4069–4074
Schnable PS, Ware D, Fulton RS, Stein JC, Wei FS, Pasternak S, Liang CZ, Zhang JW, Fulton L, Graves TA et al (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115
Schneider KL, Xie ZD, Wolfgruber TK, Presting GG (2016) Inbreeding drives maize centromere evolution. Proc Natl Acad Sci U S A 113:E987–E996
Scott KC, Sullivan BA (2014) Neocentromeres: a place for everything and everything in its place. Trends Genet 30:66–74
Shang WH, Hori T, Martins NMC, Toyoda A, Misu S, Monma N, Hiratani I, Maeshima K, Ikeo K, Fujiyama A, Kimura H, Earnshaw WC, Fukagawa T (2013) Chromosome engineering allows the efficient isolation of vertebrate neocentromeres. Dev Cell 24:635–648
Stadler LJ, Roman H (1948) The effect of X-rays upon mutation of the gene A in maize. Genetics 33:273–303
Sullivan BA, Schwartz S (1995) Identification of centromeric antigens in dicentric robertsonian translocations: CENP-C and CENP-E are necessary components of functional centromeres. Human Mol Genet 4:2189–2197
Swigonova Z, Lai JS, Ma JX, Ramakrishna W, Llaca V, Bennetzen JL, Messing J (2004) Close split of sorghum and maize genome progenitors. Genome Res 14:1916–1923
Thakur J, Sanyal K (2013) Efficient neocentromere formation is suppressed by gene conversion to maintain centromere function at native physical chromosomal loci in Candida albicans. Genome Res 23:638–652
Tolomeo D, Capozzi O, Stanyon RR, Archidiacono N, D'Addabbo P, Catacchio CR, Purgato S, Perini G, Schempp W, Huddleston J, Malig M, Eichler EE, Rocchi M (2017) Epigenetic origin of evolutionary novel centromeres. Sci Rep-Uk 7:41980
Topp CN, Okagaki RJ, Melo JR, Kynast RG, Phillips RL, Dawe RK (2009) Identification of a maize neocentromere in an oat-maize addition line. Cytogenet Genome Res 124:228–238
Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515
Ventura M, Mudge JM, Palumbo V, Burn S, Blennow E, Pierluigi M, Giorda R, Zuffardi O, Archidiacono N, Jackson MS, Rocchi M (2003) Neocentromeres in 15q24-26 map to duplicons which flanked an ancestral centromere in 15q25. Genome Res 13:2059–2068
Ventura M, Weigl S, Carbone L, Cardone MF, Misceo D, Teti M, D'Addabbo P, Wandall A, Bjorck E, de Jong PJ, She XW, Eichler EE, Archidiacono N, Rocchi M (2004) Recurrent sites for new centromere seeding. Genome Res 14:1696–1703
Voullaire LE, Slater HR, Petrovic V, Choo KHA (1993) A functional marker centromere with no detectable alpha-satellite, satellite III, or CENP-B protein: activation of a latent centromere? Am J Hum Genet 52:1153–1163
Wang H, Bennetzen JL (2012) Centromere retention and loss during the descent of maize from a tetraploid ancestor. Proc Natl Acad Sci U S A 109:21004–21009
Wang K, Wu YF, Zhang WL, Dawe RK, Jiang JM (2014) Maize centromeres expand and adopt a uniform size in the genetic background of oat. Genome Res 24:107–116
Wei F, Coe E, Nelson W, Bharti AK, Engler F, Butler E, Kim H, Goicoechea JL, Chen M, Lee S, Fuks G, Sanchez-Villeda H, Schroeder S, Fang Z, McMullen M, Davis G, Bowers JE, Paterson AH, Schaeffer M, Gardiner J, Cone K, Messing J, Soderlund C, Wing RA (2007) Physical and genetic structure of the maize genome reflects its complex evolutionary history. PLoS Genet 3:1254–1263
Wolfgruber TK, Sharma A, Schneider KL, Albert PS, Koo DH, Shi JH, Gao Z, Han FP, Lee H, Xu RH, Allison J, Birchler JA, Jiang JM, Dawe RK, Presting GG (2009) Maize centromere structure and evolution: sequence analysis of centromeres 2 and 5 reveals dynamic loci shaped primarily by retrotransposons. PLoS Genet 5:e1000743
Yunis JJ, Prakash O (1982) The origin of man—a chromosomal pictorial legacy. Science 215:1525–1530
Zang C, Schones DE, Zeng C, Cui K, Zhao K, Peng W (2009) A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25:1952–1958
Zhang B, Lv ZL, Pang JL, Liu YL, Guo X, Fu SL, Li J, Dong QH, Wu HJ, Gao Z, Wang XJ, Han FP (2013) Formation of a functional maize centromere after loss of centromeric sequences and gain of ectopic sequences. Plant Cell 25:1979–1989
Zhang HQ, Koblizkova A, Wang K, Gong ZY, Oliveira L, Torres GA, Wu YF, Zhang WL, Novak P, Buell CR, Macas J, Jiang JM (2014) Boom-bust turnovers of megabase-sized centromeric DNA in Solanum species: rapid evolution of DNA sequences associated with centromeres. Plant Cell 26:1436–1447
Zhao HN, Zhu XB, Wang K, Gent JI, Zhang WL, Dawe RK, Jiang JM (2016) Gene expression and chromatin modifications associated with maize centromeres. G3 6:183-192
Zhong CX, Marshall JB, Topp C, Mroczek R, Kato A, Nagaki K, Birchler JA, Jiang JM, Dawe RK (2002) Centromeric retroelements and satellites interact with maize kinetochore protein CENH3. Plant Cell 14:2825–2836
Acknowledgements
We thank Dr. Patrick Schnable for providing the seeds of maize line ax-3 and Drs. Kelly Dawe and Jonathan Gent for valuable comments on the manuscript. This work was supported by the National Science Foundation (NSF) grant 1338897 to B.S.G. and NSF grant IOS-1444514 to J.A.B. and J.J.
Author information
Authors and Affiliations
Contributions
H.Z and J.J. designed the research, Z.Z. and D.H.K. performed experiments, H.Z., J.A.B., and J.J. analyzed data, and H.Z., B.S.G., J.A.B., and J.J. wrote the article.
Corresponding authors
Additional information
Responsible Editor: Hans de Jong
Rights and permissions
About this article
Cite this article
Zhao, H., Zeng, Z., Koo, DH. et al. Recurrent establishment of de novo centromeres in the pericentromeric region of maize chromosome 3. Chromosome Res 25, 299–311 (2017). https://doi.org/10.1007/s10577-017-9564-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10577-017-9564-x