Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1.1 Introduction to Chromatin Structure

The complete genetic contents of a human cell reside within three billion base pairs of deoxyribonucleic acid (DNA) distributed through 23 pairs of chromosomes. If this DNA were extended and lined up end to end, it would span over 2 m, and yet, must be organized within a cell’s nucleus with an average diameter of 5–10 μm (Nelson and Cox 2008). This is analogous to packing 30 miles of thread into a basketball. However, DNA cannot be stored away indiscriminately. Rather, it must be continuously accessed in a highly coordinated fashion to allow a cell to perform specialized functions and respond to a changing environment. To accomplish this task, the genome of all eukaryotic cells is organized in a dynamic polymeric complex called chromatin.

The fundamental repeating unit of the chromatin polymer is the nucleosome (Fig. 1.1). The nucleosome contains a nucleosome core with 145–147 base pairs (bp) of DNA wrapped around an octamer of histone proteins, constructed from two copies of each of the core histones, histone H2A, histone H2B, histone H3, and histone H4. Each nucleosome core is connected to an adjacent nucleosome core through a segment of linker DNA to form the chromatin polymer with a repeat length ranging from 160 to 240 bp (McGhee and Felsenfeld 1980). Approximately 20 bp of this linker DNA is typically found in association with the linker histone H1 (also H5). The nucleosome core together with the linker histone is called the chromatosome. Adding the remaining linker DNA to the chromatosome completes the nucleosome.

Fig. 1.1
figure 1

Scheme of the nucleosome core particle, chromatosome, and nucleosome. Histones are represented by circles, colored as shown. DNA is represented by light blue lines. Double lines between histones denote histone-fold pairs; single lines represent four-helix bundle motifs

Chromatin is composed of long arrays of nucleosomes. These arrays are progressively condensed through a hierarchy of higher-order structures, starting with an extended conformation and ultimately generating two distinct cell-cycle-specific forms, interphase chromatin and metaphase chromosomes. Much remains to be elucidated with regard to these higher-order structures. This is highlighted by the controversy over not only the conformation of the first level of higher-order compaction, the 30 nm fiber, but its mere existence (Li and Reinberg 2011; Luger et al. 2012).

Importantly, chromatin is not simply a scaffold for DNA. On the contrary, it is an active signaling hub in all genome-templated processes, from gene expression to DNA replication and DNA damage repair. Chromatin assembly pathways and nucleosome remodeling complexes control nucleosome composition, occupancy, and positioning throughout the genome. The chemical landscape of nucleosomes is varied through an extensive network of histone posttranslational modifications and the incorporation of histone sequence variants, which carry variant-specific modifications. Moreover, DNA harbors chemical modifications of its own. Together, this allows for specific recruitment and exclusion of down-stream effectors, leading to direct and indirect control of chromatin structure and function. The complex and dynamic nature of chromatin is exemplified in the cell cycle regulated condensation of interphase chromatin into mitotic chromosomes, which following mitosis then redistributes throughout the nucleus.

1.2 A History of Chromatin Structure

The study of chromatin dates to the late nineteenth century with the biochemical and microscopic description of nuclear contents. In 1871, Freidrich Miescher discovered nucleic acids when he isolated a phosphorous-rich substance from leukocyte nuclei, which he called nuclein (Dahm 2005). Soon after, Albrecht Kossel (1884) extracted the proteinaceous component of nucleated erythrocytes, and named it histon, now called histones. In simultaneous efforts to describe nuclei by microscopic visualization, Walther Flemming named this nucleoprotein substance chromatin based on its tendency to strongly absorb basophilic dyes, a name that stands today (Paweletz 2001). Thus, at the turn of the century, chromatin was known to be composed of an acidic, phosphorous-rich component as well as a basic, proteinaceous component, yet the polymeric macromolecular form of these components remained obscure. What followed in the first half of the twentieth century was a dark age in the study of chromatin structure. During this time, key genetic principles were established, most notably the identification of nucleic acids as the transforming component of chromatin (Avery et al. 1944) and the structure of DNA (Franklin and Goslin 1953; Watson and Crick 1953; Wilkins et al. 1953), yet the understanding of histones remained mostly stagnant.

The latter half of the twentieth century witnessed a resurgence in the study of chromatin structure. Histones were fractionated into two categories, termed main and subsidiary, which later became known as the core and linker histones (Stedman and Stedman 1951). Heterogeneity observed within these isolated histones, likely due to contaminating protease activities, led to the erroneous interpretation that histones are diverse in composition and vary greatly across tissues and organisms. These ideas were largely dispelled by the late 1960s when acid extraction allowed intact fractionation and subsequent sequencing disclosed five histone classes with nearly invariant conservation (Van Holde 1989).

The 1970s ushered in a modern understanding of chromatin structure consisting of polymeric chains of nucleosomes. The first hints of this “simple, basic repeating structure” came from the examination of isolated chromatin digested with endogenous and exogenous nucleases that left roughly half of the DNA protected in small 100–200 bp fragments (Clark and Felsenfeld 1971; Hewish and Burgoyne 1973). Subsequent negative stain electron microscopy of chromatin fibers by Donald and Ada Olins and Chris Woodcock demonstrated a “beads on a string” structure—distinct particles (called ν bodies) of 60–100 Å in diameter linked by a thinner fibrous structure (Olins and Olins 1973, 1974; Woodcock 1973; Woodcock et al. 1976). The notion of a chromatin substructure was further established through discovery of interactions between the histone proteins, first for the H2A/H2B dimer (Kelley 1973; D’Anna and Isenberg 1974) and then for the (H3/H4)2 tetramer (Kornberg and Thomas 1974; Roark et al. 1974).

Along with the description of the (H3/H4)2 tetramer, Roger Kornberg presented an octamer model for the repeating unit of chromatin (Kornberg 1974) based on the following concepts (1) the stoichiometry of the tetramer and the requirement for all four core histones to form the repeating unit as assessed by X-ray diffraction requires an octameric structure with two copies of each of the histones. (2) Given the equal mass of DNA and histones in chromatin, each octamer interacts with approximately 200 bp of DNA. (3) The expected globular shape of the tetramer requires DNA to wrap around the periphery. (4) The existence of half the amount of linker histone compared to each of the core histones suggests that one linker histone binds per nucleosome and given that the linker histone is not necessary to reproduce the X-ray diffraction pattern, it must bind the exterior of the particle. Further experiments consolidated this and other proposals (Van Holde et al. 1974), with the core particle wrapped by approximately 140 bp of DNA, and the linker DNA and associated linker histone completing the nucleosome and extending the DNA protection to 200 bp (Sollner-Webb and Felsenfeld 1975; Van Holde 1989). Within a decade, the nucleosome core particle crystal structure was determined to 7 Å resolution, providing structural information about the DNA path around the histone octamer (Richmond et al. 1984). The task of improving the resolution to near atomic level then spanned the next dozen years. The structure of the histone octamer was determined to 3.1 Å (Arents et al. 1991) followed by the nucleosome core particle to 2.8 Å (Luger et al. 1997) at last giving atomic detail to the fundamental unit of chromatin.

1.3 The Nucleosome Core Particle Structure

1.3.1 Overview of the High-Resolution Nucleosome Core Particle Structure

The 2.8 Å resolution structure of the nucleosome core particle, solved in 1997 by Richmond and colleagues, offers the first high-resolution depiction of the histone octamer bound to DNA (Luger et al. 1997). This was made possible at least in part through the reconstitution of core particles from recombinantly expressed histones (in this case from Xenopus laevis histone sequences) and a defined DNA sequence, thus eliminating heterogeneity existing in core particles isolated from endogenous sources. The 2.8 Å structure shows 146 bp of DNA wrapped in 1.65 turns around the histone octamer in a left-handed superhelix (Fig. 1.2). The histone octamer is generated from four “histone-fold” heterodimers, two each of H3/H4 and H2A/H2B (Fig. 1.3). Two H3/H4 dimers form a central (H3/H4)2 tetramer through a four-helix bundle mediated by the H3 histone folds (Fig. 1.4). Each half of the (H3/H4)2 tetramer interacts with one H2A/H2B dimer through a four-helix bundle between the H4 and H2B histone folds, completing the octamer. This octamer forms a ramp, or spool for wrapping the nucleosomal DNA. The resultant 200 kDa disk-shaped particle has a pseudo twofold symmetry centered on the dyad.

Fig. 1.2
figure 2

Overview of the nucleosome core particle structure. Nucleosome core particle high-resolution structure (PDB ID: 1KX5). (a) Histones are depicted in cartoon representation and colored as shown. DNA is depicted in stick representation with the dyad marked by an arrow. (b) Nucleosome core particle shown in space-filling representation. All molecular graphics in this chapter were prepared using PyMOL software (The PyMOL Molecular Graphics System, Version 1.5 Schrödinger, LLC)

Fig. 1.3
figure 3

The histone-fold and the histone-fold heterodimers. Histone-folds of (a) H3 and (b) H4 shown in cartoon representation. Heterodimeric histone-fold pairs for (c) H3/H4 and (d) H2A/H2B shown in cartoon representation. Schemes representing the secondary structure elements of (e) H3/H4 and (f) H2A/H2B (Histone structures from PDB ID: 1KX5)

Fig. 1.4
figure 4

The histone-fold octamer is constructed using four-helix bundles. The histone-fold octamer (PDB ID: 1KX5) shown in cartoon representation from (a) the disk surface and (b) an orthogonal profile view looking directly at the dyad. (c) H3–H3 and (d) H4–H2B four-helix bundles, shown in cartoon representation

Subsequent work has improved the resolution of the core particle to 1.9 Å (Davey et al. 2002) and provided complementary structures containing histones from diverse species (Harp et al. 2000; White et al. 2001; Tsunaka 2005; Clapier et al. 2008), histone sequence variants (Suto et al. 2000; Chakravarthy and Luger 2006; Tachiwana et al. 2011), and different DNA sequences (Richmond and Davey 2003; Makde et al. 2010; Vasudevan et al. 2010; Chua et al. 2012). Moreover, the X-ray structure of a tetranucleosome has provided insight into the higher-order organization of chromatin (Schalch et al. 2005). Finally, recent structural studies of proteins bound to the nucleosome core particle have given atomic resolution detail of nucleosomal recognition (Makde et al. 2010; Armache et al. 2011). The following sections introduce the properties of histones and describe the octameric histone complex and its interactions with DNA to form the nucleosome core particle.

1.3.2 Primary Structure of Histones

Histones are small, basic proteins that form the scaffold for organizing DNA inside the eukaryotic nucleus. They can broadly be broken down into five classes: the four core histones, H2A, H2B, H3, and H4, contained in the nucleosome core particle; and the linker histones, H1 or H5, that interact with linker DNA and that are implicated in higher-order structures of chromatin. As the majority of DNA is packaged into nucleosomes, it follows that coincident with DNA replication in S phase, histones must be produced to package the duplicated genome. As such, histones can be further classified as replication dependent, also known as canonical or major histones, and replication independent, or variant histones. This chapter focuses on the canonical histones as the variant histones are discussed in detail in a later chapter.

Several generalizations can be drawn from the sequences of the core histones. (1) They are relatively small, ranging from 102 to 135 amino acids. (2) They each contain a central alpha-helical region, which forms a “histone-fold” motif (Fig. 1.3a, b). The histone-folds are flanked by N- and C-terminal extensions. Segments of these extensions are structured, notably the H3 αN helix and the H2B αC helix, but much of these extensions, especially in the N-terminal regions of all the core histones and the C-terminal region of H2A, exhibit more flexible conformations (Fig. 1.3e, f). These regions, called the histone “tails,” harbor an extraordinary density and diversity of posttranslational modifications and have been the focus of much of the research regarding signaling through chromatin. (3) Core histones possess a preponderance of the basic amino acids, arginine and lysine, as compared to acidic amino acids, resulting in a substantial net positive charge at physiological pH. This charge disparity is most notable within the N-terminal and C-terminal extensions from the histone-fold. (4) The core histones exhibit astonishing sequence conservation across evolutionarily distinct organisms, suggesting strong functional selective pressure. H3 and H4 are among the most highly conserved proteins, with greater than 90 % sequence identity for H4 between budding yeast and man. H2A and H2B are also highly conserved, though more divergent than H3 and H4, especially in their N- and C-terminal regions. (5) Multiple copies of each of the core histone genes are found clustered throughout the genomes of eukaryotic organisms. In budding yeast, two copies of each of the core histones are found (Osley 1991), whereas in man the complexity is increased with 10–20 functional copies (Marzluff et al. 2002). This allows for nonallelic variations. Strikingly, all 12 loci for H4 in the human genome encode identical protein sequences, again underscoring its functional conservation. In contrast, the human H2A and H2B loci include minor coding variations surrounding a strong consensus sequence. In many cases, these variations are conserved between mouse and man, suggesting functional selective pressure. To date, little is understood regarding the usage and consequences of this nonallelic variation.

The linker histones (H1/H5), which make up the fifth class of histones, are slightly larger than the core histones and far less conserved. Linker histones in metazoans have a tripartite structure with a central globular domain of ~80 amino acids flanked by unstructured N- and C-terminal domains of 13–40 and ~100 amino acids, respectively. The budding yeast H1 includes a second unique globular domain following the C-terminal domain. Similar to the core histone tails, the unstructured regions of linker histones contain a preponderance of basic amino acids. Invariably, the H1 C-terminal domains are rich in lysine, proline, and serine, a composition that has been shown to be critical for function (Lu et al. 2009). Much like the core histones, linker histones are found in increasing complexity in higher organisms. While one linker histone sequence exists in budding yeast, 11 distinct isoforms are found in man. Five of these isoforms, H1.1–H1.5, are cell cycle dependent similar to the canonical core histones. Others exhibit cell-cycle independence or tissue/germline specificity (Happel and Doenecke 2009).

1.3.3 Secondary Structure of Core Histones and the Architecture of the Histone Octamer

A single structural motif, the histone-fold, forms the foundation of the histone octamer. This fold, contained in all four core histones, is comprised of three α-helices connected by two intervening loops and is designated α1–L1–α2–L2–α3 (Fig. 1.3a, b). The two short α1 and α3 helices pack along roughly the same side of the long central α2 helix. Each histone-fold pairs with a nonidentical histone-fold—H3 pairs with H4 while H2A pairs with H2B—in an antiparallel arrangement. The resulting pseudosymmetric heterodimer forms a “handshake motif” (Fig. 1.3c, d). Pairing specificity is derived from the residues contributing to the heterodimeric interface and this precludes formation of homodimers and other heterodimeric pairs. The antiparallel arrangement of the histone-folds places the L1 loop of onefold in proximity to the L2 loop of the symmetry-related fold, with one L1L2 pair occupying each end of the heterodimer. The α2–α2 interface is closer to the N-terminal end of the α2 helices, which juxtaposes the α1 helices and separates the α3 helices. This gives the heterodimer a crescent shape with a convex surface spanning the L1L2 loops and the α1 helices opposite a concave surface formed by the α3 and central portions of the α2 helices. The L1L2 and α1α1 regions constitute the major DNA binding surfaces of each heterodimer.

The core octamer is assembled from two H3/H4 and two H2A/H2B heterodimers using one common structural motif, the four-helix bundle (Fig. 1.4a, b). Each four-helix bundle is constructed from the α3 helix and the C-terminal half of the α2 helix from adjacent histone-folds as follows. Two H3/H4 dimers associate in a head-to-head arrangement to form a (H3/H4)2 tetramer mediated by a four-helix bundle between the H3 α2 and α3 helices (Fig. 1.4c). Similarly, two H2A/H2B dimers associate with this tetramer each through the formation of an additional four-helix bundle between the α2 and α3 helices of H4 and H2B (Fig. 1.4d). The final product is a left-handed histone supercoil with pseudo twofold symmetry (H2A–H2B–H4–H3–H3–H4–H2B–H2A) (Fig. 1.4a, b).

1.3.4 Core Histone Tails and Extensions

The N- and C-terminal extensions from the histone-folds complete the protein content of the nucleosome core particle and contribute both to DNA binding and several important solvent exposed surfaces (Fig. 1.5a, b). Three of these regions warrant further discussion. The αN helix of H3 between the N-terminal tail and the α1 helix lies on top of the H4 histone-fold and organizes DNA at the entry and exit sites from the nucleosome. Meanwhile, C-terminal extensions of H2A and H2B each contribute substantially to the solvent exposed surface of the nucleosome and further solidify the octameric structure. The H2A C-terminal extension docks against the H2A α3 helix before traversing the nucleosome surface to pack against the H3/H4 heterodimer on the opposite side of the octamer, ultimately terminating near the dyad. The αC helix of H2B extends to the edge of the nucleosome opposite the dyad, packing along the α2 and α3 helices of H2A and H2B, respectively and represents the outermost margin of the disk surface of the nucleosome core particle.

Fig. 1.5
figure 5

Histone extensions and tails. (a) The complete histone octamer with extensions and tails included, shown in cartoon representation. (b) The complete histone octamer with extensions and tails colored as shown to distinguish them from histone-folds, colored gray. (c) Profile of complete core particle in space filling representation, showing exit sites of histone tails and aligned grooves of DNA gyres. For orientation, the dyad is labeled. (d) Crystal packing within the 1.9 Å crystal (PDB ID: 1KX5). Histone tails (colored) exhibit conformations defined by crystal contacts with neighboring nucleosome core particles

The histone N-terminal tails exit the nucleosome core particle by two routes (1) on top of the minor groove of the DNA as is the case for H4 and H2A or (2) through a channel created by aligned minor grooves from adjacent gyres of DNA, as is the case for H3 and H2B (Fig. 1.5c). The H3 N-terminal tails exit the particle near the entry/exit site of DNA proximal to the dyad. In contrast, the H2A and H2B N-terminal tails exit from the opposite side of the particle. The two H4 N-terminal tails exit from the particle in different locations. While these tails are not observed in most structures of the nucleosome core particle, it is important to note that the electron density was sufficient to model the entire length of all ten histone tails in the 1.9 Å structure. However, the positions of the tails are defined by crystal-packing contacts and may not reflect physiologically relevant conformations (Fig. 1.5d).

For decades, it has been clear that the core histone N-terminal tails, which constitute about 20 % of the octamer mass, exhibit dynamic and flexible structures. Thus, it is not surprising that a structure–function relationship cannot be inferred from static X-ray structures. Rather, biophysical approaches have been required to begin to elucidate the nature of the tails and their contribution to nucleosome and higher-order structures. While free core histone N-terminal tails form random coil conformations, it is increasingly clear that the tails can adopt defined structures within chromatin that are context dependent (Wang and Hayes 2006). The tails form specific contacts with DNA within the nucleosomal core particle in a salt-dependent manner (Lee and Hayes 1997), though they collectively contribute minimally to the stability of the core particle itself (Ausio et al. 1989). The specific contacts that the tails make are altered upon addition of linker DNA or linker DNA and linker histone. Building upwards toward higher-order structures, all of the tails contribute to the higher-order folding and/or oligomerization of chromatin through binding to sequentially and spatially adjacent nucleosomes. For example, a “basic patch” in the H4 N-terminal tail is required for compaction of a nucleosome array (Dorigo et al. 2003). This region interacts with an “acidic patch” on the H2A/H2B dimer of an adjacent nucleosome. Notably, acetylation of a single lysine in the H4 basic patch abrogates this interaction and the resultant chromatin compaction (Shogren-Knaak et al. 2006). The H3 N-terminal tail makes intranucleosomal interactions in an extended array of nucleosomes, but upon chromatin compaction, internucleosomal and interarray interactions are observed (Zheng et al. 2005; Kan et al. 2007). Similar to H4, these interactions are differentially affected by lysine acetylation as well as the linker histone, suggesting several potential levels of regulation. Taken together, it is likely that the core histone tails are capable of establishing a network of interactions with DNA and other histones that is both highly dependent on and contributes to local chromatin structure. By allowing the adoption of specific conformations in distinct contexts, the intrinsic structural flexibility of the N-terminal tails may impart functional flexibility in accessing and repressing higher-order chromatin structures and recruiting effector proteins to the chromatin template. Combined with the litany of posttranslational modifications of the N-terminal tails, this allows for tight regulation of chromatin structure and function. While much progress has been made in dissecting the functions of the histone tails, much remains to be elucidated in this complex and dynamic system.

1.3.5 The DNA Superhelix and Core Histone–DNA Interactions

Overall, the nucleosomal DNA wraps 1.65 turns around the histone octamer in a left-handed superhelix. DNA locations are described by the number of superhelical turns away from the dyad, which is defined as superhelix location 0 (SHL0), ranging from SHL−7 to SHL+7 (Fig. 1.6a). The DNA is bent in a nonuniform pattern owing to intrinsic constraints of DNA as well as of the surface of the underlying histone octamer. Notably, nucleosomal DNA has an increased twist relative to free B-form DNA (Richmond and Davey 2003). The register of the adjacent superhelical DNA gyres aligns the major and minor grooves as they traverse the octamer surface, creating the channels through which the H3 and H2B tails exit the core particle (Fig. 1.5c).

Fig. 1.6
figure 6

Histone–DNA interactions in the nucleosome core particle. (a) Half of the pseudosymmetric nucleosome core particle (PDB ID: 1KX5). Histones are depicted in cartoon representation and colored as shown. DNA is depicted in sticks representation with the superhelical locations numbered (dyad = SHL0). Histone–DNA interactions for (b) H3/H4 dimer and (c) H2A/H2B dimer. Key histone side chains are shown as sticks. DNA phosphates at positions where the minor groove faces the histone dimers are shown as spheres. Hydrogen bonds with histone side chains and main chains are colored orange and red, respectively

A 146-bp palindromic DNA sequence was used to solve the 2.8 Å structure of the nucleosome core particle in anticipation that each half of the pseudosymmetric octamer might wrap an identical 73 bp DNA related through the twofold symmetry of the complex. Instead, the crystal structure showed that the histone octamer binds to the DNA sequence centered on a single base pair at the dyad, consistent with site-directed hydroxyl radical mapping studies (Flaus et al. 1996). Subsequent crystal structures and biochemical mapping studies confirm that the nucleosome dyad is centered on a base pair, not between two base pairs. A base pair at the dyad therefore splits the remaining DNA of the 146 bp sequence into 73 and 72 bp halves. Overwinding and stretching of a specific segment in the 72 bp half accommodates the difference in length of each half. This ability of the nucleosome core particle to accommodate stretching appears to be dependent on both DNA sequence and the architecture of the histone octamer and permits wrapping of 145–147 bp of DNA (Richmond and Davey 2003; Ong et al. 2007; Makde et al. 2010; Vasudevan et al. 2010).

Contacts between the histones and DNA occur at regular intervals every superhelical turn where the minor groove approaches the histone octamer. With few exceptions, the direct histone–DNA contacts involve the phosphodiester backbone rather than the pyrimidine and purine rings of the individual nucleotides. Each histone-fold pair organizes 27–28 bp of DNA (Fig. 1.6b, c). Two interface types define the histone-fold DNA interface. The α1α1 type interface utilizes the N-termini of both α1 helices to bind to the DNA backbone near the center of each segment. This is flanked by two L1L2 type interfaces employing the L1 and L2 loops and the C-terminal end of the α2 helix. In this manner, the histone-folds organize the central 121 bp of nucleosomal DNA. The remaining DNA, approximately 13 bp on either end of the nucleosome core particle is organized by extensions from the histone-folds, most notably the αN helix of H3. In total, the octamer interfaces with the DNA in 14 discrete places where the minor groove faces the histone octamer, eight L1L2 type (two from each dimer), four α1α1 type (one from each dimer), and one each through the H3 αN helices.

Several general features contribute to the histone–DNA interface. (1) Hydrogen bonds and salt bridges exist between the DNA phosphate groups and the basic guanidinium and amino moieties on arginine and lysine side chains, as well as side chain hydroxyl groups. (2) Roughly equal numbers of hydrogen bonds are direct versus mediated through structured water molecules. Interestingly, there are significantly more water-mediated hydrogen bonds with the DNA bases as compared to direct hydrogen bonds (Davey et al. 2002). (3) Arginine side chains penetrate the DNA minor groove at regular intervals when it faces the histone octamer, effectively narrowing the minor groove. (4) Widespread nonpolar contacts exist between histones and the deoxyribrose rings. (5) Hydrogen bonds are found between the phosphate groups and main chain amides near the C-terminal ends of α1 and α2 helices. (6) The helix dipoles from the α1 helices of H3, H4, and H2B as well as all of the α2 helices are directed at single phosphate groups of the adjacent DNA backbone.

One implication of the lack of base specificity of the histone–DNA interaction network is the ability to accommodate almost any DNA sequence. However, the global determination of nucleosome positioning in vivo demonstrates several patterns including a high prevalence of TA base pairs and GC-rich sequences where the minor grooves and major grooves approach the histone octamer, respectively (Segal et al. 2006; Segal and Widom 2009). While this sequence specificity could result from direct recognition of the bases, the several direct interactions of this nature observed in the nucleosome core particle structure are inadequate for specific base pair recognition (Davey et al. 2002; Richmond and Davey 2003). Furthermore, the more abundant water-mediated hydrogen bonding to the bases allows for plasticity to accommodate variable sequences. Thus, much of the intrinsic sequence specificity for nucleosome formation and positioning likely results from the inherent ability of the sequence to contort to match the contour of the octamer surface. For example, the flexible TA sequence allows for maximal compression at the minor groove facing the histone octamer. Recent crystal structures of the nucleosome core particle with different DNA sequences (Luger et al. 2000; Richmond and Davey 2003; Ong et al. 2007; Makde et al. 2010; Vasudevan et al. 2010) collectively demonstrate invariant positioning of phosphate groups where the minor groove approaches the octamer surface. Sequence-dependent structural differences are reflected in DNA stretching that is accommodated by increasing DNA twist as well as variations in DNA bending between sites of interaction with the octamer.

1.3.6 The Nucleosome Core Particle Surface and Interactions

The 200 kDa nucleosome core particle is a disk-shaped complex with a diameter of approximately 100 Å. The height of the disk varies greatly, with a 25 Å minimum at the dyad and a maximum approaching 60 Å near the H2B αC helices. Varying contours furnish the core particle with a multifaceted, solvent accessible surface totaling 74,000 Å2 (Fig. 1.7b). The exposed phosphodiester backbone at the perimeter of the disk presents a highly negative electrostatic surface (Fig. 1.7a). An additional negatively charged surface, often referred to as the “acidic patch,” is found on each H2A/H2B dimer (Fig. 1.7a). This acidic patch is important for higher-order chromatin compaction through binding to the H4 N-terminal tail and may be a hot spot for nucleosome recognition by chromatin-associated proteins. In contrast to the nucleosomal disk, the histone tails have a substantial positive electrostatic potential, owing to the density of basic amino acids (Fig. 1.7a). The length and conformational flexibility of the tails allows them to extend considerably from the disk surface. Maximally extended, the 36 amino acid H3 N-terminal tail can span 125 Å, a distance greater than the diameter of the disk itself.

Fig. 1.7
figure 7

Nucleosome core particle surface and interactions. (a) Electrostatic potential and (b) van der Waals surface representations of the nucleosome core particle. The H2A/H2B acidic patch is labeled. Electrostatic surface prepared using APBS (Baker et al. 2001). (c) Structures of nucleosome core particle in complex with LANA (left, PDB ID: 1ZLA), RCC1 (center, PDB ID: 3MVD) and the BAH domain of Sir3 (right, PDB ID: 3TUA). Arginine side chains interacting with the acidic patch of the H2A/H2B dimer are shown in spheres representation. Disk surface (top) and profile (bottom) views of each complex are shown

Interactions with the nucleosome core particle follow two paradigms, binding to histone tails and/or the nucleosomal disk. The ten histone tails provide flexible platforms for nucleosomal interaction. A wealth of structural studies has illustrated the recognition of histone tails by enzymes that add and remove posttranslational modifications. Moreover, families of protein domains have been defined that bind histone tails in the context of specific types of posttranslational modifications (Taverna et al. 2007). Frequently, modification of adjacent positions enhances or abolishes binding (Winter and Fischle 2010). With multiple such domains in single proteins or within protein complexes, the recruitment of chromatin factors to genetic loci can be tuned based on a local subset of modifications (Ruthenburg et al. 2007).

In addition to binding to histone tails, numerous chromatin factors recognize surfaces of the nucleosomal disk. Recent advances in the structural characterization of the nucleosome core particle bound to peptides and proteins have shed light on several of these interactions (Fig. 1.7c). The Kaposi’s sarcoma-associated herpesvirus LANA (Latency-Associated Nuclear Antigen) peptide binds to the acidic patch of the H2A/H2B dimer to anchor its viral genome to host chromatin (Fig. 1.7c) (Barbera et al. 2006). Similarly, the β-propeller protein RCC1 (Regulator of Chromosome Condensation 1), an activator of the Ran small GTPase, uses one loop to engage the acidic patch while a second loop binds to nucleosomal DNA (Makde et al. 2010). In a third example, the BAH (Bromo-Associated Homology) domain of the yeast silencing protein Sir3 (Silent information regulator 3) binds to surfaces of the nucleosomal disk, including the acidic patch, and the H4 N-terminal tail (Armache et al. 2011). In each of these crystal structures, a single arginine side chain is inserted into the acidic patch of the H2A/H2B dimer (Fig. 1.7c). Additional interactions of the acidic patch with HMGN2 (high mobility group nucleosomal protein-2) (Kato et al. 2011) as well as the H4 N-terminal tail (Dorigo et al. 2004) raise the possibility that this represents a hot spot for recognition of the nucleosome core. For complexes eluding crystallization, multidisciplinary structural studies have been fruitful. Using such approaches, it was established that the chromatin remodeler Imitation SWitch 1a (ISW1a) binds to multiple DNA sites within two adjacent nucleosomes to effect nucleosome spacing (Yamada et al. 2011). While these studies represent substantial breakthroughs, the nucleosomal recognition of countless other chromatin-associated proteins remains obscure. Full characterization of these interactions will likely reveal new modes of chromatin interactions.

1.3.7 Nucleosome Core Dynamics: PTMs, Variants, DNA Breathing, and Suboctameric Particles

Much like the sequence of the histones themselves, the structure of the nucleosome core particle is highly conserved throughout eukaryotic organisms. Since the solution of the core particle containing Xenopus histones was reported, structures have been solved using histones from yeast, fly, and man (White et al. 2001; Tsunaka 2005; Clapier et al. 2008). While sequence differences result in minor changes to the composition of exposed surfaces and reveal complimentary coevolution within the hydrophobic core, the architecture of the complexes remains remarkably similar (Fig. 1.8). The overwhelming similarity in all core particle structures to date might lead to the false assumption that the particle is an inert structure. To the contrary, the particle is quite dynamic with variations in composition and conformation on three levels (1) chemical composition of the histones; (2) association of DNA with histones; and (3) stoichiometry of the histone subunits. Additionally, several noncanonical architectures have been proposed which may replace canonical core particles in certain specific contexts.

Fig. 1.8
figure 8

Nucleosome core particle structures from different histone sequences. Nucleosome core particle structures using Xenopus laevis (PDB ID: 1KX5), Saccharomyces cerevisiae (PDB ID: 1ID3), Drosophila melanogaster (PDB ID: 2NQB), and Homo sapiens (PDB ID: 3AFA) histones. The structure containing the human centromeric H3 variant, CENP-A, is also shown (PDB ID: 3AN2)

The chemical composition of histones is dynamically controlled through the addition and removal of posttranslational modifications (PTMs) and the incorporation of histone variants. Histones harbor an extraordinary variety and density of posttranslational modifications (Kouzarides 2007; Bannister and Kouzarides 2011). At least nine distinct types of histone PTMs have been observed. Certain types have been well characterized, such as acetylation, methylation of lysines and arginines, phosphorylation, and ubiquitylation, while current understanding of other types, including sumoylation, ADP ribosylation, deimination, proline isomerization, and proteolysis is incomplete. It is postulated that these modifications will work in a combinatorial manner to choreograph the recruitment of downstream effectors of genome-templated activities (Strahl and Allis 2000; Ruthenburg et al. 2007). Furthermore, the canonical histones can be replaced by sequence variants that carry variant-specific modifications (Henikoff et al. 2004). Collectively, these changes alter the electrostatic and van der Waals surfaces of the histones. This allows for differential association of chromatin factors that recognize specific modification states (Yun et al. 2011) or variants (Zhou et al. 2011). Modifications and variants can also lead to altered stability of histone–DNA (Neumann et al. 2009; Simon et al. 2011) and/or histone–histone interfaces within the core particle (Hoch et al. 2007) and with adjacent nucleosomes (Shogren-Knaak et al. 2006), thereby controlling chromatin stability and DNA accessibility on local and more global levels.

Crystallization of the nucleosome core particle locks the DNA in place, selecting for stable DNA and protein conformations (Andrews and Luger 2011). However, bulk and more recent single molecule experiments demonstrate that nucleosomal DNA transiently detaches from the histone octamer (Anderson and Widom 2000; Anderson et al. 2002; Buning and van Noort 2010). Importantly, this unwrapping is seen in the more physiological context of nucleosome arrays in addition to single core particles, negating artificial effects of DNA ends (Poirier et al. 2008, 2009). This phenomenon is primarily observed near the entry and exit sites of DNA (measured equilibrium constant of ~0.2–0.6) but can occur to a lesser degree elsewhere in the core particle (Buning and van Noort 2010). This asymmetry is consistent with the crystallographic observation of overall weaker histone–DNA contacts near the DNA ends than more central DNA locations. One important implication of transient unwrapping of DNA is the ability of DNA and histone-binding proteins to compete for access to buried sites within the core particle, which may be critical for transient disruption and reassembly of the nucleosome structure during transcription and DNA replication. Notably, posttranslational modifications of histones in positions underlying nucleosomal DNA (Neumann et al. 2009; Simon et al. 2011) and certain histone variants (Bao et al. 2004; Tachiwana et al. 2011) shift the equilibrium to a more unwrapped state. This is best characterized by the centromeric H3 variant. A crystal structure containing the human CENP-A variant organizes only the central 121 bp of DNA owing to a shorter αN helix (Fig. 1.8) (Tachiwana et al. 2011). This leads to increased accessibility of the terminal 13 bp of DNA at either end of the core particle (Dechassa et al. 2011; Tachiwana et al. 2011).

It can also be inferred from the assembly and disassembly of nucleosomes that several intermediate structures with suboctameric stoichiometries (i.e., lacking one or more histone heterodimers) are likely to exist, even if transiently (Fig. 1.9) (Zlatanova et al. 2009). The hexasome and tetrasome, lacking one and two H2A/H2B dimers, respectively, are two such possible complexes. These suboctameric complexes have been proposed based on the faster turnover of H2A and H2B than H3 and H4 within chromatin (Kimura and Cook 2001; Thiriet and Hayes 2005; Zlatanova et al. 2009). Significant evidence suggests that a hexasome structure exists in the wake of transcription (Hutcheon et al. 1980; Jackson and Chalkley 1985; Jackson 1990; Locklear et al. 1990). Structural analysis of reconstituted hexasomes using small angle X-ray scattering and nuclease protection confirms standard nucleosome architecture, but protecting only 110 bp of DNA (Arimura et al. 2012).

Fig. 1.9
figure 9

Scheme of suboctameric nucleosome particles. Representation of octameric nucleosome core particle, hexasome, tetrasome, and hemisome. Histones are represented by circles, colored as shown. DNA is represented by light blue lines. Double lines between histones denote histone-fold pairs; single lines represent four-helix bundle motifs

Additional, noncanonical complexes have been proposed containing one copy of each of the core histones, termed a hemisome, incorporating nonhistone proteins, and/or reverse DNA supercoils. This is exemplified by the conformation and composition of the centromeric nucleosome for which a myriad of canonical and noncanonical structures have been proposed (Fig. 1.9). Most reports, including the crystal structure of the human centromeric nucleosome (Tachiwana et al. 2011), favor a conventional octameric nucleosome with two copies of the centromeric H3 in place of major H3 and a left-handed DNA wrap. However, atomic force microscopy and supercoiling analysis of centromeric nucleosomes from fly could suggest a right-handed hemisome (Dalal et al. 2007; Furuyama and Henikoff 2009). Other models for centromeric nucleosomes include a octameric structure with a right-handed DNA wrap, a tetrasome containing two copies of the centromeric H3 and canonical H4, and a hexasome and trisome with the centromeric H3 chaperone replacing one or both copies of the H2A/H2B dimer, respectively (Black and Cleveland 2011). While consensus with regard to the centromeric nucleosome structure remains elusive, the controversy serves to highlight the potential dynamic and polymorphic nature of nucleosomes in vivo.

1.4 Linker Histone and the Chromatosome

In most eukaryotic organisms, the H1 family of linker histones exists in nearly equimolar amounts compared to the histone core, suggesting a 1:1 stoichiometry (Woodcock et al. 2006). However, the linker histone is unequally distributed in a cell with higher levels in condensed heterochromatic than more open euchromatic regions. A single linker histone associates with 15–20 bp of linker DNA increasing the nuclease protection of the core particle to ~167 bp (Noll and Kornberg 1977; Hayes and Wolffe 1993; Hayes et al. 1994; An et al. 1998a, b). The resultant complex, containing ~167 bp of DNA, the core histone octamer, and the linker histone is known as the chromatosome (Simpson 1978). Together with the remaining length of linker DNA, the chromatosome forms the fundamental repeating unit of chromatin, the nucleosome.

The linker histone globular domain has two known DNA binding motifs on opposing faces, a winged-helix motif and a conserved basic surface, which allow the bridging of two DNA strands (Clore et al. 1987; Graziano et al. 1990; Ramakrishnan et al. 1993). In the absence of a high-resolution structure of the chromatosome, several models of binding of the linker histone to a single nucleosome have been extrapolated from biochemical studies of chromatosomes reconstituted in vitro (Zhou et al. 1998; Syed et al. 2010) and the effects of mutagenesis in vivo (Brown et al. 2006). The leading model suggests that the globular domain binds asymmetrically to the outside of the nucleosome core, simultaneously engaging DNA near the dyad and either one or both linker DNA segments exiting the core particle (Brown et al. 2006; Syed et al. 2010). In a second model, the globular domain binds to DNA inside the core particle, displacing core histone–DNA interactions (Pruss et al. 1996). In either model, the globular domain should affect the trajectory of DNA entering and exiting the nucleosome core.

The C-terminal domain (CTD) of linker histones is unstructured in solution but assumes regional secondary structure upon DNA binding (Vila et al. 2000, 2001a). It is a major determinant of H1’s association with and consequent modulation of chromatin (Lu and Hansen 2004). Two critical subdomains of the CTD have been identified that mediate the functions of one isoform, H1.0. Remarkably, the role of these subdomains is linked to overall amino acid composition and location relative to the globular domain, rather than defined primary sequences (Hansen et al. 2006; Lu et al. 2009). The N-terminal domain of linker histones contributes only minimally to chromatin binding and its function is unclear at this time (Vila et al. 2001b; Th’ng et al. 2005). Analogous to the core histone tails, the linker histone N- and C-terminal domains can extend substantial distances from the globular domain. This feature may allow contacts to be made with adjacent nucleosomes in folded chromatin.

Despite remaining heavily bound to chromatin, linker histones are more mobile than the core histones (Lever et al. 2000; Misteli et al. 2000). This mobility is modulated by linker histone posttranslational modifications and competition for chromatin binding with other chromatin architectural proteins including the High Mobility Group (HMG) proteins (Catez et al. 2004). In addition to binding nucleosomal DNA, linker histones interact with a myriad of other chromatin-related proteins (McBryant et al. 2010). It is suggested that much like the core histone tails, the CTDs of linker histones can adopt diverse structures to allow binding to a multitude of protein and DNA platforms.

1.5 Higher-Order Chromatin Structure

The nucleosome accounts for a small fraction of the genomic compaction, which occurs in interphase and mitotic chromatin. The remainder of the compaction results from a hierarchical organization, collectively known as higher-order chromatin structure. Much like protein structure, higher-order chromatin structure can be broken down into primary, secondary, and tertiary structures. Similar to the primary structure (i.e., sequence) of proteins, the primary structure of chromatin describes the linear arrangement of nucleosomes on the DNA template. The resultant nucleosomal array resembles “beads on a string” with a width of 11 nm. Improved sequencing technologies have allowed the precise mapping of the linear organization of nucleosomes, and many histone variants and posttranslational modifications genome-wide.

Continuing the analogy, the secondary structure of chromatin defines the local compaction of a nucleosomal array into what most believe to be a coiled fiber, roughly 30 nm in diameter. Over three decades of research have failed to reach consensus on the structure of what is termed the 30 nm fiber. Currently, two models are favored based on thorough in vitro analysis of defined reconstituted arrays. However, recent investigations challenge even the existence of the 30 nm fiber in vivo (Eltsov et al. 2008; Maeshima et al. 2010; Joti et al. 2012; Nishino et al. 2012). In the next structural level, the tertiary structure of chromatin describes the interstrand contacts between secondary structural elements comparable to a protein fold. The dynamic nature and overall complexity of tertiary structure in interphase and mitotic chromatin has made its characterization challenging. Not surprisingly, the three levels of chromatin structure are interconnected. For example, the linear organization of nucleosomes imparts constraints on the 30 nm fiber structure. Structural details of chromatin secondary structure are discussed in the following section. Further details of genome-wide chromatin structure are addressed in a later chapter.

1.5.1 Secondary Structure of Chromatin

As early as 1980, the 30 nm fiber had been observed by thin section electron microscopy of metaphase chromosomes (Marsden and Laemmli 1979) and small-angle X-ray scattering in chicken erythrocytes (Langmore and Schutt 1980). The fiber was shown to relax into an 11 nm “beads on a string” conformation in subphysiologic ionic strengths and to a lesser degree upon depletion of linker histone (Thoma et al. 1979). Early studies of the 30 nm fiber confirmed side-to-side packing of nucleosomes oriented nearly parallel to the fiber axis (McGhee et al. 1983; Widom and Klug 1985). Subsequent studies aimed at determining the path of DNA within the 30 nm fiber led to proposed structures of two basic classes (1) the one-start model consists of bent linker DNA connecting sequential nucleosomes along a helical path to form a solenoid structure (Finch and Klug 1976; Thoma et al. 1979; McGhee et al. 1983; Widom and Klug 1985) and (2) the two-start model is built from nucleosomes connected in a zigzag pattern by straight linker DNA in a radial (the crossed-linker model) or longitudinal (the helical ribbon model) arrangement (Thoma et al. 1979; Worcel et al. 1981; Woodcock et al. 1984; Williams et al. 1986). These models place linker DNA and the linker histone in the interior of the fiber. One characteristic difference between the models is the conformation of linker DNA, being straight in the two-start models and bent in the one-start model. For many years, differentiation between these models was fraught with the challenges of heterogeneous arrays with mixed linker lengths and histone composition. More recent advances in the reconstitution of arrays with defined nucleosome positions (Dorigo et al. 2003; Huynh et al. 2005) have allowed for detailed structural characterizations of the 30 nm fiber, leading to two distinct models and continued controversy.

Richmond and colleagues observed a two-start organization in short model arrays compacted into a 30 nm fiber. The distribution of chromatin fragments following disulfide-cross-linking of spatially adjacent nucleosomes and linker DNA digestion was only consistent with the two-start fiber (Dorigo et al. 2004). The two-start conformation was unaffected by linker length up to 208 bp and the presence of linker histone. The group’s subsequent 9 Å crystal structure of a tetranucleosome with 167-bp repeat length and without linker histone showed nearly straight, zigzagging linker DNA between two nucleosome stacks, again suggesting a two-start conformation (Schalch et al. 2005). The tetranucleosome structure was used to build a model of the 30 nm fiber with characteristics similar to the aforementioned crossed-linker model (Fig. 1.10a). The resultant fiber has a diameter of approximately 25 nm. The crossed-linker arrangement places nucleosome N in proximity to nucleosomes N ± 2. Importantly, the model locates the H4 tail from one nucleosome in close proximity to the H2A/H2B acidic patch in a spatially adjacent nucleosome, consistent with cross-linking observed between this acidic patch and the H4 tail (Dorigo et al. 2004).

Fig. 1.10
figure 10

Secondary structure of chromatin: the 30 nm fiber. (a) Two orthogonal views of a 25 nm diameter two-start model for the 30 nm fiber. Pairs of nucleosomes that are adjacent in the linear DNA sequence (the two-start repeat) are colored similarly. Linker DNA is present in this model. Coordinates kindly provided by Tim Richmond. (b) Two orthogonal views of a one-start model of the 30 nm fiber (33 nm diameter model corresponding to 178–197 bp nucleosome repeat length). Sets of nucleosome in the same solenoid layer (also sequential in the linear DNA sequence) are colored similarly. Linker DNA is not shown in this model. Coordinates kindly provided by Phillip Robinson. In both models, nucleosomes are numbered starting from an arbitrarily labeled Nth nucleosome to aid in distinguishing the conformations of the one- and two-start fibers

Rhodes and colleagues reached a different conclusion using electron microscopy to measure physical parameters of long chromatin fibers containing stoichiometric linker histone (Robinson et al. 2006). They were able to distinguish two distinct fiber diameters dependent of nucleosome linker length. Linker lengths between 30 and 60 bp resulted in a 33 nm diameter, while longer fibers, with 70–90 bp linkers, had a 43 nm diameter. The observation of similar fiber diameter over large ranges of linker DNA length is suggestive of a one-start helix. Subsequent modeling of the 30 nm fiber yielded a helical arrangement with interdigitation of nucleosomes from subsequent turns (Fig. 1.10b). Importantly, this model tolerates varying linker lengths without perturbation of fiber parameters. Further modeling of the 30 nm fiber using the same parameters suggested potential two-start solutions in addition to the one-start model (Wong et al. 2007).

Subsequent single-molecule force spectroscopy measurements confirmed both one-start and two-start models with a dependence on linker length (Kruithof et al. 2009). Additionally, limited formaldehyde cross-linking of compacted reconstituted chromatin fibers followed by decompaction in low ionic strength and electron microscopic visualization revealed heteromorphic fibers (Grigoryev et al. 2009). These fibers, while predominantly two-start in nature, contained intervening segments resembling solenoid conformations. Thus, at least in vitro, both one- and two-start conformations may contribute to secondary chromatin structure. The relative contributions may be tunable and among other factors, depend on nucleosome repeat lengths. While substantial progress has been made, it is clear that much remains to be determined regarding chromatin higher-order structure in vivo.

1.6 Perspective

Over the past several decades, enormous strides have been made in the description of chromatin structure. The nucleosome core particle has been defined at atomic resolution alone and in complex with proteins. Interrogation of chromatin secondary structure and the function of the linker histone have led to models for the chromatosome and the 30 nm fiber. Despite these major advances, additional work is necessary to bring further clarity to the nature and regulation of chromatin structure. Future exploration of higher-order chromatin structures and the coordinated recruitment of chromatin-associated factors in genome-templated processes promise to heighten overall understanding of chromatin structure and function.