Keywords

1 Introduction

Transcription is a fundamental cellular process that controls gene expression. Precise regulation of transcription is essential for normal cell growth, differentiation, and function. Central players in this process are the general transcription machinery including RNA polymerase and its associated factors, DNA sequence-specific transcription factors, and a plethora of coregulators, which include coactivators that assist transcription activation and corepressors that repress transcription. Nuclear receptors are ligand-dependent transcription factors that are activated upon ligand binding. They interact with a variety of coactivators and recruit them to target gene promoter/enhancer regions to form large protein complexes and activate transcription. More than 300 coregulators have been identified so far (Nuclear Receptor Signaling Atlas, www.Nursa.org). They have diverse functions and are involved in different steps of transcription on different genes. These proteins can be chromatin remodeling enzymes, posttranslational modification enzymes, RNA splicing factors, or scaffolding proteins/bridging factors to bring other enzymatic coregulators to nuclear receptor complexes and stabilize general transcription machinery [1]. Understanding the structural basis of nuclear receptor/coactivator complexes provides valuable information on how different types of coactivators precisely contribute to nuclear receptor-mediated transcriptional activation.

Most nuclear receptor family members have generally similar domain structures. They have a conserved DNA-binding domain (DBD) at the central region that recognizes specific DNA-responsive elements, a C-terminal ligand-binding domain (LBD) that binds ligands and recruits ligand-dependent coactivators (AF-2), and a N-terminal variable region that often contains constitutive activation functions (e.g., AF-1) that the specific receptor also can bind coactivators. Crystal structures of certain regions of nuclear receptors especially the LBD have attracted much attention. Such studies provide valuable insights for understanding ligand-activated receptor function and therapeutic design of nuclear receptor antagonists. The X-ray structural studies of nuclear receptor domains have been reviewed recently [2,3,4] and will not be discussed here. Crystal structural studies of coactivators, however, are very limited due to the presence of intrinsically disordered regions, artifacts caused by a large amount of reduction from their large sizes, and conformational modification due to packing during the process of various chemical conditions in crystallization. Here we will focus on current understanding of several coactivator structures in context with our recent progress on structural organization of nuclear receptor/coactivator complexes.

2 Structural Studies of Individual Coactivators

2.1 Steroid Receptor Coactivator (SRC)

The existence of common limiting intermediary factors shared by different steroid receptors was long speculated following the observation of a squelching effect between different receptors or different activation function domains [5, 6]. Steroid receptor coactivator-1 (SRC-1/NCOA1) was the first coactivator identified through a yeast two-hybrid screen using the progesterone receptor LBD as a bait. Its overexpression enhances the receptor activity without altering basal activity of the promoter and inhibits the squelching effect [7]. Two other steroid receptor coactivator family members were later identified as SRC-2 (TIF2/GRIP1/NCOA2) [8, 9] and SRC-3 (ACTR/AIB1/RAC3/pCIP/TRAM1/NCOA3) [10,11,12,13,14]. They have similar domain structures and are approximately 160KDa size proteins and thus often are referred to as the p160 family. These three coactivators interact and activate many different nuclear receptors. They serve as primary coactivators and scaffolding proteins to recruit other secondary coactivators to nuclear receptor-targeted DNA-binding sites. The SRCs play important roles in regulating reproduction, metabolism, circadian biology, and cancer development [15,16,17,18].

The structure of SRCs can be divided into five domains (Fig. 1a). The N-terminus is a highly conserved bHLH-PAS (basic helix-loop-helix-Per Arnt Sim) domain. This domain is involved in the interaction between SRC and several secondary coactivators [19,20,21,22], as well as regulating SRC nuclear localization and protein turnover [23]. A Ser/Thr-rich region is targeted by many different posttranslational modifications (e.g., phosphorylation, monoubiquitination, and polyubiquitination) to control a SRC transcriptional time clock (activation and degradation) [24]. The central region is a RID domain (receptor-interacting domain); it interacts with a nuclear receptor LBD upon ligand activation. The C-terminal region of SRCs contains two activation domains: the CID domain (CBP/p300 interaction domain) that interacts with the histone acetyltransferase CBP/p300 to promote histone acetylation (AD1) and the HAT domain (AD2) that contains a weak acetyltransferase activity [13, 25] and later recruits histone methyltransferases CARM1 (coactivator-associated arginine methyltransferase) and PRMT1.

Fig. 1
figure 1

SRC domain organization. (a) Schematic representation of SRCs. L represents LXXLL motif. (b) Crystal structure of SRC-2 LXXLL motif (NR box II, yellow) interacting with diethylstilbestrol (magenta)-bound ER LBD dimer (PDB 3ERD). The LXXLL motif binds to a hydrophobic groove in ER LBD formed by helices 3, 4, 5 (light blue) and helix 12 (green)

Most of the prior SRC structural studies have been focused on the RID domain. There are three conserved LXXLL motifs (L represents leucine and X represents any amino acid) present in the RID domain. These motifs also are named NR boxes for their specific interactions with ligand-bound nuclear receptors [26]. Crystallographic studies of the binding NR box peptides to various nuclear receptors demonstrate that these peptides form amphipathic α-helices with leucine residues lined up on one side to contact a hydrophobic groove formed at the surface of agonist-bound receptor LBD [27] (Fig. 1b). The NR LBD is usually a three-layer sandwich-shaped structure consisting of 12 α-helices. Helix 12 is highly mobile in the absence of ligand binding (see review [4]). Agonist binding induces its transition from disordered to ordered structure [4], which then forms the SRC NR box-interacting hydrophobic groove together with helices 3, 4, and 5 [27,28,29]. Two highly conserved glutamate and lysine residues outside the hydrophobic groove also form a “charge clamp” with the LXXLL motifs to orient and pack the motifs into the coactivator-binding site [29].

The structures of other regions of SRCs remain undetermined. A NMR study on SRC-3 and CBP interaction domains indicates that both domains are intrinsically disordered when isolated [30]. However, they cooperate with each other to fold “synergistically” into a helical heterodimer [30]. This induced structure upon contact is not unique to SRC-CBP interaction domains . Many transcription factors interacting with CBP/p300 also have this structural feature to allow specific protein-protein interactions [31] (see below).

2.2 CBP/p300

CBP (CREB-binding protein [32]) and its paralog E1A-associated p300 proteins [33, 34] are essential coactivators for many transcription factors including nuclear receptors [35,36,37,38]. They play important roles in regulating cell growth, transformation, differentiation, apoptosis, and development [37, 39, 40]. The two proteins can function as bridging factors to connect transcription factors with basal transcriptional machinery, as protein scaffolds to build up multicomponent transcription factor complexes or mainly as protein and histone acetyltransferases to transfer an acetyl group from acetyl CoA to lysine residues in histones and their component substrates [39, 41] (Fig. 2a).

Fig. 2
figure 2

CBP/p300 structures. (a) CBP/p300 catalyzes lysine acylation. R represents different acyl groups. (b) Schematic representation of CBP/p300 domains and folded domain structures. TAZ1 (PDB 1U2N); KIX (PDB 2LXT, KIX (blue) binds to MLL activation domain peptide (red) and CREB pKID peptide (purple)); catalytic core (PDB 4BHW); TAZ2 (PDB 1F81); NCBD (PDB 1KBH, NCBD (blue) binds to SRC-3 peptide (purple)). (c) Comparison of TAZ1 (left panel, white, Fig. 2 (continued) PDB 2KA4) and TAZ2 (right panel, white, PDB 3T92) binding to TADs from different transcription factors. STAT2 (yellow, PDB 2KA4); HIF1α (purple, PDB 1L8C); RelA (blue, PDB 2LWW); CTED2 (green, PDB 1R8U); STAT1 (yellow, 2KA6); p53 (green, PDB 2K8F); TCF3 (blue, PDB 2MH0)

CBP and p300 have a high degree of similarity and share 63% identical amino acids [39]. They are large 300 KDa proteins containing several folded functional domains (Fig. 2b) connected through regions predicted to be intrinsically disordered [31, 42]. The bromodomain, CH2 region (cysteine-histidine-rich region 2), and HAT domain constitute the catalytic core of CBP/p300. The CH1/TAZ1 (transcriptional adaptor zinc finger 1), KIX (CREB-binding domain), CH3/TAZ2, and NCBD (nuclear coactivator-binding domain) domains mainly mediate the interaction of CBP/p300 with a variety of transcription factors, viral oncoproteins, basal transcription machinery, and coactivators.

CBP and p300 are KAT3 (lysine acetyltransferase 3 family) enzymes, which are different from other KATs (HATs) in that they use a “hit-and-run” (Theorell-Chance) catalytic mechanism [43]. They do not form a stable ternary complex with substrates and acetyl CoA cofactors. After acetyl CoA binding, substrates associate with the CBP/p300 surface transiently to allow acetyl group transfer to lysine residues. This mechanism is proposed to contribute to a broad CBP/p300 substrate specificity unlike other KATs which require a more specific substrate-binding pocket [43, 44]. CBP and p300 acetylate both histones and nonhistone proteins. Histone tail acetylation neutralizes lysine-positive charges and decondenses chromatin; it is generally associated with transcriptional activation [45, 46]. CBP and p300 are able to acetylate all core histones [47]. Their HAT activity is essential for ligand-induced nuclear receptor-target gene transcription [48]. CBP and p300 also acetylate a number of transcription factors and coactivators, such as p53 [49], CREB [50], E2F [51, 52], GATA-1 [53], TFIIE, TFIIF [54], SRC-3 [55], and regulate their transcriptional activities.

In addition to catalyzing acetylation on a broad set of substrates, CBP and p300 also utilize a variety of acyl-CoAs as cofactors to mediate histone propionylation, butyrylation, crotonylation, succinylation, glutarylation, and β-hydroxybutyrylation [56] (Fig. 2a). These non-acetyl acylations are believed to be functionally different from acetylation and exert unique regulations on gene transcription and chromatin structure [56]. P300-mediated histone butyrylation and crotonylation also were shown to strongly stimulate gene transcription in vitro [57, 58]. The relative concentrations of different acyl-CoAs regulated by cellular metabolism can determine the preference for p300 over different cofactors [56]. For example, under low glucose condition, non-acetyl histone acylations are more common [59, 60]. Crystal structures of the p300 HAT domain in a complex with different acyl-CoA cofactors demonstrate that p300 has a deep aliphatic pocket present in its active site to accommodate short-chain acyl groups that is not present in other HATs such as GCN5. This unique feature also explains the broad acyltransferase activity of p300 [61]. The preferred HAT for ER complex and SRC-3 is p300.

The HAT domain contains 380 residues. The X-ray structure of the HAT domain and Lys-CoA inhibitor complex demonstrates that it consists of seven central β-strands surrounded by nine α-helices and several loops (Fig. 2b) [43]. The Lys moiety of Lys-CoA mimics the substrate Lys residue. An unusual long substrate-binding loop L1 in the HAT domain, which is only found in CBP/p300 but not in other HATs, covers the Lys-CoA and appears to influence substrate binding [43]. CBP/p300 HAT activity is regulated by its autoacetylation. Hyperacetylated CBP/p300 is much more active than the hypoacetylated form [62]. Hyperacetylation occurs in an autoregulatory loop, which is a lysine-rich intrinsically disordered region in the HAT domain [62, 63]. When hypoacetylated, the autoregulatory loop competes with substrate binding to the HAT active site. Its autoacetylation releases its binding and thus enhances the HAT activity [62].

In addition to the loop L1 and the autoregulatory loop, domains flanking the HAT domain also play an important role in regulating the HAT activity. X-ray structures of CBP/p300 HAT and flanking domains, bromodomain, and the CH2 region [63, 64] show that they form a compact module with intimate association between flanking domains and the HAT domain (Fig. 2b). The bromodomain recognizes acetylated substrates. It is a left-handed four-helix bundle linked by two interspersed loops, which form an active acetyl-lysine binding pocket [65, 66] (Fig. 2b). This domain is required for CBP/p300 binding to its substrate, chromatin binding, and its full HAT activity [63, 67,68,69,70]. The CH2 region contains a PHD (plant homeodomain) finger interrupted by a RING (Really Interesting New Gene) domain [64]. The PHD finger is connected to the HAT domain and also makes multiple contacts with the bromodomain through hydrogen bonds and hydrophobic interactions [63, 64]. It also plays a role in recruiting p300 to chromatin [70, 71]. RING domains are often found in E3 ubiquitin ligases to mediate substrate ubiquitination. The RING domain in p300, however, does not have ubiquitination activity [64]. Instead, it has an inhibitory role on the HAT activity. It contacts the loop L1 and is positioned over the HAT active site, partially blocking access to the HAT substrate-binding groove [64]. Deletion of the RING domain significantly increases p300 autoacetylation and substrate acetylation [64].

CBP/p300 serves as a docking platform for numerous other transcription factors, components of the general transcription machinery, and coactivators, through its transactivation domains engaging in protein-protein interaction . Many of their interaction partners contain intrinsically disordered transactivation domains and adopt folded structures upon binding to CBP/p300 [31, 42]. The KIX domain was originally identified based on its interaction with the KID (kinase-inducible domain) domain of CREB [72]. It is a 90-residue long bundle of three α-helices and two additional 310 helices [73]. In addition to CREB, it also interacts with p53 [74], c-Myb [75], MLL [76], c-Jun [77], FOXO3a [78], BRCA-1 [79], SREBP [80], and STAT-1 [81] transcription factors. This domain has two binding surfaces for interacting with different transcription factors or with different transactivation domains in one protein simultaneously. Unstructured phosphorylated KID of CREB and the c-Myb activation domain fold into helical structures upon binding to a common binding site, a shallow hydrophobic groove formed by helices 1 and 3, at the KIX surface (Fig. 2b) [73, 75]. A second binding site at the opposite surface of KIX formed by helices 2, 3, and 310 is also a hydrophobic groove allowing the binding of the MLL or Jun activation domain [76, 77]. It was reported that MLL and c-Myb or MLL and p-KID form a stable ternary complex with KIX and the two binding events act cooperatively to enhance the protein-protein affinity [76, 82]. This interaction mechanism provides a structural basis for synergistic activation of transcription when CBP/p300 interacts with different transcription factors simultaneously. Some proteins, such as p53 and FOXO3a, have two disordered activation domains that each can interact with one of the KIX binding surfaces to enhance their binding affinities with the KIX domain [78, 83].

The TAZ1 at the CH1 region and TAZ2 at the CH3 region are also major domains interacting with transcription factors. They are zinc finger motifs having similar folding structures with four amphipathic α-helices stabilized by binding of three zinc atoms. TAZ1 and TAZ2 differ in that their fourth helix adopts opposite orientations resulting in different binding surfaces (Fig. 2b) [84]. They have different binding specificities to different subsets of intrinsically disordered transcription factor activation domains [42]. Comparison of structures of TAZ1 in a complex with transactivation domains from HIF1α, CITED2, STAT2, and NFκB reveals that these unstructured TADs usually have multiple amphipathic regions and fold into helical structures when interacting with TAZ1, but they do not have a fixed binding site. Instead, they wrap around the entire TAZ1 molecule along a hydrophobic groove depending on the amino sequences of amphipathic regions [31] (Fig. 2c). TAZ2 is located close to the HAT domain. It interacts with numerous transcription factors [37]. Unlike TAZ1, TAZ2 has a hydrophobic docking site at the interface of helices 1, 2, and 3 for interacting with various disordered TADs and inducing helical structure folding [31] (Fig. 2c).

The NCBD domain at the C-terminus of CBP/p300 interacts with SRCs [30, 85], p53 [86] and IRF-3 [87]. Unlike other well-structured protein-protein interaction domains mentioned above, it has characteristics of a molten globule when not contacting its binding partners [30, 88]. NMR studies suggest that the free NCBD undergoes rapid reversible conformational exchange [89] and adopts different conformations upon binding to different proteins. It folds into a three-α-helix bundle when in contact with a SRC-3 CID region, which also transits from a disordered state into a three-helix structure. The two regions pack together to create an extensive leucine-rich hydrophobic core to stabilize the complex structure (Fig. 2b) [30]. When interacting with IRF-3, the NCBD folds into a three-helix structure, but contacts between these helices are different resulting in a different tertiary structure compared to SRC-bound NCBD [87, 89]. This feature of conformational flexibility could allow the NCBD to interact with different partners with optimized structural fit.

Since CBP/p300 interacts with numerous transcription factors and has a limited concentration in cells, it is important for a mechanism to exist that regulates its binding specificity with different proteins in response to external signals. The binding affinities of CBP/p300 with different partners can be positively or negatively regulated by partner protein phosphorylation, hydroxylation, and S-nitrosylation [42] as well as by PTMs on CBP/p300. For example, CARM1-mediated CBP/p300 methylation switches off its interaction with CREB and turns on a NR-activated gene transcription function [90, 91]. Similarly, phosphorylation of CBP S436 inhibits the interaction with CREB while enhancing its association with AP-1 and Pit-1 [92, 93]. Posttranslational modifications thus provide an important layer of regulation to control CBP/p300 specificity.

2.3 CARM1

CARM1 was originally identified in a yeast two-hybrid screen for proteins interacting with the AD2 domain of SRC-2/GRIP1 [94]. It synergizes with SRCs and CBP/p300 to activate NR-mediated target gene transcription [95, 96]. Loss of CARM1 in a mouse embryo significantly reduced estrogen-regulated gene transcription [97], indicating its important role in ER-mediated function. CARM1 belongs to a protein arginine methyltransferase (PRMT) family. It is a type I PRMT (PRMT4) that asymmetrically dimethylates arginines. It transfers methyl groups from S-adenosylmethionine (AdoMet) to a guanidino nitrogen of arginine leading to the formation of methylated arginine and S-adenosylhomocysteine (AdoHcy) (Fig. 3a). CARM1 methylates histones H3R17, H3R26 [98], and H3R42 [99], as well as nonhistone proteins including SRC [100], CBP/p300 [90, 91, 101], Sox2 transcription factor [102], Notch1 [103], several RNA-binding proteins [104, 105], and splicing and transcription elongation factors [106]. CARM1 knockout mice are smaller than wild-type littermates and die shortly after birth [97]. It plays an important role in T cell development [107], neural development [108], and proliferation and differentiation of adipocyte [109], chondrocyte [110], and pulmonary epithelial cells [111].

Fig. 3
figure 3

CARM1 structures. (a) CARM1 catalyzes arginine mono- and dimethylation. (b) Schematic representation of CARM1 domains and structures. N-terminal domain dimer (PDB 2OQB); catalytic core dimer (PDB 5DX0)

CARM1 has three individual domains (Fig. 3b). The central region is a catalytic core that forms a head-to-tail dimer that is conserved in PRMTs . The catalytic PRMT core is folded into two domains that are connected by a conserved proline residue [112, 113]. The N-terminal part of the catalytic core is involved in cofactor binding. It contains a Rossmann fold structure [114], a sandwich-structure consisting of four α-helices and five β-strands, and two terminal α-helices (αX and αY) (Fig. 3b) [112, 113]. This structure is conserved for AdoMet binding in SAM-dependent methyltransferases [115]. Cofactor binding induces a structural change of the αX region from disordered structure to an α-helix, which then forms a deep binding pocket with other terminal helices and three β-strands in the Rossmann fold to bury the cofactor, restricting its accessibility only to the substrate arginine [113]. The C-terminal part of the catalytic core is a β-barrel (11 β-strands and 6 α-helices) and an arm (2 α-helices and 2 short 310 helices) involved in CARM1 dimerization. The interaction between the arm in one monomer and the Rossmann fold structure in the other monomer is important for the dimer formation. Both N- and C-domains of the catalytic core participate in the formation of an active arginine binding pocket which is located close to the cofactor-binding site. CARM1 has unique sets of substrates including histone H3R17. Structural comparison of CARM1 with PRMT1, PRMT3, and yeast Hmt1 catalytic cores demonstrates that CARM1 has a unique C-terminal extension (β16) that affects substrate-binding specificity [113]. Unlike PRMT1/PRMT3/Hmt1, CARM1 does not recognize a conserved substrate sequence motif. It does not have an acidic rich area at the surface to provide initial binding affinity for basic rich substrates. Rather, it is proposed that a narrow opening between the potential substrate-binding groove and the cofactor-binding site only accommodates a tight β-turn substrate conformation, which could explain the lack of flanking consensus sequences among CARM1 substrates [113]. Recent crystal structures of CARM1 in a complex with five different peptide substrates, including unmethylated and monomethylated H3R17 and nonhistone protein PABP1, indicate that all the substrates display a conserved core binding mode despite their different primary sequences [116]. The enzyme-substrate interactions are made primary through hydrogen bonds between an Arg residue, the backbone of substrate flanking residues with a variety of sequences, and active site residues in CARM1 . This unique backbone recognition may explain CARM1 substrate sequence diversity [116]. In addition to methyltransferase activity and dimerization, the catalytic core is also required for interacting with SRCs and its coactivator function [117].

Compared to other PRMTs, CARM1 has a unique N- and C-terminal domain flanking the conserved catalytic core [118]. The N-terminal domain (28–140 aa) adopts a PH (pleckstrin homology) domain fold (two nearly perpendicular β-sheets capped by an amphipathic α-helix) and behaves as a dimer (Fig. 3b) [112]. The PH domain structure is found in a large family of proteins often involved in transient protein-protein interactions in response to upstream signals [119]. Interestingly, the density of this PH domain is not observed in a larger CARM1 protein structure (28–507 aa), suggesting that the PH domain is wobbly [112]. We recently demonstrated that the N-terminal domain of CARM1 is mobilized upon formation of an estrogen receptor/coactivator complex and it is involved in the interaction with p300 in the complex [120]. The C-terminal domain of CARM1 is intrinsically disordered [112]. It has strong autonomous activation function [117]. Deletion of either of the N- or C-terminal domains abolishes CARM1 coactivator activity [117].

3 Structural Studies of Estrogen Receptor/Coactivator Complexes

Numerous crystallography and NMR studies described above shed light on how individual domains or motifs of coactivators interact with transcription factors and/or exert their enzymatic functions. How these domains cooperate with each other in full-length intact proteins and how receptors and coactivators function in the context of large protein complexes are less clear. Most coactivators and transcription factors have intrinsically disordered regions or flexible domains that only fold into a higher-order structure when interacting with their protein partners. Such a property limits structural studies on full-length coactivators since it is nearly impossible to analyze such large complexes using X-ray crystallography and NMR due to limitations in protein molecular size and weight. Recent advances in single-particle electron cryomicroscopy (cryo-EM) now make solving large nuclear receptor/coactivator complex possible.

Cryo-EM is a rapidly expanding methodology that is particularly well suited for studying three-dimensional structures of molecular machines in native solution or under chemically defined conditions without using negative stain or chemical fixatives. This method is ideal for specimens that are difficult to study by X-ray crystallography or NMR. Cryo-EM has been used to study macromolecular complexes of various sizes (50 kD–30 MDa), shapes (spherical, filamentous, or amyloid) and symmetries, or even complexes that completely lack symmetry (e.g., ribosomes). In the last decade, cryo-EM has generated a large increase in the number of published macromolecular structures, as well as an ever-growing user base. This rapid growth, in part, has been due to improvements in instrumentation: particularly in detectors that are able to increase signal-to-noise ratios in the image data and microscopes that have pushed the limits of very stable single-particle cryo-EM to sub-2 Å resolution [121]. This resolvability even enables the derived structural models to become usable for structure-based drug design.

Nuclear receptor coactivators act synergistically with complex partners to activate nuclear receptor-targeted transcription, but the molecular basis of this synergism is not completely understood. Our recent work on cryo-EM structures of large DNA-bound full-length estrogen receptor α (ER) and coactivator complexes provides new information that addresses this issue [21, 120].

Purified recombinant ERα, SRC-3, and p300 proteins were assembled on a biotinylated ERE (estrogen-responsive element) containing DNA in the presence of estrogen. The complex was then separated from unbound coactivators using magnetic streptavidin beads [21]. These purified proteins were intact and shown to activate target reporter transcription synergistically in vitro. The reconstituted cryo-EM structure of the complex is estimated to have a validated 25 Å resolution with a dimension of 220 × 260 × 320 Å. Using individual p300 cryo-EM structure, antibody labeling, and density map segmentation, the complex density was determined and segmented into four components: one ERα dimer, two distinct SRC-3s (SRC-3a and SRC-3b), and one p300. The structure shows that each of the ERα monomers independently recruits one SRC-3 and the two separate SRC-3s in turn lock one p300 in the complex through multiple contact points to form a more stable complex (Fig. 4). The quaternary structure of this full-length protein complex reveals an “adaptation and fit” assembly mechanism for coactivator recruitment by the nuclear receptor. The two SRC-3s adopt slightly different conformations although both interact with ERα and p300. SRC-3a has the strongest interaction with the p300 CID domain. It also appears to contact both the ERα N-terminal AF-1 domain and the C-terminal AF-2 domain. This observation provides a structural basis for cooperativity between AF-1 and AF-2 predicted previously [122,123,124]. SRC-3b, on the other hand, contacts different regions of p300 and appears to have a weaker interaction with ERα. It needs to adapt to a different conformation in order to fit into the position required to connect it with both ERα and p300.

Fig. 4
figure 4

ERE/ERα/SRC-3/p300 complex structural organization. One ERα dimer recruits two distinct SRC-3s which in turn bring in one p300 through multiple contacts

Recruitment of p300 to the ERα complex is mediated through its association with SRC-3s. A conformational change was observed for p300 upon assembly into the complex. This conformational change not only allows p300 to fit into the center to contact the two SRC-3s but also increases its HAT activity toward histone H3. The intrinsically disordered, highly flexible ERα AF-1 region is mobilized upon binding to SRC-3. Nuclear receptor activates transcription in response to ligand stimulation and recruits different coactivators at different stages of transcription; transcription activation needs to be turned off when the stimulus is no longer present. The highly flexible and dynamic nature of nuclear receptor and coactivator interactions allow rapid assembly and disassembly of different complexes in response to signal stimulation.

SRC-3 can recruit not only CBP/p300 but also CARM1 to the ER complex. CARM1 recruitment follows later than SRC-3 and CBP/p300 recruitment [120, 125]. A cross talk between CBP/p300-mediated histone acetylation and CARM1-mediated histone methylation has been well documented [113, 125, 126]. Addition of CARM1 to the purified ERα/SRC-3/p300 complex brings in new heterogeneity to the complex structure. Using a multiple refinement algorithm, three different classes of complex structures were found in our analyses [120] (Fig. 5a). Surprisingly none of the classes generates an extra density in the complex upon the addition of CARM1 to the ERα/SRC-3/p300 mixture. One of the classes is essentially the same as the ERα/SRC-3/p300 complex, representing the group without CARM1 binding. Another class shows a CARM1 density replacing the density of SRC-3b; this was confirmed by CARM1-specific antibody labeling and represents the complex now containing CARM1. The third class has only one SRC-3a in the complex, leaving an unoccupied space where SRC-3b or CARM1 is located in the other two classes; this likely reflects a less stable intermediate state.

Fig. 5
figure 5

CARM1 recruitment alters ERα/coactivator complex structural organization (a) Three classes of ERα/coactivator complex structures were found in the mixture of ERα, SRC-3, p300, and CARM1. (b) Sequential CARM1 recruitment replaces SRC-3b from the complex and alters p300 conformation, leading increased p300-mediated H3K18 acetylation and CARM1-mediated H3R17 methylation to activate transcription (adapted from [120] with modification)

Consistent with a previous observation [94], CARM1 does not directly interact with ERα . As a result, the density pertaining to the AF-1 region is missing in one of the ER monomers that does not contact SRC-3, probably due to its high mobility. Although CARM1 occupies the position of SRC-3b in the complex, it contacts different regions in p300 compared to SRC-3. Understandably, a further conformational change of p300 was observed to accommodate this change in binding partners (Fig. 5b). This sequentially occurring conformational change significantly increases p300 HAT activity on histone H3K18, which in turn promotes CARM1-mediated H3R17 dimethylation (Fig. 5b). Increased H3R17 methylation has been linked to active gene transcription [127,128,129]. Several reader proteins, including Tudor domain proteins and PAF1 complex that are involved in transcription elongation, were found to bind arginine-methylated motifs [130,131,132]. It is likely that CARM1 recruitment to the complex alters the complex structure to functionally prepare transcription transitioning from initiation to elongation. This structural impact of sequential coactivator recruitment also provides a general explanation for the synergistic transcriptional activation observed for different coactivators.

In the X-ray structural study of CARM1 , the N-terminal PHD domain was not visible due to high mobility [112]. It was proposed that this domain could be involved in protein-protein interaction [117]. Indeed, the N-terminal domain of CARM1 was found to connect CARM1 and p300 in the complex through N-terminal domain-specific antibody labeling [120]. Two antibodies bind to the CARM1 density in the complex, suggesting that CARM1 may exist as a dimer in the complex. This result is consistent with X-ray structural studies [112, 113]. Deletion of the CARM1 N-terminal domain abolishes the synergism between CBP/p300 and CARM1 [117] as well as the promotional effect of CARM1 recruitment on p300 HAT activity [120], highlighting the significance of the CARM1 PHD domain in regulating the ERα/coactivator complex function.

4 Future Perspective

With recent advances in cryo-EM technology , we now have made substantial new progress in understanding assembly mechanisms of nuclear receptor and coactivator complexes. However, as pointed above, nuclear receptors and coactivators are highly dynamic and have intrinsic disordered regions that must fit their need to quickly assemble and disassemble into different protein complexes at different stages of transcription. Compositional heterogeneity, conformational flexibility, and dynamism are limiting factors for obtaining high resolutions for these complexes. Recent improvement in cryo-EM in automated large-scale data collection [133,134,135] and improved image processing workflows will help in part to address the difficulties in dealing with these structurally heterogeneous samples. With large-scale imaging data, usage of unsupervised 3D classification algorithms will be able to categorize data with structural variability or reconstruct structures into multiple functional states that exist dynamically in one dataset, thereby improving the resolution for each state. A prominent structural feature for nuclear receptors and coactivators is that intrinsically disordered structures become structured and flexible regions become mobilized when interaction partners contact each other. In fact, ER/coactivator complexes become very stable (even resistant to urea denaturation) after forming a giant protein complex [136]. Building a much larger protein complex by including more coregulatory proteins in future structural studies might in itself improve resolution by limiting the conformational dynamics occurring in ice on the cryo-EM grid of the nuclear receptor and coactivator complex.