Keywords

1 Introduction

Intrinsically disordered proteins lack stable tertiary and/or secondary structures under physiological conditions in vitro [1]. A large number of proteins (between 25 and 41 %) are intrinsically disordered. If the dogma dedicates that proteins need a structure to function, why do so many proteins live in the disorder state? [2] However, these intrinsically disordered proteins also play key function in regulation, signaling, and control upon binding with multiple interaction partners [3]. These proteins have many names, like rheomorphic, flexible or highly flexible, natively denatured, natively unfolded, intrinsically unstructured, intrinsically disordered. These proteins composed of an ensemble of highly heterogeneous conformations. After statistics of disordered protein database, IDPs include significantly higher levels of polar amino acids for Glu, Lys, Arg, Gln, Ser, Asp and Pro, and lower levels of hydrophobic residues for Ile, Leu, Val, Trp, Phe, Tyr, Thr, Met, Cys, His and Asn [4].

Furthermore, regions of disorder are found to be abundant in proteins associated with signaling, cancer, cardiovascular disease, amyloidoses, neurodegenerative diseases, and diabetes [5]. Different from structural protein as drug target, IDPs as drug target can bring low binding affinity and low side effect. There are two strategies for drug design targeting IDPs. Firstly, drug is binding to structured partner, thereby preventing the binding of the disordered partner. Secondly, drug is binding directly to the disordered partner, thereby preventing the association of two proteins. For this approach both partners were disordered, but small molecules bound to one of the two partners only. For example, c-Myc-Max inhibitors bind to distinct ID regions of c-Myc [6, 7]. These binding sites are composed of short contiguous stretches of amino acids that can selectively and independently bind small molecules. Inhibitor binding induces only local conformational changes, preserves the overall disorder of c-Myc, and inhibits dimerization with Max.

Furthermore, many intrinsically disordered proteins undergo significant conformational transitions to well folded forms only on binding to target ligands [811]. These experimental observations raise a set of interesting questions if these intrinsic disordered proteins obey an induced fit upon binding.

Coarse-grained modeling simulation [12] and all-atomic model with high temperature simulation [13] were used in intrinsically disordered protein folding coupled partner binding. So far the folding time scales of all atomic MD simulations are restricted to microsecond magnitude at room temperature (298 K), which is significant shorter than the folding half times of most proteins [14, 15]. In order to reveal the conformational changes within reasonable time, all MD simulations in explicit solvent at high temperature have been widely used to monitor the unfolding pathways of proteins. The unfolding timescales could be nanosecond at 498 K [14, 16]. Moreover, according to the principle of microscopic reversibility, experiments have demonstrated that the transition state for folding and unfolding is supposed to be same [14]. Therefore, MD simulations high temperatures are widely used in the folding of intrinsically disordered proteins coupled partner binding.

2 Materials and Method

The atomic coordinates of intrinsically disordered proteins were obtained from pdb data bank. Point mutants were modeled with SCWRL3 [17]. All hydrogen atoms were added using the LEAP module of AMBER [18]. Counter-ions were used to maintain system neutrality. All systems were solvated in a truncated octahedron box of TIP3P waters with a buffer of 10 Ǻ [19]. Particle Mesh Ewald (PME) [20] was applied to handle long-range electrostatic interactions with default setting in AMBER [18]. The parm99 force filed was used to compute the interactions within protein [21]. The SHAKE algorithm [22] was employed to constrain bonds including hydrogen atoms. All solvated systems were first minimized by steepest descent to remove any structural clash, followed by heating up and brief equilibration in the NPT ensembles at 298 K. The time step was 2 fs with a friction constant of 1 ps−1 using in Langevin dynamics. To study the folded state of each solvated system, multiple independent trajectories in the NPT ensemble at 298 K were simulated with PMEMD of AMBER. Then multiple independent unfolding trajectories were performed to investigate unfolding pathways for each solvated system in the NVT ensemble.

3 Results

TIS11d, KID, LEF, p53, CBP, and Brinker are partially or fully intrinsically disordered proteins. [13, 2327] As transcription factor, they play key roles in signal transduction. Upon binding with DNA, RNA, or other transcription factors, they can well fold and will be introduced in this book. Their complex structures are illustrated in Fig. 9.1.

Fig. 9.1
figure 1

The complex structure of intrinsically disordered proteins. a TIS11d/mRNA. b p53/MDM2. c pKID/KIX. d Brinker/DNA. e LEF/DNA. f p53/CBP

To capture the average properties of proteins, multiple trajectories for MD simulations (5–10) are necessary [28]. To study the recognition for intrinsically disordered proteins, multiple independent trajectories for apo-states and their complex were simulated at room temperature (298 K), respectively. Cα and Φ/ψ fluctuations for apo and bound states are researched. In general, the Cα variations of bound state are significant smaller than those of apo-state, especially in the region of the binding site. The results of apo and bound TIS11d are shown in Fig. 9.2 [26]. The Cα fluctuation of bound TIS11dTZF is much smaller than that of apo-TIS11dTZF, especially in the binding site of mRNA and zinc. This suggests that bound TIS11dTZF become less flexible and more stable upon mRNA and zinc binding, which is consistent with the experiment. However, the Φ/ψ variation of bound TIS11dTZF is similar to that of apo-TIS11dTZF, suggesting that the secondary structure of bound TIS11dTZF does not significantly change upon mRNA and zinc binding. Indeed, the helices of α1, α3 and α4 are already stable within apo-TIS11dTZF.

Fig. 9.2
figure 2

Cα and Φ/Ψ variations for TIS11d

To clearly illustrate the conformational difference, the landscapes of distance difference between the average pairwise intra-molecular distance of bound states and corresponding average pairwise intra-molecular distance of apo states for intrinsically disordered protein are shown in Fig. 9.3 [24]. The landscapes can reflect the relative conformational change of DNA and LEF backbone. The deep red area indicates that the distance difference for bases 5–8 and 23–26 is positive value. These bases are corresponding to the minor groove. This suggests that the minor groove is widened upon LEF-binding. Furthermore, disordered C-tail of LEF is located at the minor groove. This suggests that the disordered C-tail of LEF has interactions with DNA and open the minor groove of DNA. The deep blue area represents that the distance difference is negative value. It suggests that the major groove is contracted. That is consistent with the experimental observation that DNA is bended upon LEF-binding [29, 30]. For LEF, the deep red and blue areas are locked at disordered C-tail. This suggests that C-tail of LEF has significant conformational change.

Fig. 9.3
figure 3

Distance difference landscapes for DNA and LEF. a DNA. b LEF

To study the driving force for these conformational adjustments, the electrostatic, hydrophobic, and hydrogen-binding interactions between intrinsically disordered protein and partner were analyzed and shown in Fig. 9.4. From this figure, stable electrostatic interactions, hydrogen bonds, and hydrophobic interactions can be calculated. In general, partner binding will introduce more electrostatic interactions, native contacts and hydrogen bonds at the interface which are responsible for the higher stability for intrinsically disordered proteins.

Fig. 9.4
figure 4

Interactions between LEF and DNA

3.1 Unfolding Kinetics

High temperature simulation was used to research the unfolding kinetics of intrinsically disordered proteins with the parameters of the fraction of native tertiary contact (Qf) and the fraction of native binding contact (Qb). Time evolutions of Qb and Qf for apo and bound states are shown in Fig. 9.5 [23]. The tertiary unfolding and unbinding can be fitted well by a single exponential function, indicating first order kinetics in the NVT ensemble at high temperature (498 K). This suggests that the binding of partner significantly postpones the tertiary unfolding of intrinsically disordered proteins. This is in agreement with the experimental observations [8, 31].

Fig. 9.5
figure 5

Unfolding kinetics for bound pKID

3.2 Ф-Value Prediction

Ф values have been widely used by theoretical and experimental works to identify the key residues for protein folding [3234]. The Ф values of pKID were predicted and shown in Fig. 9.6. Note also that the highest Φ values are found for Asn139, Asp140 and Leu141, suggesting these residues play key role in the folding of pKID [23]. A critical role of Leu141, which deeply extends into the hydrophobic groove of KIX, forms three hydrophobic contacts with KIX. All predicted Ф values can be confirmed by experiments.

Fig. 9.6
figure 6

Φ-values for bound and apo pKID

3.3 Unfolding Pathway

According to the unfolding kinetics analysis, the unfolding orders of bound intrinsically disordered protein are shown in Fig. 9.7 [13]. If we assumed folding is reverse of unfolding, the proposed folding pathway of bound intrinsically disordered protein is from the unfolded state, then secondary structure folding, tertiary folding, partner binding, then to the folded state.

Fig. 9.7
figure 7

Unfolding pathway for bound p53. a fold state. b unbinding. c tertiary unfolding. d helix 3/5 unfolding. e helix 1/2/4 unfolding. f unfolded state

3.4 Recognition Mechanism

Conformational selection and induced fit are two widely used models to interpret the recognition between intrinsic disordered proteins [35]. According to the conformational selection paradigm, various conformational ensembles explore the free energy landscapes corresponding to diverse stable unbound states in equilibrium. During the binding process, the favorable conformation compatible with binding selectively stabilize, and the populations of conformational ensembles shift towards stabilizing state [3639]. However, the induced fit scenario proposes that the favorable conformation results from significant changes of unbound ensembles upon allosteric binding [4043]. It is worthy to point out that conformational selection and induced fit models cannot be distinguished absolutely [44]. Indeed, some systems involve kinetic elements of both mechanisms [45, 46].

The possible magnitudes of conformational selection and induced fit [47] are calculated to reveal the recognition mechanism. To explore the recognition mechanism, the average RMSD deviations of bound conformation and apo conformations are analyzed as a function of distance from the centroid of binding partner and shown in Fig. 9.8 [27]. This figure illustrates that the RMSD variation gradually increases until to the global level. This suggests that there is an induced fit far away for the binding site.

Fig. 9.8
figure 8

Local conformational RMSD differences between bound and apo conformations as a function of distance from the centroid of binding partner and statistical significance of conformational selection in p53 and CBP binding. Average local RMSD for 10 pairs of bound conformations and the most similar apo conformation and for 90 pairs of bound NCBD and the other apo conformations, as a function of distance from the centroid of binding partner. a CBP. b p53. c CBP. d p53

To address the statistical significance for differences of deviations between these two systems, two sample Kolmogorov-Smirnov test [48] is used to calculate the P value for each distance group. Figure 9.8c illustrates the median of P values and the fraction with P < 0.1 for all 100 pairs of CBP conformations in each distance group. It is found that the median P values are typically smaller than 0.1 in most distance group, especially in some larger distance group with median P values approximates 0. The conformations with P < 0.1 exceed 50 % in most distance groups. This suggests that the bound CBP is significant different from the apo conformation far away from the binding site and the differences are statistically significant. In summary, the recognition between intrinsic disordered CBP and p53 might obey an induced fit based on the RMSD and P-value analysis.

4 Conclusion and Remark

Intrinsically disordered proteins lack stable tertiary and/or secondary structures under physiological conditions in vitro. Intrinsically disordered proteins undergo significant conformational transitions to well folded forms only on binding to partner. Molecular dynamics simulations are used to research the mechanism of folding for intrinsically disordered protein upon partner binding. Room-temperature MD simulations suggest that the intrinsically disordered proteins have nonspecific and specific interactions with the partner. Kinetic analysis of high-temperature MD simulations shows that bound and apo-states unfold via a two-state process, respectively. Φ-value analysis can identify the key residues of intrinsically disordered proteins. Kolmogorov-Smirnov (KS) P test analysis illustrates that the specific recognition between intrinsically disordered protein and partner might follow induced-fit mechanism. Furthermore, these methods can be widely used for the research of the binding induced folding for intrinsically disordered proteins.