Keywords

1 Introduction

Molecular recognition by proteins is fundamental to almost every biological process, particularly those protein–ligand complexes underlying enzymatic catalysis. The process of molecular recognition has therefore been of central interest from the earliest days of enzymology. Various mechanistic frameworks have been constructed to describe the physical basis for specific high affinity interactions between protein molecules and their ligands. They have ranged from the simple “lock and key” interaction model [1] to the “induced fit” [2] and the more recent “conformational selection” models [3, 4]. Each of these derives from an increasing recognition of the plasticity and energetics of the ensemble of structures that proteins occupy [5, 6] and the potential role for travelling across this complex energy landscape in protein function [7, 8].

For the most part, the predominant interest has been in characterizing how motion between structural states correlate with protein function. There have been spectacular demonstrations of conformational selection (e.g., [9, 10]) and strong indications for the participation of “special” protein motions in catalysis by enzymes (e.g., [1114]), though the latter role has been vigorously criticized on general principles [15] (but see [16]). What concerns us here is not that proteins move from one functional state to another but rather what is the thermodynamic basis of these functionally relevant states. This is a long-standing issue and is intimately related to the so-called protein folding problem and the free energy landscape that protein molecules explore [17]. Stated less obliquely, we wish to understand the physical basis for the (de)stabilization of a given state of a protein molecule. Perhaps the most primitive and potentially simplest function that a protein engages in is high affinity molecular recognition. The formation of protein complexes involves a complicated manifold of interactions that are diverse and complex. This complexity is reflected in the difficulty of computing the energetics of interactions involving proteins using molecular structure alone [1820]. Indeed, structure-based design of pharmaceuticals has been impeded by this barrier [21]. Here, we specifically focus on the role of protein conformational entropy in modulating the free energy of the association of a protein with a ligand.

Expression of the total binding free energy emphasizes that the entropy of binding is comprised of contributions from the protein, the ligand and the solvent:

$$ \Delta {G_{\mathrm{ bind}}}=\Delta {H_{\mathrm{ bind}}}-T(\Delta {S_{\mathrm{ protein}}}+\Delta {S_{\mathrm{ ligand}}}+\Delta {S_{\mathrm{ solvent}}}) $$
(1)

The free energy of binding (ΔG bind) and the enthalpy of binding (ΔH bind) can in favorable cases be directly measured using isothermal titration calorimetry [22] and the associated total binding entropy (ΔS bind) obtained by arithmetic. Unfortunately, as (1) indicates, the molecular origins of these thermodynamic parameters remain obscure. Views of associations involving proteins have largely been seen through the lens of enthalpy owing to the richness of our knowledge of the structure of proteins and their complexes, which helps reveal the details of the interactions governing the enthalpy. In great contrast, the origin of the change in entropy remains difficult to grasp as it inherently involves a manifold of states that the protein, ligand, and solvent can occupy, each having its own probability for existence. Historically, the contributions by solvent entropy to binding thermodynamics have taken center stage and are usually framed in terms of the hydrophobic effect [23]. Hydrophobic solvation by water continues to be the subject of extensive analysis [24, 25].

In principle, the entropic contribution of a structured protein to the binding of a ligand (ΔS protein) includes both changes in its internal conformational (configurational) entropy (ΔS conf) and changes in rotational and translational entropy (ΔS RT) [26]. Equation (1) emphasizes that the measurement of the entropy of binding does not resolve contributions from internal protein conformational entropy. Arguments from fundamental theory [27] and observations from simulation (e.g., [28, 29]) and experiment (e.g., [30, 31]) in the 1970s indicated that proteins fluctuate about a structure closely similar to that observed by crystallography and that these fluctuations could reflect significant residual conformational entropy. Yet it is only recently that experimental methods and strategies have been created to assess this and related ideas quantitatively.

Experimental measurement of ΔS conf has been difficult. During the past decade, we and others have been developing NMR methods that serve to provide measures of motion between different microscopic structural states and thus can act as an indirect measure of or proxy for conformational entropy [32]. The idea is simple, but extremely tricky to implement. However, as outlined below, solution NMR spectroscopy has emerged as the most powerful experimental technique for accessing protein motion in a site-resolved comprehensive manner. Motion expressed on the sub-nanosecond time scale corresponds to significant entropy [33, 34] and NMR relaxation methods are particularly well suited for its characterization [32]. As will be illustrated in detail below, this leads directly to the idea of using measures of motion as a proxy for conformational entropy. Thus, detection of motion segues into the issue of conformational entropy, though it is not obvious how to employ measures of motion as a quantitative measure of conformational entropy. Of course, one must first enable the site resolved measurement of internal motion (disorder) of proteins.

2 Solution NMR Spectroscopy and Detection of Motion

Over the past two decades, solution NMR spectroscopy has emerged as a powerful means for the site-resolved measurement of motion on an impressive range of time scales in proteins of significant size [32, 3537]. The breadth of available time scales is daunting and leaves one asking where, with limited resources, one should begin. Here we are ultimately most focused on those motions that express large contributions to protein conformational entropy. Simple arguments suggest that this should be largely manifested in the extremely fast motions corresponding to bond vibrations and torsional oscillations, which generally occur on the nanosecond and faster time scales [33]. Classical NMR relaxation phenomena allow access to this time scale.

Very briefly, an ensemble of nuclear spins can relax from a non-equilibrium state via a range of potential interactions. For example, the inter-nuclear dipole–dipole interaction between spatially proximal hydrogens gives rise to the nuclear Overhauser effect (NOE), which is likely familiar to the non-expert reader as a means to measure inter-atomic distances for the determination of molecular structure in solution. The strength of this interaction depends on the time average of the distance between the nuclei and of the angle of the inter-nuclear vector with the applied magnetic field. In favorable situations, such as when the two nuclei are bonded, the distance dependence is effectively constant (though see [38, 39]) and the interaction is temporally modified only by the change in the orientation of the bond vector with respect to the magnetic field. Examples include the 15N–1H amide bond and the 13C–1H bond in a variety of contexts such as in a methyl group. Other nuclear spin interactions are perhaps more obscure to the reader and include chemical shift anisotropy (CSA), which reflects the interaction of the nuclear spin with the asymmetric distribution of electrons about it, and the electrostatic interaction between the quadrupolar moment of a nucleus having a spin quantum number of one or greater and the surrounding electric field gradient. In some cases, the “interference” or “cross correlation” between different relaxation mechanisms can offer insight into motion (e.g., [4045]), though these approaches are not common owing to a variety of technical limitations.

Relaxation of nuclear spins in liquid samples to an equilibrium distribution of spin states is mediated by the fluctuation of local fields. Rapid molecular motions impose time modulation on these local fields and it is through this dependence that information about motion can be obtained. The theoretical treatment of the connection between motion and NMR relaxation phenomena is complicated. The interested reader is referred to a recent monograph on this and related subjects [46]. A popular and very robust way of capturing the essential character of the motion that gives rise to NMR relaxation phenomena of the type considered here is the so-called model free approach of Lipari and Szabo [47, 48]. The various NMR observables that can be measured can be generally expressed as linear combinations of the so-called spectral density functions. The spectral densities are in turn defined by the motion of the “interaction” vector within the protein. Consider a 13C nucleus attached by a single 1H, i.e., in a CHD2 isotopomer. The deuterium nucleus has a much weaker interaction than the 1H nucleus and its effects can be largely ignored (though they are not in the final analysis). The motion of the 13C–1H interaction vector (in this case along the rigid bond between them) can be described by an autocorrelation function, which is simply the dot product of the interaction vector’s orientation at some time t and its orientation at some time t′. The time dependence will have two components: a contribution from the slower global tumbling of the protein and an assumed faster component due to motion within the molecular frame. In the simplest case, the Lipari–Szabo treatment leads to three parameters: a correlation time for isotropic macromolecular reorientation (τ m), an effective correlation time (τ e), and a measure of the angular disorder of the interaction vector termed the squared generalized order parameter (O 2).Footnote 1 The order parameter by definition ranges from zero to one, corresponding to complete isotropic disorder and complete rigidity of the interaction vector within the molecular frame, respectively. It is this motional parameter that offers the potential to provide access to statements about conformational entropy. The effective correlation time has a strong technical definition that precludes its general use as faithful descriptor of the time constant(s) for the underlying motion.

The Lipari–Szabo model-free spectral density has a very simple form:

$$ J(\omega )=\frac{{{O^2}{\tau_{\mathrm{ m}}}}}{{1+{\omega^2}\tau_{\mathrm{ m}}^2}}+\frac{{(1-{O^2})\tau }}{{1+{\omega^2}{\tau^2}}} $$
(2)

where \( {\tau^{-1 }}=\tau_{\mathrm{ m}}^{-1 }+\tau_{\mathrm{ e}}^{-1 } \). The spectral densities are linearly combined as required by the physics of the specific NMR relaxation mechanism to define an observable relaxation (see [32] for illustrative derivations). For example, the longitudinal relaxation rate (1/T 1 or R 1) of the 13C nucleus by a single bonded 1H nucleus is given by:

$$ {R_1}=\frac{{{d^2}}}{4}[J({\omega_{\mathrm{ H}}}-{\omega_{\mathrm{ C}}})+3J({\omega_{\mathrm{ C}}})+6J({\omega_{\mathrm{ H}}}+{\omega_{\mathrm{ C}}})] $$
(3)

where d 2 is comprised of fundamental constants and the effective C–H bond length and ω H and ω C are the resonance frequencies of 1H and 13C nuclear spins, respectively. For each site of interest there are two unknowns (O 2 and τ e) plus one global variable (τ m) defining isotropic tumbling of the protein in solution. The situation can become more complicated, of course. For example, the character of the tumbling of the macromolecule may be anisotropic to some degree. This is easily handled using appropriate diffusion equations and data filtering [4952].

Similarly, specific instances may require (justify) more or less complex forms of the model-free spectral density shown above [53]. Nevertheless, the experimental prescription is clear: resolve relaxation data at n individual sites in a protein and measure as many relaxation parameters (e.g., T 1, T 2, etc.) at as many magnetic fields (to vary ω H, ω C) as needed to provide a robust determination of the 2n + 1 (in the case of isotropic macromolecular tumbling) parameters.

Several key steps were required to enable the comprehensive use of NMR relaxation phenomena to characterize the internal motion of proteins of significant size. The advent of multidimensional NMR spectroscopy provides a means to resolve literally hundreds of probe sites for motion in proteins. Sophisticated isotopic labeling schemes have been introduced to simplify the complexity of the NMR relaxation as much as possible in order to make its measurement and subsequent analysis more robust (e.g., [5460]). Generally, the strategy is to reduce the number of relaxation mechanisms (interactions) as much as possible. Finally, the cornerstone was the development of NMR experiments that prepared “pure” NMR observables of NMR relaxation that could be directly interpreted. Lewis Kay and colleagues are largely identified with the development of the NMR machinery necessary to measure 15N [61, 62], 2H [63, 64], and 13C [65] autorelaxation in proteins. More recently, Tugarinov and coworkers have extended the number of deuterium relaxation experiments that provide a context for extracting even more fundamental relaxation rate constants at a single magnetic field [66]. Notable contributions for implementing 13C relaxation in the context of proteins came from Torchia and colleagues in their unraveling of the complexity of this particular mechanism of relaxation [67]. Finally, computational strategies were needed to extract confidently the desired model-free parameters [53, 68, 69].

The basic experimental strategy is outlined for methyl carbon relaxation in Fig. 1. To make this particular situation as simple as possible, the interaction of the methyl carbon is restricted to a single bonded 1H. This is arranged by expression of the protein of interest during growth on randomly partially deuterated 13C3-pyruvate as a general carbon source [57] or with unlabeled glucose and appropriately labeled metabolic precursors for valine, leucine, and isoleucine [59]. Protein expression is carried out in “100%” D2O to ensure elimination of 1H spins at non-methyl sites in the protein [67].

Fig. 1
figure 04181

Observation of carbon relaxation in proteins. (a) Protein is expressed during growth on 13CHD2COOD pyruvate in D2O. Most methyl groups are selectively labeled with 13C [57] as a mixture of deuterium isotopomers [67]. (b) The appropriate 13CHD2 methyl isotopomer is selected during preparation of magnetization which (c) allows longitudinal and transverse relaxation to be measured in an otherwise deuterium background [65]. See [32] for further details

Proteins produced in this manner are largely deuterated with selective 1H and 13C labeling in methyl groups. The methyl groups are mixtures of isotopomers (i.e., CD3, CHD2, CH2D, CH3). Only the three carrying at least one hydrogen are observed in the two-dimensional 1H–13C chemical shift correlation spectrum (Fig. 1). The three observed isotopomers give crosspeaks that are at slightly different positions in the spectrum. The appropriate isotopomer, in this case the CHD2, is selected by spectroscopic manipulation [65]. The NMR relaxation experiment is designed to follow the return of a particular type of magnetization from a non-equilibrium state back to equilibrium. For carbon relaxation in proteins, longitudinal and transverse relaxation processes are most useful and are likely familiar to the non-specialist reader as “T 1” and “T 2” relaxation, respectively. The NOE is not so useful in the context of carbon relaxation in proteins of significant size since this observable reaches a limit as molecular tumbling slows. The relaxation process at a given methyl is quantified by the variation of the intensity of the corresponding 1H–13C cross peak with the relaxation time period (Fig. 1). The variation of the rate of relaxation across sites in the protein reveals corresponding differences in the underlying dynamics.

3 Fast Motion in Proteins Observed by NMR Relaxation

For technical reasons alluded to above, the primary probes of fast ps–ns motion in proteins have been the amide N–H bond and the C–H bond in methyl groups. Here “fast” is defined by processes occurring on time scales significantly shorter than the macromolecular tumbling time of the protein in solution, which is generally on the order of 50 ns or less in the context of current studies. The 15N experiments are relatively straightforward and there are literally hundreds of studies of proteins spanning a significant range of topologies and contexts [36]. The view of the backbone provided by amide 15N relaxation is generally unimpressive with uniformly high order parameters (rigidity) in regions of large elements of regular secondary structure (i.e., beta sheets, helices), which are bounded by the termini and intervening loops, turns, and sections of irregular structure having significantly larger amplitude motion [36]. Changes in functional state generally have small and often quite subtle effects on the backbone of the protein. Overall, the polypeptide chain appears largely to be a relatively rigid scaffold. Most of the dynamic response in the ps-ns time regime to changes in functional state is seen to reside in the side chains [32]. Observation of methine and methylene carbon hydrogens is complicated by the difficulty of appropriate labeling [55, 56, 70]. For probing the motion of methyl-bearing amino acid side chains, deuterium relaxation is preferred due to the purity of the source of its relaxation [32]. However, again for technical reasons, this approach is largely restricted to proteins smaller than ~25 kDa due to the effects of slow macromolecular reorientation in solution. For larger proteins, carbon relaxation methods prove to be the most robust though preparation of the samples and analysis of the data is somewhat more involved [67, 71]. The motion of methyl-bearing side chains in several dozen proteins have been studied in comprehensive detail using these methods [32]. Though sometimes obscured due to limited sampling in small proteins, three types or classes of motion of methyl-bearing side chains have been revealed by the distribution of methyl symmetry axis Lipari–Szabo order parameters [32]. The so-called J-class is centered around an O 2 axis value of ~0.35 and involves motion of the methyl group between rotameric wells, leading to averaging of the associated J-coupling. The α-class is centered around an O 2 axis value of ~0.65 and has smaller contribution from motions that lead to rotameric interconversion and generally reflects large amplitude motion within a single rotameric well. The ω-class is centered around an O 2 axis value of ~0.85 and has highly restricted motion within a single rotameric well that is somewhat reminiscent of the uniform rigidity of most backbone sites.

What is fascinating about these classes of motion is that their distribution in individual proteins can be quite variable (Fig. 2). The significant variation in the effective amplitude of motion through the protein matrix is perhaps somewhat counterintuitive. For example, restriction of motion is not strongly correlated with the depth of burial and other simple structural features [32]. Clearly, the “rules” for motion in proteins remain to be fully understood.

Fig. 2
figure 04182

Histograms of the distribution of squared generalized order parameters of methyl group symmetry axes (O 2 axis) in (a) the complex of calcium-saturated calmodulin and a peptide derived from the calmodulin-binding domain of the smooth muscle myosin light chain kinase [72], (b) flavodoxin [73], and (c) α3D, a protein of de novo design [74]. Lines are best fits to a sum of three Gaussians. Taken from Igumenova et al. [32]. Copyright American Chemical Society

4 Motional Proxy for Conformational Entropy

Some time ago Akke and coworkers [75] introduced a theoretical scheme that provided a parametric connection between motion captured by NMR relaxation and the thermodynamic parameters of an ensemble of motional probes. This is directly revealed by the formal definition of the Lipari–Szabo squared generalized order parameter [47]:

$$ {O^2}=\iint {{p_{\mathrm{ eq}}}({\varOmega_1}){P_2}(\cos {\theta_{12 }}){p_{\mathrm{ eq}}}({\varOmega_2})\mathrm{ d}{\varOmega_1}\mathrm{ d}{\varOmega_2}} $$
(4)

where Ω 1 and Ω 2 represent separate orientations (states) of the NMR interaction vector, p eq is the corresponding probability of each state, and P 2 the second order Legendre polynomial of the cosine of the angle θ 12 between the two states. Clearly, the explicit consideration of the probability of the various states accessible to the NMR relaxation probe provides a direct connection to the partition function governing the ensemble. Hence, in principle, one can have access to the fundamental thermodynamic parameters of the ensemble through the usual relations. However, this approach requires a specific model (potential energy) for the motion in order to make the parametric connection between what can be measured (O 2) and what is desired (S).

Adopting this idea, we [34] and Yang and Kay [76] used the simple harmonic oscillator and diffusion in an infinite square well potential, respectively, to illustrate the parametric relationship between the Lipari–Szabo squared generalized order parameter (O 2) and entropy (S). The uses of motion as a proxy for or as an indirect measure of conformational entropy and some of the fundamental issues associated with this strategy are illustrated in Fig. 3. Consider an amino acid side chain with a single degree of motional freedom such as a point about which the side chain can pivot. Three different simple potentials based on the angle of the pivot are illustrated in Fig. 3. The square well potential has an infinite barrier set at some fluctuation angle. This potential corresponds to free diffusion in a cone, which has an analytical expression relating the Lipari–Szabo order parameter to the corresponding entropy [76]. This infinite square well potential is somewhat unrealistic as it does not express thermodynamic some properties such as heat capacity, which are an essential descriptor of protein molecules [77].

Fig. 3
figure 04183

The dynamical proxy for entropy. (a) Various simple potential energy functions governing the behavior of an attached NMR relaxation “spy”. Shown are the infinite square well potential (SW) with an angular barrier of 1 radian, the simple harmonic oscillator potential with a quadratic dependence on the excursion angle (θ2) and its sixth-power cousin (θ6). (b) Parametric relationships between the various potentials and the corresponding Lipari–Szabo squared generalized order parameter. See [34, 75, 76, 78] for further explanation and examples

Also illustrated in Fig. 3 are the classic quadratic harmonic oscillator and its stiffer sixth power colleague, which do have the ability to represent a broader range of thermodynamic attributes of proteins in this context [75, 78]. The family of potentials of Fig. 3 gives entropies that are off-set from each other. Given the persistent uncertainty of the precise nature of the potential governing protein motion, the determination of absolute entropy is generally not possible. Fortunately, as Fig. 3 also indicates, the dependence of the slope (i.e., dS/dO 2) is relatively insensitive to the nature of the underlying potential. Thus differences in entropy obtained from differences in measures of motion seems to be possible [34]. Thus the “dynamical proxy” for entropy would appear to be useful for analysis of changes in motion corresponding to variation of the Lipari–Szabo order parameter between ~0.1 and ~0.9 even in the presence of a possible change in the underlying potential energy function upon a change in state (e.g., a binding event, pressure and temperature change, etc.).

In an approach that we term an “oscillator inventory,” a simple prescription is used to estimate changes in conformational entropy from changes in protein internal motion: (1) measure as many differences in Lipari–Szabo order parameters between two states of the protein as possible; (2) look-up the corresponding changes in entropy (i.e., using the type of relationship shown in Fig. 3); and (3) take the differences and simply add them up. There are many legitimate objections to this approach, including concerns about correlated motion, incomplete sampling, validity of the potential, the dependence of the observed relaxation on the geometric details of the motion, etc., all of which can potentially compromise the interpretation [32].

Despite the obvious limitations of the “oscillator inventory” approach, it has led to observations that promote the general idea that ligand binding to proteins produces significant changes in internal motion that, in turn, correspond to significant changes in conformational entropy (e.g., [72, 7985]). We have used calmodulin and its multitude of binding partners as a model system to investigate the role of conformational entropy in high affinity association of proteins [86]. Calmodulin is central to calcium-mediated signal transduction pathways of eukaryotes [87] and interacts with hundreds of proteins with high affinity [88]. The structures of the calmodulin complexes generally follow a “hot dog in a bun” type of topology where the two globular domains of calmodulin collapse around the target calmodulin-binding domain, which forms an amphiphilic helix that is largely sequestered from solvent. Calmodulin-binding domains are generally 20–30 residue sequences characterized by enrichment in basic and hydrophobic residues [88]. This system is particularly amenable to deuterium methyl relaxation experiments [32, 72, 78, 86, 89, 90]. As shown in Fig. 4, methyl probes are distributed throughout the calmodulin molecule. The target domains also have strong representation by the methyl-bearing amino acids [89].

Fig. 4
figure 04184

Ribbon representation of the complex of calcium-saturated calmodulin and a peptide corresponding to the calmodulin-binding domain of the calmodulin kinase I. Methyl groups of calmodulin are represented as spheres and are shaded according to Lipari–Szabo order parameter of the methyl symmetry axis (O 2 axis) as determined by deuterium relaxation methods. Taken from Frederick et al. [91]. Copyright American Chemical Society

Using NMR relaxation measurements, we have shown that calcium-saturated calmodulin (CaM) is an unusually dynamic protein that is characterized by a broad range of the amplitudes (i.e., order parameters) of fast side-chain dynamics that are redistributed upon binding of a CaM-binding domain of a regulated protein [72]. Particularly intriguing is the observation that the conformational entropy estimated by the “oscillator inventory” approach correlated linearly with the overall entropy of binding of a series of CaM-binding domains to CaM (Fig. 5) [86]. It must be emphasized that there is no physical law that requires a linear correlation of a single component of the total entropy (i.e., the conformational entropy) with the total entropy. However, the persistence of such a linear correlation suggests a biological origin: Nature employs conformational entropy in evolving toward the optimal free energy of binding. To first approximation this makes sense if a primary selection pressure for evolution of a protein–ligand interaction is the free energy of association then all sources of entropy will be invoked, to the extent possible, to satisfy this evolutionary demand.

Fig. 5
figure 04185

Application of the oscillator inventory approach to the calmodulin complexes. CaM complexes with six natural target domains were employed. The change in conformational entropy was estimated from the changes in methyl group symmetry axis squared generalized order parameters measured by deuterium relaxation [86]. The simple harmonic oscillator model was used. The linear correlation coefficient (R 2) of conformational entropy vs the entropy of binding is 0.78. Adapted from Frederick et al. [86] with permission. Copyright Nature Publishing

5 Creation and Calibration of an “Entropy Meter”

A more recent approach is perhaps less assailable and more convincing than the simple “oscillator inventory.” The idea is to subsume the various microscopic concerns enumerated above and find an empirical calibration. Thus measures of motion are used in a largely model-independent way, which thereby circumvents the microscopic details that are difficult to accommodate in a model-dependent calculation. In a sense, we have simply created an “entropy meter,” analogous to a thermometer. It is important to note that in this approach the motion of the methyl group is used to sense its local surroundings. This relies on the coupling of motion within the protein such that the “probe” methyl groups report on the local disorder [90]. This is a critical change in perspective where the dynamical probe is interpreted to reflect not only its own disorder but also the disorder of the surrounding non-probe protein matrix. Thus, for the approach to work there must be sufficient coupling between the motion of the probe and its surroundings and there must be a sufficient density of probes to provide adequate coverage of the protein.

5.1 Derivation of the Dynamical Proxy “Entropy Meter”

Making several simple assumptions regarding the nature of the free states so that the solvation entropy can be calculated, one can obtain in this case [90]

$$ \Delta {S_{\mathrm{ conf}}}=m\left[ {\left( {n_{\mathrm{ res}}^{\mathrm{ CaM}}\bullet {{{\left\langle {\Delta O_{\mathrm{ axis}}^2} \right\rangle}}^{\mathrm{ CaM}}}+n_{\mathrm{ res}}^{\mathrm{ target}}\bullet {{{\left\langle {\Delta O_{\mathrm{ axis}}^2} \right\rangle}}^{\mathrm{ target}}}} \right)} \right]+\Delta {S_{\mathrm{ other}}} $$
(5)
$$ \left( {\Delta {S_{\mathrm{ tot}}}-\Delta {S_{\mathrm{ sol}}}} \right)=m\left[ {\left( {n_{\mathrm{ res}}^{\mathrm{ CaM}}\bullet {{{\left\langle {\Delta O_{\mathrm{ axis}}^2} \right\rangle}}^{\mathrm{ CaM}}}+n_{\mathrm{ res}}^{\mathrm{ target}}\bullet {{{\left\langle {\Delta O_{\mathrm{ axis}}^2} \right\rangle}}^{\mathrm{ target}}}} \right)} \right]+\Delta {S_{\mathrm{ RT}}}+\Delta {S_{\mathrm{ other}}} $$
(6)

where ΔS tot, ΔS sol, ΔS conf, ΔS RT, and ΔS other are the changes in total system entropy, solvent entropy, conformational entropy, rotational-translational entropy, and undocumented entropy, which is mostly solvent entropy from ion pair dissociation and solvation. The changes in side chain motion are assessed from changes in methyl order parameters weighted by the number of residues in calmodulin (\( n_{\mathrm{ res}}^{\mathrm{ CaM}} \)) and the target domains (\( n_{\mathrm{ res}}^{\mathrm{ target}} \)). By postulate, ΔS conf is linearly related to the residue-weighted change in the dynamics of the target domain and the protein (e.g., calmodulin) upon binding. Linearity is strongly supported by the simple simulations illustrated in Fig. 3 [78]; “m” is the desired empirical scaling factor relating changes in motion to changes in conformation entropy.

ΔS tot is obtained from isothermal titration calorimetry. ΔS sol is obtained from the change in accessible surface area revealed by the structures of free CaM and the complexes. Despite some limitations of this particular system (see below), the approach worked very well. Figure 6 shows the empirical calibration of the “entropy meter” using five calmodulin complexes. The five complexes give an excellent linear relationship (R = 0.95) and a slope (m) of −0.037 ± 0.007 kJ K−1 mol res−1.

Fig. 6
figure 04186

Calibration of the dynamical proxy for protein conformational entropy. Simple considerations lead to the prediction of a quantitative linear relationship between the total binding entropy and the entropy of solvent to the conformational entropy by NMR relaxation parameters derived from methyl bearing amino acids (see (6)). The dynamics of free CaM and six CaM complexes were determined by deuterium methyl relaxation [89]. The lower CaM:CaMKKα(p) datum is a clear outlier, likely due to residual structure in the free CaMKKα(p) peptide. The upper CaM:CaMKKα(p) point results from a simple correction. Excluding the CaM:CaMKKα(p) complex gives the regression line shown. The slope of −0.037 ± 0.007 kJ K−1 mol res−1 allows for empirical calibration of the conversion of changes in side-chain dynamics to a quantitative estimate of changes in conformational entropy. The ordinate intercept is 0.26 ± 0.18 kJ K−1 mol res−1. Reproduced with permission from Marlow et al. [89]. Copyright Nature Publishing

5.2 Quantitative Evaluation of Entropy in Molecular Recognition by Calmodulin

The calibration of the “entropy meter” allows the contribution of conformational entropy to the binding entropy to be evaluated quantitatively (Fig. 7). Scaling of observed changes in dynamics in calmodulin to real entropy units provides, for the first time, a quantitative statement of how changes in the conformational entropy of a protein contribute significantly to the overall binding free energy of a protein–ligand complex [90]. The basic result is striking: the conformational entropy of calmodulin is a significant component of the free energy of binding of the target domains. Indeed, the variation of the conformational entropy of calmodulin effectively “tunes” the binding entropy [89].

Fig. 7
figure 04187

Decomposition of the entropy of binding of target domains to calcium-saturated calmodulin. Based on (6) and the calibration of the dynamical proxy (see Fig. 6). Solid diamonds are the solvent entropies calculated from the changes in accessible surface area and include the correction resulting from the postulated hydrophobic cluster of the free CaMKα(p) target domain. The uncorrected value is shown as an open diamond. No structure is available for the CaM:PDE(p) complex so its solvent entropy cannot be calculated. Solid circles and triangles are the contributions to the binding entropy by the conformational entropy of CaM and the target domains, respectively. Solid squares are the contributions to the binding entropy not reflected in the measured dynamics (see (6)), which is obtained from linear regression. Reproduced with permission from Marlow et al. [89]. Copyright Nature Publishing

It is important to point out that this first example of empirical calibration of a dynamical “entropy meter” used a less than ideal system. First, access to the solvent entropy of binding relied on the assumption that the free target domains (represented as peptides) are fully solvated, i.e., completely random coil. This enabled determination of the accessible surface area of the free peptide ligand. As the CaMKKα(p) example illustrated, this is not always warranted. In addition, the calculation of solvent entropy does not consider the creation (or removal) of explicit charge. This is potentially an issue with the calmodulin complexes as each contains intimate and buried ion pairs bridging the components. It is also assumed that there are no differences in the contribution of rotation and conformational entropy to the binding free energy across the various complexes. Another important assumption is that the entropy of the free ligand can be represented by the dynamics of an unhindered methyl group. Undocumented variation in either ΔS solvent or ΔS RT or ΔS conf of the ligand would tend to scatter the points. Fortunately, variation of these and other contributions assumed to be constant across the complexes was not a factor [except in the case of the CaMKKα(p) complex]. Nevertheless, it is clear that similar efforts are most ideally carried out on a series of complexes where ambiguity in solvation and rotational and translational entropy is less worrisome. An example of the former approach is shown in Fig. 6 where a point mutant in CaM perturbs the thermodynamics of binding to provide a useful calibration point.

Because the calmodulin complexes represent the first example of the empirical dynamical entropy meter approach, it remains unknown whether or not the empirical scaling constant is generally applicable. It would be surprising if the fundamental nature of the protein molecule giving rise to the motional coupling underpinning the approach would vary dramatically. Nevertheless, modest variation would seem entirely possible and this issue awaits further exploration of additional systems.

5.3 Future Strategies

Though the first application of the empirical “entropy meter” approach employing a dynamical proxy was apparently successful, the use of the calmodulin system highlights several potential weaknesses. In the case of calmodulin complexes, the predominant strategy was to compare complexes of different ligand target domains. Though very similar in amino acid composition and size, this introduces uncertainty in the assumption that changes in solvent entropy can be adequately calculated and that rotational-translational entropy and other sources of entropy changes remain constant across the complexes. An alternative and presumably less problematic approach is illustrated by the lone mutant CaM complex examined (Fig. 6). By using mutants distant from the protein–ligand interface to perturb the overall entropy of binding (in part through perturbations of conformational entropy) one can carry out a calibration with the same ligand. Furthermore, if mutants are carefully chosen to avoid significant variation in structure then uncertainty in changes in rotational-translational and solvent entropy will be largely avoided. Finally, it is not yet clear how to combine different probe types. Though the simple averaging of different methyl bearing amino acid dynamics gives impressive results, one anticipates that a more sophisticated averaging reflecting the variation in degrees of freedom associated with a given methyl group should be employed. This type of consideration will be even more important as quite distinct dynamic probes are introduced (e.g., the large aromatic ring systems) to ensure adequate coverage.

6 Implications for Enzyme Catalysis

Notwithstanding the uncertainty regarding the universality of the empirical linear scaling between protein motion and conformational entropy, it is interesting to speculate what the impact of conformational entropy might be in the catalytic cycle of enzymes. Here we draw two examples from the literature: hen egg white lysozyme (HEWL) and its interaction with a natural inhibitor and E. coli dihydrofolate reductase (DHFR) and the transition from the binary complex to the ternary Michaelis complex.

6.1 Conformational Entropy and Inhibitor Binding to Lysozyme

HEWL was the first enzyme to have its three-dimensional structure determined by X-ray diffraction [92] and has since served as a paradigm for a wide-range of biochemical and biophysical studies. The double-displacement catalytic mechanism proposed initially by Koshland [93] involving a covalent intermediate with the substrate has supplanted [94] the long held Philips mechanism that centered on a long-lived oxocarbenium ion intermediate [92]. Here we examined the thermodynamics of the interaction of an inhibitory carbohydrate with HEWL. Lysozymes catalyze the hydrolysis of β-(1,4)-linkages between N-acetylmuramic acid and N-acetyl-d-glucosamine (GlcNAc) in peptidoglycans. Additionally, some lysozymes, including that from hen egg whites, can cleave between GlcNAc residues in chitodextrins such as chitin [95]. HEWL can accommodate up to six GlcNAc residues of a chitin polymer, each binding in six sub-sites along a cleft of the protein. Cleavage occurs at the linkage between the GlcNAc residues occupying the third and fourth subsites [94, 96] and thus does not readily occur with molecules consisting of only one (GlcNAc), two (chitobiose), or three (chitotriose) GlcNAc residues [97, 98]. These smaller molecules do, however, bind with reasonable affinity (K d~10−4 to 10−6 M) [98, 99] and therefore act as natural competitive inhibitors. Using isothermal titration calorimetry, the binding of both chitobiose and chitotriose to HEWL has been found to be enthalpically driven over a wide range of temperatures [98]. The extent of the contribution from conformational entropy manifested as fast internal motion of the protein, however, is unclear, as indicated by (1). To examine this issue, we have recently carried out a comprehensive characterization of the sub-nanosecond time scale dynamics of the backbone and of the methyl-bearing side chains of HEWL in the apo state and in complex with chitotriose [100].

The fast sub-nanosecond dynamics of the backbone and the side chains were characterized using 15N- and 2H-methyl relaxation, respectively [100]. From the point of view of fast methyl-bearing side chain dynamics, HEWL is an unusually rigid protein in both its free and chitotriose complexed states. Of proteins that have been examined in this way, only those having high affinity cofactors (flavodoxin) or covalently attached prosthetic groups (cytochrome c and c2) have comparable general rigidity [73, 101, 102]. There is no distinct spatial clustering of rigidity or flexibility in the molecular structure of either the free protein or its binary complex with chitotriose.

The response of the side chain motion to chitotriose binding is complex and heterogeneous, with some sites increasing the amplitude of their motion while others are decreased. Interestingly, a significant number of relatively rigid methyl-bearing side chains of the apo state effectively become completely immobile upon binding of the ligand. These residues form a contiguous grouping that spans the core of the protein including the two catalytic residues. This core of rigidification is capped by residues that are released from an effectively rigid state in the apo state to become more dynamic upon binding chitotriose (Fig. 8). This study appears to shed light on a cooperatively formed rigidified core contacting HEWL’s catalytic residues and capped by two sites that become markedly more flexible on either end. The changes in methyl axis squared generalized order parameters across the molecule average to nearly zero (ΔO 2 axis = +0.019 ± 0.004). If it is assumed that the empirical scaling between changes in motion and the corresponding changes in conformational entropy determined for the calmodulin complexes [89] is applicable, then by averaging over the entire lysozyme protein and scaling the resulting average change in ΔO 2 axis with the empirical constant of −0.037 ± 0.007 kJ mol res−1 K−1 one estimates that the response of lysozyme to binding of chitotriose corresponds to +28 ± 8 kJ mol−1 at 308 K.

Fig. 8
figure 04188

Apparent cooperative rigidification of HEWL upon binding chitotriose. The backbone of the HEWL crystal structure [103] (PDB code 1LZB) is rendered as a ribbon and the chitotriose is shown as a stick figure. Atoms of residues whose methyl groups are effectively rigid in both the apo and complexed states (light) or become effectively rigid in the complexed state (dark) are shown as spheres. The atoms of residues whose methyl groups are effectively rigid in the apo state and become dynamic in the complex and cap the residues that are rigid in the bound state are indicated with arrows. The catalytic amino acids E35 and D52 are also labeled. Taken from Moorman et al. [100] with permission. Copyright Wiley-Blackwell and the Protein Society

6.2 Conformational Entropy and Dihydrofolate Reductase

Dihydrofolate reductase (DHFR) is a ubiquitous enzyme found in all organisms. It catalyzes the hydride transfer reaction converting dihydrofolate (DHF) to tetrahydrofolate using NADPH as its reducing cofactor. This enzyme is solely responsible for the cellular supply of THF, which serves as an essential metabolic precursor for DNA biosynthesis [104]. The catalytic cycle involves five intermediates, all of which involve at least one bound ligand. Following the hydride transfer step, the protein undergoes a conformational change in an active site loop referred to as the Met 20 loop (Fig. 9). This loop moves from the closed state, in which it shields the reactants from solvent, to the occluded state, where the nicotinamide ring of NADPH is blocked from the active site. Two other proximal loops, the F–G and G–H loops, stabilize the two states of the Met 20 loop through hydrogen bonding interactions [104]. The dynamics of these loops appears to play an important role in catalysis [105]. Studies of sub-nanosecond side chain dynamics of DHFR have also been performed and show significant changes in methyl order parameters during different stages of the catalytic cycle [106]. An interesting example impacts our understanding of the role of conformational entropy during the catalytic cycle. Wright and coworkers [106] determined the Lipari–Szabo methyl symmetry axis order parameters of E. coli DHFR ternary complex with NADP+ and folate, which is a generally accepted model for the DHFR·NADPH·dihydrofolate Michaelis complex. Lee and coworkers [107] have carried out a similar study of the DHFR·NADPH binary complex. Though done under slightly different experimental conditions (T, pH, buffer), the average change in \( {O^2}_{\mathrm{ axis}} \) of about +0.02 on going from the binary to the ternary complex combined with the scaling constant of Marlow et al. [89] suggests that an unfavorable reduction in entropy of ~0.1 kJ mol−1 K−1 in conformational entropy of DHFR accompanies binding of dihydrofolate to form the ternary DHFR·NADPH·dihydrofolate complex. This would correspond to a very unfavorable contribution to the free energy of dihydrofolate binding of roughly +30 kJ/mol−1 at 300 K. This result begins to suggest that conformational entropy can have significant impact in the interconversion of kinetic intermediates during enzyme catalysis.

Fig. 9
figure 04189

Ribbon diagram of the E. coli DHFR enzyme in complex with NADP+ and folate. Drawn with PyMol (Schrödinger, Portland, Oregon). Based on PDB code 1RX2 [108]

7 Conclusions

The application of NMR relaxation to fundamental issues in enzymology is now reaching into the thermodynamic origins of enzyme catalysis. The development of powerful experimental strategies now allows for the measurement of fast internal dynamics of proteins in various functional states and contexts. The experimental and analytical strategies needed to employ measures of motion as a proxy for the underlying conformational entropy of proteins have begun to mature. Tantalizing new features of both protein motion and the role of conformational entropy in protein function are emerging. Indeed, in the few examples of ligand binding to enzymes studied to date, the role of conformational entropy in defining the energetics of product release appears significant and warrants further detailed examination of other systems. It seems clear that these types of insights are likely not only to impact our general understanding of enzymatic function but also to assist in the more robust design of pharmaceuticals directed against them.