Introduction

Protein denaturation is an important process which provides information on the relative stability of the folded and unfolded forms of proteins and thereby insights into the protein folding process [1]. It is well established that cosolvents (anything other than the primary solvent) can be used to alter protein solubility and stability [2]. Small molecules such as urea and guanidinium chloride (gdmcl) are known to destabilize ordered protein structures and are often referred to as protein denaturants. In contrast, many polyols, sugars and amino acids can help stabilize the native form and are often referred to as osmolytes. Other molecules such as 2,2,2-trifluoroethanol (TFE) can actually induce secondary structure. A summary of the general effects of different cosolvents can be found in several comprehensive reviews [38]. However, even though the use of cosolvents to induce protein denaturation has a long history, our knowledge of the structural and dynamic details of the resulting denatured state remains rather limited. The exact interactions between cosolvents such as urea and a native or denatured protein are also poorly understood. Furthermore, it is clear that cosolvents can also be used to manipulate protein–protein interactions and therefore provide information on peptide and protein aggregation [916]. Hence, a clear picture of cosolvent effects in biomolecular systems is highly desirable.

The desire to understand cosolvent induced protein denaturation [17, 18], equilibrium dialysis [19], osmotic stress [20], the Hofmeister series [5], and light scattering from protein solutions [2123], has led to the theory of preferential binding and the concept of preferential interactions [19, 24]. In principle, preferential binding is a purely thermodynamic quantity. Traditionally, however, simple binding models have been applied to model and understand the preferential interactions of cosolvents with biomolecules [17, 25]. More recently, Kirkwood–Buff (KB) theory has been used to generate exact expressions for the preferential interactions of cosolvents with proteins which have provided a different view of the cosolvent distribution around a protein. Interestingly, this view is complimentary to the original thermodynamic (dialysis) experiments on open systems. The KB approach is particularly well suited for a description of the weak binding exhibited by most denaturants—where common experimental structural techniques (X-ray crystallography and nmr) produce useful, but limited data concerning cosolvent interactions with biomolecules [2633].

Our interest in KB theory has stemmed from a desire to use computer simulations to understand the details of cosolvent effects at the atomic level. Our early studies, while providing some interesting observations, were rather qualitative in nature [3436]. It became increasing more desirable to relate the computer simulation results to real experimental data. This is not a trivial problem as one cannot study the protein denaturation process in full atomic detail with current computational resources, and so a direct determination of the equilibrium constant is not possible. In addition to the sampling issue, we were also concerned as to the quality of the force fields being used for our studies and how accurately they mimic cosolvent effects [3739]. It was at this stage that we turned to KB theory to help quantify the effects we were observing in our simulations. KB theory is particularly attractive as it is exact and can be applied to solutions containing molecules of any size and type. Initially, our studies focused on the theory and simulation of cosolvents effects on hydrophobic hydration [40, 41], but this has since developed into a general approach for biological systems [42]. Several other authors have pursued a similar approach. Here we review the progress to date.

Notation and Overview

Traditionally, the subscripts 1, 2, and 3 refer to the primary solvent (water), the biomolecule solute, and the cosolvent, respectively. We retain that notation here. However, there are several differences in the present notation from more traditional work [43]. This is primarily a reflection of our desire to use computer simulations to study cosolvent effects, and to apply KB theory to analyze the results. In doing so, we have embraced the pseudo chemical potential approach of Ben-Naim [44]. This is closely related to the excess chemical potential expressions obtained from statistical mechanics and used in previous simulation studies. However, it is different from the traditional excess chemical potential adopted by experimentalists. In particular, number density (or molarity) is the natural concentration unit for most formulations of the chemical potential used in simulation studies. In contrast, the majority of early experimental studies have focused on the molal concentration scale. Furthermore, the thermodynamics of cosolvent effects involve derivatives of the chemical potentials with respect to the cosolvent concentration. Again, there are multiple choices for the cosolvent concentration. Each one produces slightly different, but related, expressions. This is somewhat unfortunate and therefore one must be careful to distinguish between the different definitions, approaches, and notations being used.

The application of KB theory to understand cosolvent effects is the primary advance from existing traditional approaches that is presented here. In particular, the existing approaches are not well suited for the analysis of computer simulations, whereas the descriptions provided through KB theory are simple and easy to apply. Expressions for the preferential interactions and associated activity derivatives in terms of KB integrals have only appeared recently. We believe that the KB integrals, which can be obtained from experiment or through simulation, provide the most promising approach to improve our understanding of cosolvent effects. This is the focus of the current review.

The review is organized as follows. A general outline of the concept of preferential interactions is provided including a simple derivation of the effects of a cosolvent on the chemical potential of a biomolecule. KB theory is introduced and applied to both open and closed binary and ternary systems. Specific applications of KB theory to biological systems that have appeared in the literature are then reviewed. Several currently available models for protein denaturation are outlined and discussed with reference to the proceeding KB analysis. Finally, we review the use of computer simulations to determine preferential interactions, and the use of KB theory to analyze the results of these simulations.

Preferential Binding and Preferential Interactions

Historically, the effect of a cosolvent on the properties of a biomolecule has been quantified in terms of the concept of preferential binding. The general theory has been outlined in detail by Scatchard [45], Wyman [17, 46], Eisenberg [19, 43], Tanford [25, 47], Timasheff [7], Schellman [4850], and Record [6, 51]. Preferential binding is a thermodynamic expression of the degree of cosolvent binding (over the solvent) derived for systems open to the solvent and cosolvent. The preferential binding parameter at a temperature T is defined by [2],

$$ \Upgamma _{23} =\left({\frac{\partial m_3 }{\partial m_2}} \right)_{T,\mu _1, \mu _3 } $$
(1)

where \(m_{i}=n_{i}/n_{1}\) is the species molality, n i is the number of molecules of i, and μ is the chemical potential. In an equilibrium dialysis experiment the above quantity measures the increase or decrease in the number of cosolvent and water molecules in the system on changing the biomolecule concentration, as determined from the corresponding solution density changes [19]. The preferential binding parameter is the central property of interest in this work. While it is defined in an open system, we will see that it is just as important and relevant in closed systems. Furthermore, one can use alternative definitions to Eq. 1 to quantify preferential binding when the concentrations are expressed in molarity units and/or the system is open to just species 1 or 3. Relationships between these different definitions are also available [19, 43], and will be discussed later.

Wyman derived a general expression relating changes in the equilibrium constant for a biomolecular process on the addition of a cosolvent, to the binding of the cosolvent to the different species involved in the equilibrium [17]. This was then extended by Tanford to specifically include water binding and exchange [47]. Both Wyman and Tanford used a general binding polynomial approach and the concept of linked thermodynamic functions to provide,

$$ \left({\frac{\partial \ln\,K}{\partial \ln\,a_3 }} \right)_{T,P,\mu _1}^{\rm o} =\Updelta B_3-\frac{\rho _3 }{\rho _1 }\Updelta B_1,$$
(2)

where \(\Updelta B_{3}\) (and \(\Updelta B_{1}\)) describe the difference in cosolvent (and water) binding to each state, ρ i is the bulk number density (=\(n_{i}/V\)) of each species, and the superscript o denotes the condition of an infinitely dilute protein. Timasheff and Tanford have emphasized that the B values are not independent thermodynamic quantities and so they cannot be separated on thermodynamic grounds [2, 47]. However, one can use models to evaluate the degree of cosolvent (or water) binding which allows one to determine both \(\Updelta B_{i}\) values, although this is not a true thermodynamic decomposition. We will see later that KB theory provides a rigorous thermodynamic relationship between B 3 and B 1.

The preferential binding parameter can be used to understand equilibria in closed systems by suitable thermodynamic transformations. One usually assumes that [19],

$$ \Upgamma _{23} =\left({\frac{\partial m_3}{\partial m_2}} \right)_{T,\mu _1, \mu _3 }\approx \left({\frac{\partial m_3}{\partial m_2}}\right)_{T,P,\mu_3 } $$
(3)

coupled with the thermodynamic relationship,

$$ \left({\frac{\partial m_3 }{\partial m_2 }} \right)_{T,P,\mu _3 } =-\left({\frac{\partial \mu _2 }{\partial \mu _3 }} \right)_{T,P,m_2 }$$
(4)

and reference to the Wyman linkage equation to obtain the following result for the effect of a cosolvent on the equilibrium constant (K) for denaturation,

$$ \left({\frac{\partial\ln\,K}{\partial \ln\,a_3 }} \right)_{T,P}^{\rm o} =\Upgamma _{D3}-\Upgamma _{N3} =\Updelta \Upgamma_{23} $$
(5)

where a 3 is the cosolvent activity (on any scale). Comparison with Eq. 2 therefore suggests that,

$$ \Upgamma_{23} =B_3-\frac{\rho_3 }{\rho_1 }B_1 $$
(6)

Consequently, a cosolvent that displays a larger preferential binding to the denatured (over the native) form will tend to shift the equilibrium in favor of the denatured form and is therefore classified as a denaturant. Alternatively, a cosolvent that displays preferential exclusion from the denatured form, also referred to as preferential hydration, will tend to shift the equilibrium in favor of the native form and is typical of an osmolyte. The above equation is correct as we shall see in the following section. However, we could not find a rigorous transformation from Eq. 1 through Eqs. 25 in the literature, even though many of the required transformations are known [19, 43]. It involves both a change in ensemble and a change from the total to a standard chemical potential.

Gibbs–Duhem Approach for Formulating Cosolvent Effects

A more physical picture of the cosolvent effect can be obtained by considering the Gibbs–Duhem equations corresponding to an open system containing 1 and 3 with a fixed amount of 2 in equilibrium with a closed reference solution of just 1 and 3 at the same chemical potentials. This outline is based on an approach by Hall [52], which has also been used by Wyman and Gill [46, p. 299], Parsegian and coworkers [53], Record [51], and more recently by Shimizu [54]. The physical picture is clearly illustrated in Reisler et al. [55]. At constant T and external P one can write,

$$ N_2 d\mu _2 +N_1 d\mu _1+N_3 d\mu _3 -(V-N_2 \overline {V_2 })d\Uppi =0 $$
(7)
$$ n_1 d\mu _1 +n_3 d\mu _3=0 $$
(8)

for the biomolecule and reference solutions, respectively. Here, \(\Uppi\) is the osmotic pressure due to the presence of the biomolecule, V is the solution volume after addition of the biomolecule, and \(\overline {V_2 }\) is the partial molar volume (pmv) of the biomolecule. The values of N i represent the number of molecules per biomolecule in the open system, while the n i values are the corresponding numbers in the bulk reference solution at the same cosolvent and water chemical potentials. The N i values can differ from the n i values due to the perturbing effect of the biomolecule on the solvent and cosolvent chemical potentials, which requires a solution redistribution to maintain equilibrium with the reference solution. The two equations shown above can be rearranged to eliminate dμ1 and provide, in the limit that N 2 tends to zero and using the limiting van’t Hoff equation for the osmotic pressure \((\Uppi = RT \rho_{2})\), the final expression,

$$ -\left({\frac{\partial\mu_2^{\rm O}}{\partial \mu_3 }} \right)_{T,P}^{\rm o} =N_3-\frac{\rho _3}{\rho_1}N_1 $$
(9)

where the number densities refer to the bulk reference solution. Therefore, if N 3/N 1 is greater than the bulk ratio of cosolvent to solvent (ρ31) then the standard chemical potential of the biomolecule (using any concentration scale) is decreased upon increasing the cosolvent activity (or concentration). Alternatively, if it is less than the bulk ratio the standard chemical potential is increased, whereas if it is the same as the bulk ratio the cosolvent has no effect on the standard chemical potential. The effect on an equilibrium process is then simply,

$$ \left({\frac{\partial \ln\,K}{\partial \ln\,a_3}} \right)_{T,P}^{\rm o} =\Updelta N_3 -\frac{\rho_3 }{\rho _1 }\Updelta N_1 $$
(10)

where we have assumed a simple two state process.

The above equations provide a rigorous thermodynamic description of the effects of cosolvents on biomolecules. The equations are exact within the infinitely dilute biomolecule limit. The values of \(\Updelta N_{3}\) and \(\Updelta N_{1}\) have traditionally been estimated using simple binding polynomials or binding models which assume (or at least suggest) binding to the surface of the protein. It is clear that the values of \(\Updelta N_{i}\) measure changes in the number of cosolvent and waters molecules over the whole solution and not just at the surface of the protein. We will see that KB theory is particularly well suited to this type of situation. Equation 10 is the same as Eq. 5 if we interpret \(\Updelta B_{i}\) to be the same as \(\Updelta N_{i}\). One will observe that many of the equations that result from the application of KB theory will bear a striking resemblance to those already established using previous approaches. However, they are fundamentally different from the general binding polynomial models, and therefore the details they provide concerning the intermolecular distributions.

Kirkwood–Buff Theory

The original paper concerning KB theory was published in 1951 [56]. An expanded derivation was subsequently provided by Mazo and others [52, 57, 58]. However, the approach suffered from several drawbacks compared to other solution theories being developed at that time. First, KB theory requires radial distribution functions as input, whereas these are typically output data from other theories. Second, it cannot be applied directly to ionic solutions. It is fair to say that the theory lay relatively unused until 1977 when Ben-Naim illustrated how one can use KB theory to analyze experimental data on solution mixtures [59]. Since then it has been used extensively in the chemistry and chemical engineering fields to provide information on intermolecular distributions and preferential solvation in solution [60, 61]. The potential for understanding the properties of biomolecular solutions was recognized relatively early [44, 62], but very little analysis of real experimental data was forthcoming. Only recently has KB theory gained attention for its potential to rationalize the effects of both osmolytes and protein denaturants. The specific advantages of KB theory include:

  1. 1.

    It is an exact theory.

  2. 2.

    It can be applied to any stable solution mixture involving any number of components.

  3. 3.

    It can be applied to molecules of all sizes and complexity.

  4. 4.

    It does not assume pairwise additivity of interactions.

  5. 5.

    It is well suited for the analysis of computer simulation data.

Several disadvantages will become apparent during the following discussion.

KB theory provides relationships between particle distribution functions in the grand canonical (μ VT) ensemble and derivatives of the chemical potentials of all species involved in either open, semi-open or closed systems. The primary quantity of interest is the KB integral (G ij ) between species i and j given by [44],

$$ G_{ij} =G_{ji} =4\pi\int\limits_{0}^{\infty} \left[g_{ij}^{\mu VT} (r)-1\right]r^{2}\,dr $$
(11)

where g ij is the corresponding center of mass based radial distribution function (rdf). The above integral quantifies the deviation in the distribution of j molecules around a central i molecule from that of a random distribution in an equivalent volume of the bulk solution. We note that the KB integrals are sensitive to small deviations from the bulk distribution at large separations due to the r 2 weighting factor. The above integrals adopt a similar form as the Mayer f-functions used in classical solution theory. However, the above rdf should be interpreted as a potential of mean force (pmf), through the relationship \(W_{ij}(r) = - RT \ln\,g_{ij}(r)\), where one has averaged over the configurations of all other species in the solution including any other i and j molecules. The Mayer f-functions typically represent the pmf between pairs of solute molecules at infinite dilution in a system of solvent molecules [63]. The former is directly applicable at finite solute concentrations and is easily provided by computer simulations, whereas the latter forms part of a series expansion and can be statistically unreliable via simulation.

An equivalent expression for the KB integral can be written in terms of particle number fluctuations,

$$ G_{ij} =V\left[{\frac{\langle N_i N_j \rangle -\langle N_i \rangle \langle N_j \rangle}{\langle N_i \rangle \langle N_j \rangle}-\frac{\delta _{ij} }{\langle N_i \rangle}} \right] $$
(12)

where δ ij is the Kronecker delta function. The above expression emphasizes the fact that the KB integrals are defined in an open system where the number of particles can fluctuate. Hence, KB theory is often referred to as the fluctuation theory of solutions. This is also apparent from the fact that the value of \(\rho_{j} G_{ij}\) equals 0 if \(i \neq j\) and  − 1 if i = j for a closed system [44]. Only after suitable thermodynamic transformations do the integrals provide information on closed systems. In this case the KB integrals correspond to an equivalent open system in which the \(\langle N_{i} \rangle\) equals the fixed N i in the closed system, etc.

An excess coordination number can be defined \((N_{ij}=\rho_{j}G_{ij} \neq N_{ji})\) which characterizes the number of j molecules observed around an i molecule in the open system, above that observed within an equivalent volume of a bulk reference solution at the same chemical potential [52]. It is a measure of how the addition of a single i molecule affects the distribution of i and j molecules around it in reference to the corresponding bulk distribution. For small molecules (but not proteins), a positive value of N ij typically indicates an increase in the local density of j around i above that of their bulk ratios. This can be viewed as the result of some favorable net interaction or affinity between the two species.

As the KB integrals correspond to distributions in open systems the expressions for solution properties in open systems are generally rather simple. The KB integrals provide expressions for derivatives in the grand canonical ensemble [44],

$$ \frac{RT}{V}\left({\frac{\partial \langle N_i \rangle }{\partial \mu _j}} \right)_{T,V,\mu_{k\neq j}} =\rho _i \rho _j G_{ij} +\rho _i \delta _{ij}=\frac{\langle N_i N_j \rangle -\langle N_i \rangle \langle N_j \rangle }{V} $$
(13)

where R is the gas constant and the system is open to all components. The above set of equations can then be transformed to other ensembles. The expressions become more complicated as we move to closed systems and/or increase the number of components in the solution. A general matrix formulation is available for chemical potential derivatives in closed systems [44]. Recently, we have suggested a stepwise transformation process which provides expressions for preferential interaction parameters in semi-open and closed systems in a simple manner [64, 65].

There are many concentration scales which can be used to monitor changes in the chemical potentials. For instance,

$$ \begin{aligned} \mu _i= &\mu _i^{o,m} +RT \ln\,\gamma_i m_i\\ \mu _i = &\mu _i^{o,c} +RT \ln\,y_i c_i\\ \mu _i = &\mu _i^{o,x} +RT \ln\,f_i x_i = \mu _i^{o,x} +\mu_i^{ex} +RT \ln\,x_i\\ \mu _i = &\mu _i^{\ast} +RT \ln\,\Uplambda_i^3\rho_i\\ \end{aligned} $$
(14)

The first three represent common definitions used in experimental work based on whether one measures concentrations in units of molality (m), molarity (c), or mole fraction (x), respectively. The last expression is the result obtained from statistical mechanics [44]. Here, \(\Uplambda \) is the thermal deBroglie wavelength and the number density (proportional to molarity) is the natural concentration scale. The pseudo chemical potential \((\mu^\ast)\) is important in the current analysis. It captures the effect of the solution composition on the Gibbs free energy for transfer of a molecule of i from a fixed position in a vacuum to a fixed position in the solution [44]. The advantages of using this type of approach have been discussed extensively by Ben-Naim [44, 63]. Other concentration scales introduce additional (arguably unwanted) terms into the analysis which depend on properties of the solution and have nothing to do with the biomolecule itself [64]. We will focus on the molarity based activities and derivatives for this reason.

Finally, we note that derivatives of the above expressions for the chemical potentials depend on the species involved. If one considers the solvent or cosolvent then we have \(d\mu_{i} = RT d \ln\,a_{i}\) —as the standard chemical potential is constant whether the biomolecule is at infinite dilution or not. However, the standard state or pseudo chemical potential of the biomolecule depends on the bulk solution composition, even at infinite dilution, and therefore one has \(d \mu_{2} = d \mu_{2}^{\rm o} + RT d \ln\,a_{2}\) unless the mole fraction scale (pure protein reference state) is adopted.

Application of KB Theory to Closed Binary Systems

Before applying KB theory to understand biomolecular equilibria in solution, it is informative to examine some of the results obtained for binary systems. This is particularly relevant for situations where the biomolecule concentration is low and therefore many thermodynamic transformations simply involve the properties of the pure cosolvent and water solutions. We will keep the solvent (1) and cosolvent (3) notation to remain consistent with the previous and following sections. The KB inversion procedure involves the analysis of experimental activity derivatives, pmvs, and the isothermal compressibility \((\kappa_{T})\) of solutions as a function of composition [59]. Using this data the three KB integrals (\(G_{11}, G_{33}\), and G 13) can be obtained via the relationships [56],

$$ a_{33} =\beta \left({\frac{\partial \mu _3 }{\partial \ln\,\rho _3 }} \right)_{T,P}=\left({\frac{\partial \ln\,a_3}{\partial \ln\,\rho _3 }}\right)_{T,P} =\frac{1}{1+\rho _3 (G_{33} -G_{13})} $$
(15)

with,

$$ \overline {V_i }=\frac{1+\rho _j (G_{jj} -G_{ij})}{\rho _1 +\rho _3 +\rho _1 \rho_3 (G_{11} +G_{33} -2G_{13})} $$
(16)

and,

$$ RT\kappa _T =\frac{1+\rho_1 G_{11} +\rho _3 G_{33} +\rho _1 \rho _3 (G_{11} G_{33} -G_{13}^2)}{\rho _1 +\rho _3 +\rho _1 \rho _3 (G_{11} +G_{33} -2G_{13})}$$
(17)

where \(\beta = 1/RT\). The three equations allow one to obtain the three KB integrals as a function of composition. Alternatively, if the KB integrals can be determined from simulation then one can predict the above thermodynamic properties. In practice, the experimental analysis is relatively insensitive to the exact value of \(\kappa_{T}\) for most solutions. Therefore, one can safely set the value of \(\kappa_{T}\) equal to that of pure water and treat the resulting expression as a constraint equation for the three KB integrals without significantly affecting the accuracy of the results [66].

A KB analysis of some common cosolvent solutions of biological interest (urea, gdmcl, NaCl, and TFE) has been provided [67]. Fitting equations and parameters for the required activity derivatives for common denaturants and osmolytes have also been determined [42, 6870]. A KB analysis for the above cosolvents and water mixtures at 298 K and 1 atm is presented in Figs. 1 and 2 as an illustrative example. Here, it is observed that urea solutions are almost ideal on the molarity scale. Ideality being a direct result of the similar values for G 33 and G 13 leading to a 33  = 1. Urea and GdmCl display remarkably similar values for the excess coordination numbers and a 33 values—especially considering one is a salt. TFE displays rather large values of N ij and the value of a 33 deviates significantly from that of an ideal solution. We will see that the value of a 33 is of central importance in our understanding of cosolvent binding to, and exclusion from, biomolecules in solution [42].

Fig. 1
figure 1

Experimental excess coordination numbers (N ij ) for different cosolvent (3) and water (1) mixtures at 298 K and 1 atm as a function of cosolvent molarity (c 3). Data taken from [67]. Note the change in scale for TFE. For comparison, 5 M TFE corresponds to a value of \(x_{3} = 0.12\) and a %v/v of 35%

Fig. 2
figure 2

Experimental values of the activity derivative a 33 for different cosolvents (3) and water (1) mixtures as a function of cosolvent molarity (c 3) at 298 K at 1 atm. Data taken from [67]

KB theory can also be used to provide additional relationships which can be used to simplify the general results. For example, an equivalent expression for the pmv of the cosolvent is available [52, 65, 67],

$$ \overline {V_3} =RT\kappa_T -N_{31} \overline {V_1} -N_{33} \overline {V_3} $$
(18)

and can be used to eliminate G 13 from the above equation for a 33 to generate [71],

$$ a_{33} =\frac{\phi _1}{1+\rho _3 (G_{33} -RT\kappa _T)}\approx \frac{\phi _1 }{1+N_{33}} $$
(19)

where \(\phi_{1}=\rho_{1}\overline {V_1}\) is the volume fraction of water and the approximation is very good (<1%) due to the low compressibility of solutions. Relationships between the molarity, molality, mole fraction, and activity derivatives are provided by the following thermodynamic expressions,

$$ \begin{aligned} &\left({\frac{\partial \ln\,\rho _3 }{\partial \ln\,m_3 }} \right)_{T,P}\,=\,\phi _1 \quad\left({\frac{\partial \ln\,x_3 }{\partial \ln\,m_3 }}\right)_{T,P}\,=\,x_1\quad \left({\frac{\partial \ln\,\rho _3 }{\partial\ln\,x_3 }} \right)_{T,P}\,=\,\frac{\phi _1 }{x_1 }\\ &\left({\frac{\partial \ln\,\rho _3 }{\partial \ln\,\rho _1 }} \right)_{T,P}\,=\,-\frac{\phi _1 }{\phi _3 }\quad \; -\beta \left({\frac{\partial \mu _1}{\partial \ln\,\rho _3 }} \right)_{T,P}\,=\,\frac{\rho _3 }{\rho _1}a_{33}\quad a_{33} \overline {V_1 }\,=\,a_{11} \overline {V_3 } \; \\ &\left({\frac{\partial \mu _3 }{\partial \mu _1 }} \right)_{T,P}\,=\,\left({\frac{\partial \ln\,a_3 }{\partial \ln\,a_1 }} \right)_{T,P}\,=\,-\frac{\rho _1 }{\rho _3 }\\ \end{aligned} $$
(20)

and so one can easily transform between concentration scales.

The application of KB theory to solutions of salts is complicated by correlations between the ions which are a consequence of electroneutrality constraints [72, 73]. Hence, one cannot treat salt solutions as ternary systems of solvent, cations, and anions, i.e., one cannot determine derivatives of the chemical potentials with respect to the cation concentration, for example. However, one can treat the solution as a binary system of solvent and indistinguishable ions [40, 42, 67, 72]. In this case, the chemical potential and concentration of component 3 are different from that that used experimentally. If we consider a salt (s) which generates n + cations and n anions and therefore a total of \(n_{\pm} = n_{+} + n_{-}\) ions in solution, then the relationships between the traditional salt properties and the properties of the indistinguishable species 3 used in our KB analysis are given by; \(\rho_{3} = n_{\pm} \rho_{\rm s}, d \ln\,\rho_{3} = d \ln\,\rho_{\rm s}, n_{\pm} d\mu_{3} = d \mu_{s}\), and \(y_{3} = y_{\pm}\), etc. [42, 67]. Hence, the cosolvent concentration (ρ3) becomes the total ion concentration, for example. There are also electroneutrality relationships available between the KB integrals (rdfs) for the indistinguishable ion species 3 and the corresponding KB integrals for the anions and cations [67, 73]. More details can be found in the literature [42, 67, 73].

Application of Kirkwood–Buff Theory to Open Ternary Systems

The following results all refer to a ternary system where component 2 is at infinite dilution. We will discuss the KB results for the preferential binding parameter first as this is also defined in open systems and hence the corresponding expression in terms of KB integrals is relatively simple. Several derivations of the expression for \(\Upgamma_{23}\) have appeared in the literature and differ in complexity depending on the starting ensemble [42, 7477]. We have recently applied KB theory to understand the density changes observed in equilibrium dialysis experiments [64]. In our opinion this is the simplest and most direct approach. Starting from the Gibbs–Duhem relationship for the open and closed systems (Eqs. 7 and 8) and using Eq. 13 it can be shown that [64, 65],

$$ \Upgamma _{23} =\rho _3(G_{23} -G_{21})=N_{23} -\frac{\rho _3 }{\rho _1 }N_{21} $$
(21)

The above expression is exact and quantifies the cosolvent and solvent redistribution which occurs in an open system on introduction of a biomolecule. It is both visually and formally identical to the expression presented in Eq. 6. A few points concerning the above equation are worth noting. First, the KB integrals characterize changes in the molecule distributions over all distances away from the central biomolecule, not just the surface distribution. Second, the expression only contains KB integrals involving the biomolecule and not other thermodynamic properties of the solution mixture. Third, it includes changes in both the cosolvent and solvent distributions—primarily due to the use of molalities in the definition of the preferential binding parameter and the fact that the system is open to both species.

The above equation has to be modified when the cosolvent is a salt and the biomolecule releases ions on addition to the solution which are identical to one of the salt ions [19]. In this case one obtains a Donnan contribution which gives rise to the following expression [42],

$$ n_\pm \Upgamma _{23}=N_{23} -\frac{\rho _3 }{\rho _1 }N_{21} -Z $$
(22)

where Z is the absolute charge on the biomolecule, and the factor of n  ±  corrects the experimental data to fit the indistinguishable ion approximation. The most common example where the Z factor is important is for DNA. Hence, studies of Na Z DNA polyelectrolytes in NaCl solution require the additional term. The physical picture is to reduce the preferential binding by subtracting the first Z cations which are closest to the DNA from the counting process as these are required for electroneutrality. The above expression agrees with that of Record and coworkers but is slightly different from that of Schellman due to a different definition of the excess chemical potential of the biomolecule in the latter case [49, 78, 79]. In the remaining discussion, we shall ignore the Donnan term.

The main source of preferential interaction data corresponding to the above equations is obtained from equilibrium dialysis data. However, there are a variety of other sources which provide experimental data for systems that are only open to either just the solvent or just the cosolvent. Furthermore, one can measure the concentration changes using molarities or mole fractions. Hence, a series of preferential binding parameters are available which differ in the concentration scales and ensembles used [42, 49, 76, 77, 80]. Expressions for these parameters in terms of KB integrals are summarized in Table 1. We have argued that the simplest and most useful is the original definition contained in Eqs. 1 and 21 [64]. The main reason for this is the fact that the preferential interaction, as measured using molalities in a fully open system, only contains KB integrals involving the biomolecule. Other ensembles introduce contributions which involve properties of the solution mixture itself. For instance, the relationship between the molality based preferential interactions in the various ensembles is given by [19, 42],

$$ \left({\frac{\partial m_3}{\partial m_2}}\right)_{T,P,\mu _3}^{\rm o} =\Upgamma_{23}+\frac{\phi _3 }{a_{33} \phi _1 } $$
(23)

and,

$$ \left({\frac{\partial m_3 }{\partial m_2 }} \right)_{T,P,\mu _1 }^{\rm o} =\Upgamma _{23}-\frac{1}{a_{33} } $$
(24)
Table 1 Kirkwood–Buff derived expressions for preferential binding parameters defined in different ensembles and using various concentration scales

Other relationships are provided in Table 1. The presence of the last term in both equations involves properties of the reference solution. These correction terms may be small for urea and gdmcl as a 33 is approximately unity, but can become significant for solutions such as TFE where a 33 can be small (0.2 or so). In addition, only the molality based preferential binding parameters display the appealing property of being zero when there is no change from the bulk solution distributions. Several studies have demonstrated various relationships between the preferential interactions in different ensembles [6, 19, 42, 49, 51, 64, 75, 76, 80]. For instance, it can be shown that [64],

$$ \Upgamma _{23} =\phi _1\left({\frac{\partial m_3 }{\partial m_2 }} \right)_{T,P,\mu _3}^{\rm o} + \phi _3 \left({\frac{\partial m_3 }{\partial m_2 }}\right)_{T,P,\mu _1 }^{\rm o} $$
(25)

Expressions for the preferential binding and chemical potential derivatives in systems where the biomolecule is at finite concentrations are also available [65, 81].

Application of Kirkwood–Buff Theory to Closed Ternary Systems

Most experiments are performed in closed systems, usually with a fixed number of molecules at a constant T and P. Many equivalent expressions for derivatives of the total chemical potentials have been proposed for these systems. However, changes in the pseudo chemical potential are the most relevant for the present discussion. Ben-Naim originally determined the effect of a cosolvent on the pseudo chemical potential of an infinitely dilute solute \((d \mu_{2}^\ast/dn_{3})_{P,T}\) [44, 82]. Chitra and Smith then used this expression combined with the results for binary systems to formulate a simple expression for the effects of cosolvents on the solubility of molecules in solutions [41]. In the present notation this expression is,

$$ -\beta \left({\frac{\partial \mu _2^\ast }{\partial \ln\,\rho _3 }}\right)_{T,P}^{\rm o} =\frac{\rho _3 (G_{23} -G_{21})}{1+\rho _3(G_{33} -G_{31})}=\Upgamma _{23} a_{33} $$
(26)

The expression is exact and can be used to interpret the effect of a cosolvent on the solubility of a biomolecule, or for understanding transfer data on small molecules resembling the common functional groups of amino acids. The value of a 33 must be positive for real solutions and so a positive preferential binding corresponds to a decrease in the pseudo chemical potential (salting in effect). The above equation involves four KB integrals. The values of G 33 and G 13 are relatively easy to determine. The values of G 23 and G 21 are more difficult. However, one can use the general expression valid for any component i at any concentration in any n component solution [52, 64],

$$ RT\kappa _T =\sum_{j=1}^n{\overline {V_j } (N_{ij} +\delta _{ij})} $$
(27)

to obtain the KB result for the pmv of the biomolecule (i = 2) at infinite dilution,

$$ \overline {V_2}^{\rm o}=RT\kappa _T -N_{21} \overline {V_1 } -N_{23} \overline {V_3 }$$
(28)

which in combination with the approximation used in Eq. 19 provides [71],

$$ -\beta \left({\frac{\partial \mu _2^\ast }{\partial \ln\,\rho _3 }}\right)_{T,P}^{\rm o} =\frac{\rho _3 (G_{23} +\overline {V_2 } ^{\rm o}-RT\kappa _T)}{1+\rho _3 (G_{33} -RT\kappa _T)} $$
(29)

The above relationship reduces the problem to a determination of two KB integrals (G 23 and G 33), one of which (G 33) should be relatively easy, and properties of the system (\(\overline {V_2}^{\rm O}\) and κ T ) which can be determined or approximated fairly accurately. The transfer free energy for a solute can then be obtained upon integration of the above equation. The difference between derivatives of the total and the pseudo chemical potential is given by,

$$ -\left({\frac{\partial\mu _2}{\partial \mu _3 }} \right)_{T,P,m_2 }^{\rm o} =-\left({\frac{\partial \mu _2^\ast }{\partial \mu _3 }} \right)_{T,P,m_2}^{\rm o} +\frac{\phi _3 }{a_{33} \phi _1 } $$
(30)

The same relationship is true for changes in the molar concentration standard state of an infinitely dilute biomolecule, but not for other concentration scale standard states. In particular, the use of the above relationship coupled with Eq. 23 provides an exact route from Eqs. 1 to 5.

We now turn to the thermodynamics of protein denaturation by cosolvents in closed systems. We will assume a simple equilibrium between a native (N) and denatured (D) state described by an equilibrium constant \((K=\rho_{D}/\rho_{N})\) where all biomolecule species are at pseudo infinite dilution, i.e., we can ignore the interactions between any bimolecule species. In principle, this is a quaternary solution of 1, 3, N, and D. However, the N and D components are not independent as their concentrations are related through \(\rho_{2}=\rho_{N}+\rho_{D}\). Ben-Naim has shown how one can treat such a system when the cosolvent concentration is small [62]. In applying the same procedure but using finite cosolvent concentrations (Smith, unpublished results), one obtains equivalent results to that obtained by treating both biomolecule states as independent species in a ternary solution of 1, 3, and N or D. This simpler approach was outlined by Smith using the pseudo chemical potential definitions [42]. Equating the total chemical potentials of N and D one finds,

$$ \ln\,K=-\beta (\mu_D^\ast-\mu_N^\ast)=-\beta \Updelta G^{\hbox{O}} $$
(31)

We note that one could use any of the concentration scales to express K in terms of the standard chemical potentials of the different forms (see Eq. 14). However, this is a unique situation for protein denaturation due to the unit stoichiometries of the process and the fact that all activity coefficients will be unity at infinite dilution. In general this will not be the case. Taking derivatives of both sides of Eq. 31 one obtains,

$$ \left({\frac{\partial\ln\,K}{\partial \ln\,\rho _3 }} \right)_{T,P}^{\rm o} =-\beta \left[{\left({\frac{\partial \mu _D^\ast }{\partial \ln\,\rho _3 }}\right)_{T,P}^{\rm o} -\left({\frac{\partial \mu _N^\ast}{\partial \ln\,\rho _3 }} \right)_{T,P}^{\rm o} } \right]=\Updelta\Upgamma _{23} a_{33} $$
(32)

where we have used the result from Eq. 26. Protein denaturation is therefore favored by an increase in binding to the denatured state over the native state. The overall picture is displayed in Fig. 3. Before leaving this section, we note that \(\Updelta \Upgamma_{23}\) can be determined using any of the expressions for \(\partial m_{3}/\partial m_{2}\) defined in the different ensembles (see Table 1) as they differ by KB integrals or thermodynamic properties which refer to the bulk solution composition. Hence, these differences disappear when determining changes in \(\Upgamma_{23}\)—but only for processes with a 1:1 stoichiometry. Relationships to preferential interactions on other concentrations scales are available and are also provided in Table 1 [43, 64].

Fig. 3
figure 3

A schematic representation of protein denaturation due to the increased preferential binding of a cosolvent (larger shaded circles) over water (small open circles) to the denatured state compared to the native state. The dashed line represents the region over which the local solution distribution \((N_{3}/N_{1})\) differs from the bulk solution distribution \((N_{3}/N_{1})\). Although the above figure represents a closed system, the local region can be considered an open system in contact with a large excess of bulk solvent which maintains a constant chemical potential between the local and bulk regions. This description is only appropriate when the biomolecule is at infinite dilution

Smith and others have also used the KB expression for the pmv of an infinitely dilute solute (Eq. 28) and applied it to understand volume changes on protein denaturation, and to separate the cosolvent and solvent binding as described by the two KB integrals [15, 42, 83, 84]. One finds that,

$$ \Updelta \overline{V_2}^{\rm O}+\Updelta N_{23} \overline {V_3 } +\Updelta N_{21}\overline {V_1 } =0 $$
(33)

which is simply a statement that there is no change in the total volume of the system upon denaturation. The change in protein volume on denaturation is usually negative [85]. However, Smith has argued that the change in protein volume is small and can be neglected [42, 83], especially in comparison to the inherent errors in the experimental values of \(\Upgamma _{23}\). This allows one to write a thermodynamic relationship between the change in cosolvent and solvent distributions which can be used to relate the total preferential binding to just cosolvent binding (as described by the corresponding KB integral). In this case one finds,

$$ \Updelta N_{23} =\phi _1\Updelta \Upgamma _{23} -\rho _3 \Updelta \overline {V_2}^{\rm O}\approx\phi _1 \Updelta \Upgamma _{23} $$
(34)

which is a consequence of the fact that each cosolvent molecule which redistributes itself from the bulk solution to the vicinity of the protein must displace an equivalent volume of water. The ratio of pmvs for urea and water is ≈2.5 and hence each addition of urea replaces 2.5 water molecules.

The majority of experimental protein denaturation studies involve determining a change in the equilibrium constant with denaturant concentration. This provides, through Eqs. 32 and 34, information on the difference in cosolvent affinity for the denatured and native states. When one also has access to equilibrium dialysis data for the same protein under the same conditions of T, P, and pH one can determine the individual cosolvent affinity to either state [42, 86]. For instance, the value of \(\Upgamma_{23}\) obtained from equilibrium dialysis is dependent on the cosolvent concentration. If the biomolecule exists as a mixture of different major forms (native and denatured for example), the dialysis experiment provides an average preferential interaction such that [86],

$$ \Upgamma _{23} =x_{D} \Upgamma_{D3} +x_{N} \Upgamma _{N3} $$
(35)

where x i is the mole fraction of state i. Hence, the total preferential interaction is simply the sum of the individual preferential interactions. Using the fact that protein denaturation in closed systems provides information on the difference between the same two preferential binding parameters (Eq. 32) one finds that,

$$ \Upgamma _{D3} =\Upgamma_{23} +x_D \Updelta \Upgamma _{23} $$
(36)

which allows one to isolate both \(\Upgamma_{D3}\) and \(\Upgamma_{N3}\) at any cosolvent concentration if the composition dependence of x D is known. This will prove particularly useful for simulation studies as described below.

In summary, the general consensus is, and always has been, that when a cosolvent preferentially binds to a particular state \((\Upgamma _{23} > 0)\) it will tend to shift the equilibrium in favor of that state. A cosolvent denatures a protein because it preferentially binds to the denatured over the native state. The difference in preferential binding reflects changes in both the cosolvent and solvent distributions, and includes changes in the solution composition over all distances away from the protein (see Fig. 3). Alternatively, a cosolvent that is excluded from the protein surface \((\Upgamma_{23} < 0)\) will tend to favor the native form, i.e., increased protein stabilization. Cosolvents can also change the solubility of a protein—with a positive preferential interaction leading to an increase in solubility. KB theory can be used to quantify these effects [87, 88].

Hydrostatic and Osmotic Pressure Studies

Many investigations have used hydrostatic or osmotic pressure to study their effects on biomolecular processes in both pure water and cosolvent solutions [20, 53, 85, 89]. The effects produced by changes in osmotic pressure are somewhat different from that of hydrostatic pressure. For closed systems one can write,

$$ \left({\frac{\partial\mu _2^\ast }{\partial P}} \right)_{T,N}^{\rm o} =\overline {V_2}^{\rm O}-RT\kappa _T =-N_{21} \overline {V_1 } -N_{23} \overline{V_3 } $$
(37)

which comes directly from Eqs. 14 and 28, and the relationship \((\partial \mu _i /\partial P)_{T,N} =\overline {V_i}\). Therefore, defining \(V_2^{\rm O} =\overline {V_2 }^{\rm O}-RT\kappa _T\), the effect of hydrostatic pressure on an equilibrium is given by,

$$ RT\left({\frac{\partial\ln\,K}{\partial P}} \right)_{T,N}^{\rm o} =-\Updelta V_2^{\rm O}=\Updelta N_{21} \overline {V_1 } +\Updelta N_{23} \overline {V_3 }$$
(38)

and forms the basis of pressure denaturation studies [90]. If there is no cosolvent present in the solution then the final terms in the two above equations disappear. An increase in pressure favors the form with the smallest volume. Typically, several hundreds of atmospheres are required to induce denaturation as the volume changes are usually small and negative (−50 cm3/mol) [85, 91, 92], although this might not be the case at low pressures [93].

In closed systems, it is also possible to use the thermodynamic transformations outlined in Eq. 20 to obtain [2, 54],

$$ \left({\frac{\partial\ln\,K}{\partial \ln\,a_1 }} \right)_{T,P}^{\rm o} =-\frac{\rho _1}{\rho _3 }\Updelta \Upgamma _{23} =\Updelta N_{21} -\frac{\rho _1 }{\rho_3 }\Updelta N_{23} =\Updelta \Upgamma _{21} $$
(39)

Alternatively, one can start from the Gibbs–Duhem relations at constant pressure (Eqs. 7 and 8). Hence, an increase in water activity (concentration) shifts the equilibrium in favor of the state with the largest degree of preferential hydration. It should be noted that the above equation was generated for closed systems and, in principle, is not directly applicable to osmotic systems. It merely involves a change in focus to the primary solvent and can be generated by a simple index change from 3 to 1 and vice versa. One cannot, however, do this index change with the corresponding preferential binding parameters unless one redefines molality in terms of the cosolvent 3. The activity change denoted above is simply due to a change in water activity with concentration and does not include any changes in the activity of water with (osmotic) pressure.

Equations 7 and 8 describe the thermodynamic constraints for an osmotic system open to both solvent and cosolvent in equilibrium with a closed system of cosolvent and solvent. Here one switches the focus to that of changes in water concentration or activity with osmotic pressure. Starting from these equations one can show that,

$$ -\left({\frac{\partial\mu _2^\ast }{\partial \mu _1 }} \right)_{T,P}^{\rm o} =\Upgamma_{21} $$
(40)

Application of the above equation to an equilibrium process and use of the standard relationship between osmotic pressure and the activity of water \((RTd\ln\,a_1 =-\overline {V_1 } d\Uppi)\) provides,

$$ RT\left({\frac{\partial\ln\,K}{\partial \Uppi }} \right)_{T,P}^{\rm o} =-\overline {V_1 }\Updelta \Upgamma _{21} $$
(41)

which is formally the same result as that would be obtained starting from Eq. 39. In this case the increase in osmotic pressure leads to an increase in the standard chemical potential of pure water, which requires a decrease in water activity to maintain a constant total chemical potential. Hence, an increase in osmotic pressure shifts the equilibrium in favor of the least hydrated form. If \(\Updelta \Upgamma _{21} \approx \Updelta N_{21}\) then one can view the above difference as purely a change in the associated volume of water, or hydration, during the process. This will be true if there is little or no cosolvent present or the value of \(\Updelta N_{23}\) is small, i.e., N 23 is constant. However, in general, this will not be the case. Osmotic pressure changes are typically  < 1 atm and occur with a value of \(\Updelta \Upgamma _{21}\) that is larger and of the opposite sign to \(\Updelta \Upgamma _{23}\) [94, 95]. Therefore, an increase in osmotic pressure favors the denatured form if the cosolvent exhibits preferential binding, and vice versa. Compared to hydrostatic pressure effects, much lower osmotic pressures produce similar changes in the equilibrium as the value of \(\Updelta \Upgamma_{21}\) can be in the hundreds (or even thousands for polyols) [84, 96]. Equations 38 and 41 are clearly different when a cosolvent is present. The first probes changes in the protein volume. The second probes changes in the degree of preferential hydration. Consequently, osmotic pressure has a rather different effect on the equilibrium compared to hydrostatic pressure in the presence of cosolvents.

Applications of KB Theory to Systems of Biological Interest

The ability to apply KB theory to any type of system, coupled with the fact that it involves no approximations, provides a solid foundation for the theory of biomolecular solutions and the analysis of experimental data. Early theoretical progress included the equations presented by Ben-Naim for multicomponent solutions which are also applicable to biological systems at very low cosolvent concentrations [62]. In addition, Pjura et al. used KB theory to interpret experimental data concerning partial specific volumes of proteins [97], and Hirata and coworkers have applied KB theory to understand changes in protein volumes on denaturation [98, 99]. Apart from these early studies, the application of KB theory to biological systems was rather limited. However, there has been a recent increase in the number of biologically relevant studies relying on results from KB theory, presumably in the hope that the use of KB theory will lead to an improved understanding of cosolvent effects in biological systems. Recent studies are encouraging.

One approach to understand cosolvent effects is to study small molecule systems. Hydrocarbons represent simple models for the side chains of many amino acids. Urea is known to increase the solubility of hydrocarbons larger than ethane [100], although the exact reason for this behavior is unknown. Our early studies on hydrophobic hydration investigated the effects of several cosolvents on the solubility using KB theory—albeit with several approximations [40]. This was subsequently developed into a consistent picture of changes in solubility [41], including the ability to eliminate one of the KB integrals using the solute pmv [71]. Cosolvent effects on small hydrocarbons have also been investigated by van der Vegt and coworkers [101, 102]. Shimizu and also Shulgin and Ruckenstein have applied similar approaches to understand salting in and out effects on protein solubility [88, 103], which follow the framework developed for smaller solutes [87, 104]. It is clear that a 33 is a property of solution mixtures that modulates the thermodynamic effects of cosolvents, although this is only one component of the overall thermodynamic effect. Hence, a KB analysis of common (urea, gdmcl, NaCl, and TFE) cosolvent solutions was also performed by Chitra and Smith and discussed in the context of changes in water structure [67]. Subsequent analysis has been performed for other osmolytes [70, 93].

Simultaneously, Smith and Shimizu applied KB theory to understand cosolvent mediated protein denaturation and protein stability by osmolytes [42, 54, 83, 84, 105]. Shimizu and coworkers have used KB theory to determine hydration changes for allosteric transitions and ligand binding, and to clarify the assumptions made in osmotic stress analysis [54, 84, 96, 103, 105]. Smith outlined a rigorous link between the results of computer simulations and the corresponding experimental thermodynamic data [42, 83]. Subsequently, Shulgin and Ruckenstein have applied KB theory to quantify the excess or deficiency of water around several proteins in the presence of both osmolytes and protein denaturants [74, 76, 88, 106]. As expected, an increase in hydration was observed for the osmolytes, while a decrease in hydration was found for the denaturants. Rosgen et al. have also formulated the effects of osmolytes in terms of KB integrals [70, 93]. Schurr et al. have expressed preferential interactions in terms of KB integrals and used these expressions to develop some simple models for the interaction of cosolvents with proteins [75]. The results suggest a significant excluded volume effect.

Our own work has focused on using KB theory to understand preferential interactions [4042, 71], the development of a model of cosolvent effects based on KB theory [107], and to express the density changes observed in equilibrium dialysis experiments in terms of KB integrals [64, 65]. More recently, Schellman and others have compared the results from KB theory to the corresponding expressions obtained from thermodynamic binding models [77, 106]. A variety of studies have attempted to clarify the exact KB expressions for the different preferential binding parameters corresponding to the different concentration scales and ensembles, and to derive relationships between them [42, 64, 65, 75, 76]. Hence, it is clear there is considerable recent interest in analyzing cosolvent effects in terms of KB integrals.

Models for Protein Denaturation

In this section, we will review a variety of models that have been proposed for understanding protein denaturation. In doing so we will compare many of the predictions and parameters with the KB expressions presented in the previous sections which, being exact, provides a solid foundation for comparison. Protein denaturation by cosolvents is commonly used to determine the stability of the native state in the absence of cosolvent [108]. It provides an alternative to heat, pressure, or pH denaturation. Typically, a protein will denature over a relatively limited range of cosolvent concentration providing accurate values of \(\Updelta G^{\rm O}\) only in that region (see Fig. 4) [109]. A potentially dangerous extrapolation back to zero cosolvent concentration is then required to establish the relative stabilities of the native and denatured states. Fortunately, urea denaturation curves are typically linear in urea concentration [110]. GdmCl curves display more nonlinearity [111]. Hence, an increased understanding of protein denaturation would hopefully improve our ability to extrapolate the experimental data. All the models considered here assume an infinitely dilute protein. Other theoretical treatments of cosolvent effects are available in the literature [112118].

Fig. 4
figure 4

A schematic cosolvent promoted protein denaturation curve corresponding to the standard free energy for unfolding \((\Updelta G^{\rm O})\) as a function of cosolvent molarity (c 3). The solid line represents real experimental data centered on the midpoint cosolvent concentration \((c_{3}^\ast)\). The dashed lines represent the extrapolation back to zero cosolvent assuming linear or (somewhat exaggerated) nonlinear behavior

The m-Value Approach

The assumption of linear behavior for the denaturation free energy curve was originally proposed by Greene and Pace based on a purely empirical observation [110]. The popularity and simplicity of this approach provide an excellent reference point for other models. In this case one can write the change in the standard Gibbs free energy for denaturation \(\Updelta \Updelta G = \Updelta G^{\rm O}(\rho_{3}) - \Updelta G^{\rm O}(0)\) as,

$$ \beta \Updelta \Updelta G=-m\rho _3 $$
(42)

where m is a constant for a particular protein and cosolvent at a given T, P, and pH. Hence, we will compare the ability of other models to provide linear behavior as characterized by an apparent m-value (m app). However, it should be noted that linear behavior will be a recurring problem for all of the models discussed here as they involve more than one (unknown) parameter. There is an observed correlation between m-values for a series of proteins and the estimated changes in the accessible surface area upon denaturation [119, 120]. This will also be a useful reference in our following discussion. Combining with our KB derived results we find,

$$ m^{\rm app}=\left({\frac{\partial \ln\,K}{\partial \rho _3 }} \right)_{T,P}^{\rm o}=\frac{\Updelta \Upgamma _{23} a_{33} }{\rho _3 } $$
(43)

and \(m^{\rm app} = m\). Hence, linear behavior will be observed when the difference in preferential binding is proportional to ρ3/a 33. Alternatively, one can write that the following must be true,

$$ \Updelta \Upgamma _{23}=m\rho _3 [1+\rho _3 (G_{33} -G_{13})] $$
(44)

for exact linear behavior to be observed for all cosolvent concentrations. The difference in KB integrals \((G_{33}- G_{13})\) for urea is small and negative and therefore the change in preferential binding predicted by Eq. 44 resembles that of a Langmuir isotherm. Alternatively, if one assumes that the value of 1/a 33 is relatively constant and equal to unity (an ideal solution) then we expect \(\Updelta \Upgamma _{23}\) to be proportional to the bulk urea concentration.

At this stage it is informative to consider a model example. For a protein that has an m-value of 2 M−1 and a denaturation transition midpoint at \(\rho_{3}^\ast\) = 5 M urea, then the standard free energy for unfolding in pure water is given by \(m \rho_{3}^\ast\) = 10. Therefore, the difference in preferential binding of urea at the midpoint is \(m \rho_{3}^\ast\) = 10. The change in excess urea binding \((\Updelta N_{23})\) is then given by Eq. 34 and equal to 13 in this particular example. Clearly, these values are small compared to the number of urea molecules that would be in contact with the protein, and the number of peptide or side chain groups on a typical protein. This emphasizes the weak binding exhibited by most denaturants and highlights why such high concentrations of urea are required for denaturation.

Group Transfer Model

One of the earliest expressions used to model the denaturation process involved the group transfer concept proposed by Tanford [25]. Here, a series of chemically meaningful groups in the protein are considered and the change in the standard free energy of denaturation related to the exposure (or burial) of each group on denaturation such that,

$$ \beta \Updelta \Updelta G=\beta \sum_i {n_i \alpha _i } \Updelta g_{{\rm tr},i} (\rho _3) $$
(45)

where n i is the number of groups of type i, α i is the average fractional increase in the accessible surface area on unfolding \((\hbox{ASA}_{i}^{D} -\hbox{ASA}_{i}^{N})/\hbox{ASA}_{i}^{\rm O}\), and \(\Updelta g_{{\rm tr},i}\) is the free energy for transfer of group i from pure water to a cosolvent concentration of ρ3. This can be related to small molecule transfer data via,

$$ -\beta \Updelta g_{{\rm tr},i} =a_{33} \Upgamma _{i3} $$
(46)

where \(\Upgamma _{i3}\) is given by Eq. 26 and we have assumed linear transfer free energies on the molar concentration scale. Experimental free energy transfer data are available [100, 121129]. The model is simple and very intuitive. It is somewhat impractical as it requires a detailed knowledge of the denatured state in order to determine the α i values. However, Bolen and coworkers have illustrated that one can use the above model, together with the required transfer free energies (and a careful choice of the concentration scale), to reproduce experimental m-values [129, 130]. This type of predictive power promises to be extremely valuable. In addition, these studies provide the first evidence that the general assumption of additivity is reasonable for proteins—although it may not be universally true [131, 132]. Without this simplicity the role of a denaturant or osmolyte would become specific to each protein. The general conclusion is that urea denaturation is driven by the favorable interaction of urea with the peptide group, with some possible contribution from aromatic residues [130]. This is also in agreement with earlier conclusions [91, 133].

Binding Site Model

The most common model used for the analysis of experimental thermodynamic data, with the exception of the empirical m-value model, is the binding site model [24, 134]. This is a simplification of the more general binding polynomial approach [17]. Here one assumes a series of equivalent independent binding sites on the surfaces of the native and denatured states which display the same (or an average) equilibrium constant (K b). The relative concentration of species such as 2, 2:3, 2:32, etc., can then be determined. In this case, denaturation is favored by the presence of a larger number of binding sites for the denatured form which has a larger surface area. The binding model predicts that,

$$ \beta \Updelta \Updelta G=-\Updelta n\ln\,[1+K_{\rm b} \rho_3] $$
(47)

where \(\Updelta n\) is the difference in the number of binding sites. The number of sites and the equilibrium constant can also be estimated from calorimetry data [135]. The corresponding expression for the apparent linear behavior and the preferential binding parameter are provided in Table 2. The preferential binding parameter equation clearly resembles a Langmuir binding isotherm. Values of K b vary but are of the order of 0.04 and 0.6 M−1 for urea and gdmcl, respectively, while values of \(\Updelta n\) are typically 10–100 [135]. Linear behavior is predicted when the value of K b ρ3 is small.

Table 2 A summary of formulas for the apparent linearity in the denaturation free energy curves (m app) and the corresponding preferential binding parameter differences \((\Updelta \Upgamma _{23})\) for different models of protein denaturation

The simple binding model provides a convenient description of cosolvent effects on protein denaturation. However, it has several drawbacks, especially in comparison to the KB approach. First, the idea of simple binding sites is intuitively appealing but clearly incorrect for weak binding cosolvents. There is also no accounting for the exchange of solvent molecules. These are well-known problems. Furthermore, binding to sites on the surface does not directly account for possible changes in the cosolvent distribution in successive solvation shells away from the surface. The implied presence of distinct species such as 2:3 or 2:32 is also incorrect. These problems are usually circumvented by interpreting the binding constant and number of sites in a rather loose manner [136, 137]. In this way the approach becomes more of a representation rather than a physical model of cosolvent binding. Unfortunately, it is essentially impossible to relate computer simulation data to this type of representation. Even so, the model is relatively simple and has been very useful in developing and understanding the basic thermodynamic principles of the denaturation process.

Exchange Models

The classic binding model was extended by Schellman in 1990 to include the exchange of water by cosolvent during the binding process [48, 49]. The resulting equation is,

$$ \beta \Updelta \Updelta G=-\Updelta n\ln\,[a_1 +K_{\rm e} a_3 ] $$
(48)

where K e is the corresponding intrinsic exchange binding constant. Schellman then defined an effective binding constant, \(K_{\rm e}^\prime = K_{\rm e}y_{3}/y_{1}\), although this is clearly not independent of concentration. The preferential binding predicted by the model is given in Table 2. Obviously, the same assumptions regarding the number and type of exchange sites are inherent in both the exchange and binding models. It is also limited to the case of 1:1 exchange, whereas it is known that most denaturants are significantly larger than a single water molecule. The use of activities is also inconvenient and actually leads to less linearity for the denaturation profile. Hence, Schellman has suggested that the activities be replaced by concentrations in most practical applications [138]. The major advantage of the model is that the exchange process allows for negative preferential binding depending on the sign of \(K_{\rm e}^\prime - 1\), something that is not allowed in the simple binding model, and which is required to explain the effects of osmolytes and preferential hydration.

A simpler exchange model has been proposed by Jasanoff and Fersht to account for the helix inducing effects of TFE on peptide structure [139]. The corresponding equation being,

$$ \beta \Updelta \Updelta G=-m_{\rm e} \frac{\rho_3 }{\rho_1}$$
(49)

where m e is a constant. The above equation allows for increased nonlinearity in the free energy curve, such as that observed for helix induction by TFE. The resulting preferential binding expression is also provided in Table 2.

Local-Bulk Domain Model

A significant conceptual step forward was provided by Record and coworkers with the development of the local-bulk domain model [78, 140, 141]. This was the first model specifically designed to focus directly on changes in the solution composition in the local domain surrounding a biomolecule. In contrast to a binding constant, the local-bulk domain model characterizes the increase or decrease in cosolvent concentration in terms of a partition coefficient K P. The corresponding decrease or increase in local water density is also included. The partition coefficient is defined by,

$$ K_{\rm P} =\frac{B_3 /B_1 }{\rho _3 /\rho _1 } $$
(50)

and is assumed to be independent of cosolvent concentration. Here, B 3 and B 1 are the number of cosolvent and water molecules observed in the local domain surrounding the protein. Hence, the local concentration ratio remains the same as in the bulk solution, but both species are increased in the case of denaturants. Using the above condition, coupled with an exchange coefficient ratio (S 3), and the surface hydration in the absence of cosolvent \((\hbox{ASA }b_{1}^{0})\) one finds,

$$ \beta \Updelta \Updelta G=\frac{(K_{\rm P} -1) b_1^0 \Updelta {\rm ASA}}{m_1 }I(\rho_3) $$
(51)

where I3) is an integral which depends on the properties of the pure solution. The corresponding slope and preferential binding parameters are provided in Table 2. Denaturation is therefore favored due to the larger surface area (local volume) surrounding the denatured state. The model is consistent with the concept of preferential binding in open systems, while still focusing on changes in the first solvation shell (surface) of the biomolecule. An analytical expression for the above integral has been provided for urea and gdmcl [140]. Typical values of K P are 1.12 and 1.16 for urea and gdmcl, while the surface hydration per unit surface area of protein (b 01 ) is usually taken to be 0.11 waters/Å.

LCPE Model

Recently, we have provided a relatively simple model based on a local chemical potential equalization (LCPE) principle which uses many of the concepts and equations of KB theory [107]. In particular, the model uses KB integrals to quantify the changes in cosolvent and water concentrations in the vicinity of the biomolecule. The model accounts for changes over all solvation shells and includes the concept of exchange as provided by Eq. 34. It differs from many of the other models by using the equations for the grand canonical ensemble to characterize the region of solution close to the biomolecule, which is then surrounded by bulk solvent. One can think of the biomolecule as being enclosed by a virtual dialysis membrane as illustrated in Fig. 3. If the cosolvent forms favorable interactions with the protein, then the chemical potential of the cosolvent in the vicinity of the protein will be reduced (to \(\mu_{3}^{\rm l}\)) and therefore the local cosolvent concentration will be increased in order to increase the cosolvent chemical potential and reestablish equilibrium. The change in local cosolvent density after introduction of the protein is then related, via a Taylor series expansion to second order, to changes in the chemical potential of the cosolvent due to interactions with the protein,

$$ \rho _3^{\rm l} =\rho _3 +\Updelta \mu _3 \left({\frac{\partial \rho _3 }{\partial \mu _3 }} \right)_{T,\mu _1 } + \frac{\Updelta \mu _3^2 }{2}\left({\frac{\partial ^{2}\rho _3 }{\partial \mu _3^2 }} \right)_{T,\mu _1 } + {\rm o}(\Updelta \mu _3^3) $$
(52)

where the derivatives can be found from Eq. 13. The change in cosolvent density is formally \(\Updelta N_{23}/V^{\rm l}\), which is related to the preferential binding parameter through Eq. 34. The model includes two parameters. The first is the volume of the local region (V l) around the protein where the cosolvent density differs from the bulk solution. It is dependent on the protein. The second parameter is the initial change in chemical potential of the cosolvent \((\Updelta \mu_{3}=\mu_{3}- \mu_{3}^{\rm l}),\) and is considered the same for all proteins and conformations. Manipulation of the above relationship provides,

$$ \beta \Updelta \Updelta G=-\rho _3 A [1+BN_{33} ] $$
(53)

with \(A = \Updelta V^{\rm l} \beta\Updelta \mu_{3} (1 + 1/2 \beta\Updelta \mu_{3})\) and \(B = 1/2 \beta\Updelta \mu_{3} / (1 + 1/2 \beta\Updelta \mu_{3})\) are constants. Analytical expressions giving N 33 for urea, gdmcl, NaCl, and TFE have been provided [107].

The model is very similar in concept to the local-bulk domain model in that it focuses on local concentrations instead of binding sites. It differs from this model as it accounts for possible changes in the cosolvent distribution beyond the protein surface. The final equations differ due to the assumption of a partition coefficient that is independent of cosolvent concentration in the local-bulk domain model. Denaturation is favored due to the increase in local volume on denaturation. This local volume can be written as \(V^{\rm l} = \hbox{ASA} \times R^{\rm l}\) if the shell thickness R l is small compared to the protein size. Exact linear behavior is obtained upon truncation of the Taylor series expansion after the first derivative, or if N 33 is independent of concentration.

Summary

The description of cosolvent effects that has developed according to the local-bulk domain model and the LCPE model is one of a general increase in cosolvent density in the vicinity of the protein surface. In this respect, one is returning to the picture provided by the original measures of preferential binding. It could be argued that, while one has changed the specifics of the physical picture of the cosolvent (and water) distribution, there are still typically two unknown parameters to fit a denaturation profile which is usually linear in cosolvent molarity. In the local-bulk domain model these are K P and \(\Updelta \hbox{ASA}\). In the LCPE model they are \(\Updelta \mu_{3}\) and \(\Updelta V^{\rm l}\). Hence, one has not gained significantly on the binding model where \(\Updelta n\) and \(K_{\rm b}\) are the unknowns. However, we will see that the recent models provide a more appropriate framework for the analysis of computer simulation data. Finally, we note that only the transfer model takes into account the specific amino acid composition of the protein concerned through the summation over groups. This also allows for specific differences between the native and denatured states. Other models generally involve a generic increase in the number of binding sites/surface area on forming the denatured state.

Computer Simulation of Cosolvent Effects

Computer simulations can provide valuable information concerning the interaction of cosolvents with a variety of solutes. Most previous simulations have focused on determining possible cosolvent binding sites, using radial distribution functions and coordination numbers between the cosolvent and different groups on the protein surface, or the number of hydrogen bonds a cosolvent makes with the protein or representative molecules, in an effort to probe the initial stages of protein denaturation [31, 128, 142156]. Unfortunately, while these studies have provided useful insights into possible mechanisms of denaturation, they have not provided any data which can be directly related to the experimental thermodynamic data. Hence, we will focus on some recent results related to the direct calculation of preferential interactions.

Before doing so, however, we will discuss a few technical issues which arise during the analysis of computer simulations of preferential interactions. The vast majority of simulations (and experiments) are performed in closed systems. Therefore, the corresponding KB integrals have to be approximated by assuming that beyond some distance R l all the required rdfs are unity and therefore [39, 44],

$$ G_{ij} \approx 4\pi \int_{0}^{R^{\rm l}} {[g_{ij}^{NPT} (r)-1] r^{2}\,dr} $$
(54)

Numerical simulations of the rdfs in open and closed systems support this approximation [157]. The convergence properties of the above integral can be easily checked by examining the behavior of the KB integral as a function of integration distance. In practice, the integration does not have to be performed as one can simply count cosolvent or water molecules directly to give,

$$ N_{ij} (R)=n_{23} (R)-\frac{4}{3}\pi R^{3}\rho _j $$
(55)

and hence \(G_{ij} = N_{ij}/\rho_{j}\). The n ij (R) values representing the number of j molecules found within a distance R from a central i molecule. We note that for differences between KB integrals, such as that required in Eq. 21, the second term on the rhs of Eq. 55 will disappear.

The KB integrals traditionally use the center of mass (or geometry) as a reference. In our experience, it is more satisfactory to use the molecular surface as a reference for applications involving peptides and proteins [83, 158]. The two approaches should be identical. However, for the relatively small systems studied currently there can be significant differences. The convergence properties appear to be better when using the protein surface as a reference. However, it should be noted that this approach can only be used when determining differences in KB integrals to the protein \((G_{23} - G_{21})\). If one requires just G 23, then the center of mass reference must be used. A further correction is sometimes required to account for small changes in the bulk solution distribution. As one counts cosolvent and water molecules away from the surface of the protein, one effectively is saying they become part of the local distribution and not the bulk solvent distribution. In computer simulations with a fixed number of cosolvent (n 3) and water (n 1) molecules the initial bulk distribution (n 3/n 1) has to be redefined to account for the local composition changes. Hence, the preferential binding parameter becomes,

$$ \Upgamma_{23} (R)=n_{23} (R)-\frac{n_3 -n_{23} (R)}{n_1 -n_{21} (R)}n_{21} (R) $$
(56)

The correction can be significant for protein systems, even though the bulk ratio will change only slightly, as the value of n 21 can be large.

Using this type of approach we have used computer simulation to relate changes in hydrocarbon solubility to the preferential interaction of the cosolvent with the hydrocarbon [40, 41]. A direct correlation was observed, as predicted by Eq. 26. A decomposition into local (first shell), and distant (second, third shell, etc) density changes demonstrated a degree of proportionality between the two for most cosolvents (but not TFE) [41]. Additional studies of the properties of hydrocarbons in urea and other cosolvent solutions have also been performed and analyzed using KB theory [101, 102]. A study of cavity formation in urea solutions indicated that the free energy for cavity formation is essentially independent of the urea model used [71]. However, the preferential exclusion of urea from the cavity was inversely proportional to the value of a 33 displayed by the models. This suggests that urea models which do not accurately reproduce the experimental value of a 33 may lead to inaccurate descriptions of the degree of cosolvent exclusion. We have also studied the ability of common force fields to reproduce the KB integrals for binary solution mixtures [37, 38]. In general, currently available force fields struggle to reproduce the experimental KB integrals—which appear to be a sensitive test of the quality of a force field [71, 159]. Hence, a major focus for us has been the development of improved cosolvent force fields using the experimental KB integrals as fitting data [159164].

The application of simulation for the study of preferential interactions in protein systems is relatively scarce. Tang and Bloomfield have used grand canonical Monte Carlo simulations of model systems to evaluate \(\Upgamma _{23}\) [165]. While they did not specifically use KB theory, the type of analysis performed is equivalent to Eq. 21. Baynes and Trout were the first to determine preferential binding parameters for a real protein from molecular simulations [158]. However, they too used simple counting techniques and did not directly invoke KB theory. Their results for the distribution of urea and glycerol around Ribonuclease A at low cosolvent concentrations were in good agreement with experiment, and a correlation was observed between the number of cosolvents and the number of water molecules in the vicinity of the protein. We have applied computer simulations and KB theory to study the effects of NaCl on the equilibrium thermodynamics of folded and unfolded forms of the leucine enkephalin pentapeptide [83, 166]. This study demonstrated that a combined simulation and KB approach is feasible for small systems. However, an attempt to decompose the overall effect into contributions from different groups was less successful.

As an example of a preferential binding parameter analysis we will discuss a recently performed simulation study of 8 M urea around native lysozyme [167]. The experimental data indicates that \(\Upgamma_{N3} = 16\) for this system at pH 7 and the protein remains folded even in 8 M urea [86]. The corresponding value at pH 2, where the protein denatures with a midpoint transition at 3.7 M urea, is \(\Upgamma _{N3} = -10\) in 8 M urea [86]. The latter value was obtained from a combination of dialysis and normal denaturation studies using Eqs. 36 and 44. This type of approach is particularly useful for simulations as we can study the native state (which is known for many proteins) under conditions of high cosolvent which provides good statistics for the corresponding distributions. Some of our results are displayed in Figs. 5 and 6. The preferential binding parameter was sensitive to the urea model used and neither model quantitatively reproduced the experimental data. We believe this is due to inaccuracies in current force fields. Urea clearly associates with lysozyme in both cases and is in agreement with experiment. However, the OPLS model indicates a rather large region of influence of the protein, whereas the differences in the solution composition for the KBFF model are more local with the major changes occurring within 0.5 nm from the surface.

Fig. 5
figure 5

The simulated preferential binding parameter \((\Upgamma _{N3})\) for 8 M urea (3) and native lysozyme (N) at 300 K and pH 7 as a function of integration distance from the protein surface (R). The simulated values have been obtained using the Gromos 45a3 parameters for lysozyme, the SPC water model, and either the KBFF of OPLS models for urea. Total simulation time was 6 ns. The experimental value is 16

Fig. 6
figure 6

Simulated radial distribution functions (g ij ) between lysozyme and 8 M urea (3) or water (1) as function of distance from the protein surface (r) at 300 K and pH 7. Results are shown for two urea models

In summary, it is now possible to calculate thermodynamic data from computer simulations of cosolvents around proteins. This is achieved by a simple counting procedure and linked to the thermodynamics through KB integrals. Simulations in this area are just beginning. There are no general conclusions to be drawn at present. However, the use of KB theory clearly provides a solid foundation for future studies. The only additional requirements for the simulations are the need for larger system sizes to ensure the distributions reach their bulk values, and the use of extended simulation times required to precisely determine the KB integrals.

Conclusions and Future Directions

There has been a recent resurgence in the use of KB theory for the analysis of biomolecular equilibria. This has helped to clarify the relationships between different preferential binding parameters and provide a clear picture of the effects of cosolvents which is consistent with many of the original experiments performed for open systems. At this point a theoretical analysis of the expressions describing preferential interactions in ternary systems in terms of KB integrals is essentially complete within the infinitely dilute solute approximation. While this has allowed a quantification of many cosolvent effects, and an effective decoupling of the contributions from both the cosolvent and the solvent, it has not resulted in a clear atomic picture of these effects. In our opinion, the only reasonable approach to solve this problem is by the use of computer simulation. Again, KB theory can play an integral role in understanding and analyzing this data. However, it is clear that (in principle) one has to be able to rationalize changes in solution distributions over many solvation shells. While direct interactions between a cosolvent and the surface of a protein should be relatively easy to comprehend, these longer range packing effects are currently very difficult to understand. A further possible use of KB theory will involve studies of peptide and protein aggregation. These involve a peptide or protein at finite concentrations and will therefore involve some modification of the current equations. However, the ability to influence protein–protein interactions is of great interest in understanding the growing number of diseases which are in some way related to misfolding and aggregation. In our opinion, KB theory can provide a solid foundation for such studies.