Historical prolegomena

When I arrived at the John Curtin School of Medical Research at the Australian National University at the end of 1977, armed with nothing more than a PhD thesis about protein hydrodynamics and dynamic light scattering (DLS), along with a few equations describing a polymerizing system at thermodynamic and sedimentation equilibrium, the name of Sandy Ogston was still writ large over the portal of the Department of Physical Biochemistry and the incumbent Head, Laurie Nichol, was engraving his own tablets of stone with the laws of excluded volume and thermodynamic nonideality. But I soon discovered that another fellow called Don Winzor regularly came down from the University of Queensland at Brisbane to act as the immovable stone on which Laurie sharpened his intellectual chisels. The two sat together for hours going over pages of manuscripts and arguing about a multitude of thermodynamic expressions and their meaning, always trying to tease out some new experimental consequence of the eternally intractable relation between strictly defined theoretical quantities and their messy measurable counterparts. When I showed some propensity for mathematical reasoning I was invited to enter the inner sanctum and was admitted into the cabal of those initiated in the rites of “thermodynamic intuition”. The three of us ended up writing a joint paper about non-sigmoidal Scatchard plots (Nichol et al. 1979), but the main task I took on was to develop a more complete account of thermodynamic nonideality in terms of more accurately estimated activity coefficients of separate species in interacting systems, a study that brought McMillan–Mayer theory into focus and eventually gave what seemed like quite a good description of lysozyme’s self-association in mildly acidic solutions (Wills et al. 1980). After a couple of years I took off to a new job in Berlin, but having acquired the new skill of making linear and quadratic order corrections to the standard ideal-solution equations I penned a note with Nichol and Ogston about the effect of inert polymers on protein self-association (Nichol et al. 1981) and applied the technique to DLS data (Wills and Georgalis 1981)—all with a view to rationalizing the mostly impossible behavior of aggregation-prone ribosomal proteins. However, the completion of this last task had to wait another decade and a half, that is, until Don Winzor was prepared to slice through the experimental messiness of it all and convince us that the results really did mean something (Wills et al. 1995). In the meantime, the science of excluded volumes and their effect on nucleic acid melting was meted out in the dark, smoky atmosphere of a West Berlin Kneipe known as The Flying Dutchman, reminiscent of a locale in a Le Carré novel, conveniently located on the Richard-Wagner-Strasse in the Charlottenburg district, just over from the Deutsche Oper. The obliging barman Roland spirited documents back and forth between authors and journal editor, notifying us whenever a new package with mysteriously arcane contents was ready for pick-up (Woolley and Wills 1985). Such were the ups and downs of life for a young biochemist in those days, looking for meaning in activity coefficients without thinking hard enough to realize that there was more than one to be had …

And so it came to pass that when Laurie Nichol left science to become a university administrator his parting gift to me was to attach Don Winzor to my hip and convince us that we should (and could) find a proper solution to the problem of unifying the not-completely-rigorous biochemical description of macromolecular interactions in terms of excluded volume, electrostatic and molecular association effects, unaware as we were that the task had long been completed and that we had occasionally used bits of the relevant Hill papers without comprehending their full relevance. But first Don had an old score to settle. We would show that one could be protected from all of the complications of “preferential solvation” by staying under the aegis of excluded volume (Winzor and Wills 1986). And by then I had joined the molecular biological heretics who were interested in prions (Wills 1986) and strayed off on sabbatical to Carleton Gajdusek’s lab at the National Institutes of Health in Maryland. Those were days of glorious intellectual freedom when a scientist paid to study neurological disease could live a parallel life, not only incurring the wrath of the Pentagon and eventually the FBI by using the Freedom of Information Act to play amateur WikiLeaks, but also pursuing thermodynamic rigor, strolling over to the next building to say hello to Allen Minton only to be introduced to the legendary Terrell Hill. We now had a wise and patient guide to show us the way, someone who had blazed the trail a quarter century earlier. It took him no time to explain our naïvety in thinking that the expression for the thermodynamic activity of a single component could be transferred willy-nilly across to multicomponent systems. No, one had to decide how the system was constrained while the concentration of a component was varied. Nowhere was this more simply evident than in the definition of the osmotic pressure (Fig. 1). Armed with this new perspective and the help of Wayne Comper, who already knew how to do proper thermodynamics using differential quantities, we could finally reconcile measurements of virial coefficients made under alternative “osmotic” or isobaric conditions (Wills et al. 1993). We were then able to describe preferential solvation and the effect of small molecules on macromolecules rigorously (Wills and Winzor 1993), and this in turn took us back to the basic theory of sedimentation equilibrium and the problem of how the concentration dependence of the buoyancy term \( \left(1-\overline{v}\rho \right) \) was fudged during integration of dlnc/dr 2 to obtain the equation of the radial concentration dependence c(r) of a non-ideal solute. After years of claim and counterclaim we could finally resolve that as long as one stuck to the molar concentration scale, the equation for a nonideal solution could be obtained from the ideal equation simply by replacing c(r) with the osmotic activity z(r), at least in the case of an incompressible solution, and then the density appearing in the buoyancy factor is unequivocally that of the pure solvent, rendering the buoyancy term independent of concentration. At this point we really were in a position to entrench the antipodean perspective that nonideality could not be dispensed with by curve-fitting (Adams and Fujita 1963), but had to be dealt with prior to any realistic assessment of the extent of molecular association reactions. A strong collaboration was underway.

Fig. 1
figure 1

Letter from teacher (Terrell Hill) to student (the author) outlining the derivation of the osmotic pressure equation expressed alternatively in terms of molar and molal quantities

Thermodynamic activity of a solute

The ideal equation for the chemical potential μ A of a solute A

$$ {\mu}_A={\mu}_{{}_A}^o+RT \ln {X}_A $$
(1)

is not particularly sensitive to the choice of the scale on which the concentration X is nominally measured, simply because the most commonly used scales (amount of solute per amount of solution, or alternatively, per amount of solvent) converge within a linear correction as X → 0 and the use of Eq. (1) to describe the thermodynamic behavior of a solute depends on the extent of the dilute regime in which corrections that are linear in X can be neglected. For example, as long as the specific volumes of both solute v A and solvent v s are effectively constant under the relevant experimental conditions, the molar concentration C (or its mass per unit volume equivalent c A  = C A M A ) are related to the molal concentration m A through the equation

$$ {C}_A={m}_A{\rho}_s/\left(1+{M}_A{v}_A{\rho}_s{m}_A\right) $$
(2)

where ρ s  = 1/v s is the density of the solvent and M A is the molar mass of the solute. Equation (1) can be modified to reflect the nonideal effects of molecular interactions simply by replacing the nominal concentration variable with a thermodynamic “activity”, which is usually considered to be the product of the concentration and an “activity coefficient”, the latter being a fudge factor that corrects the concentration to give the correct value of the chemical potential. However, to obtain a quantitatively useful expression involving a thermodynamic activity it is necessary to decide how the solution is to be constrained as the solute concentration is varied, a choice that corresponds to determining the standard state to which μ o refers (see Fig. 1). It turns out that the two most useful choices for the standard state of the solute are the infinite dilution limit under isothermal conditions at some specified value of either the chemical potential of solvent μ s or the pressure P. In the first case the natural concentration scale on which to define the “osmotic activity” z A  = γ A C A is the molar scale (moles of solute per unit volume of solution)

$$ {\mu}_A\left(T,{\mu}_s,{C}_A\right)={\mu}_A^o\left(T,{\mu}_s\right)+RT \ln {z}_A $$
(3)

and in the second case the natural concentration scale on which to define the “isobaric activity” a A  = y A m A is the molal scale (moles of solute per unit mass of solvent)

$$ {\mu}_A\left(T,P,{m}_A\right)={\mu}_A^o\left(T,P\right)+RT \ln {a}_A $$
(4)

The rationale for these choices is that in each case Eq. (1) is preserved as the ideal expression for μ A , at least for an incompressible solution (v A and v s constant), applicable to the constraint (const. μ s or P) under which the concentration is varied.

The statistical mechanics of solutions of macromolecules under osmotic and isobaric conditions was given a thorough treatment by Hill in the late 1950s (Hill 1956a, 1958, 1959). He provided a very general analysis, showing how the coefficients, B 2, B 3, etc., appearing in the virial expansion for the osmotic pressure, that is, changes in P due to changes in C A at constant μ s ,

$$ {\left[\varPi /(RT)\right]}_{T,{\mu}_s}={C}_A\left(1+{B}_2{C}_A+{B}_3{C}_A^2+\dots \right) $$
(5)

can be related to the coefficients C 2, C 3, etc. in the equivalent expansion that accounts for changes in μ s due to changes in m A at constant P,

$$ -{\left[\left({\mu}_s-{\mu}_s^o\right)/\left({\rho}_sRT\right)\right]}_{T,P}={m}_A\left(1+{C}_2{m}_A+{C}_3{m}_A^2+\dots \right) $$
(6)

The great advantage of the osmotic pressure expansion [Eq. (5)] is that any virial coefficient B n can, in principle, be calculated directly by considering the energy of interaction between just n molecules of solute. However, either expansion can be used to define the activity coefficient appropriate to conditions under which the corresponding activity is defined:

$$ \ln {\gamma}_A=2{B}_2{C}_A+\left(3/2\right){B}_3{C}_A^2+\dots $$
(7)

and

$$ \ln {\gamma}_A=2{C}_2{m}_A+\left(3/2\right){C}_3{m}_A^2+\dots $$
(8)

whence a molecular interpretation can eventually be made of variations in the thermodynamic activity of a macromolecular solvent.

These results, and their application to a wide variety of problems, were by no means the exclusive domain of Hill. Indeed, he stood on the shoulders of the likes of Scatchard (1946), Brinkman and Hermans (1949), Bird et al. (1950), Stockmayer (1950) and Kirkwood and Buff (1951), to name but a few. However, the most systematic and comprehensive development of McMillan–Mayer theory in relation to protein solutions was carried out by Hill (1954, 1955a, b, 1956a, b, 1957, 1958, 1959), with the main results of relevance to biochemists being incorporated into his specialized book (Hill 1968) and further elaborated in a later publication (Hill and Chen 1973).

Sedimentation equilibrium

With this clarification of what was meant by the term “thermodynamic activity” it was possible to take a new approach to the derivation of the expression for the concentration distribution of a solute in a system at sedimentation equilibrium, giving a clear comparison with the standard result for an ideal solution:

$$ {C}_A(r)={C}_A\left({r}_0\right)\psi (r) $$
(9a)

where

$$ \psi (r)= \exp \left[\phi \left({r}^2-{r}_0^2\right)\right] $$
(9b)

and

$$ \phi ={M}_A\left(1-{v}_A{\rho}_s\right){\omega}^2/(2RT) $$
(9c)

The two usual ways of obtaining Eq. (9), namely, the sedimentation–diffusion flux balance method and the thermodynamic equilibrium method, both involve an integration step in which the solution density ρ(r) is treated as a constant, independent of r, when it is not. For incompressible solutions,

$$ \rho (r)={\rho}_s+\left(1-{v}_A{\rho}_s\right){M}_A{C}_A $$
(10)

The mathematical sleight of hand involved is of little consequence in the analysis of data that can be considered to represent the behavior of an ideal solution and it accounts for versions of Eq. (9) in which ρ is said variously to be the solvent or the solution density. However, in deriving an expression for the exact magnitude of the first order nonideality correction to the sedimentation equilibrium concentration profile, the dependence of ρ on C A must be taken into account correctly. The exact result for incompressible solutions can be written simply by replacing C A in Eq. (9) with the osmotic activity z A :

$$ {z}_A(r)={z}_A\left({r}_0\right)\psi (r) $$
(11)

We found a very general pathway to this result (Wills et al. 1996) and devised a direct method for the analysis of sedimentation equilibrium data based on a transformation to ψ(r) from Eq. (9b) as the independent variable. We also presented a derivation for the simplest case of single solute (Wills et al. 2000a), as well as extending the ψ(r) analysis to mixtures of interacting solutes (Wills et al. 2000b).

To understand the relevance of the osmotic activity in sedimentation equilibrium it is important to realize that the standard chemical potential μ o A (T, μ s ) in Eq. (3) is a function of radial distance r. Consider a solution of the same composition and at the same temperature and pressure as that existing at r 0. Remove all of the solute under conditions of constant chemical potential of solvent μ s (r), to give pure solvent at a pressure of P o(r 0) = P(r 0) – Π( r 0). According to Eq. (3) this osmotic change has brought the solute to its standard state of infinite dilution, where its chemical potential is μ o A (T, μ s ) as at r 0 in the centrifuge cell. Carry out the same procedure for a solution corresponding to the conditions at r. Now, the hypothetical change Δμ o A (T, μ s ) in the chemical potential of solute between these two states of the solvent is RTϕ(r 2 − r 20 ), with the density in Eq. (9c) unequivocably ρ s because the change of state corresponds to the ideal situation in which a single solute molecule in an infinite volume of solvent is transformed from the solvent conditions of P o and μ s at r 0 to those at r. Equation (11) follows immediately from the calculation of μ A (r) – μ A (r 0) for the conditions of the actual sedimentation equilibrium under consideration.

Molecular interactions

For molecules that interact through a spherically symmetric potential u(x) when separated by a distance x, the molar-scale second virial coefficient (L is Avogadro’s number) is given by McMillan and Mayer (1945) as

$$ {B}_2=-2\pi L{\displaystyle \underset{0}{\overset{\infty }{\int }}f(x){x}^2dx} $$
(12)

where the Mayer f-function is defined as

$$ f(x)=1- \exp \left[-u(x)/kT\right] $$
(13)

and for the hard-sphere interaction between molecules of radius R A

$$ u(x)=\left\{\begin{array}{ll}\infty \hfill & x<2{R}_A\hfill \\ {}0\hfill & x\ge 2{R}_A\hfill \end{array}\right\} $$
(14)

the van der Waals excluded volume per pair of molecules

$$ {B}_2=16\pi L{R}_A^3/3 $$
(15)

is obtained (Wills and Winzor 2005).

The contribution of excluded volume to thermodynamic nonideality had long been understood, and Ogston and Winzor (1975) had extended Eq. (15) to the case of ellipsoids and used Debye–Hückel theory to take account of charge–charge interactions. However, in the background were much earlier discussions in which Hill (1954, 1956b) had already provided a simple way of calculating the third virial coefficient B 3 as well as a more realistic estimate of charge–charge effects that could easily be adapted to macromolecular solutes. The form of the electrostatic potential adopted for molecules with a net surface charge of Z A multiplied by the electronic charge e, was

$$ \begin{array}{ll}u(x)=\frac{Z_A^2{e}^2 \exp \left[-\kappa \left(x-2{R}_A\right)\right]}{\varepsilon x{\left(1+\kappa {R}_A\right)}^2}\hfill & x\ge 2{R}_A\hfill \end{array} $$
(16)

where κ is the Debye–Hückel inverse screening length of the supporting electrolyte-bearing medium and ε is its dielectric constant. This was demonstrated to give a good representation of the dependence of B 2 on ionic strength (Wills et al. 2000a) and provided confidence in what had become a common practice of using the relevant formulae to “calculate out” the effects of excluded volume and charge, thereby bringing into visibility molecular associations due to short-range attractive forces between proteins. Comparison of different ways of taking into account higher order effects of charge–charge interactions, through either extensions of Eq. (16) (Hill 1954, 1956b) or scaled particle theory, emphasized the need for independent information about the size and charge of proteins in attempting to investigate association reactions (Scott et al. 2010).

Once we had rediscovered the Hill–Chen opus (Hill and Chen 1973) and its theoretical underpinnings, Laurie Nichol’s “unification” problem was solved. Molecular associations could indeed be treated as just another form of nonideality, exactly in the manner envisaged by the van der Waals equation of state (Wills and Winzor 2005). In the case of a single solute we were not required to have separate thermodynamic equations for different oligomeric “components” as had been done previously (Wills et al. 1980); all we had to do was incorporate the dimerization constant K 2 into the relevant virial coefficients

$$ \begin{array}{lll}{B}_2={B}_{11}^{*}-{K}_2;\hfill & {B}_3={B}_{111}^{*}-2{K}_2\left(4{B}_{11}^{*}-{B}_{12}^{*}\right)+4{K}_2^2;\hfill & \mathrm{e}\mathrm{t}\mathrm{c}.\hfill \end{array} $$
(17)

and concomitantly more complex relationships for higher order coefficients. The star superscript in Eq. (17) denotes a virial coefficient that is calculated solely on the basis of the relatively long-range repulsive forces (excluded volume and electrostatic) between molecules according to the likes of Eqs. (15) and (16); and the string of subscripts on these quantities indicate a list of monomers (1), dimers (2), and others comprising a total of n monomer units.

With this basic insight we were able to extend the direct analysis of sedimentation equilibrium data using the ψ-function of Eq. (11) as far as the case of multiple experiments involving two separate components (ovalbumin and cytochrome c) under conditions in which they undergo an association reaction to form a hetero-dimer (Wills et al. 2000b). We also gave some consideration to B 3 and higher order effects (Wills and Winzor 2001, 2011; Wills et al. 2012). The ψ-function analysis predated other procedures for the direct calculation of association constants through statistical analysis of sedimentation equilibrium data, emphasizing the non-thermodynamic nature of the separation of the short-range molecular forces involved in molecular association reactions from others (Winzor and Wills 2007) and thereby invalidating statistical curve fitting as a way of eliminating effects due to other interactions as a way of access to equilibrium constants. The direct use of Eq. (17) to extract protein dimerization constants from osmotic pressure data was also demonstrated (Wills and Winzor 2009).

Other extensions

The standard analysis (Casassa and Eisenberg 1964) of the thermodynamic effects exerted by the electrolyte routinely regarded as part of the solvent in which a protein is dissolved was legendary among biochemists for its rigor and complexity. However, our analysis of “preferential solvation” (Winzor and Wills 1986; Wills and Winzor 1993) and its application to sedimentation analysis (Jacobsen et al. 1996) had led us to an approach using molar-scale variables and osmotic considerations (Wills and Winzor 2002) that was much simpler than the corresponding approach based on the use of molal quantities. Provided the components M of the diffusible osmotic solvent could be regarded as a simple volume-filling ideal mixture, their only significant effect was to alter the effective density of the medium in which the macromolecules were immersed, giving rise to an altered solvent density

$$ {\rho}_d={\rho}_s+\left(1-{v}_M{\rho}_s\right){M}_M{C}_M $$
(18)

and an effective protein specific volume v * A , defined in terms of an altered sedimentation buoyancy factor, which depended on the second virial coefficient B AM for the interaction between molecules of A and M

$$ {M}_A\left(1-{v^{*}}_A{\rho}_d\right)={M}_A\left(1+{v}_A{\rho}_s\right)-\left(1-{v}_M{\rho}_s\right){B}_{AM}{M}_M{C}_M+\dots $$
(19)

This approach, which was found to be in good agreement with experimental results, is vindicated by the interpretation of Eq. (4.37) of Hill (1968). In the case of sedimentation equilibrium the derivation of Eq. (11) is still valid for a multicomponent solvent, provided one applies the osmotic condition of constant chemical potential to all such components. When they are all sufficiently small not to undergo significant sedimentation, then the buoyancy factor is a constant throughout the cell. The presence of larger, inert polymers added to mimic the crowded molecular environment inside a cell could be handled in a similar, very simple way, provided they made the overwhelmingly dominant contribution to the nonideality of the protein (Wills et al. 1995; Winzor and Wills 2006).

Thermodynamic results are general, so insights in the context of thinking about what happens in a centrifuge cell can be transferred to an array of other experimental situations. The partitioning of a solute in a frontal chromatography experiment is clearly reminiscent of osmosis and lends direct access to the effects of nonideality. This had been exploited in the analysis of “preferential solvation” (Winzor and Wills 1986) and the use of inert polymers in the estimation of the size of proteins (Wills et al. 1995; Winzor and Wills 1995, 2006), but it was also applied in a careful analysis of hemoglobin self-association at high concentrations (Winzor and Wills 2003) and extended to the new technique of self-interaction chromatography (Winzor et al. 2007), once again proving how consideration of excluded volume could provide a parsimonious explanation of diverse phenomena, including the influence of an inert polymer on the kinetics of enzymic catalysis, an effect that had previously been ascribed to “osmotic stress” (Winzor and Wills 1995). Failure to take proper account of the definition of the thermodynamic activity, osmotic or isobaric [Eqs. (3) and (4)], and the concentration scale against which virial coefficients are measured, molar or molal [Eq. (2)], remains an impediment to the optimal interpretation of thermodynamic measurements on protein solutions (Wills and Winzor 2011). The interpretation of data from light scattering experiments (Winzor et al. 2007) remains incomplete.

Conclusion

The contribution of Don Winzor to more than half a century of physical and analytical biochemistry, starting from considerations of excluded volume and continuing to his demand that statistical mechanical expositions lead to experimentally relevant conclusions and interpretations, has been indispensable to the development of rigor and quantification in areas of science where formalism had traditionally been looked at with skepticism. The methodologies and interpretations of experimental data that were developed by him and his many collaborators continue to be applied to a wide range of problems of biological and medical significance, especially in the field of protein–protein interactions. A multitude of structure–function studies, recently ranging from the molecular basis of murine olefaction (Portman et al. 2014), through the hetero-dimeric character of a plant immune receptor (Williams et al. 2014), to a lethal mutation in the laminin alpha-1 gene (Patel et al. 2015), has demonstrated the importance of the thermodynamic approach to significant molecular biological problems. It can only be anticipated that the influence of Don’s work and its application will still be felt many decades into the future.