Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Surfactant Self Assembly: Morphology and Statistical thermodynamics

Surfactant molecules are amphiphiles: they comprise different chemical moieties which are soluble in different solvents. Since they are linked together chemically, this requires nature to grapple with an interesting problem: how best to lower the free energy, given that no matter what the solvent conditions are, some chemical moieties will likely be “unhappy.” Nature’s solution to this is self assembly—a process by which larger scale structures form cooperatively, such that unfavorable solvent contact is largely avoided. Self assembly is an amazing and hugely important example of an emergent phenomenon, in that it creates new physical entities (namely, the aggregates) which can be much bigger than the individual molecules they are made of. This transition in relevant scale is the primary reason why we can deal with these aggregates using physical tools that are quite removed from atomistic modeling—such as continuum elasticity. How self-assembly works, is the topic of our first section.

The classical work on this topic is the groundbreaking paper by Israelachvili et al. (1976), from which the present section picks some of the most beautiful results and elaborates them in a bit more detail. Good discussions can also be found in textbooks on soft condensed matter physics, such as Jones (2002) or Witten and Pincus (2004).

1.1 Morphology

Lipids, or more generally surfactants, are molecules which are typically divided into a “head” and a “tail.” The head is hydrophilic (water soluble), for instance because it has polar groups (e.g., hydroxyl or carbonyl groups), or because it is charged (e.g., amino, carboxyl, or phosphate groups). The tail, on the other hand, is hydrophobic (water insoluble), and for lipids generally consists of two aliphatic chains. They typically contain between 12 and 22 carbon atoms, usually connected by single bonds, but sometimes with one or more double bonds (in the latter case one speaks of “unsaturated lipids”). Figure 1 gives a simple illustration of this by showing pictures of lipids using some commonly employed computational models for studying them. Notice that only one of these models strives for a full chemical resolution. The others simplify the chemical architecture more or less drastically, but they all keep one key aspect: lipids are amphiphiles.

Fig. 1
figure 1

Adapted from (Wang and Deserno 2016)

Illustration of the morphology of a lipid molecule. Panel a shows a typical physicist’s cartoon—a hydrophilic head group with two schematic tails; panel b takes this sketch serious and translates it into a highly coarse grained model (Cooke et al. 2005); panel c illustrates a lipid on the MARTINI level (Marrink et al. 2007), where the number of beads is increased, but still each bead accounts for approximately 3–4 heavy atoms; and panel d displays a united-atom lipid model of DMPC (dimyristoylphosphatidylcholine) (Berger et al. 1997), in which every atom (except non-polar hydrogens) are explicitly accounted for.

The key effect on which self assembly relies is a cooperative aggregation of surfactants that tries to bury the water-insoluble tails in the interior of the aggregate, shielding them from the aqueous solvent by a layer of hydrophilic head groups. Interestingly, there are numerous different morphologies in which that could happen, and this depends on the shape of the surfactant. For instance, if the lipid has a relatively large head group and a thin tail—if it looks like an ice cream cone—then we can imagine these surfactants packing together to form little spheres. But if the shape of a lipid is less obviously pointed, then lower curved structures seem more likely—such as cylindrical aggregates or even planar sheets. As we will now see, Israelachvili et al. (1976) have developed a beautifully simple way to make this intuition quantitative.

Fig. 2
figure 2

Simplified shape-description of a surfactant as a blunted cone

Let us represent a lipid schematically as a building block that is approximately cylindrical, but with a somewhat tapered tail region, as illustrated in Fig. 2, so that it looks like a blunted cone. The area of its head-group surface is \(a=\pi r^2\), its volume is v, and its length is l. Imagine we need N of these object to piece them together into a sphere of radius \(R_\text {sph}\). It is then obvious that we must have

$$\begin{aligned} Nv&= V_\text {sph} = \frac{4}{3}\pi R_\text {sph}^3 \ ,\end{aligned}$$
(1a)
$$\begin{aligned} Na&= A_\text {sph} = 4\pi R_\text {sph}^2 \ . \end{aligned}$$
(1b)

Dividing these two equations, N cancels, and we get an equation for the radius of that sphere:

$$\begin{aligned} \frac{v}{a} = \frac{1}{3}R_\text {sph} \ . \end{aligned}$$
(2)

At the center of the sphere we cannot have any empty space. Hence the radius \(R_\text {sph}\) which we found cannot be larger than the length l of the amphiphile—imagine for instance that there is a largest length to which the tails can stretch, and that limits the sphere’s radius: \(R_\text {sph}\le l\). This results in the condition

$$\begin{aligned} \text {spheres:}\qquad \frac{v}{al} =: P \le \frac{1}{3} \ , \end{aligned}$$
(3)

where we defined the so-called packing parameter P. We hence find that if this condition on P is satisfied, these lipid building blocks will indeed like to aggregate into spherical objects, which go under the name spherical micelles.

We can repeat this argument, but now instead consider packing the building blocks into a cylinder of radius \(R_\text {cyl}\) and length \(L_\text {cyl}\); Assuming that \(L_\text {cyl}\) is large enough to ignore end effects, we then get

$$\begin{aligned} Nv&= V_\text {cyl} = \pi R_\text {cyl}^2L_\text {cyl} \ ,\end{aligned}$$
(4a)
$$\begin{aligned} Na&= A_\text {cyl} = 2\pi R_\text {cyl} L_\text {cyl} \ . \end{aligned}$$
(4b)

Again dividing these two equations yields

$$\begin{aligned} \frac{v}{a} = \frac{1}{2}R_\text {cyl} \ . \end{aligned}$$
(5)

Once more, requiring that the resulting value for the cylinder’s radius is not larger than the lipid length l leads to the condition

$$\begin{aligned} \text {cylinders:}\qquad \frac{1}{3} \le P \le \frac{1}{2} \ , \end{aligned}$$
(6)

where the lower cutoff comes from the previous case: if P is even smaller than \(\frac{1}{3}\), we already know that we get spheres.

We can again repeat this argument, but now we pack the amphiphiles into a planar bilayer structure of area \(A_\text {bil}\) and thickness \(b_\text {bil}\), leading to

$$\begin{aligned} Nv&= V_\text {bil} = b_\text {bil}A_\text {bil} \ ,\end{aligned}$$
(7a)
$$\begin{aligned} Na&= A = 2 A_\text {bil} \ , \end{aligned}$$
(7b)

and dividing these two equations gives

$$\begin{aligned} \frac{v}{a} = \frac{1}{2}b_\text {bil} \ . \end{aligned}$$
(8)

Again, the thickness of each individual leaflet (i.e., half the bilayer’s thickness) cannot exceed the length l to which the lipid can stretch, \(\frac{1}{2}b_\text {bil}\le l\), and so we find

$$\begin{aligned} \text {bilayers:}\qquad \frac{1}{2} \le P \le 1 \ . \end{aligned}$$
(9)

The argument, as presented, is remarkably simple; Israelachvili et al. (1976) look at the situation in a fair bit more detail, but the key findings nevertheless hold up. In fact, this line of reasoning works well even for building blocks which are very simple and not very pliable–such as the lipid model from Fig. 1b. Cooke and Deserno (2006) showed that by simply changing the head-group size of the three-bead lipid, one can drive the entire morphological transition from spheres over cylinders to bilayers; if one pushes the packing parameter even larger, the lamellar phase becomes unstable. This is illustrated in Fig. 3.

Fig. 3
figure 3

Reprinted from Cooke and Deserno (2006) with permission from Elsevier

The different morphologies of amphiphilic aggregates are controlled by amphiphile shape, even for models as simple as that from Fig. 1b.

Of course, the transitions themselves do not yet tell whether the simple packing-parameter theory works; but this theory makes a prediction that can be tested. Taking the area per lipid from a flat bilayer as the value for a, and using one of the transitions (say, spheres to cylinders) to pinpoint v / l, one can write the packing parameter as a function of the head-group size of the lipid. This then gives a prediction for the head-group size where the other transition (cylinders to bilayer) happens. Cooke and Deserno (2006) show that this prediction indeed works.

The geometrical picture we have in mind by now is that a smaller packing parameter P corresponds to a more cone-like shape, while for a larger P the lipid becomes more cylindrical. This intuition can be verified (and made more precise) by a simple calculation: if \(\Omega \) is the solid angle of the blunted cone, then its volume can be written as

$$\begin{aligned} v = \frac{1}{3}\Omega \left[ R^3-(R-l)^3\right] = \Omega \left[ R^2l-Rl^2+\frac{1}{3}l^3\right] \ . \end{aligned}$$
(10)

Since its head surface is \(a=\Omega R^2\), we find \(P = 1-\frac{l}{R}+\frac{1}{3}\left( \frac{l}{R}\right) ^2\), a quadratic equation that can be solved for R, from which we then get the solid angle. Since, furthermore, \(\Omega =2\pi \left( 1-\cos \frac{\varphi }{2}\right) \approx \frac{1}{4}\pi \varphi ^2\), where the last approximation is good for small \(\varphi \), we arrive at the opening angle

$$\begin{aligned} \frac{\varphi }{r/l} \approx 3\left[ 1-\sqrt{1-\frac{4}{3}(1-P)}\right] \ . \end{aligned}$$
(11)

This relation is illustrated in Fig. 4. The characteristic ratio r / l defines an angle, and the actual opening angle \(\varphi \) is some multiple of that—twice as big for cones at the boundary between spheres and cylinders, and about 1.3 times as big at the boundary between cylinders and planes. Of course, the angle vanishes at \(P=1\). Notice that we can alternatively also calculate the lipid spontaneous curvature, defined as \(K_\text {0,m}=2/R\). For P close to 1 we find for this parameter

$$\begin{aligned} K_\text {0,m}l \approx 2(1-P) + \frac{2}{3}(1-P)^2 + \cdots \end{aligned}$$
(12)

This provides a link between a parameter from continuum Helfrich theory, \(K_\text {0,m}\), and a parameter from the self assembly problem, P.

Fig. 4
figure 4

Relation between the opening angle of the blunted cone from Fig. 2 (measured in units of r / l) and the packing parameter P. Around \(P=1\) we have \(\varphi \approx \frac{2r}{l}(1-P)\)

1.2 Statistical Thermodynamics

Knowing the shape of the aggregate is only the beginning. We surely also want to know, under what conditions such aggregates form, and if they come in different sizes (say, what’s the length of a cylindrical micelle?), we want to know what that is, too.

The problem is interesting, because entropy plays a key role. Were it only a matter of energy, any kind of amphiphile would aggregate to any other amphiphile, no matter how weak any attractive interaction is. But when we consider entropy, we realize that aggregation strongly reduces the translational entropy of amphiphiles. To understand this energy–entropy balance better we again follow Israelachvili et al. (1976). Let us therefore define

$$\begin{aligned} \varepsilon _n&: \text {energy per molecule in }n\text { -aggregate} \end{aligned}$$
(13a)
$$\begin{aligned} \phi _n&: \text {concentration of }n\text {-aggregates} \end{aligned}$$
(13b)
$$\begin{aligned} X_n&: \text {concentration of monomers in }n\text {-aggregates, } = n\phi _n \ , \end{aligned}$$
(13c)

where an “n-aggregate” is a self-assembled aggregate of molecules consisting exactly of n molecules (or monomers or 1-aggregates). You may think of \(X_n\) in the following way: consider only the n-aggregates in solution (mentally remove all the others) and now ask, what is the overall concentration of all amphiphiles left in the system?

The total energy of one n-aggregate is of course \(E_n=n\varepsilon _n\). Observe that this does not imply that \(E_n\propto n\), since \(\varepsilon _n\) also depends on n. The energy density due to n-aggregates is therefore

$$\begin{aligned} e_n = \phi _nE_n = \phi _n n \varepsilon _n = X_n \varepsilon _n \ . \end{aligned}$$
(14)

For the (translational) entropy density of n-aggregates we will simply assume an ideal gas law, so that we get

$$\begin{aligned} s_n = -k_{\text {B}}\,\phi _n\big (\log \phi _n-1\big ) \ . \end{aligned}$$
(15)

The total free energy density is then the sum of the energetic and entropic terms over all aggregate sizes:

$$\begin{aligned} f = \sum _{n=1}^N\left\{ e_n-T s_n\right\} = \sum _{n=1}^N\left\{ X_n\varepsilon _n + k_{\text {B}}T\,\frac{X_n}{n}\left( \log \frac{X_n}{n}-1\right) \right\} \ , \end{aligned}$$
(16)

where N is the total number of molecules, and hence also the biggest aggregate we can get.

We are interested in the distribution function of aggregate sizes, \(X_n\), subject to the constraint that the total amount of material in the system is fixed, meaning

$$\begin{aligned} \sum _{n=1}^N X_n =: X = \text {fixed} \ , \end{aligned}$$
(17)

where X is the total monomer concentration in the system. We can calculate this distribution function by minimizing Eq. (16) subject to the constraint, which we enforce by means of a Lagrange multiplier \(\mu \):

$$\begin{aligned} 0 \mathop {=}\limits ^{!} \frac{\partial }{\partial X_n}\left\{ f[X_n] - \mu \left[ X-\sum _{m=1}^NX_m\right] \right\} \ . \end{aligned}$$
(18)

This readily gives

$$\begin{aligned} \phi _n = \text {e}^{-\beta n(\varepsilon _n-\mu )} \ , \end{aligned}$$
(19)

where as usual \(\beta =1/k_{\text {B}}T\). From this we in particular also get the monomer concentration \(\phi _1\), and so we can eliminate the Lagrange multiplier \(\mu \) from the expression:

$$\begin{aligned} \phi _n = \left[ \phi _1\,\text {e}^{\beta (\varepsilon _1-\varepsilon _n)}\right] ^n \ . \end{aligned}$$
(20)

This is a very important general result. How it plays out in reality depends entirely on \(\varepsilon _n\), which in turn depends crucially on the geometry of the aggregate—spherical cylindrical, or planar. Regardless: we see that if \(\varepsilon _n<\varepsilon _1\), meaning that it is favorable for a monomer to be in an n-aggregate compared to being isolated in solution, the exponential factor becomes large and the concentration of n-aggregates goes up. But let us now specifically look at the individual geometries.

Spherical micelles. What is the energy of a monomer in a micelle consisting of n monomers? This is potentially a difficult question, but we will circumvent it by looking at the physics: packing monomers of some particular curvature into a spherical aggregate will likely result in some particular size—say, m—at which they fit best, and deviations away from that size will be suboptimal. Let us hence assume that, to lowest order, the energy is simply quadratic in the deviation from that particular optimal state:

$$\begin{aligned} \varepsilon _n = \varepsilon _m + \frac{1}{2}\varepsilon ^*(n-m)^2 \ . \end{aligned}$$
(21)

Inserting this into Eq. (19) leads to

$$\begin{aligned} \phi _n = \exp \left\{ -\beta n\left( \varepsilon _m + \frac{1}{2}\varepsilon ^*(n-m)^2 - \mu \right) \right\} \ , \end{aligned}$$
(22)

where n needs to be determined from the normalization condition (17). Notice that this distribution is cubic in the exponent. However, we can simplify it by expanding the exponent around its maximum, up to quadratic order, and hence find an approximate Gaussian distribution that describes \(\phi _n\) reasonably well. To do so, we need to calculate

$$\begin{aligned} 0\mathop {=}\limits ^{!} \frac{\partial }{\partial n}\left[ -\beta n\left( \varepsilon _m + \frac{1}{2}\varepsilon ^*(n-m)^2 - \mu \right) \right] \ , \end{aligned}$$
(23)

which leads to the solution \(n^*\) at which the function peaks:

$$\begin{aligned} n^*= \frac{m}{3}\left[ 2+\sqrt{1-\frac{6(\varepsilon _m-\mu )}{\epsilon ^*m^2}}\right] \approx m - \frac{\epsilon _m-\mu }{\epsilon ^*m} \ , \end{aligned}$$
(24)

where the approximation results from expanding the square root to first-order, since the term \(6(\varepsilon _m-\mu )/\epsilon ^*m^2\) is small. We then find the quadratic expansion

$$\begin{aligned} n\left( \varepsilon _m + \frac{1}{2}\varepsilon ^*(n-m)^2 - \mu \right) \approx \text {const.} + \frac{1}{2}\epsilon ^*m\sqrt{1-\frac{6(\varepsilon _m-\mu )}{\epsilon ^*m^2}}(n-n^*)^2 \ . \end{aligned}$$
(25)

This shows that the micelle distribution can be approximated as a Gaussian,

$$\begin{aligned} \phi _n\approx \text {const.}\times \exp \left\{ -\frac{(n-n^*)^2}{2\sigma ^2}\right\} \ , \end{aligned}$$
(26)

with the mean value \(n^*\) given in Eq. (24) and the variance given by

$$\begin{aligned} \sigma ^2 = \frac{k_{\text {B}}T}{\epsilon ^*m} \ . \end{aligned}$$
(27)

This shows that the distribution widens at larger temperature, and is narrower for bigger micelles.

The effects on the structure on a single micelle are curious but minor in the spherical case; what is truly remarkable and very important is the overall aggregation thermodynamics which this model implies. In order to not get bogged down in tedious math (chiefly from dealing with the normalization condition (17)), let us instead look at a two-state system, in which we only have monomers coexisting with m-aggregates, and the normalization condition becomes \(X=\phi _1+m\phi _m\). Furthermore, we have

$$\begin{aligned} \phi _m \mathop {=}\limits ^{\text {(20)}} (\phi _1\text {e}^{\beta (\varepsilon _1-\varepsilon _m)})^m = (\phi _1\text {e}^\alpha )^m \ , \end{aligned}$$
(28)

where we defined \(\alpha =\beta (\varepsilon _1-\varepsilon _m)>0\) (we know the sign because we know that it is energetically favorable to form an m-aggregate). The normalization condition then becomes

$$\begin{aligned} X = \phi _1 + m\text {e}^{\alpha m}\;\phi _1^m \ . \end{aligned}$$
(29)

This must be solved for \(\phi _1\), but notice that this is an \(m^\text {th}\) order polynomial equation. This looks exceedingly troublesome, but it in fact becomes simple to get an approximate solution if we remember that m is likely large: recall from Sect. 1.1 and Fig. 2 that the number of surfactants in a spherical micelle can be written as \(N=4\pi l^2/a=4\pi l^2/\pi r^2=(2l/r)^2\), and with a reasonable estimate of \(a\approx 0.5\,\text {nm}^2\) (and hence \(r\approx 0.4\,\text {nm}\)) and \(\ell \approx 2\,\text {nm}\), we find \(N\approx 100\). We then see that the second term in Eq. (29) stays extremely small for large \(\phi _1\) and then very rapidly picks up and completely dominates the value of X—see the left hand graph in Fig. 5. The crossover happens where the two terms on the right hand side are approximately equal, leading to

$$\begin{aligned} \phi _1 = m\text {e}^{\alpha m}\phi _1^m \;\;\Longrightarrow \;\; \phi _1 = \left( \frac{1}{m}\right) ^{\frac{1}{m-1}}\text {e}^{-\frac{\alpha m}{m-1}} \approx \text {e}^{-\alpha } \ , \end{aligned}$$
(30)
Fig. 5
figure 5

The left plot shows the total monomer concentration in all aggregates combined, X, as a function of the concentration of single monomers, \(\phi _1\). Since X emerges as a sum of \(\phi _1\) and a second term \(m(\phi _1/\phi _\text {cmc})^m\) with a large m (in the graph we chose \(m=50\)), there is a sharp crossover near \(\phi _1=\phi _\text {cmc}=\text {e}^{-\alpha }\). The right picture simply flips the axes and shows the monomer concentration \(\phi _1\) as a function of the total concentration of added amphiphiles. Initially, the monomer concentration grows linearly with the amount of added amphiphiles—up to the concentration \(\phi _\text {cmc}\), at which point it essentially stays constant

where the approximation is very good because \(m\gg 1\) (recall in particular that \((1/m)^{1/m}\approx 1 - (\ln m)\,m^{-1}+\mathcal {O}(m^{-2})\)). This shows that a critical concentration exists, \(\phi _\text {cmc}=\text {e}^{-\alpha }\), at which something startling happens: up to that concentration, the normalization condition is dominated by \(\phi _1\), and this means that the solution exists almost exclusively of monomers. But at \(\phi _\text {cmc}\) the second term takes over, and from now on adding extra material will almost exclusively go into aggregates. This is very visible if we plot the inverse of the normalization condition—see the right hand side of Fig. 5: the concentration of monomers initially grows linearly with the amount of added material, but it levels off quite abruptly at \(\phi _\text {cmc}\), meaning that from now on any additional material will form micelles, which so far did not exist. The concentration \(\phi _\text {cmc}\) is called the critical micelle concentration, usually abbreviated as “cmc,” and it is a fundamentally important quantity for any aggregation problem. We will soon see that the concept remains relevant beyond the case of spherical micelles we have discussed just now. Notice that \(\alpha =\beta (\varepsilon _1-\varepsilon _m)\) is not just positive, but can be a fair amount bigger than 1, since the energy which an amphiphile gains in an aggregate compared to being in isolation can be many \(k_{\text {B}}T\). This implies, in turn, that the cmc can be very low: not much material needs to be added before micelles form. For instance, the cmc for the standard surfactant sodiumdodecylsulfate (SDS) is about \(8\,\text {mM}\) in water at 25 \(^\circ \text {C}\), at which point the aggregation number of the micelles is \(m\approx 60\) (Turro and Yekta 1978).

It should be noted that the micellization transition is not a phase transition in the classical sense: there is no discontinuity or non-analyticity in any of the thermodynamic functions; the transition is always rounded, since m is large but finite. Regardless, it is a very pronounced change in the system’s behavior, and as such it dominates aggregation physics.

Cylindrical micelles. The difference between the spherical and the cylindrical case enters via the energy per monomer in an aggregate, \(\varepsilon _n\). For spheres we made the reasonable assumption in Eq. (21) that there is a typical size for a micelle, and that the energy will deviate quadratically as we move away from that value. This cannot be true for cylindrical micelles, though, since they have an unspecified length: we can easily make cylindrical micelles longer by simply adding more amphiphiles to the linear part. The aggregation energy of these amphiphiles will be always the same, for they cannot know how long the cylindrical aggregate is of which they are a part. However, amphiphiles at the two end caps of the micelle must have a different energy, and it must be larger than the energy of amphiphiles in the wormlike middle, for if that were not so, spherical micelles would form in the first place. It is hence reasonable to write the total energy of a cylindrical micelle of n monomers as \(E_n=n\varepsilon _\infty +2E_\text {cap}\), and hence the energy per monomer is

$$\begin{aligned} \varepsilon _n = \varepsilon _\infty + \frac{2E_\text {cap}}{n} =: \varepsilon _\infty + \frac{\alpha \,k_{\text {B}}T}{n} \ . \end{aligned}$$
(31)

Notice that the dimensionless number \(\alpha \) must be large: it is the excess energy (in units of \(k_{\text {B}}T\)) of all end-cap monomers. Since these caps consist of two semi-spheres, they together make up essentially one full spherical micelle, whose aggregation number is \(\mathcal {O}(100)\), and it seems fair to estimate that the excess energy for each monomer stuck in the wrong local geometry is at least a sizable fraction of \(k_{\text {B}}T\).

Inserting this ansatz for \(\varepsilon _n\) into Eq. (20), we get

$$\begin{aligned} \phi _n = \left[ \phi _1\,\text {e}^{\beta (\varepsilon _1-\varepsilon _\infty -\alpha \,k_{\text {B}}T/n)}\right] ^n = \left[ \phi _1\,\text {e}^{\beta (\varepsilon _1-\varepsilon _\infty )}\right] ^n\text {e}^{-\alpha } = \left[ \phi _1\,\text {e}^{\alpha }\right] ^n\text {e}^{-\alpha } \ , \end{aligned}$$
(32)

where the last step follows since this equation must be true also for \(n=1\).

It is now highly useful to define the scaled concentrations \(\tilde{\phi }_n=\phi _n\text {e}^\alpha \), because in these variables Eq. (32) becomes

$$\begin{aligned} \tilde{\phi }_n = \tilde{\phi }_1^n \ . \end{aligned}$$
(33)

The distribution of the \(\tilde{\phi }_n\) is exponential, which is remarkably wide (we will make this more precise below) and very different from the spherical case, where the distribution was sharply peaked around an optimal size. Notice that in order for it to be normalizable, we must have \(\tilde{\phi }_1<1\), implying that the monomer concentration can never exceed \(\text {e}^{-\alpha }\)—a concentration we will soon recognize as the cmc for the cylindrical case.

If we define the scaled total concentration of monomers as \(\tilde{X}=X\text {e}^\alpha \), the normalization condition (17) becomes

$$\begin{aligned} \tilde{X} = \sum _{n=1}^N n\,\tilde{\phi }_n = \sum _{n=1}^N n\,\tilde{\phi }_1^n \ . \end{aligned}$$
(34)

Sums of this type can be done by the following elegant trick:

$$\begin{aligned} \sum _{n=1}^N n^bx^n = \sum _{n=1}^N \left( x\frac{\partial }{\partial x}\right) ^bx^n = \left( x\frac{\partial }{\partial x}\right) ^b \sum _{n=1}^N x^n = \left( x\frac{\partial }{\partial x}\right) ^b\frac{x-x^{N+1}}{1-x} \ , \end{aligned}$$
(35)

where in the last step we summed the well-known geometric series. Moreover, since we know that in our case \(x<1\) and N is very large, we can drop the \(x^{N+1}\) term (or, equivalently, set \(N\rightarrow \infty \)), and so we for instance find

$$\begin{aligned} \sum _{n=1}^\infty n\,x^n&= \left( x\frac{\partial }{\partial x}\right) \frac{x}{1-x} =\frac{x}{(1-x)^2} \ , \end{aligned}$$
(36a)
$$\begin{aligned} \sum _{n=1}^\infty n^2\,x^n&= \left( x\frac{\partial }{\partial x}\right) ^2\!\!\frac{x}{1-x} = \frac{x(1+x)}{(1-x)^3} \ , \end{aligned}$$
(36b)
$$\begin{aligned} \sum _{n=1}^\infty n^3\,x^n&= \left( x\frac{\partial }{\partial x}\right) ^3\!\!\frac{x}{1-x} = \frac{x(1+x(4+x))}{(1-x)^4} \ . \end{aligned}$$
(36c)

Hence, using Eq. (36a), the normalization condition (34) becomes a quadratic equation for \(\tilde{\phi }_1\) that is easy to solve

$$\begin{aligned} \tilde{X} = \frac{\tilde{\phi }_1}{(1-\tilde{\phi }_1)^2} \;\;\;\Longrightarrow \;\;\; \tilde{\phi }_{1\pm } = \frac{1+2\tilde{X}\pm \sqrt{1+4\tilde{X}}}{2\tilde{X}} \ . \end{aligned}$$
(37)

Since we know \(\tilde{\phi }_1<1\), the minus sign is the correct choice. Expanding the solution for small and large \(\tilde{X}\), we find

$$\begin{aligned} \tilde{\phi }_1 = \left\{ \begin{array}{ccc} \tilde{X} + \mathcal {O}(1) &{} : &{} \tilde{X}\ll 1 \\ [0.5em] 1-1/\sqrt{\tilde{X}} +\mathcal {O}(\tilde{X}^{-1}) &{} : &{} \tilde{X}\gg 1 \end{array} \right. . \end{aligned}$$
(38)

As promised, we can again define a cmc, \(\phi _\text {cmc}=\text {e}^{-\alpha }\), such that below the cmc the monomer concentration in our solution is proportional to the amount of added material, while for concentrations larger than the cmc any added material goes into micelles, leaving the monomer concentration below \(\phi _\text {cmc}\), and approaching it with a very slow \(1/\sqrt{X}\) asymptotics. This is illustrated in Fig. 6.

Fig. 6
figure 6

Monomer concentration for the case of a cylindrical micelle aggregation scenario. The dashed and dotted curves indicate the small- and large-concentration limits from Eq. (38). The full solution shows a cross over at the cmc

We already know that the distribution of micelle sizes is exponential, but we might also want to know what the mean and the variance are. These are easily calculated by working out (weight-averaged) moments of n. For the first one, we find

$$\begin{aligned} \langle n\rangle = \frac{\sum _{n=1}^\infty n\tilde{X}_n}{\sum _{n=1}^\infty \tilde{X}_n} = \frac{\sum _{n=1}^\infty n^2\tilde{\phi }_n}{\sum _{n=1}^\infty n \tilde{\phi }_n} \mathop {=}\limits ^{*} \frac{1+\tilde{\phi }_1}{1-\tilde{\phi }_1} \mathop {=}\limits ^{\#} \sqrt{1+4\tilde{X}} \ , \end{aligned}$$
(39)

where at \(*\) we used Eqs. (36a) and (36b) and at \(\#\) we inserted the solution (37). Hence, the average micelle length grows like the square root of the concentration: \(\langle n\rangle \approx 2\sqrt{X/\phi _\text {cmc}}\).

The second moment of n is given by

$$\begin{aligned} \langle n^2\rangle = \frac{\sum _{n=1}^\infty n^2\tilde{X}_n}{\sum _{n=1}^\infty \tilde{X}_n} = \frac{\sum _{n=1}^\infty n^3\tilde{\phi }_n}{\sum _{n=1}^\infty n \tilde{\phi }_n} \mathop {=}\limits ^{*} \frac{1+\tilde{\phi }_1(4+\tilde{\phi }_1)}{(1-\tilde{\phi }_1)^2} \ , \end{aligned}$$
(40)

where at \(*\) we used Eqs. (36a) and (36c). Hence, the variance of n is

$$\begin{aligned} \sigma _n^2 = \langle n^2\rangle - \langle n\rangle ^2 = \frac{2\tilde{\phi }_1}{(1-\tilde{\phi }_1)^2} \mathop {=}\limits ^{\#} 2\tilde{X} \ , \end{aligned}$$
(41)

where at \(\#\) we again used the solution (37). This answer is important, because it shows that the width of the distribution essentially scales with its mean, and hence

$$\begin{aligned} \frac{\sigma _n}{\langle {n}\rangle } = \sqrt{\frac{2\tilde{X}}{1+4\tilde{X}}} = \frac{1}{\sqrt{2}} - {\mathcal {O}}(\tilde{X}^{-1}) . \end{aligned}$$
(42)

Distributions of cylindrical micelles are hence “wide” no matter how large the micelles are; there is no “law of large micelles,” or a \(1/\sqrt{n}\) like asymptotics toward a sharp mean. Remarkable as this is, it is of course not unexpected, for that is what exponential distributions do.

Planar bilayers. Again, the first question to address is: what is \(\varepsilon _n\) for an aggregate that assembles in a planar fashion? To make headway, though, we need to make further assumptions about its geometry. We will assume that it stays flat, and that it will be circular. The latter follows because the amphiphiles at the bilayer disc’s edge will have a higher free energy per molecule than the one in the flat region (for reasons analogous to the elevated free energy of monomers at the ends of cylindrical micelles). This excess free energy per unit length acts as a line tension (in this case usually called edge tension), and minimizing it at constant overall area of the aggregate means that the shape has to be a circle.

If the circular aggregate has area \(A=\pi R^2\), its circumference is \(C=2\pi R=2\sqrt{\pi A}\). The excess free energy of the edge is \(E_\text {edge}=2\pi R\gamma =2\sqrt{\pi A}\gamma \), with \(\gamma \) being the edge tension—a material parameter. Since the number of lipids in the aggregate is approximately \(n=2A/a_\ell \), with \(a_\ell \) being the area per lipid, we get \(A=\frac{1}{2}na_\ell \), and hence \(E_\text {edge}=\sqrt{2\pi na_\ell }\gamma \). The replacement for Eq. (31) is hence

$$\begin{aligned} \varepsilon _n = \varepsilon _\infty + \frac{E_\text {edge}}{n} = \varepsilon _\infty + \frac{\sqrt{2\pi na_\ell }\gamma }{n} = \varepsilon _\infty + \frac{\alpha \,k_{\text {B}}T}{\sqrt{n}} \ , \end{aligned}$$
(43)

where \(\alpha =\sqrt{2\pi a_\ell }\beta \gamma \) is a dimensionless number that’s again a fair bit larger than 1. To estimate it, let’s take the DOPC values of \(a_\ell \simeq 0.7\,\text {nm}^2\) (Kučerka et al. 2006) and \(\gamma \simeq 20\,\text {pN}\) (Portet and Dimova 2010), from which we get \(\alpha \approx 10\). Notice that the only difference between the cylindrical and the planar case is that in the latter the excess term is proportional to \(1/\sqrt{n}\) instead of 1 / n. We will see that this changes the physics in a big way.

Inserting this expression for the energy per monomer into the general form of the aggregate distribution, Eq. (20), we get

$$\begin{aligned} X_n = n\,\phi _n&= n\left[ \phi _1\,\text {e}^{\beta (\varepsilon _1-\varepsilon _\infty -\alpha \,k_{\text {B}}T/\sqrt{n})}\right] ^n \nonumber \\&= n\left[ \phi _1\,\text {e}^{\beta (\varepsilon _1-\varepsilon _\infty )}\right] ^n\!\text {e}^{-\alpha \sqrt{n}} \nonumber \\&= n\left[ \phi _1\,\text {e}^{\alpha }\right] ^n\text {e}^{-\alpha \sqrt{n}} \ , \end{aligned}$$
(44)

where the last step again follows because this equation must also be true for \(n=1\). The normalization condition (17) then becomes

$$\begin{aligned} X = \sum _{n=1}^N X_n = \sum _{n=1}^N n\left[ \phi _1\,\text {e}^{\alpha }\right] ^n\text {e}^{-\alpha \sqrt{n}} \ . \end{aligned}$$
(45)

The term \(\text {e}^{-\alpha \sqrt{n}}\) decreases with n, while for the term \([\phi _1\,\text {e}^{\alpha }]^n\) the asymptotic behavior depends on whether \(\phi _1\text {e}^\alpha \) is bigger or smaller than 1. Assume it is bigger than 1. Then this term grows with n, and it asymptotically grows faster than \(\text {e}^{-\alpha \sqrt{n}}\) decreases. This might get us worried, for if we again replace \(N\rightarrow \infty \) (because N will be macroscopically big), the sum in Eq. (45) would diverge. So let us assume that, instead, the expression \(\phi _1\text {e}^\alpha \) is smaller than 1. In that case, we can calculate

$$\begin{aligned} X = \sum _{n=1}^\infty n\left[ \phi _1\,\text {e}^{\alpha }\right] ^n\text {e}^{-\alpha \sqrt{n}} \le \sum _{n=1}^\infty n\,\text {e}^{-\alpha \sqrt{n}} \approx \int _0^\infty \text {d}n\; n\,\text {e}^{-\alpha \sqrt{n}} = \frac{12}{\alpha ^4} \ . \end{aligned}$$
(46)

This is a pretty disastrous finding, though: apparently, the total amount of material we can add to the system is bounded from above. What if we wanted to add more material—who is going to stop us? (Not excluded volume—that was not part of the model!)

The solution to this conundrum is subtle: the assumption that N can be replaced by infinity is wrong—despite the fact that N could really be an Avogadro number of molecules. But large is not the same as infinite, and the normalization condition only enforces \(\phi _1\text {e}^\alpha \le 1\) if we really sum all the way up to infinity. If the sum is finite, there is no reason to demand that \(\phi _1\text {e}^\alpha \le 1\), because finite sums cannot diverge! More specifically, even if this term would ultimately outcompete \(\text {e}^{-\alpha \sqrt{n}}\), if \(\phi _1\text {e}^\alpha \) is only ever so slightly bigger than 1, this will only happen near the upper bound of the sum—showing us that the value of this sum will likely depend very critically on just how much \(\phi _1\text {e}^\alpha \) exceeds 1.

Unfortunately, it is quite tricky to see how this plays out analytically, because the normalization sum (45) turns out to be a very delicate interplay between very small and very large terms. To brace ourselves for what is actually happening here, we shall first look at a numerical example. Let us assume that \(\alpha =10\), that we have \(N=10,000\) molecules in the system (really an incredibly small number by experimental standards, but this might be a typical number to be used in a simulation), and let us demand that we want to ultimately gain a total concentration of \(X=10^{-2}\) (notice that this is larger than the erroneous upper bound of \(12/\alpha ^4=1.2\times 10^{-3}\)). If we abbreviate \(\tilde{\phi }_1=\phi _1\text {e}^\alpha \), then we have to numerically solve the following equation for \(\tilde{\phi }_1\):

$$\begin{aligned} 10^{-2} = \sum _{n=1}^{10,000} n\,\tilde{\phi }_1^n\,\text {e}^{-10\sqrt{n}} \ . \end{aligned}$$
(47)
Fig. 7
figure 7

The solid curve is the right hand side of Eq. (47) as a function of the parameter \(\tilde{\phi }_1\), for \(\alpha =10\) and \(N=10,000\); the dashed curve is the large-n-approximation from Eq. (49)

Figure 7 plots the right hand side of this equation as a function of the parameter \(\tilde{\phi }_1\) in the interesting range. Up to \(\tilde{\phi }_1\approx 1.1025\), the right hand side grows linearly (and extremely weakly) with \(\tilde{\phi }_1\), but at around this point a big change happens, and the sum picks up extremely rapidly—becoming a power law with an exponent of about 10, 000. (This also shows why it is very hard to treat this problem numerically with even bigger values of N.) The value \(10^{-2}\) is reached at \(\tilde{\phi }_1\approx 1.10330764\) and hence \(X_1\approx 5.009\times 10^{-5}\).

Fig. 8
figure 8

The solid curve is the distribution function \(X_n=n\,\phi _n\) from Eq. (44), using the numerical parameters \(\alpha =10\), \(N=10,000\), and \(X=0.01\), which implies the numerical solution \(\tilde{\phi }_1\approx 1.10330764\) and hence \(X_1\approx 5.009\times 10^{-5}\). The dashed curve is the approximate distribution from Eq. (48), using the value for \(\tilde{\phi }_1\) determined via the first-order approximation in Eq. (52), \(\tilde{\phi }_1^{(1)}\approx 1.10330882\). Using \(\tilde{\phi }_1^{(1)}\) in the full distribution (instead of the exact \(\tilde{\phi }_1\)) leads to a curve that is indistinguishable from the exact one on this plot, with a normalization that is about 1% off

Inserting this value for \(\tilde{\phi }_1\) into the distribution function for \(X_n\) from Eq. (44), we can plot it over the entire range of permissible n values: from \(n=1\) to \(n=10,000\); this is done in Fig. 8. Initially, the distribution function drops precipitously: one finds \(X_2\approx 1.756\times 10^{-6}\approx X_1/30\) and \(X_3\approx 1.211\times 10^{-7}\approx X_1/400\). But at \(n=2566\) the function attains a minimum, after which it again begins to rapidly grow. At its largest n-value it becomes \(X_{10,000}\approx 4.696\times 10^{-4}\approx 10 X_1\), showing it is about 10 times more likely to find a lipid in that aggregate than to find it in isolation! Another way of looking at this is the following: 99% of all monomers are found in aggregates with a size of at least 9, 890. And yet another illustration is the following: Look at the cumulative normalized distribution of \(X_n\), namely, \(f(m)=X^{-1}\sum _{n=1}^m X_n\). It rapidly rises from 0 to about 0.0052 when m rises from 1 to 10. However, after that it stays virtually constant, until about 9, 800, when it begins to rise again. In other words, with the exception of about half a percent of small oligomers, virtually the whole system forms one giant aggregate.

With these observations we are now in a better position to develop a decent approximate solution for the normalization condition (45). Notice that we need to analytically describe the region in that sum which strongly increases (the “uptake” in Fig. 7), and that this comes from the aggregates—meaning, the large-n part of the distribution function. Hence it is probably a good idea to expand the summands in Eq. (45) around the upper end, \(n=N\), and preferably in such a fashion that we can perform the sum. But given the exponential variation of \(X_n\), it is wise to do that expansion in the exponent:

$$\begin{aligned} X_n = n\,\tilde{\phi }_1^n \text {e}^{-\alpha \sqrt{n}}&= \tilde{\phi }_1^n \exp \left\{ -\alpha \sqrt{n} + \ln n\right\} \nonumber \\&= \tilde{\phi }_1^n \exp \bigg \{ -\alpha \left[ \sqrt{N}+\frac{1}{2\sqrt{N}}(n-N)\right] \nonumber \\&\qquad \qquad + \ln N + \frac{1}{N}(n-N) + \mathcal {O}\Big ((n-N)^2\Big )\bigg \} \nonumber \\&\approx N\text {e}^{-\alpha \sqrt{N}/2}\left( \tilde{\phi }_1\,\text {e}^{-\alpha /2\sqrt{N}}\right) ^n \ . \end{aligned}$$
(48)

This expansion permits us to do the sum, since it turns into a simple geometric series:

$$\begin{aligned} \sum _{n=1}^N n\,\tilde{\phi }_1^n \text {e}^{-\alpha \sqrt{n}} \approx N\text {e}^{-\alpha \sqrt{N}/2}\frac{y^{N+1}-1}{y-1} \qquad \text {with}\quad y = \tilde{\phi }_1\,\text {e}^{-\alpha /2\sqrt{N}} \ . \end{aligned}$$
(49)

Since y is slightly larger than 1, but N is huge, \(y^{N+1}\) will be very large compared to 1. (In our above numerical example we would find \(y\approx 1.0494987\) and hence \(y^{N+1}\approx 1.0494987^{10,001}\approx 6.92\times 10^{209}\).) We can hence neglect the “\(-1\)” in the numerator, but of course not in the denominator.

The normalization condition now becomes

$$\begin{aligned} \Xi := \frac{X\,\text {e}^{\alpha \sqrt{N}/2}}{N} = \frac{y^{N+1}}{y-1} \ , \end{aligned}$$
(50)

but this is again impossible to solve analytically. However, we can get increasingly good approximations by iteration. First, recall that the right hand side really emerged as a geometric series, and so it is given by \(y^N+y^{N-1}+y^{N-2}+\cdots \). Let us take the dominant term, \(y^N\), and solve the equation. We then get

$$\begin{aligned} y = \Xi ^{\frac{1}{N}} \ . \end{aligned}$$
(51)

Even though only approximate, this already looks remarkably good, since it gives \(\tilde{\phi }_1=1.103645\) for our numerical example, about 0.03% off. And yet, inserting this value into the normalization condition gives a value about 20 times too big. We need to do better. In fact, we can improve the solution by iterating the defining equation, à la \(y^{(i+1)} = [\Xi (y^{(i)}-1)]^{1/(N+1)}\), where \(y^{(0)}=\Xi ^{1/N}\) is our initial simple result. At first-order we get

$$\begin{aligned} \tilde{\phi }_1^{(1)}&= \text {e}^{\alpha /2\sqrt{N}} \; y^{(1)} \nonumber \\&= \text {e}^{\alpha /2\sqrt{N}}\,\left[ \Xi \left( \Xi ^{1/N}-1\right) \right] ^{1/(N+1)} \nonumber \\&= \text {e}^{\alpha /2\sqrt{N}}\left[ \frac{X}{N}\,\text {e}^{\alpha \sqrt{N}/2}\left( \left( \frac{X}{N}\right) ^{1/N}\!\!\!\text {e}^{\alpha /2\sqrt{N}}-1\right) \right] ^{1/(N+1)} \ . \end{aligned}$$
(52)

With the numerical example from above (\(X=0.01\), \(\alpha =10\), and \(N=10,000\)), this gives \(\tilde{\phi }_1^{(1)}= 1.10330882\), which differs from the exact numerical solution only by 1 part in \(10^6\), and now the normalization condition is only 1% off. Unfortunately, further iterations do not gain us much anymore, because we are still solving an approximate equation, not the exact one.

There is more to be learned. First, even the simplest solution becomes exact in the thermodynamic limit \(N\rightarrow \infty \). Performing it, we get

$$\begin{aligned} \phi _1 = \text {e}^{-\alpha }\lim _{N\rightarrow \infty }\left\{ \text {e}^{\alpha /2\sqrt{N}}\left( \frac{X\,\text {e}^{\alpha \sqrt{N}/2}}{N}\right) ^{1/N}\right\} = \text {e}^{-\alpha } \ , \end{aligned}$$
(53)

showing that—again—we have a critical “micelle” concentration. Since bilayer patches are usually not viewed as “micelles,” this is more commonly called the critical aggregate concentration and abbreviated as “cac”: \(\phi _\text {cac}=\text {e}^{-\alpha }\).

The scenario looks superficially similar to what we have seen in the spherical case: the normalization condition becomes a polynomial with a constant term, a linear term, and one term with a large power (compare Eqs. (29) and (50)), and the “largeness” of that power makes the transition. However, in the spherical micelle case that power was given by the micelle size, and hence it was mesoscopic—of order \(10^2\). In the bilayer case that power is macroscopic—the total number of molecules in the system, conceivably of the order of Avogadro’s number, but more importantly: extensive. It will by definition diverge in the thermodynamic limit. It hence follows that the aggregation transition for bilayers is a true phase transition—at least in the model we have studied here.

Alas, our model is defective. The \(1/\sqrt{n}\) correction to \(\varepsilon _n\) (see Eq. (43)), on which the whole scenario hinges, comes from the \(\sqrt{n}\) divergence of the edge energy for increasingly large flat circular aggregates. But bilayer patches do not have to stay flat. Once they exceed a critical size, it is preferable for them to close up, make an edgeless spherical vesicle, and pay bending energy instead, because bending energy does not scale with size. This was first discussed by Helfrich (1974). Hence, vesiculation caps the edge energy, moving the correction term back to a 1 / n form, for which we expect a wide exponential distribution function like in the case of cylindrical micelles. Unfortunately, in reality things are now a lot more complicated, because we can no longer ignore kinetics. In any case, we still encounter an aggregation transition once the amphiphile concentration in solution exceeds a critical aggregate concentration.

2 Fluid Elastic Sheets: From Three to Two Dimensions

The previous section has shown that there is something special about two-dimensional assemblies of amphiphiles. Spherical micelles are by construction microscopic, and cylindrical micelles are tenuous threads, constantly breaking and re-merging, with a corresponding wide length distribution. In contrast, two-dimensional amphiphilic sheets are endowed by thermodynamics with certain inalienable rights, among them extensivity, stability, and universal elasticity. They arise as macroscopic persistent entities, for which we therefore expect an effective large-scale theory to exist, whose key degrees of freedom are emergent and independent of the microscopic realization, and whose key physical parameters are functions of the underlying structure, but might as well be taken as fundamental at the emergent level.

This situation arises frequently in physics: a system is known to have an underlying structure, but we can describe it effectively (and very elegantly) at a level that completely ignores this structure. For instance, fluid dynamics need not know about atoms. Its laws follow from thermodynamics and symmetry, only its parameters (mass density and viscosity in the simplest case) reflect the details of the constituents. The same is true for elasticity theory, where we can see even more clearly how local microscopic symmetries leave traces in the macroscopic description (they dictate the number and type of elastic moduli).

In such a situation there are two ways for how to proceed, and they differ quite fundamentally in their “philosophy”:

  • Bottom-up approaches strive to reduce larger scale phenomenology to a microscopic description at a smaller scale that is considered more fundamental. In particular, they aim to elucidate the dependence of the larger scale parameters on the microscopic foundation.

  • Top-down approaches ignore the underlying structure and postulate a macroscopic theory from scratch, constrained only by symmetry. The parameters of this theory are not themselves predictable, for they depend on the microscopic structure which this approach purposefully ignores. But one can always measure them at the macroscopic level, so the endeavor is self-contained.

Both approaches are perfectly valid and have their own advantages and drawbacks. The top-down approach, for instance, need not wrestle with underlying microscopic degrees of freedom—say, trying to eliminate them by performing partial traces in phase space or other scale-bridging procedures. But decorating all symmetry-permissible terms with phenomenological parameters might be dangerous, for they need not be independent: a relation between them, enforced by subtleties of the underlying microphysics, could be missed. The bottom-up approach, in contrast, necessarily captures such effects, which is probably its biggest strength. But given our poor ability to actually do the math needed to rigorously coarse-grain a Hamiltonian, approximations along the way might cloud the path of emergence. Moreover, often we do not know the underlying microscopic theory all that well, and so we instead start with what we perceive to be a good model of the microphysics. This often works flawlessly, in the sense of giving a perfectly acceptable macroscopic theory—but this is to be expected: after all, hardly any microscopic details survive the emergence process. The macroscopic theory only depends on very generic symmetry considerations and the microscopic details matter only inasmuch as they predict macroscopic coefficients or produce correlations between them. If we cannot measure both the microscopic and the macroscopic parameters, it is very difficult to test whether these predicted connections are fulfilled, and hence it is usually impossible to be sure that our microscopic model was correct. This, of course, is the well-known bane of scientists looking for The Fundamental Laws: there is more than one way to skin a cat.

In our experience, combining both approaches to elucidate the path of scale-bridging, being aware what powers and limits each modus operandi, and being skeptical of too freely floating phenomenology as well as suspiciously specific model-building—these are attitudes that will deepen one’s understanding of the key physics. Indeed, one of the goals of the book you are holding is to explore this duality for lipid membranes, for which phenomenological geometric Hamiltonians can be written down, which in turn can also be motivated by underlying microphysics that considers the lipid constituents.

In this section we wish to discuss one particular connection between large-scale membrane theory and an underlying more microscopic model that is interesting because it is itself already coarse-grained. It is a description of a thin two-dimensional fluid elastic sheet, and the question is, how to bootstrap ourselves up to larger scales and one dimension lower: large-scale two-dimensional curvature elastic surfaces. This program has been proposed and worked through in an important and seminal paper by Hamm and Kozlov (2000). The goal of this section is to revisit their elegant derivation, but here and there keep a few higher order terms which Hamm and Kozlov have neglected, but for which one can make good arguments to keep them.

Before we now dive into membrane elasticity—a little heads-up: unlike the previous section, this one will start to use numerous tools from surface differential geometry. The notation follows a recent review one of us has written (Deserno 2015), which introduces the basic formalism, derives most of the key identities, and also provides several applications to membrane elasticity. But then, our view and usage of differential geometry in this context has been very heavily influenced by Jemal Guven, who also has a chapter in this book. We hence strongly recommend that the reader also consults the master, not merely his apprentices.

2.1 The Starting Point: Thin Fluid Elastic Sheets

It is well-known that if \(u_{ij}\) is the Cauchy strain tensor, the most general quadratic expression for the elastic energy density we can write down is

$$\begin{aligned} e_\text {3d} = \frac{1}{2} \lambda _{ijkl} \, u_{ij} u_{kl} \ , \end{aligned}$$
(54)

where \(\lambda _{ijkl}\) is the elastic modulus tensor (Landau and Lifshitz 1986). Without loss of generality, the exchange symmetries \(i\leftrightarrow j\), \(k\leftrightarrow l\), and \(ij\leftrightarrow kl\) can be assumed, leaving at most 21 independent components. But we want to use this expression for the energy of a fluid lipid monolayer, and in that case additional symmetries reduce the number of components much further (Hamm and Kozlov 2000; Campelo et al. 2014).

Area strain. Assume the leaflet lies in the xy-plane. First note that the two reflection symmetries \((x,y,z)\rightarrow (-x,y,z)\) and \((x,y,z)\rightarrow (x,-y,z)\) imply that neither an x- nor a y index can occur in \(\lambda _{ijkl}\) an odd number of times. Curiously, this implies that the same must hold for the z-index, even though a monolayer does not have an up-down reflection symmetry that would enforce this all by itself. Furthermore, one consequence of in-plane isotropy is that the x- and the y-directions are indistinguishable, and so their \(\lambda \)-coefficients must be equal. This already massively reduces the permissible terms to the following six:

$$\begin{aligned} e_\text {3d} =&\frac{1}{2}\lambda _{xxxx}\big (u_{xx}^2+u_{yy}^2\big ) + \lambda _{xxyy}u_{xx}u_{yy} +2\lambda _{xyxy}u_{xy}^2 \nonumber \\&+\lambda _{xxzz}\big (u_{xx}+u_{yy}\big )u_{zz} + 2\lambda _{xzxz}\big (u_{xz}^2+u_{yz}^2\big ) + \frac{1}{2}\lambda _{zzzz}u_{zz}^2 \ , \end{aligned}$$
(55)

where the prefactors account for obvious permutation multiplicities—such as \(\lambda _{xyxy}=\lambda _{yxyx}=\lambda _{xyyx}=\lambda _{yxxy}\). It is now useful to rework the quadratic strain expressions in the following way:

$$\begin{aligned} e_\text {3d} =&\frac{1}{2}\lambda _{xxxx}\big (u_{xx}+u_{yy}\big )^2 + \big (\lambda _{xxyy}-\lambda _{xxxx}\big )\big (u_{xx}u_{yy}-u_{xy}^2\big ) \nonumber \\&+\big (2\lambda _{xyxy}+\lambda _{xxyy}-\lambda _{xxxx}\big )\, u_{xy}^2 \nonumber \\&+\lambda _{xxzz}\big (u_{xx}+u_{yy}\big )u_{zz} + 2\lambda _{xzxz}\big (u_{xz}^2+u_{yz}^2\big ) + \frac{1}{2}\lambda _{zzzz}u_{zz}^2 \ . \end{aligned}$$
(56)

At this point we can exploit full in-plane rotational symmetry. The first two strain terms in Eq. (56) are quadratic invariants under in-plane rotation: they are (i) the square of the trace and (ii) the determinant of the strain tensor’s xy-subspace, respectively. But the term in the second line is not an invariant, and there is no term left to combine it with to remedy this flaw; hence, this term must vanish.

Next, let us make use of in-plane fluidity, which implies that the energy cannot change under in-plane shear deformations—meaning, in-plane shear stresses must vanish. One such deformation is a simple shear, \(u_{xy}\), and its associated shear stress is

$$\begin{aligned} 0 \mathop {=}\limits ^{!} \sigma _{xy}=\frac{\partial e_\text {3d}}{\partial u_{xy}} = -2\big (\lambda _{xxyy}-\lambda _{xxxx}\big )\,u_{xy} \ . \end{aligned}$$
(57)

Since this must hold for \(u_{xy}\ne 0\), we must have \(\lambda _{xxyy}=\lambda _{xxxx}\), and so the second term in Eq. (56) must vanish, too.

Finally, recall that we intend to describe a thin leaflet, which has the following consequence: the normal stress \(\sigma _{zz}\) at the leaflet’s upper and lower surface vanishes if the surface is free, but since the leaflet is thin, \(\sigma _{zz}\) does not have much opportunity to considerably grow anywhere within the material. We will hence assume that it vanishes throughout the material, and this implies

$$\begin{aligned} 0 \mathop {=}\limits ^{!} \sigma _{zz}=\frac{\partial e_\text {3d}}{\partial u_{zz}} = \lambda _{xxzz}\big (u_{xx}+u_{yy}\big )+\lambda _{zzzz}u_{zz} \ , \end{aligned}$$
(58)

and this means that the in-plane and transverse strains are related by

$$\begin{aligned} u_{zz} = -\frac{\lambda _{xxzz}}{\lambda _{zzzz}}\big (u_{xx}+u_{yy}\big ) =: -\tilde{\nu }\,\big (u_{xx}+u_{yy}\big ) \ . \end{aligned}$$
(59)

The dimensionless parameter \(\tilde{\nu }\) is related to the usual Poisson ratio \(\nu \) via \(\tilde{\nu }=\nu /(1-\nu )\). Inserting this into Eq. (56), what remains is

$$\begin{aligned} e_\text {3d} = \frac{1}{2} \tilde{E} (u_{xx} + u_{yy})^2 + 2 \lambda _{xzxz} (u_{xz}^2 + u_{yz}^2) \ , \end{aligned}$$
(60)

where we defined the effective modulus

$$\begin{aligned} \tilde{E} = \lambda _{xxxx} - \frac{\lambda _{xxzz}^2}{\lambda _{zzzz}} =\lambda _{xxxx}-\tilde{\nu }^2\lambda _{zzzz} \ . \end{aligned}$$
(61)

The first part in the energy (60) has now been recast in terms of a local area strain, which we will soon relate to the extent of bending.

Fig. 9
figure 9

(Figure adapted from Reddy (2006))

The material director \(\hat{\varvec{d}}\) in a flat thin plate is by construction aligned with the local surface normal \(\hat{\varvec{n}}\). But upon bending, \(\hat{\varvec{d}}\) may deviate from \(\hat{\varvec{n}}\) by an angle \(\theta \) due to transverse shear

Lipid tilt. For the second term in Eq. (60), a connection to area strain is not possible, because the strains \(u_{xz}\) and \(u_{yz}\) correspond to a local transverse shear, i.e., a deformation related to the fact that the material director of a sheet need not coincide with the surface normal, even if it does so for the flat sheet—see Fig. 9. This term can instead be related to lipid tilt—if we decide that a lipid’s orientation is the appropriate indicator for the local material director.Footnote 1 To do so quantitatively, it is useful to define a locally transverse tilt-field \({\varvec{T}}\) that measures the deviation between material director and surface normal (Hamm and Kozlov 2000):

$$\begin{aligned} {\varvec{T}}= T^l{\varvec{e}}_l = \frac{\hat{\varvec{d}}}{\hat{\varvec{n}}\cdot \hat{\varvec{d}}} - \hat{\varvec{n}}\ . \end{aligned}$$
(62)

This definition makes the transversality of \({\varvec{T}}\) manifest, since \({\varvec{T}}\cdot \hat{\varvec{n}}=0\) by construction (i.e., independent of any other conditions that would have to hold, such as \({\varvec{T}}\) being the solution of some Euler–Lagrange equation). Also, \({\varvec{T}}\) is not normalized; instead, its magnitude is \(|{\varvec{T}}|=\tan \theta \), where \(\theta \) is the tilt angle (i.e., the angle between \(\hat{\varvec{d}}\) and \(\hat{\varvec{n}}\)). Alternatively, we can write

$$\begin{aligned} \frac{1}{\cos \theta }=\frac{1}{\cos \arctan |{\varvec{T}}|} = \sqrt{1+{\varvec{T}}^2} = 1+\frac{1}{2}{\varvec{T}}^2+\mathcal {O}\big (({\varvec{T}}^2)^2\big ) \ . \end{aligned}$$
(63)

Since within first-order shear deformation plate theory \(2u_{xz}={\varvec{T}}\cdot {\varvec{x}}\) and \(2u_{yz}={\varvec{T}}\cdot {\varvec{y}}\) (Reddy 2006), this leads to

$$\begin{aligned} u_{xz}^2 + u_{yz}^2 = \textstyle \frac{1}{4}T_lT^l \ , \end{aligned}$$
(64)

and that permits us to replace the second term in Eq. (60) in a way that involves tilt:

$$\begin{aligned} e_\text {3d} = \frac{1}{2}\tilde{E}\,(u_{xx} + u_{yy})^2 + \frac{1}{2}\lambda _{xzxz} T_lT^l \ , \end{aligned}$$
(65)

We would now have to relate the two deformations—especially the area strain—to the geometry of a curved membrane. But before we do that, it is important to realize that we are not yet done with the energy density: a very crucial term is missing, because its origin requires us to think beyond a thin sheet of local moduli—i.e., we must go beyond Eq. (54).

Lateral prestress. Consider again what we are trying to model: a thin self-assembled in-plane fluid leaflet made up of amphiphilic molecules. Now focus on the fact that, unlike a homogeneous thin sheet, a lipid monolayer has internal structure that underlies its very reason of existence: the strongly positionally varying solubility of lipids—which gives rise to the self-assembly process that shields the tails from the embedding solvent by placing the head groups in between. One important consequence of this assembly-driven cohesion is that it leaves the membrane under internal pre-stresses—meaning, stresses that do not locally vanish in the equilibrium state, only globally. As we will soon see, they contribute to the deformation energy.

Let us first explore, what kind of remaining stresses are permissible by symmetry. Evidently, for a flat membrane lying in the xy-plane the stress tensor \({\varvec{\Pi }}\) is diagonal in the \(\{x,y,z\}\) coordinate system. Due to translational symmetry, it can only depend on z, and due to rotational symmetry, the x- and y-components must be identical:

$$\begin{aligned} {\varvec{\Pi }}= \text {diag}\big (\Pi _{xx}({\varvec{r}}),\Pi _{yy}({\varvec{r}}),\Pi _{zz}({\varvec{r}})\big ) = \text {diag}\big (\Pi _{||}(z),\Pi _{||}(z),\Pi _{\perp }(z)\big )\ . \end{aligned}$$
(66)

In mechanical equilibrium \({\varvec{\Pi }}\) must be divergence free, \(\partial _i\Pi _{ij}=0\). This equation immediately implies that \(\Pi _{\perp }(z)=\Pi _{\perp }\) is a constant, and so it must be equal to the isotropic ambient pressure acting on the membrane. The tangential component \(\Pi _{||}(z)\), however, is not restricted by this argument and could be a pretty complicated function of z. As we will soon discover, it indeed is.

Now imagine that we place a small patch of membrane inside a cuboid box of area A and height z. How would the energy change if we (isothermally and reversibly) deform that box in a volume-preserving way such that \(A\rightarrow A+\delta A\) and \(z\rightarrow z-\delta z = z-(z/A)\delta A\)? Following Rowlinson and Widom (2002 Chap. 2.5), the vertical compression requires the work

$$\begin{aligned} \delta W_\perp = A\,\delta z\,\Pi _{\perp } = z\,\delta A\,\Pi _{\perp } \ . \end{aligned}$$
(67)

In contrast, the lateral expansion requires the work

$$\begin{aligned} \delta W_{||} = -\delta A\int _{-z/2}^{z/2}\text {d}z\; \Pi _{||}(z) \ . \end{aligned}$$
(68)

Hence, the total change in free energy is

$$\begin{aligned} \delta F = \delta W_{\perp }+\delta W_{||} = \delta A\int _{-z/2}^{z/2}\!\!\!\text {d}z \left\{ \Pi _\perp -\Pi _{||}(z)\right\} \ . \end{aligned}$$
(69)

The expression under the integral is the positionally resolved effective lateral mechanical tension acting in the membrane. It is often simply called the lateral stress profile:

$$\begin{aligned} \sigma _0(z) = \Pi _\perp -\Pi _{||}(z) \ . \end{aligned}$$
(70)

What does this function look like for a membrane?

First, consider that there is the equivalent of a hydrophilic–hydrophobic interface at the backbone of a lipid, and so roughly at that height in the monolayer we have a relatively large lateral tension. This is where the bilayer is being pulled together, where the effect is localized that gives rise to a membrane in the first place. As a consequence, both the tails and the heads of the lipids are now being compressed, leaving us with a positive pressure (or negative tension) in the tail and upper head region that strives to expand the leaflet. For a membrane that is not subject to a net lateral tension, these stresses must balance, such that the net total stress (the integral over \(\sigma _0(z)\)) vanishes, thereby setting the equilibrium area per lipid. Hence, we expect \(\sigma _0(z)\) to be a function that features (positive) peaks near the two hydrophilic/hydrophobic transition regions in a lipid bilayer, while being negative both in the center and further out beyond the transition regions, such that the overall positive and negative areas balance.

Fig. 10
figure 10

Lateral stress profile \(\sigma _0(z)\) of a lipid bilayer, using a coarse-grained model of the lipid DMPC (MARTINI force field) at \(300\,\text {K}\). This profile is based on simulation results presented in (Wang and Deserno 2015)

Figure 10 shows the function \(\sigma _0(z)\) as measured for a particular lipid membrane model (the MARTINI version of DMPC, at a temperature of \(300\,\text {K}\)). Our overall expectations are met, even though we could not have anticipated all the extra wiggles. What might look extremely surprising, though, is how very large the effective stresses are: hundreds of bars! However, upon second thought, this makes sense: a typical value for the oil-water surface tension is about \(50\,\text {mN}/\text {m}\) (Goebel and Lunkenheimer 1997). Chemistry and Fig. 10 suggest that the transition between the hydrophilic and hydrophobic environment occurs over a region of approximately \(1\,\text {nm}\) width, and hence the pressure we would expect at the peak is about

$$\begin{aligned} \sigma _0(z_\text {peak}) \sim \frac{50\,\text {mN}/\text {m}}{1\,\text {nm}} = 500\,\text {bar} \ . \end{aligned}$$
(71)

Which is very close to what the simulation finds (fortuitously so, of course, but it is only the order of magnitude that counts).

Armed with the new insight that an in-plane lateral stress \(\sigma _0(z)\) exists in a lipid membrane, we should amend the monolayer elastic energy from Eq. (65) with a term that penalizes stretching or compression against that pre-existing stress, which leads to a term that is linear in the area strain:

$$\begin{aligned} e_\text {3d} = \sigma _0(z(\zeta ))\varepsilon (\zeta ) +\frac{1}{2}\tilde{E}\,\varepsilon (\zeta )^2 + \frac{1}{2}\lambda _{xzxz} T_lT^l \ . \end{aligned}$$
(72)

Here we also defined two more concepts:

  1. 1.

    \(z(\zeta )\) is the transverse coordinate z of a piece of material in the flat monolayer as a function of its transverse position \(\zeta \) in the curved monolayer. Since curving leads to local lateral stretching or compression, this impacts the transverse coordinates, because the Poisson ratio generally does not vanish—see Eq. (58). We will soon exploit this to connect z with \(\zeta \).

  2. 2.

    \(\varepsilon (\zeta )\) is the lateral area strain as a function of the curved transverse coordinate \(\zeta \). To first-order in \(\zeta \), it is equal to \(u_{xx}+u_{yy}\), but at next order it differs. But since this difference takes the form of a lateral shear, which meets no resistance in fluid leaflets, we can ignore it—that’s how Hamm and Kozlov (2000) argue. One could also state, though, that the true area strain should linearly couple to the true area stress, and that is why \(\varepsilon \) should naturally multiply the stress profile \(\sigma _0\). Of course, the outcome is the same. Also, notice that the difference only matters in the linear (pre-stress) term, because it becomes higher than quadratic order in the already quadratic elastic term.

2.2 Decomposing the Membrane Deformation into Three Stages

It should now become quite evident that the reason curvature will enter our final expression for a surface energy density functional is that bending the leaflet will give rise to positionally varying strains. To describe them, we need to carefully distinguish coordinates in the flat and curved sheet. It turns out that a convenient way of doing this is to decompose the strain by defining an intermediate state between the original flat bilayer and the final curved one: a state where lipids have tilted, resulting in a change of thickness and area per lipid of the leaflet that is uniform throughout its width. From there, any further strain is now a function that depends at least linearly on the transverse position and thus describes higher order curvature-induced strains.

Fig. 11
figure 11

The flat and untilted monolayer state in (a) is first transformed to a flat but tilted state (b) in which thickness and area per lipid have changed. From there a subsequent bending deformation, which leaves the area at the pivotal plane invariant, leads to the final curved state (c)

Figure 11 illustrates these stages, as well as the notation we will use to describe them: for coordinates or differentials in the initial, intermediate, and final state, we will use lower case roman, upper case roman, and lower case Greek letters, respectively. In particular, local area element and transverse height differential in these three states will be denoted as

$$\begin{aligned} \text {initial, (a):} \qquad&\; \{\text {d}a; \text {d}z\} \ , \nonumber \\ \text {intermediate, (b):} \qquad&\; \{\text {d}A; \text {d}Z\} \ , \nonumber \\ \text {final, (c):} \qquad&\; \{\text {d}\alpha ; \text {d}\zeta \} \ . \nonumber \end{aligned}$$

While in the first two states the transverse coordinates z or Z are perpendicular to the area element, this is not the case in the final state, for which the coordinate \(\zeta \) aligns with the local lipid direction. As a consequence, the volume element is not simply \(\text {d}\alpha \,\text {d}\zeta \) but instead \(\text {d}\alpha \,\text {d}\zeta \cos \theta \), where \(\theta \) is the tilt angle.Footnote 2 This will become important below.

The area element \(\text {d}\alpha \) in the curved configuration (c) will generally be different from the area element \(\text {d}A\) of the intermediate configuration (b): further “outside” it will be stretched, while further “inside” it will be compressed. But there will exist one particular location in the leaflet, called the “pivotal plane,” at which the area element is unchanged. We will use this specific location as the reference surface for the curved configuration, the transverse location from which \(\zeta \) will be measured and to which all curvatures shall refer. Hence, we get \(\text {d}\alpha (\zeta = 0) = \text {d}A\), while away from the pivotal plane the changed area element leads to the higher order lateral strain

$$\begin{aligned} \epsilon _\zeta = \frac{\text {d}\alpha - \text {d}A}{\text {d}A} \ . \end{aligned}$$
(73)

Since \(\epsilon _\zeta \) is local, quantifying it requires not only the lateral location on the leaflet, but also the transverse coordinate \(\zeta \). In contrast, the strain leading from the initial state (a) to the intermediate state (b) is by construction independent of \(\zeta \). We will hence refer to it as the zeroth order strain, which we can express as

$$\begin{aligned} \epsilon _0 = \frac{ \text {d}A - \text {d}a}{\text {d}a} \, . \end{aligned}$$
(74)

Obviously, the total area strain upon transitioning from state (a) to state (c) can be expressed through \(\epsilon _0\) and \(\epsilon _\zeta \):

$$\begin{aligned} \epsilon ( \zeta ) = \frac{\text {d}\alpha - \text {d}a }{ \text {d}a } = ( 1 + \epsilon _0 ) ( 1 + \epsilon _\zeta ) - 1 \ . \end{aligned}$$
(75)

Observe that the decomposition through some intermediate state is not unique. Other states could have been chosen, and more than one intermediate state is possible. But since the final state of deformation and its associated elastic energy is indeed a thermodynamic state, it does not matter by what specific path it is reached. The particular sequence of strains we have chosen to get from the initial to the final state is motivated by convenience, but our final answer will not depend on it.

2.3 The Link Between Curvature and Local Area Strain

Let us henceforth assume that we describe the shape of a curved monolayer via the location of its pivotal plane.Footnote 3 Any other surface, displaced from the pivotal plane by some amount, will generally not have a vanishing local area strain, and so there will be a local contribution to the elastic energy coming from (i) the local stress–strain work when stretching or compressing against the pre-existing stress \(\sigma _0\) and (ii) the elastic contribution quadratic in the local strain. We need to calculate how big this strain is.

More precisely, if \({\varvec{X}}\) is a point on the pivotal plane, we arrive at the new shifted point by displacing it a constant distance \(\zeta \) along the local material direction \(\hat{\varvec{d}}\):

$$\begin{aligned} {\varvec{X}}' = {\varvec{X}}+ \zeta \,\hat{\varvec{d}}\ . \end{aligned}$$
(76)

We need to know the area strain on the shifted surface \({\varvec{X}}'\), which in turn depends on the local displacement direction \(\hat{\varvec{d}}\).

Area strain for parallel surfaces. The easiest situation is if \(\hat{\varvec{d}}=\hat{\varvec{n}}\), in which case the shifted surface is called a parallel surface. To calculate the resulting area strain, we must compare an area element \(\text {d}A'\) on the parallel surface with its corresponding area element \(\text {d}A\) on the original parent surface. Recall that the tangent vectors on the parent surface are given by \({\varvec{e}}_i=\nabla _i{\varvec{X}}\), where \(\nabla _i\) is the metric-compatible covariant derivative. The area element on the parallel surface is hence

$$\begin{aligned} {\varvec{e}}_i'=\nabla _i({\varvec{X}}+\zeta \hat{\varvec{n}})={\varvec{e}}_i+\zeta K_i^j{\varvec{e}}_j \ , \end{aligned}$$
(77)

where \(K_{ij}\) is the curvature tensor and where we used the Weingarten equation \(\nabla _i\hat{\varvec{n}}=K_i^j{\varvec{e}}_j\). To get the area element, we need the metric determinant \(g'\) on the parallel surface, and that we get from the cross products of the two tangent vectors:

$$\begin{aligned} \sqrt{g'} = |{\varvec{e}}_1'\times {\varvec{e}}_2'|&= \Big |({\varvec{e}}_1+\zeta K_1^j{\varvec{e}}_j)\times ({\varvec{e}}_2+\zeta K_2^k{\varvec{e}}_k)\Big | \nonumber \\&= \Big |{\varvec{e}}_1\times {\varvec{e}}_2 + \zeta (K_1^1+K_2^2)\,{\varvec{e}}_1\times {\varvec{e}}_2 \nonumber \\&\quad \,\, + \zeta ^2(K_1^1K_2^2-K_1^2K_2^1)\,{\varvec{e}}_1\times {\varvec{e}}_2\Big | \nonumber \\&= \Big |(1+K\zeta +K_\text {G}\zeta ^2)\;(\sqrt{g}\,\hat{\varvec{n}})\Big | \nonumber \\&= \sqrt{g}\,(1+K\zeta +K_\text {G}\zeta ^2) \ , \end{aligned}$$
(78)

where K and \(K_\text {G}\) are trace and determinant of the curvature tensor \(K_{ij}\).

The maybe slightly unorthodox use of individual components can be avoided by proceeding a little bit more formally. The calculation is a bit longer, but it will turn out to be quite useful when we go beyond this simple case. Define the Levi–Civita symbol \(\epsilon _{ij}=\epsilon ^{ij}\) such that \(\epsilon _{11}=\epsilon _{22}=0\) and \(\epsilon _{12}=-\epsilon _{21}=1\). Furthermore, define the Levi–Civita tensor density \(\varepsilon _{ij}=\sqrt{g}\,\epsilon _{ij}\), which also implies \(\varepsilon ^{ij}=\epsilon ^{ij}/\sqrt{g}\). Now, the cross product between the two tangent vectors can be written as

$$\begin{aligned} {\varvec{e}}_1'\times {\varvec{e}}_2'&= \frac{1}{2}\sqrt{g}\,\varepsilon ^{ij}{\varvec{e}}_i'\times {\varvec{e}}_j' \nonumber \\&= \frac{1}{2}\sqrt{g}\,\varepsilon ^{ij}({\varvec{e}}_i+\zeta K_i^k{\varvec{e}}_k)\times ({\varvec{e}}_j+K_j^l{\varvec{e}}_l) \nonumber \\&\mathop {=}\limits ^{*} \frac{1}{2}\sqrt{g}\,\varepsilon ^{ij}\left[ \varepsilon _{ij} + \zeta (\varepsilon _{il}K_j^l+\varepsilon _{kl}K_i^k)+\zeta ^2\varepsilon _{kl}K_i^kK_j^l)\right] \hat{\varvec{n}}\nonumber \\&= \frac{1}{2}\sqrt{g}\left[ \varepsilon ^{ij}\varepsilon _{ij} + \zeta (\varepsilon ^{ij}\varepsilon _{il}K_j^l+\varepsilon ^{ij}\varepsilon _{kj}K_i^k) + \zeta ^2 \varepsilon ^{ij}\varepsilon _{kl}K_i^kK_j^l\right] \hat{\varvec{n}}\ , \end{aligned}$$
(79)

where at \(*\) we used \({\varvec{e}}_i\times {\varvec{e}}_j=\varepsilon _{ij}\hat{\varvec{n}}\). If we now apply the identities \(\varepsilon ^{ij}\varepsilon _{ij}=2\), \(\varepsilon ^{ij}\varepsilon _{ik}=g_k^j=\delta _k^j\) (i.e., the Kronecker-\(\delta \)), and the definition of the determinant, \(\det (K_{ij})=\frac{1}{2}\varepsilon ^{ij}\varepsilon ^{kl}K_{ik}K_{jl}\), the last line immediately reproduces Eq. (78) when taking the modulus.

Since \(\text {d}A=\sqrt{g}\,\text {d}u^1\text {d}u^2\) and \(\text {d}A'=\sqrt{g'}\,\text {d}u^1\text {d}u^2\), we find the area strain

$$\begin{aligned} \epsilon _\zeta = \frac{\text {d}A'-\text {d}A}{\text {d}A} = \frac{\sqrt{g'}}{\sqrt{g}}-1 = K\zeta + K_\text {G}\zeta ^2 \ . \end{aligned}$$
(80)

This is quite remarkable because it is exact: No corrections beyond quadratic order in \(\zeta \) occur.

Area strain for more general lipid-shifted surfaces. If the direction of shift, \(\hat{\varvec{d}}\), is not along the surface normal but along the lipid orientation, we instead have

$$\begin{aligned} \hat{\varvec{d}}= \frac{T^j {\varvec{e}}_j + \hat{\varvec{n}}}{\sqrt{1 + T_j T^j}} = T^j {\varvec{e}}_j + \left( 1 - \textstyle \frac{1}{2} T_j T^j \right) \hat{\varvec{n}}+ \mathcal {O} (|{\varvec{T}}|^3) \, . \end{aligned}$$
(81)

It is worthwhile to note that we deviate here from Hamm and Kozlov (2000), for these authors do not normalize the orientation vector. Clearly, this only matters at higher order, but the difference does have a physical interpretation. Recall that we want \(\zeta \) to measure a given distance along a lipid. If we do not normalize the lipid director, the displacement \(|\zeta {\varvec{d}}|\) along a tilted lipid is longer than for an untilted one, while this distance remains unchanged if we use the normalized director. Which one is correct hence depends on whether lipids stretch upon tilting. Hamm and Kozlov assumed that lipids stretch by laterally shearing, which exactly corresponds to not normalizing the orientation vector in the numerator of Eq. (81). However, more recently Kopelevich and Nagle (2015) showed in a simulation study that there is virtually no correlation between a lipid’s length and its orientation, suggesting that lipids rotate upon tilting. In that case the normalized orientation vector is the more appropriate choice, which leads to the lipid-shifted surface

$$\begin{aligned} {\varvec{X}}' = {\varvec{X}}+ \zeta \left[ T^j {\varvec{e}}_j + \left( 1 - \textstyle \frac{1}{2} T_j T^j \right) \hat{\varvec{n}}\right] + \mathcal {O}(T^3) \ . \end{aligned}$$
(82)

The remainder of the calculation follows the one for parallel surfaces, except that the form of \({\varvec{X}}'\) results in more complex expressions. To begin with, the tangent vectors are

$$\begin{aligned} {\varvec{e}}_j'&= \nabla _j\left\{ {\varvec{X}}+ \zeta \left[ T^j {\varvec{e}}_j + \left( 1 - \textstyle \frac{1}{2} T_j T^j \right) \hat{\varvec{n}}\right] \right\} \nonumber \\&= {\varvec{e}}_j + \zeta \left[ \left( \tilde{K}_j^{\;\,k}-\textstyle \frac{1}{2}K_j^k T_l T^l\right) {\varvec{e}}_k - \tilde{K}_{jl}T^l \hat{\varvec{n}}\right] \ , \end{aligned}$$
(83)

where we defined the effective curvature tensor

$$\begin{aligned} \tilde{K}_{ij} := K_{ij} + \nabla _i T_j \ . \end{aligned}$$
(84)

Warning: \(\tilde{K}_{ij}\) is generally not a symmetric tensor (unlike \(K_{ij}\)), because \(\nabla _iT_j\ne \nabla _jT_i\). This means we must be careful when contracting indices, or when raising one of them: \(\tilde{K}_i^{\;\,j}\) is not the same as \(\tilde{K}^j_{\;\,\,i}\).

Calculating the cross product of the tangent vectors is now a bit more tedious, but still straightforward. First,

$$\begin{aligned} {\varvec{e}}_1'\times {\varvec{e}}_2' =&\frac{1}{2}\sqrt{g}\,\varepsilon ^{ij}{\varvec{e}}_i'\times {\varvec{e}}_j' \nonumber \\ =&\frac{1}{2}\sqrt{g}\,\varepsilon ^{ij}\bigg \{ {\varvec{e}}_i + \zeta \left[ \left( \tilde{K}_i^{\;\,k}-\textstyle \frac{1}{2}K_i^k {\varvec{T}}^2\right) {\varvec{e}}_k - \tilde{K}_{im}T^m \hat{\varvec{n}}\right] \bigg \}\nonumber \\&\qquad \times \bigg \{ {\varvec{e}}_j + \zeta \left[ \left( \tilde{K}_j^{\;\,\,l}-\textstyle \frac{1}{2}K_j^{\,l} {\varvec{T}}^2\right) {\varvec{e}}_l - \tilde{K}_{jn}T^n \hat{\varvec{n}}\right] \bigg \} \ . \end{aligned}$$
(85)

Making use of \({\varvec{e}}_i\times {\varvec{e}}_j=\varepsilon _{ij}\,\hat{\varvec{n}}\) as well as \(\hat{\varvec{n}}\times {\varvec{e}}_i=\varepsilon _{ij}\,{\varvec{e}}^j\), all cross products can again be expressed as Levi–Civita tensor densities. Two of them contract either into a metric or create a determinant. The one case where that does not happen, they form an expression that will not matter up to order \(\zeta ^2\). We then find

$$\begin{aligned} {\varvec{e}}_1'\times {\varvec{e}}_2' =&\sqrt{g}\,\bigg \{\hat{\varvec{n}}\left[ 1+\zeta (\tilde{K}-\textstyle \frac{1}{2}K{\varvec{T}}^2)+\zeta ^2(\tilde{K}_\text {G}-K_\text {G}{\varvec{T}}^2)\right] \nonumber \\&\quad + {\varvec{e}}^i\left[ \zeta \tilde{K}_{im}T^m + \zeta ^2(\text {irrelevant stuff})\right] \bigg \} \ , \end{aligned}$$
(86)

where the trace and determinant of the effective curvature tensor are

$$\begin{aligned} \tilde{K}= \text {Tr}(\tilde{K}_{ij}) = g^{ij}\tilde{K}_{ij} \quad ,\quad \tilde{K}_\text {G}= \text {det}(\tilde{K}_{ij}) = \textstyle \frac{1}{2}\varepsilon ^{ij}\varepsilon ^{kl} \tilde{K}_{ik}\tilde{K}_{jl} \ . \end{aligned}$$
(87)

Moreover, Eq. (86) already exploits the fact that this expansion is only supposed to be accurate up to maximally order \(K^2T^2\). This for instance means that terms like \(\tilde{K}^2 T^2\) can be replaced by \(K^2 T^2\), since the “extra T” in \(\tilde{K}\) would contribute at higher order.

Up to order \(\zeta ^2\), the square of Eq. (86) is hence given by

$$\begin{aligned} \frac{|{\varvec{e}}_1'\times {\varvec{e}}_2'|^2}{g} = \left[ 1+\zeta (\tilde{K}-\textstyle \frac{1}{2}K{\varvec{T}}^2)+\zeta ^2(\tilde{K}_\text {G}-K_\text {G}{\varvec{T}}^2)\right] ^2 +\zeta ^2 K_{im}K^i_nT^mT^n , \end{aligned}$$
(88)

and so the ratio of metric determinants is

$$\begin{aligned} \frac{\sqrt{g'}}{\sqrt{g}} = 1 + (\tilde{K}-\textstyle \frac{1}{2}K{\varvec{T}}^2)\zeta + (\tilde{K}_\text {G}- K_\text {G}{\varvec{T}}^2+\frac{1}{2}K_{im}K^i_nT^mT^n)\zeta ^2 \ . \end{aligned}$$
(89)

We hence find the following area strain:

$$\begin{aligned} \epsilon _\zeta = \frac{\text {d}\alpha -\text {d}A}{\text {d}A} = \frac{\sqrt{g'}}{\sqrt{g}} - 1 = \epsilon _1\,\zeta +\epsilon _2\,\zeta ^2 \ , \end{aligned}$$
(90)

with the first- and second-order contribution

$$\begin{aligned} \epsilon _1&= \tilde{K}-\textstyle \frac{1}{2}K T_iT^i \ , \end{aligned}$$
(91a)
$$\begin{aligned} \epsilon _2&= \tilde{K}_\text {G}- \big (K_\text {G}\, g_{mn} -\textstyle \frac{1}{2}K_{im}K^i_n\big )T^mT^n \ . \end{aligned}$$
(91b)

The transverse dimension: Poisson ratio effects. We have just seen how the lateral area element changes due to curvature—both with and without accounting for tilt. However, the transverse length element will change, too, and the extent to which this happens is dictated by the Poisson ratio. It will affect the zeroth order strain \(\epsilon _0\) as well as the connection between the differentials \(\text {d}z\), \(\text {d}Z\), and \(\text {d}\zeta \), which we have been careful to distinguish.

Let us begin with the zeroth order strain \(\epsilon _0\). Following the finding by Kopelevich and Nagle (2015) that lipids rotate upon tilting, we must have \(\text {d}Z = \text {d}z \, \cos \theta \), and so the zeroth order transverse strain is

$$\begin{aligned} u^0_{zz} = \frac{\text {d}Z - \text {d}z}{ \text {d}z} = \cos \theta -1 \mathop {=}\limits ^{*} \frac{1}{\sqrt{1+|{\varvec{T}}|^2}}-1 = - \frac{1}{2} T_l T^l + \mathcal {O}(|{\varvec{T}}|^4) \ , \end{aligned}$$
(92)

where at “\(*\)” we used \(|{\varvec{T}}|=\tan \theta \).

Furthermore, recall that the normal stresses vanish at the top and bottom of the leaflet. If it is sufficiently thin, this implies that the normal stress vanishes throughout the leaflet, since they has not much opportunity to grow appreciably. This relates the transverse and lateral strains (Landau and Lifshitz 1986)

$$\begin{aligned} -\frac{\nu }{1-\nu } ( u_{xx}^0 + u_{yy}^0 ) =: -\tilde{\nu } ( u_{xx}^0 + u_{yy}^0 ) = u_{zz}^0 \mathop {=}\limits ^{\text {(92)}} - \frac{1}{2} T_l T^l \ , \end{aligned}$$
(93)

where \(\nu \) is Poisson’s ratio for the present anisotropic material, which in terms of the elastic tensor \(\lambda _{ijkl}\) is

$$\begin{aligned} \nu = \frac{\lambda _{xxzz} }{ \lambda _{xxzz} + \lambda _{zzzz}} \quad \text {or}\quad \tilde{\nu }:= \frac{\nu }{1-\nu } = \frac{\lambda _{xxzz}}{\lambda _{zzzz}} \ . \end{aligned}$$
(94)

These elastic coefficients could in general depend on their transverse position through the leaflet, but notice that the area strain cannot, because we assumed it to arise from a rigid rotation of the lipids. Hence, in Eq. (94) \(\tilde{\nu }\) must denote the average value across the leaflet.

Finally, since at \(\mathcal {O}(\zeta )\) the zeroth order area strain equals the sum of the in-plane diagonal components of the strain tensor, it is found to be

$$\begin{aligned} \epsilon _0 = \frac{1}{2 \tilde{\nu }} T_l T^l\, . \end{aligned}$$
(95)

Notice that \(\nu \) can in principle be zero, in which case \(\tilde{\nu }\) would also vanish, seemingly leading to a divergent strain. However, at a vanishing Poisson ratio a lateral surface stress would not lead to a reduction of thickness, and hence lipids cannot in fact tilt. Indeed, Eq. (93) shows that if \(\tilde{\nu }=0\) then the tilt vanishes as well, and hence the area strain remains finite. At any rate, the physically relevant situation for soft fluid leaflets is \(\tilde{\nu }\approx 1\), not a vanishing Poisson ratio.

Next, let us look at the connection between the transverse area elements in the initial and final configuration. To begin with, note that the difference in alignment between the Z- and \(\zeta \)-coordinate again implies that \(\text {d}Z = \text {d}\zeta \cos \theta \). Combining this with the usual Poisson relation between lateral and transverse strain normal to the lateral direction, we get

$$\begin{aligned} \frac{ \text {d}\zeta \cos \theta - \text {d}Z}{\text {d}Z} = - \tilde{\nu } \epsilon _\zeta \ . \end{aligned}$$
(96)

Inserting \(\text {d}Z=\cos \theta \,\text {d}z\) from Eq. (92), \(\cos \theta \) cancels in the relation between the transverse differentials z and \(\zeta \):

$$\begin{aligned} \text {d}\zeta = \text {d}z \left[ 1 - \tilde{\nu } \left( \epsilon _1 \zeta + \epsilon _2 \zeta ^2 \right) \right] \ . \end{aligned}$$
(97)

The expansion coefficients \(\epsilon _1\) and \(\epsilon _2\) are those given in Eq. (91a) and (91b). This connection constitutes a differential equation for \(\zeta (z)\), and it can be solved by a straightforward quadrature. Fortunately, though, we will only need the solution up to order \(z^2\):

$$\begin{aligned} \zeta (z) = z - \textstyle \frac{1}{2} \tilde{\nu } \epsilon _1 z^2 + \mathcal {O}(z^3) \ . \end{aligned}$$
(98)

Just as in the case of the zeroth order area strain, \(\tilde{\nu }\) in principle depends on the position within the leaflet, but since we do not know the functional form, we could not in general integrate the differential equation. However, at the order in z that we strive for, all we could and need account for is a linear deviation away from its average value. Since \(\tilde{\nu }\) is anyways most likely very close to 1, this extra work seems hardly justified, and in order to keep things simple, we will again just take the average value of the Poisson ratio in Eq. (98).

The volume element Having calculated the lateral and transverse coordinate differentials in the deformed configuration, we can now calculate the volume element in the coordinates we need—which are the transverse position z in the flat untilted state and the area element \(\text {d}A\) of the flat tilted state, which by definition is identical to the area element \(\text {d}\alpha (\zeta =0)\) of the curved leaflet at its pivotal plane. To calculate the volume element, we also must recall that the new volume element is generally not orthogonal, since the \(\zeta \)-direction has an angle \(\theta \) with respect to the membrane normal, and so we get a projection factor \(\cos \theta \). Putting everything together, we find

$$\begin{aligned} \text {d}V = \text {d}A \, \text {d}z \big [1 - \textstyle \frac{1}{2} T_l T^l\big ] \big [1 + (1 -\tilde{\nu }) \epsilon _\zeta \; \underline{-\;\tilde{\nu }\epsilon _\zeta ^2\,}\big ] \ , \end{aligned}$$
(99)

where we use Eq. (73) and the Poisson ratio relation Eq. (96).

Notice that Eq. (99) has one disconcerting feature: in the incompressible limit, \(\tilde{\nu }=1\), the only contribution to area strain should come from tilt (namely, \(\epsilon _0\)), but here we get another contribution from geometry—the underlined term. This trouble is not specific to our particular problem but more generally reflects the fact that the Poisson ratio is a first-order concept. To see this, consider an area strain \(\epsilon _A\) and a transverse strain \(\epsilon _z\). Together, they result in a volume strain \(\epsilon _V=(1+\epsilon _A)(1+\epsilon _z)-1=\epsilon _A+\epsilon _z+\epsilon _A\epsilon _z\). And with the usual Poisson ratio connection \(\epsilon _z=-\tilde{\nu }\epsilon _A\), we get \(\epsilon _V=(1-\tilde{\nu })\epsilon _A-\tilde{\nu }\epsilon _A^2\). The last term does not vanish in the incompressible limit \(\tilde{\nu }=1\), and it is exactly the source of the underlined term in Eq. (99). To avoid this inconsistency, we will drop the underlined quadratic term.

2.4 From Three Dimensions to Two

Putting everything together, we then arrive at the following overall elastic energy, which is correct up to order \(\zeta ^2\), squared curvature, squared tilt, and biquadratic terms:

$$\begin{aligned} \mathcal {H}_\text {m}&= \int \text {d}A \, \text {d}z \left[ 1 - \textstyle \frac{1}{2} T_l T^l + (1 -\tilde{\nu }) \left( \tilde{K}- K T_l T^l\right) \zeta \right] \times \nonumber \\&\qquad \quad \bigg \{ \sigma _0(z(\zeta )) \bigg [\;\; \textstyle \frac{1}{2\tilde{\nu }} T_l T^l + \Big (\tilde{K}+ \textstyle \frac{ 1 - \tilde{\nu }}{2 \tilde{\nu }} K\,T_l T^l\Big )\zeta \nonumber \\&\qquad +\Big (\tilde{K}_\text {G}- \textstyle \frac{1}{2} \left( K_\text {G}g_{ij} - K_{ki} K^k_j \right) T^i T^j\Big )\zeta ^2\bigg ] \nonumber \\&\qquad \qquad + \textstyle \frac{1}{2} \tilde{E}\left( \tilde{K}^2 + \frac{1}{\tilde{\nu }} K_\text {G}T_l T^l \right) \zeta ^2 \nonumber \\&\qquad \qquad + \textstyle \frac{1}{2 \tilde{\nu }} \tilde{E} K \zeta T_l T^l + \frac{1}{2} \lambda _{xzxz} T_lT^l \bigg \} \ . \end{aligned}$$
(100)

Here we made one further approximations: we dropped biquadratic terms which exhibit an additional factor of \(1-\tilde{\nu }\). Since we will invariably be close to the incompressible limit, this multiplies the small biquadratics by yet another smallness parameter, which we will ignore for simplicity.

From this expression we get the elastic surface energy density by performing the integral over z, which also requires us to insert the functional dependence \(\zeta (z)\) from Eq. (98). Doing this integral, we arrive at the surface energy density

$$\begin{aligned} e_{\text {2d}} = \frac{1}{2} \kappa _\text {m}(\tilde{K}- K_{0,\text {m}})^2 + \overline{\kappa }_\text {m}\tilde{K}_\text {G}+ \frac{1}{2} \kappa _\text {t,m}M^{\prime }_{ij} T^i T^j \ . \end{aligned}$$
(101)

This expression now features numerous new elastic constants, but all of them have expressions in terms of the underlying elastic model:

$$\begin{aligned} \kappa _\text {m}&= \int \text {d}z \left[ \tilde{E}(z) - \tilde{\nu } \sigma _0(z) \right] z^2 \ , \end{aligned}$$
(102a)
$$\begin{aligned} \overline{\kappa }_\text {m}&= \int \text {d}z \, \sigma _0(z) \, z^2 \ , \end{aligned}$$
(102b)
$$\begin{aligned} \kappa _\text {t,m}&= \int \text {d}z \, \lambda _{xzxz}(z) \ , \end{aligned}$$
(102c)
$$\begin{aligned} \kappa _{\text {m}, \nu }&= \int \text {d}z \, \frac{1}{\tilde{\nu }} \left[ \tilde{E}(z) - \tilde{\nu }\sigma _0(z) \right] z^2 \ , \end{aligned}$$
(102d)
$$\begin{aligned} - \kappa _\text {m}K_{0,\text {m}}&= \int \text {d}z \, \sigma _0(z) \, z \ , \end{aligned}$$
(102e)
$$\begin{aligned} - \kappa _\text {m}K_{0,\text {t}}&= \int \text {d}z \, \lambda _{xzxz}(z) \, (1 - \tilde{\nu } ) \, z \ , \end{aligned}$$
(102f)
$$\begin{aligned} \kappa _\text {m}K_{0,\text {m}}'&= \int \text {d}z \, \frac{1}{\tilde{\nu }} \left[ \tilde{E}(z) + ( 2 - 3 \tilde{\nu } ) \sigma _0(z) \right] z \ , \end{aligned}$$
(102g)

The quadratic tilt term in Eq. (101) is not merely characterized by a scalar modulus but instead by a full tensor, which has the form

$$\begin{aligned} M_{ij}' = \Big [1 + \ell ^2 K \big (K_{0,\text {m}}' \!- K_{0,\text {t}} \big ) - \ell ^2K^2 + \big ( \ell _{\nu }^2 - r_\text {m}\ell ^2\big ) K_\text {G}\Big ] g_{ij} + r_\text {m}\ell ^2 K_{ki} K^k_j \ . \end{aligned}$$
(103)

Here, \(\ell \) is a characteristic length defined from bending and tilt moduli, while the other length scale \(\ell _\nu \) is defined via the new modulus \(\kappa _{\text {m},\nu }\):

$$\begin{aligned} \ell ^2 = \frac{\kappa _\text {m}}{\kappa _\text {t,m}} \qquad \text {,}\qquad \ell ^2_{\nu } = \frac{\kappa _{\text {m},\nu }}{\kappa _\text {t,m}} \ . \end{aligned}$$
(104)

Moreover, the dimensionless number \(r_\text {m}\) is given by

$$\begin{aligned} r_\text {m}= \frac{\overline{\kappa }_\text {m}}{\kappa _\text {m}} \ . \end{aligned}$$
(105)

In the absence of tilt, the stability of the quadratic curvature expression Eq. (101) requires \(-2\le r_\text {m}\le 0\) (Deserno 2015), and so \(r_\text {m}<0\).

Observe that in the absence of tilt Eq. (101) simplifies to the Helfrich Hamiltonian. Moreover, if the curvature radii are large compared to the characteristic scales \(\ell \) and \(\ell _{\nu }\)), the tilt tensor approaches the metrix, \(M_{ij}'\rightarrow g_{ij}\), and the expression for the surface energy density reduces to the original expression by Hamm and Kozlov (2000).

Disentangling Tilt and Curvature in \(\tilde{K}_\text {G}\). The effective curvature \(\tilde{K}=K+\nabla _lT^l\) is the sum of the total curvature and the divergence of the tilt. This separates tilt and curvature quite nicely, and shows for instance that the divergence of tilt can be viewed as a position-dependent dynamic spontaneous curvature. Unfortunately, it is not quite so easy to see how we can wrest the tilde from \(\tilde{K}_\text {G}\). But it is possible. To do so, recall the definition

$$\begin{aligned} \tilde{K}_\text {G}&= \frac{1}{2} \varepsilon ^{ij} \varepsilon ^{kl} \tilde{K}_{ik}\tilde{K}_{jl} \qquad \text {(now use }\varepsilon ^{ij}\varepsilon ^{kl}=g^{ik}g^{jl}-g^{il}g^{jk}\text {)} \nonumber \\&= \frac{1}{2} \left( \tilde{K}^2 - \tilde{K}_i^{\;\,k} \tilde{K}_k^{\;\,i}\right) \ , \nonumber \\&= \frac{1}{2} \left[ \Big (K_i^i+\nabla _iT^i\Big )\Big (K_j^j+\nabla _jT^j\Big )-\Big (K_i^j+\nabla _iT^j\Big )\Big (K_j^i+\nabla _jT^i\Big )\right] \nonumber \\&= \frac{1}{2}\left[ K^2-K_i^jK_j^i + 2\Big (K\nabla _iT^i-K_i^j\nabla _jT^i\Big ) + \nabla _iT^i\nabla _jT^j-\nabla _iT^j\nabla _jT^i\right] \nonumber \\&= K_\text {G}+ \Big (K\nabla _iT^i-K_i^j\nabla _jT^i\Big ) + \frac{1}{2}\Big (\nabla _iT^i\nabla _jT^j-\nabla _iT^j\nabla _jT^i\Big ) \ . \end{aligned}$$
(106)

As the next step, recall that the above expression occurs under an integral. We aim to integrate the second and third parenthesis by parts, which means “swapping one derivative and one sign,” as well as getting one boundary term. Doing so, we find

$$\begin{aligned} \tilde{K}_\text {G}&= K_\text {G}+ \Big (\nabla _iK_j^i-\nabla _jK\Big )T^j + \frac{1}{2}T^i\Big (\nabla _j\nabla _i-\nabla _i\nabla _j\Big )T^j + \nabla _iB^i \ , \end{aligned}$$
(107)

where the last term is the total divergence of

$$\begin{aligned} B^i = KT^i - K_j^iT^j+\frac{1}{2}\Big (T^i\nabla _jT^j-T^j\nabla _jT^i\Big ) \ . \end{aligned}$$
(108)

Now notice that the expression in the first parenthesis of Eq. (107) vanishes due to the contracted Codazzi–Mainardi equation. The expression in the second parenthesis is more interesting: this is the commutator of covariant derivatives, and as is well know, it does not vanish in curved geometries. Instead, we have

$$\begin{aligned}{}[\nabla _a,\nabla _b]V_c = R_{abcd}V^d \ , \end{aligned}$$
(109)

where \(R_{abcd}\) is the Riemann tensor. In the present case we hence find

$$\begin{aligned} T^i[\nabla _j,\nabla _i]T^j = g^{jk}T^i[\nabla _j,\nabla _i]T_k = g^{jk}T^iR_{jikl}T^l = R_{il}T^iT^l = K_\text {G}{\varvec{T}}^2 \ , \end{aligned}$$
(110)

where \(R_{il}=g^{jk}R_{jikl}\) is the Ricci tensor, which in two dimensions is simply given by \(R_{il}=K_\text {G}\, g_{il}\). This shows that—up to a boundary term—we can disentangle the tilt from the effective Gaussian curvature, finding

$$\begin{aligned} \tilde{K}_\text {G}= K_\text {G}+ \frac{1}{2}K_\text {G}{\varvec{T}}^2 \ . \end{aligned}$$
(111)

As it turns out, the boundary term is in many cases irrelevant. Notice that we are writing down a theory for a monolayer. If this monolayer is part of a closed vesicle, it has no boundary. But even if we have a bilayer membrane with an open edge or a pore, the monolayer is continuous and boundary-free, since it wraps around the edges. A case where we cannot ignore the edge hence needs to actually provide an edge. One way in which this could happen is if a membrane contains transmembrane proteins, which locally provide an end to the monolayer. Now the boundary term will matter, but we will not look at this case here.

Observe what the disentanglement (111) does to our energy density from Eq. (101): removing the tilde from \(\tilde{K}_\text {G}\) creates the new term \(\frac{1}{2}\overline{\kappa }_\text {m}K_\text {G}{\varvec{T}}^2\), which we can incorporate into the effective tilt modulus tensor of Eq. (103), where it cancels the \(K_\text {G}\) part in its isotropic contribution.

The elastic parameters. The two-dimensional elastic functional (101) contains seven new parameters, and the set of Eq. (102) shows how they depend on the underlying elastic tensor \(\lambda _{ijkl}(z)\) and the pre-stress \(\sigma _0(z)\). Of these parameters, \(\kappa _\text {m}\), \(\overline{\kappa }_\text {m}\), \(\kappa _\text {t,m}\), and \(K_{0,\text {m}}\) already appear in the treatment by Hamm and Kozlov (2000), and in fact are given by the same microscopic expressions (if we specialize to the incompressible limit \(\tilde{\nu }=1\)). On the other hand, the three parameters \(\kappa _{\text {m},\nu }\), \(K_{0,\text {t}}\), and \(K_{0,\text {m}}'\) are new. They are related to the novel biquadratic terms, and in order to judge their relevance, we need to estimate their magnitude. To keep things simple, we will assume that the elastic tensor \(\lambda _{ijkl}\) is in fact constant throughout the leaflet, for this allows us to evaluate the moment-integrals analytically.

Let us start with the inverse length \(K_{0,\text {t}}\) from Eq. (102f), the form of which mimics the spontaneous curvature term \(K_{0,\text {m}}\). In contrast to the latter, however, \(K_{0,\text {t}}\) is usually negligible. To see this, consider the following:

$$\begin{aligned} - \kappa _\text {m}K_{0,t}&= \int _0^{d_\text {m}} \text {d}z \, \lambda _{xzxz}(z) \, (1 - \tilde{\nu } ) \, (z-z_0) \nonumber \\&\approx \lambda _{xzxz} \, (1 - \tilde{\nu } ) \int _0^{d_\text {m}} \text {d}z \, (z-z_0) \nonumber \\&= \lambda _{xzxz} \, (1 - \tilde{\nu } ) \frac{d_\text {m}^2-2d_\text {m}z_0}{2}\nonumber \\&\approx \frac{1}{2}\kappa _\text {t,m}(1-\tilde{\nu })(d_\text {m}-2z_0) \ , \end{aligned}$$
(112)

where \(d_\text {m}\) is the monolayer thickness and where we explicitly centered the trans-bilayer integral around \(z_0\). We also used Eq. (102c) to rewrite \(\kappa _\text {t,m}=\lambda _{xzxz} d_\text {m}\) in the constant-\(\lambda \)-approximation. Dividing out \(\kappa _\text {m}\) and using Eq. (104), we get

$$\begin{aligned} K_{0,\text {t}} \approx \frac{1- \tilde{\nu }}{2\ell ^2} ( 2 z_0 - d_\text {m}) \ . \end{aligned}$$
(113)

This is only nonzero if (i) the monolayer is compressible and (ii) its pivotal plane is not in the center of the leaflet. In a recent simulation study Wang and Deserno (2016) found \(z_0=1.32\,\text {nm}\) for a united-atom model of the lipid DMPC (Berger et al. 1997; Lindahl and Edholm 2000). Taking the monolayer thickness to be the distance between the bilayer midplane and the position of the phosphate atom (as a proxy for the Luzzati plane), the authors find \(d_\text {m}=1.80\,\text {nm}\). Also using the value \(\ell \approx 1.61\,\text {nm}\) determined in the same paper, we arrive at \(K_{0,\text {t}}\approx 0.16 (1- \tilde{\nu })\,\text {nm}^{-1}\), or a corresponding curvature radius of \(1/K_{0,\text {t}}\approx 6 / (1-\tilde{\nu })\,\text {nm}\). In practice we do not expect any strong deviation from incompressibility, and even if we assume \(\nu \approx 0.45\), we still find \(1/K_{0,\text {t}}\approx 33\,\text {nm}\), much larger than any of the other microscopic length scales (such as \(d_\text {m}\), \(z_0\), or \(\ell \)). It is hence a very good approximation to neglect the \(K_{0,\text {t}}\) term altogether.

The other two new terms contain the Poisson ratio in a way that leaves their incompressible limit finite, and for the sake of estimating magnitudes, we will hence set \(\tilde{\nu }=1\). This immediately shows that \(\kappa _{\text {m},\nu }=\kappa _\text {m}\) and hence also \(\ell _\nu =\ell \). The final expression, \(\kappa _\text {m}K_{0,\text {m}}'\) is then found to be the first moment of \(\tilde{E}(z)-\sigma _0(z)\). With the approximation \(\tilde{E}(z)=\tilde{E}=\text {const.}\), we then find

$$\begin{aligned} \kappa _\text {m}K_{0,\text {m}}' \approx \int \text {d}z \, \left[ \tilde{E} (z)- \sigma _0(z) \right] z \approx \tilde{E} \int _0^{d_\text {m}} \text {d}z \, (z-z_0) + \kappa _\text {m}K_{0,\text {m}} \ , \end{aligned}$$

where we again explicitly centered the z-integral. We hence find

$$\begin{aligned} K_{0,\text {m}}' = K_{0,\text {m}} - \frac{\tilde{E}d_\text {m}}{2\kappa _\text {m}}(2z_0-d_\text {m}) \ . \end{aligned}$$
(114)

If the pivotal plane is in the middle of the leaflet, then \(K_{0,\text {m}}'=K_{0,\text {m}}\). However, usually the pivotal plane of a lipid monolayer is located closer to the headgroup region, often about \(\frac{2}{3}\) up along the lipid. Using this rule of thumb, we get

$$\begin{aligned} K_{0,\text {m}}' \approx K_{0,\text {m}} - \frac{\tilde{E}d_\text {m}^2}{6\kappa _\text {m}} \qquad \text {(if }z_0=\textstyle \frac{2}{3}d_\text {m}\text {)} \ . \end{aligned}$$
(115)

If we now apply the constant-\(\tilde{E}\)-approximation also to Eq. (102a), we get

$$\begin{aligned} \kappa _\text {m}\approx \tilde{E} \int _0^{d_\text {m}} \!\!\! \text {d}z (z-z_0)^2 - \overline{\kappa }_\text {m}= \frac{1}{3}\tilde{E}d_\text {m}(d_\text {m}^2-3d_\text {m}z_0+3z_0^2) -\overline{\kappa }_\text {m}\ . \end{aligned}$$
(116)

And if we again specialize to the good guess \(z_0=\frac{2}{3}d_\text {m}\), we find

$$\begin{aligned} \frac{\tilde{E}d_\text {m}^3}{9} = \kappa _\text {m}+\overline{\kappa }_\text {m}\qquad \text {(if }z_0=\textstyle \frac{2}{3}d_\text {m}\text {)} \ , \end{aligned}$$
(117)

which together with Eq. (115) leads to

$$\begin{aligned} K_{0,\text {m}}' \approx K_{0,\text {m}} - \frac{1+r_\text {m}}{z_0} \qquad \text {(if }z_0=\textstyle \frac{2}{3}d_\text {m}\text {)} \ . \end{aligned}$$
(118)

This expression is quite curious, because the “correction” part \((1+r_\text {m})/z_0\) can be anything between zero and very large. It vanishes for \(r_\text {m}=-1\), which is a perfectly permissible value for the Gaussian elastic ratio. On the other hand, it is equally possible that \(r_\text {m}\) is somewhere between \(-1\) and 0, say \(-\frac{1}{2}\), in which case the additional term is \(-1/2z_0\), and this is a very strong spontaneous curvature. Recall that Wang and Deserno (2016) found \(z_0=1.32\,\text {nm}\) for a united-atom model of DMPC, which gives \(-1/2z_0\approx -0.38\,\text {nm}^{-1}\), much larger (in magnitude) than typical lipid spontaneous curvatures. For comparison, the conventional spontaneous curvature \(K_{0,\text {m}}\) for DMPC is about \(0.025\,\text {nm}^{-1}\) (Venable et al. 2015), and lysophosphatidylcholine, one of the most strongly positively curved lipids, has a spontaneous curvature radius of about \(0.26\,\text {nm}^{-1}\) (Kooijman et al. 2005). The reason why such a potentially large \(K_{0,\text {m}}'\) does not majorly affect bilayer stability and morphology is that it does not directly enter the bending term—only the ordinary spontaneous curvature \(K_{0,\text {m}}\) does.

Putting things together. We can finally write down a (slightly approximated) version of the surface energy functional, in which we ignore \(K_{0,\text {t}}\), identify \(\kappa _{\text {m},\nu }=\kappa _\text {m}\), wrest the tilde from the Gaussian curvature, and also disentangle the term \(K_{ki}K^k_j\) by virtue of the once-contracted Gauss equation \(K_{ki}K^k_j = KK_{ij}-K_\text {G}g_{ij}\):

$$\begin{aligned} e_{\text {2d}} = \frac{1}{2} \kappa _\text {m}(K + \nabla _iT^i - K_{0,\text {m}})^2 + \overline{\kappa }_\text {m}K_\text {G}+ \frac{1}{2} \kappa _\text {t,m}M_{ij} T^i T^j \end{aligned}$$
(119)

with

$$\begin{aligned} M_{ij} = \Big [1 + \ell ^2 \big ( K K_{0,\text {m}}'- K^2 + (1-r_\text {m}) K_\text {G}\big )\Big ] g_{ij} + r_\text {m}\ell ^2 KK_{ij} \ . \end{aligned}$$
(120)

2.5 Some Consequences of the Curvature-Tilt Functional

As stated before, the theory presented here follows the lead of Hamm and Kozlov (2000), but it retains some of the higher order terms which they have neglected—specifically biquadratic terms such as \(K_\text {G}T^2\), which is quadratic in both curvature and tilt. Hamm and Kozlov eliminate such terms in their treatment whenever they explicitly occur, on account of them being higher order than the usual quadratic terms. And yet, the tilde over \(K_\text {G}\), which they do not ignore, is effectively a biquadratic term.

To be consistent, two paths are possible. The simple one is to eliminate all biquadratics, including the tilde over \(K_\text {G}\). The perhaps more interesting one is to keep them all, because they are responsible for some fascinating new physics. However, one could object against this on the ground that if we keep biquadratic terms, we should also keep quartic ones, such as \(K^4\), \(K_\text {G}^2\), or \(T^2(\nabla _kT^k)^2\). This is, in principle, a valid concern. However, there are good pragmatic reasons for working with a theory that drops these terms, despite the issue of a consistent order termination: the biquadratic terms create qualitative changes in the curvature-tilt theory, because they introduce a new mode of coupling between curvature and tilt that is absent on the quadratic level. In consequence, they spawn “new physics”—as we will soon see. The same cannot be said for the quartic terms, which (at least initially) only quantitatively change the physics, for instance by affecting the curvature energy and hence changing equilibrium shapes, while only indirectly affecting the partnering field. Of course, ultimately we would need all terms for truly quantitative predictions, but it is easier to investigate how a novel curvature-tilt coupling affects the basic physics without simultaneously having to deal with all other conceivable nonlinearities on the non-coupled side of the energy functional. We hence learn, what new physics is in store, and so we can create hypotheses worthy of testing with more refined approaches. Incidentally, it is of note that the geometric transformations we have discussed above indeed create terms quartic in curvature, but they do not create purely quartic tilt terms.

As anticipated when we started, the new two-dimensional surface functional comes with a number of coupling coefficients in front of terms that are permitted by symmetry, but the underlying elastic theory predicts their values in terms of the underlying parameters, such as \(\lambda _{ijkl}\) or \(\sigma _0(z)\). Crucially, this is not only true for the “classical” parameters which Hamm and Kozlov (2000) already wrote down, but also for all higher order terms. This means that any ad hoc extension of their original functional by terms such as \(K_\text {G}T^2\) would likely miss the fact that the corresponding prefactors are not new coefficients but related to the existing ones, such as \(\overline{\kappa }_\text {m}\).

An important general finding is that all biquadratic terms act as position-dependent contributions to the tilt modulus. This is mathematically obvious, but then, it would be equally conceivable to have them enter as position-dependent contributions to the bending modulus. After all, the following (simplified toy) expressions are perfectly equivalent:

$$\begin{aligned} \frac{1}{2}\kappa _\text {m}K^2 + \frac{1}{2}\kappa _\text {t,m}\left[ 1+\frac{A}{\kappa _\text {t,m}}K^2\right] T^2 = \frac{1}{2}\kappa _\text {m}\left[ 1+\frac{A}{\kappa _\text {m}}T^2\right] K^2 + \frac{1}{2}\kappa _\text {t,m}T^2 \ . \end{aligned}$$
(121)

But while equivalent, from a practical point of view the notion that the tilt modulus gets modified is more useful. To begin with, at sufficiently large scale tilt becomes irrelevant,Footnote 4 And hence, bending is all there is. It then makes sense to solve the problem iteratively by starting with the shape solution in the absence of tilt, and then take this to calculate the tilt field at a given shape background. Moreover, there are interesting cases where the shape is given and need not really be solved for, such as when we ask what the tilt field is at the edge of a membrane or within a small pore, where a monolayer tightly curves around to connect the two individual leaflets. In this case, again, it makes sense to solve for the tilt field in the presence of a shape, but not the other way around. Of course, should there ever be a situation where the opposite point of view is more useful, it is trivial to rewrite our equations to reflect this shift in philosophy.

Observe that the biquadratic terms do not merely amend the tilt modulus in a local curvature-dependent way; they amend it in an anisotropic way, because \(M_{ij}\) is not merely proportional to \(g_{ij}\): the last term in Eq. (120) involves the curvature tensor \(K_{ij}\). As a consequence, the eigenvectors of \(M_{ij}\) coincide with those of \(K_{ij}\), and so the principal curvature directions of the surface also play a special role for tilting. To make this more explicit, assume that \({\varvec{p}}=p^i{\varvec{e}}_i\) and \({\varvec{q}}=q^i{\varvec{e}}_i\) are the two principal directions of \(K_{ij}\) (at some local point), so that we can write it as \(K_{ij}=K_p p_ip_j+K_qq_iq_j\), where \(K_p\) and \(K_q\) are the principal curvatures. The anisotropic term in the tilt energy density can hence be written as

$$\begin{aligned} \frac{1}{2}\overline{\kappa }_\text {m}KK_{ij}T^iT^j&= \frac{1}{2}\overline{\kappa }_\text {m}K\Big [K_p \, p_ip_j+K_q \, q_iq_j\Big ]T^iT^j \nonumber \\&= \frac{1}{2}\overline{\kappa }_\text {m}K\Big [K_p\,T_p^2 + K_q\,T_q^2\Big ] \ , \end{aligned}$$
(122)

where \(T_p= T^ip_i={\varvec{T}}\cdot {\varvec{p}}\) is the p-component of the tilt field, and \(T_q\) is the q-component. For instance, imagine a straight membrane edge, where the p-direction points “around” the edge, and the q-direction points along the edge. In that case, \(K_q=0\) and \(K_p\approx 1/z_0\), giving the contribution

$$\begin{aligned} \text {straight edge:}\quad \frac{1}{2}\overline{\kappa }_\text {m}KK_{ij}T^iT^j = \frac{1}{2}\overline{\kappa }_\text {m}\frac{1}{z_0^2} T_p^2 \end{aligned}$$
(123)

This term leaves any tilt along the edge unaffected, but it lowers the cost for tilting around the edge—since \(\overline{\kappa }_\text {m}<0\). In fact, it is easy to see that the full edge tilt energy density is given by

$$\begin{aligned} e_\text {2d,edge} = \frac{1}{2}\kappa _\text {t,m}\left[ 1+\frac{\ell ^2}{z_0^2}\Big (K_{0,\text {m}}'z_0-1\Big )\right] T^2 + \frac{1}{2}\overline{\kappa }_\text {m}\frac{1}{z_0^2}T_p^2 \ . \end{aligned}$$
(124)

However, there is something quite disconcerting about this expression: the ratio \(\ell ^2/z_0^2\) can be bigger than 1. In fact, taking the numbers which Wang and Deserno (2016) found for DMPC (\(\ell =1.61\,\text {nm}\) and \(z_0=1.32\,\text {nm}\)) we get \(\ell ^2/z_0^2\approx 1.5\). Now, \(r_\text {m}<0\), and \(K_{0,\text {m}}'z_0\) is generally very negative—see Eq. (118). We hence must conclude that for curvatures as large as the ones we encounter at an open edge, the effective tilt energy density is negative, and this could in principle drive the tilt to grow beyond all bounds. For the tilt around the edge this cannot happen in practice over the short region of the edge, since the tilt divergence term in Eq. (119) prevents the tilt from changing too rapidly. But notice that Eq. (124) shows that the effective tilt modulus along the edge can also become negative,Footnote 5 and in that case the finite-region-argument does not save us. Hence, it truly is worrisome that the functional can cease to be bounded below. This, of course, is a direct consequence of us having neglected quartic terms, which would have to stabilize it (since the microscopic theory we started out with is clearly bounded below). We thereby have encountered a case where we are pushing our theory to its limits. But we also discover remarkable physics that is hidden at that border, for even if we catch the divergence by a quartic term, we have now run into a phase transition, and so it is conceivable that strongly curved regions create spontaneous tilt. A more refined theory is necessary to probe this, but even without such a better theory, the “circumstantial evidence” that exciting things can happen in highly curved regions might motivate us to look for them in experiments or simulations.

Clearly, the anisotropic term in \(M_{ij}\) vanishes if \(K=0\), meaning that on minimal surfaces the tilt modulus is always isotropic. This is curious, because minimal surfaces are anything but isotropic. The other possibility for \(M_{ij}\) being isotropic is if the curvature tensor is locally proportional to the metric, \(K_{ij}=c\,g_{ij}\) with some (possible position dependent) function \(c(u^1,u^2)\). What do such surfaces look like? Inserting this special form of \(K_{ij}\) in the contracted Gauss–Codazzi equation, we find

$$\begin{aligned} 0 = \nabla _jK-\nabla _iK_j^i = \nabla _j(2\,c)-\nabla _i (c\,g_j^i) = 2\nabla _jc - \nabla _jc = \nabla _j c \ , \end{aligned}$$
(125)

and hence \(c=\text {const}\). We then have \(K_{ij}=c\,g_{ij}\) with a constant prefactor c. Such surfaces are spheres (do Carmo 1976), and so the resulting isotropy of \(M_{ij}\) is much less mysterious.

3 Measuring the Bending Modulus

In the previous section we have derived a curvature-tilt functional, following the original treatment of Hamm and Kozlov (2000). The functional form of many terms in that theory is often highly intuitive, in the sense that we could have confidently predicted that these terms would show up; but there is of course nothing intuitive about their prefactors. Revisiting the bottom-up and top-down philosophies discussed at the beginning of Sect. 2, we now have two choices: either we derive the resulting moduli from the lower level theory, or we need to determine them on the level of the larger scale theory.

In the present case, the lower level theory was built on the notion of a pre-stressed thin fluid elastic sheet, quantified by the elastic modulus tensor \(\lambda _{ijkl}\) and the stress profile \(\sigma _0(z)\), and we know from Eqs. (102) how the parameters of the curvature-tilt functional relate to the lower level input. However, we have not yet addressed the question where we would get \(\lambda _{ijkl}\) and \(\sigma _0(z)\) from. Again, we have two choices here. One of them is that there could be an even lower level theory that predicts these objects, based on even more fundamental parameters. And yet, the reader might be wondering whether we are merely begging the question, for where would these parameters come from? An even lower level model? And where would its parameters come from? What saves us from an infinite regress? The answer is, usually, that at some point we declare that we know the theory and the parameters. We state that this is the most fundamental level we care about, and that on this level we happen to have a theory that we trust. For instance, we could state that the lowest level we care about is atomistic chemistry (meaning, we ignore nuclei, quarks, strings, ...), and that we are maybe even willing to trust the force fields of classical molecular dynamics to be applicable to this problem. Being poor calculators of such complex systems, we then most likely think hard what type of simulation would give us, say, a modulus tensor, and then we run such a simulation and “measure” that tensor.

The other choice is to forgo the hope of predicting the elastic modulus tensor \(\lambda _{ijkl}\) and the stress profile \(\sigma _0(z)\) from some underlying theory and instead measure them in experiment. Once we have them, we can then plug the results into Eqs. (102) and derive the curvature-tilt parameters, such as the bending modulus.

Thinking about the second option, the following question might stir: why not measure the parameters of the curvature-tilt theory directly? Why should we even take the detour over the lower level theory? Why not cut out the middle man?

The question is serious. After all, we have just noted that the form of the terms in the higher level theory is often very clear: symmetry principles usually go a long way in telling us which terms can or cannot appear. Hence, their presence in a theory rests on something stronger and more fundamental than the particular lower level model we have chosen to construct. Stated differently: if the higher level theory can be phrased completely in terms of observables that emerge on that higher level, the specific details for how that emergence happens need not concern us in order to have a perfectly workable theory on that level. We do not have to dig down into the details. But if we do care about emergence, then a key worry might be whether we got the underlying model right. It is then a good idea to measure things on both levels, followed by a series of tests that scrutinize the putative connections between the two theories. Or we might at least test whether connections predicted entirely on the level of emergent quantities, which are consequences of the model, turn out to be satisfied. If they hold up, the underlying model is promising. If they fail, it is (likely) wrong.

Notice that we are merely retracing the thoughts of the beginning of Sect. 2, with the specific issue of model parameters in mind. We hope the reader will not consider them trite, because in this interplay between tiers of modeling lies a core element of science and epistemology.

The purpose, then, of this last section is to discuss, how some of the parameters entering the theories discussed so far can be measured. As we have argued above, determining parameters on two different levels, and then checking whether they connect according to some proposed model of emergence, is the probably most thorough way to probe nature. However, this is a rather extensive endeavor, and it would warrant a book on its own, just dealing with the special case of lipid membranes. We will hence restrict to make some comments about measuring one parameter, one which happens to live at the emergent level, and illustrate the maybe unexpected richness of problems and opportunities that arise even in this narrow corner. Specifically, we will discuss using simulations to find a parameter. Purists do not call this “measurement,” and they are strictly right: we don’t query nature, we merely query a theoretical model invented to represent nature. In some sense, we use computers to solve a problem we are yet incapable to tackle analytically, but whose answer follows inevitably from that model. And there it is again: tier-bridging.

Let us hence ask: how can we find the value of a membrane’s bending modulus \(\kappa \) in a simulation?

3.1 Active Versus Passive Strategies

When measuring spring constants, there are two conceptually different things one could do. First, one could simply deform the spring and monitor, how much force is required for a given deformation. But if the spring is very soft, this requires measuring very small forces. In fact, the spring could be so soft that thermal fluctuations all by themselves already deform the spring. In that case we not only have to measure a presumably very tiny deformation force; we would also have to figure out how to correct for the effects of thermal noise. However, there is an opportunity here: if fluctuations alone deform the spring, maybe this suffices as a deformation? After all, we know the strength of thermal fluctuations, and if we can measure the spring’s stochastic response, we ought to be able to back out its stiffness.

This second approach—measuring fluctuations to infer rigidities—is very popular in many fields of soft matter physics. The reason is that soft matter has (almost by definition) small spring constants (read now: moduli), which can be inferred by the way they pit themselves against the thermal breeze. Lipid membranes are a good example for this, and we will begin with a discussion for how this connection works—before concluding that we can do better by actively deforming a membrane.

Membrane undulation spectrum. Consider a flat membrane patch of area \(L\times L\), and imagine it being subject to periodic boundary conditions. This is not only theoretically convenient; it is the most natural choice in simulations. Even if the membrane is on average flat, thermal fluctuations will roughen it up by adding stochastic undulations of the shape. However, these will be small, and so we can likely get away with a parametrization of the membrane that describes the geometry as a quadratic-level deviation from flatness: linear Monge gauge.

As a brief reminder: in Monge gauge, a membrane’s shape is described by a height function \(h({\varvec{r}})\) above a flat (horizontal) reference plane, with \({\varvec{r}}\) being the position within that plane. Let \(\nabla \) be the gradient operator in that base plane. If \(|\nabla h|\ll 1\), the expressions for area element and curvatures simplify significantly, and can be written as (Deserno 2015)

$$\begin{aligned} \text {d}A&= \sqrt{1+(\nabla h)^2}\,\text {d}^2 r \; \approx \; \big (1+\textstyle \frac{1}{2}(\nabla h)^2\big )\,\text {d}^2 r \ , \end{aligned}$$
(126a)
$$\begin{aligned} K&= -\nabla \cdot \left( \frac{\nabla h}{\sqrt{1+(\nabla h)^2}}\right) \; \approx \; -\nabla ^2 h \ , \end{aligned}$$
(126b)
$$\begin{aligned} K_\text {G}&= \frac{\text {det}(\partial _i\partial _j h)}{(1+(\nabla h)^2)^2} \; \approx \text {det}(\partial _i\partial _j h) \ . \end{aligned}$$
(126c)

Hence, the membrane Hamiltonian (including bending, but ignoring both spontaneous curvature and tilt for now, and adding a membrane tension \(\sigma \)) can be written as

$$\begin{aligned} E&= \int \text {d}A\left\{ \frac{1}{2}\kappa K^2 + \overline{\kappa }K_\text {G}+ \sigma \right\} \end{aligned}$$
(127a)
$$\begin{aligned}&\approx \int \text {d}^2r\left\{ \frac{1}{2}\kappa (\nabla ^2 h)^2 + \frac{1}{2}\sigma (\nabla h)^2\right\} + \text {const.} \end{aligned}$$
(127b)

where we eliminated the Gaussian term, because it vanishes under periodic boundary conditions—courtesy of the Gauss–Bonnet theorem.

The resulting Hamiltonian in Eq. (127b) is quadratic, but it contains gradients and Laplacians. These can be removed by going into Fourier space (since Fourier modes are the eigenfunctions of the gradient operator). Hence, let us Fourier expand the shape \(h({\varvec{r}})\) according to

$$\begin{aligned} h({\varvec{r}}) = \sum _{\varvec{q}}\tilde{h}_{\varvec{q}}\text {e}^{\text {i}{\varvec{q}}\cdot {\varvec{r}}} \quad \text {with}\;\; {\varvec{q}}= \frac{2\pi }{L}\genfrac(){0.0pt}0{n_x}{n_y} \quad \text {and}\;\; n_x,n_y\in \mathbb {N} \ . \end{aligned}$$
(128)

Since we want this expansion to be real, we must require of the Fourier coefficients that \(\tilde{h}_{-{\varvec{q}}}=\tilde{h}_{\varvec{q}}^*\). Inserting this expansion into the quadratic Hamiltonian (127b), we find

$$\begin{aligned} E&= \frac{1}{2}\int \text {d}^2r\bigg \{ \sum _{{\varvec{q}},{\varvec{q}}'} \tilde{h}_{\varvec{q}}\tilde{h}_{{\varvec{q}}'}\Big [\kappa (-q^2)(-q'^2) + \sigma (\text {i}q)(\text {i}q')\Big ]\,\text {e}^{\text {i}({\varvec{q}}+{\varvec{q}}')\cdot {\varvec{r}}} \bigg \} \nonumber \\&= \frac{1}{2} \sum _{{\varvec{q}},{\varvec{q}}'} \tilde{h}_{\varvec{q}}\tilde{h}_{{\varvec{q}}'}\big (\kappa q^2q'^2 - \sigma q q'\big )\, \underbrace{\int \text {d}^2r \; \text {e}^{\text {i}({\varvec{q}}+{\varvec{q}}')\cdot {\varvec{r}}}}_{=L^2\,\delta _{{\varvec{q}},-{\varvec{q}}'}} \nonumber \\&= \frac{1}{2}L^2 \sum _{{\varvec{q}}} \big |\tilde{h}_{\varvec{q}}\big |^2\,\big (\kappa q^4 + \sigma q^2\big ) \ . \end{aligned}$$
(129)

This final form shows that if the membrane shape is expressed using the Fourier components \(\tilde{h}_{\varvec{q}}\) as degrees of freedom, then the Hamiltonian is not merely quadratic but diagonal—all degrees of freedom are independent. From the equipartition theorem we then immediately find that the mean squared amplitude of every Fourier mode is given by

$$\begin{aligned} \Big \langle \big |\tilde{h}_{\varvec{q}}\big |^2\Big \rangle \; = \; \frac{k_{\text {B}}T}{L^2(\kappa q^4+\sigma q^2)} \ . \end{aligned}$$
(130)

This formula, and variants of it, underly a vast number of methods and papers for measuring the bending modulus \(\kappa \)—both in simulation and, in fact, experiment. The basic idea is that if we can access the fluctuation spectrum, we can fit to this equation and extract \(\kappa \).

But let’s now investigate how much of a membrane deformation we are talking about. First, notice that the bending rigidity will of course reduce the fluctuations—as will the tension. To get the biggest effect, let us imagine that we set the tension to zero.Footnote 6 Since the (root mean square) curvature will scale like \(q^2|\tilde{h}_{\varvec{q}}|\), the typical (root mean square) radius of curvature \(\overline{R}_{\varvec{q}}\) of any given Fourier mode \({\varvec{q}}\) is going to be

$$\begin{aligned} \overline{R}_{\varvec{q}}\sim \frac{1}{\sqrt{\langle K^2\rangle }} \sim \frac{1}{\sqrt{\langle (q^2|\tilde{h}_{\varvec{q}}|)^2\rangle }} \mathop {=}\limits ^{\sigma =0} L\sqrt{\frac{\kappa }{k_{\text {B}}T}} \ . \end{aligned}$$
(131)

Since for a typical bilayer membrane we have \(\kappa \sim 10\ldots 50\,k_{\text {B}}T\), we find \(\overline{R}_{\varvec{q}}\sim 3\ldots 7\,L\), showing that—independently of mode—the radius of curvature is several times bigger than the size of the bilayer. These are very weak curvatures! Not only are they hard to pick up in a simulation,Footnote 7 they are also much smaller than many curvatures we are likely to later impose on membranes (say, when we simulate vesicles), raising the question whether at much larger curvatures the quadratic theory assumed in Eq. (127a) actually holds.

The reason this happens is that the rigidity \(\kappa \) is actually not really small compared to thermal energy \(k_{\text {B}}T\). It is comfortably larger than thermal energy, ensuring that membranes do not fluctuate themselves into bits and pieces, and so while flickering of membranes is readily observed, it is still a small effect.

Force along a cylindrical membrane tube. The observations from the previous section suggest that we could instead look at an actively imposed deformation a membrane and measure the force required to impose it. Several years ago, Harmandaris and Deserno (2006) have proposed to study a cylindrical membrane tube (connected through periodic boundary conditions into one “infinitely long” cylinder) and measure the axial force along it. It is easy to see that such a force should exist: the fixed number of lipids in the simulation box will give rise to a membrane of some given overall area \(A=2\pi R L\), where R and L are cylinder radius and length, respectively. If we change the length of the cylinder, we change R (since A must stay constant), and so we change the bending energy E. This results in a force F, given by

$$\begin{aligned} F = \frac{\partial E}{\partial L}\bigg |_A&= \frac{\partial }{\partial L}\bigg |_A\bigg [\frac{1}{2}\kappa \left( \frac{1}{R}\right) ^2\times A\bigg ] \; = \; \frac{\partial }{\partial L}\bigg |_A\bigg [\frac{1}{2}\kappa \left( \frac{2\pi L}{A}\right) ^2\times A\bigg ] \nonumber \\&= \kappa \left( \frac{2\pi L}{A}\right) \left( \frac{2\pi }{A}\right) \times A \; = \; \frac{2\pi \kappa }{R} \ . \end{aligned}$$
(132)

Hence, measuring the force and the radius gives the rigidity: \(\kappa =FR/2\pi \). Moreover, we can impose much larger curvatures than would ever happen under passive undulation conditions, and so we can test how far the quadratic curvature Hamiltonian (127a) can be trusted. Harmandaris and Deserno (2006) found that—for the coarse grained model they studied (Cooke et al. 2005)—it worked with remarkable accuracy down to curvature radii equal to a few times the membrane thickness—much better than one would probably have any right to hope! Also, the measured rigidity was compatible with what was previously measured from monitoring membrane shape undulations (i.e., exploiting Eq. (130)), but it could be measured more precisely with the same simulation overhead.

There is a big snag, though: as nice and intuitive as this method appears, it fundamentally relies on two conditions that are hardly ever met in a realistic simulation context, both of which are related to the equilibration of a chemical potential. First, the simulation setup divides the simulation box into a region inside the tube, and a region outside. These do not easily communicate, because the solvent (water, or a coarse grained version of it) usually does not diffuse fast enough through a bilayer (on the time scales relevant for the simulation). While in reality the chemical potential of water is equilibrated across the two sides, in a simulation it generally is not (we do not know ahead of time how much water we really need to put into the two environments), and it will not automatically equilibrate. Second, the chemical potential of the lipids in the two bilayer leaflets also has to be the same, since lipids can flip-flop between leaflets. But again, this typically is much too slow a process to significantly happen during the course of a simulation, so unless we set up the system already in equilibrium (and we cannot easily do that, because we do not know how many more lipids we would have to place in the outside leaflet), we have no chance of instead converging to it. Harmandaris and Deserno did not have these difficulties, since the highly coarse grained lipid model which they used (Cooke et al. 2005) (a) has no solvent and (b) has a sufficiently high flip–flop rate. But for any more highly resolved and not necessarily solvent free model, the “pulling-a-tube” method does not readily work.

And yet, this idea of an active deformation remains enticing—we just need to find a way to circumvent the unfortunate equilibration troubles. The path to glory exists, and it involves looking at a different deformation.

3.2 Buckling for Fluid Membranes

In a very important paper, Noguchi (2011) presented a method that solves this problem (without actually needing it for the model he used): if we place a membrane into a box that is too small for that membrane, it will buckle. Choosing a large aspect ratio, we end up with a very well-defined one-dimensional deformation, an example of which is shown in Fig. 12. Clearly, maintaining that shape requires a force, which ought to encode the stiffness of the membrane—buckling a more rigid membrane ought to be harder. In fact, it seems clear that this force ought to be proportional to the bending rigidity \(\kappa \). In the following we provide a solution to this problem, following Hu et al. (2013), which pushes the analytical treatment slightly farther than Noguchi did.

Fig. 12
figure 12

Reprinted from Hu et al. (2013), with the permission of AIP Publishing

Geometry of a buckled membrane, and illustration of the angle-arclength parametrization that can be used to describe it: it gives the angle \(\psi (s)\) of the local profile with respect to the horizontal as a function of the arclength measured along the buckle.

The shape of a one-dimensional buckle. If we parametrize the membrane in the angle-arclength parametrization \(\psi (s)\) indicated in Fig. 12, the relevant curvature along the buckle is given by \(-\dot{\psi }\). Since the curvature in the perpendicular direction vanishes, we get \(K=-\dot{\psi }\) and \(K_\text {G}=0\). The curvature elastic Hamiltonian (again, without tilt) is then given by

$$\begin{aligned} E = L_y\int _0^L\text {d}s\left\{ \frac{1}{2}\kappa \dot{\psi }^2 + f_x\left[ \cos \psi -\frac{L_x}{L}\right] \right\} \ . \end{aligned}$$
(133)

The second term in the integrand enforces the constraint that the membrane fits into the box—meaning, that the total distance traversed horizontally equals \(L_x\). Physically, the associated Lagrange multiplier \(f_x\) is nothing but the force (per unit length) required to ensure that this constraint is satisfied.

A simple functional variation gives the Euler–Lagrange equation that \(\psi (s)\) needs to satisfy in oder to minimize this energy:

$$\begin{aligned} \ddot{\psi }+\lambda ^{-2}\sin \psi = 0 \qquad \text {with}\quad \lambda =\sqrt{\frac{\kappa }{f_x}} \ , \end{aligned}$$
(134)

where we encounter a new characteristic length \(\lambda \). If we multiply this equation with \(\dot{\psi }\), we find

$$\begin{aligned} 0 = \dot{\psi }\ddot{\psi }+ \lambda ^{-2}\dot{\psi }\sin \psi \; = \; \frac{\text {d}}{\text {d}s}\left[ \frac{1}{2}\dot{\psi }^2-\lambda ^{-2}\cos \psi \right] \ , \end{aligned}$$
(135)

showing that the expression in square brackets is conserved and hence a first integral. We can make this constant more explicit by evaluating the expression at an inflection point of the buckle, where \(\dot{\psi }=0\). Calling the value of the angle at that point \(\psi _\text {i}\), we get

$$\begin{aligned} \frac{1}{2}\dot{\psi }^2-\lambda ^{-2}\cos \psi =-\lambda ^{-2}\cos \psi _\text {i}\ , \end{aligned}$$
(136)

a first-order differential equation whose quadrature can be found by separation of variables:

$$\begin{aligned} \frac{s}{\lambda } = \int _0^s\frac{\text {d}s'}{\lambda } = \int _0^\psi \!\!\frac{\text {d}\psi '}{\sqrt{2(\cos \psi '-\cos \psi _\text {i})}} = \text {F}\!\left[ \arcsin \frac{\sin (\psi /2)}{\sin (\psi _\text {i}/2)}\Big |\sin ^2\frac{\psi _\text {i}}{2}\right] \ . \end{aligned}$$
(137)

Here, \(\text {F}[z|m]\) is the incomplete elliptic integral of the first kind. (For all subsequent special functions—a veritable panoply of elliptic functions and integrals—see Abramowitz and Stegun (1970)). After defining the elliptic parameter

$$\begin{aligned} m = \sin ^2\frac{\psi _\text {i}}{2} \ , \end{aligned}$$
(138)

inverting Eq. (137) leads to the angle \(\psi (s)\)

$$\begin{aligned} \psi (s) = 2\,\arcsin \big \{\sqrt{m}\,\text {sn}\big [s/\lambda \,\big |\,m\big ]\big \} \ , \end{aligned}$$
(139)

and integrating the cosine and sine of that expression gives a parametric representation of the buckle:

$$\begin{aligned} x(s)&= 2\lambda \,\text {E}\big [\text {am}\big [s/\lambda \,\big |\,m\big ]\,\big |\,m\big ] \ , \end{aligned}$$
(140a)
$$\begin{aligned} z(s)&= 2\lambda \,\sqrt{m}\,\big (1-\text {cs}\big [s/\lambda \,\big |\,m\big ]\big ) \ . \end{aligned}$$
(140b)

For instance, the second equation (140b) shows that the buckle amplitude is \(z_\text {a}=z(L/4)=2\lambda \sqrt{m}\).

Fixing the constraints. The solutions (139) or (140) to the buckle’s differential equation contain two integration constants: first, \(\lambda \)—which really stands in for the unknown Lagrange multiplier \(f_x\); and second, m—which encodes the angle which the buckle makes at its inflection point. The first one is of great interest to us, the second one not so much—but it is the one that causes technical troubles, because in a simulation we do not fix the angle but the extent of a buckle’s compression—essentially, \(L_x\). Of course, we could always measure the inflection angle in our simulation, but this is laborious, for it would require us to explicitly determine the membrane shape. Instead, it is much more convenient to do a bit more work and re-express the constant m in terms of a more natural one, namely the compressional strain \(\gamma \), defined as

$$\begin{aligned} \gamma = \frac{L-L_x}{L} \ . \end{aligned}$$
(141)

To do so, recall that the two constants are fixed by the two boundary conditions of the problem, which are

$$\begin{aligned} \psi (L/4) = \psi _\text {i}\qquad \text {and}\qquad x(L/4) = L_x/4 \ . \end{aligned}$$
(142)

Using Eq. (137), the first condition implies

$$\begin{aligned} \frac{L}{4\lambda } = \text {F}\Big [\frac{\pi }{2}\,\big |\,m\Big ] = \text {K}[m] \ . \end{aligned}$$
(143)

And using Eq. (140a), the second one yields

$$\begin{aligned} L_x = 8\lambda \,\text {E}[m] - L \ . \end{aligned}$$
(144)

Between these two equations, the length \(\lambda \) can be eliminated, leading to the transcendental equation

$$\begin{aligned} \gamma (m) = 2\left( 1-\frac{\text {E}[m]}{\text {K}[m]}\right) \ , \end{aligned}$$
(145)

which we now “merely” have to invert for \(m(\gamma )\) in order to make the strain \(\gamma \) the independent variable. Unfortunately, this cannot be done in closed form. But it is quite easy to find an accurate series expansion solution, by making the ansatz

$$\begin{aligned} m(\gamma ) = \sum _{i=1}^\infty a_i\,\gamma ^i \ , \end{aligned}$$
(146)

inserting this into Eq. (145), again expanding the right hand side in a Taylor series in \(\gamma \), comparing equal powers of \(\gamma \) on both sides, and thus obtain a set of equations that will determine the coefficients \(a_i\). Most symbolic algebra packages do this in seconds, and one finds

$$\begin{aligned} m(\gamma ) = \gamma -\frac{1}{8}\gamma ^2-\frac{1}{32}\gamma ^3-\frac{11}{1024}\gamma ^4-\frac{17}{4096}\gamma ^5-\frac{55}{32\,768}\gamma ^6-\cdots \end{aligned}$$
(147)

Hu et al. (2013) tabulate the coefficients up to order \(\gamma ^{10}\) and show that the accuracy (compared to an “exact” numerical solution, and restricted to relevant values of \(\gamma \lesssim 0.5\)) is always better than \(2\times 10^{-9}\). In other words: we now have to all intents and purposes an analytical solution of the buckling problem. As an illustration, Fig. 13 shows a sequence of buckles for increasing strain \(\gamma \)

Fig. 13
figure 13

Sequence of buckles, with the buckling strain \(\gamma \) (in percent) given below the arrow at the right end of the buckle. The buckle self-touches at \(\gamma \approx 84.87\%\); notice also that a strain of merely \(10\%\) already reaches about half the transverse amplitude of that final touching-state

Stress–strain relation. The stress \(f_x\) required to compress the buckle enters in the length scale \(\lambda \), and now that we know \(m(\gamma )\), Eq. (143) can be solved for the stress strain relation:

$$\begin{aligned} f_x(\gamma )&= \kappa \left( \frac{4}{L}\text {K}\big [m(\gamma )\big ]\right) ^2 \end{aligned}$$
(148a)
$$\begin{aligned}&= \kappa \left( \frac{2\pi }{L}\right) ^2\bigg [1+\frac{1}{2}\gamma +\frac{9}{32}\gamma ^2+\frac{21}{128}\gamma ^3+\frac{795}{8192}\gamma ^4+\cdots \bigg ] \ . \end{aligned}$$
(148b)

Notice that the stress is directly proportional to the rigidity (as expected) and inversely proportional to the square of the buckle’s contour length. Also, the limit \(\gamma \rightarrow 0\) is discontinuous, showing that a finite stress is required to induce even an infinitesimal strain—the hallmark of a buckling transition. After onset of buckling, the stress continues to grow monotonically. The initial post-buckling slope is \(\frac{1}{2}\) (independent in fact of details of the boundary conditions), and the remaining terms only provide a small correction to them—about \(7\%\) at \(\gamma =50\%\).

Of course, for compressible materials the initial rise cannot be discontinuous. Since a lipid membrane has a finite area compressibility \(K_A\), we would hence expect the initial rise to be linear, but with a much bigger slope. The crossover strain \(\gamma ^*\) occurs, roughly, where compression and buckling have equal stresses, leading to the condition

$$\begin{aligned} K_A\gamma ^*= \kappa \left( \frac{2\pi }{L}\right) ^2 \ . \end{aligned}$$
(149)

Using microscopic theories (such as the ones from Sect. 2), we can relate the area compressibility and the bending modulus. In our special case this is difficult, because the bending modulus also involves the stress profile. But mere scaling already suggests a relation \(\kappa \propto K_A d^2\), where d is the membrane thickness. Imagining lipid bilayers as two thin homogeneous slidable plates without internal prestress gives a constant of proportionality of \(\frac{1}{36}\) (for a Poisson ratio of \(\frac{1}{2}\)) (Deserno 2015), leading to

$$\begin{aligned} \gamma ^*= \frac{\kappa }{K_A}\left( \frac{2\pi }{L}\right) ^2 \sim \frac{\pi ^2}{9}\left( \frac{d}{L}\right) ^2 \approx \left( \frac{d}{L}\right) ^2 \ . \end{aligned}$$
(150)

For the systems studied by Hu et al. (2013), this is always smaller than about \(1\%\). Notice, however, that a finite compressibility also changes the buckling problem itself. The corrections are small if the area compressibility is small (in the sense that \(\sqrt{\kappa /K_A}\) is microscopic), but the resulting theory is extremely fascinating, as Oshri and Diamant (2016) show. For instance, while there is a well-known analogy between the one-dimensional Euler elastic studied here and the mathematical pendulum (observe that Eq. (134) is nothing but the pendulum equation), the compressible elastic can be exactly mapped to the relativistic pendulum.

Evidently, the idea is now to simulate buckles at various different strains (bigger at least than the crossover strain \(\gamma ^*\)) and fit the measured stress–strain relation to Eq. (148)—using \(\kappa \) as the sole fitting parameter. As Hu et al. (2013) demonstrate, this works very well for models all the way from strongly coarse grained to virtually fully atomistic.

The stress tensor for membrane buckles. We can learn more about the stress distribution in a buckle, and in particular the isotropic tension \(\sigma \) within it, by looking at the membrane stress tensor \({\varvec{f}}^a\) (Capovilla and Guven 2002, 2004; Guven 2004). Guven and Vázquez–Montejo provide a pedagogical introduction in this volume to the necessary mathematics, and it is also covered in a recent review by one of us (Deserno 2015). Briefly, if we draw a curve on a membrane surface with tangent vector \({\varvec{t}}=t^a{\varvec{e}}_a\), tangential co-normal \({\varvec{l}}=l^a{\varvec{e}}_a\), and membrane normal \({\varvec{n}}={\varvec{l}}\times {\varvec{t}}\), the traction \({\varvec{f}}\) acting onto the membrane side into which \({\varvec{l}}\) points is given by

$$\begin{aligned} {\varvec{f}}= l_a{\varvec{f}}^a = \left[ \frac{1}{2}\kappa \Big (K_\perp ^2-K_{||}^2\Big )-\sigma \right] {\varvec{l}}+ \kappa \,KK_{\perp ||}\,{\varvec{t}}- \kappa (\nabla _\perp K)\,{\varvec{n}}\ . \end{aligned}$$
(151)

Here, \(K_\perp =l^al^bK_{ab}\) and \(K_{||}=t^at^bK_{ab}\) are the normal curvatures into \({\varvec{l}}\) and \({\varvec{t}}\) direction, respectively, while \(K_{\perp ||}\) is the off-diagonal element of the curvature tensor in the \(({\varvec{l}},{\varvec{t}})\) basis; \(\nabla _\perp K = l^a\nabla _aK\) is the gradient of K along \({\varvec{l}}\). Let’s check the sign: if \(\kappa =0\) and we merely have surface tension (this would correspond for instance to a soap film), we have \({\varvec{f}}=-\sigma {\varvec{l}}\), showing that a surface tension of magnitude \(\sigma \) pulls (minus sign!) tangentially onto the side into which \({\varvec{l}}\) points.

Fig. 14
figure 14

Cross cut through part of a buckle, defining the local \(({\varvec{l}},{\varvec{t}},{\varvec{n}})\) coordinate system, and the angle \(\psi \) which the buckle makes with the horizontal \({\varvec{x}}\). Notice that \({\varvec{l}}\cdot {\varvec{x}}=\cos \psi \) and \({\varvec{t}}={\varvec{y}}\)

Let us now specialize this to the case of a straight line which runs in the flat direction of the buckle (the y-direction in Fig. 12). The local geometry is sketched in Fig. 14. Since this line is straight, \(K_{||}=0\), and since it is also a line of curvature, \(K_{\perp ||}=0\). Hence, the traction \({\varvec{f}}\) is given by

$$\begin{aligned} {\varvec{f}}= \left[ \frac{1}{2}\kappa K_\perp ^2-\sigma \right] {\varvec{l}}- \kappa (\nabla _\perp K_\perp )\,{\varvec{n}}= \left[ \frac{1}{2}\kappa \dot{\psi }^2-\sigma \right] {\varvec{l}}+ \kappa \,\ddot{\psi }\,{\varvec{n}}\ , \end{aligned}$$
(152)

where in the second step we used \(\nabla _\perp =\frac{\text {d}}{\text {d}s}\) and \(K_\perp =-\dot{\psi }\).

Now, a crucial thing to realize is that \({\varvec{f}}\) must be constant and horizontal. Constant, because the stress tensor is divergence free, \(\nabla _a{\varvec{f}}^a=0\), or in our one-dimensional case, \(\text {d}{\varvec{f}}/\text {d}s=0\), and since there are no sources of stress along the buckle, the traction is constant. There are sources at the ends, and they push the buckle horizontally; hence \({\varvec{f}}\propto {\varvec{x}}\). This means that there are two ways for how to get the magnitude of \({\varvec{f}}\): you could either project it onto \({\varvec{x}}\), or you could square it. This leads to the two equations

$$\begin{aligned} f_x&= {\varvec{f}}\cdot {\varvec{x}}= \left[ \frac{1}{2}\kappa \dot{\psi }^2-\sigma \right] \,\cos \psi - \kappa \,\ddot{\psi }\,\sin \psi \ , \end{aligned}$$
(153a)
$$\begin{aligned} f_x^2&= {\varvec{f}}\cdot {\varvec{f}}= \left[ \frac{1}{2}\kappa \dot{\psi }^2-\sigma \right] ^2 \!+ \kappa ^2\,\ddot{\psi }^2 \ . \end{aligned}$$
(153b)

Between these two equations, we can eliminate the higher derivative \(\ddot{\psi }\) and thereby arrive at a differential equation that is one order lower:

$$\begin{aligned} \frac{1}{2}\kappa \dot{\psi }^2 - \sigma = f_x\,\cos \psi \ . \end{aligned}$$
(154)

In other words, stress conservation has given us a first integral of the shape equation—and we did not even have to write down the shape equation. Observe that Eq. (154) is the analog of Eq. (136), but in this case we also get a mechanical interpretation of the constant of integration, not just a geometrical one. Picking the position such that we are at an inflection point—just as we had done in Eq. (136)—we find

$$\begin{aligned} \sigma = -f_x\,\cos \psi _\text {i}\ . \end{aligned}$$
(155)

Hence, the isotropic tension (which couples to the area per lipid) is not equal to the (negative of the) buckling stress, but equal to that stress times the cosine of the inflection angle. In particular, it vanishes if \(\psi _\text {i}=\frac{\pi }{2}\), which happens at \(m=\frac{1}{2}\) or \(\gamma \approx 0.543\).

Advantages and drawbacks. Now that we have seen how buckling a membrane gives rise to an observable, \(f_x\), that will encode the bending modulus, \(\kappa \), let us briefly stop and ponder the benefits and limitations that come with this particular method of determining a membrane’s rigidity, especially in comparison with more traditional fluctuation approaches.

Advantages:

  • The signal we measure, \(f_x\), is directly proportional to the observable we care about, \(\kappa \). In the fluctuation case it was inversely proportional: \(|\tilde{h}_{\varvec{q}}|^2\propto \kappa ^{-1}\). Hence, the buckling method should become better if membranes get stiffer, and worse if they get softer. Since \(\kappa \) is on the order of a few tens of \(k_{\text {B}}T\), we already are in the limit where fluctuations are visible but weak. Moreover, fitting the \(q^{-4}\) dependence predicted in Eq. (130) requires a range of q-values, and if we want just one order of magnitude in q, we encounter a drop of four orders of magnitude in \(|\tilde{h}_{\varvec{q}}|^2\). Indeed, we are looking at very weak signals then.

  • In fluctuation methods, the fluctuations are the signal from which the observable \(\kappa \) is deduced, and hence we need to sample them adequately. In contrast, in the buckling protocol fluctuations are noise—an unwanted perturbation. Not sampling them properly affects the error of our result much less than in a fluctuation method. To see that fluctuations are indeed subdominant, consider the persistence length \(\ell _\text {p}\) of the equivalent one-dimensional “polymer,” which is given by \(\ell _\text {p}=\kappa L_y/k_{\text {B}}T\). This is typically several tens times \(L_y\). For common situations this makes the persistence length substantially larger than the buckle’s length, and so its deformations are dominated by the ground state energy.Footnote 8

  • The method makes no strong assumptions about the microphysics that gives rise to a bending rigidity in the first place. It measures the emergent macroscopic modulus, not a microscopic object that is predicted to coincide with it within the framework of a particular scale-bridging theory. Hence, the bending modulus derived from buckling can serve as a reference which microscopic predictions must meet. It has this property in common with the classical fluctuation method based on Eq. (130), but not with every fluctuation method. For instance Watson et al. (2012) propose a method that extracts the bending rigidity from the orientation fluctuations of lipids; it works with significantly smaller membrane sizes than what Eq. (130) tends to need (and is hence much more efficient), but it relies on an underlying microscopic theory for how curvature and tilt couple (which, incidentally, is of similar nature as the one discussed in Sect. 2).

  • Fluctuation methods involve relatively weak curvatures for typical values of the bending rigidity, as Eq. (131) shows. In order to test whether quadratic curvature elasticity holds for curvatures beyond the weak fluctuation-induced ones, we have to impose them actively. For instance, by simulating tethers, Harmandaris and Deserno (2006) showed that within the statistics available at that time, Cooke-model membranes (panel (b) in Fig. 1) can be bent into curvature radii approaching the thickness of the membrane without significant deviations from quadratic curvature elasticity. The buckling method opens this possibility to membrane models for which tether pulling does not work (because, as discussed above, it is hard to equilibrate the chemical potential of solvent and lipids).

Drawbacks:

  • Studying buckles is technically more involved than studying a flat membrane. First, they must be createdFootnote 9; and second, they require bigger simulation boxes in the z-direction, hence necessitating more solvent.

  • Buckled membranes are not stress free. This does not merely refer to the externally applied buckling stress \(f_x\), but the resulting tension \(\sigma =-f_x\,\cos \psi _\text {i}\)—see Eq. (155). Since \(\sigma \) couples to the area per lipid, buckled membranes usually have their lipids under a compression, and so they are not, strictly speaking, in the thermodynamically relaxed state that is probed with the fluctuation formula from Eq. (130). This matters in particular if the membrane is close to a phase transition for which the area per lipid could change. For instance, if a fluid membrane is close to its main phase transition temperature (below which it goes into a gel phase with a smaller area per lipid), the additional imposed compressive stress can drive (parts of) the membrane—via Le Châtelier’s principle—into a gel phase, thus obviating the applicability of the buckling protocol. An exception is the strain leading to \(\psi _\text {i}=\frac{\pi }{2}\), at which point \(\sigma =0\).

  • The buckling protocol cannot be applied to mixtures without some substantial extensions. The reason is that the buckle’s local geometry changes with position, and different lipid species could prefer different regions—for instance regions where the local monolayer curvature better matches their own spontaneous curvature. Hence, the nontrivial geometry constitutes a driving force for a nonuniform lipid distribution (and even trigger demixing, in the most extreme case). One can of course account for these effects, and most likely even learn more about the mixture in that way, but this requires additional modeling.

Thermodynamics of the membrane bending modulus. Before we move on to some striking deviations from Euler buckling, let us conclude this section with a little detour through the thermodynamics of membrane bending. The buckling force \(f_x\) arises because the curved membrane has a higher energy than the flat one. Or to be more precise—and now we have to be—because it has a higher free energy: we compress the lipid bilayer at constant temperature. It is crucial to realize that even if we ignore large wavelength thermal undulations, we by no means study a system that microscopically sits in an energy ground state. The lipid constituents have a considerable number of degrees of freedom (translation, rotation, bond length and angle vibrations, dihedral rotations) which explore their permissible phase space and whose non-sharp distribution functions “store” a substantial amount of entropy. Of course, none of this is explicitly accounted for in the Helfrich Hamiltonian—so where did it go? The answer is that it went into the parameters—for instance the moduli. The microscopic wiggling of the molecular constituents is captured by effective parameters on the macroscale. If so, the Helfrich Hamiltonian really describes a free energy, and since the curvatures K and \(K_\text {G}\) merely capture the geometry, the subdivision into energetic and entropic contributions to the free energy happens at the level of the moduli. One might hence ask: is there a way to disentangle them?

If we integrate the stress strain relation \(f_x(\gamma )\) over \(\gamma \), we get back the free energy \(\mathcal {E}(\gamma )\), and it is easy to see that per unit area it is given by

$$\begin{aligned} \frac{\mathcal {E}(\gamma )}{LL_y} = \int _0^\gamma \text {d}\gamma '\;f_x(\gamma ') \mathop {=}\limits ^{\text {(148)}} \kappa \left( \frac{2\pi }{L}\right) ^2\left[ \gamma +\frac{1}{4}\gamma ^2+\frac{3}{32}\gamma ^3+\cdots \right] \ . \end{aligned}$$
(156)

Now, in a simulation we can also measure the plain energy—simply by evaluating the total microscopic Hamiltonian of the system. Even at zero strain it will have some nonzero value, but if we buckle the membrane, this energy changes. Let us define \(E(\gamma )=E_\text {sim}(\gamma )-E_\text {sim}(0)\), the excess energy which the buckled membrane has relative to the stress-free flat state. How does it compare to \(\mathcal {E}(\gamma )\)?

Fig. 15
figure 15

Reprinted from Hu et al. (2013), with the permission of AIP Publishing

Free energy \(\mathcal {E}(\gamma )\) (solid) and energy \(E(\gamma )\) (dashed) in units of \(\epsilon =1.1\,k_{\text {B}}T\) as a function of strain \(\gamma \). The inset shows the ratio \(\mathcal {R}=E/\mathcal {E}\), which is largely independent of \(\gamma \) and hence a property of the modulus.

Figure 15 shows a plot of \(E(\gamma )\) and \(\mathcal {E}(\gamma )\) versus the strain \(\gamma \), for the Cooke model at standard conditions (see Hu et al. (2013) for details). The energy increases much more rapidly than the free energy, indicating that the entropic contribution will bring down the true cost of bending—or, in other words, entropy favors bending. The inset in Fig. 15 shows the ratio \(\mathcal {R}=E/\mathcal {E}\) of these two quantities. Notice that \(\mathcal {R}\) is remarkably constant, indicating that geometry “cancels” and all we see is the ratio between energy and free energy as captured in the bending modulus. It hence makes sense to talk of the energetic and entropic contribution of the bending modulus, and thus to “take it apart” as we do with any ordinary free energy:

$$\begin{aligned} \kappa = \kappa _E - T\kappa _S \ . \end{aligned}$$
(157)

Moreover, using well-known thermodynamic identities, we can write

$$\begin{aligned} \kappa _E = \kappa + T\kappa _S = \kappa - T\frac{\partial \kappa }{\partial T} = \kappa \left( 1 - \frac{T}{\kappa }\frac{\partial \kappa }{\partial T}\right) = \kappa \left( 1 - \frac{\partial \log \kappa }{\partial \log T}\right) \ , \end{aligned}$$
(158)

and hence

$$\begin{aligned} \mathcal {R} = \frac{\kappa _E}{\kappa } = 1 - \frac{\partial \log \kappa }{\partial \log T} \ . \end{aligned}$$
(159)

This is a differential equation for the temperature dependence of the bending modulus which we can integrate—provided we know \(\mathcal {R}(T)\). Assuming we can expand it as a series in the smallness parameter \(\log (T/T_0)\),

$$\begin{aligned} \mathcal {R}(T)&= \sum _{n=0}^\infty \frac{\mathcal {R}_n}{n!}\log ^n\frac{T}{T_0} \end{aligned}$$
(160a)
$$\begin{aligned}&= \mathcal {R}_0 + \mathcal {R}_1\frac{T-T_0}{T_0} + \frac{\mathcal {R}_2-\mathcal {R}_1}{2}\left( \frac{T-T_0}{T_0}\right) ^2 + \cdots \ , \end{aligned}$$
(160b)

the integration is trivially done, leading to

$$\begin{aligned} \log \frac{\kappa (T)}{\kappa _0}&= (1-\mathcal {R}_0)\log \frac{T}{T_0} - \sum _{n=2}^\infty \frac{\mathcal {R}_{n-1}}{n!}\log ^n\frac{T}{T_0} \ . \end{aligned}$$
(161)

This expresses the functional form of \(\kappa (T)\) in a log-log fashion. Notice that for T close to \(T_0\) this boils down to a simple power law, with corrections only at the quadratic level:

$$\begin{aligned} \kappa (T) \approx \kappa _0\left( \frac{T_0}{T}\right) ^{\mathcal {R}_0-1} \left[ 1-\frac{\mathcal {R}_1}{2}\left( \frac{T-T_0}{T_0}\right) ^2+\cdots \right] \ . \end{aligned}$$
(162)

By explicitly calculating \(\kappa (T)\) over the range \(0.95\le \frac{T}{T_0}\le 1.11\), Hu et al. (2013) have shown (using the standard Cooke model) that a simple power law relation indeed describes the data very well. This is quite advantageous, because it means that by also measuring \(\mathcal {R}\) from the buckling simulations (essentially at no extra cost), one can predict the bending modulus \(\kappa \) in the vicinity of the simulation temperature, not just at it.

Notice that if \(\mathcal {R}_0>1\), heating softens the membrane. We would probably have expected this to be true no matter what, but we now see that this occurs if and only if buckling increases the energy more rapidly than it increases the free energy. Interestingly, this need not always be true: \(\mathcal {R}(T)<1\) is thermodynamically possible and does in fact occur. Its hallmark is “anomalous swelling,” the phenomenon that the spacing in a multilamellar stack of membranes unexpectedly increases upon cooling, which happens for some lipids a few degrees above their main phase transition. Chu et al. (2005) have argued that this swelling is due to an increased Helfrich fluctuation repulsion between the lamellae, which indeed points toward a softening of the modulus.

3.3 Buckling for Gel-Phase Membranes

When fluid membranes are cooled, they ultimately reach a temperature at which they change into a new phase that is both more ordered and more rigid—the so-called “gel phase”; this is called the “main transition” of a membrane (see Nagle (1980) for a review of the theory). Many subtleties exist about this transition, and some membranes even change first into an intriguing corrugated phase (the so-called “ripple phase”), but none of this will concern us here. For now we are very modest and merely want to know, how much stiffer a gel phase is, and how we can measure that.

Experiments indicate that gel-phase membranes are at least an order of magnitude stiffer than fluid-phase membranes (Lee et al. 2001; Dimova et al. 2000; Steltenkamp et al. 2006). Hence, the observable signal from fluctuation methods drops by at least an order of magnitude, while the signal from active methods increases by the same factor. Relatively speaking, active methods should therefore be about two orders of magnitude more sensitive for measuring the membrane bending modulus. Since, furthermore, the buckling method works even if the membrane becomes less fluid (the deformation is isometric and can be realized even for solid sheets, such as paper), applying the technique discussed in this chapter seems ideally suited to study the rigidity of gel phases. Indeed, Diggins et al. (2015) have done just that. What they found, though, was highly surprising: the theory developed so far, in particular the stress strain relation from Eq. (148), does not describe their simulation data at all—not even qualitatively. Figure 16 shows the stress–strain relation extracted from simulations of a gel-phase membrane. In contrast to the prediction from Eq. (148), which is clearly a monotonically increasing function, the opposite is true for the measured data: higher strains lead to smaller stresses, and thus the compressibility is negative.

Curvature softening. Based on a careful analysis of the resulting buckle shapes, which on average appear more “pointy” than classical Euler buckles, Diggins et al. (2015) conjecture that the reason for the discrepancy is a failure of quadratic curvature elasticity: assume that membranes soften upon bending, in the sense that their elastic energy does not keep growing quadratically with curvature but instead lags behind as one continues to increase the curvature. If so, it would be energetically advantageous to localize bending in small regions, rather than distributing it more evenly. This would explain the more “pointy” buckle shapes, but what would it predict for the stress–strain relation?

Fig. 16
figure 16

Reprinted with permission from Diggins et al. (2015); copyright 2015 American Chemical Society

Stress–strain relation for a Cooke buckle at \(w_\text {c}/\sigma =1.6\) and \(k_{\text {B}}T/\epsilon = 0.85\). The open circles are the directly measured stress, the filled circles use additional information from the shape. The blue dashed line is a poor fit to Eq. (148), the solid line is the prediction from Eq. (164a) (surrounded by the 68 and 95% confidence bands). The bottom panel shows the inferred value of the bending rigidity.

In order to be quantitative about the stresses, we first need a quantitative theory of curvature softening. The probably easiest phenomenological approach would be to amend the quadratic curvature energy density e(K) by a quartic term that reduces the energy—in the spirit of adding a next order correction to Helfrich theory:

$$\begin{aligned} e(K) = \frac{1}{2}\kappa K^2 - \frac{1}{4}\kappa _4 K^4 +\mathcal {O}(K^6) \ . \end{aligned}$$
(163)

Unfortunately, this is an awkward theory to work with: for \(K>K^\star =\sqrt{\kappa /\kappa _4}\) this energy density decreases with curvature—all the way to minus infinity; the energy is not convex, not even bounded below. This will invariably create numerous artifacts and is hence ill-suited as an explanatory model for our findings. To fix this, Diggins et al. (2015) propose an alternative energy density which is both bounded below and in fact convex, but which up to quartic order coincides with the first guess from Eq. (163):

$$\begin{aligned} e(K)&= \frac{\kappa }{\ell ^2}\Big [\sqrt{1+K^2\ell ^2}-1\Big ] \end{aligned}$$
(164a)
$$\begin{aligned}&= \left\{ \begin{array}{lcc} \frac{1}{2}\kappa K^2 - \frac{1}{8}(\kappa \ell ^2)K^4 + \mathcal {O}(K^6) &{} , &{} K\ll \ell ^{-1} \\ \frac{\kappa }{\ell ^2}\Big (\big |K\ell \big |-1\Big ) + \mathcal {O}(K^{-1}) &{} , &{} K\gg \ell ^{-1} \end{array} \right. \, \end{aligned}$$
(164b)

where \(\ell \) is a new characteristic length scale, telling us where softening starts to set in. Notice that for sufficiently small K this looks like the curvature-softened first guess from Eq. (163), with \(\kappa _4=\frac{1}{2}\kappa \ell ^2\). But beyond \(K\sim \ell ^{-1}\) the initial quadratic increase turns into a mere linear one. Stronger bending still always costs more energy, but at large curvature the differential price is much less than at small curvature. What does this imply for the stress–strain relation?

A new stress–strain relation. As it turns out, the Euler–Lagrange equation associated with this new energy density can still be turned into a first integral:

$$\begin{aligned} \frac{s}{\ell } = \int _{\psi _{\text {i}}}^{\psi (s)}\!\!\!\text {d}s\;\frac{1}{\sqrt{\Big \{1-\tilde{f}_x\big [\cos (\psi (s))-\cos \psi _{\text {i}}\big ]\Big \}^{-2}-1}} \ , \end{aligned}$$
(165)

where \(\tilde{f}_x=f_x\ell ^2/\kappa =\ell ^2/\lambda ^2\) is the scaled buckling force.

Using the same series-inversion techniques as in the ordinary case, Diggins et al. (2015) arrive at a revised stress strain relationship:

$$\begin{aligned} f_x(\gamma ,\delta ) = \kappa \left( \frac{2\pi }{L}\right) ^2\left[ 1+\frac{1}{2}\Big (1-3\delta ^2\Big )\gamma +\frac{9}{32}\Big (1-\frac{14}{3}\delta ^2+\frac{31}{3}\delta ^4\Big )\gamma ^2 + \cdots \right] \ , \end{aligned}$$
(166)

which features the new parameter \(\delta =\frac{2\pi \ell }{L}\) as a convenient dimensionless measure for exactly how strongly the situation deviates from the plain Euler case. Notice that in the limit \(\delta \rightarrow 0\) Eq. (166) reduces to the first terms of the Eulerian stress–strain relation (148), and that for any nonzero \(\delta \) the initial post-buckling slope of \(\frac{1}{2}\) is reduced. In fact, at \(\delta =\delta _{\text {c}}=1/\sqrt{3}\approx 0.577\) that slope vanishes, and for \(\delta >\delta _{\text {c}}\) the stress–strain relation starts with a negative slope. For the Cooke-model data in Fig. 16 Diggins et al. (2015) indeed find that \(\delta \) is much bigger than that critical value: \(\delta \approx 2.9\). Unfortunately, at these large values the series expansion from Eq. (166) no longer converges for all strains of interest, so a numerical solution needs to be sought. But this solution very nicely fits the measured data, hence supporting the contention that gel-phase membranes appear to soften upon bending, in a way that is captured reasonably well by the empirical energy density (164a).

Fig. 17
figure 17

Stress–strain relation for a buckle consisting of Cooke lipids—the data are from Hu et al. (2013). The dashed curve is a fit to the classical Euler stress–strain relation from Eq. (148), the solid curve is a fit to the revised stress–strain relation from Eq. (166) that allows for curvature softening—using the bending rigidity \(\kappa \) and the new variable \(\delta \) as fitting parameters. The scaling of the vertical axis is such that the intercept will give the bending rigidity. Including curvature softening leads to a prediction for the value of \(\kappa \) that is about 10% bigger than what the classical fit yields

Given that gel-phase membranes soften, it is fair to ask whether this is also true for fluid-phase membranes. If they do, the effect cannot be very large, for otherwise it would have been observed in many earlier studies. But if the effect is small, finding it requires both good statistics and a quantitative model capable of identifying the softening is needed. Hence, one way to answer the question of fluid-phase curvature softening is to revisit the original buckling data from Hu et al. (2013) and fit them with the revised curvature softened theory (164a). The result is shown in Fig. 17. While the classical Euler fit is not truly poor, it does seem to have a slight overall bias—in the sense that the fit is too large at high strains and too small at low strains. Given that softening will reduce the slope of the stress–strain relation, we can expect that this deficiency is resolved by the new theory. Indeed, the solid curve in Fig. 17 shows the fit to Eq. (166), which is overall a better description of the data. Notice that this implies the bending rigidity (which can be read off at the intercept) to be larger than what the classical Euler fit would predict. Indeed, the latter would give \(\kappa /k_\text {B}T=12.7\pm 0.3\), while the curvature softened theory yields the larger rigidity \(\kappa /k_\text {B}T=13.8\pm 0.4\), with a value for the softening parameter of \(\delta =0.44\pm 0.08\), or a characteristics length of \(\ell /\sigma =4.7\pm 0.8\) (which is about the bilayer thickness).