1 Introduction

This paper is intended to both clarify and raise issues related to approaches to hyperelasticity that are increasingly being adopted within the soft condensed matter physics community, primarily for their application to the incompatible elasticity of plate and shell structures. While aimed at this particular community, it might also stimulate some discussion with elasticians who employ approaches aligned with modern continuum mechanics, a field whose notation and attitude has diverged from physics since the mid-twentieth century. The scope is limited to a comparison of a few choices of small-strain elastic energies, and the application of one of these to derive equations of equilibrium and free boundary conditions for plates. The latter derivation is presented in a compact vector form that I believe is not to be found elsewhere, and which I hope some will find useful. The major points made, and their locations in the paper, are briefly listed at the end of this lengthy introduction.

I will approach these topics in a purposefully naive way, without invoking the deformation gradient or established stress tensors of continuum mechanics, and only occasionally referring to a few such strain tensors and their invariantsFootnote 1 to indicate a correspondence with quantities at hand. An exception to this can be found in Appendix A, which translates between the physics and mechanics languages. The intent is to connect with the recent soft matter literature and its notation, so that the ideas expressed here can be adopted there. This is not to say that I favor such approaches over those developed in continuum mechanics over the last three quarters of a century; I think there are good arguments for both the physicists’ and the mechanicians’ perspectives.Footnote 2 But I also think it should be possible to obtain results consistent with modern continuum mechanics by employing the physicists’ language and approach, already well entrenched in the context of other field theories such as electromagnetism and gravity.

Proceeding in this manner, certain aspects of the derivation and presentation may seem either obvious or ridiculous to some mechanicians. Many of the problems faced by the physicist working from his own conception of first principles, some such problems surely being artifacts of his approach, were already thought through and resolved, or avoided, in early foundational work in mechanics in the 1950s [2,3,4,5]. The physicist would no doubt find the style and notation of some of these works far more accessible than the more recent continuum mechanics literature, but all too often takes as authoritative reference the elasticity volume of the Landau–Lifshitz course of theoretical physics [6]. Though first published in 1959, it unfortunately does not incorporate the relevant material from the preceding decade and treats most of elasticity as a linear field theory.

In keeping with a primary audience in physics, I will assume basic knowledge of differential geometric symbols and index gymnastics in the early-twentieth-century style of “absolute differential calculus,” but no background in continuum mechanics. I will follow similar styles of notation as found in the older elasticity literature [4, 7], some later mechanics literature derived therefrom [8,9,10,11], and recent physics literature [12,13,14], but will introduce a bit of new notation in an attempt to avoid ambiguity in certain common expressions. Even within this stylistic slice of the literature, it is impossible to be consistent with everybody, as things like capitalization or barring of quantities are used in opposing ways by different authors.

Continuum mechanical approaches often consider very general constitutive relations or stored energy functions. While this generality is powerful, and avoids certain problems encountered in physics approaches, it can also hide some important complexity, particularly if one wants to connect with a reduced-dimensional theory of a plate or rod. The alternate approach I borrow from physics is to construct a Lagrangian as an expansion in elastic strains, expressed as a functional of a position (embedding) vector and its derivatives with respect to material markers. To these must be added other fields describing a reference configuration. Compatibility equations are avoided because of the explicit use of position. Stresses are derived quantities arising naturally inside a divergence.Footnote 3 The first derivative of position is a tangent vector to the body, a viewpoint that seems most natural when considering low-dimensional bodies such as surfaces, but need not be so restricted. We can think of the components of this vector as carrying one material and one spatial index. A related fundamental quantity in continuum mechanics is the deformation gradient, a two-point tensor with one foot in a reference configuration and one in the present configuration. Commonly used index-free notations, matrix representations, or treatments based on Cartesians rather than convected coordinates often obscure the special nature of this object that does not live in a single space.

The supposed justification for a small-strain expansion, other than physicists’ seeming compulsion to write everything that way, is that the desired end is a reduced energy for a plate or shell structure in which both mid-surface stretching and the product of curvature and thickness are small. The energy of a two-dimensional plate derived from three-dimensional bulk elasticity will contain bending terms associated with both extrinsic and intrinsic curvatures, as well as stretching terms describing in-plane deformations of the mid-surface. However, it will become clear in subsequent discussion that the definition of a “quadratic” energy is ambiguous and potentially misleading, as one is typically choosing among common strain measures that do not have equivalent dependencies on variables like displacement derivatives or stretch. Higher-order differences in bulk elasticity are not only important for understanding theories that extend to moderate strains, but actually have significant lower-order effects on derived bending energies. Surprisingly, the most common choices of bulk strain energies do not correspond to the simplest phenomenological direct theories of lower-dimensional objects. These points are illustrated in a simple example, that of extensible elastica, sketched in Sect. 3.2 and fleshed out in another publication [15].

One way to view elasticity is as a theory with multiple metrics. But a metric has two distinct uses. One is to construct geometric quantities like the integration measure and covariant derivative, and the other is to construct invariants, for example by applying the inverse metric to derivatives of position. As will be discussed, these are effectively choices about coordinate bases and constitutive laws. Theories will be displayed in which the metrics of the reference or present configuration are used for both purposes (Sects. 2.22.3), and in a hybrid approach in which the reference metric is used only for invariants and the present metric for everything else (Sect. 2.4). That we are free to make such choices was recognized in early work on nonlinear elasticity [3, 4, 16, 17] and has been recognized implicitly or explicitly in the literature on general relativistic elasticity [18] and fluid membranes [19, 20]. One finds divergence expressions employing the covariant derivatives of reference and present configurations in studies of classical [4, 7, 21,22,23] and general relativistic [24] elasticity. The present configuration is more common, likely because of its clearer connection with boundary loadings. However, the geometry of the reference configuration has been favored in recent physics literature such as [12] and other work it has influenced. A hybrid approach is most common in continuum mechanics. John’s paper [17] and Green and Zerna’s book [4] suggest that either metric could be used to construct invariants, but both only use the reference metric in practice. In an earlier paper, Green and Zerna [3] use both metrics to form mixed strain components, but use the reference metric to explicitly construct a Taylor expansion for an energy. In later papers with Naghdi and others [25,26,27,28], Green has dropped any mention of the possibility of using the present metric for invariants.

What does this actually mean? We have a reference (rest, target) metric corresponding to a basis of tangent vectors to the reference configuration, and a present (current, deformed, actual, realized) metric corresponding to a basis of tangent vectors to the present configuration. Both bases are coordinate bases corresponding to the same material coordinates, one basis being convected into the other by the deformation. With regard to geometry, it seems sensible to use each metric as a metric in its own configuration; outside of its configuration, the metric is just some tensor. With regard to invariants, there seem to be reasons to favor the reference configuration if it is indeed a “rest” configuration, but really this is just a constitutive assumption. While I reserve explicit definitions and discussion for Sect. 2.1, for now I point out that objects such as the metric differences \(\epsilon _{ij} = \frac{1}{2}\left( g_{ij} - {\bar{g}}_{ij} \right) \), formed by taking derivatives of present and reference positions, can be covariant components of either Green or Almansi strain tensors and can be acted on by either (inverse) metric to form invariants of these respective tensors. Thus, what a physicist might call a choice of metric would be more properly called a constitutive law for a hypothetical material. The choice of basis to accompany these components is a choice of tensors—component functions do not uniquely identify a tensor in elasticity, so a common physics shorthand may lead to serious confusion. The reference basis is usually favored in continuum mechanics, and typically researchers have a better intuition for the meaning of Green strain and its invariants than for other options. However, in proceeding from bulk elasticity to the inherited bending behavior of thin structures, there are clear advantages to using the present basis, because one gets to work with the proper geometric invariants of surface curvature tensors and the like. This, along perhaps with the general lack of reference bases in other fields of physics, and the enticing potential for parallels with geometric theories in so-called fundamental physics, has tended to favor the present basis in the work of many physicists. However, as discussed below, one needs to be careful making parallels, as elasticity is not a geometric theory in the sense intended in these other areas.

Using the present metric for invariants is rare in continuum mechanics. An early work on rods by Hay [29] uses the present metric for all purposes, although he does not ever discuss invariants of strain, only of stress. Volterra [30] uses the present metric and constructs a rod energy from a strain linear in covariant derivatives of displacement. Ericksen and his collaborators Rivlin [31] and Doyle [5] take it for granted that one should use the reference metric, as the strain measure they are applying its inverse to is simply the present metric, and the other option would just generate the identity. However, Doyle and Ericksen [5] also allow for the possibility of constructing energy expansions using quantities akin to those obtained using the present metric. In fact, algebraic relationships hold between the three invariants of Green and Almansi tensors [4, 5].

Maugin’s opinion [32] was that the deformed metric is the right metric to use “on the material manifold,” and Toupin [33] appears to be saying the same thing, but both authors are concerned with constructing theories consistent with relativistic physics. Rayner’s [34] use of a reference metric is atypical in this area. Oldroyd [35] uses the convected metric to construct invariants for general relativistic rheological equations of state, and Maugin [36] indicates use of the inverse deformation gradient to construct an energy in general relativistic magnetoelasticity.Footnote 4 This is also implicit in the work of Hernandez [39], although he does not discuss invariants, and Beig and Schmidt [40], and explicit in papers by Kijowski and Magli [41, 42], who confusingly refer to the inverse deformation gradient as the relativistic deformation gradient. Carter and Quintana [18] use the present metric, which they consider to be a projection tensor of sorts, in their general treatment of general relasticity, but then use the reference metric in an isotropic Hookean approximation, and clearly recognize that one is free to choose when constructing expansions. Karlovini and Samuelsson [24] follow Carter and Quintana and use the present metric, while noting freedom of choice and low-order equivalence in an appendix.

In the context of bending of surfaces, Peterson [43] uses the reference metric as the metric and the integration measure, but constructs a phenomenological bending energy by using both inverse present metric and inverse reference metric to construct a difference between two curvatures. He indicates that this choice is purely for illustrative purposes, and indeed if one were to apply the same approach to the metric tensors, the bulk strains would vanish trivially. Similar approaches to bending can be found elsewhere [44, 45]. At first glance, the resulting object does not seem to correspond to an invariant of a single tensor. Similarly, Stumpf and Makowski [46] use the reference metric on the bulk and mid-surface strains in shells, while using the differences in mean and Gaussian curvature with one value scaled by the ratio of integration measures. Rosso and Virga [47], without mention of reference configurations, discuss general energies for lipid membranes that depend on curvature invariants, while Maleki et al. [48] discuss general energies for lipid membranes that may depend on curvature invariants for both present and referential surfaces. In contrast, despite their notation (following [49]) and some of the statements in the discussion around their equation (2.1), Pezzulla et al. [50] construct their shell bending energy using the reference metric and derivatives of present and reference positions and thus are not employing the present curvature tensor except in those cases, common in their work, where the reference and present metrics of their mid-surface coincide.

Different choices of linear constitutive relations between nonlinear stresses and strains lead to qualitatively different results in bulk elasticity at large strains [51, 52]. It will be shown in Sect. 3.2 and a separate publication [15] that constitutive choices have even more marked qualitative effects on derived bending elasticity. In particular, the Green form of the bulk energy favored in some recent works [12, 14, 50] appears to generate bending energies in qualitative contrast with intuitively sensible definitions even at moderate strains.

This paper is restricted to the relatively simple case of isotropic compatible or incompatible elasticity as considered by [12] and related works, where it is assumed that certain quantities, such as metrics, can still be defined even if no reference configuration exists in \({\mathbb {E}}^3\). Also notable are recent studies of nematic glass or nematic elastomer sheets that involve coupling between the deformation gradient and a director field [53, 54]. These are examples of a class of complicated systems including materials with defects and inclusions, and processes of possibly anisotropic growth or swelling and subsequent elastic deformation. Other tools have been developed, such as the “elastic metric” of Nardinocchi et al. [55], or the more complicated mathematical machinery for calculus on manifolds [56] employed by Yavari et al. [57, 58], who have also made an effort to relate such formalisms to existing approaches and notations. The paper [57] also discusses the distinction between the general covariance principles taken for granted in physics and the concept of objectivity under special Euclidean transformations that has taken prominence in some approaches to continuum mechanics. Other concepts I will not employ in the present work include the two-point “shifter” tensor of early continuum mechanics [59] for translation between two different coordinate systems, and the “vielbein” of general relativity [60] for relating a coordinate basis to a non-coordinate unit basis. While elasticity describes deformation of the same coordinate basis rather than a change of basis in the same configuration, the idea of treating one basis as an alternate non-coordinate non-unit basis seems an interesting possibility, although this would require introducing the concept of Cartan connection to construct covariant derivatives. The use of anholonomic components of tensors was also discussed in the early mechanics literature [59]. I will also ignore the third, simple, Euclidean metric of the embedding space that induces the metrics of the reference and present configurations and is implicitly present when taking dot products. More interesting background spaces are relevant to neutron stars [18], and even to a more sophisticated treatment of incompatible elasticity [58]. I note that in some solid mechanics literature, the present configuration is often loosely identified with the embedding space, and working with present objects is sometimes unfortunately referred to as the “Eulerian” picture, despite the common use of “Lagrangian” convected coordinates and coordinate bases in the sense familiar to a fluid mechanician.

Another notable feature of some physics literature on elasticity is its inheritance of the use of geometric (per volume or per area) rather than material (per reference volume, akin to per mass) energies. These geometric actions resemble those in high-energy and gravitational physics such as the Nambu–Goto action consisting of the area of the world-sheet of a “string,” and the Einstein–Hilbert action consisting of the Ricci scalar integrated over the volume of space–time. Despite names like “string” and “brane,” such objects are geometric rather than elastic bodies, and the concept of material coordinates does not apply. In the case of Einstein–Hilbert, there is no meaning to the position (embedding) vector either, and the variation of the volume form is performed with respect to the metric components, producing the Ricci scalar term in the associated field equations. Similarly, the variation of a volume or area form with respect to position in a geometric energy will contribute a term proportional to the Lagrangian density. There are physical situations where this is appropriate, but solid elasticity is not one of them, as it is described by a material energy defined per unit mass [43], an integral over a fixed set of material points. One expects a geometric energy for something like a soap film, where energy is proportional to surface area, and the film is able to pull material from a reservoir either at its boundaries, or within its own finite thickness. In some situations, one might seek a shape that extremizes a functional without requiring it to describe a fixed amount of material. This is quite distinct from the concept of virtual work in continuum mechanics, which relies on mass-conserving deformations [4, 61]. How these issues might arise in attempts to construct variational principles in the context of growth [55, 62, 63] is far beyond the scope of the current paper.

An energy commonly used to model lipid membranes as two-dimensional fluids with bending elasticity is the per area Helfrich energy [64,65,66]. As this elasticity arises from different origins than does the bending elasticity of solid sheets, this paper does not cast any light on the appropriateness of such a choice. The use of such an energy leads to geometrically elegant results [67,68,69,70,71,72], and the advantage that terms arising from Gaussian curvature can be pushed out to the boundary conditions, or ignored in the case of closed membranes. Additionally, if the surface metric is constrained, differences in geometric and material energies can be absorbed into the corresponding multiplier, as shown in Sect. 2.5. Otherwise, such differences might be justifiable as a small-strain approximation [7, 13, 21]. Steigmann [73,74,75] distinguishes between per area and per reference area energies, discusses how material coordinates are still relevant for fluid membranes, and comments on how physicists’ conception of “reparameterization” may obscure the fact that variation of a material position vector means that actual material elements, not coordinates, are being moved around.

The remainder of the paper is broadly divided into the topics of bulk elasticity (Sect. 2), reduction and related issues pertaining to bending energies (Sect. 3), and plate elasticity (Sect. 4). A few additional points and suggestions of further avenues for exploration are discussed in Sect. 5. Notations, important concepts, and the definitions of fundamental geometric objects are provided in Sect. 2.1. A few additional tools are introduced in later sections as needed to perform operations.

The paper covers several major and minor topics. The consequences of the “choice of metric” on the forms of both the bulk quadratic energy, in terms of invariants of Green or Almansi strain, and of its resulting equilibrium equations and boundary conditions, are presented across Sects. 2.22.4. Some aspects of these derivations are interpreted in terms of standard continuum mechanical objects in Appendix A. Emphasis is laid on the divergence form of the equations, both in these sections and in a later discussion of plate equations (Sect. 4.4) that combine stretching or an in-plane metric constraint (Sect. 4.1) with quadratic mean (Sect. 4.2) and Gaussian (Sect. 4.3) curvature elasticity terms arising from an Almansi form of the bulk energy. Variation of these geometric quantities is presented in a compact vector form with intermediate steps made explicit. There are significant differences in definitions of bending elasticity arising from the choice of these strain measures, or other possible choices, and in Sect. 3.2.1 it is suggested that the Biot strain may be a more appropriate choice for thin structures, an idea already present in prior literature but not in wide circulation [76,77,78,79,80,81]. These points are illustrated in the simple context of extensible elastica in Sect. 3.2. Section 3.1 contains a brief sketch of reduction employing the Kirchhoff–Love assumptions to obtain a plate energy, and a discussion of some ambiguities in how terms are retained (Sect. 3.1.1). The difference between elastic and geometric energies arises in Sect. 2.5 with a discussion of a soap film energy, and the manner in which these differences need to be absorbed into multipliers in the inextensible limit. The distinction is reiterated in the contexts of extensible elastica and plate bending energies in Sect. 3.2.2. Its most glaring consequences are the non-Helfrich form and additional tangential component of the contribution to the equations from the elastic squared mean curvature energy (Sect. 4.2), and the fact that the variation of elastic Gaussian curvature energy is not a pure divergence and thus contributes to the Euler–Lagrange equations (Sect. 4.3). A discussion of small-strain approximations in the combined plate equations and a brief comparison with inextensible elastica can be found in Sect. 4.4.

2 Elasticity

In this section, after an extensive discussion of notation and fundamental objects for bulk and thin-body elasticity, I will compare several theories for an elastic energy “quadratic” in strain components and discuss differences between elastic and geometric energies.

2.1 Notation, coordinates, derivatives, bases, metrics, strains, invariants, and some geometric preliminaries

Let us take as fundamental objects the position vectors of present and reference configurations, their derivatives, and the inverses of these derivatives. These will be used to construct tensorial or other quantities used in calculations. A body is described using a position vector \(\mathbf{R}\left( \{x^i\}\right) \), \(i \in \{1, 2, 3\}\), in \({\mathbb {E}}^3\), where \(\{x^i\}\) are convected material coordinates. Derivatives will be denoted with subscripts, \(\partial _i \equiv \tfrac{\partial }{\partial x^i}\,\). A variation \(\delta \) will always mean a first-order change resulting from the operation \(\mathbf{R}\rightarrow \mathbf{R}+ \delta \mathbf{R}\) or, in later sections, its lower-dimensional analogue; no confusion should arise with the Kronecker delta \(\delta ^i_j\), which will always be written with indices. The present configuration \(\mathbf{R}\) has the metric \(g_{ij} = \partial _i\mathbf{R}\cdot \partial _j\mathbf{R}\) induced by the embedding space \({\mathbb {E}}^3\); the dot product will always mean the usual Euclidean operation. We will also employ a reference metric \({\bar{g}}_{ij}\), which for the purposes of this paper can be imagined as derived from a reference configuration \({\bar{\mathbf{R}}}\), \({\bar{g}}_{ij} = \partial _i{\bar{\mathbf{R}}}\cdot \partial _j{\bar{\mathbf{R}}}\). However, such an object is often used for systems without a reference configuration in Euclidean space, as might be appropriate to describe thermal expansion or chemical swelling in frustrated incompatible-elastic bodies that cannot “rest” peacefully [12]. The present and reference metrics have inverses, such that \(g^{ij}g_{jk} = \delta ^i_k\) and \(\left( {\bar{g}}^{-1}\right) ^{ij}{\bar{g}}_{jk} = \delta ^i_k\). Instead of the latter expression, we will often write the same functions as \({\bar{g}}^{IJ}{\bar{g}}_{jk}=\delta ^I_k\), with summation holding independent of capitalization, using capital raised indices to indicate raising with the inverse metric rather than the present metric. Thus, the functions \(\left( {\bar{g}}^{-1}\right) ^{ij} = {\bar{g}}^{IJ} = {\bar{g}}^{IL}{\bar{g}}^{JK}{\bar{g}}_{LK}\) are distinct from the functions \({\bar{g}}^{ij} = g^{il}g^{jk}{\bar{g}}_{jk}\). The present and reference configurations have as natural coordinate basis vectors their tangent vectors \(\partial _i\mathbf{R}\) and \(\partial _i{\bar{\mathbf{R}}}\), respectively. The corresponding reciprocal basis vectors are \(\partial ^i\mathbf{R}= g^{ij}\partial _j\mathbf{R}\) and \(\left( \partial {\bar{\mathbf{R}}}^{-1}\right) ^i = \left( {\bar{g}}^{-1}\right) ^{ij}\partial _j{\bar{\mathbf{R}}} = \partial ^I{\bar{\mathbf{R}}} = {\bar{g}}^{IJ}\partial _j{\bar{\mathbf{R}}}\). Note that \(g^{ij} = \partial ^i\mathbf{R}\cdot \partial ^j\mathbf{R}\) and \({\bar{g}}^{IJ} = \partial ^I{\bar{\mathbf{R}}}\cdot \partial ^J{\bar{\mathbf{R}}}\). The deformation gradient can be written as \(\partial _i\mathbf{R}\partial ^I{\bar{\mathbf{R}}}\).

A great potential source of confusion in elasticity is the fact that these functions, and others derived from them, can serve as covariant or contravariant components of different tensors by pairing them with either the present or reference bases.Footnote 5 This is despite the fact that the indices always correspond to the same material coordinates—the reference basis is convected into the present basis by the deformation. It may be confusing if one is used to thinking of metrics in terms of line elements, as the expressions \(\hbox {d}l^2 = g_{ij}\hbox {d}x^i\hbox {d}x^j\) and \(\hbox {d}{\bar{l}}^2 = {\bar{g}}_{ij}\hbox {d}x^i\hbox {d}x^j\) involve the same material coordinate segments \(\hbox {d}x^i\) that are unchanged by deformations of the body [43]. But the change of basis matters. Unlike many other situations in physics, in elasticity one cannot unambiguously refer to a tensor by referring to its components.

The two notations I adopt here avoid potential ambiguities associated with the otherwise very compact and elegant formalism of Green and Zerna [4] that has been adopted with modifications by Steigmann [8] and others. They also obviate any need to remember the respective spaces in which different objects live. The clumsy inverse notation is necessary when one writes out specific components, such as \(\left( g^{-1}\right) ^{12}\), and will also be of conceptual help in Sect. 2.4 when both bases are to be used simultaneously. Interesting alternate approaches to notation may be found in [54, 82].

Paired with the (reciprocal) reference basis, the \(g_{ij} = g_{IJ}\) are covariant components of the right Cauchy–Green deformation tensor \(\mathbf{C}= g_{IJ}\partial ^I{\bar{\mathbf{R}}}\partial ^J{\bar{\mathbf{R}}}\), while paired with the (reciprocal) present basis they are covariant components of the identity \(\mathbf{I}= g_{ij}\partial ^i\mathbf{R}\partial ^j\mathbf{R}\) and colloquially said to be “the metric,” as it is implied that one is in the present configuration. Similarly, the \({\bar{g}}_{ij}={\bar{g}}_{IJ}\) are either covariant components of the identity \(\mathbf{I}= {\bar{g}}_{IJ}\partial ^I{\bar{\mathbf{R}}}\partial ^J{\bar{\mathbf{R}}}\), in which case they are “the metric” in the reference configuration,Footnote 6 or they are covariant components of the inverse left Cauchy–Green tensor \(\mathbf{B}^{-1} = {\bar{g}}_{ij}\partial ^i\mathbf{R}\partial ^j\mathbf{R}\). For completeness and later use, let us also write out the corresponding inverse right Cauchy–Green tensor \(\mathbf{C}^{-1} = g^{ij}\partial _i{\bar{\mathbf{R}}}\partial _j{\bar{\mathbf{R}}}\) and the left Cauchy–Green tensor \(\mathbf{B}= {\bar{g}}^{IJ}\partial _I\mathbf{R}\partial _J\mathbf{R}\). Just as with the metric functions, capitalization only matters on the upper indices, so \(\partial _i\mathbf{R}= \partial _I\mathbf{R}\) and \(\partial _i{\bar{\mathbf{R}}}=\partial _I{\bar{\mathbf{R}}}\), and the important distinguishers are the bars or lack thereof on the \(\mathbf{R}\)s. This is simply a consequence of the fact that \(\partial _i\) and \(\partial _I\) are the same partial derivative, but more care will be needed when we employ covariant derivatives.

An important derived quantity is the difference in two metrics

$$\begin{aligned} \epsilon _{ij} = \epsilon _{IJ} = \tfrac{1}{2}\left( g_{ij} - {\bar{g}}_{ij} \right) , \end{aligned}$$
(1)

which is a commonly used measure of strain. These functions are, again, components of two possible tensors: the Green(–Lagrange–St. Venant) strain tensor \(\varvec{\epsilon }_{\text {{G}}} = \epsilon _{IJ}\partial ^I{\bar{\mathbf{R}}}\partial ^J{\bar{\mathbf{R}}} = \tfrac{1}{2}\left( \mathbf{C}- \mathbf{I}\right) \) or the (Euler–)Almansi strain tensor \(\varvec{\epsilon }_{\text {{A}}} = \epsilon _{ij}\partial ^i\mathbf{R}\partial ^j\mathbf{R}= \tfrac{1}{2}\left( \mathbf{I}- \mathbf{B}^{-1}\right) \). With the bases in mind, it is clear how to form invariants of such tensors from their identical covariant components. For example, the traces of the Green and Almansi strains have the unambiguous definitions \(\mathrm {Tr}(\varvec{\epsilon }_{\text {{G}}}) = \mathbf{I}: \varvec{\epsilon }_{\text {{G}}} = {\bar{g}}^{IJ}\epsilon _{IJ} = \left( {\bar{g}}^{-1}\right) ^{ij}\epsilon _{ij}\) and \(\mathrm {Tr}(\varvec{\epsilon }_{\text {{A}}}) = \mathbf{I}: \varvec{\epsilon }_{\text {{A}}} = g^{ij}\epsilon _{ij}\), respectively. These are sometimes informally expressed as \(\mathrm {Tr}_{{\bar{g}}}(\epsilon )\) and \(\mathrm {Tr}_{g}(\epsilon )\), or by saying that one takes the trace of the strain by raising with one (inverse) metric or the other, but really this does not have a meaning. In elasticity, the component functions do not unambiguously define a tensor, so the same functions can have different interpretations. Thus, the question of which metric is used to define invariants is really a question of which energy or constitutive relation is being chosen, and this does not have a “right” answer except as it connects to experiments on particular materials.

Each metric has its own compatible covariant derivative identifiable with its Levi-Civita connection, such that \({\bar{\nabla }}_K{\bar{g}}_{IJ} = 0\) and \(\nabla _k g_{ij} = 0\), with corresponding Christoffel symbols formed with the respective metrics and inverse metrics. Likewise, the metric determinants pass through the corresponding derivatives, \({\bar{\nabla }}_K\sqrt{{\bar{g}}}=0\) and \(\nabla _k\sqrt{g}=0\). A single partial derivative of a vector such as \(\mathbf{R}\), with no indices, may be replaced by a material covariant derivative, \(\partial _i\mathbf{R}= \nabla _i\mathbf{R}= {\bar{\nabla }}_I\mathbf{R}\), an object with one covariant material index which should subsequently be interpreted carefully depending on context. Another property of \({\mathbb {E}}^3\) is that \({\bar{\nabla }}_I {\bar{\nabla }}_J {\bar{\mathbf{R}}} = \mathbf{{0}}\) and \(\nabla _i\nabla _j\mathbf{R}= \mathbf{{0}}\), so we could define displacements \(\mathbf{u}= \mathbf{R}- {\bar{\mathbf{R}}}\) and write \({\bar{\nabla }}_I{\bar{\nabla }}_J \mathbf{u}= {\bar{\nabla }}_I{\bar{\nabla }}_J \mathbf{R}\) and \(\nabla _i\nabla _j \mathbf{u}= -\,\nabla _i\nabla _j{\bar{\mathbf{R}}}\). Note that

$$\begin{aligned} \partial _i\partial _j\mathbf{R}&= \partial _i\nabla _j\mathbf{R}= \partial _i\partial _j\mathbf{R}\cdot \nabla ^k\mathbf{R}\nabla _k\mathbf{R}= \Gamma ^k_{ji}\partial _k\mathbf{R}, \end{aligned}$$
(2)
$$\begin{aligned} \mathrm {so}\quad \nabla _i\nabla _j\mathbf{R}&= \partial _i\nabla _j\mathbf{R}- \Gamma ^k_{ij}\nabla _k\mathbf{R}= \mathbf{{0}}, \end{aligned}$$
(3)

using the fact that the Levi-Civita connection is torsion-free, so the Christoffel symbols \(\Gamma ^k_{ij}\) are symmetric in their lower indices. However, when we consider plates, we will use a two-dimensional description of the mid-surface of the body, with only the surface indices as material indices. Let \(\mathbf{X}\left( \{x^\alpha \}\right) \), \(\alpha \in \{1, 2\}\), be a two-dimensional body, a surface embedded in \({\mathbb {E}}^3\), with unit normal \(\varvec{\mathrm {\hat{N}}}\), metric \(a_{\alpha \beta } = \partial _\alpha \mathbf{X}\cdot \partial _\beta \mathbf{X}\), and symmetric curvature tensor components \(b_{\alpha \beta } = \partial _\beta \partial _\alpha \mathbf{X}\cdot \varvec{\mathrm {\hat{N}}} = {-\partial _\alpha \mathbf{X}\cdot \partial _\beta \varvec{\mathrm {\hat{N}}}}\). It is also convenient to define the components of the third fundamental form \({\partial _\alpha \varvec{\mathrm {\hat{N}}}\cdot \partial _\beta \varvec{\mathrm {\hat{N}}}} = b^\gamma _\alpha b_{\beta \gamma } = c_{\alpha \beta }\). Now

$$\begin{aligned} \partial _\alpha \partial _\beta \mathbf{X}&= \partial _\alpha \nabla _\beta \mathbf{X}= \partial _\alpha \partial _\beta \mathbf{X}\cdot \left( \nabla ^\gamma \mathbf{X}\nabla _\gamma \mathbf{X}+ \varvec{\mathrm {\hat{N}}}\varvec{\mathrm {\hat{N}}} \right) = \Gamma ^\gamma _{\beta \alpha }\partial _\gamma \mathbf{X}+ b_{\beta \alpha }\varvec{\mathrm {\hat{N}}}, \end{aligned}$$
(4)

and as \(\nabla _\alpha \nabla _\beta \mathbf{X}= \partial _\alpha \nabla _\beta \mathbf{X}- \Gamma ^\gamma _{\alpha \beta }\nabla _\gamma \mathbf{X}\), we have obtained the first part of the Gauss–Weingarten system

$$\begin{aligned} \nabla _\beta \nabla _\alpha \mathbf{X}&= b_{\alpha \beta }\varvec{\mathrm {\hat{N}}}, \end{aligned}$$
(5)
$$\begin{aligned} \nabla _\alpha \varvec{\mathrm {\hat{N}}}&= -b^\beta _\alpha \nabla _\beta \mathbf{X}. \end{aligned}$$
(6)

Applied to the position vector of the surface, two covariant derivatives give a term, symmetric in its two surface indices, that does not vanish but points entirely normal to, or rather out of, this lower-dimensional curved material space. Note that this is distinct from pointing normal to the boundary of the body—there is no corresponding concept for a space-filling body. More broadly, while covariant derivatives do not in general commute in a curved space, they will do so when acting on a scalar or something like a spatial vector \(\mathbf{X}\) that carries no material indices.

The mean and Gaussian curvatures of \(\mathbf{X}\) are invariants of the curvature tensor \(\mathbf{b}= b_{\alpha \beta }\partial ^\alpha \mathbf{X}\partial ^\beta \mathbf{X}\),

$$\begin{aligned} H&= \tfrac{1}{2}b^\alpha _\alpha = \tfrac{1}{2}\, \mathrm {Tr}\left( \mathbf{b}\right) , \end{aligned}$$
(7)
$$\begin{aligned} K&= \tfrac{1}{2}\left( b^\alpha _\alpha b^\beta _\beta - b^\alpha _\beta b^\beta _\alpha \right) = \tfrac{1}{2}\,\left( \left( \mathrm {Tr}\left( \mathbf{b}\right) \right) ^2 -\mathrm {Tr}\left( \mathbf{b}^2\right) \right) =\mathrm {Det}\left( \mathbf{b}\right) , \end{aligned}$$
(8)

where clearly the inverse metric of the surface \(a^{\alpha \beta }\) is used to form invariants from derivatives of position. Notation such as that in [49, 50] is vague, as it suggests that a curvature tensor is being used when it is not. The quadratic invariants relevant to bending energies can also be written directly in terms of the derivatives,

$$\begin{aligned} H^2&= \tfrac{1}{4}\nabla ^2\mathbf{X}\cdot \nabla ^2\mathbf{X}, \end{aligned}$$
(9)
$$\begin{aligned} K&= \tfrac{1}{2}\nabla ^2\mathbf{X}\cdot \nabla ^2\mathbf{X}- \tfrac{1}{2}\nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \nabla _\alpha \nabla ^\beta \mathbf{X}, \end{aligned}$$
(10)

where \(\nabla ^2\) stands for the covariant Laplacian or Laplace–Beltrami operator on the surface \(\nabla ^\alpha \nabla _\alpha \).

Finally, the compatibility relationships between metric and curvature for two-dimensional surfaces are contained in the Codazzi equations

$$\begin{aligned} \nabla _\alpha b_{\beta \gamma } = \nabla _\beta b_{\alpha \gamma }, \end{aligned}$$
(11)

and the Gauss equation

(12)
(13)

where \(R_{\alpha \beta \gamma \zeta }\) is the Riemann tensor, definable in terms of the metric via the usual mess of Christoffel symbols, \(R_{\alpha \beta }\) is the Ricci tensor, and R is the Ricci scalar.

2.2 Reference metric as the metric: invariants of Green strain in the reference configuration

This approach, in which the reference metric plays every metric role, is closest to the influential work of Efrati et al. [12] which has been adopted by several other groups. While not particularly traditional, there is precedent for this approach in the shell literature [83]. In Sect. 3.2, we will demonstrate that the bending energy for thin bodies derived from this bulk energy has undesirable features. I tentatively interpret what follows as performing all operations in the reference configuration. In Sect. 2.4, we will revisit the same energy in the present configuration. Appendix A will provide further clarification of these two treatments.

We write the energy as

$$\begin{aligned} {\bar{E}} = \int \hbox {d}{\bar{V}} \bar{\mathcal {E}}, \end{aligned}$$
(14)

where \(\hbox {d}{\bar{V}} = \sqrt{{\bar{g}}} \prod \nolimits _{i} \hbox {d}x^i\). The energy density \(\bar{\mathcal {E}}\) can be thought of as per unit unstrained (reference) volume, or equivalently as per unit mass if we assume a uniform mass density of unity in the reference configuration. For simplicity, the distinction will be ignored in this paper, and the phrase “per unit mass” used loosely. However, it is important in more general problems involving a combination of swelling and subsequent elastic deformation in which multiple reference states exist [54].

The strain components in (1) are taken as the relevant measure of strain. This is by no means the only or best choice, as will be discussed in Sect. 3.2.1. However, given this choice, the appropriate energy density for small strains is the quadratic expression

$$\begin{aligned} \bar{\mathcal {E}} = \tfrac{1}{2}{\bar{A}}^{IJKL}\epsilon _{IJ}\epsilon _{KL}, \end{aligned}$$
(15)

where the isotropic elasticity tensor has contravariant components \({\bar{A}}^{IJKL} = \lambda {\bar{g}}^{IJ}{\bar{g}}^{KL} + \mu \left( {\bar{g}}^{IK}{\bar{g}}^{JL} +\, {\bar{g}}^{IL}{\bar{g}}^{JK}\right) \), \(\lambda \) and \(\mu \) being constant moduli with units of energy per unit mass. Thus, \(2\bar{\mathcal {E}} = \lambda \epsilon _I^I\epsilon _J^J + 2\mu \epsilon _J^I\epsilon _I^J = \lambda \left( \mathrm {Tr}\left( \varvec{\epsilon }_{\text {{G}}}\right) \right) ^2 + 2\mu \mathrm {Tr}\left( \varvec{\epsilon }_{\text {{G}}}^2\right) \).

The variation \(\mathbf{R}\rightarrow \mathbf{R}+ \delta \mathbf{R}\) passes through the partial derivative and transforms the present metric as \(\delta g_{ij} = \partial _i\delta \mathbf{R}\cdot \partial _j\mathbf{R}+ \partial _i\mathbf{R}\cdot \partial _j\delta \mathbf{R}\). It leaves \({\bar{g}}_{ij}\), \({\bar{g}}^I_j = \delta ^I_j\), and \({\bar{g}}^{IJ}\) unchanged, but transforms \(g^I_j = {\bar{g}}^{IK}g_{kj}\). To perform the variation, we use the equivalence of \(\partial _i\mathbf{R}= {\bar{\nabla }}_i\mathbf{R}\), the (IJKL) symmetry of \(A^{IJKL}\), and the following fact. Let there be a symmetric tensor with components \(S_{ij} = Q_{ij} + Q_{ji}\). Then \(S^i_iS^j_j = 4Q^i_iQ^j_j\) and \(S^i_jS^j_i = 4Q^i_jQ^j_i\). Thus,

$$\begin{aligned} \delta \bar{\mathcal {E}}&= \delta \int \hbox {d}{\bar{V}} \tfrac{1}{8}{\bar{A}}^{IJKL}\left( g_{IJ}-{\bar{g}}_{IJ}\right) \left( g_{KL}-{\bar{g}}_{KL}\right) , \nonumber \\&= \int \hbox {d}{\bar{V}} \tfrac{1}{2}{\bar{A}}^{IJKL}\left( g_{IJ}-{\bar{g}}_{IJ}\right) {\bar{\nabla }}_L\mathbf{R}\cdot {\bar{\nabla }}_K\delta \mathbf{R}, \nonumber \\&= -\int \hbox {d}{\bar{V}}\, {\bar{\nabla }}_K\left( {\bar{A}}^{IJKL}\epsilon _{IJ} {\bar{\nabla }}_L\mathbf{R}\right) \cdot \delta \mathbf{R}\, \nonumber \\&\quad \,\, + \oint \hbox {d}{\bar{S}}\, {\bar{n}}_K {\bar{A}}^{IJKL}\epsilon _{IJ} {\bar{\nabla }}_L\mathbf{R}\cdot \delta \mathbf{R}, \end{aligned}$$
(16)

assuming a smooth boundary \(\partial {\bar{V}}\). Here \(\hbox {d}{\bar{S}}\) is a surface element constructed with the reference metric and \({\bar{n}}_K\) is the covariant component, in the reference basis, of the unit normal to the reference configuration, \(\varvec{\mathrm {\hat{{\bar{n}}}}} = {\bar{n}}^I\partial _I{\bar{\mathbf{R}}}\) and \(\varvec{\mathrm {\hat{{\bar{n}}}}}\cdot \partial _J{\bar{\mathbf{R}}} = {\bar{n}}_J\).

Thus, we are led to define stress components

$$\begin{aligned} {\bar{\Sigma }}^{KL} = {\bar{A}}^{IJKL}\epsilon _{IJ}, \end{aligned}$$
(17)

such that the equations of equilibrium are

$$\begin{aligned} -{\bar{\nabla }}_K\left( {\bar{\Sigma }}^{KL} {\bar{\nabla }}_L\mathbf{R}\right) = \mathbf{{0}}, \end{aligned}$$
(18)

and free boundary conditions on \(\partial {\bar{V}}\) are

$$\begin{aligned} {\bar{n}}_K {\bar{\Sigma }}^{KL} {\bar{\nabla }}_L\mathbf{R}= \mathbf{{0}}. \end{aligned}$$
(19)

These results are equivalent to those in [4, 8, 83] and differ from those of [12] only in that their expressions use the present normal vector.

It seems strange that the basis vectors of the present configuration \({\bar{\nabla }}_L\mathbf{R}= \partial _l\mathbf{R}\) are here playing the role of covariant components of objects in the reference configuration. I will revisit this thought in Sect. 2.4.

Throughout the paper, I will informally interpret (18) and other expressions like it as being vectors in the embedding space and scalars in the material space of the body. It is apparent that (18) is a divergence, as it should be. However, \({\bar{\nabla }}_K{\bar{\nabla }}_L \mathbf{R}\ne 0\) because the reference metric and connection are not constructed from the present configuration \(\mathbf{R}\). For this reason, the projection (by Euclidean dot product) of the divergence (18) onto the tangent vectors does not result in a divergence form of the resulting equations for material components. Indeed, the projection onto the inverse deformed tangent vector \({\bar{\nabla }}^J\mathbf{R}\) of the bulk Eq. (18) is cumbersome,

$$\begin{aligned} -g_L^J {\bar{\nabla }}_K{\bar{\Sigma }}^{KL} -{\bar{\Sigma }}^{KL}{\bar{\nabla }}_K{\bar{\nabla }}_L\mathbf{R}\cdot {\bar{\nabla }}^J\mathbf{R}= 0. \end{aligned}$$
(20)

This is accompanied by the projection of the boundary conditions (19)

$$\begin{aligned} g_L^J {\bar{n}}_K{\bar{\Sigma }}^{KL} = 0. \end{aligned}$$
(21)

We may abusively rewrite the second derivatives as

$$\begin{aligned} {\bar{\nabla }}_K{\bar{\nabla }}_L\mathbf{R}&= \nabla _K\nabla _L\mathbf{R}+ \left( \Gamma ^m_{KL} - {\bar{\Gamma }}^M_{KL}\right) \partial _M\mathbf{R}, \end{aligned}$$
(22)

where Christoffel symbols are defined using the corresponding unbarred or barred metrics, and upper indices raised following our capitalization convention. We may also eliminate stray mixed components of present metrics by returning to (18) and (19) and projecting them onto \({\bar{\nabla }}^J\mathbf{R}\), to obtain the projection of the bulk equation

$$\begin{aligned} -{\bar{\nabla }}_K{\bar{\Sigma }}^{KJ} - {\bar{\Sigma }}^{KL}\left( \Gamma ^j_{KL} - {\bar{\Gamma }}^J_{KL}\right) = 0, \end{aligned}$$
(23)

with boundary conditions

$$\begin{aligned} {\bar{n}}_K{\bar{\Sigma }}^{KJ} = 0. \end{aligned}$$
(24)

Note that (23) is typically written with some type of explicit inverse notation, akin to what was introduced in Sect. 2.1 but with metric roles reversed. I think we need to interpret the free index as corresponding to the same configuration for each term; this is likely the reference configuration, despite the fact that in one term the index was raised with the present metric. The representations (18) or (20) for the bulk equation are less ambiguous. Equations (23) and (24) can be found in [12], excepting notational differences and the disagreement of the choice of normal. Similar expressions appear in Sanders [84]. The difference in Christoffel symbols is a tensor that also appears in other works [17, 24, 85]. Steigmann [85] refers to it as a “strain measure” for shells, along with more typical strain and curvature differences.

Because \({\bar{\nabla }}_K{\bar{\nabla }}_L \mathbf{R}\) is \(O(\epsilon , \partial _i\epsilon )\) and \(g^J_L = \delta ^J_L + O(\epsilon )\), it may be quite sensible to write approximate projected bulk equations

$$\begin{aligned} -{\bar{\nabla }}_K{\bar{\Sigma }}^{KJ} + O(\epsilon ^2, \epsilon \partial _i\epsilon ) = 0. \end{aligned}$$
(25)

There seems little reason to include quadratic terms in the Euler–Lagrange equations while neglecting others that would have been generated by an energy expanded out to cubic order. However, in Sect. 2.4 I will present a better way to clean up these expressions that does not rely on such an argument and naturally provides a divergence form for both the equations and their projections.

2.3 Present metric as the metric: invariants of Almansi strain in the present configuration

This approach, in which the present metric plays every metric role, is a somewhat unconventional constitutive choice in continuum mechanics. However, this is essentially the default among physicists studying relativistic elasticity or the bending elasticity of fluid membranes (whether biological or cosmic), due in part to its natural production of geometric quantities in the equations. Such an approach is often applied in a manner inappropriate for describing a fixed quantity of elastic material; the distinction between geometric and material energies will be discussed in Sect. 2.5. The particular presentation here, as well as its further application to plates in Sect. 4 is, I believe, not to be found elsewhere. In Sect. 3.2, we will see that the qualitative properties of the bending energy derived from this approach are in direct opposition to those derived from the reference metric approach and that neither correspond to the simplest director theories of rods. All operations are performed in the present configuration.

We write the energy as

$$\begin{aligned} E = \int \hbox {d}V \mathcal {E}, \end{aligned}$$
(26)

where \(\hbox {d}V = \sqrt{g} \prod \nolimits _{i} \hbox {d}x^i = \sqrt{g/{\bar{g}}}\,\hbox {d}{\bar{V}}\). This description now requires the density \(\mathcal {E}\) to be an energy per unit strained (present) volume. Yet we still expect the moduli to be constants per unit mass, not per unit volume. Thus, a sensible energy density is

$$\begin{aligned} \mathcal {E} = \tfrac{1}{2}\sqrt{{\bar{g}}/g}\,A^{ijkl}\epsilon _{ij}\epsilon _{kl} , \end{aligned}$$
(27)

where the isotropic elasticity tensor has contravariant components \(A^{ijkl} = \lambda g^{ij}g^{kl} + \mu \left( g^{ik}g^{jl} + g^{il}g^{jk}\right) \). Thus, \(2\mathcal {E} = \sqrt{{\bar{g}}/g}\left( \lambda \epsilon _i^i\epsilon _j^j + 2\mu \epsilon _j^i\epsilon _i^j \right) = \sqrt{{\bar{g}}/g}\left( \lambda \left( \mathrm {Tr}\left( \varvec{\epsilon }_{\text {{A}}}\right) \right) ^2 + 2\mu \mathrm {Tr}\left( \varvec{\epsilon }_{\text {{A}}}^2\right) \right) \). Note that while \(\sqrt{{\bar{g}}}\) and \(\sqrt{g}\) are scalar densities, a quantity like the scale factor \(\sqrt{{\bar{g}}/g}\) is a scalar because the two metric determinants transform equally under changes of the same material coordinates. The quantities \(\sqrt{g/{\bar{g}}}\) and \(\sqrt{{\bar{g}}/g}\) usually appear as J and \(J^{-1}\) in the continuum mechanics literature. For the purposes of the variation, the \(\sqrt{g}\) in the volume form \(\hbox {d}V\) and the \(1/\sqrt{g}\) multiplying the moduli cancel each other out. Physically, this just means that the energy for a fixed amount of material does not increase simply because the volume increases. This is a material energy, not a geometric energy like that of a soap film in contact with a reservoir (Sect. 2.5).

Although they look similar, \(\mathcal {E} \ne \bar{\mathcal {E}}\) and \(E \ne {\bar{E}}\), and these approaches represent two different theories of elasticity arising from two different constitutive stored energy functions. Either could be said to be “quadratic” in the strain components, but that is not enough to uniquely specify the theory, as the components can belong to either the Green or Almansi strain tensors. The distinction is of higher order than quadratic in strain, that is,

$$\begin{aligned} \mathcal {E}&= \bar{\mathcal {E}} + O(\epsilon ^3), \end{aligned}$$
(28)

and indeed this is also the argument for the rather unphysical neglect of \(\sqrt{{\bar{g}}/g}\) in [13]. However, as we will see in Sects. 3.1.1 and 4.1.1, such differences in energy actually have important lower-order consequences at the level of the equations of equilibrium for thin bodies, which might be an artifact of these choices of strain measure. To see that (28) is true, note that the elasticity tensor used in the reference energy in Sect. 2.2 would be written in the present configuration using components of the inverse reference metric \(\left( {\bar{g}}^{-1}\right) ^{ij} = {\bar{g}}^{IJ}\). We can explicitly relate the mixed components of the inverse reference metric and those of the present metric \(g^i_k = \delta ^i_k\):

$$\begin{aligned} 2\left( {\bar{g}}^{-1}\right) ^i_j\epsilon ^j_k&= \left( {\bar{g}}^{-1}\right) ^i_j\left( \delta ^i_k - {\bar{g}}^j_k\right) , \nonumber \\&= \left( {\bar{g}}^{-1}\right) ^i_k - \delta ^i_k. \end{aligned}$$
(29)

Additional examples of such approximations appear in the early shell literature. Koiter [7] invokes small-strain arguments to replace the deformed surface metric tensor with the undeformed metric tensor in the compatibility equations for strains, thus only approximately satisfying Gauss and Codazzi. He also approximates the relevant scale factor as unity in writing a quadratic energy. Leonard [21] also discusses similar substitutions of metrics for small-strain theories. Relationships between the quadratic and cubic invariants of the relevant tensors have been known since the early days of nonlinear elasticity [4, 5].

To perform the variation \(\mathbf{R}\rightarrow \mathbf{R}+ \delta \mathbf{R}\), we will need

$$\begin{aligned} \delta g^{ij}&= -a^{ik}a^{jl}\delta g_{kl}, \nonumber \\&= -\,\nabla ^i\mathbf{R}\cdot \nabla ^j\delta \mathbf{R}- \nabla ^j\mathbf{R}\cdot \nabla ^i\delta \mathbf{R}, \end{aligned}$$
(30)

where we have used \(\delta \left( g^{jl}g_{lk}\right) = 0\), the symmetry of \(g_{ij}\), and the equivalence \(\partial _i\mathbf{R}= \nabla _i\mathbf{R}\). The variation still leaves the reference metric components \({\bar{g}}_{ij}\) and the delta functions \(g^i_j = \delta ^i_j\) unchanged, but \(\delta {\bar{g}}^i_j = \delta g^{ik} {\bar{g}}_{kj}\), and similarly \(\delta {\bar{g}}^{ij} \ne 0\). As mentioned above, there is no need to consider a contribution to \(\delta \mathcal {E}\) from the variation of the volume form \(\hbox {d}V\), as this is compensated by the prefactor \(\sqrt{{\bar{g}}/g}\) in the energy density: \(\delta \left( \hbox {d}V\sqrt{{\bar{g}}/g}\right) = 0\). Using the same tricks as before, the variation is most easily performed thus:

(31)

assuming a smooth boundary \(\partial V\). Here \(\hbox {d}S\) is a surface element constructed with the present metric and \(n_k = \varvec{\mathrm {\hat{n}}}\cdot \partial _k\mathbf{R}\) is the covariant component, in the present basis, of the unit normal to the present configuration.

Thus, we define stress components

$$\begin{aligned} \Sigma ^{km}&= A^{ijkl}\epsilon _{ij} {\bar{g}}^m_l, \end{aligned}$$
(32)
$$\begin{aligned}&= A^{ijkm}\epsilon _{ij} - 2A^{ijkl}\epsilon _{ij}\epsilon ^m_l, \end{aligned}$$
(33)

such that the equations of equilibrium are

$$\begin{aligned} -\,\nabla _k\left( \sqrt{{\bar{g}}/g}\, \Sigma ^{km} \nabla _m\mathbf{R}\right) = \mathbf{{0}}, \end{aligned}$$
(34)

and free boundary conditions on \(\partial V\) are

$$\begin{aligned} \sqrt{{\bar{g}}/g}\, n_k \Sigma ^{km} \nabla _m\mathbf{R}= \mathbf{{0}}. \end{aligned}$$
(35)

A nice thing about the divergence (34) is that its projection (onto \(\nabla ^j\mathbf{R}\)) is also a divergence,

$$\begin{aligned} -\,\nabla _k\left( \sqrt{{\bar{g}}/g}\, \Sigma ^{kj}\right) = 0, \end{aligned}$$
(36)

because \(\nabla _k\nabla _m\mathbf{R}= \mathbf{{0}}\). Similarly, we have the boundary conditions

$$\begin{aligned} \sqrt{{\bar{g}}/g}\, n_k \Sigma ^{kj} = 0. \end{aligned}$$
(37)

These results are equivalent to those in [4].

Again, as only quadratic strain terms were retained in the energy, it may be sensible to ignore some of the quadratic strain terms in the definition of the stress components,

$$\begin{aligned} \Sigma ^{kl} = A^{ijkl}\epsilon _{ij} + O(\epsilon ^2). \end{aligned}$$
(38)

Had we followed the more common practice of constructing the invariants with the reference metric, or rather using the invariants of the Green strain, the only difference in our result would have been a quadratic adjustment to the stress. For now we retain, rather than approximate, the prefactor \(\sqrt{{\bar{g}}/g}\,\) appearing in front of the stress, both to maintain some similarity with the continuum mechanics literature, and because it has a sensible physical meaning connected to the conservation of mass. Also, while \(\sqrt{g}\) passes through its connection \(\nabla \), it is simplest to keep the scalar factor \(\sqrt{{\bar{g}}/g}\) intact, because \(\sqrt{{\bar{g}}}\) is a tensor density and does not behave as a scalar under covariant differentiation. That is,

$$\begin{aligned} \nabla _i\sqrt{{\bar{g}}}&= \sqrt{g}\,\nabla _i\sqrt{{\bar{g}}/g}, \nonumber \\&= \sqrt{g}\, \partial _i\sqrt{{\bar{g}}/g}, \nonumber \\&= \partial _i\sqrt{{\bar{g}}}\, - \sqrt{{\bar{g}}/g}\, \partial _i \sqrt{g}, \nonumber \\&= \partial _i\sqrt{{\bar{g}}}\, - \sqrt{{\bar{g}}}\,\Gamma ^j_{ji}, \end{aligned}$$
(39)

and therefore even a simple Cartesian reference integration measure will not have a simple present covariant derivative. It is easier to deal with \(\nabla _i\sqrt{{\bar{g}}/g} = \partial _i\sqrt{{\bar{g}}/g}\,\). Of course, it is even easier to approximate \(\sqrt{{\bar{g}}/g}\) as unity and throw away the terms of higher order in strain.

2.4 Hybrid approach: invariants of Green (or rather some other unnamed) strain in the present configuration

This approach is somewhat more traditional than that of either of the two previous sections, but requires some care in interpretation. Let us work with the present metric, but rewrite the reference energy (14) as an integral over the present configuration

$$\begin{aligned} {\bar{E}} = \int \hbox {d}V \sqrt{{\bar{g}}/g}\, \bar{\mathcal {E}}. \end{aligned}$$
(40)

We define \(\bar{\mathcal {E}} = \tfrac{1}{2}{\bar{A}}^{ijkl}\epsilon _{ij}\epsilon _{kl}\) using the same functions as in (15), the difference being that we now consider the components of the elasticity tensor as referring to the present basis, even though they are computed using the inverse reference basis. We write \({\bar{A}}^{ijkl} = \lambda \left( {\bar{g}}^{-1}\right) ^{ij}\left( {\bar{g}}^{-1}\right) ^{kl} + \mu \left( \left( {\bar{g}}^{-1}\right) ^{ik}\left( {\bar{g}}^{-1}\right) ^{jl} + \left( {\bar{g}}^{-1}\right) ^{il}\left( {\bar{g}}^{-1}\right) ^{jk}\right) \), and generate the same quantities as in Sect. 2.2. We are obliquely using the fact that we can view the same component functions in two ways, and obtain the same invariants from the two different tensors \({\bar{g}}^{IJ}\partial _I{\bar{\mathbf{R}}}\partial _J{\bar{\mathbf{R}}} \cdot 2\epsilon _{KL}\partial ^K{\bar{\mathbf{R}}}\partial ^L{\bar{\mathbf{R}}} = \mathbf{I}\cdot \left( \mathbf{C}- \mathbf{I}\right) = \mathbf{C}- \mathbf{I}\) and \(\left( {\bar{g}}^{-1}\right) ^{ij}\partial _i\mathbf{R}\partial _j\mathbf{R}\cdot 2\epsilon _{kl}\partial ^k\mathbf{R}\partial ^l\mathbf{R}= \mathbf{B}\cdot \left( \mathbf{I}- \mathbf{B}^{-1}\right) = \mathbf{B}- \mathbf{I}\). While perhaps a bit confusing at first, this seems easier to interpret than the reference metric approach in the context of incompatible elasticity, when no reference basis exists.Footnote 7 Proceeding with the variation, we have

$$\begin{aligned} \delta \bar{\mathcal {E}}&= \delta \int \hbox {d}V \tfrac{1}{8}\sqrt{{\bar{g}}/g}\,{\bar{A}}^{ijkl}\left( g_{ij}-{\bar{g}}_{ij}\right) \left( g_{kl}-{\bar{g}}_{kl}\right) , \nonumber \\&= \int \hbox {d}V \tfrac{1}{2}\sqrt{{\bar{g}}/g}\,{\bar{A}}^{ijkl}\left( g_{ij}-{\bar{g}}_{ij}\right) \nabla _l\mathbf{R}\cdot \nabla _k\delta \mathbf{R}, \nonumber \\&= -\int \hbox {d}V\, \nabla _k\left( \sqrt{{\bar{g}}/g}\,{\bar{A}}^{ijkl}\epsilon _{ij} \nabla _l\mathbf{R}\right) \cdot \delta \mathbf{R}\, \nonumber \\&\quad \,\, + \oint \hbox {d}S\, \sqrt{{\bar{g}}/g}\, n_k {\bar{A}}^{ijkl}\epsilon _{ij} \nabla _l\mathbf{R}\cdot \delta \mathbf{R}. \end{aligned}$$
(41)

In similar spirit to (17), we define

$$\begin{aligned} {\bar{\Sigma }}^{kl} = {\bar{A}}^{ijkl}\epsilon _{ij}, \end{aligned}$$
(42)

and write the equations of equilibrium

$$\begin{aligned} -\,\nabla _k\left( \sqrt{{\bar{g}}/g}\,{\bar{\Sigma }}^{kl} \nabla _l\mathbf{R}\right) = \mathbf{{0}}, \end{aligned}$$
(43)

and free boundary conditions on \(\partial V\)

$$\begin{aligned} \sqrt{{\bar{g}}/g}\,n_k {\bar{\Sigma }}^{kl} \nabla _l\mathbf{R}= \mathbf{{0}} . \end{aligned}$$
(44)

Projection of the divergence in Eq. (43) onto \(\nabla ^j\mathbf{R}\) is also clearly a divergence, as in Sect. 2.3, namely

$$\begin{aligned} -\,\nabla _k\left( \sqrt{{\bar{g}}/g}\,{\bar{\Sigma }}^{kj}\right) = 0, \end{aligned}$$
(45)

which comes with boundary conditions

$$\begin{aligned} \sqrt{{\bar{g}}/g}\,n_k {\bar{\Sigma }}^{kj} = 0. \end{aligned}$$
(46)

I leave it as a tedious exercise to the reader to use identities such as \(\sqrt{g}\,\Gamma ^i_{ij}=\partial _j\sqrt{g}\) to show that, with somewhat loose interpretations of indices, the clean divergence (45) is just the non-divergence (23) multiplied by a factor of \(\sqrt{{\bar{g}}/g}\). That is, the same functions are represented by these two expressions. This relationship can be inferred from the treatment in Green and Zerna’s foundational work from 1954 [4]. Concurrent and subsequent developments in continuum mechanics led to standard definitions of stress tensors and their interrelationships. A summary of these and their connection with the present approach can be found in Appendix A, which is probably best to read now.

2.5 Remarks on geometric and material energies

A few additional remarks are in order on the distinction between geometric energies measured per unit volume and material energies measured per unit mass or reference volume. Further related points will be made in Sect. 3.2.2.

A commonly encountered geometric energy is the “soap film” energy or surface energy per unit present area (two-dimensional volume). Consider a surface \(\mathbf{X}\) with normal \(\varvec{\mathrm {\hat{N}}}\), and the energy

$$\begin{aligned} E_{\mathrm {sf}} = \int \hbox {d}A = \int \sqrt{a}\, \hbox {d}x^1\hbox {d}x^2. \end{aligned}$$
(47)

For the variation we will need [86]

$$\begin{aligned} \delta \sqrt{a}&= \frac{1}{2}\frac{\delta a}{\sqrt{a}} = \frac{1}{2}\frac{1}{\sqrt{a}}\frac{\partial a}{\partial a_{\alpha \beta }} \delta a_{\alpha \beta } = \frac{1}{2}\frac{a\, a^{\alpha \beta }}{\sqrt{a}}\delta a_{\alpha \beta } =\sqrt{a}\, \nabla ^\alpha \mathbf{X}\cdot \nabla _\alpha \delta \mathbf{X}, \end{aligned}$$
(48)

so that

$$\begin{aligned} \delta E_{\mathrm {sf}}&= \int \delta \sqrt{a}\, \hbox {d}x^1\hbox {d}x^2, \nonumber \\&= \int \hbox {d}A\, \nabla ^\alpha \mathbf{X}\cdot \nabla _\alpha \delta \mathbf{X}, \nonumber \\&= - \int \hbox {d}A\, \nabla ^2\mathbf{X}\cdot \delta \mathbf{X}\, \nonumber \\&\quad \,\, + \oint \hbox {d}L\, n_\alpha \nabla ^\alpha \mathbf{X}\cdot \delta \mathbf{X}, \end{aligned}$$
(49)

where \(\hbox {d}L\) is a boundary element. The boundary term represents a tangential force resisting extension of the surface, while the bulk term is the Laplacian of the position vector \(\nabla ^2\mathbf{X}= 2H\varvec{\mathrm {\hat{N}}}\), which seeks, for example, to collapse a sphere to a point. Note that a similar energy for a three-dimensional body would have resulted in the vanishing bulk term \(\nabla ^2\mathbf{R}=\mathbf{{0}}\).

While this type of energy is appropriate for describing a soap film, whose energy is contained in its soap–air interfaces, it is not suitable for describing kinetic or elastic energies. In mathematical models of lipid membranes, the surface area is often held fixed with a constraint, while a geometric energy is used to obtain simple and clean expressions [71, 72]. In this constrained limit, the difference between geometric and material energies arising from the variation of the area form can be absorbed into a Lagrange multiplier. Consider the soap film again, and let \(\gamma \) be a multiplier. Clearly,

$$\begin{aligned} \sqrt{g}\,\delta \left[ \mathcal {E}_0 + \gamma \left( \sqrt{g} - \sqrt{{\bar{g}}} \right) \right] = \sqrt{g}\left[ \delta \mathcal {E}_0 + \gamma \nabla ^i\mathbf{R}\cdot \nabla _i\delta \mathbf{R}\right] \end{aligned}$$
(50)

and

$$\begin{aligned} \delta \left( \sqrt{g}\left[ \mathcal {E}_0 + \gamma \left( \sqrt{g} - \sqrt{{\bar{g}}} \right) \right] \right) = \sqrt{g}\left[ \delta \mathcal {E}_0 + \left( \gamma +\mathcal {E}_0\right) \nabla ^i\mathbf{R}\cdot \nabla _i\delta \mathbf{R}\right] \end{aligned}$$
(51)

are equivalent under the redefinition \(\gamma \rightarrow \gamma + \mathcal {E}_0\). For elastic sheets, Guven and Müller [87] introduced something akin to the more restrictive rigid body constraint \(\mathrm{T}^{ij}\left( \nabla _i\mathbf{R}\cdot \nabla _j\mathbf{R}- {\bar{g}}_{ij}\right) \), with multipliers \(\mathrm{T}^{ij}\) for all components of the metric. Noting that \(\nabla ^i\mathbf{R}\cdot \nabla _i\delta \mathbf{R}= g^{ij}\nabla _i\mathbf{R}\cdot \nabla _j\delta \mathbf{R}\), the appropriate redefinition in this case is

$$\begin{aligned} \mathrm{T}^{ij} \rightarrow \mathrm{T}^{ij} + \mathcal {E}_0g^{ij}. \end{aligned}$$
(52)

The reader may find it illustrative to examine equations (122–123) of [71] in this light. In Sect. 4 we will see how, in the absence of such constraints on stretching, elastic bending energies differ in important ways from geometric bending energies.

3 Reduction

In this section, I will reduce the three-dimensional energy corresponding to the Almansi or present metric approach of Sect. 2.3 to a two-dimensional energy defined on the mid-surface of a plate. I will also consider the Biot strain in addition to the strains of Sect. 2 and use the simple case of an extensible elastica or unidirectionally bent plate to demonstrate that the choice of strain measure has significant qualitative effects on the definition of bending energies.

The reduction process is not the focus of this paper, so no attempt is made at rigor. It is included here for completeness of the plate equation derivations, but also in order to raise some important issues that are likely to be far less dependent on the specific reduction process than they are on the choice of strain measure. Thus, I will follow standard, if somewhat unjustified, procedures invoking the first and second Kirchhoff–Love assumptions [12], despite the fact that these are contradictory. For a different method, see Steigmann [88, 89].

3.1 A reduced energy for a plate

This derivation is based on that of [13], which follows that of [12]. Let \(\mathbf{X}\left( \{x^\alpha \} \right) \) be a two-dimensional mid-surface of a three-dimensional plate \(\mathbf{R}\left( \{x^i\}\right) \). Recall the convention, earlier implicitly introduced, for Latin indices \(i \in \{1, 2, 3\}\) and Greek indices \(\alpha \in \{1, 2\}\). The latter coordinates are a subset of the former. The Latin letter z will denote the specific coordinate transverse to the plate and should not be confused with a free index. We write \(\mathbf{R}\left( \{x^i\}\right) = \mathbf{R}\left( \{x^\alpha \}, z \right) \) and \(\mathbf{X}\left( \{x^\alpha \} \right) = \mathbf{R}\left( \{x^\alpha \}, 0 \right) \). The goal is to express the energy as an effective field theory

$$\begin{aligned} E(\mathbf{X}) = \int \hbox {d}V \mathcal {E}(\mathbf{R}) = \int \hbox {d}A \int \limits ^{t/2}_{-t/2} \hbox {d}z\, \mathcal {E}(\mathbf{X}, z) = \int \hbox {d}A\, \mathcal {E}_{2D}(\mathbf{X}), \end{aligned}$$
(53)

where \(\hbox {d}A = \sqrt{a} \prod \nolimits _{\alpha } \hbox {d}x^\alpha \), and t is the uniform thickness of the undeformed plate. For compactness, we suppress explicit dependence on the reference configuration or metric in these expressions.

Presume a reference metric for the plate of the form

$$\begin{aligned} {\bar{g}}_{ij} = \begin{pmatrix} \left( {\bar{g}}_{\alpha \beta }\right) _{2\times 2}&{}\bigcirc \\ \bigcirc &{}1\\ \end{pmatrix} = \begin{pmatrix} \left( {\bar{a}}_{\alpha \beta }\right) _{2\times 2}&{}\bigcirc \\ \bigcirc &{}1\\ \end{pmatrix}. \end{aligned}$$
(54)

The second equality simply expresses the fact that \({\bar{g}}_{\alpha \beta }\) is not a function of z for a plate, although it would be for a generic shell. We may, but need not, think of this as coming from a rest configuration for a simple undeformed plate, \({\bar{\mathbf{R}}}\left( \{x^i\}\right) = {\bar{\mathbf{X}}}\left( \{x^\alpha \} \right) + z\varvec{\mathrm {\hat{{\bar{N}}}}}\), where the reference unit normal \(\varvec{\mathrm {\hat{{\bar{N}}}}}\) is a constant vector. If this assumption is made, we can choose a very simple reference metric such as the two-dimensional Cartesian \({\bar{a}}_{11}={\bar{a}}_{22}=1\), \({\bar{a}}_{12}={\bar{a}}_{21}=0\). In the general case of an incompatible plate [12], the three-dimensional plate does not have a reference embedding, but the mid-surface is still most likely embeddable in \({\mathbb {E}}^3\).

The first Kirchhoff–Love assumption diagonalizes the present metric,

$$\begin{aligned} g_{ij} = \partial _i \mathbf{R}\cdot \partial _j \mathbf{R}= \begin{pmatrix} \left( g_{\alpha \beta }\right) _{2\times 2}&{}\bigcirc \\ \bigcirc &{}g_{zz}\\ \end{pmatrix}. \end{aligned}$$
(55)

We might think of this as arising from an assumed form for the position vector \(\mathbf{R}\left( \{x^i\}\right) = \mathbf{X}\left( \{x^\alpha \} \right) + z\partial _z\mathbf{R}\left( \{x^i\}\right) \), where \(\partial _z\mathbf{R}\) is normal to \(\mathbf{X}\) and has a magnitude independent of \(\{x^\alpha \}\). In order to integrate over z, we need to rewrite the energy density in terms of \(\mathbf{X}\) and z by getting rid of the third tensor index. We can use the boundary condition (37) on the top and bottom faces of the plate to obtain \(\Sigma ^{zj} = 0\) for any j. Recall that the stress was defined in (32) as \(\Sigma ^{km} = A^{ijkl}\epsilon _{ij} {\bar{g}}^m_l\) with \(A^{ijkl} = \lambda g^{ij}g^{kl} + \mu \left( g^{ik}g^{jl} + g^{il}g^{jk}\right) \). Using \(\Sigma ^{zz}=0\) and the presumed form (54), we find \(\lambda \epsilon ^i_i + 2\mu \epsilon ^z_z = 0\), and therefore

$$\begin{aligned} \epsilon ^z_z = - \frac{\lambda }{\lambda +2\mu }\epsilon ^\alpha _\alpha . \end{aligned}$$
(56)

The recasting of \(\sqrt{{\bar{g}}/g}\) is complicated,Footnote 8 but for the moment we need only know that \(\sqrt{{\bar{g}}/g} = \sqrt{{\bar{a}}/a} \left( 1+O(\epsilon )\right) \). Using the presumed forms (54) and (55), we can say that \(\epsilon ^i_j\epsilon ^j_i = \epsilon ^\alpha _\beta \epsilon ^\beta _\alpha + \left( \epsilon ^z_z\right) ^2\) and rewrite the elastic energy (27),

$$\begin{aligned} \mathcal {E}\left( \mathbf{X},z\right) = \tfrac{1}{2}\sqrt{{\bar{a}}/a}\, \mu \left( \frac{2\lambda }{\lambda +2\mu }g^{\alpha \beta }g^{\gamma \zeta } + g^{\alpha \gamma }g^{\beta \zeta } + g^{\alpha \zeta }g^{\beta \gamma }\right) \epsilon _{\alpha \beta }\epsilon _{\gamma \zeta } + O\left( \epsilon ^3\right) . \end{aligned}$$
(57)

Now we need to express off-mid-surface terms in terms of mid-surface terms. To simplify the calculations, we invoke the second Kirchhoff–Love assumption, which comes from a (clearly incorrect) presumed deformed configuration \(\mathbf{R}\left( \{x^i\}\right) = \mathbf{X}\left( \{x^\alpha \} \right) + z\varvec{\mathrm {\hat{N}}}\left( \{x^\alpha \}\right) \), such that

$$\begin{aligned} g_{ij} = \partial _i \mathbf{R}\cdot \partial _j \mathbf{R}= \begin{pmatrix} \left( g_{\alpha \beta }\right) _{2\times 2}&{}\bigcirc \\ \bigcirc &{}1\\ \end{pmatrix}, \end{aligned}$$
(58)

where

$$\begin{aligned} g_{\alpha \beta } = a_{\alpha \beta } - 2zb_{\alpha \beta } + z^2c_{\alpha \beta } . \end{aligned}$$
(59)

We introduce \(\varepsilon _{\alpha \beta }\left( \{x^\alpha \} \right) \equiv \epsilon _{\alpha \beta }\left( \{x^\alpha \}, 0 \right) \), so that \(2\varepsilon _{\alpha \beta } = a_{\alpha \beta }-{\bar{a}}_{\alpha \beta }\) while \(2\epsilon _{\alpha \beta } = g_{\alpha \beta }-{\bar{g}}_{\alpha \beta } = g_{\alpha \beta }-{\bar{a}}_{\alpha \beta } = 2\varepsilon _{\alpha \beta } - 2zb_{\alpha \beta } + z^2c_{\alpha \beta }\). Note that surface indices are raised with the inverse surface metric, so while the bulk mixed component \(\epsilon ^\alpha _\beta = g^{\alpha \gamma }\epsilon _{\gamma \beta }\), the surface mixed component \(\varepsilon ^\alpha _\beta = a^{\alpha \gamma }b_{\gamma \beta }\), and similarly for \(b^\alpha _\beta \) and \(c^\alpha _\beta \).

Following Flügge [45], we approximate the inverse metric by first defining \(\lambda ^\beta _\alpha \equiv \delta ^\beta _\alpha + zb^\beta _\alpha + z^2c^\beta _\alpha \) so that \(g^{\alpha \beta } = \lambda ^\alpha _\gamma \lambda ^\beta _\zeta a^{\gamma \zeta } + O\left( (zb)^3\right) \) and \(g^{\alpha \beta }g_{\beta \gamma }=\delta ^\alpha _\gamma + O\left( (zb)^3\right) \). The mixed strains are

$$\begin{aligned} \epsilon ^\alpha _\beta&= g^{\alpha \gamma }\epsilon _{\gamma \beta }, \nonumber \\&= \left( \delta ^\alpha _\zeta + zb^\alpha _\zeta + z^2c^\alpha _\zeta \right) \left( \delta ^\gamma _\eta +zb^\gamma _\eta + z^2c^\gamma _\eta \right) a^{\zeta \eta }\left( \varepsilon _{\gamma \beta } -zb_{\gamma \beta } +\tfrac{z^2}{2}c_{\gamma \beta } \right) + O\left( (zb)^3\varepsilon \right) , \end{aligned}$$
(60)
$$\begin{aligned}&= \varepsilon ^\alpha _\beta -zb^\alpha _\beta + O\left( (zb)^2, zb\varepsilon \right) . \end{aligned}$$
(61)

Following traditional practice, we will initially keep terms of orders \((zb)^2\), \(zb\varepsilon \), and \(\varepsilon ^2\) in our quadratic strain energy, although terms odd in z will integrate to zero. However, we will revisit this choice in Sect. 3.1.1. We write

$$\begin{aligned} \mathcal {E}\left( \mathbf{X},z\right) = \tfrac{1}{2}\sqrt{{\bar{a}}/a}\, \mathcal {A}^{\alpha \beta \gamma \zeta } \left( \varepsilon _{\alpha \beta } -zb_{\alpha \beta }\right) \left( \varepsilon _{\gamma \zeta } -zb_{\gamma \zeta }\right) + O\left( (zb)^3, (zb)^2\varepsilon , zb\varepsilon ^2, \varepsilon ^3\right) , \end{aligned}$$
(62)

where \(\mathcal {A}^{\alpha \beta \gamma \zeta }= \mu \left( \frac{2\lambda }{\lambda +2\mu }a^{\alpha \beta }a^{\gamma \zeta } + a^{\alpha \gamma }a^{\beta \zeta } + a^{\alpha \zeta }a^{\beta \gamma }\right) \), and we are now forming invariants of two-dimensional surface tensors. This approximate plate energy displays a clean separation between mid-surface stretching and bending energies,

$$\begin{aligned} \mathcal {E}_{2D}(\mathbf{X})&= \int \limits ^{t/2}_{-t/2} \hbox {d}z\, \mathcal {E}(\mathbf{X}, z) \, \nonumber \\&= \tfrac{1}{2}\sqrt{{\bar{a}}/a}\, \mathcal {A}^{\alpha \beta \gamma \zeta }\left( t\, \varepsilon _{\alpha \beta }\varepsilon _{\gamma \zeta } +\tfrac{t^3}{12}\,b_{\alpha \beta }b_{\gamma \zeta } \right) + O\left( t(tb)^3, t(tb)^2\varepsilon , t(tb)\varepsilon ^2, t\varepsilon ^3\right) , \end{aligned}$$
(63)

although the energies themselves are not entirely separate because of the compatibility constraints (11) and (13) relating metric and curvature tensors. The bending contribution to the energy, arising from asymmetric off-mid-surface stretching, is

$$\begin{aligned} \sqrt{{\bar{a}}/a}\,\tfrac{1}{2}\tfrac{t^3}{12} \mathcal {A}^{\alpha \beta \gamma \zeta }b_{\alpha \beta }b_{\gamma \zeta }&= \sqrt{{\bar{a}}/a}\,\tfrac{1}{2}\tfrac{t^3}{12}\,2\mu \left( \tfrac{2\left( \lambda +\mu \right) }{\lambda +2\mu }\,4H^2 - 2K\right) , \nonumber \\&\equiv \sqrt{{\bar{a}}/a}\,\left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) , \end{aligned}$$
(64)

where we define the moduli \({\scriptstyle B_H}\equiv \tfrac{4\left( \lambda +\mu \right) }{\lambda +2\mu }\,{\scriptstyle B_K}\) and \({\scriptstyle B_K}\equiv \tfrac{t^3\mu }{6}\) for compactness of subsequent presentation. These bending terms are distinct from those in Willmore [90, 91] or Helfrich [64,65,66] energies because of the prefactor \(\sqrt{{\bar{a}}/a}\,\) that produces an elastic energy per unit mass. The consequences of this distinction, including a contribution to the Euler–Lagrange equations from the elastic Gaussian term, will be shown in Sect. 4.

3.1.1 Extra terms in the energy

Actually, things need not be as simple as the typical derivation above implies. Let us note an oddity. The nicely decoupled form (63) of the energy is the result of an approximation of the energy rather than the equations of equilibrium. It will be seen in Sect. 4 that the variation of the bending terms, which are quadratic in curvature, will lead to terms cubic in curvature in the equations of equilibrium. These are traditionally retained in the shape equation of Helfrich lipid membranes just as they are for elastica, where indeed they need to be in order to include non-inflectional (double-well) shapes among the solutions. Meanwhile, dropped terms of order \((zb)^2\varepsilon \) in (62) would have yielded comparable terms through the variation of the strain.Footnote 9 The same would be true for an energy constructed using the reference or hybrid approaches, and something similar can be said for the difference between geometric and material energies, as will be noted later in Sect. 3.2.2.Footnote 10 Recognizing this, Dias et al. [13] retained mixed stretching–bending terms in their derivation and showed that they could be absorbed, along with other higher-order terms arising from variation of the volume form in their geometric energy, into the definition of the stress. I will show the derivation of an augmented stress that includes “extra terms” in Sect. 4.1, which can be employed later in equations of equilibrium in Sect. 4.4, but will neglect these terms as being of higher order when discussing energies in the following section. For a plate energy with even higher-order curvature and mixed terms, see [92].

Overall, something seems fishy about the expansion, and a possible reason will become apparent shortly in Sect. 3.2.1.

3.2 Different bending energies for extensible elastica

At this point, a simple example will illustrate the differences in the approaches of Sects. 2.22.4 when used in conjunction with the Kirchhoff–Love assumptions. For simplicity of exposition, moduli, factors of two, or similar inessentials will be neglected in the descriptions of energies.

Consider an extensible variant of the classical planar elastica, endowed with a one-dimensional bending energy on a curve \(\mathbf{X}(x^1)\) with a single material coordinate. This can be thought of as the mid-line of a two-dimensional object, or of a three-dimensional rod confined in the plane, or as a representation of the mid-surface of a unidirectionally bent plate in which the second body coordinate is ignored. The only difference in these descriptions would appear in the bending modulus. The bending energy will arise from mean curvature, as the Gaussian curvature is zero. If \(x^1\) is the arc length of \(\mathbf{X}\) in the reference configuration, then \({\bar{a}}_{11} = \left( {\bar{a}}^{-1}\right) ^{11} = \sqrt{{\bar{a}}} = 1\) and \({\bar{\Gamma }}^1_{11} = 0\). In the present configuration, let the tangents be expressed in Cartesians as \(\partial _1\mathbf{X}= \varLambda \begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix}\), where \(\varLambda \) is the stretch and \(\theta \) is the tangential angle in the plane. Thus, \(a_{11}=\varLambda ^2\), \(a^{11}=1/\varLambda ^2\), \(\sqrt{a}=\varLambda \), and \(\Gamma ^1_{11} = \partial _1\varLambda /\varLambda \).

Regardless of which metric is used to form invariants and determine an energy, the present—not reference—covariant derivative is what appears from the expansion of the metric (59). This derivative isolates the bending (normal) term from the stretching (tangential) term when taking two derivatives of \(\mathbf{X}\). That is, while \(\partial _1\partial _1\mathbf{X}= \partial _1\nabla _1\mathbf{X}= \partial _1\varLambda \begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix} + \varLambda \partial _1\theta \begin{pmatrix} -\sin \theta \\ \cos \theta \end{pmatrix}\), the covariant derivative \(\nabla _1\partial _1\mathbf{X}= \nabla _1\nabla _1\mathbf{X}= \partial _1\partial _1\mathbf{X}- \Gamma ^1_{11}\partial _1\mathbf{X}= \varLambda \partial _1\theta \begin{pmatrix} -\sin \theta \\ \cos \theta \end{pmatrix}\). The term \(\nabla _1\nabla _1\mathbf{X}\cdot \nabla _1\nabla _1\mathbf{X}= b_{11}b_{11}\) will be acted on by either reference or present inverse metrics depending on the approach.

The reference metric (Green strain) and hybrid approaches of Sects. 2.2 and 2.4, respectively, have the same bending contribution to the energy, taking the form

$$\begin{aligned} \int \sqrt{{\bar{a}}}\, \hbox {d}x^1\, \left( {\bar{a}}^{-1}\right) ^{11}\nabla _1\nabla _1\mathbf{X}\cdot \left( {\bar{a}}^{-1}\right) ^{11}\nabla _1\nabla _1\mathbf{X}= \int \sqrt{{\bar{a}}}\, \hbox {d}x^1 \left( \varLambda \partial _1\theta \right) ^2. \end{aligned}$$
(65)

Consider a ring of elastic plate material, that is, a piece of material that wants to be linear but has been bent into a circle and had its ends attached, rather than a circular shell with rest curvature. For purely topological reasons, the two-dimensional ring cannot achieve its rest configuration through its thickness, but its mid-line can achieve its one-dimensional rest configuration by, for example, choosing a circle of the right radius. Now examine the reference bending energy (65). Increasing the radius of the circle will make this energy increase, as \(\partial _1\theta \) will remain the same, but \(\varLambda \) will increase. One would need to conclude that given the same amount of material in a circular configuration, larger-radius circles have greater “bending energy.” A ring formed by inextensible bending of a plate would try to relax its bending energy by contracting to a smaller radius at the expense of some compression energy. This is quite a strange definition of bending energy. Something akin to this issue was observed in the context of shells by Pezzulla et al. [50], who then counteracted its effects by employing a prefactor related to through-thickness strains, but this prefactor does not exist for plates and thus does not resolve the problem.

Instead, consider the present metric (Almansi strain) approach of Sect. 2.3, which applies the Laplacian \(\nabla ^2 = \nabla ^1\nabla _1 = a^{11}\nabla _1\nabla _1\) to \(\mathbf{X}\) to obtain

$$\begin{aligned} \nabla ^2\mathbf{X}=\frac{\partial _1\theta }{\varLambda }\begin{pmatrix} -\sin \theta \\ \cos \theta \end{pmatrix}. \end{aligned}$$
(66)

Note that \(\nabla ^2\mathbf{X}= \tfrac{1}{\sqrt{a}}\partial _1\left( \sqrt{a}a^{11}\partial _1\mathbf{X}\right) =\tfrac{1}{\varLambda }\partial _1\left( \tfrac{1}{\varLambda }\partial _1\mathbf{X}\right) = \partial _s^2\mathbf{X}\), where \(\partial _s\) is the derivative with respect to the present arc length, such that \(\partial _1=\varLambda \partial _s\). (The first step uses a form of the covariant Laplacian valid only when acting on a “surface scalar.”) The magnitude of this expression is \(\nabla ^2\mathbf{X}\cdot \nabla ^2\mathbf{X}= \left( \partial _s\theta \right) ^2\), where \(\partial _s\theta \) is the Frenet curvature. This is truly a geometric object, but the integral is still over the material coordinate, making the energy itself material rather than geometric. The bending contribution to the energy is

$$\begin{aligned} \int \sqrt{a}\, \sqrt{{\bar{a}}/a}\, \hbox {d}x^1\, \nabla ^2\mathbf{X}\cdot \nabla ^2\mathbf{X}= \int \sqrt{{\bar{a}}}\, \hbox {d}x^1 \left( \frac{\partial _1\theta }{\varLambda }\right) ^2. \end{aligned}$$
(67)

Now the stretch appears in the denominator rather than the numerator, and an increase in the radius of a circle leads to a smaller present bending energy (67). A ring formed by inextensible bending of a plate would try to relax its bending energy by expanding to a larger radius at the expense of some stretching energy.

Let us also consider another operation on plate material, namely taking an open arc of a circle and extending it to make a longer arc of the same radius. This will lead to no change in (67) and is thus defined as a pure stretch in the present bending energy case. This conclusion might seem intuitively reasonable at first. However, we will revisit this issue shortly below in Sect. 3.2.1 with another strain measure, using again the two operations of expansion of a circle and extension of a circular arc.

Using the present metric, the stretching energy term \(\left( a^{11}\left( a_{11}-{\bar{a}}_{11}\right) \right) ^2 = \left( \frac{\varLambda ^2-1}{\varLambda ^2}\right) ^2\) will saturate at very large stretches but increase without bound in compression, while using the reference metric, the corresponding stretching energy term \(\left( \left( {\bar{a}}^{-1}\right) ^{11}\left( a_{11}-{\bar{a}}_{11}\right) \right) ^2 = \left( \varLambda ^2-1\right) ^2\) goes up quadratically with stretch but is bounded in compression. Neither model is really intended to be used beyond small to moderate strains.

3.2.1 Other strain measures, particularly Biot

An important note is in order on treatments of extensible rods and sheets in the mechanics and physics literatures. A direct or Cosserat approach to such structures [27, 59, 93, 94] views them as low-dimensional bodies with phenomenological stretching, bending, and twisting energies or constitutive relations, without consideration of the higher-dimensional origins of such terms. Instead, simple kinematical variables and constitutive laws are postulated. Neither of the quadratic Green or Almansi bulk energies considered thus far corresponds to the simplest direct theory of rods.

Antman [95], Reissner [96], and Whitman and DeSilva [97] formulated equilibrium equations for rods using generalized strain variables, one of which is \(\partial _1\theta \). Huddleston [98] used similar variables, while Green and Laws [93] used \(\varLambda \partial _1\theta \), and other contemporaries developed different measures and approaches [99,100,101]. Adopting Reissner’s and Whitman and DeSilva’s simple assumption of a linear relation between moment and \(\partial _1\theta \) preserves the structure of the equations of equilibrium of the elastica, with the inextensible arc length derivative replaced by an extensible material derivative. Antman, without proposing any particular constitutive relation, favors the kinematical variable \(\partial _1\theta \) because it is in some sense a measure of curvature that does not include changes arising from dilation (see page 98 of [102]), such as expansion of a circle, which he sees as a “pure extension” of an initially curved rod [95].

A Reissner-like model will not arise if one performs a reduction of either Green or Almansi energies as in the preceding Sect. 3.2. An important, but apparently little known and frequently re-discovered, fact is that this model can arise from a reduction if the Biot strain \(\varLambda -1\), rather than the Green strain \(\tfrac{1}{2}\left( \varLambda ^2-1\right) \), is used [76,77,78,79]. The Biot strain is equivalent to the Green strain for small strains, \(\varLambda \approx 1\), and corresponds to bending terms of the form \(\left( \partial _1\theta \right) ^2\). Such a bending energy does not change when a circle is expanded to change its radius, in keeping with Antman’s preferred definition of a pure stretch, but increases when a circular arc is extended at fixed radius. Oshri and Diamant [81] also prefer this form for similar reasons as the early rod mechanicians, because it “decouples” the bending terms, so that one defines a moment linearly dependent on the kinematic bending variable and independent of stretch. While they consider pure bending and recognize that the Green energy [12] produces a bending energy dependent on stretch, they do not notice that this dependence is counterintuitive and do not consider extension of a bent arc. Irschik and Gerstmayr [79] explicitly show the nonlinear relationship between moment and stretch and bending variables corresponding to a reduction from Green strains.

One can also consider strains corresponding to the use of the present basis, which would provide Almansi \(\frac{1}{2}\frac{\varLambda ^2-1}{\varLambda ^2}\) and Swainger \(\frac{\varLambda -1}{\varLambda }\) strains as counterparts to the Green and Biot strains, respectively. Unlike Green and Almansi, we do not currently have a covariant expression for Biot and Swainger energies in terms of derivatives for the general case. Oshri and Diamant [81] have formulated a non-covariant two-dimensional theory for axisymmetrically deformed plates that employs Biot strains. I will not attempt here to formulate a general covariant three-dimensional description in terms of Biot or Swainger strain in 3D, but one can infer that the Swainger bending energy will be qualitatively similar to the Almansi description, in which extending an arc with fixed radius is a pure stretch, whereas increasing a circle’s radius involves both stretching and bending. To illustrate this line of thinking, consider constructing a scalar energy in one dimension from a variable such as \(\left( \sqrt{a}-\sqrt{{\bar{a}}}\right) \), a tensor density equivalent to a component of the Biot strain. Curiously, the first work of which I am aware that demonstrates the Biot–Reissner correspondence also expresses the stretch in these terms [76], a trick that unfortunately does not generalize to higher dimensions. We can use other tensor densities to construct an invariant energy. In particular, consider the terms \(\left( \frac{1}{\sqrt{a}}\left( \sqrt{a}-\sqrt{{\bar{a}}}\right) \right) ^2 = \left( \frac{\varLambda -1}{\varLambda }\right) ^2\) and \(\left( \frac{1}{\sqrt{{\bar{a}}}}\left( \sqrt{a}-\sqrt{{\bar{a}}}\right) \right) ^2 = \left( \varLambda -1\right) ^2\) constructed using the present and reference integration measures, respectively. These Swainger (present) and Biot (reference) energy terms have the same bounded or unbounded character in tension and compression as the analogous quartic-in-stretch Almansi and Green energies, but are quadratic in stretch. A preference for blow-up in compression rather than tension to describe real materials, something that happens naturally when using the present basis, led Magnusson et al. [78] to contrive a more complicated nonlinear constitutive relation than that corresponding to Biot energy. In another publication [15], my co-author and I confirm that a Swainger energy leads to the same \(\left( \frac{\partial _1\theta }{\varLambda }\right) ^2\) bending term as that obtained from the Almansi approach.

Let us summarize here the behavior of bending energies, derived from quadratic bulk energies, associated with simple deformations of a plate. Using the Green strain energy ([12] and its descendants), both expanding a circle and extending a circular arc increase the “bending energy,” the former being a counterintuitive definition. Using the Biot strain energy ([79] and others) leads to a Reissner-like model in which expanding a circle is a pure stretch that preserves “bending energy,” while extending an arc increases it. Using the Almansi or Swainger [15] strain energies (Sect. 2.3), expanding a circle decreases the “bending energy,” while extending an arc is a pure stretch that preserves it. Of course, all of these are constitutive assumptions so, intuitive or not, cannot be right or wrong except as descriptions of a particular material’s experimentally determined behavior.

However, the simplicity of the Biot result suggests that a more appropriate expansion around the mid-surface is not in terms of Green strain or similar variables, but in something akin to Biot strain. Some early workers sought to compare effective field theories and direct approaches through expansions of the position vector [25,26,27,28, 103]. The use of a strain linear in displacement derivatives might even resolve the issue of inconsistency between approximating the energy and approximating the equations of equilibrium discussed in Sect. 3.1.1. The choice of Green/Almansi components in models recently employed by the soft condensed matter community is likely driven both by the prominence of Green strain as the prototypical strain measure and by the ease in which such expansions can be constructed in terms of derivatives of position. However, the use of linearized strain measures based on linear springs and linear or nonlinear hinges in models of low-dimensional structures is also a popular approach in the same community. This conceptualization of a sheet as a continuum limit of a spring network, rather than a two-dimensional limit of a three-dimensional bulk elastic body, reflects the different origins of stretching and bending elasticity for molecular mesostructures and elastic solids and is likely the original driver behind Oshri and Diamant’s use of Biot strains. An influential paper by Seung and Nelson [104] introduced a computational model for elastic sheets that has been used to explore the nature of crumpling singularities [105], among other things. They define a linearized stretching energy that is effectively Biot in nature. Their discretization of bending energy is nonlinear and is intended to represent the continuum per area Helfrich energy, although the distinction between per mass, as naturally arises in the discretization, and per area, as treated in the theory, does not seem to be acknowledged. This is perhaps consistent with the Landau–Lifshitzian neglect of in-plane nonlinearity [6] in their theoretical discussion. There is still disagreement on the correct bending energy for such discrete structures [106, 107], which sometimes employ linear hinges. A theory with linear springs and linear hinges was applied to approximate extensible elastica by Oshri and Diamant [80], who derive a bending term of the form \(\left( \partial _1\theta \right) ^2\), like that of the Reissner model. In a study of buckling of axisymmetric shells, Knoche and Kierfeld [108] use Biot-like strains and an equivalent measure for bending strains to obtain simple constitutive relations.

3.2.2 Further remarks on geometric and material energies: elasticity is not geometry

While the quantity \(\frac{\partial _1\theta }{\varLambda }\) in (67) is the curvature \(\partial _s\theta \), the integral is distinct from the geometric integral that one would obtain by integrating an energy per length (one-dimensional volume). For example, let the reference and present configurations of a body be two concentric circles parameterized such that the reference arc length and the tangential angle are always identical, \(x^1=\theta \). Let the reference configuration be a circle of radius unity, such that the reference arc length coincides with the tangential angle \(x^1=\theta \), so that \({\bar{\mathbf{X}}}\left( \theta \right) = \begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix}\) and the reference metric \({\bar{a}}_{\theta \theta } = \partial _\theta {\bar{\mathbf{X}}}\cdot \partial _\theta {\bar{\mathbf{X}}} = 1\). Let the present configuration be a circle of radius r given by \(\mathbf{X}\left( \theta \right) = r \begin{pmatrix} \cos \theta \\ \sin \theta \end{pmatrix}\) with present metric \(a_{\theta \theta } = \partial _\theta \mathbf{X}\cdot \partial _\theta \mathbf{X}= r^2\). The integral

$$\begin{aligned} \int \limits _0^{2\pi }\sqrt{{\bar{a}}}\,\hbox {d}\theta \, H^2&= \int \limits _0^{2\pi } 1\,\hbox {d}\theta \, \frac{1}{r^2} = \frac{2\pi }{r^2}, \end{aligned}$$
(68)

akin to (67), is clearly different from the integral

$$\begin{aligned} \int \limits _0^{2\pi }\sqrt{a}\,\hbox {d}\theta \, H^2&= \int \limits _0^{2\pi } r \,\hbox {d}\theta \, \frac{1}{r^2} = \frac{2\pi }{r}. \end{aligned}$$
(69)

The significance for surfaces is even more qualitatively striking. For a sphere, the geometric integral \(\int H^2 \hbox {d}A\) is independent of radius, while \(\int H^2 \hbox {d}{\bar{A}}\) is not. Another geometric quadratic bending energy term would be \(\int K \hbox {d}A\,\), which by the Gauss–Bonnet theorem is a combination of boundary terms and a topological invariant. But the elastic energy \(\int K \hbox {d}{\bar{A}}\) is not, with consequences that will be shown in the next section. Some of the elegant transformations, including exploitation of conformal invariance [109,110,111], that are possible with geometric energies are not possible with elastic energies.

4 Plates

In this section, I will perform the variation \(\delta E = \delta \int \hbox {d}A\, \mathcal {E}_{2D}\,\), with the density \(\mathcal {E}_{2D}\) defined in (63), piece by piece, with steps shown explicitly and quantities written using derivatives of position. While this particular form of the derivation is, to my knowledge, unique to this paper, the vector equations and boundary conditions are implicitly constructable from published results in component form [68,69,70], and other compact vector forms can be found in the literature [9,10,11, 74, 75]. Aside from notational compactness, vector approaches naturally give rise to different boundary conditions than those obtained by breaking the vector variation into normal and tangential components, a point discussed elsewhere by Steigmann [74].

First, we will require another basic tool. As mentioned in Sect. 2.2, the variation passes through the partial derivative, and thus through the first covariant derivative acting on \(\mathbf{X}\), but not subsequent covariant derivatives as are found in the curvature components. Recall the variations of the metric and inverse metric,

$$\begin{aligned} \delta a_{\alpha \beta }&= \nabla _\alpha \delta \mathbf{X}\cdot \nabla _\beta \mathbf{X}+ \nabla _\alpha \mathbf{X}\cdot \nabla _\beta \delta \mathbf{X}, \end{aligned}$$
(70)
$$\begin{aligned} \delta a^{\alpha \beta }&= -\,\nabla ^\alpha \mathbf{X}\cdot \nabla ^\beta \delta \mathbf{X}- \nabla ^\beta \mathbf{X}\cdot \nabla ^\alpha \delta \mathbf{X}. \end{aligned}$$
(71)

The variation of the curvature invariants or components uses the fact that \(\delta \left( \nabla _\beta \nabla _\alpha \mathbf{X}\right) = \delta \left( \partial _\beta \nabla _\alpha \mathbf{X}- \Gamma ^\gamma _{\beta \alpha }\nabla _\gamma \mathbf{X}\right) = \partial _\beta \nabla _\alpha \delta \mathbf{X}- \delta \left( \Gamma ^\gamma _{\beta \alpha }\right) \nabla _\gamma \mathbf{X}- \Gamma ^\gamma _{\beta \alpha }\nabla _\gamma \delta \mathbf{X}= \nabla _\beta \nabla _\alpha \delta \mathbf{X}- \delta \left( \Gamma ^\gamma _{\beta \alpha }\right) \nabla _\gamma \mathbf{X}\). This, along with \(\delta \varvec{\mathrm {\hat{N}}}\cdot \varvec{\mathrm {\hat{N}}}=0\), leads to the curious result that

$$\begin{aligned} \delta b_{\alpha \beta }&= \delta \left( \nabla _\beta \nabla _\alpha \mathbf{X}\right) \cdot \varvec{\mathrm {\hat{N}}} + \nabla _\beta \nabla _\alpha \mathbf{X}\cdot \delta \varvec{\mathrm {\hat{N}}}, \nonumber \\&= \nabla _\beta \nabla _\alpha \delta \mathbf{X}\cdot \varvec{\mathrm {\hat{N}}}, \end{aligned}$$
(72)

despite the variation not passing through both covariant derivatives. This may be used to write

$$\begin{aligned} \nabla ^2\mathbf{X}\cdot \delta \left( \nabla _\beta \nabla _\alpha \mathbf{X}\right) = \nabla ^2\mathbf{X}\cdot \nabla _\beta \nabla _\alpha \delta \mathbf{X}. \end{aligned}$$
(73)

Note that in a recent work on lipid membrane bending elasticity, Capovilla [112] assumes a commutation (vanishing Lie bracket) between his variation and covariant derivative that allows for a compact general expression for Euler–Lagrange equations of energies depending on second covariant derivatives. This is a restriction on the variation that is apparently allowable for fluid membranes, in which any in-plane strains can be compensated by a reparameterization. Although results on variation of curvature invariants probably do not depend on this moot point, his equation (41) is not general and cannot be applied in the context of solid elastic membranes.

4.1 Stretching or constraint

The stretching term is \(\tfrac{t}{2}\sqrt{{\bar{a}}/a}\, \mathcal {A}^{\alpha \beta \gamma \zeta } \varepsilon _{\alpha \beta }\varepsilon _{\gamma \zeta }\). Following an identical procedure to that of Sect. 2.3, we can define stress components

$$\begin{aligned} \varsigma ^{\gamma \eta }&= t \mathcal {A}^{\alpha \beta \gamma \zeta } \varepsilon _{\alpha \beta } {\bar{a}}^\eta _\zeta , \end{aligned}$$
(74)
$$\begin{aligned}&= \mathcal {A}^{\alpha \beta \gamma \eta } \varepsilon _{\alpha \beta } + O(\epsilon ^2), \end{aligned}$$
(75)

such that the term in the bulk A is

$$\begin{aligned} -\,\nabla _\gamma \left( \sqrt{{\bar{a}}/a}\, \varsigma ^{\gamma \eta } \nabla _\eta \mathbf{X}\right) , \end{aligned}$$
(76)

and the term on the boundary \(\partial A\) is

$$\begin{aligned} \sqrt{{\bar{a}}/a}\, n_\gamma \varsigma ^{\gamma \eta } \nabla _\eta \mathbf{X}. \end{aligned}$$
(77)

Terms of exactly the same form are obtained if we define \( \varsigma ^{\gamma \eta }\) as a tensor multiplier constraining the metric of the mid-surface [87] and, in lieu of a stretching energy, add a constraint term of the form \(\tfrac{1}{2}\varsigma ^{\gamma \eta }\left( a_{\gamma \eta }-{\bar{a}}_{\gamma \eta }\right) \) to a bending energy \(\mathcal {E}_0\). This constraint restricts the deformations to isometric deformations of the mid-surface. In this limit, \(\sqrt{a}=\sqrt{{\bar{a}}}\) and use of one or the other quantity will only redefine the multipliers in a manner akin to Eq. (52) in Sect. 2.5,

$$\begin{aligned} \varsigma ^{\alpha \beta } \rightarrow \varsigma ^{\alpha \beta } + \mathcal {E}_0a^{\alpha \beta }. \end{aligned}$$
(78)

4.1.1 Extra terms in the stress

Let us briefly see what would happen if we had retained extra terms, so as to approximate the equations of equilibrium rather than the energy, as discussed in Sect. 3.1.1. Returning to the expression for the mixed strains (60), and using the symmetry of the second and third fundamental forms,

$$\begin{aligned} \epsilon ^\alpha _\beta&= \varepsilon ^\alpha _\beta -zb^\alpha _\beta + 2zb^{\gamma \alpha }\varepsilon _{\gamma \beta } - \tfrac{3z^2}{2}c^\alpha _\beta + O\left( (zb)^3, (zb)^2\varepsilon \right) . \end{aligned}$$
(79)

The strain energy density is, using the symmetries of the elastic tensor,

$$\begin{aligned} \mathcal {E}\left( \mathbf{X},z\right)&= \tfrac{1}{2}\sqrt{{\bar{a}}/a}\, \mathcal {A}^{\alpha \beta \gamma \zeta } \left[ \left( \varepsilon _{\alpha \beta } -zb_{\alpha \beta }\right) \left( \varepsilon _{\gamma \zeta } -zb_{\gamma \zeta }\right) -3z^2c_{\alpha \beta }\varepsilon _{\gamma \zeta } - 4z^2b_{\alpha \beta } b^\eta _\gamma \varepsilon _{\eta \zeta } \right] \nonumber \\&\qquad + \mathrm {odd\,} (zb)^3\, \mathrm {terms} + O\left( (zb)^4, (zb)^3\varepsilon , (zb)^2\varepsilon ^2, \varepsilon ^3\right) , \end{aligned}$$
(80)

where the odd terms will be integrated away. We end up with

$$\begin{aligned} \mathcal {E}_{2D}(\mathbf{X})&= \tfrac{1}{2}\sqrt{{\bar{a}}/a}\, \mathcal {A}^{\alpha \beta \gamma \zeta }\left[ \left( t\,\varepsilon _{\alpha \beta } - \tfrac{t^3}{4}c_{\alpha \beta }\right) \varepsilon _{\gamma \zeta } - \tfrac{t^3}{3}b_{\alpha \beta } b^\eta _\gamma \varepsilon _{\eta \zeta } +\tfrac{t^3}{12}\,b_{\alpha \beta }b_{\gamma \zeta } \right] \nonumber \\&\qquad + O\left( t(tb)^4, t(tb)^3\varepsilon , t(tb)^2\varepsilon ^2, t\varepsilon ^3\right) , \end{aligned}$$
(81)

which contains, alongside the familiar stretching and bending energies, two extra terms linear in the mid-surface strain. Variation of the curvatures in these terms will lead to mixed higher-order terms, which we will neglect in the equations of equilibrium. However, variation of the strain in these terms will lead to terms that combine with the variation of the stretching energy in an augmented stress

(82)
(83)

The extra terms in this stress contribute terms to the equations of equilibrium similar to those arising from the variation of \(H^2\). While it has been preferable to neglect smaller terms in our prior discussion of energies, their variational offspring are not really “smaller” than other terms we consider important. We will return to this issue in Sect. 4.4 in an attempt to approximate the equations of equilibrium. Note that while some of these terms would not arise when using the reference metric instead of the present metric, some such terms would indeed remain, so these extra complications are relevant to either approach. The difference between expression (83) and its equivalent in [13] comes from not varying the volume form, or using its expansion off of the mid-surface, in the definition of a material energy.

4.2 Mean curvature

The squared mean curvature term is \(\sqrt{{\bar{a}}/a}\,{\scriptstyle B_H}H^2\), with \(H^2\) given by (9). Recall that \(\delta \left( \hbox {d}A\sqrt{{\bar{a}}/a}\right) = 0\). We will use the trick (73) and the symmetry of \(\nabla _\alpha \nabla _\beta \mathbf{X}\). Consider

$$\begin{aligned}&\delta \int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \tfrac{1}{4}\nabla ^2\mathbf{X}\cdot \nabla ^2\mathbf{X}\nonumber \\&= \int \hbox {d}A \sqrt{{\bar{a}}/a}\, \tfrac{1}{2} \nabla ^2\mathbf{X}\cdot \left[ a^{\alpha \beta }\delta \left( \nabla _\alpha \nabla _\beta \mathbf{X}\right) + \delta a^{\alpha \beta }\nabla _\alpha \nabla _\beta \mathbf{X}\right] , \nonumber \\&= \int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \left[ \tfrac{1}{2}\nabla ^2\mathbf{X}\cdot \nabla ^2\delta \mathbf{X}- \nabla ^2\mathbf{X}\cdot \nabla _\alpha \nabla _\beta \mathbf{X}\nabla ^\beta \mathbf{X}\cdot \nabla ^\alpha \delta \mathbf{X}\right] , \nonumber \\&= \int \hbox {d}A \left[ \nabla ^2\left( \sqrt{{\bar{a}}/a}\,\tfrac{1}{2}\nabla ^2\mathbf{X}\right) + \nabla ^\alpha \left( \sqrt{{\bar{a}}/a}\, \nabla ^2\mathbf{X}\cdot \nabla _\alpha \nabla _\beta \mathbf{X}\nabla ^\beta \mathbf{X}\right) \right] \cdot \delta \mathbf{X} \end{aligned}$$
(84)
$$\begin{aligned}&\quad \,\, + \oint \hbox {d}L\, n_\alpha \left[ \sqrt{{\bar{a}}/a}\,\tfrac{1}{2}\nabla ^2\mathbf{X}\cdot \nabla ^\alpha \delta \mathbf{X}- \nabla ^\alpha \left( \sqrt{{\bar{a}}/a}\,\tfrac{1}{2}\nabla ^2\mathbf{X}\right) \cdot \delta \mathbf{X}\right. \nonumber \\&\quad \,\, \left. -\, \sqrt{{\bar{a}}/a}\,\nabla ^2\mathbf{X}\cdot \nabla ^\alpha \nabla _\beta \mathbf{X}\nabla ^\beta \mathbf{X}\cdot \delta \mathbf{X}\right] , \nonumber \\&= \int \hbox {d}A \left[ \nabla ^2\left( \sqrt{{\bar{a}}/a}\,H\right) \varvec{\mathrm {\hat{N}}} - \sqrt{{\bar{a}}/a}\, H\nabla ^2\varvec{\mathrm {\hat{N}}} \right] \cdot \delta \mathbf{X}\nonumber \\&\quad \,\, - \oint \hbox {d}L\, n_\alpha \nabla ^\alpha \left( \sqrt{{\bar{a}}/a}\, H \right) \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X} \end{aligned}$$
(85)
$$\begin{aligned}&\quad \,\, + \oint \hbox {d}L\, \sqrt{{\bar{a}}/a}\, H n_\alpha \nabla ^\alpha \left( \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right) \, , \end{aligned}$$
(86)
$$\begin{aligned}&= \int \hbox {d}A \left[ \nabla ^2\left( \sqrt{{\bar{a}}/a}\,H\right) \varvec{\mathrm {\hat{N}}} - \sqrt{{\bar{a}}/a}\, H\nabla ^2\varvec{\mathrm {\hat{N}}} \right] \cdot \delta \mathbf{X}, \end{aligned}$$
(87)
$$\begin{aligned}&\quad \,\, + \oint \hbox {d}L \left[ \sqrt{{\bar{a}}/a}\, H n_\alpha \nabla ^\alpha \varvec{\mathrm {\hat{N}}} -n_\alpha \nabla ^\alpha \left( \sqrt{{\bar{a}}/a}\, H \right) \varvec{\mathrm {\hat{N}}} \right] \cdot \delta \mathbf{X} \end{aligned}$$
(88)
$$\begin{aligned}&\quad \,\, + \oint \hbox {d}L\, \sqrt{{\bar{a}}/a}\, H \varvec{\mathrm {\hat{N}}} \cdot n_\alpha \nabla ^\alpha \delta \mathbf{X}, \end{aligned}$$
(89)

where \(n_\alpha \nabla ^\alpha \) is the projection of the covariant derivative normal to the boundary. Smooth boundaries have been assumed, although no corner terms would arise from this energy anyway. The two boundary terms correspond to forces and moments, the latter arising from the derivative normal to the boundary, and are in accordance with Steigmann’s separation of variations [74]. Had we instead kept \(\nabla ^\alpha \left( \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right) \) intact on the boundary, as in lines (8586), we would have obtained boundary terms like those [7, 13, 69, 113] that arise when variations are broken into tangential and normal components, or akin to geometrically linearized Föppl–von Kármán approaches such as that in [6]. While these provide a simpler force boundary condition, they implicitly view \(\nabla ^\alpha \left( \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right) \) as somehow independent of \(\delta \mathbf{X}\).

We can rewrite the bulk term (87) using Gauss–Weingarten (56) and Codazzi (11) to evaluate the Laplacian of the normal,

$$\begin{aligned}&\left[ \nabla ^2\left( \sqrt{{\bar{a}}/a}\,H\right) + \sqrt{{\bar{a}}/a}\,2H\left( 2H^2 - K\right) \right] \varvec{\mathrm {\hat{N}}} + \sqrt{{\bar{a}}/a}\,\nabla _\alpha \left( H^2\right) \nabla ^\alpha \mathbf{X}, \end{aligned}$$
(90)
$$\begin{aligned}&= \left[ \nabla ^2\left( \sqrt{{\bar{a}}/a}\,H\right) + \sqrt{{\bar{a}}/a}\,2H\left( H^2 - K\right) \right] \varvec{\mathrm {\hat{N}}} + \sqrt{{\bar{a}}/a}\,\nabla _\alpha \left( H^2\nabla ^\alpha \mathbf{X}\right) , \end{aligned}$$
(91)

where the first line separates normal and tangential components, but the second line separates a Helfrich-like term from another term arising because of the lack of variation of the area form in the elastic integral.

4.3 Gaussian curvature

The Gaussian curvature term is \(-\sqrt{{\bar{a}}/a}\,{\scriptstyle B_K}K\), with K given by (10). This is an elastic form of a geometric energy whose variation would be a pure divergence, by the Gauss–Bonnet theorem. By contrast, the elastic energy contributes to the Euler–Lagrange equations.

We will require the following funny identity that makes use of Gauss (12), Weingarten (56), and Codazzi (11):

(92)

Along with this, we will use the symmetry of \(\nabla _\alpha \nabla _\beta \mathbf{X}\), the identity \(\nabla _\alpha \mathbf{X}\cdot \nabla _\beta \mathbf{X}\nabla ^\beta \mathbf{X}= \nabla _\alpha \mathbf{X}\), and the fact that \(\nabla _\alpha \nabla _\beta \mathbf{X}\cdot \nabla _\gamma \mathbf{X}=0\). Rather than evaluate the second term in K similarly to the first, consider the entire expression (10) together,

$$\begin{aligned} \delta&\int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \left[ \tfrac{1}{2}\nabla ^2\mathbf{X}\cdot \nabla ^2\mathbf{X}- \tfrac{1}{2}\nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \nabla _\alpha \nabla ^\beta \mathbf{X}\right] \nonumber \\&= \int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \left[ \nabla ^2\mathbf{X}\cdot \nabla ^2\delta \mathbf{X}-2\nabla ^2\mathbf{X}\cdot \nabla _\alpha \nabla _\beta \mathbf{X}\nabla ^\beta \mathbf{X}\cdot \nabla ^\alpha \delta \mathbf{X}\right. \nonumber \\&\quad \,\, \left. - \nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \nabla _\alpha \nabla ^\beta \delta \mathbf{X}+ 2\nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \nabla _\alpha \nabla _\gamma \mathbf{X}\nabla ^\gamma \mathbf{X}\cdot \nabla ^\beta \delta \mathbf{X}\right] , \nonumber \\&= \int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \left[ \nabla ^2\mathbf{X}\cdot \nabla ^2\delta \mathbf{X}- \nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \nabla _\alpha \nabla ^\beta \delta \mathbf{X}-2K\nabla _\alpha \mathbf{X}\cdot \nabla ^\alpha \delta \mathbf{X}\right] , \end{aligned}$$
(93)
$$\begin{aligned}&= \int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \left[ \nabla _\alpha \left( \nabla ^2\mathbf{X}\cdot \nabla ^\alpha \delta \mathbf{X}- \nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \nabla ^\beta \delta \mathbf{X}\right) - K\nabla _\alpha \mathbf{X}\cdot \nabla ^\alpha \delta \mathbf{X}\right] , \nonumber \\&= \int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \left[ \nabla ^2\left( \nabla ^2\mathbf{X}\cdot \delta \mathbf{X}\right) - \nabla _\alpha \nabla ^\beta \left( \nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \delta \mathbf{X}\right) +\nabla _\alpha \left( K\nabla ^\alpha \mathbf{X}\right) \cdot \delta \mathbf{X}\right] . \end{aligned}$$
(94)

This is tantalizingly close to a divergence, but misses on two counts. One is that the factor \(\sqrt{{\bar{a}}/a}\) does not pass through the covariant derivative. The other is that the \(\delta \mathbf{X}\) in the third term sits outside the divergence. This is simply the result of not varying the volume form (see 48). The term \(\delta \left( \sqrt{a}\,K\right) = \sqrt{a}\left( \delta K + K\nabla ^\alpha \mathbf{X}\cdot \nabla _\alpha \delta \mathbf{X}\right) \), where \(\delta K\) is just the bracketed quantity in (94), is indeed a divergence, and thus moveable to the boundary [13, 69, 72]. The first two terms of (94) simplify as follows,

$$\begin{aligned} \nabla ^2\left( \nabla ^2\mathbf{X}\cdot \delta \mathbf{X}\right) - \nabla _\alpha \nabla ^\beta \left( \nabla ^\alpha \nabla _\beta \mathbf{X}\cdot \delta \mathbf{X}\right)&= \nabla _\alpha \nabla ^\beta \left[ \left( \nabla ^2\mathbf{X}\delta ^\alpha _\beta - \nabla ^\alpha \nabla _\beta \mathbf{X}\right) \cdot \delta \mathbf{X}\right] , \nonumber \\&= \nabla _\alpha \nabla ^\beta \left[ \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right] , \nonumber \\&= \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \nabla _\alpha \nabla ^\beta \left( \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right) , \end{aligned}$$
(95)

where we have used Codazzi (11) to see that \(\left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \) is divergence-free.

There are several ways of manipulating the integrals. We choose to return to the calculation and proceed; thus,

$$\begin{aligned}&\int \hbox {d}A\, \sqrt{{\bar{a}}/a}\, \left[ \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \nabla _\alpha \nabla ^\beta \left( \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right) +\nabla _\alpha \left( K\nabla ^\alpha \mathbf{X}\right) \cdot \delta \mathbf{X}\right] \nonumber \\&= \int \hbox {d}A \left[ \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \nabla ^\beta \nabla _\alpha \left( \sqrt{{\bar{a}}/a}\, \right) \varvec{\mathrm {\hat{N}}} +\sqrt{{\bar{a}}/a}\,\nabla _\alpha \left( K\nabla ^\alpha \mathbf{X}\right) \right] \cdot \delta \mathbf{X}\nonumber \\&\quad \,\, + \oint \hbox {d}L \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \left[ \sqrt{{\bar{a}}/a}\,n_\alpha \nabla ^\beta \left( \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right) - n^\beta \nabla _\alpha \left( \sqrt{{\bar{a}}/a}\,\right) \varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\right] , \nonumber \\&= \int \hbox {d}A \left[ \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \nabla ^\beta \nabla _\alpha \left( \sqrt{{\bar{a}}/a}\, \right) \varvec{\mathrm {\hat{N}}} +\sqrt{{\bar{a}}/a}\,\nabla _\alpha \left( K\nabla ^\alpha \mathbf{X}\right) \right] \cdot \delta \mathbf{X}\nonumber \\&\quad \,\, + \oint \hbox {d}L \left[ l^\beta l_\gamma \nabla ^\gamma \left( b^\alpha _\beta n_\alpha \sqrt{{\bar{a}}/a}\, \right) - \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) n^\beta \nabla _\alpha \left( \sqrt{{\bar{a}}/a}\,\right) \right] \varvec{\mathrm {\hat{N}}}\cdot \delta \mathbf{X} \end{aligned}$$
(96)
$$\begin{aligned}&\quad \,\, + \oint \hbox {d}L\, \sqrt{{\bar{a}}/a}\,\left( 2H-n_\alpha b^\alpha _\beta n^\beta \right) n_\gamma \nabla ^\gamma \left( \varvec{\mathrm {\hat{N}}}\cdot \delta \mathbf{X}\right) , \end{aligned}$$
(97)
$$\begin{aligned}&= \int \hbox {d}A \left[ \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) \nabla ^\beta \nabla _\alpha \left( \sqrt{{\bar{a}}/a}\, \right) \varvec{\mathrm {\hat{N}}} +\sqrt{{\bar{a}}/a}\,\nabla _\alpha \left( K\nabla ^\alpha \mathbf{X}\right) \right] \cdot \delta \mathbf{X}\nonumber \\&\quad \,\, + \oint \hbox {d}L \left[ \sqrt{{\bar{a}}/a}\,\left( 2H-n_\alpha b^\alpha _\beta n^\beta \right) n_\gamma \nabla ^\gamma \varvec{\mathrm {\hat{N}}} - \left( 2H\delta ^\alpha _\beta - b^\alpha _\beta \right) n^\beta \nabla _\alpha \left( \sqrt{{\bar{a}}/a}\,\right) \varvec{\mathrm {\hat{N}}} \right. \nonumber \\&\quad \,\, \left. + \,l^\beta l_\gamma \nabla ^\gamma \left( b^\alpha _\beta n_\alpha \sqrt{{\bar{a}}/a}\, \right) \varvec{\mathrm {\hat{N}}} \right] \cdot \delta \mathbf{X} \end{aligned}$$
(98)
$$\begin{aligned}&\quad \,\, + \oint \hbox {d}L\, \sqrt{{\bar{a}}/a}\,\left( 2H-n_\alpha b^\alpha _\beta n^\beta \right) \varvec{\mathrm {\hat{N}}}\cdot n_\gamma \nabla ^\gamma \delta \mathbf{X}. \end{aligned}$$
(99)

The final lines use the decomposition \(\nabla ^\beta = \left( n^\beta n_\gamma + l^\beta l_\gamma \right) \nabla ^\gamma \), where \(n_\gamma \nabla ^\gamma \) and \(l_\gamma \nabla ^\gamma \) are the projections of the covariant derivative onto the unit normal to the boundary, and along the boundary, respectively, and \(n_\alpha n^\alpha = 1\) and \(n_\alpha l^\alpha = 0\). A quantity such as \(l^\beta l_\gamma \nabla ^\gamma \mathrm{T}_\beta \) is a divergence on the one-dimensional boundary. Smooth boundaries are assumed, so a corner term \(- l^\beta b^\alpha _\beta n_\alpha \sqrt{{\bar{a}}/a}\,\varvec{\mathrm {\hat{N}}} \cdot \delta \mathbf{X}\) is thrown away. Note that one of the terms in the corner jump condition of [13] is necessarily zero because \(n_\alpha l^\alpha = 0\), but was kept to show kinship with the term in the moment balance equation. Again, had we grouped terms as in the lines (9697), we would have obtained boundary terms consistent with the form found in [6, 7, 13, 69, 113]. Other possibilities will be discussed in Sect. 4.4 just below.

4.4 Combining and dropping terms

The scale factor \(\sqrt{{\bar{a}}/a}\) adds complications by behaving essentially like a variable bending modulus throughout these expressions. We have retained it for a long time, primarily as a reminder that mass is conserved and that elasticity is per mass (material) and not per area (geometric). However, remembering that \(\sqrt{{\bar{a}}/a} = 1 + O(\varepsilon )\), let us consider whether we can justify dropping the additional ugly terms.

Here I offer a hand-waving argument that says we can do so, anticipating the final form of the equations of equilibrium, and focusing on the shape equation, the normal projection of the bulk terms. If the scale factor were unity, we would end up with terms of orders \((t^3\nabla ^2 b, t^3b^3, tb\varepsilon )\), the latter coming from the stretching term, while the augmentation of the stress adds nothing new. If these terms are retained in the final balance, it implies that we consider \(\varepsilon \), tb and \(t\nabla \) “small.” Inclusion of the scale factor gives us terms of the original orders multiplied by \(\varepsilon \), as well as new terms of orders \((t^3 b \nabla ^2\varepsilon , t^3\nabla b \nabla \varepsilon )\). According to our thinking, these are also smaller by an order of \(\varepsilon \) and can be dropped.

After this, the only remaining terms that distinguish between per mass and per area energies are terms \({\scriptstyle B_H}\nabla _\alpha \left( H^2\nabla ^\alpha \mathbf{X}\right) \) and \(-{\scriptstyle B_K}\nabla _\alpha \left( K\nabla ^\alpha \mathbf{X}\right) \) in the vector bulk equations. These can be grouped with the stress tensor, although this makes one boundary condition particularly long-winded. Note that the term involving K will disappear in the limit of a mid-surface isometry only when the reference configuration is a flat plate; for incompatible elasticity, this term is nontrivial. Had we approximated the scale factor in the energy, we could have gone further and constructed an approximate geometric energy, varied the area form \(\sqrt{a}\) along with the rest of the bending energy, and obtained a pure boundary term for the variation of the Gaussian curvature, as is done in the isometric treatment of Guven and Müller [87]. As discussed in Sects. 3.1.1 and 4.1.1, there is a difference between dropping terms in the energy and dropping them in the equations of equilibrium. The difference between the geometric \(\sqrt{a}\,K\) and the elastic \(\sqrt{{\bar{a}}}\,K\) is another example of a \((zb)^2\varepsilon \) term that could have been added to or dropped from the energy, but contributes terms at retained orders in the equations of equilibrium. Thus, particularly if we wish to retain the extra stress terms (see 83), we may wish to keep this term in the equations of equilibrium while dropping all other terms involving the scale factor. And even if we truncate the energy at quadratic order in mid-surface strain and curvature, it is most physical to write the energy as being per mass, which means retaining the bulk Gaussian term. Of course, if the energy is written up to higher orders, these distinctions become important, and any approximation as a per area energy becomes less accurate.

In the limit of a mid-surface isometry, the full expression for the variation of K simplifies considerably. It is then a matter of taste whether one uses the area form \(\sqrt{a}\) or the equivalent reference area form \(\sqrt{{\bar{a}}}\) as the measure of integration, the difference appearing only in the multipliers. The former choice, while unconventional from the point of view of the mechanics literature, is preferred by many physicists and leads to geometric integrals with elegant properties [87]. However, this hides the material nature of the elastic energy and does not cleanly link up with the material form required when mid-surface strains, particularly of higher orders, are included. Thus, despite some loss of elegance, we favor retaining the per mass form of the equations even in the isometric limit.

To derive the equations for the isometric limit, must be replaced with a multiplier. Terms with a strain, and bending terms that arose from variation of a strain, no longer have meaning. The relevant energy density is then

$$\begin{aligned} \int \sqrt{{\bar{a}}}\,\hbox {d}x^1\hbox {d}x^2 \left[ \left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) + \tfrac{1}{2}{\tilde{\varsigma }}^{\alpha \beta }\left( \nabla _\alpha \mathbf{X}\cdot \nabla _\beta \mathbf{X}- {\bar{a}}_{\alpha \beta }\right) \right] , \end{aligned}$$
(100)

where \({\tilde{\varsigma }}^{\alpha \beta }\) is a tensor multiplier enforcing the constraint of isometric deformations of the mid-surface. This is an elastic (non-geometric) version of the Guven–Müller functional [87] that describes a metrically constrained Willmore–Helfrich energy. Because \(\sqrt{{\bar{a}}} = \sqrt{a}\) in the limit, the only difference between the resulting equations and those in [87] is a redefinition of the multipliers as in (78).

Let us now write the combined equations using the symbol \(\tau ^{\alpha \beta }\), which should be interpreted either as the Lagrange multiplier \({\tilde{\varsigma }}^{\alpha \beta }\) in the constrained theory (100), or as the stress from (83) in an elastic theory that retains terms up to \(O((t\nabla )^2(tb), (tb)^3, (tb)\varepsilon )\) in the normal projection of the bulk equations.

The combined bulk equations are

$$\begin{aligned} {\scriptstyle B_H}\left[ \nabla ^2 H + 2H\left( H^2 - K\right) \right] \varvec{\mathrm {\hat{N}}} - \nabla _\alpha \left( \left[ \tau ^{\alpha \beta } - \left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) a^{\alpha \beta } \right] \nabla _\beta \mathbf{X}\right) = \mathbf{{0}} , \end{aligned}$$
(101)

with free boundary conditions for forces

$$\begin{aligned} \left[ -{\scriptstyle B_H}n_\alpha \nabla ^\alpha H - {\scriptstyle B_K}l^\beta l_\gamma \nabla ^\gamma \left( b^\alpha _\beta n_\alpha \right) \right] \varvec{\mathrm {\hat{N}}} +\left[ {\scriptstyle B_H}H -{\scriptstyle B_K}\left( 2H-n_\alpha b^\alpha _\beta n^\beta \right) \right] n_\gamma \nabla ^\gamma \varvec{\mathrm {\hat{N}}} + n_\alpha \tau ^{\alpha \beta }\nabla _\beta \mathbf{X}= \mathbf{{0}}, \end{aligned}$$
(102)

and moments

$$\begin{aligned} \left[ {\scriptstyle B_H}H - {\scriptstyle B_K}\left( 2H - n_\alpha b^\alpha _\beta n^\beta \right) \right] \varvec{\mathrm {\hat{N}}} = \mathbf{{0}}. \end{aligned}$$
(103)

The bulk Eq. (101) and force boundary condition (102) correspond to the variation \(\delta \mathbf{X}\), and the moment boundary condition corresponds to the derivative of this variation normal to the boundary, \(n_\alpha \nabla ^\alpha \delta \mathbf{X}\). It seems clear that these can be taken as independent variations, in keeping with the grouping of terms in lines (8889) and (9899). Again, this is in contrast with those treatments [7, 13, 69, 113] that break the variation into tangential and normal components, and consider \(\delta \mathbf{X}\) and \(n_\alpha \nabla ^\alpha \left( \varvec{\mathrm {\hat{N}}}\cdot \delta \mathbf{X}\right) \) to be independent variations, in keeping with the alternate grouping of terms in lines (8586) and (9697). The present form is consistent with the approach of Steigmann [74] and contains additional \(n_\gamma \nabla ^\gamma \varvec{\mathrm {\hat{N}}}\) terms in the force boundary condition (102) that are absent in [6, 7, 13, 69, 113]. Note that the moment condition (103) can be used to replace the terms multiplying \(n_\gamma \nabla ^\gamma \varvec{\mathrm {\hat{N}}}\) in (102), which simply vanish for a moment-free boundary. We also did not group \(-\left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) a^{\alpha \beta }\) terms with the stress in (102) as was done in the bulk equation, because this merely complicates the force boundary condition—compare the tangential projection of (102) with equation (6) of [13], where additional terms appear outside the stress. For an elastic plate, \({\scriptstyle B_H}\) is linearly related to \({\scriptstyle B_K}\), so one can combine the mean curvature terms in the moment bracket, but we refrain from doing so in order to keep the results applicable to any general choice of coefficients for the two quadratic curvature energies.

The vector bulk equations (101) are a surface divergence, which can be seen by looking back to line (84),

$$\begin{aligned} \nabla _\alpha \left[ {\scriptstyle B_H}\left( \nabla ^\alpha H\varvec{\mathrm {\hat{N}}} - H\nabla ^\alpha \varvec{\mathrm {\hat{N}}}\right) - \tau ^{\alpha \beta }\nabla _\beta \mathbf{X}- {\scriptstyle B_K}K\nabla ^\alpha \mathbf{X}\right] = \mathbf{{0}}. \end{aligned}$$
(104)

Using Gauss (13) and some manipulations, we can write

$$\begin{aligned} n_\alpha K\nabla ^\alpha \mathbf{X}= -\left( 2H - n_\alpha b^\alpha _\beta n^\beta \right) n_\gamma \nabla ^\gamma \varvec{\mathrm {\hat{N}}} - l^\beta l_\gamma \nabla ^\gamma \left( b^\alpha _\beta n_\alpha \right) \varvec{\mathrm {\hat{N}}} + l^\beta l_\gamma \nabla ^\gamma \left( b^\alpha _\beta n_\alpha \varvec{\mathrm {\hat{N}}} \right) , \end{aligned}$$
(105)

so that (\(-n_\alpha \) times) the quantity inside the divergence in (104) differs from the force boundary terms in (102) only by a boundary divergence.

The normal and tangential projections of the bulk equations (101) are

$$\begin{aligned} {\scriptstyle B_H}\left[ \nabla ^2 H + 2H\left( H^2 - K\right) \right] - \left[ \tau ^{\alpha \beta } - \left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) a^{\alpha \beta } \right] b_{\beta \alpha }&= 0, \end{aligned}$$
(106)
$$\begin{aligned} -\,\nabla _\alpha \left[ \tau ^{\alpha \gamma } - \left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) a^{\alpha \gamma } \right]&= 0. \end{aligned}$$
(107)

The divergence form of (104) and its tangential projection (107) are clearly anticipated from the discussion in [87]. Note that terms \(\left[ \left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) a^{\alpha \beta }\right] b_{\beta \alpha } = 2H \left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) \) in (106) arising from the lack of variation of the area form are similar to, but not quite the same as, the cubic terms in the Helfrich contribution on the left. The normal and tangential projections of the force boundary condition (102) are

$$\begin{aligned} -{\scriptstyle B_H}n_\alpha \nabla ^\alpha H - {\scriptstyle B_K}l^\beta l_\gamma \nabla ^\gamma \left( b^\alpha _\beta n_\alpha \right)&= 0, \end{aligned}$$
(108)
$$\begin{aligned} n_\gamma \left( -\left[ {\scriptstyle B_H}H -{\scriptstyle B_K}\left( 2H-n_\alpha b^\alpha _\beta n^\beta \right) \right] b^{\gamma \eta } + \tau ^{\gamma \eta } \right)&= 0, \end{aligned}$$
(109)

where again we do not group the \(-\left( {\scriptstyle B_H}H^2 - {\scriptstyle B_K}K \right) a^{\alpha \beta }\) term with the stress \(\tau ^{\alpha \beta }\), to keep the expression (109) in a simpler form. The moment boundary condition (103) has only its normal projection,

$$\begin{aligned} {\scriptstyle B_H}H - {\scriptstyle B_K}\left( 2H - n_\alpha b^\alpha _\beta n^\beta \right) = 0 . \end{aligned}$$
(110)

We can use (110) in (109) to find that \(n_\gamma \tau ^{\gamma \eta } = 0\) for a moment-free boundary.

There are other ways to write these equations and conditions, and the related forms for lipid membranes, and no doubt some are nicer than what is found here. Notable examples include the use of force and moment [19, 74, 75, 85] or closely related stress and torque tensors [68, 69, 87, 114], and the normal and geodesic curvature and torsion of the boundary [72, 114].

Interpreting the equations as constrained elastic equations derived from (100), it is instructive to compare them with those derived in the arc length “gauge” for the inextensible elastica using the functional \(\tfrac{1}{2}\int \hbox {d}s \left[ {\scriptstyle B}\kappa ^2 + \varsigma \left( \partial _s\mathbf{X}\cdot \partial _s\mathbf{X}- 1\right) \right] \), where \(\kappa ^2 = \partial ^2_s\mathbf{X}\cdot \partial ^2_s\mathbf{X}\) is the squared Frenet curvature. Here s is both the rest and present arc length, no metric tensor or covariant derivative explicitly appears, and \(\hbox {d}s\) is not subject to variation. Comparison with the variation of this functional shows us that \(\tau ^{ss} = \varsigma + 2{\scriptstyle B}\kappa ^2 = \varsigma + 4{\scriptstyle B_H}H^2\), with \(\kappa = 2H\) and \(2{\scriptstyle B}= {\scriptstyle B_H}\).

5 Additional discussion

The results derived here can just as well be thought of as applying to a plate with in-plane incompatibility, that is, an object with no through-thickness variation in rest metric. An interpretation in terms of a reference embedding in \({\mathbb {E}}^3\) is no longer possible, although one could likely construct a smooth or piecewise smooth [115] isometric embedding of the mid-surface. The present metric (Almansi) approach, in concert with the second Kirchhoff–Love assumption, leads to expressions in the Euler–Lagrange equations involving geometric quantities of the deformed surface. Using another approach that favors the reference metric, it should be possible to derive similarly simple expressions involving the geometric features of the reference configuration. This question would be better explored in the context of shells rather than plates. The retention of the divergence form of the equations after tangential projection, a structure made clear in early work on bulk elasticity [4] and in recent approaches to solid surfaces [87], likely reflects a connection between the statements of conservation of momentum and pseudomomentum [116].

The concept of an energy quadratic in strain has been ill-defined. If metrics or the \(\epsilon _{ij}\) derived from them are seen as fundamental fields, then a general quadratic energy would contain terms constructed from both \(g^{ij}{\bar{g}}_{jk}\) and \({\bar{g}}^{IJ}g_{jk}=\left( {\bar{g}}^{-1}\right) ^{ij}g_{jk}\). Interestingly, one finds both \({\bar{g}}^{IJ}g_{ij}\) and \(g^{ij}{\bar{g}}_{ij}\sqrt{g/{\bar{g}}}\) as commonly discussed invariants in early foundational work on elasticity [4, 5], because one can rewrite invariants of right and left Cauchy–Green tensors and their inverses, and by extension the Green and Almansi strains, in terms of each other. These translations involve the third (cubic) invariant, the Jacobian \(\sqrt{g/{\bar{g}}}\,\). But there are yet other measures of strain. The principal stretches are square roots of the eigenvalues of the right Cauchy–Green tensor, so something quadratic in the latter is quartic in the former. Commonly used rubber elasticity models such as neo-Hookean or Mooney–Rivlin energy densities explicitly contain terms quadratic in stretches, although they are not general expansions in stretch as might satisfy a physicist.

The difference in the importance or order of terms between the energy and equilibrium equations, leading to potential confusion as to when to drop terms, may simply indicate that an expansion in terms of mid-surface Green or Almansi strain and the product of curvature and thickness is simply not the natural one. Further indication of this possibility comes from the correspondence between Biot strain and the simplest direct theories of rods. Thus, developing a general covariant theory of Biot strain in the language of physics would be valuable. It would also connect with early treatments of low-dimensional bodies [25,26,27,28, 103] and recent developments in soft matter [80, 81, 108]. Many direct theories of rods and shells naively assume a simple relationship between a convenient kinematical variable and the stored energy function without determining what that possibly very complex function might be. Some comparisons have been made between direct and reduced theories [25,26,27,28, 103, 117,118,119], but it would be valuable to take a broader look at the simple generalized strains that naturally arise in Cosserat theories [25, 59, 93, 94] and how they do, or do not, correspond to simple stored energy functions. Linking a direct theory to a bulk elastic model not only justifies its employment, but could be a basis for derivation of better models for elastic strips or moderately thick structures.

6 Conclusions

I have compared and discussed some recent approaches to incompatible elasticity in the soft condensed matter physics community and presented the derivation of plate equations in a compact form. Among the issues raised were the meaning behind what physicists refer to as metric choices, the divergence form of equations of equilibrium, qualitative differences in derived bending energies and their predictions for a simple combined stretching and bending problem, the possible advantages of the Biot strain as an alternate measure to serve as a basis for expansions, and the differences between geometric and material bending energies.