1 Introduction

In 1982, Penrose [25] listed a set of major open problems in which one was to find a suitable quasi-local definition of energy-momentum in general relativity. As is well-known, the fundamental difficulty is that there is no natural notion of energy density for the gravitational field, due to Einstein’s principle of equivalence. This has led to a plethora of different mathematical formulations including: the localization of ADM mass by Bartnik [2], twistor and spinor approaches of Dougan and Mason [8], Lott [21], Ludvigsen and Vickers [22], Penrose [26], and Zhang [41], the Hawking mass [13], as well as Hamilton–Jacobi methods employed by Booth and Mann [3], Brown and York [6], Epp [10], Hawking and Horowitz [14], Kijowski [18], Liu and Yau [20], Tsang [36], and Wang and Yau [38]. For a detailed account of the various quasi-local masses, see the survey paper of Szabados [35].

There are several desirable properties that a quasi-local mass should have. For instance, like the positive mass theorem it should be nonnegative for a large class of surfaces in the presence of the dominant energy condition, and exhibit rigidity in the sense that it vanishes if and only if the surface arises from Minkowski space. Appropriate asymptotics are also important, in that the ADM and Bondi masses should be recovered in the large sphere limits at spatial and null infinities. Moreover, gauge independence and relation with a Hamilton–Jacobi analysis are preferable, in order to aid with physical relevance and interpretations. While many other properties of the quasi-local mass itself may be added to the list, a potentially advantageous characteristic of a proof of nonnegativity is that it be quasi-local. This refers to a proof strategy that appeals solely to a compact initial data set enclosed by the spacelike 2-surface in question, as opposed to the use of asymptotically flat (or other) extensions and the positive mass theorem. The latter strategy of extensions is essential in the work of Shi and Tam [32] on the Brown–York mass, as well as for the Liu-Yau [20] and Wang-Yau [37, 38] masses, and is even contained within the definition of the Bartnik mass [2]. In fact, Schoen asked in [30] whether it is possible to find a quasi-local proof of nonnegativity for the Brown–York and related quasi-local masses. An affirmative answer to this question for the Brown–York definition has recently been put forward by Montiel [23], utilizing a spinorial approach.

In the current paper, we introduce a new gauge independent expression for quasi-local mass with a quasi-local proof of nonnegativity. This mass also comes with a Hamilton–Jacobi interpretation, and although it shares some similarities with the Wang-Yau definition, the new notion has a completely separate derivation and involves a different range of applicability. The derivation and basic motivation comes from certain integral expressions associated with spacetime harmonic functions [15]. In [33], level set techniques for harmonic maps to \(S^1\) were introduced into the study of scalar curvature on closed 3-dimensional Riemannian manifolds. Inspired by this, a proof of the Riemannian positive mass theorem [5] was given based on level sets of asymptotically linear harmonic functions. Spacetime harmonic functions were then introduced [15] to treat the asymptotically flat spacetime version of the positive mass theorem, and have subsequently been used to prove the corresponding theorem in asymptotically (locally) hyperbolic settings [1, 4], as well as for comparison theorems in Riemannian geometry [16, 17]; surveys of some recent advancements may be found in [4, 34]. Throughout this work, unless specified otherwise, all manifolds will be assumed to be connected, oriented, and smooth.

Let \(\Sigma \) be a closed (compact without boundary) spacelike 2-surface having induced metric \(\sigma \) in spacetime \(N^{3,1}\) with metric \(\langle \cdot ,\cdot \rangle \) of signature \((-+++)\). Let \(\{e_3,e_4\}\) be an orthonormal frame for the (time oriented) normal bundle consisting of a spacelike and future directed timelike vector respectively; note that this bundle is topologically trivial since the classifying space for \(SO_+(1,1)\cong {\mathbb {R}}\) is a point. In this gauge, the normal bundle connection 1-form and mean curvature vector are given by

$$\begin{aligned} \alpha _{e_3}(\cdot )=\langle \nabla ^N_{(\cdot )}e_3,e_4\rangle ,\quad \quad \quad \vec {H}=(\textrm{div}_{\sigma }e_3)e_3 -(\textrm{div}_{\sigma }e_4)e_4. \end{aligned}$$
(1.1)

Consider an isometric embedding \(\iota :\Sigma \hookrightarrow {\mathbb {R}}^{3,1}\), and choose Cartesian coordinates \(({\textbf{t}},{\textbf{x}}^i)\), \(i=1,2,3\) for the target Minkowski space. One may then obtain a function \(u_{a}=\iota ^{*}\left( -{\textbf{t}}+a_i{\textbf{x}}^i\right) \) on \(\Sigma \) for any set of constants \({\textbf{a}}=(a_1,a_2,a_3)\), which will be restricted to satisfy \(|{\textbf{a}}|^2=\sum _i a_i^2=1\). Recall that in [37, 38], Wang-Yau parameterize a class of isometric embeddings into Minkowski space with a time function \(\tau \) on \(\Sigma \) satisfying the convexity condition

$$\begin{aligned} \left( 1+|\nabla _\partial \tau |^2\right) K_{{\tilde{\sigma }}}=K_{\sigma } +\frac{\text {det}(\nabla _\partial ^2\tau )}{(\det \sigma )\left( 1+|\nabla _\partial \tau |^2\right) }>0, \end{aligned}$$
(1.2)

where \(K_{\sigma }\) and \(K_{{\tilde{\sigma }}}\) denote the Gaussian curvatures of \(\sigma \) and \({\tilde{\sigma }}=\sigma +d\tau ^2\), and \(\nabla _\partial \) is the connection with respect to \(\sigma \). By the classical theorem of Nirenberg [24] and Pogorelov [27] there exists a unique isometric embedding up to rigid motion into \({\mathbb {R}}^3\), and from this one obtains an isometric embedding into \({\mathbb {R}}^{3,1}\). An alternative method to produce isometric embeddings, which does not rely on (1.2), is to embed \(\Sigma \) into a (hyperboloid) hyperbolic space \({\mathbb {H}}^3_{-\kappa }\subset {\mathbb {R}}^{3,1}\) with large \(\kappa >0\) to aid with ellipticity. For instance, every metric on the 2-sphere admits such an isometric embedding [28, 29], and a related result of Gromov [12, Section 3.2.4] shows that any closed 2-surface admits an isometric embedding into some complete 3-dimensional Riemannian target space of constant negative sectional curvature.

The quintuple \((\sigma ,\vec {H},\alpha ,\iota ,u_a)\) will be used to build the quasi-local energy, and may be referred to as a quasi-local data set for \(\Sigma \). If the mean curvature vector is spacelike, then for any constant \(\varepsilon >0\), there exists a unique orthonormal frame \(\{{\bar{e}}_3,{\bar{e}}_4\}\) for the normal bundle of \(\Sigma \) such that

$$\begin{aligned} \langle \vec {H},{\bar{e}}_3\rangle >0,\qquad \quad \langle \vec {H},{\bar{e}}_4\rangle =\frac{-\Delta _\partial u_a}{\sqrt{|\nabla _\partial u_a|^2+\varepsilon ^2}}. \end{aligned}$$
(1.3)

We define the quasi-local energy with respect to the observer determined by the pair \((\iota ,u_a)\) to be

$$\begin{aligned} E(\Sigma ,\iota ,u_a) =\lim _{\varepsilon \rightarrow 0}\frac{1}{8\pi }\int _{\Sigma }\left( {\mathcal {H}}_0({\hat{e}}_3,u_a) -{\mathcal {H}}({\bar{e}}_3,u_a)\right) dA, \end{aligned}$$
(1.4)

where the quasi-local Hamiltonian density is

$$\begin{aligned} {\mathcal {H}}({\bar{e}}_3,u_a) =\sqrt{|\nabla _\partial u_a|^2+\varepsilon ^2}\,\langle \vec {H},{\bar{e}}_3\rangle +\alpha _{{\bar{e}}_3}(\nabla _\partial u_a) \end{aligned}$$
(1.5)

and \({\mathcal {H}}_0\) represents the same quantity for \(\iota (\Sigma )\subset {\mathbb {R}}^{3,1}\) with \(\{{\hat{e}}_3,{\hat{e}}_4\}\) denoting the frame given by equations (1.3) in the Minkowski setting. In analogy with special relativity, the quasi-local mass is then set to be the infimum of energy over all admissible observers

$$\begin{aligned} \text {mass}(\Sigma )={\text {inf}}_{(\iota ,u_a)}E(\Sigma ,\iota ,u_a). \end{aligned}$$
(1.6)

Due to the null trajectory of the observers, this mass may be interpreted as the difference of energy and norm of linear momentum from the 4-momentum vector as opposed to its Lorentz length; see the remark in [7, p. 1496] for a related discussion of this discrepancy.

A pair \((\iota ,u_a)\) will be called admissible for \(\Sigma \subset N^{3,1}\) if the mean curvature vector of \(\iota (\Sigma )\) is spacelike, and the regular level sets of \(u_a\) are maximal in the following sense. Namely, there exists a compact spacelike hypersurface \({\hat{\Omega }}\subset {\mathbb {R}}^{3,1}\) with \(\partial {\hat{\Omega }}=\iota (\Sigma )\), such that if any regular s-level set of \(u_a\) has n components then the Euler characteristic of the s-level fill-in with respect to \({\hat{\Omega }}\) satisfies \(\chi ({\hat{\Sigma }}_s)=n\). Here, \({\hat{\Sigma }}_s\) denotes the set of points within \({\hat{\Omega }}\) satisfying the level set equation \(s=-{\textbf{t}}+a_i{\textbf{x}}^i\). Furthermore, we will say that the dominant energy condition holds for a spacetime if the Einstein equations are enforced with stress-energy tensor satisfying \(T({\textbf{v}},{\textbf{w}})\ge 0\) for all future-pointing causal vectors \({\textbf{v}}\) and \({\textbf{w}}\).

Theorem 1.1

Let \(\Sigma \) be a closed spacelike 2-surface in a 4-dimensional spacetime satisfying the dominant energy condition, and assume that \(\Sigma \) has an outward pointing spacelike mean curvature vector and bounds a compact spacelike hypersurface \(\Omega \) with trivial second homology \(H_2(\Omega ;{\mathbb {Z}})=0\). The following statements hold with \({\textbf{a}}\in {\mathbb {R}}^3\) satisfying \(|{\textbf{a}}|=1\).

  1. 1.

    If an isometric embedding \(\iota :\Sigma \hookrightarrow {\mathbb {R}}^{3,1}\) together with function \(u_a\) is admissible for \(\Sigma \), then the quasi-local energy limit exists, is finite, and is nonnegative: \(E(\Sigma ,\iota ,u_a)\ge 0\).

  2. 2.

    Under the same hypotheses, if \(E(\Sigma ,\iota ,u_a)=0\) for all \({\textbf{a}}\) then \((\Sigma ,\sigma ,\vec {H},\alpha )\) arises from Minkowski space.

  3. 3.

    If \(\Sigma \subset {\mathbb {R}}^{3,1}\) is such that the inclusion map is admissible with some \(u_a\), then the quasi-local mass vanishes: \(\text {mass}(\Sigma )=0\).

Remark 1.2

Nonnegativity of the energy holds in the more general circumstance of a disconnected \(\Sigma \) and without the homology assumption on \(\Omega \), when a suitable modification of the admissibility condition is enforced. This is discussed in more detail below at the end of Sect. 4.

A model situation for an admissible pair \((\iota ,u_a)\) occurs when \(\Sigma \) is a topological 2-sphere, and its isometric image in Minkowski space bounds a 3-ball whose generic intersection with the null hyperplanes associated to \(u_a\) is in the form of a disc. Therefore, this theorem yields positivity under the dominant energy condition for a large class of surfaces. Moreover, because this admissibility condition is not based on a convexity property such as (1.2) which appears in [38, Definition 5.1], it is applicable beyond spheres to surfaces of positive genus. Beyond this difference with the Wang-Yau mass in terms of the admissibility conditions and range of applicability, there are two other immediate distinguishing characteristics between these two quasi-local masses. First, as explained in Sect. 3, a Hamilton–Jacobi analysis shows that energy (1.4) is measured by a null observer while the Wang-Yau energy is measured by a timelike observer. Secondly, while the proof of nonnegativity for the Wang-Yau mass relies on asymptotically flat extensions and ultimately the positive mass theorem, the proof here requires only quasi-local information.

A consequence of the proof of Theorem 1.1 is that the limit may be evaluated explicitly in definition (1.4). In particular, if \({\tilde{\Sigma }}\) denotes the open subset of \(\Sigma \) on which \(|\nabla _{\partial } u_a|\ne 0\) then we have

$$\begin{aligned} \begin{aligned} E(\Sigma ,\iota ,u_a)&=\frac{1}{8\pi }\int _{{\tilde{\Sigma }}}\left( \sqrt{|\vec {H}_0|^2|\nabla _\partial u_a|^2+(\Delta _\partial u_a)^2}+\nabla _{\partial }f_0\cdot \nabla _{\partial }u_a+\alpha _{\frac{\vec {H}_0}{|\vec {H}_0|}}(\nabla _{\partial }u_a)\right) dA\\&\quad -\frac{1}{8\pi }\int _{{\tilde{\Sigma }}}\left( \sqrt{|\vec {H}|^2|\nabla _\partial u_a|^2+(\Delta _\partial u_a)^2}+\nabla _{\partial }f\cdot \nabla _{\partial }u_a+\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial }u_a)\right) dA, \end{aligned} \end{aligned}$$
(1.7)

where \(\vec {H}_0\) is the mean curvature vector of the isometric embedding \(\iota (\Sigma )\) and

$$\begin{aligned} f_{0}=\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}_0||\nabla _{\partial }u_a|}\right) ,\qquad f=\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}||\nabla _{\partial }u_a|}\right) . \end{aligned}$$
(1.8)

In addition to the attributes of positivity and rigidity, the new definition also behaves well with respect to asymptotic limits at spatial infinity. Let (Mgk) be a 3-dimensional initial data set for the Einstein equations, with g denoting a Riemannian metric and k a symmetric 2-tensor representing the extrinsic curvature in spacetime. These objects satisfy the constraint equations

$$\begin{aligned} \mu =\frac{1}{2}\left( R_g +(\textrm{Tr}_g k)^2 -|k|^2\right) ,\quad \quad J=\textrm{div}_g\left( k-(\textrm{Tr}_g k)g\right) , \end{aligned}$$
(1.9)

where \(R_g\) is the scalar curvature and \(\mu \) and J represent the energy and momentum density of matter fields multiplied by \(8\pi \). The data are asymptotically flat (with one end) if outside a compact set, M is diffeomorphic to the compliment of a ball in Euclidean space \({\mathbb {R}}^3 \setminus B_1\), and in the coordinates provided by this diffeomorphism

$$\begin{aligned}{} & {} |\partial ^l (g_{ij}-\delta _{ij})(x)|=O(|x|^{-\tau -l})\quad \text { } l=0,1,2,3,\nonumber \\{} & {} \quad |\partial ^l k_{ij}(x)|=O(|x|^{-\tau -1-l})\quad \text { } l=0,1, \end{aligned}$$
(1.10)

for some \(\tau >\tfrac{1}{2}\). An extra third derivative of g is included for control of isometric embeddings in the next result. The energy and momentum densities will be taken to be integrable \(\mu , J \in L^1(M)\) so that the ADM energy and linear momentum are well-defined and given by

$$\begin{aligned} {\mathcal {E}}= & {} \lim _{r\rightarrow \infty }\frac{1}{16\pi }\int _{S_{r}}\sum _i \left( g_{ij,i}-g_{ii,j}\right) \upsilon ^j dA,\nonumber \\ {\mathcal {P}}_i= & {} \lim _{r\rightarrow \infty }\frac{1}{8\pi }\int _{S_{r}} \left( k_{ij}-(\textrm{Tr}_g k)g_{ij}\right) \upsilon ^j dA, \end{aligned}$$
(1.11)

where \(\upsilon \) is the unit outer normal to the coordinate sphere \(S_r\) of radius \(r=|x|\) and dA denotes its area element. The ADM mass is the Lorentz length of the ADM energy-momentum vector \(({\mathcal {E}},{\mathcal {P}})\). If the dominant energy condition is satisfied which implies that \(\mu \ge |J|\), then the spacetime positive mass theorem [9, 15, 31, 40] asserts that the ADM energy-momentum is nonspacelike, and characterizes Minkowski space as the unique spacetime having asymptotically flat initial data with vanishing mass; see [19] for a detailed account.

Theorem 1.3

Let (Mgk) be an asymptotically flat initial data set for the Einstein equations, and let \(\iota _r:S_r \hookrightarrow {\mathbb {R}}^{3,1}\) denote the (unique up to Euclidean motion) isometric embedding of the r-coordinate sphere into a constant time slice \({\mathbb {R}}^3\subset {\mathbb {R}}^{3,1}\). Then for any \({\textbf{a}}\), the asymptotic limit of quasi-local energies is given in terms of the ADM energy and linear momentum by

$$\begin{aligned} \lim _{r\rightarrow \infty }E(S_r,\iota _r,u_a)={\mathcal {E}}-\langle {\textbf{a}},{\mathcal {P}}\rangle . \end{aligned}$$
(1.12)

This paper is organized as follows. In the next section the main ideas behind the derivation of the energy will be explained, while in Sect. 3 a physical interpretation will be given in terms of a Hamilton–Jacobi analysis. Section 4 is dedicated to the proof of nonnegativity, and Sect. 5 deals with the rigidity statement of Theorem 1.1. Moreover, the asymptotic behavior of Theorem 1.3 will be established in Sect. 6, while the equation associated with optimal isometric embedding is obtained in Sect. 7.

2 Derivation of the Energy

A motivation for the quasi-local energy comes from the level set technique and certain integral formulae involving spacetime harmonic functions [15]. Consider a compact initial data set \((\Omega ,g,k)\). Recall that a function \(u\in C^{2}(\Omega )\) is spacetime harmonic if it satisfies the equation

$$\begin{aligned} \Delta _g u+(\textrm{Tr}_g k)|\nabla u|=0, \end{aligned}$$
(2.1)

and note that this arises as the trace of the spacetime Hessian

$$\begin{aligned} {\overline{\nabla }}_{ij} u:=\nabla _{ij}u+k_{ij}|\nabla u|. \end{aligned}$$
(2.2)

The following inequality was established in [1, Proposition 3.1].

Proposition 2.1

Let \((\Omega ,g,k)\) be a 3-dimensional compact initial data set with smooth boundary \(\partial \Omega \), having outward unit normal \(\nu \). Let \(u:\Omega \rightarrow {\mathbb {R}}\) be a spacetime harmonic function which lies in \(C^{2,\varsigma }(\Omega )\), \(0<\varsigma <1\), and consider the open subset \({\bar{\partial }}\Omega \) of the boundary on which \(|\nabla _{\partial } u|\ne 0\), where \(\nabla _{\partial }u\) is the projection of the full gradient onto the boundary tangent space. If \({\overline{u}}\) and \({\underline{u}}\) are the maximum and minimum values of u and \(\Sigma _s =u^{-1}(s)\), then

$$\begin{aligned} \begin{aligned}&\int _{\partial \Omega }\left( k(\nabla _{\partial }u,\nu )-|\nabla u|H-\nu (u)\textrm{Tr}_{\partial \Omega }k\right) dA\\&\qquad +\int _{{\bar{\partial }} \Omega }\frac{|\nabla _{\partial }u|}{|\nabla u|}\nabla _{\partial }u\left( \frac{\nu (u)}{|\nabla _{\partial }u|}\right) dA+2\pi \int _{{\underline{u}}}^{{\bar{u}}}\chi (\Sigma _s)ds \\&\quad \ge \int _{\Omega }\left( \frac{1}{2}\frac{|{\overline{\nabla }}^2 u|^2}{|\nabla u|}+\mu |\nabla u|+J(\nabla u)\right) dV, \end{aligned} \end{aligned}$$
(2.3)

where \(\chi (\Sigma _s)\) denotes the Euler characteristic, and H is the mean curvature of the boundary with respect to \(\nu \).

The right-hand side of (2.3) is nonnegative if the dominant energy condition holds, and this suggests investigating the boundary terms in relation to a quasi-local energy. In order to better interpret the boundary expression with regards to spacetime geometry, assume that the data arise from a spacetime \((\Omega ,g,k)\hookrightarrow N^{3,1}\) and let \({\textbf{n}}\) be the associated unit timelike future directed normal vector field. Consider the frame \(\{\nu ,{\textbf{n}}\}\) for the SO(1, 1) normal bundle over \(\partial \Omega \), and observe that the mean curvature vector and an auxiliary vector field are given by

$$\begin{aligned} \vec {H}=H\nu -(\textrm{Tr}_{\partial \Omega }k){\textbf{n}},\quad \quad \quad \vec {w}=|\nabla u|\nu +\nu (u){\textbf{n}}, \end{aligned}$$
(2.4)

with \(|\vec {w}|^2 =|\nabla _{\partial }u|^2\). Therefore, two of the boundary terms combine to form

$$\begin{aligned} \langle \vec {H},\vec {w}\rangle = |\nabla u|H+\nu (u)\textrm{Tr}_{\partial \Omega }k. \end{aligned}$$
(2.5)

The remaining two boundary terms may be interpreted as follows. In the gauge determined by the current frame, the connection 1-form for the normal bundle is

$$\begin{aligned} \alpha _{\nu }(X)=-k(X,\nu )=\langle \nabla _{X}^{N} \nu ,{\textbf{n}}\rangle , \end{aligned}$$
(2.6)

where X is any tangent vector field to the boundary surface. Notice that with a change of gauge to the frame determined by \(e_3=a\nu + b{\textbf{n}}\), \(e_4 =b\nu +a{\textbf{n}}\) with \(a=\cosh f\), \(b=\sinh f\) yields

$$\begin{aligned} \alpha _{e_3}(X)= & {} \langle \nabla _{X}^{N}e_3,e_4\rangle =a^2 \langle \nabla _{X}^{N}\nu ,{\textbf{n}}\rangle +b^2 \langle \nabla _{X}^{N}{\textbf{n}},\nu \rangle +X(a)b-X(b)a\nonumber \\= & {} \alpha _{\nu }(X)-X(f), \end{aligned}$$
(2.7)

for any function \(f\in C^1(\partial \Omega )\). Below, for simplicity of the discussion, we will assume that all calculations occur away from critical points of u restricted to the boundary. Then choosing \(X=\nabla _{\partial }u\) and \(f=\sinh ^{-1}\left( \nu (u)/|\nabla _{\partial }u|\right) \) produces

$$\begin{aligned} X(f)=\nabla _{\partial }u\left( \sinh ^{-1}\left( \frac{\nu (u)}{|\nabla _{\partial }u|}\right) \right) =\frac{|\nabla _{\partial }u|}{|\nabla u|}\nabla _{\partial }u\left( \frac{\nu (u)}{|\nabla _{\partial }u|}\right) . \end{aligned}$$
(2.8)

Hence, combining (2.6), (2.7), and (2.8) shows that

$$\begin{aligned} \alpha _{e_3}(\nabla _{\partial }u)= & {} \alpha _{\nu }(\nabla _{\partial }u)-\nabla _{\partial }u\left( \sinh ^{-1}\left( \frac{\nu (u)}{|\nabla _{\partial }u|}\right) \right) \nonumber \\= & {} -k(\nabla _{\partial }u,\nu )-\frac{|\nabla _{\partial }u|}{|\nabla u|}\nabla _{\partial }u\left( \frac{\nu (u)}{|\nabla _{\partial }u|}\right) . \end{aligned}$$
(2.9)

Observe that the vector \(\vec {w}\) is proportional to a special case (when \(\varepsilon =0\)) of the first member of the level set frame

$$\begin{aligned} e'_3= & {} \frac{\sqrt{|\nabla u|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial }u|^2+\varepsilon ^2}}\,\nu + \frac{\nu (u)}{\sqrt{|\nabla _{\partial }u|^2+\varepsilon ^2}}\,{\textbf{n}},\nonumber \\ e'_4= & {} \frac{\nu (u)}{\sqrt{|\nabla _{\partial }u|^2+\varepsilon ^2}}\,\nu +\frac{\sqrt{|\nabla {u}|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial }u|^2+\varepsilon ^2}}\,{\textbf{n}}. \end{aligned}$$
(2.10)

Here, the parameter \(\varepsilon >0\) is included to avoid the technical issue of critical points for the restriction of u to \(\partial \Omega \). The computations (2.5) and (2.9) then motivate us to define the quasi-local Hamiltonian density as

$$\begin{aligned} {\mathcal {H}}(e'_3,u) =\sqrt{|\nabla _\partial u|^2+\varepsilon ^2}\,\langle \vec {H},e'_3\rangle +\alpha _{e'_3}(\nabla _\partial u), \end{aligned}$$
(2.11)

with respect to the level set frame. The expression for quasi-local energy \(E(\Sigma ,\iota ,u_a)\) in (1.4) follows by choosing an ‘optimal’ frame, and comparing to an appropriate Hamiltonian density in the ground state.

3 A Hamilton–Jacobi Interpretation of the Energy

In this section we will describe the relationship between the quasi-local energy defined in the introduction, and a Hamilton–Jacobi analysis. Recall that Brown and York [6] and Hawking and Horowitz [14] derived a Hamiltonian for closed spacelike 2-surfaces \(\Sigma \hookrightarrow N^{3,1}\) which enclose a compact initial data set \((\Omega ,g,k)\). As before let \({\textbf{n}}\) be the unit timelike future directed normal to the data, and let \(\nu \) be the unit outer normal to \(\Sigma \) with respect to \(\Omega \). If \(T=\varphi {\textbf{n}} +Y\) is a timelike vector field along \(\Sigma \), representing an observer with lapse \(\varphi \) and shift Y, then the surface Hamiltonian takes the form

$$\begin{aligned} {\textbf{H}}(\Sigma ,T,{\textbf{n}})=-\frac{1}{8\pi }\int _{\Sigma }\left( \varphi H-k(\nu ,Y)+(\text {Tr}_{g}k)g(\nu ,Y)\right) dA, \end{aligned}$$
(3.1)

where dA is the area element on \(\Sigma \). As pointed out by Wang and Yau [37], this Hamiltonian may be reexpressed with the aid of the vector field

$$\begin{aligned} P=H{\textbf{n}}+k(\nu )-(\text {Tr}_{g}k)\nu , \end{aligned}$$
(3.2)

so that

$$\begin{aligned} {\textbf{H}}(\Sigma ,T,{\textbf{n}})=\frac{1}{8\pi }\int _{\Sigma }\langle P,T\rangle dA. \end{aligned}$$
(3.3)

Note that P is perpendicular to the the mean curvature vector \(\vec {H}=H\nu -(\text {Tr}_\Sigma k) {\textbf{n}}\). The associated energy is then defined by choosing a reference Hamiltonian, determined by an isometric embedding \(\iota :\Sigma \hookrightarrow {\mathbb {R}}^{3,1}\) and corresponding vector fields \(T_0\) and \({\textbf{n}}_0\) along the image in Minkowski space, namely

$$\begin{aligned} {\textbf{E}}(\Sigma )={\textbf{H}}(\Sigma ,T,{\textbf{n}})-{\textbf{H}}_0(\iota (\Sigma ),T_0,{\textbf{n}}_0). \end{aligned}$$
(3.4)

This notion of energy depends on the choices of T, \({\textbf{n}}\) and \(T_0\), \({\textbf{n}}_0\), as well as on the isometric embedding. Observe that if the isometric embedding lands in a time slice \({\mathbb {R}}^3\subset {\mathbb {R}}^{3,1}\) with normal \({\textbf{n}}_0\), and the choices \(T={\textbf{n}}\) and \(T_0={\textbf{n}}_0\) are made, then the typical expression for the the Brown–York mass is recovered; it depends on the initial data \(\Omega \) and hence on \({\textbf{n}}\). The Liu-Yau mass is obtained with the same prescription, except that \({\textbf{n}}\) is taken to satisfy \(\langle \vec {H},{\textbf{n}}\rangle =0\). Furthermore, given an admissible time function \(\tau \) on \(\Sigma \) associated with an isometric embedding, the Wang-Yau energy is produced by setting

$$\begin{aligned} T_0 = \sqrt{1+|\nabla _{\partial }\tau |^2}{\textbf{n}}_0 -\nabla _{\partial }\tau , \end{aligned}$$
(3.5)

and choosing \({\textbf{n}}_0\) so that \(\{\nu _0,{\textbf{n}}_0\}\) is the unique frame for the normal bundle of \(\iota (\Sigma )\) satisfying

$$\begin{aligned} \langle \vec {H}_0,\nu _0\rangle >0,\qquad \quad \langle \vec {H}_0,{\textbf{n}}_0\rangle =\frac{-\Delta _\partial \tau }{\sqrt{1+|\nabla _\partial \tau |^2}}, \end{aligned}$$
(3.6)

while T is set to have the same lapse and shift as \(T_0\) and \({\textbf{n}}\) is chosen to satisfy conditions analogous to (3.6) in \(N^{3,1}\).

In order to apply these considerations to the quasi-local energy introduced in Sect. 1, let \(({\textbf{t}},{\textbf{x}}^i)\), \(i=1,2,3\) be coordinates for the reference Minkowski space, and pull back a linear null function \(u_a=\iota ^*(-{\textbf{t}}+a_i{\textbf{x}}^i)\) to \(\Sigma \). For each \(\varepsilon >0\) we then set

$$\begin{aligned} T_0 = \sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}{\textbf{n}}_0 +\nabla _{\partial }u_a, \end{aligned}$$
(3.7)

and choose \({\textbf{n}}_0\) so that \(\{\nu _0,{\textbf{n}}_0\}\) is the unique frame for the normal bundle of \(\iota (\Sigma )\) satisfying

$$\begin{aligned} \langle \vec {H}_0,\nu _0\rangle >0,\qquad \quad \langle \vec {H}_0,{\textbf{n}}_0\rangle =\frac{-\Delta _\partial u_a}{\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}}. \end{aligned}$$
(3.8)

Moreover, T is set to have the same lapse and shift as \(T_0\), and \({\textbf{n}}\) is chosen to satisfy conditions analogous to (3.8) in \(N^{3,1}\). Note that T and \(T_0\) are timelike with \(|T|^2=|T_0|^2=-\varepsilon ^2\), and are approaching null vectors as \(\varepsilon \rightarrow 0\). Observe that writing \(\langle \vec {H},\nu \rangle =H\) and \(\langle \vec {H},{\textbf{n}}\rangle =\text {Tr}_\Sigma k\) gives rise to

$$\begin{aligned} \begin{aligned} {\textbf{H}}^{\varepsilon }(\Sigma ,T,{\textbf{n}})&:=\frac{1}{8\pi }\int _{\Sigma }\langle P,T\rangle dA\\&=\frac{1}{8\pi }\int _{\Sigma }\left( -H\sqrt{|\nabla _\partial u_a|^2+\varepsilon ^2}+k(\nu ,\nabla _\partial u_a)\right) dA\\&=\frac{1}{8\pi }\int _{\Sigma }\left( -\sqrt{|\nabla _\partial u_a|^2+\varepsilon ^2}\langle \vec {H},\nu \rangle -\alpha _{\nu }(\nabla _\partial u_a)\right) dA\\&=-\frac{1}{8\pi }\int _{\Sigma }{\mathcal {H}}(\nu , u_a) dA, \end{aligned} \end{aligned}$$
(3.9)

and similarly for the reference Hamiltonian. Therefore, the new quasi-local energy arises from Hamiltonians by taking a limit as the observer approaches a null direction

$$\begin{aligned} E(\Sigma ,\iota ,u_a)=\lim _{\varepsilon \rightarrow 0}\left( {\textbf{H}}^{\varepsilon }(\Sigma ,T,{\textbf{n}}) -{\textbf{H}}^{\varepsilon }_0(\iota (\Sigma ),T_0,{\textbf{n}}_0)\right) , \end{aligned}$$
(3.10)

where in terms of the notation of the introduction we have \(\{{\bar{e}}_3,{\bar{e}}_4\}=\{\nu ,{\textbf{n}}\}\) and \(\{{\hat{e}}_3,{\hat{e}}_4\}=\{\nu _0,{\textbf{n}}_0\}\).

4 Proof of Nonnegativity

The purpose of the current section is to establish the inequality portion of Theorem 1.1. As before, given a linear null function \(u_a\) on a spacelike 2-surface \(\Sigma \) in spacetime \(N^{3,1}\), consider the normal bundle frame \(\{{\bar{e}}_3,{\bar{e}}_4\}\) defined by

$$\begin{aligned} \langle \vec {H},{\bar{e}}_3\rangle >0,\qquad \quad \langle \vec {H},{\bar{e}}_4\rangle =\frac{-\Delta _\partial u_a}{\sqrt{|\nabla _\partial u_a|^2+\varepsilon ^2}}, \end{aligned}$$
(4.1)

for \(\varepsilon >0\). We begin with a preliminary result, similar to [38, Proposition 2.1] for the Wang-Yau quasi-local energy, which demonstrates how this may be interpreted as an optimal frame.

Lemma 4.1

If the mean curvature vector \(\vec {H}\) of \(\Sigma \) is spacelike, and \(\{e_3,e_4\}\) is any frame for the normal bundle of \(\Sigma \) with the properties that \(e_3\) is spacelike and \(\langle \vec {H},e_3\rangle >0\), then

$$\begin{aligned} \int _{\Sigma }{\mathcal {H}}(e_3,u_a)dA\ge \int _{\Sigma }{\mathcal {H}}({\bar{e}}_3,u_a) dA \end{aligned}$$
(4.2)

for all \(\varepsilon >0\).

Proof

Consider the following functional which sends normal bundle frames to the real numbers

$$\begin{aligned} \{e_3,e_4\} \mapsto \int _{\Sigma }{\mathcal {H}}(e_3,u_a)dA =\int _{\Sigma }\left( \sqrt{|\nabla _\partial u_a|^2+\varepsilon ^2}\,\langle \vec {H},e_3\rangle +\alpha _{e_3}(\nabla _\partial u_a)\right) dA.\nonumber \\ \end{aligned}$$
(4.3)

Since the normal bundle of \(\Sigma \) is rank 2 with structure group SO(1, 1), each frame may be given by a hyperbolic angle function. In particular, because \(\vec {H}\) is spacelike and \(\langle \vec {H},e_3\rangle >0\), we may express the frame defined by the mean curvature vector as

$$\begin{aligned} {\tilde{e}}_3:=\frac{\vec {H}}{|\vec {H}|}=(\cosh f) e_3+(\sinh f) e_4,\qquad {\tilde{e}}_4:=(\sinh f) e_3+(\cosh f) e_4, \end{aligned}$$
(4.4)

for some \(f\in C^{\infty }(\Sigma )\). It follows that

$$\begin{aligned} \langle \vec {H},{e}_3\rangle =|\vec {H}|\cosh f,\qquad \langle \vec {H},{e}_4\rangle =-|\vec {H}|\sinh f. \end{aligned}$$
(4.5)

Furthermore, using Eqs. (2.7) and (4.4) produces

$$\begin{aligned} \alpha _{e_3}(\nabla _{\partial }u_a)=\alpha _{{\tilde{e}}_3}(\nabla _{\partial }u_a)+\nabla _{\partial }u_a\cdot \nabla _{\partial } f. \end{aligned}$$
(4.6)

Therefore, the functional of (4.3) can be rewritten as

$$\begin{aligned} \begin{aligned} f \mapsto&\int _{\Sigma }\left( \sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}|\vec {H}|\cosh f+\nabla _{\partial }u_a\cdot \nabla _{\partial } f+\alpha _{{\tilde{e}}_3}(\nabla _{\partial }u_a)\right) dA\\ =&\int _{\Sigma }\left( \sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}|\vec {H}|\cosh f-f\Delta _{\partial }u_a+\alpha _{{\tilde{e}}_3}(\nabla _{\partial }u_a)\right) dA. \end{aligned} \end{aligned}$$
(4.7)

Since \(|\vec {H}|>0\) this functional is convex, and it may be easily checked that the minimum occurs when

$$\begin{aligned} |\vec {H}|\sinh f=\frac{\Delta _{\partial }u_a}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}. \end{aligned}$$
(4.8)

Hence, (4.5) shows that the minimum is achieved at the frame \(\{{\bar{e}}_3,{\bar{e}}_4\}\). \(\square \)

Let \((\Omega , g,k)\) be the initial data set for a compact spacelike hypersurface in \(N^{3,1}\) which is enclosed by \(\Sigma \), and let \(u_a \in C^{\infty }(\Sigma )\). Consider the unique solution of the spacetime harmonic Dirichlet problem

$$\begin{aligned} \Delta _g u+(\text {Tr}_{g}k)|\nabla u|=0\quad \text { in }\Omega ,\qquad u=u_a\quad \text { on }\partial \Omega =\Sigma . \end{aligned}$$
(4.9)

The existence of a unique solution \(u\in C^{2,\varsigma }(\Omega )\) for any \(\varsigma \in (0,1)\), follows in a straightforward manner from the results of [15, Section 4.1]. We may then define a level set frame for the normal bundle of \(\Sigma \) by

$$\begin{aligned} {e}'_3= & {} \frac{\sqrt{|\nabla u|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\nu + \frac{\nu (u)}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}{\textbf{n}},\nonumber \\ {e}'_4= & {} \frac{\nu (u)}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\nu +\frac{\sqrt{|\nabla {u}|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}{\textbf{n}}, \end{aligned}$$
(4.10)

for each \(\varepsilon >0\) and where \(\{\nu ,{\textbf{n}}\}\) is the normal bundle frame determined by \(\Omega \) in which \(\nu \) is the outer normal to \(\partial \Omega \) and \({\textbf{n}}\) is future directed timelike. Note that since the mean curvature vector of \(\Sigma \) is outward pointing spacelike, we have

$$\begin{aligned} H=\langle \vec {H},\nu \rangle >|\langle \vec {H},{\textbf{n}}\rangle |=|\textrm{Tr}_{\Sigma }k| \end{aligned}$$
(4.11)

and therefore

$$\begin{aligned} \begin{aligned} \langle \vec {H},{e}'_3\rangle&=\frac{\sqrt{|\nabla u|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial } {\hat{u}}_a|^2+\varepsilon ^2}}H +\frac{\nu (u)}{\sqrt{|\nabla _{\partial } {\hat{u}}_a|^2+\varepsilon ^2}} \text {Tr}_{\Sigma }k\\&> \frac{\sqrt{|\nabla u|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial } {\hat{u}}_a|^2+\varepsilon ^2}} |\text {Tr}_{\Sigma }k| +\frac{\nu (u)}{\sqrt{|\nabla _{\partial } {\hat{u}}_a|^2+\varepsilon ^2}}\text {Tr}_{\Sigma }k\\&\ge \frac{|\nu (u)|}{\sqrt{|\nabla _{\partial } {\hat{u}}_a|^2+\varepsilon ^2}} |\text {Tr}_{\Sigma }k| +\frac{\nu (u)}{\sqrt{|\nabla _{\partial } {\hat{u}}_a|^2+\varepsilon ^2}}\text {Tr}_{\Sigma }k\\&\ge 0. \end{aligned} \end{aligned}$$
(4.12)

This allows for an application of Lemma 4.1 to conclude that

$$\begin{aligned} \int _{\Sigma }{\mathcal {H}}({e}'_3,u_a)\, dA\ge \int _{\Sigma }{\mathcal {H}}({\bar{e}}_3,u_a)\, dA. \end{aligned}$$
(4.13)

Inequality (4.13), together with the following estimate, will be used together to show nonnegativity of the quasi-local energy.

Lemma 4.2

Let \((\Omega , g,k)\) be initial data for a compact spacelike hypersurface with boundary \(\partial \Omega =\Sigma \) in spacetime \(N^{3,1}\), and let \(u_a \in C^{\infty }(\Sigma )\) and \(\varepsilon >0\). If \(u\in C^{2,\varsigma }(\Omega )\) is the spacetime harmonic solution of (4.9) with associated level set frame \(\{e'_3,e'_4\}\), then

$$\begin{aligned}{} & {} \lim _{\varepsilon \rightarrow 0}\int _{\Sigma }{\mathcal {H}}(e'_3,u_a) dA+\int _{\Omega }\left( \frac{1}{2}\frac{|{\nabla }^2 u+k|\nabla u||^2}{|\nabla u|}+\mu |\nabla u|+J(\nabla u)\right) dV\nonumber \\{} & {} \quad \le 2\pi \int _{{\underline{u}}}^{{\overline{u}}}\chi (\Sigma _s)ds, \end{aligned}$$
(4.14)

where \(\Sigma _s=u^{-1}(s)\) and \({\overline{u}}\), \({\underline{u}}\) represent the maximum and minimum values of u. Furthermore, the limit in this expression exists and is finite.

Proof

Let \(f_{\varepsilon }\in C^{\infty }(\Sigma )\) be such that

$$\begin{aligned} \cosh f_{\varepsilon }=\frac{\sqrt{|\nabla u|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}},\qquad \sinh f_{\varepsilon }=\frac{\nu (u)}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}, \end{aligned}$$
(4.15)

then (2.6) and (2.7) imply

$$\begin{aligned} \alpha _{e'_3}(\nabla _{\partial }u_a)=\alpha _{\nu }(\nabla _{\partial }u_a)-\nabla _{\partial }u_a\cdot \nabla _{\partial } f_{\varepsilon } =-k(\nabla _{\partial }u_a,\nu )-\nabla _{\partial }u_a\cdot \nabla _{\partial } f_{\varepsilon }. \end{aligned}$$
(4.16)

It follows that

$$\begin{aligned} \begin{aligned} {\mathcal {H}}(e'_3,u_a)&=\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}\langle \vec {H},{e}'_3\rangle +\alpha _{e'_3}(\nabla _{\partial }u_a)\\&=\sqrt{|\nabla u|^2+\varepsilon ^2}H +\nu (u)\text {Tr}_{\Sigma }k\\&\quad -k(\nabla _{\partial }u_a,\nu )-\nabla _{\partial }u_a \left( \sinh ^{-1}\frac{\nu (u)}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\right) \\&=\sqrt{|\nabla u|^2 \!+\!\varepsilon ^2}H \!+\!\nu (u)\text {Tr}_{\Sigma }k\\&\quad - k(\nabla _{\partial }u_a,\nu ) \!-\!\frac{\sqrt{|\nabla _{\partial }u_a|^2 \!+\!\varepsilon ^2}}{\sqrt{|\nabla {u}|^2 \!+\!\varepsilon ^2}}\nabla _{\partial }u_a\left( \!\frac{\nu (u)}{\sqrt{|\nabla _{\partial }u_a|^2 \!+\!\varepsilon ^2}}\!\right) . \end{aligned} \end{aligned}$$
(4.17)

Since the last term in this expression may be estimated by

$$\begin{aligned}{} & {} \frac{\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}}{\sqrt{|\nabla {u}|^2 +\varepsilon ^2}} \Bigg |\frac{\nabla _{\partial }u_a\cdot \nabla _{\partial }\left( \nu (u)\right) }{\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}} -\frac{\nu (u)\nabla _{\partial }^2u_a(\nabla _{\partial }u_a,\nabla _{\partial }u_a)}{\left( |\nabla _{\partial }u_a|^2+\varepsilon ^2\right) ^{3/2}}\Bigg |\nonumber \\{} & {} \quad \le |\nabla ^2 u|+|II||\nabla _{\partial }u_a|+|\nabla _{\partial }^2 u_a|, \end{aligned}$$
(4.18)

where II is the second fundamental form of \(\Sigma \) as a submanifold of \((\Omega ,g)\), we may apply the dominated convergence theorem to conclude that the relevant limit exists, is finite, and satisfies

$$\begin{aligned} \begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _{\Sigma }{\mathcal {H}}(e'_3,u_a) dA =&\int _{\partial \Omega }\left( |\nabla u|H+\nu (u)\text {Tr}_\Sigma k -k(\nabla _{\partial } u_a,\nu )\right) dA\\&-\int _{{\bar{\partial }}\Omega }\frac{|\nabla _{\partial }u_a|}{|\nabla u|}\nabla _{\partial }u_a\left( \frac{\nu (u)}{|\nabla _{\partial }u_a|}\right) dA. \end{aligned} \end{aligned}$$
(4.19)

Here \({\bar{\partial }}\Omega \) denotes the open set of points within \(\partial \Omega \) on which \(|\nabla _{\partial }u_a|\ne 0\). The desired result now follows from Proposition 2.1. \(\square \)

Proof of Theorem 1.1: Nonnegativity

A direct application of (4.13), Lemmas 4.2, and 5.1 (in the next section) yields

$$\begin{aligned} \begin{aligned} E(\Sigma ,\iota ,u_a) =&\lim _{\varepsilon \rightarrow 0}\frac{1}{8\pi }\int _{\Sigma }\left( {\mathcal {H}}_0({\hat{e}}_3,u_a)-{\mathcal {H}}({\bar{e}}_3,u_a)\right) dA\\ \ge&\lim _{\varepsilon \rightarrow 0}\frac{1}{8\pi }\int _{\Sigma }\left( {\mathcal {H}}_0({\hat{e}}_3,u_a)- {\mathcal {H}}({e}'_3,u_a)\right) dA\\ \ge&\int _{\Omega }\left( \frac{1}{2}\frac{|{\nabla }^2 u+k|\nabla u||^2}{|\nabla u|}+\mu |\nabla u|+J(\nabla u)\right) dV\\&+\frac{1}{4} \int _{{\underline{u}}_a}^{{\overline{u}}_a}\chi ({\hat{\Sigma }}_s)ds -\frac{1}{4} \int _{{\underline{u}}}^{{\overline{u}}}\chi (\Sigma _s)ds, \end{aligned} \end{aligned}$$
(4.20)

where \({\hat{\Sigma }}_s\) are the level sets of the relevant null linear function in Minkowski space restricted to the fill-in \({\hat{\Omega }}\) of \(\iota (\Sigma )\). By the maximum principle for (4.9) we find that \({\overline{u}}={\overline{u}}_a\) and \({\underline{u}}={\underline{u}}_a\). Furthermore, analogous arguments to those used in [15, Proposition 5.2] show that the trivial homology hypothesis \(H_2(\Omega ;{\mathbb {Z}})=0\) guarantees that each component of any regular level \(\Sigma _s\) for u must intersect the boundary \(\partial \Omega \). Hence, \(\chi (\Sigma _s)\le n\) where n is the number of components of the s-level set for \(u_a=u|_{\partial \Omega }\). On the other hand, the admissibility condition ensures that \(\chi ({\hat{\Sigma }}_s)=n\), so the difference of Euler characteristic integrals in (4.20) is nonnegative. The dominant energy condition \(\mu \ge |J|\) then gives \(E(\Sigma ,\iota ,u_a)\ge 0\).

It remains to show that the limit defining the quasi-local energy exists and is finite. From the proof of Lemma 4.1 it follows that

$$\begin{aligned} \begin{aligned} {\mathcal {H}}({\bar{e}}_3,u_a)=&\sqrt{|\vec {H}|^2 (|\nabla _{\partial } u_a|^2 +\varepsilon ^2)+(\Delta _{\partial }u_a)^2}+\alpha _{{\tilde{e}}_3}(\nabla _{\partial }u_a)\\&+\nabla _{\partial }u_a \cdot \nabla _{\partial }\left( \sinh ^{-1}\frac{\Delta _{\partial }u_a}{|\vec {H}|\sqrt{|\nabla _{\partial } u_a|^2 +\varepsilon ^2}}\right) . \end{aligned} \end{aligned}$$
(4.21)

The last term in this expression may be estimated by

$$\begin{aligned} \begin{aligned}&\Bigg |\frac{\nabla _{\partial }u_a (\Delta _{\partial }u_a)-(\Delta _{\partial }u_a)\nabla _{\partial }u_a(\log |\vec {H}|)\!-\!(\Delta _{\partial }u_a)\nabla ^2_{\partial }u_a(\nabla _{\partial }u_a,\!\nabla _{\partial }u_a)(|\nabla _{\partial } u_a|^2 \!+\!\varepsilon ^2)^{-1}}{\sqrt{|\vec {H}|^2 (|\nabla _{\partial } u_a|^2 \!+\!\varepsilon ^2)\!+\!(\Delta _{\partial }u_a)^2}}\Bigg |\\&\qquad \le |\vec {H}|^{-1}|\nabla _{\partial }^3 u_a|+|\nabla _{\partial }u_a||\nabla _{\partial }\log |\vec {H}||+|\nabla _{\partial }^2 u_a|. \end{aligned} \end{aligned}$$
(4.22)

We may then apply the dominated convergence theorem to find that the limit exists, is finite, and satisfies

$$\begin{aligned} \begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _{\Sigma }{\mathcal {H}}({\bar{e}}_3,u_a)dA =&\int _{\Sigma }\left( \sqrt{|\vec {H}|^2 |\nabla _{\partial } u_a|^2 \!+\!(\Delta _{\partial }u_a)^2}+\alpha _{{\tilde{e}}_3}(\nabla _{\partial }u_a)\right) dA\\&+\int _{{\tilde{\Sigma }}}\nabla _{\partial }u_a \cdot \nabla _{\partial }\left( \sinh ^{-1}\frac{\Delta _{\partial }u_a}{|\vec {H}||\nabla _{\partial } u_a|}\right) dA, \end{aligned} \end{aligned}$$
(4.23)

where \({\tilde{\Sigma }}\) denotes the open subset of \(\Sigma \) on which \(|\nabla _{\partial }u_a|\ne 0\). Similar arguments hold for the limit of the reference Hamiltonian. \(\square \)

Nonnegativity with a disconnected surface

In Remark 1.2, it was stated that nonnegativity of the energy holds in the more general circumstance of a disconnected \(\Sigma \) and without the homology assumption on \(\Omega \), when a suitable modification of the admissibility condition is enforced. More precisely, in this setting we may define a pair \((\iota ,u_a)\) to be admissible for \(\Sigma \subset N^{3,1}\) if the mean curvature vector of \(\iota (\Sigma )\) is outward pointing spacelike, and there exist compact spacelike hypersurfaces \({\hat{\Omega }}\subset {\mathbb {R}}^{3,1}\), \(\Omega \subset N^{3,1}\) with \(\partial {\hat{\Omega }}=\iota (\Sigma )\), \(\partial \Omega =\Sigma \) such that

$$\begin{aligned} \int _{{\underline{u}}_a}^{{\overline{u}}_a}\left( \chi ({\hat{\Sigma }}_s)-\chi (\Sigma _s)\right) ds \ge 0, \end{aligned}$$
(4.24)

where the level sets \({\hat{\Sigma }}_s\), \(\Sigma _s\) are defined as above. Since (4.13), Lemmas 4.2, and 5.1 continue to hold under the more general hypotheses presented here, inequality (4.20) again implies that \(E(\Sigma ,\iota ,u_a)\ge 0\). \(\square \)

5 Proof of Rigidity

The purpose of the current section is to establish the rigidity statement of Theorem 1.1. We will first compute the surface Hamiltonian for surfaces in Minkowski space. Let \({\hat{\Sigma }}=\iota (\Sigma )\subset {\mathbb {R}}^{3,1}\) be a spacelike 2-surface, and assume that \((\iota ,u_a)\) is admissible. There is then a compact spacelike hypersurface \(({\hat{\Omega }},{\hat{g}},{\hat{k}})\) in Minkowski space with boundary \(\partial {\hat{\Omega }}={\hat{\Sigma }}\). Let \(\{{\hat{\nu }},\hat{{\textbf{n}}}\}\) be a normal bundle frame for \({\hat{\Sigma }}\), where \({\hat{\nu }}\) is the outer normal with respect to \({\hat{\Omega }}\) and \(\hat{{\textbf{n}}}\) is future directed timelike. For \(\varepsilon >0\) consider the level set frame

$$\begin{aligned} {\hat{e}}'_3= & {} \frac{\sqrt{|\nabla {\hat{u}}|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}{\hat{\nu }} + \frac{{\hat{\nu }}({\hat{u}})}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\hat{{\textbf{n}}},\qquad \nonumber \\ {\hat{e}}'_4= & {} \frac{{\hat{\nu }}({\hat{u}})}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}{\hat{\nu }} +\frac{\sqrt{|\nabla {\hat{u}}|^2+\varepsilon ^2}}{\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\hat{{\textbf{n}}}, \end{aligned}$$
(5.1)

where \({\hat{u}}=(-{\textbf{t}}+a_i {\textbf{x}}^i)|_{{\hat{\Omega }}}\) is the restriction of the null linear function to the hypersurface, \(u_a\) is the restriction of \({\hat{u}}\) to \({\hat{\Sigma }}\), with \(\nabla \) and \(\nabla _{\partial }\) denoting the connections on \({\hat{\Omega }}\) and \({\hat{\Sigma }}\) respectively. Note that since the spacetime gradient of the linear function is null it holds that \(\hat{{\textbf{n}}}(-{\textbf{t}}+a_i {\textbf{x}}^i)=-|\nabla {\hat{u}}|\). Moreover, linearity of this function implies that

$$\begin{aligned} \begin{aligned} 0={{\hat{\square }}}(-{\textbf{t}}+a_i {\textbf{x}}^i)&= \left( {\hat{\nabla }}_{{\hat{\nu }}{\hat{\nu }}}+{\hat{\nabla }}_{\hat{{\textbf{n}}}\hat{{\textbf{n}}}}+\vec {H}_0+\Delta _{\partial }\right) (-{\textbf{t}}+a_i {\textbf{x}}^i)\\&=\left( {\hat{H}}{\hat{\nu }}-(\textrm{Tr}_{{\hat{\Sigma }}}{\hat{k}})\hat{{\textbf{n}}} +\Delta _{\partial }\right) (-{\textbf{t}}+a_i {\textbf{x}}^i)\\&={\hat{H}}{\hat{\nu }}({\hat{u}})+(\textrm{Tr}_{{\hat{\Sigma }}}{\hat{k}})|\nabla {\hat{u}}| +\Delta _{\partial }u_a, \end{aligned} \end{aligned}$$
(5.2)

where \({{\hat{\square }}}\) and \({\hat{\nabla }}\) represent the wave operator and connection on Minkowski space, and \(\vec {H}_0={\hat{H}}{\hat{\nu }}-(\textrm{Tr}_{{\hat{\Sigma }}}{\hat{k}})\hat{{\textbf{n}}}\) is the mean curvature vector of \({\hat{\Sigma }}\). Since \(\hat{{\textbf{n}}}\) is future pointing timelike, the null condition for the linear function also gives \(|\nabla {\hat{u}}|>0\) on \({\hat{\Omega }}\). Therefore, an examination of the proof for Proposition 2.1 shows that the inequality of (2.3) is in fact an equality. This and (4.19), combined with the observation that \({\hat{u}}\) has vanishing spacetime Hessian [4, Section 5] (see also [15, Section 3]), and using that \(\mu =|J|=0\) in Minkowski space, yield a computation of the surface Hamiltonian

$$\begin{aligned} \begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _{{\hat{\Sigma }}}{\mathcal {H}}_0({\hat{e}}'_3,u_a) d{\hat{A}}&=\int _{{\hat{\Sigma }}}\left( |\nabla {\hat{u}}|{\hat{H}}+{\hat{\nu }}({\hat{u}})\text {Tr}_{{\hat{\Sigma }}} {\hat{k}} -{\hat{k}}(\nabla _{\partial } u_a,{\hat{\nu }})\right) d{\hat{A}}\\&\quad -\int _{{\bar{\partial }}{\hat{\Omega }}}\frac{|\nabla _{\partial }u_a|}{|\nabla {\hat{u}}|}\nabla _{\partial }u_a\left( \frac{{\hat{\nu }}({\hat{u}})}{|\nabla _{\partial }u_a|}\right) d{\hat{A}}\\&=2\pi \int _{{\underline{u}}_a}^{{\overline{u}}_a}\chi ({\hat{\Sigma }}_s)ds, \end{aligned} \end{aligned}$$
(5.3)

where \({\hat{\Sigma }}_s\) and \({\overline{u}}_a\), \({\underline{u}}_a\) denote the level sets, maximum, and minimum of \({\hat{u}}\) respectively. The next result shows that the same value is achieved by evaluating at the optimal frame, which is uniquely determined by

$$\begin{aligned} \langle \vec {H}_0,{\hat{e}}_3\rangle >0,\qquad \quad \langle \vec {H}_0,{\hat{e}}_4\rangle =\frac{-\Delta _\partial u_a}{\sqrt{|\nabla _\partial u_a|^2+\varepsilon ^2}}. \end{aligned}$$
(5.4)

Lemma 5.1

Let \((\iota ,u_a)\) be an admissible pair for a spacelike 2-surface \(\Sigma \), then the reference Hamiltonian satisfies

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _{\Sigma }{\mathcal {H}}_0({\hat{e}}_3,u_a)dA=2\pi \int _{{\underline{u}}_a}^{{\overline{u}}_a}\chi ({\hat{\Sigma }}_s)ds. \end{aligned}$$
(5.5)

Proof

For convenience we will remove extraneous notation, including the subscript on the mean curvature vector and the hat notation from all objects except the optimal frame. Consider the frame associated with \(\Omega \), namely

$$\begin{aligned} {\tilde{e}}_3{} & {} =\frac{\vec {H}}{|\vec {H}|}=\frac{H}{|\vec {H}|}\nu -\frac{(\textrm{Tr}_{\Sigma }k)}{|\vec {H}|}{\textbf{n}}:=(\cosh \ell ) \nu -(\sinh \ell ) {\textbf{n}},\quad \quad \quad \nonumber \\ {\tilde{e}}_4{} & {} =-(\sinh \ell ) \nu +(\cosh \ell ){\textbf{n}}. \end{aligned}$$
(5.6)

This may be used as an intermediary frame to compute the relation between optimal frame \(\{{\hat{e}}_3,{\hat{e}}_4\}\) and the \(\Omega \) frame \(\{\nu ,{\textbf{n}}\}\). In particular, since

$$\begin{aligned} {\hat{e}}_3=(\cosh f_{\varepsilon }){\tilde{e}}_3 -(\sinh f_{\varepsilon }){\tilde{e}}_4,\quad \quad \quad {\hat{e}}_4=-(\sinh f_{\varepsilon }){\tilde{e}}_3 +(\cosh f_{\varepsilon }){\tilde{e}}_4 \end{aligned}$$
(5.7)

where

$$\begin{aligned} \cosh f_{\varepsilon }{=}\frac{\sqrt{|\vec {H}|^2 (|\nabla _{\partial }u_a|^2+\varepsilon ^2){+}(\Delta _{\partial }u_a)^2}}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}},\quad \quad \sinh f_{\varepsilon }{=}\frac{\Delta _{\partial }u_a}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}},\nonumber \\ \end{aligned}$$
(5.8)

it follows that

$$\begin{aligned} {\hat{e}}_3=(\cosh q_{\varepsilon })\nu -(\sinh q_{\varepsilon }){\textbf{n}},\quad \quad \quad {\hat{e}}_4 =-(\sinh q_{\varepsilon })\nu +(\cosh q_{\varepsilon }){\textbf{n}} \end{aligned}$$
(5.9)

with \(q_{\varepsilon }=f_{\varepsilon }+\ell \) and

$$\begin{aligned} \begin{aligned} \cosh q_{\varepsilon } =&\frac{\sqrt{|\vec {H}|^2 (|\nabla _{\partial }u_a|^2+\varepsilon ^2)+(\Delta _{\partial }u_a)^2}}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\cdot \frac{H}{|\vec {H}|} +\frac{\Delta _{\partial }u_a}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\cdot \frac{\textrm{Tr}_{\Sigma }k}{|\vec {H}|}\\ \sinh q_{\varepsilon } =&-\frac{\Delta _{\partial }u_a}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\cdot \frac{H}{|\vec {H}|} -\frac{\sqrt{|\vec {H}|^2 (|\nabla _{\partial }u_a|^2+\varepsilon ^2)+(\Delta _{\partial }u_a)^2}}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\cdot \frac{\textrm{Tr}_{\Sigma }k}{|\vec {H}|}. \end{aligned} \end{aligned}$$
(5.10)

The Hamiltonian density function may then be rewritten with respect to the \(\Omega \) frame utilizing (2.6) and (2.7) to obtain

$$\begin{aligned} \begin{aligned}&{\mathcal {H}}_0({\hat{e}}_3,u_a)=\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}\langle \vec {H},{\hat{e}}_3\rangle +\alpha _{{\hat{e}}_3}(\nabla _{\partial }u_a)\\&\quad =\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}\left( H\cosh q_{\varepsilon } +(\textrm{Tr}_{\Sigma }k)\sinh q_{\varepsilon }\right) -k(\nabla _{\partial }u_a ,\nu )-\nabla _{\partial }u_a \cdot \nabla _{\partial } q_{\varepsilon }. \end{aligned} \end{aligned}$$
(5.11)

By employing the fact that the spacetime Hessian of u vanishes on \(\Omega \), or rather solving for the Laplacian in (5.2), we find that

$$\begin{aligned} \Delta _{\partial }u_a=-H\nu (u)-(\textrm{Tr}_{\Sigma }k)|\nabla u|. \end{aligned}$$
(5.12)

This expression may then be inserted into (5.10) to produce

$$\begin{aligned}{} & {} \sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}H\cosh q_{\varepsilon } \rightarrow H|\nabla u|,\quad \quad \nonumber \\{} & {} \quad \sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}(\textrm{Tr}_{\Sigma }k)\sinh q_{\varepsilon } \rightarrow (\textrm{Tr}_{\Sigma }k)\nu (u), \end{aligned}$$
(5.13)

as \(\varepsilon \rightarrow 0\). Moreover, similarly to (4.21) and (4.22) we can estimate

$$\begin{aligned} \begin{aligned} |\nabla _{\partial }u_a \cdot \nabla _{\partial } q_{\varepsilon }|&\le 2\left( \frac{|H|+|\textrm{Tr}_{\Sigma }k|}{|\vec {H}|}\right) (|\nabla _{\partial }^3 u_a|+|\nabla _{\partial }^2u_a|+|\nabla _{\partial }\log |\vec {H}|||\nabla _{\partial }u_a|)\\&\quad +2\left( |\nabla _{\partial }(|\vec {H}|^{-1}H)|+|\nabla _{\partial }(|\vec {H}|^{-1}\textrm{Tr}_{\Sigma }k)|\right) |\nabla _{\partial } u_a|. \end{aligned} \end{aligned}$$
(5.14)

Therefore the dominated convergence theorem may be employed to yield

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0}\int _{\Sigma }\left( \nabla _{\partial }u_a \cdot \nabla _{\partial } q_{\varepsilon } \right) dA =\int _{{\bar{\partial }}\Omega }\frac{|\nabla _{\partial }u_a|}{|\nabla u|}\nabla _{\partial }u_a\left( \frac{\nu (u)}{|\nabla _{\partial }u_a|}\right) dA. \end{aligned}$$
(5.15)

The desired result now follows from the second equality in (5.3), together with (5.11), (5.13), and (5.15). \(\square \)

Proof of Theorem 1.1: Rigidity

To establish part (3) of this of this theorem, it suffices to observe that if the inclusion map of a spacelike 2-surface in Minkowski space is admissible with some \(u_a\), then the physical and reference Hamiltonian densities are the same, so that the associated energy is zero. It then follows from nonnegativity in part (1) that the infimum of energies, and hence the mass of this surface, is zero.

Consider now part (2). Suppose that for some spacelike 2-surface the pair \((\iota ,u_a)\) is admissible, and \(E(\Sigma ,\iota ,u_a)=0\) for all \({\textbf{a}}\). Let \((\Omega ,g,k)\) be a compact initial data set satisfying the dominant energy condition, with trivial second homology, and boundary \(\partial \Omega =\Sigma \). The spacetime harmonic function on \(\Omega \) with boundary values \(u_a\), will also be denoted by \(u_a\). We will follow the general strategy of [15, Section 7]. First observe that for each \({\textbf{a}}\) the inequality (4.20) implies

$$\begin{aligned} {\nabla }^2 u_a+k|\nabla u_a|=0,\quad \quad \quad \mu =|J|=0 \quad \quad \text { on }\Omega , \end{aligned}$$
(5.16)

whenever \(|\nabla u_a|\ne 0\). Moreover, the Hopf lemma applied to a maximum point \(x_0 \in \Sigma \) of \(u_a\) shows that \(|\nabla u_a (x_0)|\ne 0\). Let \(\gamma \subset \Omega \) be a curve emanating from \(x_0\) parameterized by arclength, and observe that since

$$\begin{aligned} |\nabla |\nabla u_a||\le |\nabla ^2 u_a|\le |k||\nabla u_a| \end{aligned}$$
(5.17)

holds away from critical points, we have

$$\begin{aligned} |(\log |\nabla u_a|\circ \gamma )'|\le |\nabla \log |\nabla u_a||\circ \gamma \le C \end{aligned}$$
(5.18)

away from critical points, for some constant C. By integrating along arbitrary \(\gamma \), it follows that there is a constant \(C_1\) such that

$$\begin{aligned} C_1^{-1}|\nabla u_a(x_0)|\le |\nabla u_a(x)|\le C_1 |\nabla u_a(x_0)|,\quad \quad \quad x\in \Omega . \end{aligned}$$
(5.19)

Hence, \(|\nabla u_a|\) does not vanish globally for any \({\textbf{a}}\).

We will now choose three spacetime harmonic functions \(u_1\), \(u_2\), and \(u_3\) on \(\Omega \) in the following way. Let \(u_3\) be the spacetime harmonic function associated with \({\textbf{a}}_3=(0,0,1)\), and consider a point \(p\in \Sigma \) at which \(u_3\) achieves its maximum; at this point \(|\nabla _{\partial }u_3(p)|=0\). In Minkowski space, the isometric image \(\iota (\Sigma )\) is tangent at \(\iota (p)\) to a hyperplane \(-{\textbf{t}}+{\textbf{x}}^3 =const\) and lies to one side of it. By computing the projection of Minkowski space gradients for the functions \(-{\textbf{t}}+a_i{\textbf{x}}^i\) onto the tangent space \(T_{\iota (p)}\iota (\Sigma )\), it is possible to find two choices for \({\textbf{a}}\) such that the corresponding spacetime harmonic functions \(u_1\), \(u_2\) have nonvanishing boundary gradients at p and satisfy \(\nabla _{\partial }u_1(p) \perp \nabla _{\partial }u_2(p)\). It follows that the vector fields \(\nabla u_l\), \(l=1,2,3\) are linearly independent on \(\Omega \).

Consider the quantities

$$\begin{aligned} \varphi =|\nabla u_1|+|\nabla u_2|+|\nabla u_3|, \quad \quad \quad Y=\nabla u_1 +\nabla u_2 +\nabla u_3, \end{aligned}$$
(5.20)

and build the stationary spacetime \(({\mathbb {R}}\times \Omega ,{\bar{g}})\) where

$$\begin{aligned} {\bar{g}}=-(\varphi ^2 -|Y|^2) dt^2+2Y_i dx^i dt +g. \end{aligned}$$
(5.21)

This is the Kiling development of \((\Omega ,g,k,\varphi ,Y)\) with Killing initial data lapse-shift \((\varphi ,Y)\) decomposing the Killing vector \(\partial _t =\varphi {\textbf{n}}+Y\), where \({\textbf{n}}\) is the future pointing unit normal to the constant time slices. As is shown in [15, proof of Theorem 7.3] the initial data for these slices is \((\Omega ,g,k)\), and as a consequence of the vanishing spacetime Hessians (5.16) the function \(\varphi ^2-|Y|^2 =:c^2\) is constant. Furthermore, observe that

$$\begin{aligned} \varphi ^2-|Y|^2= & {} 2\sum _{l<m}\left( |\nabla u_l||\nabla u_m|-\nabla u_l \cdot \nabla u_m\right) \nonumber \\= & {} \sum _{l<m}|\nabla u_l||\nabla u_m|\left| \frac{\nabla u_l}{|\nabla u_l|}-\frac{\nabla u_m}{|\nabla u_m|}\right| ^2. \end{aligned}$$
(5.22)

Since \(|\nabla u_l|\) never vanishes, if \(c=0\) then \(\nabla u_l \parallel \nabla u_m\) for all lm. In particular, there exist constants \(c_{lm}>0\) such that at a given point \(x_1\in \Omega \) it holds that

$$\begin{aligned} \nabla u_l(x_1)-c_{lm} \nabla u_m(x_1)=0. \end{aligned}$$
(5.23)

We claim that these relations hold at all points. To see this, note that (5.16) implies

$$\begin{aligned} |\nabla |\nabla (u_l-c_{lm} u_m)||\le & {} |\nabla ^2(u_l -c_{lm} u_m)|\nonumber \\\le & {} |k|||\nabla u_l|-c_{lm}|\nabla u_m|| \le |k||\nabla (u_l -c_{lm}u_m)|. \end{aligned}$$
(5.24)

Then integrating along curves emanating from \(x_1\) produces

$$\begin{aligned} C_2^{-1}|\nabla (u_l -c_{lm}u_m)(x_1)|\le |\nabla (u_l -c_{lm}u_m)(x)|\le C_2 |\nabla (u_l -c_{lm}u_m)(x_1)|\nonumber \\ \end{aligned}$$
(5.25)

for some constant \(C_2>0\) and all \(x\in \Omega \), yielding the desired claim. However, (5.23) cannot hold at \(p\in \Sigma \) due to the properties of the boundary gradients \(\nabla _{\partial }u_l(p)\). We conclude that the constant \(c\ne 0\), and hence

$$\begin{aligned} {\bar{g}}=-\left( cdt-c^{-1}Y_i dx^i \right) ^2+(g_{ij}+c^{-2}Y_i Y_j )dx^i dx^j=-d{\bar{t}}^2 +(g+d{\textbf{u}}^2)\nonumber \\ \end{aligned}$$
(5.26)

where \({\bar{t}}=ct-c^{-1}{\textbf{u}}\) and \({\textbf{u}}=u_1+u_2+u_3\).

We will now show that the Killing development is isometric to a portion of Minkowski space. Consider the null vector fields

$$\begin{aligned} X_l =\nabla {\tilde{u}}_l +|\nabla {\tilde{u}}_l|{\textbf{n}}, \quad \quad \quad l=1,2,3, \end{aligned}$$
(5.27)

where the functions \({\tilde{u}}_l\) are spacetime harmonic functions on \(\Omega \) extended trivially in the t-direction to all of \({\mathbb {R}}\times \Omega \). These functions are chosen in the following way. Let \(p_1 \in \Sigma \) be a maximum point for \(u_1+u_2+u_3\) on \(\Sigma \). Then in Minkowski space, the isometric image \(\iota (\Sigma )\) is tangent at \(\iota (p_1)\) to a hyperplane \(-{\textbf{t}}+b_i{\textbf{x}}^i =const\) and lies to one side of it, where \(|{\textbf{b}}|\le 1\). We may rotate this to a null hyperplane which is still tangent to \(\iota (\Sigma )\) at \(\iota (p_1)\), and which is defined by \(-{\textbf{t}}+{\tilde{a}}_i {\textbf{x}}^i =const\) with \(|\tilde{{\textbf{a}}}|=1\). We then set \({\tilde{u}}_3\) to be the associated spacetime harmonic function, and note that \(|\nabla _{\partial }{\tilde{u}}_3(p_1)|=0\). As above, we may also find two additional spacetime harmonic functions \({\tilde{u}}_1\), \({\tilde{u}}_2\) having nonvanishing boundary gradients at \(p_1\) and satisfying \(\nabla _{\partial }u_1(p_1) \perp \nabla _{\partial }u_2(p_1)\). Since

$$\begin{aligned} \sum _l d_l X_l + d_4 \partial _t= & {} \sum _l d_l \nabla {\tilde{u}}_l + d_4 \nabla (u_1 +u_2 +u_3)\nonumber \\{} & {} \quad +\left( \sum _l d_l |\nabla {\tilde{u}}_l| +d_4(|\nabla u_1|+|\nabla u_2|+|\nabla u_3|)\right) {\textbf{n}}\nonumber \\ \end{aligned}$$
(5.28)

for any constants \(d_1,\ldots ,d_4\), we find that setting this quantity to zero implies \(d_1=d_2=0\) by evaluating on \(T_{p_1}\Sigma \). Moreover, since \(X_3\) is null and \(\partial _t\) is timelike, it follows that \(\{X_1,X_2,X_3,\partial _t\}\) is linearly independent. Additionally, note that from (5.16) the functions \({\tilde{u}}_l\) have vanishing spacetime Hessians, which as in [15, proof of Theorem 7.3] implies that each of these four vector fields is covariantly constant in spacetime. Hence \(({\mathbb {R}}\times \Omega ,{\bar{g}})\) is flat, and consequently the Riemannian manifold \((\Omega ,g+d{\textbf{u}}^2)\) is flat. It remains to show that this manifold is isometric to a domain in Euclidean 3-space.

The vanishing quasi-local energy, the admissiblity condition, and inequality (4.20) show that the Euler characteristics agree \(\chi (\Sigma _s)=\chi ({\hat{\Sigma }}_s)\) for regular values s. Here \(\Sigma _s\) is the s-level set of an arbitrary spacetime harmonic function \(u_a\), and \({\hat{\Sigma }}_s\) is the s-level set of the corresponding null linear function in \({\hat{\Omega }}\subset {\mathbb {R}}^{3,1}\) where \({\hat{\Omega }}\) is a spacelike fill-in for \(\iota (\Omega )\). It follows that \(\Omega \) is diffeomorphic to \({\hat{\Omega }}\). Consider now the dual 1-forms to the vector fields \(X_l\) on spacetime, and restrict them to the \({\bar{t}}=0\) slice to obtain 1-forms \(\omega ^l\), \(l=1,2,3\) on \((\Omega ,g+d{\textbf{u}}^2)\). Since the \(\omega ^l\) are covariantly constant, they are closed. We claim that they are also exact. To see this, it will be shown that they integrate to zero on any closed curve \({\textbf{c}}\subset \Omega \). Indeed, such a \({\textbf{c}}\) is homologous to a closed curve \(\tilde{{\textbf{c}}}\) inside the \({\bar{t}}=-c^{-1}{\textbf{u}}\) slice, which coincides with the initial data \((\Omega ,g,k)\). Furthermore, according to (5.27) the restriction of the \(X_l\) dual 1-forms to this hypersurface is exact, and hence integrates to zero along \(\tilde{{\textbf{c}}}\). Hence, there exist functions \({\bar{u}}^l \in C^{2,\varsigma }(\Omega )\), \(l=1,2,3\) such that \(\omega ^l=d{\bar{u}}^l\). Since the \(\omega ^l\) are covariantly constant, by applying a Gram-Schmidt procedure we may assume that they are orthonormal, and in addition the Hessians of the \({\bar{u}}^l\) vanish. Therefore

$$\begin{aligned} g+d{\textbf{u}}^2 =(d{\bar{u}}^1)^2 +(d{\bar{u}}^2)^2 +(d{\bar{u}}^3)^2, \end{aligned}$$
(5.29)

and \(({\bar{u}}_1,{\bar{u}}_2,{\bar{u}}_3)\) can be used as a system of global coordinates on the \({\bar{t}}=0\) slice. It follows that the Killing development of \((\Omega ,g,k)\) is isometric to a portion of Minkowski space. \(\square \)

6 Asymptotics of the Energy

In this section we will establish Theorem 1.3, which shows that the quasi-local energy asymptotes to the appropriate ADM quantity along coordinate spheres in an asymptotically flat end. Let (Mgk) be an asymptotically flat initial data set for the Einstein equations, and consider a coordinate sphere \(S_r \subset M\). According to [11, Lemma 2.1] the mean and Gauss curvatures of these spheres satisfy the following expansions

$$\begin{aligned} H=\frac{2}{r}+O(r^{-1-\tau }),\quad \quad \quad \quad K=\frac{1}{r^2}+O(r^{-2-\tau }), \end{aligned}$$
(6.1)

where \(\tau >\tfrac{1}{2}\) is the asymptotic flatness parameter of (1.10). It follows that for sufficiently large r the Gauss curvature is positive, and hence from [24, p. 353] (see also [11, (2.18)]) there exists an isometric embedding into a constant time slice of Minkowski space \(\iota _r:S_r \hookrightarrow {\mathbb {R}}^3 \subset {\mathbb {R}}^{3,1}\) such that

$$\begin{aligned} |\nabla _{\partial }^l( \iota _r -\textrm{id}_r)|=O(r^{1-\tau -l}),\quad \quad \quad l=0,1,2, \end{aligned}$$
(6.2)

where \(\textrm{id}_r\) is the identity map on the sphere of radius r in \({\mathbb {R}}^3\). It follows that the null linear function pullback, used to define the quasi-local energy, may be approximated by a linear combination of asymptotically flat coordinates. In particular, if \((x^1,x^2,x^3)\) are coordinates from (1.10) in the asymptotic end of M and \(u_a=\iota _r^*(-{\textbf{t}}+a_i {\textbf{x}}^i)\) then

$$\begin{aligned} |\nabla _{\partial }^l (u_a -u)|=O(r^{1-\tau -l}),\quad \quad \quad l=0,1,2, \end{aligned}$$
(6.3)

where \(u=a_i x^i\). We will also use the notation \({\hat{u}}=-{\textbf{t}}+a_i {\textbf{x}}^i=a_i {\textbf{x}}^i\) on \({\mathbb {R}}^3\).

The pre-limit energy of coordinate spheres may be expressed using (4.7) as

$$\begin{aligned} \begin{aligned} \begin{aligned} E_{\varepsilon }(S_r,\iota _r,u_a)=&\frac{1}{8\pi }\int _{S_r}\left( \sqrt{|\vec {H}_0|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2} -f_0 \Delta _{\partial } u_a +\alpha _{\frac{\vec {H}_0}{|\vec {H}_0|}}(\nabla _{\partial }u_a)\right) dA\\&-\frac{1}{8\pi }\int _{S_r}\left( \sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2){+}(\Delta _\partial u_a)^2} {-}f_{\varepsilon }\Delta _{\partial } u_a +\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial }u_a)\right) dA, \end{aligned} \end{aligned} \end{aligned}$$
(6.4)

where \(\vec {H}=H\nu -(\textrm{Tr}_{S_r}k){\textbf{n}}\) and \(\vec {H}_0={\hat{H}}{\hat{\nu }}\) are the mean curvature vectors of \(S_r \subset M\) and \(\iota _r(S_r)\subset {\mathbb {R}}^{3,1}\), and

$$\begin{aligned} f_{\varepsilon }=\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}}\right) ,\quad \quad \quad f_0=\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}_0|\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}}\right) .\nonumber \\ \end{aligned}$$
(6.5)

In the decomposition of the mean curvature vectors, the normals \(\nu \) and \({\hat{\nu }}\) are tangent to M and the constant time slice of Minkowski space respectively. From (1.10), (5.2), and (6.2) we have

$$\begin{aligned} \Delta _{\partial }u_a=-{\hat{H}}{\hat{\nu }}({\hat{u}})=-|\vec {H}_0|{\hat{\nu }}({\hat{u}})=-|\vec {H}|{\hat{\nu }}({\hat{u}})+O(r^{-1-\tau }) =-H{\hat{\nu }}({\hat{u}})+O(r^{-1-\tau }),\nonumber \\ \end{aligned}$$
(6.6)

since \({\hat{\nu }}({\hat{u}})=O(1)\). In particular, observing that \(|\nabla _{\partial }u_a|^2 +{\hat{\nu }}({\hat{u}})^2 =1\) produces

$$\begin{aligned} |\nabla _\partial u_a|^2 +|\vec {H}|^{-2}(\Delta _\partial u_a)^2 =1 +O(r^{-\tau }),\quad \quad \quad |\nabla _\partial u_a|^2 +|\vec {H}_0|^{-2}(\Delta _\partial u_a)^2 =1.\nonumber \\ \end{aligned}$$
(6.7)

With the help of \(|\nabla _{\partial }u_a|=O(1)\) it follows that

$$\begin{aligned}{} & {} \frac{\left( |\vec {H}_0|+|\vec {H}|\right) (|\nabla _\partial u_a|^2 +\varepsilon ^2)}{\sqrt{|\vec {H}_0|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}+\sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}}\nonumber \\{} & {} \quad =|\nabla _\partial u_a|^2+O(r^{-\tau } +\varepsilon ^2), \end{aligned}$$
(6.8)

and therefore

$$\begin{aligned} \begin{aligned} \begin{aligned}&\int _{S_r}\left( \sqrt{|\vec {H}_0|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2} -\sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}\right) dA\\ {}&\quad = \int _{S_r}\left( |\vec {H}_0|-|\vec {H}|\right) \frac{\left( |\vec {H}_0|+|\vec {H}|\right) (|\nabla _\partial u_a|^2 +\varepsilon ^2)}{\sqrt{|\vec {H}_0|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}+\sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}} dA\\ {}&\quad = \int _{S_r}\left( |\vec {H}_0|-|\vec {H}|\right) |\nabla _\partial u_a|^2 dA +O(r^{1-2\tau }+\varepsilon ^2 r^{1-\tau }). \end{aligned} \end{aligned} \end{aligned}$$
(6.9)

Consider now the terms involving \(f_{\varepsilon }\) and \(f_0\). Notice that by setting \(\zeta =|\vec {H}|^{-1}|\vec {H}_0|-1=O(r^{-\tau })\) and applying the mean value theorem to the following function of \(\zeta \) we find

$$\begin{aligned} \begin{aligned} \begin{aligned}&\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\right) \\ {}&\quad = \sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}_0|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\cdot (1+\zeta )\right) \\ {}&\quad =\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}_0|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\right) +\left[ 1+\left( \frac{(1+O(r^{-\tau }))\Delta _{\partial }u_a}{|\vec {H}_0|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\right) ^2\right] ^{-1/2}\frac{\Delta _{\partial }u_a}{|\vec {H}_0|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\cdot \zeta \\ {}&\quad =\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}_0|\sqrt{|\nabla _{\partial }u_a|^2+\varepsilon ^2}}\right) +\frac{\Delta _{\partial }u_a}{|\vec {H}_0|}\left( \frac{|\vec {H}_0|}{|\vec {H}|}-1\right) \left( 1+O(r^{-\tau }+\varepsilon ^2)\right) , \end{aligned} \end{aligned} \end{aligned}$$
(6.10)

where (6.7) has also been used. Hence (6.6) yields

$$\begin{aligned} \int _{S_r}(f_{\varepsilon }-f_0)\Delta _{\partial }u_a dA=\int _{S_r}{\hat{\nu }}({\hat{u}})^2\left( |\vec {H}_0|-|\vec {H}|\right) dA +O(r^{1-2\tau }+\varepsilon ^2 r^{1-\tau }),\nonumber \\ \end{aligned}$$
(6.11)

and combining this with (6.9) gives

$$\begin{aligned} \begin{aligned}&\int _{S_r}\left( \sqrt{|\vec {H}_0|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2} -f_0 \Delta _{\partial } u_a \right) dA\\&\qquad -\int _{S_r}\left( \sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2} -f\Delta _{\partial } u_a \right) dA\\&\quad =\int _{S_r}\left( |\vec {H}_0|-|\vec {H}|\right) dA+O(r^{1-2\tau }+\varepsilon ^2 r^{1-\tau })\\&\quad =8\pi {\mathcal {E}}+o(1)+O(r^{1-2\tau }+\varepsilon ^2 r^{1-\tau }). \end{aligned} \end{aligned}$$
(6.12)

Here \({\mathcal {E}}\) is the ADM energy, and in the last step we utilize the fact that the Liu-Yau and Brown–York energy have the same large sphere limit [39, proof of Theorem 3.1], together with the convergence of the Brown–York energy to the ADM energy [11, Theorem 1.1].

Lastly, consider the connection 1-forms within (6.4). According (2.6) we find that the reference connection 1-form vanishes, and with (2.7) as well as (6.3) it follows that

$$\begin{aligned} \begin{aligned} \begin{aligned} \alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial } u_a)=&-k(\nabla _{\partial }u_a ,\nu )-\nabla _{\partial }u_a \cdot \nabla _{\partial }h\\ =&-k(\nabla _{\partial }u ,\nu )+O(r^{-1-2\tau })-\nabla _{\partial }u_a \cdot \nabla _{\partial }h \\ =&-\!\left( k\!-\!(\text {Tr}_g k)g\right) (\nabla u,\nu )\!-\!(\text {Tr}_{S_r} k)\nu (u)\!+\! h\Delta _{\partial }u_a \!-\!\text {div}_{\partial }\left( h\nabla _{\partial } u_a\right) +O(r^{-1-2\tau }), \end{aligned} \end{aligned} \end{aligned}$$
(6.13)

where

$$\begin{aligned} h=-\sinh ^{-1}\left( \frac{\textrm{Tr}_{S_r}k}{|\vec {H}|}\right) =-\left( 1+O(r^{-\tau })\right) \frac{\textrm{Tr}_{S_r}k}{|\vec {H}|} \end{aligned}$$
(6.14)

with the mean value theorem being used in the last equality. Furthermore (6.2) implies that \({\hat{\nu }}({\hat{u}})=\nu (u)+O(r^{-\tau })\), which together with (6.6) yields

$$\begin{aligned} \Delta _{\partial }u_a =-|\vec {H}|\nu (u)+O(r^{-\tau }), \end{aligned}$$
(6.15)

and hence

$$\begin{aligned} -(\textrm{Tr}_{S_r} k)\nu (u)+h\Delta _{\partial }u_a= (\textrm{Tr}_{S_r}k)\nu (u)\cdot O(r^{-\tau })=O(r^{-1-2\tau }). \end{aligned}$$
(6.16)

Moreover, since \(\nabla u=a_i \partial _{x^i} +O(r^{-\tau })\) we obtain

$$\begin{aligned} \int _{S_r}\left( \alpha _{\frac{\vec {H}_0}{|\vec {H}_0|}}(\nabla _{\partial } u_a)-\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial } u_a)\right) dA =-8\pi \langle {\textbf{a}},{\mathcal {P}}\rangle +o(1)+O(r^{1-2\tau }),\nonumber \\ \end{aligned}$$
(6.17)

where \({\mathcal {P}}\) is the ADM linear momentum. By combining (6.4), (6.12), and (6.17) the desired result is achieved

$$\begin{aligned} \lim _{r\rightarrow \infty }E(S_r,\iota _r,u_a) =\lim _{r\rightarrow \infty }\lim _{\varepsilon \rightarrow 0} E_{\varepsilon }(S_r,\iota _r,u_a)={\mathcal {E}}-\langle {\textbf{a}},{\mathcal {P}}\rangle . \end{aligned}$$
(6.18)

7 First Variation of the Energy

The purpose of this section is to derive the Euler-Lagrange equation for the function \(u_a\) at a critical point of the energy, under ideal conditions. In particular, it will be assumed that the critical pair \((\iota ,u_a)\) is admissible and has the following properties. The set of critical points for \(u_a\) is sufficiently mild to allow for the interchange of limits as \(\varepsilon \rightarrow 0\) with integration and variational differentiation, and the Euler characteristics \(\chi ({\hat{\Sigma }}_s)\) of regular level sets within the reference space fill-in \({\hat{\Omega }}\) take only the value 1. We will use the notation \(\delta \) to denote the operation of variation.

Consider the quantity

$$\begin{aligned} \begin{aligned} \begin{aligned}&E_{\varepsilon }(\Sigma ,\iota ,u_a)=\frac{1}{4}\int _{{\underline{u}}_a}^{{\overline{u}}_a}\chi ({\hat{\Sigma }}_s)ds\\ {}&\quad -\frac{1}{8\pi }\int _{\Sigma }\left( \sqrt{|\vec {H}|^2 (|\nabla _\partial u_a|^2+\varepsilon ^2)+(\Delta _\partial u_a)^2}+\nabla _{\partial }f_{\varepsilon }\cdot \nabla _{\partial }u_a+\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial }u_a)\right) dA, \end{aligned} \end{aligned} \end{aligned}$$
(7.1)

where

$$\begin{aligned} f_{\varepsilon }=\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}|\sqrt{|\nabla _{\partial }u_a|^2 +\varepsilon ^2}}\right) . \end{aligned}$$
(7.2)

Observe that (4.21) together with Lemma 5.1 and the coarea formula show

$$\begin{aligned} E(\Sigma ,\iota ,u_a)=\lim _{\varepsilon \rightarrow 0}E_{\varepsilon }(\Sigma ,\iota ,u_a). \end{aligned}$$
(7.3)

Direct calculations yield

$$\begin{aligned}{} & {} \delta \int _{{\underline{u}}_a}^{{\overline{u}}_a}\chi ({\hat{\Sigma }}_s)ds=\delta {\overline{u}}_a -\delta {\underline{u}}_a, \end{aligned}$$
(7.4)
$$\begin{aligned}{} & {} \quad \delta \int _{\Sigma }\sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2} dA\nonumber \\{} & {} \quad =\int _{\Sigma }\frac{|\vec {H}|^2 \nabla _{\partial }u_a \cdot \nabla _{\partial }\delta u_a + \Delta _{\partial }u_a \Delta _{\partial }\delta u_a}{\sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}} dA, \end{aligned}$$
(7.5)

and

$$\begin{aligned} \delta \int _{\Sigma }\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial }u_a) dA=\int _{\Sigma }\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial }\delta u_a) dA. \end{aligned}$$
(7.6)

Moreover since

$$\begin{aligned} \delta f_{\varepsilon }=\frac{\Delta _\partial \delta u_a -(|\nabla u_a|^2 +\varepsilon ^2)^{-1}(\Delta _\partial u_a)\nabla _\partial u_a\cdot \nabla _\partial \delta u_a}{\sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}}, \end{aligned}$$
(7.7)

we have

$$\begin{aligned} \begin{aligned} \delta \int _{\Sigma }\nabla _\partial f_{\varepsilon }\cdot \nabla _{\partial } u_a dA =&-\delta \int _{\Sigma }f_{\varepsilon }\Delta _{\partial }u_a dA\\ =&-\int _{\Sigma }\left( (\delta f_{\varepsilon })\Delta _{\partial }u_a +f_{\varepsilon }\Delta _{\partial }\delta u_a \right) dA\\ =&\int _{\Sigma }\left( \frac{-\Delta _{\partial } u_a\Delta _\partial \delta u_a +(|\nabla u_a|^2 +\varepsilon ^2)^{-1}(\Delta _\partial u_a)^2\nabla _\partial u_a\cdot \nabla _\partial \delta u_a}{\sqrt{|\vec {H}|^2(|\nabla _\partial u_a|^2 +\varepsilon ^2)+(\Delta _\partial u_a)^2}}\right) dA\\&+\int _{\Sigma }\nabla _\partial f_{\varepsilon } \cdot \nabla _{\partial }\delta u_a dA. \end{aligned} \end{aligned}$$
(7.8)

It follows that

$$\begin{aligned} \begin{aligned}&\delta E_{\varepsilon }(\Sigma ,\iota ,u_a)= \frac{1}{4}(\delta {\overline{u}}_a -\delta {\underline{u}}_a)\\&\quad -\frac{1}{8\pi }\int _{\Sigma }\left( |\vec {H}|(\cosh f_{\varepsilon }) \frac{\nabla _{\partial }u_a\cdot \nabla _{\partial }\delta u_a}{\sqrt{|\nabla _{\partial } u_a|^2+\varepsilon ^2}}+\nabla _\partial f_{\varepsilon } \cdot \nabla _{\partial }\delta u_a +\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial }\delta u_a)\right) dA. \end{aligned} \end{aligned}$$
(7.9)

Under the ideal conditions mentioned at the beginning of this section, we may interchange limit and variational derivative, and apply the dominated convergence theorem as in (4.23) to obtain

$$\begin{aligned} \begin{aligned} \delta E(\Sigma ,\iota ,u_a)&= \lim _{\varepsilon \rightarrow 0}\delta E_{\varepsilon }(\Sigma ,\iota ,u_a)\\&=\frac{1}{4}(\delta {\overline{u}}_a -\delta {\underline{u}}_a)\\&\quad -\frac{1}{8\pi }\int _{\Sigma }\left( |\vec {H}|(\cosh f) \frac{\nabla _{\partial }u_a\cdot \nabla _{\partial }\delta u_a}{|\nabla _{\partial } u_a|}+\nabla _\partial f \cdot \nabla _{\partial }\delta u_a +\alpha _{\frac{\vec {H}}{|\vec {H}|}}(\nabla _{\partial }\delta u_a)\right) dA, \end{aligned} \end{aligned}$$
(7.10)

where

$$\begin{aligned} f=\sinh ^{-1}\left( \frac{\Delta _{\partial }u_a}{|\vec {H}||\nabla _{\partial }u_a|}\right) . \end{aligned}$$
(7.11)

If the variations \(\delta u_a\) are plentiful enough so as to include all smooth functions, then we find that the critical isometric embedding pair gives rise to a weak solution of the 4th order equation

$$\begin{aligned} \text {div}_{\sigma }\left( |\vec {H}|(\cosh f)\frac{\nabla _\partial u_a}{|\nabla _\partial u_a|}+\nabla _{\partial } f+V\right) =2\pi (\pmb {\delta }_- -\pmb {\delta }_+), \end{aligned}$$
(7.12)

in which V is the dual vector field to the connection 1-form \(\alpha _{\frac{\vec {H}}{|\vec {H}|}}\) and \(\pmb {\delta }_{\pm }\) are Dirac delta distributions at the max and min points.