1 Introduction

We prove the nonlinear stability of \((3+1)\)-dimensional Minkowski space as a vacuum solution of Einstein’s field equation and obtain a precise full expansion of the solution, in a mildly generalized harmonic gauge, in all asymptotic regions, i.e. near spacelike, null, and timelike infinity. On a conceptual level, we show how some of the methods we developed for our proofs of black hole stability in cosmological spacetimes [53, 60] apply in this more familiar setting, studied by Christodoulou–Klainerman [27], Lindblad–Rodnianski [79, 80], and many others: this includes the use of an iteration scheme for the construction of the metric in which we solve a linear equation globally at each step, keeping track of the precise asymptotic behavior of the iterates by working on a suitable compactificationM of the spacetime, and the implementation of constraint damping.

The estimates we prove for the linear equations—which arise as linearizations of the gauge-fixed Einstein equation around metrics which lie in the precise function space in which we seek the solution—are largely based on energy estimates and a version of the vector field method [64]. The estimates are rather refined in terms of a splitting of the symmetric 2-tensor bundle (different metric components behave differently at null infinity); the vector fields we use are closely related to those in [27, 64, 79, 80]. In our systematic approach, both the relevant notion of regularity (matching [74]) and the determination of the precise asymptotic behavior of the solution follow readily from an inspection of the geometric and algebraic properties of the linearized gauge-fixed (or ‘reduced’) Einstein equation; correspondingly, once M and the required function spaces are defined (§§23), the proof of stability itself is rather concise (§§46).

The weak null condition of Lindblad–Rodnianski [78] manifests itself in our linearization approach as a nilpotent coupling of certain metric components for a linear model operator at null infinity: the logarithmic growth (relative to the typical decay rate of \(r^{-1}\) of waves on \((3+1)\)-dimensional Minkowski space near null infinity) of one metric component is rendered harmless due to its coupling (to leading order) only to a metric component \(g_{0 0}\) which governs the ‘long range’ behavior of outgoing light cones and which decays faster than \(r^{-1}\) by a factor of \(r^{-\gamma }\) for some \(\gamma >0\) (see the discussions in §§1.1.2 and 3.3). For the reader already familiar with the weak null condition, we mention here that the better decay of \(g_{0 0}\) in [80] (corresponding, roughly, to \(g_{L L}\) in the reference) is a consequence of the harmonic gauge condition being satisfied by the nonlinear solution, while in the present paper we have decay of the (0, 0)-component of every iterate in our iteration scheme since we arrange constraint damping, which, roughly speaking, ensures that our gauge condition is satisfied to high accuracy (in the sense of decay) even though we are only solving ‘nongeometric’ (linear) equations. (This makes constraint damping attractive for numerical analysis, see [46, 95] and Remark 1.2 below).

We proceed to state a simple version of our main theorem, before returning to an in-depth discussion of our approach, the relevant estimates, and the structure of the Einstein equation in §1.1. Recall that in Einstein’s theory of general relativity, a vacuum spacetime is described by a 4-manifold \(M^\circ \) which is equipped with a Lorentzian metric g with signature \((+,-,-,-)\) satisfying the Einstein vacuum equation

$$\begin{aligned} \text {Ric}(g)=0. \end{aligned}$$
(1.1)

The simplest solution is the Minkowski spacetime\((M^\circ ,g)=(\mathbb {R}^4,\underline{g}{})\),

$$\begin{aligned} \underline{g}{}:=d t^2-d x^2,\quad \mathbb {R}^4=\mathbb {R}_t\times \mathbb {R}^3_x. \end{aligned}$$
(1.2)

The far field of an isolated gravitational system \((M^\circ ,g)\) with total (ADM) mass m is usually described by the Schwarzschild metric

(1.3)

where denotes the round metric on \(\mathbb {S}^2\); the Minkowski metric \(\underline{g}{}=g_0^S\) differs from this by terms of size \(\mathcal {O}(m r^{-1})\). In the study of weak nonlinear gravity in vacuum (in particular, black holes are excluded), one then works with metrics g which are smooth extensions of (a short range perturbation of) \(g_m^S\) to all of \(\mathbb {R}^4\). Such spacetimes are asymptotically flat: letting \(|t|+|x|\rightarrow \infty \) in \(\mathbb {R}^4\), the metric g (in a suitable gauge) approaches the flat Minkowski metric \(\underline{g}{}\) in a quantitative fashion.

Suitably interpreted, the field equation (1.1) has the character of a quasilinear wave equation; in particular, it predicts the existence of gravitational waves, which were recently observed experimentally [70]. Correspondingly, the evolution and long time behavior of solutions of (1.1) can be studied from the perspective of the initial value problem: given a 3-manifold \(\Sigma ^\circ \) and symmetric 2-tensors \(\gamma ,k\in \mathcal {C}^\infty (\Sigma ^\circ ;S^2 T^*\Sigma ^\circ )\), with \(\gamma \) a Riemannian metric, one seeks a vacuum spacetime \((M^\circ ,g)\) and an embedding \(\Sigma ^\circ \hookrightarrow M^\circ \) such that

$$\begin{aligned} \text {Ric}(g) = 0\ \ \text{ on }\ M^\circ ,\quad g|_{\Sigma ^\circ }=-\gamma ,\ II_g=k\ \ \text{ on }\ \Sigma ^\circ , \end{aligned}$$
(1.4)

where \(II_g\) denotes the second fundamental form of \(\Sigma ^\circ \), and where we use the embedding \(\Sigma ^\circ \hookrightarrow M^\circ \) to identify the tensors \(\gamma ,k\) on \(\Sigma ^\circ \) with (tangential) tensors on the image of \(\Sigma ^\circ \) in \(M^\circ \). (The minus sign in (1.4) is due to our sign convention for Lorentzian metrics). A fundamental result due to Choquet-Bruhat and Geroch [17, 19] states that necessary and sufficient conditions for the well-posedness of this problem are the constraint equations for \(\gamma \) and k,

$$\begin{aligned} R_\gamma +({\text {tr}}_\gamma k)^2-|k|_\gamma ^2 = 0,\ \ \delta _\gamma k+d{\text {tr}}_\gamma k=0, \end{aligned}$$
(1.5)

where \(R_\gamma \) is the scalar curvature of \(\gamma \), and \(\delta _\gamma \) is the (negative) divergence. Concretely, if these are satisfied, there exists a maximal globally hyperbolic solution \((M^\circ ,g)\) of (1.4) which is unique up to isometries. By the future development of an initial data set \((\Sigma ^\circ ,\gamma ,k)\), we mean the causal future of \(\Sigma ^\circ \) as a Lorentzian submanifold of \((M^\circ ,g)\). Our main theorem concerns the long time behavior of solutions of (1.4) with initial data close to those of Minkowski space:

Theorem 1.1

Let \(b_0>0\). Suppose that \((\gamma ,k)\) are smooth initial data on \(\mathbb {R}^3\) satisfying the constraint equations (1.5) which are small in the sense that for some small \(\delta >0\), a cutoff \(\chi \in \mathcal {C}^\infty _{\text {c}}(\mathbb {R}^3)\) identically 1 near 0, and \(\widetilde{\gamma }:=\gamma -(1-\chi )(-g_m^S)|_{\{t=0\}}\),Footnote 1 where \(|m|<\delta \), we have

$$\begin{aligned} \sum _{j\le N+1} \Vert \langle r\rangle ^{-1/2+b_0} (\langle r\rangle \nabla )^j \widetilde{\gamma } \Vert _{L^2} + \sum _{j\le N} \Vert \langle r\rangle ^{1/2+b_0} (\langle r\rangle \nabla )^j k \Vert _{L^2} < \delta , \end{aligned}$$
(1.6)

where N is some large fixed integer (\(N=26\) works). Assume moreover that the weighted \(L^2\) norms in (1.6) are finite for all \(j\in \mathbb {N}\).

Then the future development of the data \((\mathbb {R}^3,\gamma ,k)\) is future causally geodesically complete and decays to the flat (Minkowski) solution. More precisely, there exist a smooth manifold with corners M with boundary hypersurfaces \(\Sigma \), \(I^0\), \(\mathscr {I}^+\), \(I^+\), and a diffeomorphism of the interior \(M^\circ \) with \(\{t>0\}\subset \mathbb {R}^4\), as well as an embedding \(\mathbb {R}^3\cong \Sigma ^\circ \) of the Cauchy hypersurface, and a solution g of the initial value problem (1.4) which is conormal (see below) on M and satisfies \(|g-\underline{g}{}|\lesssim (1+t+|r|)^{-1+\epsilon }\) for all \(\epsilon >0\). See Figure 1. For fixed ADM mass m, the solution g depends continuously on \(\widetilde{\gamma }\), k, see Remark 6.4.

If the normalized initial data \((\langle r\rangle \widetilde{\gamma },\langle r\rangle ^2 k)\) are in addition \(\mathcal {E}\)-smooth, i.e. polyhomogeneous at infinity with index set \(\mathcal {E}\) (see below), then the solution g is also polyhomogeneous on M, with index sets given explicitly in terms of \(\mathcal {E}\).

More precise versions will be given in Theorem 1.8 and in §6. The condition (1.6) allows for \(\widetilde{\gamma }\) to be pointwise of size \(r^{-1-b_0-\epsilon }\), \(\epsilon >0\); since \(b_0>0\) is arbitrary, this means that we allow for the initial data to be Schwarzschildean modulo \(\mathcal {O}(r^{-1-\epsilon })\) for any \(\epsilon >0\).

Fig. 1
figure 1

Left: the compact manifold M (solid boundary), containing a compactification \(\Sigma \) of the initial surface \(\Sigma ^\circ \). The boundary hypersurfaces \(I^0\), \(\mathscr {I}^+\), and \(I^+\) are called spatial infinity, (future) null infinity, and (future) timelike infinity, respectively. One can think of M as the blow-up of a Penrose diagram at timelike and spatial infinity. A global compactification would extend across \(\Sigma \) to the past, with additional boundary hypersurfaces \(\mathscr {I}^-\) (past null infinity) and \(I^-\) (past timelike infinity). Right: for comparison, the Penrose diagram of Minkowski space

In Theorem 1.1, conormality is a (local) regularity notion on a manifold with corners \(\mathsf {M}\) which is equivalent to smoothness in \(\mathsf {M}^\circ \), but differs from it near \(\partial \mathsf {M}\): in the model case \(\mathsf {M}=[0,\infty )_x^p\times \mathbb {R}_y^q\), and with \(\alpha \in \mathbb {R}^p\), a function \(u\in x^\alpha L^\infty _{\text {loc}}(\mathsf {M})\) is called conormal relative to the space \(x^\alpha L^\infty _{\text {loc}}(\mathsf {M})\) if

$$\begin{aligned} V_1\cdots V_N u \in x^\alpha L^\infty _{\text {loc}}(\mathsf {M})\quad \forall \ N\in \mathbb {N}, \end{aligned}$$

where each \(V_j\) is one of the vector fields \(x_k\partial _{x_k}\), \(\partial _{y_l}\), \(1\le k\le p\), \(1\le l\le q\). (A typical example of a conormal function is \(x^\beta \), where \(\beta \in \mathbb {R}^p\), \(\beta \ge \alpha \) component-wise). We say that a distribution u is conormal if it is conormal relative to \(x^\alpha L^\infty _{\text {loc}}(\mathsf {M})\) for some vector \(\alpha \in \mathbb {R}^p\) of weights. In the context of Theorem 1.1, the weights are specified in Theorem 1.8 and Remark 1.9 below; at this point we simply content ourselves with taking them to be 0 at each hypersurface.

Before continuing the discussion of Theorem 1.1, we remark that the assumption that all weighted norms in (1.6) are finite is only needed to conclude the conormality of g. If one is only interested in controlling a finite number of derivatives of g, we only need to require the finiteness of finitely many weighted norms (1.6) (as can be seen by inspecting the Nash–Moser theorem we use in our nonlinear iteration).

Next, \(\mathcal {E}\)-smoothness is a refinement of conormality: the assumption of \(\mathcal {E}\)-smoothness, i.e. polyhomogeneity with index set \(\mathcal {E}\subset \mathbb {C}\times \mathbb {N}_0\), means, roughly speaking, that \(\langle r\rangle \widetilde{\gamma }\) (similarly \(\langle r\rangle ^2 k\)) has a full asymptotic expansion as \(r\rightarrow \infty \) of the form

$$\begin{aligned} \langle r\rangle \widetilde{\gamma } \sim \sum _{(z,k)\in \mathcal {E}} r^{-i z}(\log r)^k\widetilde{\gamma }_{(z,k)}(\omega ),\ \ \omega =x/|x|\in \mathbb {S}^2,\ \widetilde{\gamma }_{(z,k)}\in \mathcal {C}^\infty (\mathbb {S}^2;S^2 T^*\mathbb {R}^3), \end{aligned}$$
(1.7)

with \({\text {Im}}z<-b_0\), where for any fixed C, the number of \((z,k)\in \mathcal {E}\) with \({\text {Im}}z>-C\) is finite. (That is, \(\langle r\rangle \widetilde{\gamma }\) admits a generalized Taylor expansion into powers of \(r^{-1}\), except the powers may be fractional or even complex—that is, oscillatory—and logarithmic terms may occur. A typical example is that all z are of the form \(z=-i k\), \(k\in \mathbb {N}\), in which case (1.7) is an expansion into powers \(r^{-k}\), with potential logarithmic factors). The polyhomogeneity of g on the manifold with corners M means that at each of the hypersurfaces \(I^0\), \(\mathscr {I}^+\), and \(I^+\), the metric g admits an expansion similar to (1.7), with \(r^{-1}\) replaced by a defining function of the respective boundary hypersurface (for example \(\mathscr {I}^+\)) such that moreover each term in the expansion (which is thus a tensor on \(\mathscr {I}^+\)) is itself polyhomogeneous at the other boundaries (that is, at \(\mathscr {I}^+\cap I^0\) and \(\mathscr {I}^+\cap I^+\)). We refer the reader to §2.2 for precise definitions, and to Examples 7.2 and 7.3 for the list of index sets for two natural classes of polyhomogeneous initial data.

Christodoulou [24] showed that, generically, one can only expect the metric g, suitably rescaled to a non-degenerate metric on a compactification of \(\mathbb {R}^4\), to be of class \(\mathcal {C}^{1,\alpha }\), \(\alpha <1\), due to the presence of logarithmic terms in the expansion of certain geometric quantities at null infinity; polyhomogeneity of the metric (rather than smoothness of a conformal multiple down to \(\mathscr {I}^+\)) is thus the best one can hope for, and this is what we establish here. (We also prove that the metric is indeed conformal to a non-degenerate metric of class \(\mathcal {C}^{1,\alpha }\), \(\alpha <\min (b_0,1)\), down to \(\mathscr {I}^+\); see Remark 8.12).

If the initial data do not have a full polyhomogeneous expansion, but only a partial expansion (containing only finitely many terms) plus a sufficiently regular remainder decaying faster than the terms in the expansion, the solution g will itself have a finite partial expansion at each boundary hypersurface, plus a faster decaying remainder; we shall not, however, record results of this nature here.

Applying a suitable version of this theorem both towards the future and the past, we show that the maximal globally hyperbolic development is given by a causally geodesically complete metric g, with analogous regularity and polyhomogeneity statements as in Theorem 1.1, on a suitable manifold with corners whose interior is diffeomorphic to \(\mathbb {R}^4\) (and contains \(\Sigma ^\circ \)), which now has the additional boundary hypersurfaces \(\mathscr {I}^-\) and \(I^-\); see Theorem 6.7 and the end of §7.

Like many other approaches to the stability problem (see the references below), our arguments apply to the Einstein–massless scalar field system \(\text {Ric}(g)=|\nabla \phi |_g^2\), \(\Box _g\phi =0\), with small initial data for the scalar field in order to obtain global stability. They also give the stability of the far end of a Schwarzschild black hole spacetime with any mass \(m\in \mathbb {R}\), i.e. of the domain of dependence of the complement of a sufficiently large ball in the initial surface, without smallness assumptions on the data: in this case, we control the solution up to some finite point along the radiation face \(\mathscr {I}^+\). See Remark 6.6.

The compactification M only depends on the ADM mass m of the initial data set;Footnote 2 for the class of initial data considered here, the mass gives the only long range contribution to the metric that significantly (namely, logarithmically) affects the bending of light rays: for the Schwarzschild metric (1.3), radially outgoing null-geodesics lie on the level sets of \(t-r-2 m\log (r-2 m)\). Concretely, near \(I^0\cup \mathscr {I}^+\), M will be the Penrose compactification of the region \(\{t/r<2,\,r\gg 1\}\subset \mathbb {R}^4\) within the Schwarzschild spacetime, i.e. equipped with the metric \(g_m^S\), blown up at spacelike and future timelike infinity. As in our previous work [53, 60] on Einstein’s equation, we prove Theorem 1.1 using a Newton-type iteration scheme (more precisely: Nash–Moser) in which we solve a linear equation globally on M at each step. While this approach brings many advantages (cf. Remark 1.3), a disadvantage of using a Nash–Moser iteration is the typically rather large number of derivatives needed compared to other approaches.

We do not quite use the wave coordinate gauge as in Lindblad–Rodnianski [79, 80], but rather a wave map gauge with background metric given by the Schwarzschild metric with mass m near \(I^0\cup \mathscr {I}^+\), glued smoothly into the Minkowski metric elsewhere; this is a more natural choice than using the Minkowski metric itself as a background metric (which would give the standard wave coordinate gauge), as the solution g will be a short range perturbation of \(g_m^S\) there. This gauge, which can be expressed as the vanishing of a certain 1-form \(\Upsilon (g)\), fixes the long range part of g and hence the main part of the null geometry at \(\mathscr {I}^+\). In order to ensure the gauge condition to a sufficient degree of accuracy (i.e. decay) at \(\mathscr {I}^+\) throughout our iteration scheme, we implement constraint damping, first introduced in the numerics literature in [46], and crucially used in [60]. This means that we use the 1-form encoding the gauge condition in a careful manner when passing from the Einstein equation (1.1) to its ‘reduced’ quasilinear hyperbolic form: we can arrange that for each iterate \(g_k\) in our iteration scheme, the gauge 1-form \(\Upsilon (g_k)\) vanishes sufficiently fast at \(\mathscr {I}^+\) so as to fix the long range part of g. In order to close the iteration scheme and control the nonlinear interactions, we need to keep precise track of the leading order behavior of the remaining metric coefficients at \(\mathscr {I}^+\). We discuss this in detail in §1.2.

Remark 1.2

Fixing the geometry at \(\mathscr {I}^+\) in this manner, the first step of our iteration scheme, i.e. solving the linearized gauge-fixed Einstein equation with the given (nonlinear) initial data of size \(\delta \), produces a solution with the correct long range behavior and which is \(\delta ^2\) close to the nonlinear solution in the precise function spaces on M in which we measure the solution. (Subsequent iteration steps give much more accurate approximations since the convergence of the iteration scheme is exponential). This suggests that our formulation of the gauge-fixed Einstein equation could allow for improvements of the accuracy of post-Minkowskian expansions—which are iterates of a Picard-type iteration scheme as in [80, Equation (1.7)]—used to study gravitational radiation from isolated sources [9].

The global stability of Minkowski space was established, building in particular on [22, 64], in the monumental work of Christodoulou–Klainerman [27] for asymptotically Schwarzschildean data (similar to those in (1.6) but with \(b_0\ge \tfrac{1}{2}\), though requiring only \(N=3\) derivatives) and precise control at null infinity, with an alternative proof using double null foliations by Klainerman–Nicolò [66]; and more recently in [79, 80] using the wave coordinate gauge, for initial data as in Theorem 1.1 (but requiring only \(N=10\) derivatives on the initial data). Friedrich [42] (see [43] for the Einstein–Yang–Mills case) established non-linear stability, using a conformal method, for a restrictive class (shown to be nonempty in [31]) of initial data, but with precise information on the asymptotic structure of the spacetime. Bieri [16] studied the problem for a very general class of data which are merely decaying like \(\langle r\rangle ^{-1/2-\delta }\) for some \(\delta >0\)—thus more slowly even than the \(\mathcal {O}(r^{-1})\) terms of Schwarzschild—and even less regularity than Christodoulou–Klainerman; in this case, the ‘correct’ compactification on which the metric has a simple description will have to depend on more than just the ADM mass (this is clear e.g. for the initial data constructed by Carlotto–Schoen [33], which are nontrivial only in conic wedges); Bieri and Chruściel [8, 26] construct a piece of \(\mathscr {I}^+\) for the data considered in [16] but without a smallness assumption. Further works on the stability of Minkowski space for the Einstein equations coupled to other fields, in the wake of [27, 79, 80], include those by Speck [99] on (a generalization of) the Einstein–Maxwell system, Taylor [103], Lindblad–Taylor [81], and Fajman–Joudioux–Smulevici [38] for both the massless and the massive Einstein–Vlasov system. We also mention Keir’s very general quasilinear results [63] which in particular imply the global solvability for small data of the gauge-fixed Einstein equation in harmonic coordinates (but without constraint damping) even when the gauge condition is violated, albeit at the expense of losing the precise asymptotic control at null infinity. The global stability for a minimally coupled massive scalar field was proved by LeFloch–Ma [75] and Wang [112].

The present paper contains the first proof of full conormality and polyhomogeneity of small nonlinear perturbations of Minkowski space in \(3+1\) dimensions. Lindblad–Rodnianski also established high conormal regularity, see [80, Equation (1.14)], though, in the context of the present paper, on the compactification corresponding to Minkowski rather than on M, and hence with a loss in the decay rates. This was improved by Lindblad [74] who proved sharp decay for the metric at null infinity (albeit in a slightly different gauge), and uses them to establish a relationship between the ADM mass and the total amount of gravitational radiation. The decay in [74] corresponds to the leading order decay which we prove at \(\mathscr {I}^+\); we improve this by proving definite decay rates towards the leading order terms at \(\mathscr {I}^+\), and we strengthen the decay rate towards \(I^+\) to \(t^{-1}\); in fact, we show decay at a faster rate to an \(\mathcal {O}(t^{-1})\) leading order term, see the proof of Theorem 8.14. (Neither improvement requires polyhomogeneous initial data).

Previously, polyhomogeneity was established in spacetime dimensions \(\ge 9\) for the Einstein vacuum and Einstein–Maxwell equations, for initial data stationary outside of a compact set, by Chruściel–Wafo [34]; this relied on earlier work by Chruściel-Łeski [29] on the polyhomogeneity of solutions of hyperboloidal initial value problemsFootnote 3 for a class of semilinear equations, and Loizelet’s proof [76, 77] of the electrovacuum extension (using wave coordinate and Lorenz gauges) of [79]; see also [7]. Lengard [69] studied hyperboloidal initial value problems and established the propagation of weighted Sobolev regularity for the Einstein equation, and of polyhomogeneity for nonlinear model equations. In spacetime dimensions 5 and above, Wang [110, 111] obtained the leading term (i.e. the ‘radiation field’) of \(g-\underline{g}{}\) at \(\mathscr {I}^+\), and proved high conormal regularity. Baskin–Wang [15] and Baskin–Sá Barreto [11] defined radiation fields for linear waves on Schwarzschild as well as for semilinear wave equations on Minkowski space. For initial data which are exactly Schwarzschildean outside a compact set and in even spacetime dimensions \(\ge 6\), a simple conformal argument, which requires very little information on the structure of the Einstein(–Maxwell) equation, stability and smoothness of \(\mathscr {I}^+\) were proved by Choquet-Bruhat–Chruściel–Loizelet [18]; see also [3] for a different approach in the vacuum case. The construction of the required initial data sets as well as questions of their smoothness and polyhomogeneity were taken up in the hyperboloidal context by Andersson–Chruściel–Friedrich [4] and extended in [1, 2], see also [28]. Paetz and Chruściel [32, 93] studied this for characteristic data; we refer to Corvino [31], Chruściel–Delay [21], and references therein for the case of asymptotically flat data sets.

The backbone of our proof is a systematic treatment of the stability of Minkowski space as a problem of proving regularity and asymptotics for a quasilinear (hyperbolic) equation on a compact, but geometrically complete manifold with corners M. That is, we employ analysis based on complete vector fields on M and the corresponding natural function spaces, which in this paper are b-vector fields, i.e. vector fields tangent to \(\partial M\), and spaces with conormal regularity or (partial) polyhomogeneous expansions; following Melrose [85, 87], this is called b-analysis (‘b’ for ‘boundary’). The point is that once the smooth structure (the manifold M) and the algebra of differential operators appropriate for the problem at hand give a simple background on which to do analysis;Footnote 4 we will give examples and details in §1.1. In this context, it is often advantageous to work on a more complicated manifold M if this simplifies the algebraic structure of the equation at hand. While this point of view has a long history in the study of elliptic equations, see e.g. [48, 84, 85, 88, 98], its explicit use in hyperbolic problems is, to a large part, rather recent [13, 14, 52, 58,59,60, 86, 89, 90, 104]. We also point out that fixing the smooth structure on M, one gains the

A (clean) description of polyhomogeneous expansions, in particular at the transitions between different regimes such as near \(I^0\cap \mathscr {I}^+\) or \(\mathscr {I}^+\cap I^+\), requires working on a manifold with corners. More generally, it is often easier to define function spaces on \(M^\circ \) by working uniformly up to \(\partial M\), and decay rates from the perspective of \(M^\circ \) can be encoded as orders of vanishing at \(\partial M\) (the latter making sense since M is equipped with a smooth structure).Footnote 5

Working in a compactified setting furthermore makes the structures allowing for global existence clearly visible in the form of linear model operators defined at the boundary hypersurfaces. Among the key structures for Theorem 1.1 are the symmetries of the model operator \(L^0\)at\(\mathscr {I}^+\), which is essentially the product of two transport ODEs, as well as constraint damping and a certain null structure, both of which are simply a certain Jordan block structure of \(L^0\), with the null structure corresponding to a nilpotent Jordan block. At \(I^+\), the model operator will be closely related (via a conformal transformation) to the conformal Klein–Gordon equation on static de Sitter space, which enables us to determine the asymptotic behavior of g there via resonance expansions from known results on the asymptotics of conformal waves on de Sitter space.

A closely related reason for viewing a global problem (i.e. to be solved, at first glance, on a noncompact set) as a (degenerate) problem on a compact manifold with boundary or corners is that asymptotic data of the solution become restrictions of the solution to boundary hypersurfaces: it was for the purpose of giving a simple and conceptually clean description of the radiation field of scalar, electromagnetic, or gravitational waves, and also of solutions of the full nonlinear Einstein equation, that Penrose introduced his compactifications and diagrams. (These restrictions may solve interesting equations by themselves, as is the case for the Bondi mass loss formula at \(\mathscr {I}^+\), and in the case of the scattering argument which we will use at \(I^+\) to prove the vanishing of the final Bondi mass at the future boundary of \(\mathscr {I}^+\)). While a compactified perspective is often not strictly necessary for the description of asymptotic data and relations between them, it is usually conceptually advantageous, and brings to light the key features of a PDE problem which may be difficult to detect from the noncompact point of view, cf. the references above. (For example, finding the linearized version of the weak null structure of Lindblad–Rodnianski does not require any careful inspection, but simply the calculation of a partial Jordan block decomposition of a coefficient of a model operator defined at null infinity).

We also note that the symmetries and dynamical/geometric features of (asymptotically) Minkowski metrics relevant in each of these regimes are different. Hence, we find it advantageous to adapt our descriptions of coordinates, operators, and function spaces to the various asymptotic regimes and symmetries of the problem, rather than e.g. working throughout with standard (tx)-coordinates on \(\mathbb {R}^4\): the latter seem to be most useful for capturing the (approximate) translation-invariance of wave equations on (asymptotically) Minkowski spacetimes—which does not play a role in the stability proof—while scaling, boosts and rotations, while of course expressible in (tx) coordinates, become very simple on M, simply becoming smooth vector fields on M with some extra properties, such as tangency to \(\partial M\).

While the manifold M is compact, our analysis of the linear equations (arising from a linearization of the gauge-fixed Einstein equation) on M lying at the heart of this paper is not a short-time existence/regularity analysis near the interiors of \(I^0\), resp. \(I^+\), but rather a global in space, resp. global in time analysis. (Conformal methods such as [44] bringing \(I^0\) to a finite place have the drawback of imposing very restrictive regularity conditions on the initial data). At \(\mathscr {I}^+\), we use a version of Friedlander’s rescaling [39] of the wave equation, which does give equations with singular (conormal or polyhomogeneous) coefficients; but since \(\mathscr {I}^+\) is a null hypersurface, conormality or polyhomogeneity—which are notions of regularity defined with respect to (b-)vector fields, which are complete—are essentially transported along the generators of \(\mathscr {I}^+\). At the past and future boundaries of \(\mathscr {I}^+\), i.e. at \(I^0\cap \mathscr {I}^+\) and \(\mathscr {I}^+\cap I^+\), the two pictures fit together in a simple and natural fashion. We discuss this in detail in §§1.1.1 and 1.1.3.

We reiterate that our goal is to exhibit the conceptual simplicity of our approach, which we hope will allow for advances in the study of related stability problems which have a more complicated geometry on the base, i.e. on the level of the spacetime metric, on the fibers, i.e. for equations on vector bundles, or both. In particular, we are not interested in optimizing the number of derivatives needed for our arguments based on Nash–Moser iteration.

Following our general strategy, one can also prove the stability of Minkowski space in spacetime dimensions \(n+1\), \(n\ge 4\), for sufficiently decaying initial data, with the solution conormal (or polyhomogeneous, if the initial data are such), thus strengthening Wang’s results [111]. There are a number of simplifications due to the faster decay of linear waves in \(\mathbb {R}^{1+n}\): the compactification M of \(\mathbb {R}^{1+n}\) does not depend on the mass anymore and can be taken to be the blow-up of the Penrose diagram of Minkowski space at spacelike and future timelike infinity; we do not need to implement constraint damping as metric perturbations no longer have a long range term which would change the geometry of \(\mathscr {I}^+\); and we do not need to keep track of the precise behavior (such as the existence of leading terms at \(\mathscr {I}^+\)) of the metric perturbation. We shall not discuss this further here.

1.1 Aspects of the systematic treatment; examples

Consider a nonlinear partial differential equation \(P(u)=0\), with P encoding boundary or initial data as well, whose global behavior one wishes to understand for high regularity data which have small norm; denote by \(L_u:=D_u P\) the linearized operators. In the present problem, P will be the map assigning a metric to the value of the (gauge-fixed) Einstein operator on it, as well as its pair of initial data. Our strategy, with references to their implementation for the present problem, is:

  1. 1.

    fix a \(\mathcal {C}^\infty \) structure, that is, a compact manifold M, with boundary or corners, on which one expects the solution u to have a simple description (regularity, asymptotic behavior)—see §2.1 for the definition of the compactification of \(\mathbb {R}^4\) on which we will work;

  2. 2.

    choose an algebra of differential operators and a scale of function spaces on M, say \(\mathcal {X}^s,\mathcal {Y}^s\), encoding the amount \(s\in \mathbb {R}\) of regularity as well as relevant asymptotic behavior, such that for \(u\in \mathcal {X}^\infty :=\bigcap _{s>0}\mathcal {X}^s\) small in some \(\mathcal {X}^s\) norm, the operator \(L_u\) lies in this algebra and maps \(\mathcal {X}^\infty \rightarrow \mathcal {Y}^\infty :=\bigcap _{s>0}\mathcal {Y}^s\)—see §§2.2 and 3.1 for the function spaces we will use: conormal sections of certain vector bundles together with certain leading order terms at null infinity; and §3.2 for the verification of the mapping property;

  3. 3.

    show that for such small u, the operator \(L_u\) has a (right) inverse

    $$\begin{aligned} (L_u)^{-1}:\mathcal {Y}^\infty \rightarrow \mathcal {X}^\infty \end{aligned}$$
    (1.8)

    on these function spaces—see §§4, 5, discussed below;

  4. 4.

    solve the nonlinear equation using a global iteration scheme, schematically

    $$\begin{aligned} u_0=0;\quad u_{k+1}=u_k+v_k,\ v_k=-(L_{u_k})^{-1}(P(u_k));\quad u=\lim _{k\rightarrow \infty } u_k\in \mathcal {X}^\infty . \end{aligned}$$
    (1.9)

    See §6.

  5. 5.

    (Optional.) Improve on the regularity of the solution \(u\in \mathcal {X}^\infty \), provided the data has further structure such as polyhomogeneity or better decay properties, by using the PDE \(P(u)=0\) directly, or its approximation by linearized model problems in the spirit of \(0=P(u)\approx L_0 u+P(0)\) and a more precise analysis of \(L_0\). See §7, where we prove the polyhomogeneity for asymptotically Minkowski metrics.

We stress that steps 1 and 2 are nontrivial, as they require significant insights into the geometric and analytic properties of the PDE in question, and are thus intimately coupled to step 3; the function spaces in step 2 must be large enough in order to contain the solution u, but precise (i.e. small) enough so that the nonlinearities and linear solution operators are well-behaved on them.

Note that if one has arranged 3, then the iteration scheme (1.9) formally closes, i.e. all iterates \(u_k\) lie in \(\mathcal {X}^\infty \) modulo checking their required smallness in \(\mathcal {X}^s\). Checking the latter, thus making (1.9) rigorous, is however easy in many cases, for example by using Nash–Moser iteration [51, 100], which requires \((L_u)^{-1}\) to satisfy so-called tame estimates; these in turn are usually automatic from the proof of (1.8), which is often ultimately built out of simple algebraic operations like multiplications and taking reciprocals of operator coefficients or symbols, and energy estimates, for all of which tame estimates follow from the classical Moser estimates. The precise bookkeeping, done e.g. in [59], can be somewhat tedious but is only of minor conceptual importance: it only affects the number of derivatives of the data which need to be controlled, i.e. the number N in (1.6); in this paper, we shall thus be generous in this regard.

As a further guiding principle, which applies in the context of our proof of Theorem 1.1, one can often separate step 3, i.e. the analysis of the equation \(L_u v=f\), into two pieces:

  1. 3.1.

    prove infinite regularity of v but without precise asymptotics—see §4, where we accomplish this using simple energy estimates;

  2. 3.2.

    improve on the asymptotic behavior of v to show \(v\in \mathcal {X}^\infty \)—see §5, where we use integration along approximate characteristics as well as spectral theory/normal operator arguments for this purpose.

The point is that a ‘background estimate’ from step 3.1 may render many terms of \(L_u\) lower order, thus considerably simplifying the analysis of asymptotics and decay; see e.g. the discussion around (1.22).

Remark 1.3

Let us compare this strategy to proofs using bootstrap arguments, which are commonly used for global existence problems for nonlinear evolution equations as e.g. in [27, 80, 82]. The choice of bootstrap assumptions is akin to choosing the function space \(\mathcal {X}^\infty \) (and thus implicitly \(\mathcal {Y}^\infty \)) in step 2, while the consistency of the bootstrap assumptions, without obtaining a gain in the constants in the bootstrap, is similar to proving (1.8). However, note that the bootstrap operates on a solution of the nonlinear equation, whereas we only consider linear equations; the gain in the bootstrap constants thus finds its analogue in the fact that one can make the iteration scheme (1.9) rigorous, e.g. using Nash–Moser iteration, and keep low regularity norms of \(u_k\) bounded (and \(v_k\) decaying with k) throughout the iteration scheme. In the context in particular of Einstein’s equation, a bootstrap argument has the advantage that the gauge condition is automatically satisfied as one is dealing with solutions of the nonlinear equation; thus the issue of constraint damping does not arise, whereas we do have to arrange this. In return, we gain significant flexibility in the choice of analytic tools for the global study of the linearized equations (e.g. methods from microlocal analysis, scattering theory), as used extensively in [60]; bootstrap arguments on the other hand are strongly tied to the character of P(u) as a (nonlinear) hyperbolic (or parabolic) and differential operator, or at least to its locality in ‘time’, and it is much less clear how to exploit global information (e.g. resonances).

Before discussing Einstein’s equation in §1.2, we first describe this strategy for scalar nonlinear wave equations on Minkowski space. The most significant part of the work required to implement this strategy is the analysis of the linear operators called \(L_u\) above; we thus begin in §1.1.1 by explaining how we obtain estimates for solutions of linear wave equations on Minkowski space in a manner that will work for linearizations of the gauge-fixed Einstein equation in §4. In §1.1.2, we then put a few examples of nonlinear scalar equations into the abstract general framework described above, including a discussion of polyhomogeneity (step 5 above) in §1.1.3.

1.1.1 Linear waves in Minkowski space

For step 1, we seek a convenient compactification M of \(\mathbb {R}^4\). The goal, from the PDE perspective, is for the asymptotic behavior of linear waves on \(\mathbb {R}^4\) to have a simple description on M; closely related to this is that the asymptotic behavior of natural geometric objects such as (null)geodesics should be simple. Consider first ‘null infinity’: a (rescaled) linear wave on \(\mathbb {R}^4\) has a limit as \(r\rightarrow \infty \) along any null-geodesic, e.g. the one defined by \(t-r=s_0\), \(\omega =\omega _0\in \mathbb {S}^2\) (using polar coordinates on \(\mathbb {R}^3\)) for \((s_0,\omega _0)\in \mathbb {R}\times \mathbb {S}^2\). Thus, we want to define M in such a way that a sequence of points, with \(r\rightarrow \infty \), along such a ray has a unique limit in M; that is, one boundary hypersurface of M should be equal to (the closureFootnote 6 of) all such limiting points, with a bijection between \((s_0,\omega _0)\) and points in (the interior of) this boundary hypersurface, and such a boundary hypersurface then deserves the name \(\mathscr {I}^+\). (The interior of \(\mathscr {I}^+\) is thus \((\mathscr {I}^+)^\circ \cong \mathbb {R}\times \mathbb {S}^2\)). The radiation field is then the restriction of the rescaled wave, extended from \(\mathbb {R}^4\) to M by continuity, to \(\mathscr {I}^+\subset \partial M\) (or \((\mathscr {I}^+)^\circ \) in standard terminology).

For other asymptotic regimes, there are a number of choices one can make on Minkowski space: the Penrose diagram, or the conformal embedding of Minkowski space into the Einstein universe give two (closely related) compactifications of \(\mathbb {R}^4\) in which future timelike and spacelike geodesic rays have limit points. In order to facilitate the generalization to compactifications of asymptotically Minkowskian spacetimes in §2, we choose to work with a compactification in which the closure of the set of these limiting points, called future timelike infinity \(I^+\) and spacelike infinity \(I^0\), are 3-dimensional (rather than 2-dimensional, as in the Penrose compactification); coordinates in their interiors are x/t with \(|x/t|<1\), \(t^{-1}=0\) in \((I^+)^\circ \), and \((t/r,\omega )\) with \(|t/r|<1\), \(r^{-1}=0\) in \((I^0)^\circ \).

At future timelike infinity \(I^+\), the asymptotic behavior of waves is governed, quite generally on suitable asymptotically Minkowski spacetimes, by quantum resonances [13];Footnote 7 also, nonlinear interactions are much simpler to deal with than near \(\mathscr {I}^+\). (This is a further reason to keep \((\mathscr {I}^+)^\circ \) and \((I^+)^\circ \) separate: it keeps the delicate analysis at \(\mathscr {I}^+\) separate on M from the more straightforward analysis at \(I^+\). The analysis at \(I^0\) is even simpler). We also point out that it is a specific feature of exact Minkowski space that one can ‘blow down’ \(I^+\); that is, suitably rescaled linear waves are smooth directly on the Penrose compactification, and the blow-up of timelike infinity \(i^+\) and spacelike infinity \(i^0\) in the Penrose diagram, as in Figure 1, is not required; on more general asymptotically Minkowski spacetimes on the other hand, one needs to resolve \(i^+\) and \(i^0\) via real blow-up, obtaining \(I^+\) and \(I^0\), in order to exhibit linear waves as polyhomogeneous (read: having a simple asymptotic description) functions on the compactification.

Thus, we begin by defining \(\overline{\mathbb {R}^4}\):

Definition 1.4

The radial compactification of \(\mathbb {R}^4\) is defined as

$$\begin{aligned} \overline{\mathbb {R}^4}:=\mathbb {R}^4\sqcup ([0,1)_R\times \mathbb {S}^3)/\sim , \end{aligned}$$
(1.10)

where \(\sim \) identifies \((R,\omega )\), \(R>0\), \(\omega \in \mathbb {S}^3\), with the point \(R^{-1}\omega \in \mathbb {R}^4\). The quotient carries the smooth structure in which the smooth functions are precisely those which over \(\mathbb {R}^4\) (the interior of \(\overline{\mathbb {R}^4}\)) are smooth in the usual sense, and which over \([0,1)_R\times \mathbb {S}^3_\omega \) are smooth in \((R,\omega )\) down to \(R=0\).

The function \(\rho :=(1+t^2+r^2)^{-1/2}\in \mathcal {C}^\infty (\overline{\mathbb {R}^4})\) is a boundary defining function, i.e. \(\partial \overline{\mathbb {R}^4}=\rho ^{-1}(0)\) with \(d\rho \) nondegenerate everywhere on \(\partial \overline{\mathbb {R}^4}\). Letting \(v=(t-r)/r\) away from \(r=0\), all future null-geodesics tend to \(S^+=\{\rho =0,\,v=0\}\), and we then define M as the closure of \(t\ge 0\) within the blow-upFootnote 8\([\overline{\mathbb {R}^4};S^+]\) of \(\overline{\mathbb {R}^4}\) at \(S^+\) (see Figure 1), i.e. the smooth manifold obtained by declaring polar coordinates around \(S^+\) to be smooth down to the origin. We refer to the front face \(\mathscr {I}^+\) of this blow-up as null infinity or the radiation face; it has a natural fibration by the fibers of the map \(\mathscr {I}^+\rightarrow S^+\), which we call the fibers of the radiation face/null infinity/\(\mathscr {I}^+\). (The interior of a typical fiber is equal to \(\mathbb {R}_{s_0}\times \{\omega _0\}\) for some fixed \(\omega _0\in \mathbb {S}^2\)).

We can equivalently describe M by giving a list of local coordinate patches and how (pieces of) \(\mathbb {R}^4\) are glued to them. We describe two exemplary coordinate charts here: the first one is

$$\begin{aligned}{}[0,1)_{\rho _0} \times [0,1)_{\rho _I} \times \mathbb {S}^2_\omega , \end{aligned}$$

and we identify \((\rho _0,\rho _I,\omega )\) for \(\rho _0,\rho _I>0\) with the point \((t,x)\in \mathbb {R}\times \mathbb {R}^3\) for \(t=\rho _0^{-1}(\rho _I^{-1}-1)\), \(x=\rho _0^{-1}\rho _I^{-1}\omega \). Thus,

$$\begin{aligned} \rho _0 = (r-t)^{-1},\quad \rho _I = (r-t)/r; \end{aligned}$$
(1.11)

then \(I^0\), resp. \(\mathscr {I}^+\) is locally given by \(\rho _0=0\), resp. \(\rho _I=0\); thus, this chart describes a neighborhood of \(I^0\cap \mathscr {I}^+\), i.e. the transition from spacelike to null infinity. (For example, \(\{\rho _0=0,\ \rho _I=c\}\) for some fixed \(c\in (0,1)\) consists of the points ‘at (spacelike) infinity’ of a spacelike cone in Minkowski space, while \(\{\rho _0=c,\ \rho _I=0\}\) consists of the points ‘at (null) infinity’ of a null cone). See Figure 2.

Fig. 2
figure 2

Illustration of the coordinate chart (1.11). Shown are a number of level sets of \(\rho _0\) (red dashed lines) and \(\rho _I\) (blue dashed lines) projected onto the (tr) plane. Indicated on the top right is the \((\rho _0,\rho _I,\omega )\) coordinate system including the boundary hypersurfaces \(I^0\) and \(\mathscr {I}^+\) which are glued onto \(\mathbb {R}^4\)

The second coordinate chart is

$$\begin{aligned}{}[0,1)_{\tilde{\rho }_I} \times [0,1)_{\rho _+} \times \mathbb {S}^2_\omega , \end{aligned}$$

and \((\tilde{\rho }_I,\rho _+,\omega )\) for \(\tilde{\rho }_I,\rho _+>0\) is identified with (tx) for \(t=\rho _+^{-1}(\tilde{\rho }_I^{-1}+1)\), \(x=\tilde{\rho }_I^{-1}\rho _+^{-1}\omega \); thus

$$\begin{aligned} \tilde{\rho }_I = (t-r)/r,\quad \rho _+=(t-r)^{-1}. \end{aligned}$$
(1.12)

(Now \(\{\tilde{\rho }_I=c,\ \rho _+=0\}\) for \(c\in (0,1)\) consists of the points ‘at (future timelike) infinity’ of a timelike cone in Minkowski space). When the coordinate system in which we work is clear, we simply write \(\rho _I\) instead of \(\tilde{\rho }_I\).

To motivate a preliminary choice of function spaces for step 2, recall that the behavior of solutions of \(\Box _{\underline{g}{}}u:=-u_{;\mu }{}^\mu \) near \(\mathscr {I}^+\) can be studied using the Friedlander rescaling

$$\begin{aligned} L:=\rho ^{-3}\Box _{\underline{g}{}}\rho . \end{aligned}$$
(1.13)

This operator has smooth coefficients down to the interior \((\mathscr {I}^+)^\circ \) of null infinity: it is equal to the conformal wave operator \(\Box _{\rho ^2\underline{g}{}}-\tfrac{1}{6}R_{\rho ^2\underline{g}{}}\), and \(\rho ^2\underline{g}{}\) is a smooth, nondegenerate Lorentzian metric down to \((\mathscr {I}^+)^\circ \): in local coordinates \(\rho =r^{-1}\ge 0\), \(x^1=t-r\in \mathbb {R}\), \(\omega \in \mathbb {S}^2\) near \((\mathscr {I}^+)^\circ \), we have . Thus, solutions of \(L u=0\), with \(\mathcal {C}^\infty _{\text {c}}(\mathbb {R}^3)\) initial data, are smooth up to \(\mathscr {I}^+\) and typically nonvanishing there. We shall refer to this reasoning as Friedlander’s argument below. (A more sophisticated version of this observation lies at the heart of Friedrich’s conformal approach [40] to the study of Einstein’s equation). However, for more general initial data, and, more importantly, in many nonlinear settings (see §§1.1.2 and 1.2 below), smoothness will not be the robust notion, and we must settle for less: conormality at\(\partial M\). Namely, let \(\mathcal {V}_{\text {b}}(M)\) denote the Lie algebra of b-vector fields, i.e. vector fields tangent to the boundary hypersurfaces of M other than the closure \(\Sigma \) of the initial surface \(\Sigma ^\circ =\{t=0\}\), a function u on M is conormal iff it remains in a fixed weighted \(L^2\) space on M upon application of any finite number of b-vector fields. For M defined above, \(\mathcal {V}_{\text {b}}(M)\) is spanned over \(\mathcal {C}^\infty (M)\) by translations \(\partial _t\) and \(\partial _{x^i}\) as well as the scaling vector field \(t\partial _t+x\partial _x\), boosts \(t\partial _{x^i}+x^i\partial _t\), and rotation vector fields \(x^i\partial _{x^j}-x^j\partial _{x^i}\).Footnote 9 (Note however that the definition of \(\mathcal {V}_{\text {b}}(M)\) depends only on the smooth structure of M.Footnote 10)

Let us now explain how to obtain a background estimate, step 3.1 above, for the forcing problem \(L u=f\) with trivial initial data. First, we can estimate u in \(H^1\) on any compact subset of \(\mathbb {R}^4\cap \{t\ge 0\}\) by f on another compact set. Then, on a neighborhood of \((I^0)^\circ \) which is diffeomorphic to \([0,1)_{\rho _0}\times (0,1)_\tau \times \mathbb {S}^2\), where

$$\begin{aligned} \rho _0 := r^{-1},\ \ \tau := t/r, \end{aligned}$$

with \(\rho _0\) a local boundary defining function of \(I^0\), this problem roughly takes the form

(1.14)

where we use the standard notation

$$\begin{aligned} D = \frac{1}{i}\partial ,\quad i=\sqrt{-1}. \end{aligned}$$
(1.15)

In (1.14), is the Laplacian on \(\mathbb {S}^2\), and f has suitable decay properties making its norms in the estimates below finite. This is a wave equation on the (asymptotically) cylindrical manifold \([0,1)_{\rho _0}\times \mathbb {S}^2\). Let

$$\begin{aligned} U_0=\{0\le \tau \le c,\ \rho _0\le 1\},\quad c\in (0,1). \end{aligned}$$

For any weight \(a_0\in \mathbb {R}\), we can run an energy estimate using the vector field multiplier \(\rho _0^{-2 a_0}\partial _\tau \) and obtain

$$\begin{aligned} \Vert u \Vert _{\rho _0^{a_0}H_{{\text {b}}}^1(U_0)} \lesssim \Vert f \Vert _{\rho _0^{a_0}L^2_{\text {b}}(U_0)} \end{aligned}$$
(1.16)

for f supported in \(U_0\); see Figure 3. Here \(L^2_{\text {b}}\) is the \(L^2\) space with respect to the b-density , the weighted \(L^2_{\text {b}}\) norm is defined by \(\Vert f\Vert _{\rho _0^{a_0}L^2_{\text {b}}}=\Vert \rho _0^{-a_0}f\Vert _{L^2_{\text {b}}}\), and \(H_{{\text {b}}}^1\) is the space of all \(u\in L^2_{\text {b}}\) such that \(V u\in L^2_{\text {b}}\) for all \(V\in \mathcal {V}_{\text {b}}(M)\); in \(U_0\), \(\mathcal {V}_{\text {b}}(M)\) is spanned (over \(\mathcal {C}^\infty (M)\) by \(\partial _\tau \), \(\rho _0\partial _{\rho _0}\), , so we let

In order to obtain a higher regularity estimate, one can commute any number of b-vector fields through (1.14); the estimate (1.16) only relies on the principal (wave) part of L; lower order terms arising as commutators are harmless. Thus, \(f\in \rho _0^{a_0}H_{{\text {b}}}^\infty \) (weighted \(L^2_{\text {b}}\)-regularity with respect to any finite number of b-vector fields) implies \(u\in \rho _0^{a_0}H_{{\text {b}}}^\infty \), with estimates.

The same conclusion holds for the initial value problem for \(L u=0\) with initial data which near \(I^0\) are \((u|_{\tau =0},\partial _\tau u|_{\tau =0})=(u|_{t=0},r \partial _t u|_{t=0})=(u_0,u_1)\), \(u_j\in \rho _0^{a_0}H_{{\text {b}}}^\infty (\overline{\mathbb {R}^3})\), where \(\overline{\mathbb {R}^3}\) is the radial compactification of \(\mathbb {R}^3\), defined analogously to (1.10), which has boundary defining function \(\rho _0=r^{-1}\). The assumption (1.6) on the size of initial data is a smallness condition on \(\Vert \langle r\rangle \widetilde{\gamma }\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{N+1}}+\Vert \langle r\rangle ^2 k\Vert _{\rho _0^{b_0}H_{{\text {b}}}^N}\).

Fig. 3
figure 3

The domain \(U_0\) on which the energy estimate (1.16) holds. Left: as a subset of M. Right: as a subset of the Penrose compactification

Re-defining \(\rho =r^{-1}\) near \(S^+\), a neighborhood of \(I^0\cap \mathscr {I}^+\) is diffeomorphic to \([0,1)_{\rho _0}\times [0,1)_{\rho _I}\times \mathbb {S}^2\), where (as in (1.11))

$$\begin{aligned} \rho _0 := -\rho /v = (r-t)^{-1},\ \ \rho _I := -v = (r-t)/r \end{aligned}$$
(1.17)

are boundary defining functions of \(I^0\) and \(\mathscr {I}^+\), respectively. (Thus, a function bounded by \(\rho _0^{a_0}\rho _I^{a_I}\) decays like \(r^{-a_0}\) near \((I^0)^\circ \) and like \(r^{-a_I}\) near \((\mathscr {I}^+)^\circ \)). The lift of L to M is singular as an element of \(\text {Diff}_{\text {b}}^2(M)\) but with a very precise structure at \(\mathscr {I}^+\): the equation \(L u=f\) is now of the form

(1.18)

modulo terms with more decay; here, ignoring weights, \(\rho _I\partial _{\rho _I}\sim \partial _t+\partial _r\) and \(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}\sim \partial _t-\partial _r\) are the radial null vector fields. Assuming f vanishes far away from \(\mathscr {I}^+\), we can run an energy estimate using \(V=\rho _0^{-2 a_0}\rho _I^{-2 a_I}V_0\) as a multiplier, where \(V_0=-c\rho _I\partial _{\rho _I}+\rho _0\partial _{\rho _0}\) is future timelike in \(M\setminus \mathscr {I}^+\) if we choose \(c<1\); note that \(V_0\) is tangent to\(I^0\)and\(\mathscr {I}^+\) (and null at \(\mathscr {I}^+\)); it is necessary to arrange this tangency for compatibility with our conormal function spaces, but it comes at the expense of giving control at \(\mathscr {I}^+\) that is weaker (but more robust, i.e. holds for a larger class of spacetimes) than the smoothness following from Friedlander’s argument. A simple calculation, cf. Lemma 4.4, shows that for \(a_I<a_0\) and \(a_I<0\),

(1.19)

see Figure 4, where \(L^2_{\text {b}}\) is the \(L^2\) space with integration measure . The assumptions on the weights are natural: since \(\partial _t-\partial _r\) transports mass from \(I^0\) to \(\mathscr {I}^+\), we certainly need \(a_I\le a_0\), while \(a_I<0\) is necessary since, in view of the behavior of linear waves discussed after (1.13), the estimate must apply to u which are smooth and nonzero down to \(\mathscr {I}^+\). In (1.19), derivatives of u along b-vector fields tangent to the fibers of the radiation face are controlled without a loss in weights, while general derivatives such as spherical ones lose a factor of \(\rho _I^{1/2}\).Footnote 11 When controlling error terms later on, we thus need to separate them into terms involving fiber-tangent b-derivatives and general b-derivatives, and check that the coefficients of the latter have extra decay in \(\rho _I\); see §2.4.

Fig. 4
figure 4

The domain \(U_I\) on which the energy estimate (1.19) holds

From (1.18), \(L\in \rho _I^{-1}\,\text {Diff}_{\text {b}}^2(M)\) is equal to the model operator

$$\begin{aligned} L^0 := 2\partial _{\rho _I}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}) \end{aligned}$$

modulo \(\text {Diff}_{\text {b}}^2(M)\) (i.e. ignoring second order differential operators, such as , which are sums of at most twofold products of b-vector fields). The commutation properties of this model are what allows for higher regularity estimates:Footnote 12 (\(\rho _I\) times) equation (1.18) commutes with \(\rho _0\partial _{\rho _0}\) (scaling), \(\rho _I\partial _{\rho _I}\) (roughly a combination of scaling and boosts), and spherical vector fields which are independent of \(\rho _0\) and \(\rho _I\).Footnote 13 In the end, we obtain \(u\in \rho _0^{a_0}\rho _I^{a_I}H_{{\text {b}}}^\infty \) when \(f\in \rho _0^{a_0}\rho _I^{a_I-1}H_{{\text {b}}}^\infty \).

Lastly, near \(I^+\), one can use energy estimate with weight \(\rho _I^{-2 a_I}\rho _+^{-2 a_+}\), \(a_+<a_I\) large and negative, multiplying a timelike extension of the above \(V_0\); higher regularity follows by commuting with the scaling vector field \(\rho _+\partial _{\rho _+}\), where \(\rho _+\) is a defining function of \(I^+\), and elliptic regularity for \(C(\rho _+ D_{\rho _+})^2-L\), \(C>0\) large, in \(I^+\) away from \(\mathscr {I}^+\), which is a consequence of the timelike nature of the scaling vector field \(\rho _+\partial _{\rho _+}\) in \((I^+)^\circ \). See Figure 5. Note that it is only at this stage that one uses the asymptotically Minkowskian nature of the metric in a neighborhood of all of\(I^+\); when dealing with a more complicated geometry, as e.g. in the study of perturbations of a Schwarzschild black hole, establishing this part of the background estimate (as well as the more precise asymptotics at \(I^+\) discussed momentarily) becomes a major difficulty.

Fig. 5
figure 5

The neighborhood (shaded) of \(I^+\) on which we use a global (in \(I^+\)) weighted energy estimate

Putting everything together, we find that

$$\begin{aligned} f\in \rho _0^{a_0}\rho _I^{a_I-1}\rho _+^{a_+}H_{{\text {b}}}^\infty (M),\ f\equiv 0\ \text{ near }\ \Sigma \ \Longrightarrow \ u\in \rho _0^{a_0}\rho _I^{a_I}\rho _+^{a_+}H_{{\text {b}}}^\infty (M), \end{aligned}$$
(1.20)

for \(a_I<\min (a_0,0)\) and \(a_+<a_I\).Footnote 14

For nonlinear applications, the information (1.20) on u is not sufficient: the decay rate at \(\mathscr {I}^+\) is limited, and we do not have a good decay rate at \(I^+\) either, cf. the discussion of \(\rho _0^{a_0}\rho _I^{a_I}\) following (1.17). Let us thus turn to step 3.2 and analyze \(L u=f\) for f, vanishing near \(\Sigma \), having more decay,

$$\begin{aligned} f\in \mathcal {Y}^\infty := \rho _0^{b_0}\rho _I^{-1+b_I}\rho _+^{b_+}H_{{\text {b}}}^\infty (M);\quad b_+<b_I<b_0,\ \ b_I\in (0,1). \end{aligned}$$
(1.21)

The background estimate (1.20) gives \(u\in \rho _0^{b_0}\rho _I^{-\epsilon }\rho _+^{a_+}H_{{\text {b}}}^\infty \) for all \(\epsilon >0\). Near \(I^0\cap \mathscr {I}^+\) then, the conormality of u allows for equation (1.18) to be written as

(1.22)

i.e. L effectively becomes the composition of (linear) transport equations along the two radial null directions. See Figure 6. Integration of \(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}\) is straightforward, while integrating \(\rho _I\partial _{\rho _I}\), which is a regular singular ODE with indicial root 0, implies that u has a leading order term at \(\mathscr {I}^+\); one finds that

$$\begin{aligned} u = u^{(0)} + u_{\text {b}};\ \ u^{(0)}\in \rho _0^{b_0}H_{{\text {b}}}^\infty (\mathscr {I}^+),\ u_{\text {b}}\in \rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty (M)\ \ \text{ near }\ I^0\cap \mathscr {I}^+, \end{aligned}$$

which implies the existence of the radiation field.Footnote 15 The procedure to integrate along (approximate) characteristics to get sharp decay is frequently employed in the study of nonlinear waves on (asymptotically) Minkowski spaces, see e.g. [80, §2.2], [74], and their precursors [71, 72].

Fig. 6
figure 6

The integral curves of the vector fields \(\partial _t+\partial _r\sim -\rho _I\partial _{\rho _I}\) and \(\partial _t-\partial _r\sim \rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}\). Integration along the former gives the leading term at \(\mathscr {I}^+\), while integration along the latter transports weights (and polyhomogeneity) from \(I^0\) to \(\mathscr {I}^+\)

At \(\mathscr {I}^+\cap I^+\), the same argument works, showing that \(u^{(0)}\) and \(u_{\text {b}}\) are bounded by \(C\rho _+^{a_+}\) and \(C\rho _I^{b_I}\rho _+^{a_+}\) near \(I^+\) (i.e. by \(t^{-a_+}\) as \(t\rightarrow \infty \) with r/t in compact subsets of [0, 1)). Improving this weight however does not follow from such a simple argument. Indeed, at \(I^+\), the behavior of u is governed by scattering theoretic phenomena: the asymptotics are determined by scattering resonances of a model operator at \(I^+\), namely the normal operator of the b-differential operator L at \(I^+\), obtained by freezing its coefficients at \(I^+\), see equation (2.2). We thus use the arguments introduced in [107], see also [58, Theorem 2.21], based on Mellin transform in \(\rho _+\), inversion of a ‘spectral family’ \(\widehat{L}(\sigma )\), which is the conjugation of the model operator (called ‘normal operator’ in b-parlance) of L at \(I^+\) by the Mellin transform in \(I^+\), with \(\sigma \) the dual parameter to \(\rho _+\), and contour shifting in the inverse Mellin transform to find the correct asymptotic behavior at \(I^+\): the resonances \(\sigma \in \mathbb {C}\), which are the poles of \(\widehat{L}(\sigma )^{-1}\), give rise to a term \(\rho _+^{i\sigma }v\), v a function on \(I^+\), in the asymptotic expansion of u. (See §§5.2 and 7 for details). The resonances can easily be calculated explicitly in the present context, and they all satisfy \(-{\text {Im}}\sigma \ge 1>b_+\). The upshot is that

$$\begin{aligned} f \in \mathcal {Y}^\infty \ \Rightarrow \ u \in \mathcal {X}^\infty&:= \bigl \{ \chi u^{(0)} + u_{\text {b}}:u^{(0)}\in \rho _0^{b_0}\rho _+^{b_+}H_{{\text {b}}}^\infty (\mathscr {I}^+),\nonumber \\&\qquad \qquad u_{\text {b}}\in \rho _0^{b_0}\rho _I^{b_I}\rho _+^{b_+}H_{{\text {b}}}^\infty (M) \bigr \}, \end{aligned}$$
(1.23)

where \(\chi \) cuts off to a neighborhood of \(\mathscr {I}^+\).

For later use as a simple model for constraint damping, consider a more general equation,

$$\begin{aligned} L_\gamma u \equiv \rho ^{-3}(\Box _{\underline{g}{}} - 2\gamma t^{-1}\partial _t)(\rho u) = f, \end{aligned}$$
(1.24)

for \(\gamma \in \mathbb {R}\); near \(\mathscr {I}^+\) and \(I^0\), this now roughly takes the form

Once the conormality of u is known, integrating the first vector field on the left gives a leading term \(\rho _I^\gamma \), which is decaying for \(\gamma >0\). (One can show that the background estimate (1.20) holds for \(a_I<\gamma \), but even an ineffective bound \(a_I\ll 0\) would be good enough, as the transport ODE argument automatically recovers the optimal bound).

Remark 1.5

Note that for small \(\gamma \), the normal operator of \(L_\gamma \) at \(I^+\) is close to the normal operator for \(\gamma =0\), hence one would like to conclude that mild decay \(\rho _+^{b_+}\), \(b_+<1\), at \(I^+\) still holds in this case. This is indeed true, but the argument has a technical twist: \(L_\gamma \) does not have smooth coefficients at \(\mathscr {I}^+\) as a differential operator (unlike L in Friedlander’s argument) due to the presence of derivatives which are not tangential to \(S^+\). However, we still have \(L_\gamma \in \text {Diff}_{\text {b}}^2(\overline{\mathbb {R}^4})\); we thus deduce asymptotics at \(I^+\) via normal operator analysis on the blown-down space\(\overline{\mathbb {R}^4}\), analogously to [13, 14]. See §5.2.

Remark 1.6

The improved decay at \(\mathscr {I}^+\) translates into higher b-regularity of u on the blown-down space \(\overline{\mathbb {R}^4}\), as we will show in Lemma 5.7; in the language of [13, Proposition 4.4], this corresponds to a shift of the threshold regularity at the radial set by \(\gamma \) coming from the skew-symmetric part of \(L_\gamma \).

1.1.2 Non-linearities and null structure

Equipped with this understanding of linear waves, we now discuss steps 2–4 of the abstract strategy of §1.1. In particular, we will show how the absence of a ‘null structure’ for a semilinear wave equation well-known to exhibit finite-time blow-up manifests itself from the global, Newton iteration scheme perspective; we will also discuss examples of equations that do satisfy a null condition, of the type arising when studying the linearization of the gauge-fixed Einstein equation.

To begin, recall that if u is conormal on M, then its derivatives along \(\partial _0:=\partial _t+\partial _r\) or size 1 spherical derivatives have faster decay by one order at \(\mathscr {I}^+\), whereas its ‘bad’ derivative along \(\partial _1:=\partial _t-\partial _r\) does not gain decay there; indeed, modulo vector fields with more decay at \(\mathscr {I}^+\), we calculate near \(I^0\cap \mathscr {I}^+\) using (1.17)

$$\begin{aligned} \partial _0 = -\tfrac{1}{2}\rho _0\rho _I\,\rho _I\partial _{\rho _I},\ \ \partial _1 = \rho _0(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}); \end{aligned}$$

note the extra factor of \(\rho _I\) in \(\partial _0\). All these derivatives gain an order of decay at \(I^0\), hence the structure of nonlinearities is relevant mainly at \(\mathscr {I}^+\); let us thus restrict the discussion to a neighborhood of \(I^0\cap \mathscr {I}^+\). (Similar considerations apply to a neighborhood of \(I^+\cap \mathscr {I}^+\)). Consider the nonlinear equation \(\Box _{\underline{g}{}}u-(\partial _1 u)^2=f\), or rather the closely related equation

$$\begin{aligned} P(u) = L u - \rho ^{-1}(\partial _1 u)^2 - f,\ \ f\in \mathcal {Y}^\infty \ \ \text{ small }, \end{aligned}$$
(1.25)

with L given by (1.13); this is well-known to violate the null condition introduced by Christodoulou [22] and Klainerman [64]. From our compactified perspective, the issue is the following. For \(u\in \mathcal {X}^\infty \), the linearization \(L_u=L-2\rho ^{-1}(\partial _1 u)\partial _1\) is, to leading order as a b-operator,

$$\begin{aligned} 2\rho _I^{-1}(\rho _I\partial _{\rho _I}-\partial _1 u)(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}), \end{aligned}$$

so the indicial root at \(\mathscr {I}^+\) is shifted from 0 to \(\partial _1 u|_{\mathscr {I}^+}\). Therefore, a step \(L_u v=-P(u)\) in the Newton iteration scheme (1.9) does not give \(v\in \mathcal {X}^\infty \). A Picard iteration, solving \(L_0 v=-P(u)\) would, due to the leading term of \(\rho ^{-1}(\partial _1 u)^2\) of size \(\rho _I^{-1}\), cause v to have a logarithmic leading term when integrating the analogue of (1.22). Neither iteration scheme closes, and this will remain true for any modification of the space \(\mathcal {X}^\infty \), e.g. if one allowed elements of \(\mathcal {X}^\infty \) to have leading terms involving higher powers of \(\log \rho _I\). In fact, solutions of global versions of this equation blow up in finite time [62].

Assuming initial data to have sufficient decay, the nonlinear system \(L u_1^c=0\), \(L u_1-\rho ^{-1}(\partial _1 u_1^c)^2=0\) on the other hand can be solved easily if we design the function space \(\mathcal {X}^\infty \) in step 2 to encode a \(\rho _I^0\) leading term for \(u_1^c\) at \(\mathscr {I}^+\), as in (1.23), and two leading terms, of size \(\log \rho _I\) and \(\rho _I^0\), for \(u_1\). Extending this model slightly, let \(\gamma >0\), recall \(L_\gamma \) from (1.24), and consider the system for \(u=(u_0,u_1^c,u_1)\),

$$\begin{aligned} P(u) = \bigl ( L_\gamma u_0,\ L u_1^c-\rho ^{-1}(\partial _1 u_0)^2,\ L u_1-\rho ^{-1}(\partial _1 u_1^c)^2 \bigr ) = 0; \end{aligned}$$
(1.26)

which is a toy model for the nonlinear structure of the gauge-fixed Einstein equation with constraint damping, as we will argue in §1.2. Only working in \((\mathscr {I}^+)^\circ \), i.e. ignoring weights at \(I^0\) and \(I^+\) for brevity, the above discussions suggest taking \(b_I\in (0,\gamma )\) and working with the spaceFootnote 16

$$\begin{aligned} \mathcal {X}^\infty = \{ u=(u_0,u_1^c,u_1) :(u_0,u_1^c-u_1^{c(0)},u_1-u_1^{(1)}\log \rho _I-u_1^{(0)}) \in \rho _I^{b_I}H_{{\text {b}}}^\infty (M) \}, \end{aligned}$$
(1.27)

where \(u_1^{c(0)}\), \(u_1^{(1)}\), \(u_1^{(0)}\in \mathcal {C}^\infty ((\mathscr {I}^+)^\circ )\) are the leading terms. Then

$$\begin{aligned} P :\mathcal {X}^\infty \rightarrow \mathcal {Y}^\infty = \{ f=(f_0,f_1^c,f_1) :(f_0,f_1^c,f_1-\rho _I^{-1}f_1^{(0)})\in \rho _I^{-1+b_I}H_{{\text {b}}}^\infty \}, \end{aligned}$$

where \(f_1^{(0)}\in \mathcal {C}^\infty ((\mathscr {I}^+)^\circ )\). The linearization \(L_u\) of P around \(u\in \mathcal {X}^\infty \) then has as its model operator at \(\mathscr {I}^+\)

$$\begin{aligned} L_u^0 = 2\rho _I^{-1}(\rho _I\partial _{\rho _I}-A_u)(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}), \quad A_u=\begin{pmatrix} \gamma &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \\ 0 &{} \partial _1 u_1^{c(0)} &{} 0 \end{pmatrix}, \end{aligned}$$
(1.28)

which has a (lower triangular) Jordan block structure, with all blocks either having positive spectrum (the upper \(1\times 1\) entry) or being nilpotent (the lower \(2\times 2\) block). Thus, by integrating \(\rho _I\partial _{\rho _I}-A_u\), we conclude that for \(L_u v=-P(u)\), we have \(v\in \mathcal {X}^\infty \), thus closing the iteration scheme (1.9). A background estimate as well as its higher regularity version, which is the prerequisite for \(L_u^0\) being of any use, can be proved as before. Error terms arising from commutation with \(A_u\) have lower differential order and can thus be controlled inductively; that is, only the commutation properties of the principal part of\(L_u^0\) matter for this.

Remark 1.7

A tool for the study of the long time behavior of nonlinear wave equations on Minkowski space introduced by Hörmander [54] is the asymptotic system, see also [55, §6.5] and [78]: this arises by making an ansatz \(u\sim \epsilon r^{-1} U(t-r,\epsilon \log r,\omega )\) for the solution and evaluating the \(\epsilon ^2\) coefficient, which gives a PDE in \(1+1\) dimensions in the coordinates \(t-r\) and \(\ell :=\epsilon \log r\) which one expects to capture the behavior of the nonlinear equation near the light cone; if the classical null condition is satisfied, the PDE is linear, otherwise it it nonlinear. The weak null condition [78] is the assumption that solutions of the asymptotic system grow at most exponentially in \(\ell \), and for the Einstein vacuum equation in harmonic gauge, solutions are polynomial (in fact, linear) in \(\ell \). The latter finds its analogue in our framework in the nilpotent structure of the coupling matrix in (1.28). (However, quasilinear equations with variable long-range perturbations, see the discussion around (1.35), cannot be treated directly with our methods, corresponding to the difficulty in assigning a geometric meaning to the asymptotic system in such situations). For works which establish global existence of nonlinear equations even when the asymptotic system has merely exponentially bounded (in \(\ell \)) solutions, we refer to Lindblad [72, 73] and Alinhac [6].

1.1.3 Polyhomogeneity

Consider again equation (1.14) near \((I^0)^\circ \), now assuming that f is polyhomogeneous. For simplicity, let \(f=\rho _0^{i z}f_z+\widetilde{f}\), where \(f_z\in \mathcal {C}^\infty (\partial \overline{\mathbb {R}^4})\), \(z\in \mathbb {C}\), and \(\widetilde{f}\) decays faster than the leading term, so \(\widetilde{f}\in \rho _0^{b_0}H_{{\text {b}}}^\infty \) with \(b_0>-{\text {Im}}z\). A useful characterization of the polyhomogeneity of f is that the decay of f improves upon application of the vector field \(\rho _0 D_{\rho _0}-z\) in the notation (1.15). The solution u satisfies \(u\in \rho _0^{a_0}H_{{\text {b}}}^\infty \) for any \(a_0<-{\text {Im}}z\); but \(u':=(\rho _0 D_{\rho _0}-z)u\) solvesFootnote 17

$$\begin{aligned} L u' = (\rho _0 D_{\rho _0}-z)f = (\rho _0 D_{\rho _0}-z)\widetilde{f} \in \rho _0^{b_0}H_{{\text {b}}}^\infty , \end{aligned}$$

so \(u'\in \rho _0^{b_0}H_{{\text {b}}}^\infty \). This is exactly the statement that u has the form \(u=\rho _0^{i z}u_z+\widetilde{u}\) for some \(u_z\in \mathcal {C}^\infty (\partial \overline{\mathbb {R}^4})\), \(\widetilde{u}\in \rho _0^{b_0}H_{{\text {b}}}^\infty \). If f has a full polyhomogeneous expansion, an iteration of this argument shows that u has one too, with the same index set.

Near the corner \(I^0\cap \mathscr {I}^+\) then, one can proceed iteratively as well, picking up the terms of the expansion at \(\mathscr {I}^+\) one by one, by analyzing the solution of the product of transport equations in equation (1.22) when the right hand side has a partial polyhomogeneous expansion at \(\mathscr {I}^+\): the point is that \(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}\) transports expansions from \(I^0\) to \(\mathscr {I}^+\), ultimately since it annihilates \(\rho _0^{i z}\rho _I^{i z}\). See Lemmas 7.57.7.

To obtain the expansion at \(I^+\), we argue iteratively again, using the resonance expansion obtained via normal operator analysis as in the proof of [58, Theorem 2.21]. One needs to invert the normal operator family of L on spaces of functions which are polyhomogeneous at the boundary \(\partial I^+\), which is easily accomplished by solving away polyhomogeneous terms formally and using the usual inverse, defined on spaces of smooth functions, to solve away the remainder; see Lemma 7.8.

1.2 Analysis of Einstein’s equation

For Einstein’s equation, the strategy outlined in §1.1 needs to be supplemented by a preliminary step, the choice of the nonlinear operator P, which in particular means choosing a gauge, i.e. a condition on the solution g of \(\text {Ric}(g)=0\) which breaks the diffeomorphism invariance; by the latter we mean the fact that for any diffeomorphism \(\phi \) of M, \(\phi ^*g\) also solves \(\text {Ric}(\phi ^*g)=0\). Following DeTurck [36], the presentation by Graham–Lee [47], and [60], we consider the gauge-fixed Einstein equation

$$\begin{aligned} P_0(g) = \text {Ric}(g) - \widetilde{\delta }{}^*\Upsilon (g) = 0, \end{aligned}$$
(1.29)

where \(\widetilde{\delta }{}^*\) is a first order differential operator with the same principal symbol (which is independent of g) as the symmetric gradient \((\delta _g^*u)_{\mu \nu }=\tfrac{1}{2}(u_{\mu ;\nu }+u_{\nu ;\mu })\); we comment on the choice of \(\widetilde{\delta }{}^*\) below. Further, the gauge 1-form is

$$\begin{aligned} \Upsilon (g;g_m)_\mu := (g g_m^{-1}\delta _g G_g g_m)_\mu = g_{\mu \nu }g^{\kappa \lambda }(\Gamma (g)_{\kappa \lambda }^\nu - \Gamma (g_m)_{\kappa \lambda }^\nu ), \end{aligned}$$
(1.30)

where \(\delta _g\) is the adjoint of \(\delta _g^*\), i.e. the (negative) divergence, \(G_g=1-\tfrac{1}{2}g{\text {tr}}_g\) is the trace reversal operator, and \(g_m\) is a fixed background metric; we write\(\Upsilon (g)\equiv \Upsilon (g;g_m)\)from now on. This is a manifestly coordinate invariant generalization of the wave coordinate gauge, where one would choose \(g_m=\underline{g}{}\) to be the Minkowski metric on \(\mathbb {R}^4\) and demand that a global coordinate system \((x^\mu ):(M^\circ ,g)\rightarrow (\mathbb {R}^4,\underline{g}{})\) be a wave map. (Friedrich describes \(\Upsilon (g)=0\) and more general gauge conditions using gauge source functions, see in particular [41, Equation (3.23)]).

Two fundamental properties of \(P_0(g)\) are: (1) \(P_0(g)\) is a quasilinear wave equation, hence has a well-posed initial value problem; (2) by the second Bianchi identity—the fact that the Einstein tensor \(\text {Ein}(g):=G_g\text {Ric}(g)\) is divergence-free—the equation \(P_0(g)=0\) implies the wave equation

$$\begin{aligned} \delta _g G_g\widetilde{\delta }{}^*\Upsilon (g)=0 \end{aligned}$$
(1.31)

for \(\Upsilon (g)\), which thus vanishes identically provided its Cauchy data are trivial; we call \(\delta _g G_g\widetilde{\delta }{}^*\) the gauge propagation operator. Therefore, solving (1.29) with Cauchy data for which \(\Upsilon (g)\) has trivial Cauchy data is equivalent to solving Einstein’s equation (1.1) in the gauge \(\Upsilon (g)=0\).

Since we wish to solve the initial value problem (1.4), we need to choose the Cauchy data for g, i.e. the restrictions \(g_0\) and \(g_1\) of g and its transversal derivative to the initial surface \(\Sigma ^\circ \)as a Lorentzian metric on\(M^\circ \) such that \(\gamma \) is the pullback of \(g_0\) to \(\Sigma ^\circ \) and k is the second fundamental form of any metric with Cauchy data \((g_0,g_1)\); note that k only depends on up to first derivatives of the ambient metric, hence can indeed be expressed purely in terms of \((g_0,g_1)\). These conditions do not determine \(g_0,g_1\) completely, and one can arrange in addition that \(\Upsilon (g)\) vanishes at \(\Sigma ^\circ \) as a 1-form on M. Provided then that \(P_0(g)=0\), with these Cauchy data for g, holds near \(\Sigma ^\circ \), the constraint equations at \(\Sigma ^\circ \) can be shown to imply that also the transversal derivative of \(\Upsilon (g)\) vanishes at \(\Sigma ^\circ \) (see the proof of Theorem 6.3), and then the argument involving (1.31) applies.

If the initial data in Theorem 1.1 are exactly Schwarzschildean for \(r\ge R\gg 1\), the solution g is equal (i.e. isometric) to the Schwarzschild metric in the domain of dependence of the region \(r\ge R\); more generally, for initial data which are equal to those of mass m Schwarzschild modulo decaying corrections, we expect all outgoing null-geodesics to be bent in approximately the same way as for the metric \(g_m^S\). Thus, we should define the manifold M in step 1 so that \(\mathscr {I}^+\) is null infinity of the Schwarzschild spacetime. Now, along radial null-geodesics of \(g_m^S\), the difference \(t-r_*\) is constant, where

$$\begin{aligned} r_*=r+2 m\log (r-2 m) \end{aligned}$$
(1.32)

is the tortoise coordinate up to an additive constant, see [109, Equation (6.4.20)]. Correspondingly, we define the compactification \({}^m\overline{\mathbb {R}^4}\) near \(t\sim r_*\) such that \(\rho =r^{-1}\) is a boundary defining function, and \({}^m v:=(t-r_*)/r\) is smooth up to the boundary; \({}^mM\) is defined by blowing this up at \(S^+=\{\rho =0,\,{}^m v=0\}\). (This is smoothly extended away from \(t\sim r_*\) to a compactification of all of \(\mathbb {R}^4\)). Thus, \({}^m\overline{\mathbb {R}^4}\) and the Minkowski compactification \(\overline{\mathbb {R}^4}={}^0\overline{\mathbb {R}^4}\) are canonically identified by continuity from \(\mathbb {R}^4\), but have slightly different smooth structures; see §2.3 and [14, §7]). The interior of the front face \(\mathscr {I}^+\) of the blow-up is diffeomorphic to \(\mathbb {R}_s\times \mathbb {S}^2\), where \(s:={}^m v/\rho =t-r_*\) is an affine coordinate along the fibers of the blow-up. We denote defining functions of \(I^0\) (the closure of \(\{\rho =0,\,{}^m v<0\}\) in \({}^mM\)), \(\mathscr {I}^+\), and \(I^+\) (the closure of \(\{\rho =0,\,{}^m v>0\}\) in \({}^mM\)) by \(\rho _0\), \(\rho _I\), and \(\rho _+\), respectively.

It is then natural to fix the background metric \(g_m\) to be equal to \(g_m^S\) near \(I^0\cup \mathscr {I}^+\) and smoothly interpolate with the Minkowski metric near \(r=0\) (which is nonsingular there, unlike \(g_m^S\)). We then work with the gauge \(\Upsilon (g;g_m)=0\), and seek the solution of

$$\begin{aligned} P(h) := \rho ^{-3} P_0(g) = 0,\ \ g = g_m + \rho h, \end{aligned}$$
(1.33)

with h to be determined; the factors \(\rho \) are introduced in analogy with the discussion of the scalar wave equation (1.13).Footnote 18 Here, \(\rho \) is a global boundary defining function of \({}^m\overline{\mathbb {R}^4}\); one can e.g. take \(\rho =r^{-1}\) away from the axis \(r/t=0\), and \(\rho =t^{-1}\) near \(r/t=0\). Now, due to the quasilinear character of (1.29), the principal part of \(L_h:=D_h P\) depends on h: it is given by \(\tfrac{1}{2}\Box _g\). Thus, one needs to ensure that throughout the iteration scheme (1.9), the null-geometry of g is compatible with \({}^mM\), in the sense that the long range term of g determining the bending of light rays remains unchanged. To see what this means concretely, consider a metric perturbation h in (1.33) which is not growing too fast at \(\mathscr {I}^+\), say \(|h|\lesssim \rho _I^{-\epsilon }\) for \(\epsilon <1/2\) (that is, \(|h|\lesssim r^\epsilon \) when \(t-r_*\) remains in a bounded interval); one can then check that, modulo terms with faster decay at \(\mathscr {I}^+\),

$$\begin{aligned} \Box _g = 2\rho _I^{-1}\bigl (\rho _I\partial _{\rho _I}+2\rho _0 h_{0 0}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})\bigr )(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})\ \ \text{ near }\ I^0\cap \mathscr {I}^+, \end{aligned}$$
(1.34)

which identifies

$$\begin{aligned} h_{0 0}=h(\partial _0,\partial _0),\ \ \partial _0=\partial _t+\partial _{r_*}, \end{aligned}$$
(1.35)

as the (only) long range component of h; see the calculation (3.15).Footnote 19. Indeed, the first vector field in (1.34) is approximately tangent to outgoing null cones, so for \(h_{0 0}\ne 0\) at \(\mathscr {I}^+\), outgoing null cones do not tend to \((\mathscr {I}^+)^\circ \). (Rather, if \(h_{0 0}>0\), say, they are less strongly bent, like in a Schwarzschild spacetime with mass smaller than m). Whether or not \(h_{0 0}\) vanishes at \(\mathscr {I}^+\) depends on the choice of gauge. A calculation, see (A.5), shows that the gauge condition \(\Upsilon (g)=0\) implies the constancy of \(h_{0 0}\) along \(\mathscr {I}^+\); but since \(h_{0 0}\) is initially 0 due to \(g_m\) already capturing the long range part of the initial data, this means that \(h_{0 0}|_{\mathscr {I}^+}=0\) indeed—provided that\(P(h)=0\) with Cauchy data satisfying the gauge condition, as we otherwise cannot conclude the vanishing of \(\Upsilon (g)\). We remark that \(\Upsilon (g)=0\) implies the vanishing of further components of h, namely \(r^{-1}h_{0 a}\equiv h(\partial _0,r^{-1}\partial _{\theta ^a})\) and , \(h_{a b}:=h(\partial _{\theta ^a},\partial _{\theta ^b})\), which we collectively denote by \(h_0\); see (3.4) and (3.11), where the notation \(h_0=:\pi _0 h\) is introduced.

As we are solving approximate (linearized) equations at each step of our Newton-type iteration scheme in step 4, we thus need an extra mechanism to ensure that \(\Upsilon (g)\), \(g=g_m+\rho h\), is decaying sufficiently fast at \(\mathscr {I}^+\) to guarantee the vanishing of \(h_{0 0}\) at \(\mathscr {I}^+\). This is where constraint damping comes into play. Roughly speaking, if one only has an approximate solution of \(P_0(g)\approx 0\), then we still get \(\delta _g G_g\widetilde{\delta }{}^*\Upsilon (g)\approx 0\); if one chooses \(\widetilde{\delta }{}^*\) carefully, solutions of this can be made to decay at \(\mathscr {I}^+\) sufficiently fast so as to imply the vanishing of \(h_{0 0}\). We shall show that the choice

$$\begin{aligned} \widetilde{\delta }{}^* u = \delta _{g_m}^*u + 2\gamma \tfrac{d t}{t}\otimes _s u - \gamma (\iota _{t^{-1}\nabla t}u)g_m,\ \ \gamma >0, \end{aligned}$$

accomplishes this.Footnote 20 As a first indication, one can check that \(2\delta _{g_m}G_{g_m}\widetilde{\delta }{}^*\) has a structure similar to (1.24) with \(\gamma >0\), for which we had showed the improved decay at \(\mathscr {I}^+\).

Regarding steps 2 and 3 of our general strategy, the correct function spaces can now be determined easily (after some tedious algebra): solving \(L_0 u=0\), where \(L_h=D_h P\) as usual, one finds that \(u_0=\pi _0 u\), so in particular the long range component \(u_{0 0}\) of u decays at \(\mathscr {I}^+\), while the remaining components, denoted \(u_0^c\), have a size 1 leading term at \(\mathscr {I}^+\), just like solutions of the linear scalar wave equation. This follows from the schematic structure

$$\begin{aligned} \rho _I^{-1}\biggl (\rho _I\partial _{\rho _I}-\begin{pmatrix} \gamma &{} 0 \\ 0 &{} 0 \end{pmatrix}\biggr )(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}) \begin{pmatrix} u_0 \\ u_0^c \end{pmatrix} \end{aligned}$$

of the model operator at \(\mathscr {I}^+\) in this case. However, for such u then, solutions of \(L_u u'=-P(u)\) have slightly more complicated behavior. Indeed, the model operator at \(\mathscr {I}^+\) has a schematic structure similar to (1.28), acting on \((u'_0,(u')_{1 1}^c,u'_{1 1})\), where we separate the components of \((u')_0^c\) into two sets, one of which consists of the single component

$$\begin{aligned} u'_{1 1}=u(\partial _1,\partial _1),\ \ \partial _1=\partial _t-\partial _{r_*}, \end{aligned}$$
(1.36)

while \((u')_{1 1}^c\) captures the remaining components, which are \(u_{0 1}\), \(r^{-1} u_{1 b}\), and the part of the spherical part of u which is trace-free with respect to . Correspondingly, we need to allow \(u'_{1 1}\) to have a logarithmic leading order term, just like the component called \(u_1\) in the definition of the function space (1.27). In the next iteration step, \(L_{u'}u''=-P(u')\), no further adjustments are necessary: the structure of the model operator at \(\mathscr {I}^+\) is unchanged, hence the asymptotic behavior of \(u''\) does not get more complicated.Footnote 21 We remark that due to our precise control over each iterate, encoded by membership in \(\mathcal {X}^\infty \), the relevant structure of the model operators and the regularity of the coefficients of the linearized equations are the same at each iteration step; in particular, the fact that equation (1.33) is quasilinear rather than semilinear does not cause any complications beyond the need for constraint damping.

The decoupling of the model operator at \(\mathscr {I}^+\) into three pieces—one for the decaying components \(u_0\), one for the components \(u_{1 1}^c\) which have possibly nontrivial leading terms at \(\mathscr {I}^+\), and one for the logarithmically growing component \(u_{1 1}\)—is the key structure making our proof of global stability work. The fact that the equation for the components \(u_0\) decouples is not coincidental, as they are governed by the gauge condition and thus are expected to decouple to leading order in view of the second Bianchi identity as around (1.31).Footnote 22 The decoupling of \(u_{1 1}\) and \(u_{1 1}^c\) on the other hand is the much more subtle manifestation of the weak null condition, as discussed in Remark 1.7.

The solution h of (1.33) is a symmetric 2-tensor in \(M^\circ \); as part of step 1, we still need to specify the smooth vector bundle on M which h will be a section of. Consider first the Minkowski metric \(\underline{g}{}\) on the radial compactification \({}^0\overline{\mathbb {R}^4}\). In \(\mathbb {R}^4\), \(\underline{g}{}\) is a quadratic form, with constant coefficients, in the 1-forms dt and \(d x^i\), which extend smoothly to the boundary as sections of the scattering cotangent bundle\({}^{{\text {sc}}}T^*\,{}^0\overline{\mathbb {R}^4}\) first introduced in [86]; in a collar neighborhood \([0,1)_\rho \times \mathbb {R}^3_X\) of a point in \(\partial {}^0\overline{\mathbb {R}^4}\), the latter is by definition spanned by the 1-forms \(\frac{d\rho }{\rho ^2}\), \(\frac{d X^i}{\rho }\), which are smooth and linearly independent sections of \({}^{{\text {sc}}}T^*\,{}^0\overline{\mathbb {R}^4}\)down to the boundary. For instance, near \(r=0\), we can take \(\rho =t^{-1}\) and \(X=x/t\), in which case \(\frac{d\rho }{\rho ^2}=-d t\) and \(\frac{d X^i}{\rho }=d x^i-X^i\,d t\). Similarly then, \(g_m\) will be a smooth section of the second symmetric tensor power \(S^2\,{}^{{\text {sc}}}T^*\,{}^m\overline{\mathbb {R}^4}\). Since our nonlinear analysis takes place on the blown-up space \({}^mM\), we seek h as a section of the pullback bundle \(\beta ^*S^2\,{}^{{\text {sc}}}T^*\,{}^m\overline{\mathbb {R}^4}\), where \(\beta :{}^mM\rightarrow {}^m\overline{\mathbb {R}^4}\) is the blow-down map. For brevity, we shall suppress the bundle from the notation here.

Theorem 1.8

Suppose the assumptions of Theorem 1.1 are satisfied, i.e. for some small \(m\in \mathbb {R}\) and \(b_0>0\) fixed, the normalized data \(\rho _0^{-1}\widetilde{\gamma }\) and \(\rho _0^{-2}k\in \rho _0^{b_0}H_{{\text {b}}}^\infty (\overline{\mathbb {R}^3})\) are small in \(\rho _0^{b_0}H_{{\text {b}}}^{N+1}\) and \(\rho _0^{b_0}H_{{\text {b}}}^N\), respectively. Then there exists a solution g of the initial value problem (1.4) satisfying the gauge condition \(\Upsilon (g)=0\), see (1.30), which on \({}^mM\) is of the form \(g=g_m+\rho h\), \(h\in \rho _0^{b_0}\rho _I^{-\epsilon }\rho _+^{-\epsilon }H_{{\text {b}}}^\infty ({}^mM)\) for all \(\epsilon >0\); here \(\rho \) is a boundary defining function of \({}^m\overline{\mathbb {R}^4}\), and \(\rho _0\), \(\rho _I\), and \(\rho _+\) are defining functions of \(I^0\), \(\mathscr {I}^+\), and \(I^+\), respectively.

More precisely, near \(\mathscr {I}^+\) and using the notation introduced after (1.35) and (1.36), the components \(h_{0 0}\), \(r^{-1}h_{0 b}\), and lie in

$$\begin{aligned} \rho _0^{b_0}\rho _I^{b_I}\rho _+^{-\epsilon }H_{{\text {b}}}^\infty ({}^mM) \end{aligned}$$
(1.37)

for all \(b_I<\min (1,b_0)\) and \(\epsilon >0\), while \(h_{0 1}\), \(r^{-1}h_{1 b}\), and have size 1 leading terms at \(\mathscr {I}^+\) plus a remainder in the space (1.37) for all such \(b_I,\epsilon \), and \(h_{1 1}\) has a logarithmic and a size 1 leading term at \(\mathscr {I}^+\) plus a remainder in the space (1.37) for all such \(b_I,\epsilon \). At \(I^+\) on the other hand, h has a size 1 leading term: there exists \(h^+\in \rho _I^{-\epsilon }H_{{\text {b}}}^\infty (I^+)\) such that \(h-h^+\in \rho _I^{-\epsilon }\rho _+^{b_+}H_{{\text {b}}}^\infty ({}^mM)\) near \(I^+\) for any \(b_+<\min (b_0,1)\).

Remark 1.9

Near \({}^m\mathscr {I}^+\), and indeed for \(r\gg 1\) and \(t-r_*\le \tfrac{1}{2}r\), the membership \(u\in \rho _0^{b_0}\rho _I^{b_I}\rho _+^{b_+}H_{{\text {b}}}^\infty ({}^mM)\) (e.g. u being a metric coefficient of h, and \(b_+=-\epsilon \) as in (1.37)) is equivalent, up to arbitrarily small losses in decay (due to switching from \(L^2\) to \(L^\infty \) via Sobolev embedding), to

$$\begin{aligned} |V_1\cdots V_N u| \lesssim r^{-b_I}(1+(r_*-t)_+)^{-b_0+b_I} (1+(t-r_*)_+)^{b_++b_I} \end{aligned}$$

for all \(N\in \mathbb {N}_0\), where each \(V_j\) is a rotation vector field in \(\mathbb {R}^3\) or one of the vector fields \(t\partial _t+r_*\partial _{r_*}\), \(t\partial _{r_*}+r_*\partial _t\), \(\partial _t\), \(\partial _x\).

See Theorem 6.3 for the full statement, which in particular allows for the decay rate \(b_0\) of the initial data to be larger and gives the corresponding weight at \(I^0\) for the solution. The final conclusion follows from resonance considerations, as indicated before (1.23), and will follow from the arguments used to establish polyhomogeneity in §7. We discuss continuous dependence on initial data in Remark 6.4. A typical example of a polyhomogeneous expansion of h arises for initial data which are smooth functions of 1/r in \(r\gg 1\): in this case, the leading terms of the expansion of h are schematically (and not showing the coefficients, which are functions on \(\mathscr {I}^+\))

$$\begin{aligned} h_0\sim \rho _I\log ^{\le 3}\rho _I,\ \ h_{1 1}^c\sim 1+\rho _I\log ^{\le 4}\rho _I,\ \ h_{1 1}\sim \log ^{\le 1}\rho _I+\rho _I\log ^{\le 6}\rho _I \end{aligned}$$
(1.38)

at \(\mathscr {I}^+\), and \(h\sim 1+\rho _+\log ^{\le 8}\rho _+\) at \(I^+\); see Example 7.3. Here, \(\log ^{\le k}\rho _I\) stands for functions which are sums of products \(|\log \rho _I|^\ell a_\ell \), \(0\le \ell \le k\), with \(a_\ell \) functions on \(\mathscr {I}^+\).

While a solution g of \(\text {Ric}(g)=0\) in the gauge \(\Upsilon (g)=0\) of course solves equation (1.29) for any choice of \(\widetilde{\delta }{}^*\), we argued why a careful choice is crucial to make our global iteration scheme work. Another perspective is the following: implementing constraint damping allows us to solve the gauge-fixed equation (1.29) for any sufficiently small Cauchy data; whether or not these data come from an initial data set satisfying the constraint equations is irrelevant. Only at the end, once one has a solution of (1.29), do we use the constraint equations and the second Bianchi identity to deduce \(\Upsilon (g)=0\).

In contrast, consider the choice \(\widetilde{\delta }{}^*=\delta _g^*\) in (1.29); the linearization of \(P_0(g)\) around the Minkowski metric \(g=\underline{g}{}\) is then equal to \(\tfrac{1}{2}\Box _{\underline{g}{}}\), which is \(\tfrac{1}{2}\) times the scalar wave operator acting component-wise on the components of a symmetric 2-tensor in the frame \(d x^\mu \otimes d x^\nu +d x^\nu \otimes d x^\mu \), where \(x^0=t\), \(x^i\), \(i=1,2,3\), are the standard coordinates on \(\mathbb {R}^{1+3}_{t,x}\). Solving \(\Box _{\underline{g}{}}(\rho h)=0\) with given initial data, which would be the first step in our iteration scheme for initial data with mass \(m=0\), does not imply improved behavior for any components of h, in particular \(h_{0 0}\); this means that constraint damping fails for this choice of \(\widetilde{\delta }{}^*\). Thus, the next iterate \(\underline{g}{}+\rho h\) in general has a different long range behavior, and correspondingly \({}^0M\) is no longer the correct place for the analysis of the linearized operator in the next iteration step—even though the final solution of Einstein’s equation is well-behaved on \({}^0M\) for such initial data. With constraint damping on the other hand, the linearized equation always produces behavior consistent with the qualitative properties of the nonlinear solution.

1.3 Bondi mass loss formula

The description of the asymptotic behavior of the metric \(g=g_m+\rho h\) in Theorem 1.8 on the compact manifold \({}^mM\) and in the chosen gauge allows for a precise description of outgoing light cones close to the radiation face \(\mathscr {I}^+\). Work on geometric quantities at \(\mathscr {I}^+\) started with the seminal works of Bondi–van der Burg–Metzner [10, 12], Sachs [96, 97], Newman–Penrose [92], and Penrose [94]; the precise decay properties of the curvature tensor—in particular ‘peeling estimates’ or their failure—were discussed in [24, 67], see also [35]. (For studies on conditions on initial data which ensure or prevent smoothness of the metric at \(\mathscr {I}^+\) in suitable coordinates, see [1, 30, 40, 41, 108] and [66, §8.2]).

As remarked before, the logarithmic bending of light cones is controlled by the ADM mass m, which measures mass on spacelike, asymptotically flat, Cauchy surfaces. A more subtle notion is the Bondi mass [12], see also [23], which is a function of retarded time \(x^1=t-r_*\) that can be defined as follows: let \(S(u)\subset \mathscr {I}^+\) denote the u-level set of \(x^1\) at null infinity; S(u) is a 2-sphere, and naturally comes equipped with the round metric. If \(C_u\) denotes the outgoing light cone which limits to S(u) at null infinity and which asymptotically approaches the radial Schwarzschild light cone \(\{x^1=u\}\), one can define a natural area radius\(\mathring{r}\) on \(C_u\), equal to the coordinate r plus lower order correction terms; the Bondi mass \(M_{\mathrm{B}}(u)\) is then the limit of the Hawking mass of the 2-sphere \(\{x^1=u,\ \mathring{r}=R\}\) as \(R\rightarrow \infty \). See §8 for the precise definitions. A change \(\frac{d}{d u}M_{\mathrm{B}}(u)\) of the Bondi mass reflects a flux of gravitational energy to \(\mathscr {I}^+\) along \(C_u\). We shall calculate these quantities explicitly and show:

Theorem 1.10

Suppose we are given a metric constructed in Theorem 1.8, and write \(h_{1 1}=h_{1 1}^{(1)}\log (r)+\mathcal {O}(1)\) near \(\mathscr {I}^+\), where \(h_{1 1}^{(1)}\in \rho _0^{b_0}\rho _+^{-\epsilon }H_{{\text {b}}}^\infty (\mathscr {I}^+)\) is the logarithmic leading term. Then the Bondi mass is equal to

(1.39)

The Bondi mass loss formula takes the form \(\frac{d}{d u}M_{\mathrm{B}}(u)=-E(u)\), where

is the outgoing energy flux. Finally, \(M_{\mathrm{B}}(-\infty )=m\) and \(M_{\mathrm{B}}(+\infty )=0\).

We prove this for all initial data which are small and asymptotically flat in the sense of (1.6). The Bondi mass was shown to be well-defined (and to satisfy a mass loss formula) for the weakly decaying initial data used in [16] by Bieri–Chruściel [8] in the geometric framework of [27], but the question of how to define Bondi–Sachs coordinates remained open. Our result is the first to accomplish this for a large class of initial data, and to identify the Bondi mass in a (generalized) wave coordinate gauge setting. (The \(\mathcal {C}^{1,\min (b_0,1)-0}\) regularity of a conformally rescaled non-degenerate metric down to \(\mathscr {I}^+\) is a by-product of our analysis) The key to establishing the first part of Theorem 1.10 is the construction and precise control of the aforementioned geometric quantities leading to the identification (1.39); the mass loss formula itself is then equivalent to the vanishing of the leading term of the (1, 1) component of the gauge-fixed Einstein equation at \(\mathscr {I}^+\). The vanishing of \(M_{\mathrm{B}}(u)\) as \(u\rightarrow -\infty \) follows immediately from the decay properties of h there. On the other hand, the proof that the total radiated energy \(\int E(u)\,d u\) equals the initial mass m proceeds by studying the leading order term \(h|_{I^+}\) as the solution of a linear equation on \(I^+\) (obtained by restricting the nonlinear gauge-fixed Einstein equation to \(I^+\)), with a forcing term that comes from the failure of our glued background metric \(g_m\) to satisfy the Einstein equation and which is thus proportional to m. This equation now is closely related to the spectral family of exact hyperbolic space at the bottom of the essential spectrum;Footnote 23 a calculation of the scattering matrix acting on the incoming data given by \(h_{1 1}^{(1)}\) and comparing the (0, 0) component of the outgoing data with \(h_{0 0}\)—which vanishes by construction!—then establishes the desired relationship.

Theorem 1.10 shows that the logarithmic term in the asymptotic expansion of \(h_{1 1}\) carries physical meaning. Its vanishing forces \(m=0\), which by the positive mass theorem means that the spacetime is exact Minkowski space. (The observation that \(\int E(u)\,d u\ge 0\) immediately implies the nonnegativity of the ADM mass of the small initial data under consideration here, which in this case was first proved by Choquet-Bruhat–Marsden [20]).

Further geometric properties of the vacuum metrics constructed in this paper, such as the identification of \((\mathscr {I}^+)^\circ \subset M\), resp. \((I^+)^\circ \), as the set of endpoints of future-directed null, resp. timelike, geodesics, will be discussed elsewhere.

1.4 Outline of the paper

In §§2 and 3, we set the stage for the analysis (steps 1 and 2): we give the precise definition of the compactification \(M={}^mM\) on which we will find the solution of (1.4) in §2.1; the relevant function spaces are defined in §2.2, and the relationships between different compactifications are discussed in §2.3. In §2.4, we prepare the invariant formulation of estimates such as (1.19); the results there are not needed until §4. In §3.1, we define the spaces \(\mathcal {X}^\infty \) and \(\mathcal {Y}^\infty \) on M in which we shall find the solution h in Theorem 1.8, and calculate the mapping properties and model operators of the (linearized) gauge-fixed Einstein operator in §§3.2 and 3.3, respectively. (The necessary algebra is moved to Appendix A). The key structures (constraint damping, null structure) critical for our proof will be discussed there as well. We accomplish part 3.1 of step 3—the proof of a high regularity background estimate with imprecise weights—by exploiting these structures in §4. The recovery of the precise asymptotic behavior in §5 finishes step 3.2. Putting this into a Nash–Moser framework allows us to finish the proof of Theorem 1.8 in §6; the proof of polyhomogeneity, thus of the last part of Theorem 1.1, is proved in §7. Finally, a finer description of the resulting asymptotically flat spacetime near null infinity, leading to the proof of Theorem 1.10, is given in §8.

For the reader only interested in the key parts of the proof, we recommend reading §§2.1 and 2.2 for the setup, §3.1 for the form of metric perturbations we need to consider, and §3.2 for an explanation of the main features of the linearized problem; taking the background estimate, Theorem 4.2 (which uses material from §2.4, and whose proof roughly follows the steps outlined in §1.1.1), as a black box, the argument formally concludes in §5. (Getting the actual nonlinear solution in §6 is then routine).

2 Compactification

As explained in §1.2, we shall find the metric g in Theorem 1.8 as a perturbation of a background metric \(g_m\) which interpolates between mass m Schwarzschild in a neighborhood \(\{r\gg 1,\ |t|<2 r\}\) of \(I^0\cup \mathscr {I}\) and the Minkowski metric elsewhere. In §2.1, we define such a metric \(g_m\) as a smooth scattering metric on a suitable partial compactification \({}^m\overline{\mathbb {R}^4}\) of \(\mathbb {R}^4\) to a manifold with boundary which is closely related to the radial compactifications of asymptotically Minkowski spaces used in [13, 14]. The ideal boundaries \(I^0\), \(\mathscr {I}^+\), and \(I^+\) are then the boundary hypersurfaces of a manifold with corners obtained by blowing up \({}^m\overline{\mathbb {R}^4}\) at the ‘light cone at infinity.’ The spaces of conormal and polyhomogeneous functions on this manifold are defined in §2.2.

Let us recall the notion of the scattering cotangent bundle\({}^{{\text {sc}}}T^*X\) over an n-dimensional manifold X with boundary \(\partial X\). Over the interior \(X^\circ \), \({}^{{\text {sc}}}T^*_{X^\circ }X:=T^*_{X^\circ }X\) is the usual cotangent bundle. Near the boundary, let

$$\begin{aligned} \rho \ge 0,\quad y=(y^1,\ldots ,y^{n-1})\in \mathbb {R}^{n-1} \end{aligned}$$
(2.1)

denote local coordinates in which \(\partial X\) is given by \(\rho =0\); then the 1-forms \(\frac{d\rho }{\rho ^2}\), \(\frac{d y^j}{\rho }\) (\(j=1,\ldots ,n-1\)) are a smooth local frame of \({}^{{\text {sc}}}T^*X\), i.e. smooth scattering 1-forms are precisely the linear combinations \(a(\rho ,y)\frac{d\rho }{\rho ^2}+a_j(\rho ,y)\frac{d y^j}{\rho }\) with \(a,a_j\) smooth. (Equivalently, we can use \(d(1/\rho )\) and \(d(y^j/\rho )\) as a smooth local frame). The point is that, viewed from the perspective of \(X^\circ \), such 1-forms have a very specific behavior as one approaches \(\partial X\). Tensor powers and their symmetric versions \(S^k\,{}^{{\text {sc}}}T^*X\), \(k\in \mathbb {N}\), are defined in the usual manner; the dual bundle is denoted \({}^{{\text {sc}}}TX\) and called scattering tangent bundle. In the case that \(\partial X=Y\times Z\) and \(X=[0,1)_\rho \times \partial X\) are products, so \(T^*Y\subset T^*X\) is a well-defined subbundle, then the rescaling \(\rho ^{-1}T^*Y\subset {}^{{\text {sc}}}T^*X\), spanned by covectors of the form \(\rho ^{-1}\eta \), \(\eta \in T^*Y\), is a smooth subbundle.

To give an example, calculations similar to the ones prior to Theorem 1.8 show that the differentials of the standard coordinates on \(\mathbb {R}^n\) extend to the radial compactification \(\overline{\mathbb {R}^n}\) as smooth scattering 1-forms; they are in fact a basis of \({}^{{\text {sc}}}T^*\overline{\mathbb {R}^n}\), and any metric on \(\mathbb {R}^n\) with constant coefficients, such as the Minkowski or Euclidean metric, is a scattering metric, i.e. an element of \(\mathcal {C}^\infty (\overline{\mathbb {R}^n};S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^n})\).

The b-cotangent bundle\({}^{{\text {b}}}T^*X\) is locally spanned by the 1-forms \(\frac{d\rho }{\rho }\), \(d y^j\) (\(j=1,\ldots ,n-1\)); its dual is the b-tangent bundle\({}^{{\text {b}}}TX\), spanned locally by \(\rho \partial _\rho \) and \(\partial _{y^j}\). The space \(\mathcal {V}_{\text {b}}(X)\) of b-vector fields on X, consisting of those vector fields V on X which are tangent to \(\partial X\), is then canonically identified with \(\mathcal {C}^\infty (X;{}^{{\text {b}}}TX)\). A b-metric is a nondegenerate section of \(S^2\,{}^{{\text {b}}}TX\). The space \(\text {Diff}_{\text {b}}^k(X)\) of b-differential operators of degree k consists of finite sums of k-fold products of b-vector fields. Fixing a collar neighborhood \([0,\epsilon )_\rho \times \partial X\) and choosing local coordinates \(y^j\) on \(\partial X\) as before, the normal operator of an operator \(L\in \text {Diff}_{\text {b}}^k(X)\) given in the coordinates (2.1) by \(L=\sum _{j+|\alpha |\le k} a_{j\alpha }(\rho ,y)(\rho D_\rho )^j D_y^\alpha \) is defined by freezing coefficients at \(\rho =0\),

$$\begin{aligned} N(L) := \sum _{j+|\alpha |\le k} a_{j\alpha }(0,y)(\rho D_\rho )^j D_y^\alpha \in \text {Diff}_{\text {b}}^k([0,\infty )_\rho \times \partial X). \end{aligned}$$
(2.2)

This depends on the choice of collar neighborhood only through the choice of normal vector field \(\partial _\rho |_{\partial X}\); see [85, §4.15] for an invariant description. The Mellin-transformed normal operator family\(\widehat{L}(\sigma )\), \(\sigma \in \mathbb {C}\), is the conjugation of N(L) by the Mellin transform in \(\rho \); thus, in view of \(\rho ^{-i\sigma }\rho D_\rho (\rho ^{i\sigma })=\sigma \rho ^{i\sigma }\), one obtains \(\widehat{L}(\sigma )\) by formally replacing \(\rho D_\rho \) by \(\sigma \):

$$\begin{aligned} \widehat{L}(\sigma ) := \sum _{j+|\alpha |\le k} a_{j\alpha }(0,y)\sigma ^j D_y^\alpha . \end{aligned}$$

This is a holomorphic family of elements of \(\text {Diff}^k(\partial X)\). Analogous constructions can be performed for b-operators acting on vector bundles.

2.1 Analytic structure

Fix the mass \(m\in \mathbb {R}\); for now, m does not have to be small. The Schwarzschild metric, written in polar coordinates on \(\mathbb {R}\times \mathbb {R}^3\), takes the form

(2.3)

where denotes the round metric on \(\mathbb {S}^2\), and where we let

$$\begin{aligned} s := t-r_*,\ \ r_* := r + 2 m\log (r-2 m), \end{aligned}$$
(2.4)

so \(dr_*=\tfrac{r}{r-2 m}dr\). Note that level sets of s are radial outgoing null cones. Define

$$\begin{aligned} \rho := r^{-1}, \ \ v := r^{-1}\bigl (t - r - \chi (t/r)2 m\log (r-2 m)\bigr ), \end{aligned}$$
(2.5)

where \(\chi (x)\equiv 1\), \(x<2\), and \(\chi (x)\equiv 0\), \(x>3\). Let then

$$\begin{aligned} C_1 := [0,\epsilon _0)_\rho \times (-\tfrac{7}{4},5)_v \times \mathbb {S}^2_\omega , \end{aligned}$$
(2.6)

where we shrink \(\epsilon _0>0\) so that t is well-defined and depends smoothly on \(\rho >0\) and v, via the implicit function theorem applied to (2.5). This will provide the compactification near the future light cone (and part of spatial infinity). Near future infinity, we use standard coordinates \((t,x)\in \mathbb {R}\times \mathbb {R}^3\) on \(\mathbb {R}^4\); define

$$\begin{aligned} \rho '_+=t^{-1}, \ \ X=x/|t|, \end{aligned}$$
(2.7)

and put

$$\begin{aligned} C_2 := [0,\epsilon _0)_{\rho '_+} \times \{ X\in \mathbb {R}^3 :|X|<\tfrac{1}{4} \}. \end{aligned}$$
(2.8)

For \(\epsilon _0>0\) small enough, we can consider the interiors \(C_1^\circ \), \(C_2^\circ \) as smooth submanifolds of \(\mathbb {R}^4\) using the identifications (2.5) and (2.7). (Note in particular that the smooth structures agree with the induced smooth structure of \(\mathbb {R}^4\)). Let us consider the transition map between \(C_1^\circ \) and \(C_2^\circ \) in more detail: in \(C_1^\circ \cap C_2^\circ \) and for \(t^{-1}\) small enough, we have \(\chi (t-r)\equiv 0\) and \(\tfrac{r}{t}>\tfrac{1}{7}\), so the map

$$\begin{aligned} (\rho '_+,X) \mapsto (\rho =\rho '_+ / |X|,\ v=|X|^{-1}-1,\ \omega =X/|X|) \end{aligned}$$
(2.9)

extends smoothly (with smooth inverse) to \(\rho '_+=0\). We then let

$$\begin{aligned} \overline{\mathbb {R}^4} := \bigl ( \mathbb {R}^4 \sqcup C_1 \sqcup C_2 \bigr ) / \sim \end{aligned}$$

where \(\sim \) identifies \(C_1\) and \(C_2\) with subsets of \(\mathbb {R}^4\) as above, and the boundary points of \(C_1\) and \(C_2\) are identified using the map (2.9). This is thus a smooth manifold with boundary,Footnote 24 though both \(\overline{\mathbb {R}^4}\) and \(\partial {\overline{\mathbb {R}^4}}=(\partial C_1\sqcup \partial C_2)/\sim \) are noncompact. In other words, \(\overline{\mathbb {R}^4}\) is only a compactification of the region \(v>-\tfrac{7}{4}\). See Figure 7.

Fig. 7
figure 7

The partial compactification \(\overline{\mathbb {R}^4}\) of \(\mathbb {R}^4\), constructed from \(\mathbb {R}^4\), \(C_1\), and \(C_2\). Also shown is the hypersurface \(\Sigma \) from (2.15)

The scattering cotangent bundle of \(\overline{\mathbb {R}^4}\) near the light cone at infinity has a smooth partial trivialization \({}^{{\text {sc}}}T^*_{C_1}\overline{\mathbb {R}^4}=\langle dr\rangle \oplus \langle d(v/\rho )\rangle \oplus \rho ^{-1}T^*\mathbb {S}^2\), thus if \(\psi \) is a smooth function with \(\psi (v)\equiv 1\) for \(v<1\) and \(\psi (v)\equiv 0\) for \(v>2\), then

(2.10)

In \(v>3\) and for \(\epsilon _0>0\) small enough, we simply have , which is thus equal to

$$\begin{aligned} g_{m,2} := d(1/\rho '_+)^2 - d(X/\rho '_+)^2 \in \mathcal {C}^\infty (C_2;S^2\,{}^{{\text {sc}}}T^*_{C_2}\overline{\mathbb {R}^4}) \end{aligned}$$

on the overlap \(C_1\cap C_2\). Thus, we can glue \(g_{m,1}\) and \(g_{m,2}\) together to define a Lorentzian scattering metric \(\widetilde{g}_m\) on \(C_1\cup C_2\). We extend \(\widetilde{g}_m\) to a global metric:

Definition 2.1

Fix \(\phi \in \mathcal {C}^\infty (\overline{\mathbb {R}^4})\) such that \({\text {supp}}\phi \subset C_1\cup C_2\), and so that \(\phi \equiv 1\) near \(\partial \overline{\mathbb {R}^4}\). With \(\tilde{g}_m\) as above, we then define

$$\begin{aligned} g_m := \phi \widetilde{g}_m + (1-\phi )(dt^2-dx^2) \in \mathcal {C}^\infty (\overline{\mathbb {R}^4};S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}), \end{aligned}$$
(2.11)

thus gluing \(\widetilde{g}_m\) to the Minkowski metric away from \(C_1\cup C_2\).

By construction, \(g_m\) is equal to the Minkowski metric in a compact region of \(\mathbb {R}^4\) as well as in a closed subcone of the interior of the future light cone, which we glue together with the Schwarzschild metric near spacelike and null infinity.

Next, denote the light cone at future infinity by

$$\begin{aligned} S^+ := \{ \rho =0,\ v=0 \} \subset \partial \overline{\mathbb {R}^4} \end{aligned}$$
(2.12)

and let

$$\begin{aligned} M' := [\overline{\mathbb {R}^4};S^+] \end{aligned}$$

denote the blow-up of \(\overline{\mathbb {R}^4}\) at \(S^+\), see Figure 8. That is, as a set,

$$\begin{aligned} M'=\bigl (\overline{\mathbb {R}^4}\setminus S^+\bigr ) \sqcup \bigl ([-\pi /2,\pi /2]_\sigma \times \mathbb {S}^2_\omega \bigr ), \end{aligned}$$

which can be endowed with the structure of a smooth (noncompact) manifold with corners by writing it as

$$\begin{aligned} M' = \left( \bigl (\overline{\mathbb {R}^4}\setminus S^+\bigr ) \sqcup \bigl ([0,1)_{\rho _I}\times [-\pi /2,\pi /2]_\sigma \times \mathbb {S}^2_\omega \bigr )\right) / \sim , \end{aligned}$$

where we identify a point in \(\overline{\mathbb {R}^4}\) with coordinates \((\rho ,v,\omega )\), \((\rho ,v)\ne (0,0)\), with the point \((\rho _I=\sqrt{\rho ^2+v^2},\,\sigma =\arctan (v/\rho ),\,\omega )\). The map

$$\begin{aligned} \beta :M' \rightarrow \overline{\mathbb {R}^4}, \end{aligned}$$
(2.13)

equal to the identity map away from \(S^+\), and given by \(\beta (\rho _I,\sigma ,\omega )=(\rho =\rho _I\cos \sigma ,\,v=\rho _I\sin \sigma ,\,\omega )\), is called the blow-down map. Note that \(\beta \) is a local diffeomorphism away from \(S^+\), but is not injective at the front face

$$\begin{aligned} \text{ ff }([\overline{\mathbb {R}^4};S^+]) := \rho _I^{-1}(0) \end{aligned}$$

of the blow-up. The point of doing this blow-up is that curves tending to \(S^+\) but at different angles \(\sigma \) have distinct limiting points on the front face. Concretely, \(s=\tan (\sigma )=v/\rho =t-r_*\) is an affine parameter on the fibers \(\beta ^{-1}(p)\), \(p\in S^+\), of the blow-down map, so \(\beta ^{-1}(S^+)\) is the set of all endpoints of future-directed outgoing radial null-geodesics of mass m Schwarzschild, and radial null-geodesics with different \(t-r_*\) are separated all the way up to \(\beta ^{-1}(S^+)\). It is thus natural to define:

Definition 2.2

Null infinity \(\mathscr {I}^+\) is defined as the front face of the blowup of \(S^+\subset \overline{\mathbb {R}^4}\),

$$\begin{aligned} \mathscr {I}^+ := \text{ ff }([\overline{\mathbb {R}^4};S^+]). \end{aligned}$$
Fig. 8
figure 8

Left: the partial compactification \(\overline{\mathbb {R}^4}\) and its light cone at infinity \(S^+\). Right: the blow-up \(M'=[\overline{\mathbb {R}^4};S^+]\xrightarrow {\beta }\overline{\mathbb {R}^4}\), with front face \(\mathscr {I}^+\) (null infinity) and side faces \(I^0\) (spatial infinity), \(I^+\) (future timelike infinity)

The side faces of the blow-up are the connected components of the lift of the original boundary hypersurface \(\partial \overline{\mathbb {R}^4}\), i.e. of the closure of the preimage of \(\partial \overline{\mathbb {R}^4}\setminus S^+\) under \(\beta \). In the present situation, there are two side faces:

Definition 2.3

The future temporal face is

$$\begin{aligned} I^+=\overline{\beta ^{-1}\bigl ((\partial C_2\cap \partial \overline{\mathbb {R}^4}) \cup \{v>0\}\bigr )}, \end{aligned}$$

whose image \(\beta (I^+)\) is a closed 3-ball with boundary \(S^+\). The spatial face (more precisely: the part of it that we chose to include in the compactification \(\overline{\mathbb {R}^4}\)) is defined by

$$\begin{aligned} I^0 := \overline{\beta ^{-1}\bigl (\partial \overline{\mathbb {R}^4}\cap \{v<0\}\bigr )}. \end{aligned}$$

Using \(\beta \), one can pull back natural vector bundles on \(\overline{\mathbb {R}^4}\) to \(M'\); for instance, the pullback \(\beta ^*g_m\), which we simply denote by \(g_m\) for brevity, is an element of \(\mathcal {C}^\infty (M';\beta ^* S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})\) (and constant along the fibers of \(\beta \)).

Let \(\rho _0=r^{-1}\) for \(|v+1|\le \tfrac{3}{4}\) and \(r>R\), \(R\gg 1\), and extend it to a smooth positive function on all of \(\mathbb {R}^4\). Denote then by \(t_{\text {b}}\) the smooth function

$$\begin{aligned} t_{\text {b}}= \rho _0(t - 2 m \chi _0(r) \log (r-2 m)), \end{aligned}$$
(2.14)

defined for \(|t|/\langle r\rangle <\tfrac{1}{2}\), where \(\chi _0\equiv 0\) for \(r<R\) and \(\chi _0\equiv 1\) for \(r>2 R\); this extends the function \(v+1\) smoothly into the interior \(\mathbb {R}^4\), and \(dt_{\text {b}}\) is timelike on

$$\begin{aligned} \Sigma :=t_{\text {b}}^{-1}(0). \end{aligned}$$
(2.15)

(The main point of this construction is to write the initial hypersurface \(\Sigma \) in a nondegenerate way, i.e. as the zero set of a function whose differential does not vanish anywhere on it). Note that the function \(\rho _0\) is, in a neighborhood of \(\Sigma \), a boundary defining function of \(I^0\); below, we shall use different boundary defining functions adapted to our needs, but keep the same notation. See also Remark 2.6.

We restrict our analysis from now on to the following smooth manifold with corners:

Definition 2.4

The compact manifold with corners M is defined by

$$\begin{aligned} M:=M'\cap \bigl (\{t_{\text {b}}\ge 0\}\cup \{t>\tfrac{1}{3} \langle r\rangle \}\bigr ). \end{aligned}$$

One should think of this as (the compactification of) the causal future of \(\Sigma \); and this is indeed what it is if we endow \(M^\circ \) with the Minkowski metric.

We regard the boundary \(\Sigma \subset M\) as ‘artificial,’ i.e. incomplete, from the point of view of b-analysis; recall Figure 1; abusing notation slightly, we shall denote the part \(I^0\cap M\) of spatial infinity contained in M again by \(I^0\). We denote by \(\rho _0\), \(\rho _I\), and \(\rho _+\in \mathcal {C}^\infty (M)\) defining functions of \(I^0\), \(\mathscr {I}^+\), and \(I^+\), respectively; we further let \(\rho \in \mathcal {C}^\infty (M)\) denote a total boundary defining function, e.g. \(\rho =\rho _0\rho _I\rho _+\). Defining functions are well-defined up to multiplication by smooth positive functions. We shall often make concrete choices to simplify local calculations; by a local defining function of \(I^0\), say, on some open subset \(U\subset M\) we then mean a function \(\rho _0\in \mathcal {C}^\infty (U)\) so that for any \(K\Subset U\), \(\rho _0|_K\) can be extended to a globally defined defining function of \(I^0\). We remark that \(\rho _0|_\Sigma \in \mathcal {C}^\infty (\Sigma )\) is a defining function of \(\partial \Sigma \) within \(\Sigma \).

Remark 2.5

The causal character (spacelike, null, timelike) of level sets of \(\rho _0\), i.e. of \(d\rho _0\), depends on the particular choice of \(\rho _0\). On the other hand, the vector field\(\rho _0\partial _{\rho _0}\), defined using any local coordinate system, is well-defined as an element of \({}^{{\text {b}}}T_{I^0}M\), and thus so is its causal character at \(I^0\) with respect to the b-metric \(\rho ^2 g_m\): it is the scaling vector field at infinity, see the discussion after equation (1.13), and spacelike away from the corner \(I^0\cap \mathscr {I}^+\). Likewise, \(\rho _+\partial _{\rho _+}\) is the scaling vector field at \(I^+\), which is timelike.

Let us relate \(\Sigma \) to the radial compactification \(\overline{\mathbb {R}^3}\) of Euclidean 3-space; recall that the latter is defined using polar coordinates \((r,\omega )\in (0,\infty )\times \mathbb {S}^2\) on \(\mathbb {R}^3\) as the closed 3-ball

$$\begin{aligned} \overline{\mathbb {R}^3} := \bigl (\mathbb {R}^3 \sqcup ([0,\infty )_{\rho _0}\times \mathbb {S}^2)\bigr ) / \sim ,\ \ \text{ where }\ (r,\omega ) \sim (\rho _0,\omega ),\ \rho _0=r^{-1},\ r>0. \end{aligned}$$

Consider the map \(\iota :\mathbb {R}^3\ni x=(r,\omega )\mapsto (2 m\chi (r)\log (r-2 m),x)\in \Sigma ^\circ \subset \mathbb {R}_t\times \mathbb {R}^3_x\), which is the projection along the flow of \(\partial _t\). Expressed near \(\partial \overline{\mathbb {R}^3}\), i.e. for small \(\rho _0\), this takes the form \(\iota (\rho _0,\omega )=(\rho ,v,\omega )\) for \(\rho =\rho _0\) and \(v=-1\); thus, \(\iota \) extends to a diffeomorphism

$$\begin{aligned} \Sigma \cong \overline{\mathbb {R}^3}. \end{aligned}$$
(2.16)

Whenever necessary, we shall make the mass parameter m in these constructions explicit by writing

$$\begin{aligned} {}^m\overline{\mathbb {R}^4},\ {}^mM',\ {}^mM,\ {}^m\Sigma ,\ {}^m t_{\text {b}},\ {}^m I^0,\ {}^m\mathscr {I}^+,\ {}^m I^+,\ {}^m\beta ,\ {}^m\rho ,\ \text {etc.} \end{aligned}$$
(2.17)

In particular, \({}^0\overline{\mathbb {R}^4}\) is the radial compactification of \(\mathbb {R}^4\) with the closed subset \(\{|t|^{-1}=0,\ t/r\le -\tfrac{3}{4}\}\) of the boundary removed; note here that on their respective domains of definition, \(r^{-1}\) and \(|t|^{-1}\) are indeed local boundary defining functions of \({}^0\overline{\mathbb {R}^4}\). Moreover, the metric \(g_m\) for \(m=0\) is equal to the Minkowski metric \(\underline{g}{}\). We shall explore the relationships between \({}^m\overline{\mathbb {R}^4}\) etc. for different values of m in §2.3.

Remark 2.6

For \(m=0\), it is easy to write down global expressions for boundary defining functions in \(t\ge \tfrac{1}{2}|r|\), for instance (using notation similar to [74])

$$\begin{aligned} {}^0\rho _0 = (1+q_-)^{-1}, \quad {}^0\rho _I = t^{-1}(1+q_-)(1+q_+),\quad {}^0\rho _+ = (1+q_+)^{-1}, \quad {}^0\rho = t^{-1}; \end{aligned}$$
(2.18)

here \(q_+=\phi _+(t-\langle r\rangle )\) and \(q_-=\phi _+(\langle r\rangle -t)\), where \(\phi _+(x)\) is a smooth function, \(\phi _+(x)=x\) for \(x\ge 1\), and \(\phi _+(x)=0\), \(x\le 0\). One can write down similar expressions for general m by using \(r_*\) instead of r near \(\mathscr {I}^+\cup I^0\), and inserting suitable partitions of unity to obtain expressions which are globally smooth. While expressions such as (2.18) offer a quick way to relate bounds by \(({}^0\rho _0)^{a_0}({}^0\rho _I)^{a_I}({}^0\rho _+)^{a_+}\) into bounds in terms of standard coordinates on \(\mathbb {R}^4\), they are of course cumbersome to work with if one used them as parts of local coordinate systems on \({}^mM\). Furthermore, since we fixed a smooth structure of \({}^mM\), boundary defining functions on \({}^mM\) are well-defined up to multiplication by smooth, positive functions with smooth, positive reciprocals; therefore, decay rates, such as \(a_0,a_I,a_+\) above, with respect to one particular set of choices of boundary defining functions of \({}^mM\) are the same as for any other set of choices on the same manifold \({}^mM\). The advantage of defining \({}^mM\) is then that one can work with any convenient choices of (local) boundary defining functions for any particular local coordinate calculation or estimate for a PDE on \({}^mM\), and the decay rates in such an estimate, when expressed in terms of one’s chosen defining functions, make invariant sense.

Working on \({}^m\overline{\mathbb {R}^4}\), the following coordinates are convenient for performing calculations near the light cone at infinity \(S^+\):

Definition 2.7

We define the coordinates \(q=x^0\) and \(s=x^1\) as follows:

$$\begin{aligned} q:=x^0:=t+r_*,\ \ s:=x^1:=t-r_*. \end{aligned}$$

Their level sets are null hypersurfaces for the mass m Schwarzschild metric. Using \(d q=d s+2 d r_*\) and (2.4),

$$\begin{aligned} {}^{{\text {sc}}}T^*\overline{\mathbb {R}^4} = \langle dq \rangle \oplus \langle ds \rangle \oplus r\,T^*\mathbb {S}^2 \end{aligned}$$
(2.19)

therefore defines a smooth partial trivialization near \(S^+\); recall that \(\rho =r^{-1}\) there. Similarly,

$$\begin{aligned} \partial _0 \equiv \partial _{x^0} = \partial _q = \tfrac{1}{2}(\partial _t+\partial _{r_*}),\ \ \partial _1 \equiv \partial _{x^1} = \partial _s = \tfrac{1}{2}(\partial _t-\partial _{r_*}) \end{aligned}$$

are smooth scattering vector fields on \(\overline{\mathbb {R}^4}\), and together with \(r^{-1} T\mathbb {S}^2\), they give a smooth partial trivialization of \({}^{{\text {sc}}}T\overline{\mathbb {R}^4}\) near \(S^+\).Footnote 25 Letting \(x^a\), \(a=2,3\), denote local coordinates on \(\mathbb {S}^2\), we will denote spherical indices by early alphabet Latin letters abcde, and general indices ranging from 0 to 3 by Greek letters. The components of a section \(\omega \) of \({}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\) in the splitting (2.19) are denoted with barred indices:

$$\begin{aligned} \omega _{\bar{0}}:=\omega (\partial _0),\ \ \omega _{\bar{1}}:=\omega (\partial _1),\ \ \omega _{\bar{a}}:=\omega (\rho \partial _a)=r^{-1}\omega (\partial _a). \end{aligned}$$
(2.20)

Thus, the components of a tensor with respect to this splitting have size comparable to the components in the coordinate basis of \(T^*\mathbb {R}^4\). The splitting (2.19) induces the splitting

$$\begin{aligned} S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}&= \langle dq^2 \rangle \oplus \langle 2 dq\,ds \rangle \oplus (2 dq\otimes _s r\,T^*\mathbb {S}^2) \nonumber \\&\quad \oplus \langle ds^2 \rangle \oplus (2 ds\otimes _s r\,T^*\mathbb {S}^2) \oplus r^2\,S^2 T^*\mathbb {S}^2, \end{aligned}$$
(2.21)

as well as the dual splittings of the dual bundles \({}^{{\text {sc}}}T\overline{\mathbb {R}^4}\) and \(S^2\,{}^{{\text {sc}}}T\overline{\mathbb {R}^4}\). We will occasionally use the further splitting

(2.22)

For calculations of geometric quantities associated with the metric, the bundle splittings induced by the coordinates \(q,s,x^2,x^3\), i.e.

$$\begin{aligned} T^*\mathbb {R}^4&= \langle dq\rangle \oplus \langle ds\rangle \oplus T^*\mathbb {S}^2, \nonumber \\ S^2 T^*\mathbb {R}^4&= \langle dq^2 \rangle \oplus \langle 2 dq\,ds \rangle \oplus (2 dq\otimes _s T^*\mathbb {S}^2) \nonumber \\&\quad \oplus \langle ds^2 \rangle \oplus (2 ds\otimes _s T^*\mathbb {S}^2) \oplus S^2 T^*\mathbb {S}^2, \end{aligned}$$
(2.23)

are more convenient. Components are denoted without bars, that is, for a 1-form \(\omega \) and for \(\mu =0,1\), we have \(\omega _\mu :=\omega (\partial _\mu )=\omega _{\bar{\mu }}\), while we let \(\omega _a:=\omega (\partial _a)=r\omega _{\bar{a}}\). In short, we have

$$\begin{aligned} \omega _{\bar{\mu }}=r^{-s(\mu )}\omega _\mu ,\quad s(\mu _1,\ldots ,\mu _N) := \#\{\lambda :\mu _\lambda \in \{2,3\} \}, \end{aligned}$$
(2.24)

likewise for tensors of higher rank.

On the resolved space M, the null derivatives \(\partial _0,\partial _1\) can be computed as follows: near \(I^0\cap \mathscr {I}^+\), we can take

$$\begin{aligned} \rho _0=-\rho /v=(r_*-t)^{-1},\ \ \rho _I=-v=(r_*-t)/r,\ \ \rho =\rho _0\rho _I=r^{-1}; \end{aligned}$$
(2.25)

then

$$\begin{aligned} \partial _0&= -\tfrac{1}{2}\rho _0\rho _I(1-2 m\rho )\rho _I\partial _{\rho _I}, \nonumber \\ \partial _1&= \rho _0\bigl (\rho _0\partial _{\rho _0} - (1-\tfrac{1}{2}\rho _I(1-2 m\rho ))\rho _I\partial _{\rho _I}\bigr ), \end{aligned}$$
(2.26)

and dually

$$\begin{aligned} \rho \,d q =-\tfrac{2}{1-2 m\rho }\bigl (\tfrac{d\rho _0}{\rho _0}+\tfrac{d\rho _I}{\rho _I}\bigr ) + \rho _I\tfrac{d\rho _0}{\rho _0}, \ \ \rho \,d s = \rho _I\tfrac{d\rho _0}{\rho _0}. \end{aligned}$$
(2.27)

A similar calculation near \(I^+\cap \mathscr {I}^+\) yields

$$\begin{aligned} \partial _0 = f_0\rho _0\rho _I\rho _+\cdot \rho _I\partial _{\rho _I},\ \ \partial _1 \in \rho _0\rho _+\mathcal {V}_{\text {b}}(M), \end{aligned}$$
(2.28)

for some \(f_0\in \mathcal {C}^\infty (M)\), \(f_0>0\), depending on the choices of boundary defining functions.

2.2 Function spaces

We first recall the notion of b-Sobolev spaces on \(\mathbb {R}^{n,d}_+:=[0,\infty )^d_x\times \mathbb {R}^{n-d}_y\): first, we set \(H_{{\text {b}}}^0(\mathbb {R}^{n,d}_+)\equiv L^2_{\text {b}}(\mathbb {R}^{n,d}_+):=L^2(\mathbb {R}^{n,d}_+;|\frac{d x^1}{x^1}\ldots \frac{d x^d}{x^d}d y|)\); for \(k\in \mathbb {N}\) then, \(H_{{\text {b}}}^k(\mathbb {R}^{n,d}_+)\) consists of all \(u\in L^2_{\text {b}}\) such that \(V_1\ldots V_j u\in L^2_{\text {b}}\) for all \(0\le j\le k\), where each \(V_\ell \) is equal to either \(x^p\partial _{x^p}\) or \(\partial _{y^q}\) for some \(p=1,\ldots ,d\), \(q=1,\ldots ,n-d\). For general \(s\in \mathbb {R}\), one defines \(H_{{\text {b}}}^s(\mathbb {R}^{n,d}_+)\) by interpolation and duality. One can define b-Sobolev spaces on compact manifolds with corners by localization and using local coordinate charts; we give an invariant description momentarily. Note that the logarithmic change of coordinates \(t^j:=-\log x^j\), \(j=1,\ldots ,d\), induces an isometric isomorphism \(H_{{\text {b}}}^s(\mathbb {R}^{n,d}_+)\cong H^s(\mathbb {R}^n)\) with the standard Sobolev space on \(\mathbb {R}^n\).

Now on \(M'\), fix any smooth b-density, i.e. in local coordinates as above a smooth positive multiple of \(|\frac{d x^1}{x^1}\ldots \frac{d x^d}{x^d}d y|\), then the space \(L^2_{\text {b}}(M')\) with respect to this density is well-defined; the space \(L^2_{\text {b}}(M)\) of restrictions of elements \(u\in L^2_{\text {b}}(M')\) to M is similarly well-defined, and since M is compact, any two choices of b-densities on \(M'\) yield equivalent norms on \(L^2_{\text {b}}(M)\). More generally, if \(b_0,b_I,b_+\in \mathbb {R}\) are weights, we define the weighted \(L^2\) space

$$\begin{aligned} \rho _0^{b_0}\rho _I^{b_I}\rho _+^{b_+}H_{{\text {b}}}^0(M) \equiv \rho _0^{b_0}\rho _I^{b_I}\rho _+^{b_+}L^2_{\text {b}}(M) := \bigl \{ u :\rho _0^{-b_0}\rho _I^{-b_I}\rho _+^{-b_+}u\in L^2_{\text {b}}(M) \bigr \}. \end{aligned}$$

The b-Sobolev spaces of order \(k=0,1,2,\ldots \) are defined using a finite collection of vector fields \(\mathscr {V}\subset \mathcal {V}_{\text {b}}(M')\) such that at each point \(p\in M\), the collection \(\mathscr {V}_p\) spans \({}^{{\text {b}}}T_p M\), namely

$$\begin{aligned} H_{{\text {b}}}^k(M) := \{ u\in L^2_{\text {b}}(M) :V_1\dots V_j u \in L^2_{\text {b}}(M),\ 0\le j\le k,\ V_\ell \in \mathscr {V}\}; \end{aligned}$$

the norm on this space is the sum of the \(L^2_{\text {b}}(M)\)-norms of u and its up to k-fold derivatives along elements of \(\mathscr {V}\). One defines \(\rho _0^{b_0}\rho _I^{b_I}\rho _+^{b_+}H_{{\text {b}}}^k(M)\) and its norm correspondingly. Note that the vector fields in \(\mathscr {V}\) are required to be tangent to \(I^0\), \(\mathscr {I}^+\), and \(I^+\), but not to \(\Sigma \); thus, we measure standard Sobolev regularity near \(\Sigma \), and b- (conormal) regularity at \(I^0\), \(\mathscr {I}^+\), and \(I^+\). (Thus, our space \(H_{{\text {b}}}^k(M)\) would be denoted \(\bar{H}_{{\text {b}}}^k(M)\) in the notation of [56, Appendix B]). Due to the compactness of M, any two choices of collections \(\mathscr {V}\) and boundary defining functions \(\rho _0,\rho _I,\rho _+\) give rise to the same b-Sobolev space, up to equivalence of norms. (For instance, any other defining function \(\rho _0'\) of \(I^0\) is related to \(\rho _0\) by \(\rho _0'=a\rho _0\) where \(0<a\in \mathcal {C}^\infty (M)\) (and thus by compactness of M, \(C^{-1}\le a\le C\) for some \(C>1\)); the equality of the weighted spaces defined using \(\rho _0\) or \(\rho _0'\) is then a consequence of the fact that multiplication by \(a^{b_0}\), or in fact by any smooth nonzero function on M with smooth reciprocal, is an isomorphism on \(H_{{\text {b}}}^k(M)\)). The space \(H_{{\text {b}}}^\infty (M)=\bigcap _{k\ge 1}H_{{\text {b}}}^k(M)\) and its weighted analogues have natural Fréchet space structures; we refer to their elements as conormal functions. We shall also use function spaces with infinitely decaying weights, so for instance

$$\begin{aligned} \rho _I^\infty H_{{\text {b}}}^k(M) := \bigcap _{b_I\in \mathbb {R}}\rho _I^{b_I}H_{{\text {b}}}^k(M), \end{aligned}$$
(2.29)

as well as spaces of the form

$$\begin{aligned} \rho _I^{b_I-0}H_{{\text {b}}}^k(M) := \bigcap _{\epsilon >0}\rho _I^{b_I-\epsilon }H_{{\text {b}}}^k(M), \end{aligned}$$

similarly for spaces with more weights.

Weighted b-Sobolev spaces of sections of vector bundles on M are defined using local trivializations. We will in particular use the space

$$\begin{aligned} H_{{\text {b}}}^{k;b_0,b_I,b_+}(E)\equiv H_{{\text {b}}}^{k;b_0,b_I,b_+}(M;E) := \rho _0^{b_0}\rho _I^{b_I}\rho _+^{b_+}H_{{\text {b}}}^k(M;E), \end{aligned}$$
(2.30)

with E denoting the trivial bundle \(\underline{\mathbb {C}}{}:=M\times \mathbb {C}\rightarrow M\), or \(E=\beta ^*{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\), or \(E=\beta ^*S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\). When the bundle E is clear from the context, we will simply write \(H_{{\text {b}}}^{k;b_0,b_I,b_+}\). When estimating error terms, we will often use the inclusion

$$\begin{aligned} \mathcal {C}^\infty (\overline{\mathbb {R}^4}) \subset \mathcal {C}^\infty (M) \subset H_{{\text {b}}}^{\infty ;-0,-0,-0} := \bigcap _{\epsilon >0}H_{{\text {b}}}^{\infty ;-\epsilon ,-\epsilon ,-\epsilon }. \end{aligned}$$

For the last part of Theorem 1.1, we need to define the notion of polyhomogeneity (or \(\mathcal {E}\)-smoothness) and discuss its basic properties; see [84, §2A] and [87, §4.15] for detailed accounts and proofs. An index set is a discrete subset \(\mathcal {E}\subset \mathbb {C}\times \mathbb {N}_0\) such that

$$\begin{aligned}&(z,j)\in \mathcal {E}\ \Longrightarrow \ (z,j')\in \mathcal {E}\ \ \forall \,j'\le j; \end{aligned}$$
(2.31a)
$$\begin{aligned}&(z_\ell ,j_\ell )\in \mathcal {E},\ |z_\ell |+j_\ell \rightarrow \infty \ \Longrightarrow \ {\text {Im}}z_\ell \rightarrow -\infty ; \end{aligned}$$
(2.31b)
$$\begin{aligned}&(z,j)\in \mathcal {E}\ \Longrightarrow \ (z-i,j)\in \mathcal {E}. \end{aligned}$$
(2.31c)

We shall write

$$\begin{aligned} {\text {Im}}\mathcal {E}< c\ :\Longleftrightarrow \ {\text {Im}}z<c\ \ \forall \,(z,k)\in \mathcal {E}, \end{aligned}$$
(2.32)

likewise for the nonstrict inequality sign. Note that by condition (2.31b), every index set \(\mathcal {E}\) has an upper bound \({\text {Im}}\mathcal {E}<C\) for some C; more precisely, if \(\mathcal {E}\) is an index set and \(C'\in \mathbb {R}\), then there are only finitely many points \((z,k)\in \mathcal {E}\) with \({\text {Im}}z>C'\).

Let now X denote a compact manifold with boundary \(\partial X\), and let \(\rho \in \mathcal {C}^\infty (X)\) be a boundary defining function. The choice of a collar neighborhood \([0,1)_\rho \times \partial X\) makes the vector field \(\rho D_\rho =\frac{1}{i}\rho \partial _\rho \) well-defined, and any two choices of collars give the same vector field \(\rho D_\rho \) modulo elements of \(\rho \mathcal {V}_{\text {b}}(X)\). Let \(\mathcal {E}\) be an index set. The space \(\mathcal {A}_{\text {phg}}^\mathcal {E}(X)\) then consists of all \(u\in \rho ^{-\infty }H_{{\text {b}}}^\infty (X)=\bigcup _{N\in \mathbb {R}}\rho ^NH_{{\text {b}}}^\infty (X)\) for which

$$\begin{aligned} \prod _{\genfrac{}{}{0.0pt}{}{(z,j)\in \mathcal {E}}{{\text {Im}}z\ge -N}} (\rho D_\rho -z)u\in \rho ^NH_{{\text {b}}}^\infty (X)\ \ \text{ for } \text{ all }\ N\in \mathbb {R}; \end{aligned}$$
(2.33)

equivalently, there exist \(a_{(z,j)}\in \mathcal {C}^\infty (X)\), \((z,j)\in \mathcal {E}\), such that

$$\begin{aligned} u - \sum _{\genfrac{}{}{0.0pt}{}{(z,j)\in \mathcal {E}}{{\text {Im}}z\ge -N}} \rho ^{i z}(\log \rho )^j a_{(z,j)} \in \rho ^NH_{{\text {b}}}^\infty (X). \end{aligned}$$
(2.34)

(Condition (2.31c) ensures that this is independent of the choice of \(\rho D_\rho \)). In particular, \(u\in \rho ^{-{\text {Im}}\mathcal {E}-0}H_{{\text {b}}}^\infty (X)\). When no confusion can arise, we write

$$\begin{aligned} (a,k) := \{(a-i n,j):n\in \mathbb {N}_0,\,0\le j\le k\},\ \ a:=(a,0). \end{aligned}$$
(2.35)

For example, \(\mathcal {A}_{\text {phg}}^{-i a}(X)=\rho ^a\mathcal {C}^\infty (X)\). We also recall the notion of the extended union of two index sets \(\mathcal {E}_1\), \(\mathcal {E}_2\), defined by

$$\begin{aligned} \mathcal {E}_1 \overline{\cup }\mathcal {E}_2 = \mathcal {E}_1 \cup \mathcal {E}_2 \cup \{ (z,k) :\exists \,(z,j_\ell )\in \mathcal {E}_\ell ,\ k\le j_1+j_2+1 \}, \end{aligned}$$

so e.g. \(0\overline{\cup }0=(0,1)\), as well as their sum

$$\begin{aligned} \mathcal {E}_1+\mathcal {E}_2 := \{ (z,j) :\exists \,(z_\ell ,j_\ell )\in \mathcal {E}_\ell ,\ z=z_1+z_2,\,j=j_1+j_2 \}; \end{aligned}$$

thus \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_1}(X)\cdot \mathcal {A}_{\text {phg}}^{\mathcal {E}_2}(X)\subset \mathcal {A}_{\text {phg}}^{\mathcal {E}_1+\mathcal {E}_2}(X)\). For \(j\in \mathbb {N}\) and an index set \(\mathcal {E}\), we define

$$\begin{aligned} j\mathcal {E}_1:=\mathcal {E}_1+\cdots +\mathcal {E}_1, \end{aligned}$$

with j summands.

If X is a manifold with corners with embedded boundary hypersurfaces \(H_1,\ldots ,H_k\) to each of which is associated an index set \(\mathcal {E}_i\), we define \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_1,\ldots ,\mathcal {E}_k}(X)\) as the space of all \(u\in \rho ^{-\infty }H_{{\text {b}}}^\infty (X)\), with \(\rho \in \mathcal {C}^\infty (X)\) a total boundary defining function, such that for each \(1\le i\le k\), there exist weights \(b_j\in \mathbb {R}\), \(j\ne i\), such that, with \(\rho _i\in \mathcal {C}^\infty (X)\) denoting a defining function of \(H_i\),Footnote 26

$$\begin{aligned} \prod _{\genfrac{}{}{0.0pt}{}{(z,j)\in \mathcal {E}_i}{{\text {Im}}z\ge -N}} (\rho _i D_{\rho _i}-z)u \in \rho _i^N\prod _{j\ne i}\rho _j^{b_j} H_{{\text {b}}}^\infty (X)\ \ \text{ near }\ H_i. \end{aligned}$$

This is equivalent to u admitting an asymptotic expansion at each \(H_i\) as in (2.34), with each \(a_{(z,j)}\) polyhomogeneous with index set \(\mathcal {E}_j\) at each nonempty boundary hypersurface \(H_j\cap H_i\) of \(H_i\).

We shall also need spaces encoding polyhomogeneous behavior at one hypersurface but not others; for brevity, we only discuss this in the case of two boundary hypersurfaces \(H_1,H_2\): for an index set \(\mathcal {E}\) and \(\alpha \in \mathbb {R}\), \(\mathcal {A}^{\mathcal {E},\alpha }_{{\text {phg}},{\text {b}}}\) consists of all \(u\in \rho ^{-\infty }H_{{\text {b}}}^\infty \) such that

$$\begin{aligned} \prod _{\genfrac{}{}{0.0pt}{}{(z,j)\in \mathcal {E}_i}{{\text {Im}}z\ge -N}} (\rho _1 D_{\rho _1}-z)u\in \rho _1^N\rho _2^\alpha H_{{\text {b}}}^\infty \ \ \text{ near }\ H_1,\ \ \text{ for } \text{ all }\ N\in \mathbb {R}; \end{aligned}$$

this is equivalent to u having an expansion at \(H_1\) with terms \(a_{(z,j)}\in \rho _2^\alpha H_{{\text {b}}}^\infty (H_2)\).

We briefly discuss nonlinear properties of b-Sobolev and polyhomogeneous spaces; for brevity, we work on an n-dimensional compact manifold X with boundary \(\partial X\), and leave the statements of the obvious generalizations to the setting of manifolds with corners to the reader. Thus, if \(s>n/2\), then \(H_{{\text {b}}}^s(X)\) is a Banach algebra, and more generally \(u_1\cdot u_2\in \rho ^{a_1+a_2}H_{{\text {b}}}^s(X)\) if \(u_j\in \rho ^{a_j}H_{{\text {b}}}^s(X)\), \(j=1,2\). Regarding the interaction with polyhomogeneous spaces, if \(\mathcal {E}\) is an index set, then \(\mathcal {A}_{\text {phg}}^\mathcal {E}(X)\cdot \rho ^aH_{{\text {b}}}^s(X)\subset \rho ^{a-e}H_{{\text {b}}}^s(X)\) for all \(a,s\in \mathbb {R}\) when \(e>{\text {Im}}\mathcal {E}\); in the case that \(\mathcal {E}=(a_0,0)\cup \mathcal {E}'\) with \({\text {Im}}\mathcal {E}'<{\text {Im}}a_0\), we may take \(e={\text {Im}}a_0\). One can also take inverses, to the effect that \(u/(1-v)\in H_{{\text {b}}}^s(X)\) provided \(u,v\in H_{{\text {b}}}^s(X)\), \(s>n/2\), and \(v\le C<1\), which follows readily from the corresponding results on \(\mathbb {R}^n\), see e.g. [102, §13.10], by a logarithmic change of coordinates.

For comparisons with the Minkowski metric, we study the regularity properties of \(t^{-1}\) on \({}^m\overline{\mathbb {R}^4}\). Define the index set

$$\begin{aligned} \mathcal {E}_{\mathrm{log}}:=\{(-i k,j):k\in \mathbb {N}_0,\ 0\le j\le k\},\ \ \mathcal {E}'_{\mathrm{log}} := \mathcal {E}_{\mathrm{log}}\setminus \{(0,0)\}. \end{aligned}$$
(2.36)

Lemma 2.8

Letting \(U=\overline{\{t>\tfrac{2}{3} r\}}\subset {}^m\overline{\mathbb {R}^4}\), we have

$$\begin{aligned} t^{-1} \in \rho \cdot \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}(U) \subset \rho \,\mathcal {C}^\infty (U)+\rho ^{2-0}H_{{\text {b}}}^\infty (U)\subset \rho ^{1-0}H_{{\text {b}}}^\infty (U), \end{aligned}$$
(2.37)

and \(t^{-1}/\rho \in \mathcal {C}^\infty (U\cap \partial \overline{\mathbb {R}^4})\) is everywhere nonzero.

Definition 2.9

We define \(\rho _t\in \mathcal {C}^\infty ({}^m\overline{\mathbb {R}^4})\) to be any boundary defining function satisfying \(\rho _t/\rho =t^{-1}/\rho \) at \(U\cap \partial {}^m\overline{\mathbb {R}^4}\).

By Lemma 2.8, this fixes \(\rho _t\) in U modulo \(\rho ^2\mathcal {C}^\infty ({}^m\overline{\mathbb {R}^4})\); away from U, \(\rho _t\) is merely well-defined modulo \(\rho \,\mathcal {C}^\infty ({}^m\overline{\mathbb {R}^4})\).

Proof of Lemma 2.8

Using the notation of §2.1, we have \(t^{-1}\in \mathcal {C}^\infty (C_2)\). Thus, it suffices to work in \(C_1\cap \{v>-\tfrac{1}{2}\}\), where we can take \(\rho =r^{-1}\); we then need to prove \(f:=\rho t\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\) and \(f|_{\partial \overline{\mathbb {R}^4}}\ne 0\) there, which implies the claim about \(t^{-1}/\rho =1/f\) as \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\) is closed under multiplication. Note that \(f\in \mathcal {C}^\infty (\mathbb {R}^4)\), and \(f>\tfrac{1}{4}\). Let \(\tilde{\chi }(x)=\chi (x^{-1})\in \mathcal {C}^\infty ((0,\infty );[0,1])\) in the notation (2.5), so \(\tilde{\chi }(x)\equiv 0\), \(x<\tfrac{1}{3}\), and \(\tilde{\chi }(x)\equiv 1\), \(x>\tfrac{1}{2}\), then

$$\begin{aligned} f = 1 + v - 2 m\rho \tilde{\chi }(f) \bigl (\log \rho -\log (1-2 m\rho )\bigr ). \end{aligned}$$
(2.38)

Note that near \(\rho =0\), \(f=\rho ^{-1}t^{-1}\) is the unique positive function satisfying this equation: indeed, if \(f'\) is another such function, then \(|f-f'|\lesssim (\rho \log \rho )|f-f'|\). At \(\rho =0\), we have \(f=1+v\). Thus, let \(k\ge 2\) be an integer, and consider the map

$$\begin{aligned} T :\tilde{f} \mapsto - 2 m\rho (\log \rho -\log (1-2 m\rho ))\tilde{\chi }(1+v+\tilde{f}) \end{aligned}$$

on \(\rho ^{1-\delta }H_{{\text {b}}}^k([0,\epsilon _k)_\rho \times (-1/2,5)_v)\), where \(\delta \in (0,1)\) is fixed. Now

$$\begin{aligned} \Vert T(\tilde{f})-T(\tilde{f}')\Vert _{\rho ^{1-\delta }H_{{\text {b}}}^k}\le C_k\Vert \rho \log \rho -\rho \log (1-2 m\rho )\Vert _{H_{{\text {b}}}^k} \Vert \tilde{\chi }\Vert _{\mathcal {C}^k}\Vert \tilde{f}-\tilde{f}'\Vert _{\rho ^{1-\delta }H_{{\text {b}}}^k}; \end{aligned}$$

choosing \(\epsilon _k>0\) sufficiently small, the first norm on the right can be made arbitrarily small. By the contraction mapping principle, this gives \(f-1-v\in \rho ^{1-\delta }H_{{\text {b}}}^\infty \) since k was arbitrary. We can now improve the remainder term by plugging this into (2.38), which gives

$$\begin{aligned} f - \bigl (1 + v - 2 m\tilde{\chi }(1+v)\bigl (\rho \log \rho -\rho \log (1-2 m\rho )\bigr )\bigr ) \in \rho ^{2-\delta }H_{{\text {b}}}^\infty , \end{aligned}$$

so \(f\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}+\rho ^{2-\delta }H_{{\text {b}}}^\infty \). Using that \(\chi \circ (\cdot )\) maps \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\) into itself, as follows from the testing definition (2.33), the desired conclusion follows from an iterative argument. \(\square \)

2.3 Relationships between different compactifications

The only difference between the compactifications \({}^m\overline{\mathbb {R}^4}\) for different values of m is the manner in which a smooth collar neighborhood of \(\partial {}^m\overline{\mathbb {R}^4}\) is glued together with \(\mathbb {R}^4\). Since this difference is small due to the logarithmic correction in (2.5) being only of size \(r^{-1}\log r\), different compactifications are closely related; see also [14, §7]. Indeed:

Lemma 2.10

The identity map \(\mathbb {R}^4\rightarrow \mathbb {R}^4\) induces a homeomorphism \(\phi :{}^m\overline{\mathbb {R}^4}\rightarrow {}^0\overline{\mathbb {R}^4}\), which in fact is a polyhomogeneous diffeomorphism with index set \(\mathcal {E}_{\mathrm{log}}\); that is, in smooth local coordinate systems near \(\partial {}^m\overline{\mathbb {R}^4}\) and \(\partial {}^0\overline{\mathbb {R}^4}\), the components of both \(\phi \) and \(\phi ^{-1}\) are real-valued functions on \([0,\infty )\times \mathbb {R}^3\) of class \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\). Moreover, \(\phi \) induces a smooth diffeomorphism \(\partial {}^m\overline{\mathbb {R}^4}\cong \partial {}^0\overline{\mathbb {R}^4}\), which restricts to \({}^m\beta ({}^m I^+)\cong {}^0\beta ({}^0 I^+)\), and also induces a smooth diffeomorphism \({}^m I^+\cong {}^0 I^+\).

Proof

We have \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\subset \mathcal {C}^\infty +\rho ^{1-0}H_{{\text {b}}}^\infty \subset \mathcal {C}^0\), so it suffices to prove the polyhomogeneity statement. Defining the smooth coordinates \(\rho \) and v as in (2.5), and the corresponding smooth coordinates \({}^0\rho =r^{-1}\) and \({}^0 v=r^{-1}(t-r)\) on \({}^0\overline{\mathbb {R}^4}\), we then observe that \({}^0\rho =\rho \), while in the notation of equation (2.38), we established that \(1+{}^0 v=f\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\) on \({}^m\overline{\mathbb {R}^4}\), giving the desired conclusion for \(\phi \). For \(\phi ^{-1}\), we write \(v={}^0 v-r^{-1}\chi (t/r)2 m\log (r-2 m)\) and note that \(t/r\in \mathcal {C}^\infty ({}^0\overline{\mathbb {R}^4})\). For the last claim, we observe that

$$\begin{aligned} v={}^0 v\ \ \text{ at }\ \partial {}^m\overline{\mathbb {R}^4} \end{aligned}$$
(2.39)

under the identification with \(\partial {}^0\overline{\mathbb {R}^4}\) given by \(\phi \). This also shows that the sets \({}^m\beta ({}^m I^+)=\{v\ge 0\}\) and \({}^0\beta ({}^0 I^+)=\{{}^0 v\ge 0\}\) are diffeomorphic. On \({}^mM\), resp. \({}^0M\) then, v, resp. \({}^0 v\), are local defining functions of the boundaries \(\partial {}^m I^+\), resp. \(\partial {}^0 I^+\), hence by (2.39), the identification \({}^m I^+\cong {}^0 I^+\) in the interior of \({}^m I^+\) indeed extends smoothly to its boundary. \(\square \)

In a similar vein, the scattering (co)tangent bundles can be naturally identified over the boundary:

Lemma 2.11

The identity map \(T^*\mathbb {R}^4\rightarrow T^*\mathbb {R}^4\) extends by continuity to a continuous bundle map \({}^{{\text {sc}}}T^*\,{}^m\overline{\mathbb {R}^4}\rightarrow {}^{{\text {sc}}}T^*\,{}^0\overline{\mathbb {R}^4}\) which restricts to a smooth bundle isomorphism over the boundary.

Proof

Since away from \(r=0\), \(\langle d(r^{-1})\rangle \) and \(r\,T^*\mathbb {S}^2\) are smooth subbundles of \({}^{{\text {sc}}}T^*\,{}^m\overline{\mathbb {R}^4}\) for any m, it suffices to show that \(d(t^{-1})\), which is a smooth section of \({}^{{\text {sc}}}T^*\,{}^0\overline{\mathbb {R}^4}\), extends by continuity from \(\mathbb {R}^4\) to \(\partial {}^m\overline{\mathbb {R}^4}\) and restricts to a smooth section of \({}^{{\text {sc}}}T^*_{\partial {}^m\overline{\mathbb {R}^4}}{}^m\overline{\mathbb {R}^4}\). By Lemma 2.8, we have \(t=\rho ^{-1}f\), \(f\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\), so \(d t=f\,d(\rho ^{-1})+\rho ^{-1}d f\); but \(f|_{\partial {}^m\overline{\mathbb {R}^4}}\) is smooth indeed, while in a local product neighborhood \([0,1)_\rho \times \mathbb {R}_X^3\) of a point in \(\partial {}^m\overline{\mathbb {R}^4}\), \(\rho ^{-1}d f=(\rho \partial _\rho f)\frac{d\rho }{\rho ^2}+(\partial _X f)\frac{d X}{\rho }\) restricts to the smooth scattering 1-form \((\partial _X f)\frac{d X}{\rho }\) on \(\partial {}^m\overline{\mathbb {R}^4}\). \(\square \)

Let us discuss this on the level of function spaces. The map \(\phi \) in Lemma 2.10 induces \(\mathcal {C}^\infty ({}^m\overline{\mathbb {R}^4})\subset \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}({}^0\overline{\mathbb {R}^4})\) and vice versa. Moreover, it induces an isomorphism

$$\begin{aligned} ({}^m\rho )^\alpha H_{{\text {b}},{\text {loc}}}^s({}^m\overline{\mathbb {R}^4})\cong ({}^0\rho )^\alpha H_{{\text {b}},{\text {loc}}}^s({}^0\overline{\mathbb {R}^4}),\ \ s,\alpha \in \mathbb {R}, \end{aligned}$$
(2.40)

as follows from \(\phi \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\). The corresponding statement is not quite true on the blown-up spaces \({}^mM\), the failure happening at \({}^m\mathscr {I}^+\); there, let us use

$$\begin{aligned} {}^m\rho ={}^0\rho =r^{-1}; \quad {}^m v={}^0 v-2 m({}^0\rho )\log (({}^0\rho )^{-1}-2 m),\ {}^0 v=r^{-1}(t-r). \end{aligned}$$

Now, the b-tangent bundle on \({}^0M\) is spanned near \({}^0\mathscr {I}^+\) by spherical derivatives,

$$\begin{aligned} {}^0\rho \partial _{{}^0\rho } \in {}^m\rho \partial _{{}^m\rho } + \mathcal {A}_{\text {phg}}^{\mathcal {E}'_{\mathrm{log}}}\cdot \partial _{{}^m v},\ \ {}^0 v\partial _{{}^0 v} \in \bigl ({}^m v+\mathcal {A}_{\text {phg}}^{\mathcal {E}'_{\mathrm{log}}}\bigr )\partial _{{}^m v}, \end{aligned}$$

and \({}^0\rho \partial _{{}^0 v}={}^m\rho \partial _{{}^m v}\); due to the logarithmic loss at \(\mathscr {I}^+\), we thus only have

$$\begin{aligned} ({}^m\rho _0)^{b_0}({}^m\rho _I)^{b_I}({}^m\rho _+)^{b_+}H_{{\text {b}},{\text {loc}}}^s({}^mM) \subset ({}^0\rho _0)^{b_0}({}^0\rho _I)^{b_I-\epsilon }({}^0\rho _+)^{b_+}H_{{\text {b}},{\text {loc}}}^s({}^0M) \end{aligned}$$

for all \(\epsilon >0\), but the inclusion fails for \(\epsilon =0\). That is, conormal function spaces are the same on \({}^mM\) and \({}^0M\) up to an arbitrarily small loss in the weight at \(\mathscr {I}^+\).

Polyhomogeneous spaces on \({}^m\overline{\mathbb {R}^4}\) for different values of m are related in a simple manner: if \(\mathcal {E}\subset \mathbb {C}\times \mathbb {N}_0\) is an index set and \(\mathcal {E}_{\mathrm{log}}\) is given by (2.36), then \(\phi \) induces inclusions

$$\begin{aligned} \mathcal {A}_{\text {phg}}^\mathcal {E}({}^m\overline{\mathbb {R}^4}) \hookrightarrow \mathcal {A}_{\text {phg}}^{\mathcal {E}+\mathcal {E}_{\mathrm{log}}}({}^0\overline{\mathbb {R}^4}),\ \ \mathcal {A}_{\text {phg}}^\mathcal {E}({}^0\overline{\mathbb {R}^4}) \hookrightarrow \mathcal {A}_{\text {phg}}^{\mathcal {E}+\mathcal {E}_{\mathrm{log}}}({}^m\overline{\mathbb {R}^4}); \end{aligned}$$
(2.41)

this is only nontrivial where the two compactifications differ, i.e. away from \(r=0\), i.e. where we can use \(r^{-1}\) as a boundary function for both \({}^0\overline{\mathbb {R}^4}\) and \({}^m\overline{\mathbb {R}^4}\). Considering a single term \(r^{-i z}(\log r)^k f({}^m v,\omega )\), with \(\omega \in \mathbb {S}^2\) and f smooth, in the expansion of an element of \(\mathcal {A}_{\text {phg}}^\mathcal {E}({}^m\overline{\mathbb {R}^4})\), the first inclusion in (2.41) follows from \(f\circ \phi \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}({}^0\overline{\mathbb {R}^4})\), which in turn can be seen by Taylor expanding \(f({}^0 v-2 m({}^0\rho )\log (({}^0\rho )^{-1}-2 m),\omega )\) in the first argument around \({}^0 v\). The proof of the second inclusion is similar. See [14, Proposition 7.8] for an alternative argument.

Polyhomogeneity on different spaces \({}^mM\) on the other hand is much less well-behaved: for instance, a function \(u\in \mathcal {C}^\infty ({}^mM)\) compactly supported near a point in \(({}^m\mathscr {I}^+)^\circ \), \(m>0\), so \(u\in \mathcal {A}_{\text {phg}}^{\emptyset ,0,\emptyset }({}^mM)\), is not polyhomogeneous on \({}^0M\): it vanishes near \(({}^0\mathscr {I}^+)^\circ \) and \(({}^0 I^+)^\circ \), but is nontrivial at the corner \({}^0\mathscr {I}^+\cap {}^0 I^+\).

2.4 Bundles and connections near null infinity

In the energy estimate (1.19) for the toy problem (1.18), derivatives of u along vector fields tangent to the fibers of \(\beta :\mathscr {I}^+\rightarrow S^+\) are better controlled than general b-derivatives. In this section, we introduce analytic structures on the blow-up M of \(\overline{\mathbb {R}^4}\) capturing this in an invariant manner.

Definition 2.12

For vector bundles \(E_j\rightarrow \overline{\mathbb {R}^4}\), \(j=1,2\), let

$$\begin{aligned} \mathcal {M}_{\beta ^*E_1,\beta ^*E_2} \subset \text {Diff}_{\text {b}}^1(M;\beta ^*E_1,\beta ^*E_2) \end{aligned}$$

denote the \(\mathcal {C}^\infty (M)\)-module of all first order b-differential operators A which satisfy the following condition near \(\mathscr {I}^+\): if \(E_j\cong \mathcal {U}\times \mathbb {C}^{k_j}\), \(j=1,2\), is a local trivialization of \(E_j\), with \(\mathcal {U}\subset \overline{\mathbb {R}^4}\) a neighborhood of \(S^+\), see (2.12), and we pull these trivializations back to \(\beta ^*E_j\cong \beta ^{-1}(\mathcal {U})\times \mathbb {C}^{k_j}\), then \(A=V+f\), where V is a \(k_2\times k_1\) matrix of vector fields \(V_{i j}\in \mathcal {V}_{\text {b}}(M)\) which are tangent to the fibers of \(\beta \), and \(f\in \mathcal {C}^\infty (M)^{k_2\times k_1}\). Let moreover

$$\begin{aligned} {}^0\mathcal {M}_{\beta ^*E_1,\beta ^*E_2}\subset \mathcal {M}_{\beta ^*E_1,\beta ^*E_2} \end{aligned}$$

denote the submodule for which \(f|_{\mathscr {I}^+}=0\)

For a single vector bundle \(E\rightarrow \overline{\mathbb {R}^4}\), we write \({}^{(0)}\mathcal {M}_{\beta ^*E}:={}^{(0)}\mathcal {M}_{\beta ^*E,\beta ^*E}\). Whenever the bundle E is clear from the context, we shall simply write \({}^{(0)}\mathcal {M}:={}^{(0)}\mathcal {M}_{\beta ^*E}\). For \(k\in \mathbb {N}\), we write \(\mathcal {M}^k\subset \text {Diff}_{\text {b}}^k\) for sums of k-fold products of elements of \(\mathcal {M}\).

It is easy to check that the definition of \(\mathcal {M}_{\beta ^*E_1,\beta ^*E_2}\) is independent of the choice of local trivializations; for \({}^0\mathcal {M}\), this is true as well, since vector fields tangent to the fibers of \(\beta \) annihilate the matrices for changes of frames of \(E_1\) and \(E_2\) which lift to be constant along the fibers of \(\beta \). We make some elementary observations:

Lemma 2.13

We have:

  1. (1)

    \(\rho _I\text {Diff}_{\text {b}}^1(M;\beta ^*E)\subset {}^0\mathcal {M}_{\beta ^*E}\subset \mathcal {M}_{\beta ^*E}\);

  2. (2)

    if \(A,B\in \mathcal {M}_{\beta ^*E}\), and A has a scalar principal symbol, then \([A,B]\in \mathcal {M}_{\beta ^*E}\). Strengthening the assumption to \(A,B\in {}^0\mathcal {M}_{\beta ^*E}\), we have \([A,B]\in {}^0\mathcal {M}_{\beta ^*E}\);

  3. (3)

    there is a well-defined map

    $$\begin{aligned} \mathcal {M}_{\underline{\mathbb {C}}{}}\ni A\mapsto A\otimes {\text {Id}}\in {}^0\mathcal {M}_{\beta ^*E}/\rho _I\,\mathcal {C}^\infty (M;{\text {End}}(\beta ^*E)). \end{aligned}$$

Proof

(1) and (2) are clear from the definition. The map in (3) is given in a local trivialization \(E\cong \mathcal {U}\times \mathbb {C}^k\) of E near \(S^+\) as \(A\cdot {\text {Id}}_{k\times k}\in \text {Diff}_{\text {b}}^1(M)^{k\times k}\); the transition function between two different trivializations is given by \(C\in \mathcal {C}^\infty (\mathcal {U};\mathbb {C}^{k\times k})\), which pulls back to M to be constant along the fibers of \(\beta \); but then \(C^{-1}(A\cdot {\text {Id}}_{k\times k})C-(A\cdot {\text {Id}}_{k\times k})=C^{-1}A(C)\in \mathcal {C}^\infty (M;\mathbb {C}^{k\times k})\), with A acting component-wise, vanishes on \(\mathscr {I}^+\) by definition of \(\mathcal {M}_{\underline{\mathbb {C}}{}}\). \(\square \)

In local coordinates \([0,\epsilon _0)_{\rho _0}\times [0,\epsilon _0)_{\rho _I}\times \mathbb {R}_{x^2 x^3}^2\) near \(I^0\cap \mathscr {I}^+\) as in (1.17), with \(\mathbb {R}^2\) a local coordinate patch on \(\mathbb {S}^2\), elements of \(\mathcal {M}_{\underline{\mathbb {C}}{}}\) are linear combinations of \(\rho _0\partial _{\rho _0}\), \(\rho _I\partial _{\rho _I}\), and \(\rho _I\partial _{x^a}\), \(a=2,3\), plus smooth functions. We thus see that \({}^{(0)}\mathcal {M}_{\underline{\mathbb {C}}{}}\) is generated over \(\mathcal {C}^\infty (M)\) by \((\rho _I)\mathcal {C}^\infty (M)\) and lifts of elements \(V\in \mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4})\) which vanish at \(S^+\) as incomplete vector fields, i.e. \(V|_{S^+}=0\in T_{S^+}\overline{\mathbb {R}^4}\). (This should be compared to the larger space \(\mathcal {V}_{\text {b}}(M)\), which is generated by lifts of elements \(V\in \mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4})\) which are merely tangent to \(S_+\)). Note that by (2.28), we have

$$\begin{aligned} \rho ^{-1}\partial _0,\ \rho _0^{-1}\rho _+^{-1}\partial _1 \in {}^0\mathcal {M}_{\underline{\mathbb {C}}{}}; \end{aligned}$$
(2.42)

for a fixed choice of \(\rho \), the operators \(\rho ^{-1}\partial _0\) and \(\partial _1\) acting on sections of any bundle \(\beta ^*E\) are therefore well-defined, modulo \(\rho _I\,\mathcal {C}^\infty \) and \(\rho _0\rho _I\rho _+\mathcal {C}^\infty \) valued in \({\text {End}}(\beta ^*E)\), respectively.

The modules defined above are closely related to a natural subbundle of \({}^{{\text {b}}}T_{\mathscr {I}^+}M\):

Definition 2.14

Denote by

$$\begin{aligned} {}^{\beta }T_{\mathscr {I}^+}M\subset {}^{{\text {b}}}T_{\mathscr {I}^+}M \end{aligned}$$

the rank 2 subbundle generated by all \(V\in {}^{{\text {b}}}T_{\mathscr {I}^+}M\) which are tangent to the fibers of \(\beta \), see (2.13), and let \({}^{\beta }TM\) be any smooth rank 2 extension of \({}^{\beta }T_{\mathscr {I}^+}M\) to a neighborhood of \(\mathscr {I}^+\). Let then

$$\begin{aligned} ({}^{\beta }TM)^\perp := \{ \alpha \in {}^{{\text {b}}}T^* M :\alpha (V)=0\ \text{ for } \text{ all }\ V\in {}^{\beta }TM \} \subset {}^{{\text {b}}}T^*M \end{aligned}$$

denote the annihilator of \({}^{\beta }TM\) in \({}^{{\text {b}}}T^*M\).

Near \(I^0\cap \mathscr {I}^+\), we can for instance take \({}^{\beta }TM\subset {}^{{\text {b}}}TM\) to be the subbundle whose fibers are spanned by \(\rho _I\partial _{\rho _I}\) and \(\rho _0\partial _{\rho _0}\).

Remark 2.15

Another equivalent characterization of \(\mathcal {M}\) is that the principal symbols of its elements vanish on \(({}^{\beta }T_{\mathscr {I}^+}M)^\perp \). We also note that for \(p\in \mathscr {I}^+\), there is a natural isomorphism

$$\begin{aligned} ({}^{\beta }TM)^\perp _p \cong T^*_{\beta (p)}S^+. \end{aligned}$$
(2.43)

Indeed, given \(V\in {}^{{\text {b}}}T_p M\), note that \(\beta _*V\in {}^{{\text {b}}}T_{S^+}\overline{\mathbb {R}^4}\) is tangent to \(S^+\), hence has a well-defined image in \(T_p S^+\); and \(V\in {}^{\beta }T_p M\) is precisely the condition that this image be 0. Thus, the isomorphism (2.43) is obtained by mapping \(\eta \in T^*_{\beta (p)}S^+\) to \({}^{{\text {b}}}T_p M\ni V\mapsto \eta (\beta _*V)\).

Using this subbundle, we have

$$\begin{aligned} \mathcal {M}_{\underline{\mathbb {C}}{}} = \mathcal {C}^\infty (M;{}^{\beta }TM+\rho _I{}^{{\text {b}}}TM) + \mathcal {C}^\infty (M) \subset \text {Diff}_{\text {b}}^1(M), \end{aligned}$$

where we write

$$\begin{aligned} \mathcal {C}^\infty (M;{}^{\beta }TM+\rho _I{}^{{\text {b}}}TM) := \mathcal {C}^\infty (M;{}^{\beta }TM) + \rho _I\,\mathcal {C}^\infty (M;{}^{{\text {b}}}TM). \end{aligned}$$
(2.44)

Note here that the sum of the first two spaces on the right is globally well-defined on M even though we only defined \({}^{\beta }TM\) in a neighborhood of \(\mathscr {I}^+\): this is due to \({}^{\beta }TM\subset {}^{{\text {b}}}TM\). The general modules \(\mathcal {M}_{\beta ^*E_1,\beta ^*E_2}\) have a completely analogous description obtained by tensoring the bundles with \({\text {Hom}}(\beta ^*E_1,\beta ^*E_2)\).

We next prove some lemmas allowing us to phrase energy estimates for bundle-valued waves invariantly.

Lemma 2.16

Let \(E\rightarrow \overline{\mathbb {R}^4}\) be a vector bundle, and let \(d^E\in \text {Diff}^1(\overline{\mathbb {R}^4};E,T^*\overline{\mathbb {R}^4}\otimes E)\) be a connection. Then \(d^E\) induces a b-connection, i.e. a differential operator

$$\begin{aligned} d^E \in \text {Diff}_{\text {b}}^1(M;\beta ^*E,{}^{{\text {b}}}T^*M \otimes \beta ^*E), \end{aligned}$$
(2.45)

on \(\beta ^*E\rightarrow M\). If \(\tilde{d}^E\) is another connection on E, then, with notation analogous to (2.44),

$$\begin{aligned} d^E-\tilde{d}^E \in \mathcal {C}^\infty \bigl (M;(({}^{\beta }TM)^\perp +\rho _I{}^{{\text {b}}}T^*M)\otimes {\text {End}}(\beta ^*E)\bigr ). \end{aligned}$$
(2.46)

Proof

Fix a local frame \(e^i\) of E, then for \(u_i\in \dot{\mathcal {C}}^\infty (M)\subset \dot{\mathcal {C}}^\infty (\overline{\mathbb {R}^4})\), we have

$$\begin{aligned} d^E(u_i e^i) = d u_i \otimes e^i + u_i\,d^E e^i. \end{aligned}$$

Now the map \(u_i\mapsto d u_i\) extends to M as the map \(u_i\mapsto {}^{{\text {b}}}du_i\), with \({}^{{\text {b}}}d\in \text {Diff}_{\text {b}}^1(M;\underline{\mathbb {C}}{},{}^{{\text {b}}}T^*M)\); and \(f^i:=d^E e^i\in \mathcal {C}^\infty (\overline{\mathbb {R}^4};T^*\overline{\mathbb {R}^4}\otimes E)\) canonically induces \(\beta ^*f^i\in \mathcal {C}^\infty (M;{}^{{\text {b}}}T^*M\otimes \beta ^*E)\) by \(\beta ^*f^i(V)=f^i(\beta _*V)\), \(V\in {}^{{\text {b}}}TM\). Therefore, the expression \(d^E(u_i\cdot \beta ^*e^i)={}^{{\text {b}}}du_i\otimes \beta ^*e^i + u_i\cdot \beta ^*f^i\) proves (2.45).

Letting \(\tilde{f}^i:=\tilde{d}^E e^i\), we have \((d^E-\tilde{d}^E)(u_i\cdot \beta ^*e^i)=u_i\cdot (\beta ^*f^i-\beta ^*\tilde{f}^i)\). But \({}^{\beta }T_{\mathscr {I}^+}M\subset \ker \beta _*\), so the bundle map \(d^E-\tilde{d}^E\) annihilates \({}^{\beta }TM\) at \(\mathscr {I}^+\), giving (2.46). \(\square \)

Lemma 2.17

In the notation of Lemma 2.16, suppose E is equipped with a fiber metric \(\langle \cdot ,\cdot \rangle _E\), and let

$$\begin{aligned} K \in \mathcal {C}^\infty \bigl (M;(S^2\,{}^{\beta }TM+\rho _I\,S^2\,{}^{{\text {b}}}TM)\otimes {\text {End}}(\beta ^*E)\bigr ). \end{aligned}$$
(2.47)

Moreover, let \(B\in \mathcal {C}^\infty (M;{\text {Hom}}({}^{{\text {b}}}TM,{}^{{\text {b}}}T^*M))\) denote a fiber metric on \({}^{{\text {b}}}TM\). Then, acting on sections of \(\beta ^*E\), we have

$$\begin{aligned} (d^E)^* B K d^E - (\tilde{d}^E)^*B K \tilde{d}^E \in \rho _I\text {Diff}_{\text {b}}^1(M;\beta ^*E), \end{aligned}$$
(2.48)

where we take adjoints with respect to the fiber metrics on \({}^{{\text {b}}}TM\) and E, and any fixed b-density on M. Moreover, if \((d^E)^\dag \) denotes the adjoint with respect to another fiber metric on E, then \((d^E)^\dag B K d^E-(d^E)^*B K d^E\in \rho _I\text {Diff}_{\text {b}}^1(M;\beta ^*E)\).

Note that for K as in (2.47) with both the \(S^2\,{}^{\beta }TM\) and the \(S^2\,{}^{{\text {b}}}TM\) summands positive definite, and adding weights, the pairing \(\langle (d^E)^*B K d^E u,u\rangle \) provides the control on fiber-tangential derivatives of u as in the toy model (1.19), but is weaker by \(\rho _I^{1/2}\) for general b-derivatives; we will take care of this in Definition 4.1. The space in (2.48) will be weak enough to be treated as an error term (similar to the \(\text {Diff}_{\text {b}}\) spaces arising as error terms in Lemma 3.8 below).

Proof of Lemma 2.17

We write the left hand side of (2.48) as

$$\begin{aligned} (d^E)^* B K (d^E-\tilde{d}^E) + (d^E-\tilde{d}^E)^* B K \tilde{d}^E, \end{aligned}$$

with one summand being the adjoint of the other. Now, \((d^E)^*B\in \text {Diff}_{\text {b}}^1(M;{}^{{\text {b}}}TM\otimes \beta ^*E,\beta ^*E)\), while Lemma 2.16 implies

$$\begin{aligned} K(d^E-\tilde{d}^E) \in \rho _I\,\mathcal {C}^\infty (M;{}^{{\text {b}}}TM\otimes {\text {End}}(\beta ^*E)). \end{aligned}$$

This proves (2.48). (Alternatively, one can analyze the second summand directly, using that over \(p\in M\), \(\langle (d^E-\tilde{d}^E)^*(B(V)\otimes e),e'\rangle _E = \langle e, (d^E-\tilde{d}^E)(V\otimes e')\rangle _E\) for \(V\in {}^{{\text {b}}}T_p M\), \(e,e'\in E_{\beta (p)}\)). For the second part, note that the two adjoints are related via \((d^E)^\dag =C^{-1}(d^E)^*C\) for some \(C\in \mathcal {C}^\infty (\overline{\mathbb {R}^4};{\text {End}}(E))\), hence \(\tilde{d}^E:=(d^E)^{\dag *}=d^E+C^*[d^E,(C^{-1})^*]\) is a connection on E, and therefore

$$\begin{aligned} ((d^E)^\dag -(d^E)^*)B K d^E=(\tilde{d}^E-d^E)^*B K d^E \in \rho _I\text {Diff}_{\text {b}}^1(M;\beta ^*E) \end{aligned}$$

by what we already proved. \(\square \)

Lemma 2.18

Equip \(E\rightarrow \overline{\mathbb {R}^4}\) with a fiber metric and fix a b-density on \(\overline{\mathbb {R}^4}\). Then for principally scalar \(W\in {}^0\mathcal {M}_{\beta ^*E}\), with principal symbol equal to that of the real vector field \(W_1\in \mathcal {V}_{\text {b}}(M)\), we have \(W+W^*\in -{\text {div}}W_1+\rho _I\,\mathcal {C}^\infty (M;{\text {End}}(\beta ^*E))\).

Proof

In a local trivialization on E, we have \(W=W_1\otimes 1+W_0\), \(W_0\in \rho _I\,\mathcal {C}^\infty (M;{\text {End}}(\beta ^*E))\), while the fiber inner product k on E is related to the standard Euclidean fiber inner product \(\underline{k}{}\) in the trivialization by \(k(e,e')=\underline{k}{}(\widetilde{C} e,\widetilde{C} e')\) for some \(\widetilde{C}\) smooth on \(\overline{\mathbb {R}^4}\), hence fiber constant on M. Denoting adjoints with respect to \(\underline{k}{}\) by \(\dag \), and letting \(C:=\widetilde{C}^*\widetilde{C}\), we thus have

$$\begin{aligned} W+W^*&=\bigl (W_1\otimes 1+C^{-1}(W_1^\dag \otimes 1)C\bigr ) + (W_0+W_0^*) \\&\in -({\text {div}}W_1)\otimes 1 + C^{-1}[W_1^\dag \otimes 1,C] + \rho _I\,\mathcal {C}^\infty , \end{aligned}$$

with the second term also lying in \(\rho _I\,\mathcal {C}^\infty \) since C is fiber-constant. \(\square \)

3 Gauge-fixed Einstein equation

As motivated in §1.2, we work in the wave map gauge with respect to the background metric \(g_m\) constructed in §2.1, since we expect the solution g of the initial value problem (1.4) for the Einstein vacuum equation with initial data asymptotic to mass m Schwarzschild to be well-behaved on the space \({}^mM\). The gauge condition reads

$$\begin{aligned} \Upsilon (g;g_m)_\mu := (g g_m^{-1}\delta _g G_g g_m)_\mu = g_{\mu \nu }g^{\kappa \lambda }(\Gamma (g)_{\kappa \lambda }^\nu - \Gamma (g_m)_{\kappa \lambda }^\nu ) = 0, \end{aligned}$$
(3.1)

where we recall the notation \(G_g=1-\tfrac{1}{2}g{\text {tr}}_g\), and \((\delta _g u)_\mu =-u_{\mu \nu ;}{}^\nu \). For brevity, we shall write

$$\begin{aligned} \Upsilon (g) \equiv \Upsilon (g;g_m), \end{aligned}$$

when the background metric \(g_m\) is clear from the context. A simple calculation shows that if \(h\in H_{{\text {b}}}^{\infty ;-\epsilon ,-\epsilon ,-\epsilon }({}^mM)\), \(\epsilon >0\) small, is a metric perturbation, and \(g=g_m+\rho h\), then the gauge condition \(\Upsilon (g;g_m)=0\) implies that the \(\partial _1\)-derivatives of the good components\(h_{0 0}\), \(h_{0\bar{b}}\), and decay towards \(\mathscr {I}^+\). (See equation (A.5) for this calculation for h with special structure). A key ingredient of our iteration scheme is therefore constraint damping, which ensures that the gauge condition, or, more directly, the improved decay of the good components at \(\mathscr {I}^+\), is satisfied to leading order for each iterate h. We implement constraint damping by considering the gauge-fixed Einstein operator

$$\begin{aligned} P(h) := \rho ^{-3}P_0(g_m + \rho h),\ \ \ P_0(g) := \text {Ric}(g) - \widetilde{\delta }{}^*\Upsilon (g;g_m), \end{aligned}$$
(3.2)

where on 1-forms u

$$\begin{aligned} \widetilde{\delta }{}^*u = \delta _{g_m}^*u - 2\gamma \tfrac{d\rho _t}{\rho _t}\otimes _s u + \gamma (\iota _{\rho _t^{-1}\nabla ^{g_m}\rho _t}u)g_m \end{aligned}$$
(3.3)

is a modification of the symmetric gradient \(\delta _{g_m}^*\) by a 0-th order term; here \(\rho _t\) is fixed according to Definition 2.9. We discuss the effect of this modification in §3.3, see in particular (3.26a). From now on, the mass parametermwill be fixed and dropped from the notation whenever convenient.

3.1 Form of metric perturbations

One can easily establish the existence of a solution of (1.4) near \(I^0\setminus (I^0\cap \mathscr {I}^+)\) for normalized initial data (see Theorem 1.8) which lie merely in \(\rho _0^{1/2+0}H_{{\text {b}}}^\infty \); this is due to nonlinear interactions being weak at \(I^0\), which in turn can ultimately be traced back to the null derivatives (2.28) coming with extra factors of \(\rho _0\).Footnote 27 However, we will use (and prove) the existence of leading terms of the perturbation h of \(g=g_m+\rho h\) at \(\mathscr {I}^+\); as discussed around (1.18), this requires the initial data to be decaying to mass m Schwarzschild data. At \(I^+\) however, weak control, i.e. \(h\in \rho _+^{-1/2+0}H_{{\text {b}}}^\infty \) away from \(\mathscr {I}^+\), suffices due to the nonlinear interactions being as weak there as they are at \(I^0\). (The decay of our initial data does imply the existence of a leading term at \(I^+\), see §7). Motivated by this and the discussion of constraint damping above, and recalling the notation (2.30) and the bundle splittings (2.19) and (2.21), we will seek the solution h of \(P(h)=0\) in the function space \(\mathcal {X}^{k;b_0,b_I,b'_I,b_+}\):

Definition 3.1

Let \(k\in \mathbb {N}_0\cup \{\infty \}\), and fix weightsFootnote 28

$$\begin{aligned} -1< b_+< 0< b_I< b'_I < \min (\tfrac{1}{2},b_0); \end{aligned}$$

let further \(\chi \in \mathcal {C}^\infty (M)\) be identically 1 near \(\mathscr {I}^+\), with support in a small neighborhood of \(\mathscr {I}^+\) where the bundle splitting (2.19) is defined; different choices of \(\chi \) will produce the same function space, as we shall discuss below. The space \(\mathcal {X}^{k;b_0,b_I,b'_I,b_+}\) consists of all \(h\in H_{{\text {b}}}^{k;b_0,-1,b_+}(M;\beta ^*S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})\) such that

(3.4)
(3.5)
(3.6)

where the leading and remainder terms are

$$\begin{aligned}&h_{1 1}^{(\ell )},\ h_{0 1}^{(0)},\ h_{1\bar{b}}^{(0)},\ h_{\bar{a}\bar{b}}^{(0)} \in \rho _0^{b_0}\rho _+^{b_+}H_{{\text {b}}}^k(\mathscr {I}^+), \\&h_{0 1,{\text {b}}},\ h_{1 1,{\text {b}}},\ h_{1\bar{b},{\text {b}}},\ h_{\bar{a}\bar{b},{\text {b}}} \in H_{{\text {b}}}^{k;b_0,b_I,b_+}, \end{aligned}$$

the latter supported on \({\text {supp}}\chi \) and valued in the bundles \(\underline{\mathbb {C}}{}\) (for \(\ell =0,1\)), \(\underline{\mathbb {C}}{}\), \(\beta ^*(r\,T^*\mathbb {S}^2)\), and \(\beta ^*(r^2\,S^2 T^*\mathbb {S}^2)\), respectively; we describe the topology on \(\mathcal {X}^{k;b_0,b_I,b'_I,b_+}\) below. Here, we use a collar neighborhood to extend functions from \(\mathscr {I}^+\) to a neighborhood of \(\mathscr {I}^+\) in M, and to extend the relevant bundles from \(\mathscr {I}^+\) to smooth subbundles of \(\beta ^*S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\) near \(\mathscr {I}^+\); all choices of collar neighborhoods and extensions give the same function space. We shall suppress the parameters \(b_0,b_I,b'_I,b_+\) from the notation when they are clear from the context, so

$$\begin{aligned} \mathcal {X}^k := \mathcal {X}^{k;b_0,b_I,b'_I,b_+}. \end{aligned}$$

Remark 3.2

The partial expansions amount to a statement of partial polyhomogeneity: for example, the condition on \(h_{0 1}\) in (3.6) for \(k=\infty \) can be phrased as \(h_{0 1}\in \mathcal {A}_{{\text {b}},{\text {phg}},{\text {b}}}^{b_0,0,b_+}+H_{{\text {b}}}^{\infty ;b_0,b_I,b_+}\), and similarly for \(k<\infty \) if one replaces the first summand by a function space capturing the finite regularity of the leading term at \(\mathscr {I}^+\). In view of the existence of at most logarithmically growing leading terms of \(h\in \mathcal {X}^k\) at \(\mathscr {I}^+\), we automatically have \(h\in H_{{\text {b}}}^{k;b_0,-0,b_+}\).

Thus, \(h\in \mathcal {X}^k\) decays at \(I^0\), while (3.4) encodes the vanishing of the good components at \(\mathscr {I}^+\); (3.5) and (3.6) assert the existence of leading terms of the remaining components, in the case of \(h_{1 1}\) allowing for a logarithmic term;Footnote 29 at \(I^+\) finally, h is allowed to have mild growth. The existence of leading terms of \(h\in \mathcal {X}^{k;b_0,b_I,b'_I,b_+}\) at \(\mathscr {I}^+\) implies in particular that

$$\begin{aligned}&\rho _I\partial _{\rho _I}h_{\bar{\mu }\bar{\nu }}\in H_{{\text {b}}}^{k-1;b_0,b_I,b_+},\ \ (\bar{\mu },\bar{\nu })=(0,1),(1,\bar{b}),(\bar{a},\bar{b}), \nonumber \\&\rho _I\partial _{\rho _I}h_{1 1}\in h_{1 1}^{(1)}+H_{{\text {b}}}^{k-1;b_0,b_I,b_+},\ (\rho _I\partial _{\rho _I})^2 h_{1 1}\in H_{{\text {b}}}^{k-2;b_0,b_I,b_+}, \end{aligned}$$
(3.7)

which we will frequently use without further explanation.

For \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\), we describe P(h) using a closely related function space:

Definition 3.3

For \(k\in \mathbb {N}_0\cup \{\infty \}\) and weights \(b_0,b_I,b'_I,b_+\) as above, the function space \(\mathcal {Y}^{k;b_0,b_I,b'_I,b_+}\) consists of all \(f\in H_{{\text {b}}}^{k;b_0,-2,b_+}(M;\beta ^*S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})\) so that near \(\mathscr {I}^+\),

(3.8)

The shift by \(-1\) in the decay order at \(\mathscr {I}^+\) is due to the linearized gauge-fixed Einstein equation, or even the linear scalar wave equation, being \(\rho _I^{-1}\) times a b-differential operator at \(\mathscr {I}^+\), cf. (1.18). A calculation will show that for h as above, the gauge-fixed Einstein operator P(h) defined in (3.2) satisfies \(P(h)\in \mathcal {Y}^{\infty ;b_0,b_I,b'_I,b_+}\), see Lemma 3.5 for a more precise statement. Note here that P(h) is well-defined (i.e. \(g_m+\rho h\) is a nondegenerate symmetric 2-tensor, making P(h) computable) in a neighborhood of \(\partial M\) due to the decay (in \(L^\infty \)) of \(g=g_m+\rho h\) to \(g_m\). In order for P(h) to be defined globally, we need to assume \(\rho h\) to be small in \(L^\infty \).

Fixing a smooth cutoff \(\chi \) as in Definition 3.1, we can define a norm on \(\mathcal {Y}^{k;b_0,b'_I,b_I,b_+}\) using the notation of Definition 3.3 by setting

where the choice of \(\rho _I\)-weight in the remainder term is arbitrary (as long as it is fixed and less than \(-1\)). Equipped with this norm, \(\mathcal {Y}^{k;b_0,b_I,b'_I,b_+}\) is a Banach space. A completely analogous definition gives a norm \(\Vert \cdot \Vert _{\mathcal {X}^{k;b_0,b_I,b'_I,b_+}}\). The spaces \(\mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) and \(\mathcal {Y}^{\infty ;b_0,b_I,b'_I,b_+}\), equipped with the projective limit topologies, are Fréchet spaces.

In particular, using the Sobolev embedding \(H_{{\text {b}}}^3(M)\hookrightarrow L^\infty (M)\) (which uses that \(3>\dim (M)/2\)), we have an embedding \(\mathcal {X}^3\hookrightarrow \rho _0^{b_0}\rho _I^{-1}\rho _+^{b_+}L^\infty \); thus, P(h) is well-defined globally on M provided h is small in \(\mathcal {X}^3\).

It will occasionally be useful to write

$$\begin{aligned} \mathcal {X}^k = \mathcal {X}^k_{\text {phg}}\oplus \mathcal {X}^k_{\text {b}}, \ \ \mathcal {Y}^k = \mathcal {Y}^k_{\text {phg}}\oplus \mathcal {Y}^k_{\text {b}}, \end{aligned}$$
(3.9)

where \(\mathcal {Y}^k_{\text {phg}}=\{ \chi f_{1 1}^{(0)} :f_{1 1}^{(0)}\in \rho _0^{b_0}\rho _+^{b_+}H_{{\text {b}}}^k(\mathscr {I}^+) \}\) encodes the leading term of elements of \(\mathcal {Y}^k\), while \(\mathcal {Y}^k_{\text {b}}=\{f\in \mathcal {Y}^k:f_{1 1}^{(0)}=0\}\) captures the remainder terms (i.e. with vanishing leading terms at \(\mathscr {I}^+\)); the spaces \(\mathcal {X}^k_{\text {phg}}\) and \(\mathcal {X}^k_{\text {b}}\) are defined analogously.

In order to exhibit the ‘null structure,’ or upper triangular block structure, of the linearized gauge-fixed Einstein operator \(D_h P\) for \(h\in \mathcal {X}\) at \(\mathscr {I}^+\) in a compact fashion, we introduce subbundles of the symmetric 2-tensor bundle. We use the following notation: given a nowhere vanishing section e of a complex vector bundle \(E\rightarrow U\) over base manifold U, we denote by \(\langle e\rangle \) the line subbundle of E whose fiber of \(p\in U\) is given by \(\{\lambda e(x):\lambda \in \mathbb {C}\}\).

Definition 3.4

Define the subbundles

of \(S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}|_{S^+}\), which we extend in a smooth but otherwise arbitrary fashion to a neighborhood of \(S^+\) as rank 5, resp. 6, subbundles of \(S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\), still denoted by \(K_{1 1}^c\) and \(K_0^c\). Furthermore, define near \(S^+\) the subbundles

(3.10)

The only property of \(K_0\) and \(K_{1 1}\) which we will need is

$$\begin{aligned} K_0^c \oplus K_0 = S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4},\ \ K_{1 1}^c \oplus K_{1 1} = K_0^c. \end{aligned}$$

Denote by

$$\begin{aligned}&\pi _0:S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\rightarrow S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}/K_0^c \cong K_0, \nonumber \\&\tilde{\pi }_{1 1}:K_0^c \rightarrow K_0^c/K_{1 1}^c\cong K_{1 1} \end{aligned}$$
(3.11)

the projections onto the quotient bundles,

$$\begin{aligned} \pi _0^c := 1-\pi _0 :S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\rightarrow K_0^c, \end{aligned}$$

and

$$\begin{aligned} \pi _{1 1}:=\tilde{\pi }_{1 1}\pi _0^c :S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\rightarrow K_{1 1},\ \ \pi _{1 1}^c := (1-\tilde{\pi }_{1 1})\pi _0^c :S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4} \rightarrow K_{1 1}^c. \end{aligned}$$
(3.12)

Writing

$$\begin{aligned} \beta ^*S^2 \equiv \beta ^*S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4} \end{aligned}$$
(3.13)

from now on, the improved decay (3.4) of the good components of \(h\in \mathcal {X}^{k;b_0,b_I,b'_I,b_+}\) can then be expressed, using local coordinates \((\theta ^2,\theta ^3)\) on \(\mathbb {S}^2\), as

similarly for (3.8). The refinement \(K_{1 1}^c\subset K_0^c\),

will be used to encode part of the ‘null structure’ of the linearized gauge-fixed Einstein equation at \(\mathscr {I}^+\), as discussed in §5; the component

$$\begin{aligned} \pi _{1 1} h = h_{1 1}\,d s^2 \end{aligned}$$

will capture the logarithmically growing (relative to \(r^{-1}\)) component at \(\mathscr {I}^+\).

Consider now a fixed \(h\in \mathcal {X}^\infty \) which is small in \(\mathcal {X}^3\) so that \(g:=g_m+\rho h\) is a Lorentzian metric on \(\mathbb {R}^4\). Working near \(\mathscr {I}^+\), we recall and the barred index notation (2.20), so with \(\rho =r^{-1}\), the coefficients of g in the product splitting (2.23) are

(3.14)

the coefficients \(g^{\mu \nu }\) of the inverse metric \(g^{-1}=g_m^{-1}-r^{-1}g_m^{-1}h g_m^{-1}+r^{-2}g_m^{-1}h g_m^{-1}h g_m^{-1}+H_{{\text {b}}}^{\infty ;3+3 b_0,3-0,3+3 b_+}\) are

(3.15)

where we raise spherical indices using the round metric , i.e. etc. Thus,

(3.16)

The calculation of the connection coefficients, components of Riemann and Ricci curvature, and other geometric quantities associated with the metric g is then straightforward; the results of these calculations are given in Appendix A.

3.2 Mapping properties of the gauge-fixed Einstein operator

Let \(h\in \mathcal {X}^\infty =\mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\). In order to compute the leading terms of the gauge-fixed Einstein operator \(P(h)=\rho ^{-3} P_0(g)\), \(g=g_m+\rho h\), see (3.2), we first use the definition (3.3) of \(2(\widetilde{\delta }{}^*-\delta _{g_m^S}^*)\) (given explicitly by (A.2) in the case \(m=0\)) and the observation, from (A.5), that \(\Upsilon (g)\in H_{{\text {b}}}^{\infty ;2+b_0,1+b_I',2+b_+}\) (note that the explicit terms given in (A.5) lie in this space in view of (2.28) and the decay of the coefficients of h in Definition 3.1), to deduce that

$$\begin{aligned} 2(\widetilde{\delta }{}^*-\delta _{g_m}^*)\Upsilon (g) \in H_{{\text {b}}}^{\infty ;3+b_0,2+b_I',3+b_+}. \end{aligned}$$
(3.17)

The decay rate at \(I^+\) holds globally there—not only near \(I^+\cap \mathscr {I}^+\) where \(g_m=g_m^S\). To see this, it suffices to show that \(\Upsilon (g)\in \rho _+^{2+b_+}H_{{\text {b}}}^\infty \) near \((I^+)^\circ \) (since \(\widetilde{\delta }{}^*-\delta _{g_m}^*\in \rho _+\text {Diff}_{\text {b}}^1\), cf. (3.3), then maps it into the stated space).Footnote 30 But this follows from the fact that there g differs from the smooth scattering metric \(g_m\) by an element of \(\rho _+^{1+b_+}H_{{\text {b}}}^\infty \) (with values in \(S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\)). Concretely, choosing local coordinates \(y^1,y^2,y^3\) in \(\partial \overline{\mathbb {R}^4}\), near any point \(p\in (I^+)^\circ \), we can introduce coordinates \(z^0:=\rho _+^{-1}\), \(z^a=\rho _+^{-1}y^a\) (\(a=1,2,3\)), in a neighborhood of p intersected with \(\rho >0\), and \(\{\partial _{z^\mu }:\mu =0,\ldots ,3\}\) is a frame of \({}^{{\text {sc}}}T\overline{\mathbb {R}^4}\) there; but then, using \(\partial _{z^\mu }\in \rho _+\mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4})\), one sees that \(\Gamma (g_m+\rho h)_{\kappa \lambda }^\nu -\Gamma (g_m)_{\kappa \lambda }^\nu \) is a sum of terms of the form

$$\begin{aligned} ((g_m+\rho h)^{\mu \nu }{-}(g_m)^{\mu \nu })\partial _{z^\kappa }(g_m)_{\lambda \sigma }\in \rho _+^{1+b_+}H_{{\text {b}}}^\infty \cdot \rho _{+}\mathcal {C}^\infty (\overline{\mathbb {R}^4})\subset \rho _+^{2+b_+}H_{{\text {b}}}^\infty \ \ (\text{ near }\ p), \end{aligned}$$

and \((g_m+\rho h)^{\mu \nu }\partial _{z^\kappa }(\rho h_{\lambda \sigma })\), which likewise lies in \(\rho _+^{2+b_+}H_{{\text {b}}}^\infty \) near p. (The Christoffel symbols themselves satisfy \(\Gamma (g_m)_{\kappa \lambda }^\nu \in \rho ^+\mathcal {C}^\infty (\overline{\mathbb {R}^4})\), \(\Gamma (g_m+\rho h)_{\kappa \lambda }^\nu \in \rho _+\mathcal {C}^\infty (\overline{\mathbb {R}^4})+\rho _+^{2+b_+}H_{{\text {b}}}^\infty \)).

We can now prove:

Lemma 3.5

For any \(h\in \mathcal {X}^\infty \), the tensor P(h) is well-defined near \(\partial M\) (in the sense explained in the paragraph after Definition 3.3), and we have \(\chi P(h)\in \mathcal {Y}^\infty \) for any \(\chi \in \mathcal {C}^\infty (M)\) with support sufficiently close (depending on h) to \(\partial M\). We have \(P(h)\in \mathcal {Y}^\infty \) provided \(\Vert h\Vert _{\mathcal {X}^3}\) is small. More precisely, we have \(P(h)_{\bar{a}\bar{b}}\in H_{{\text {b}}}^{\infty ;b_0,-1+b'_I,b_+}\) and

$$\begin{aligned} P(h)_{1 1} \in -2\rho ^{-2}\partial _1\partial _0 h_{1 1} - \tfrac{1}{4} \rho ^{-1}\partial _1 h^{\bar{d}\bar{e}}\partial _1 h_{\bar{d}\bar{e}} + H_{{\text {b}}}^{\infty ;b_0,-1+b_I,b_+} \end{aligned}$$
(3.18)

when \(\rho =r^{-1}\) near \(\mathscr {I}^+\).

Proof

We use the calculations (near \(I^0\cup \mathscr {I}^+\)) of \(\delta _{g_m}^*\Upsilon (g)\) in (A.6) and of \(\text {Ric}(g)\) in (A.8); in view of the calculation (3.17), it suffices to prove that \(\rho ^{-3}(\text {Ric}(g)-\delta _{g_m}^*\Upsilon (g))\in \mathcal {Y}^\infty \) near \(\partial M\). In a neighborhood of \(I^0\cup \mathscr {I}^+\), this follows by subtracting (A.6) from (A.8) and dividing by \(\rho ^3\) (thus shifting the three orders down by 3); the expression (3.18) is a particular result of this subtraction.

It remains to justify the decay rate globally at \(I^+\), which is a slight extension of the calculations justifying (3.17) above. We use local coordinates near \(p\in (I^+)^\circ \) as above: firstly, the membership of \(\delta _{g_m}^*\Upsilon (g)\) follows directly from the above arguments. Secondly, the difference of curvature components \(R(g_m+\rho h)^\mu {}_{\nu \kappa \lambda }-R(g_m)^\mu {}_{\nu \kappa \lambda }\) is a sum of terms of the schematic forms \(\partial _\mu (\Gamma (g_m+\rho h)^\kappa _{\nu \lambda }-\Gamma (g_m)^\kappa _{\nu \lambda })\) and \((\Gamma (g_m+\rho h)_{\mu \nu }^\kappa -\Gamma (g_m)_{\mu \nu }^\kappa )\Gamma (g_m+\rho h)_{\kappa \lambda }^\nu \), both of which lie in \(\rho _+^{3+b_+}H_{{\text {b}}}^\infty \) by the calculations above. But by construction, see equations (2.10)–(2.11), \(g_m\) differs from a flat metric by a smooth symmetric scattering 2-tensor of class \(\rho _+\mathcal {C}^\infty (\overline{\mathbb {R}^4})\), which implies that \(R(g_m)^\mu {}_{\nu \kappa \lambda }\in \rho _+^3\mathcal {C}^\infty (\overline{\mathbb {R}^4})\) near p. Therefore, the Riemann curvature tensor satisfies

$$\begin{aligned} R(g_m+\rho h) \in \rho _+^{3+b_+}H_{{\text {b}}}^\infty \end{aligned}$$
(3.19)

as a section of \({}^{{\text {sc}}}T\overline{\mathbb {R}^4}\otimes ({}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})^{\otimes 3}\) near \((I^+)^\circ \), which a fortiori gives \(\text {Ric}(g)\in \rho _+^{3+b_+}H_{{\text {b}}}^\infty \), as desired. (The vanishing of P(h) modulo the faster decaying space \(\rho _0^{b_0}H_{{\text {b}}}^\infty \) near \((I^0)^\circ \) requires more structure of \(g_m\), namely the Ricci flatness of the background metric \(g_m\)). \(\square \)

Note that one component of P(h) has a nontrivial leading term at \(\mathscr {I}^+\); in order for this to not create logarithmically growing terms in components (other than the (1, 1) component) of the next iterate of our Newton-type iteration scheme (which would cause the iteration scheme to not close), one needs to exploit the special structure of the operator \(D_h P\). See also the discussion around (1.26).

3.3 Leading order structure of the linearized gauge-fixed Einstein operator

For \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) small, write

$$\begin{aligned} L_h:=D_h P, \end{aligned}$$
(3.20)

and let \(g=g_m+\rho h\). We shall now calculate the structure of \(L_h\) ‘at infinity,’ that is, its leading order terms at \(I^0\), \(\mathscr {I}^+\), and \(I^+\): at \(\mathscr {I}^+\), we will find that the equation \(L_h u=f\) can be partially decoupled to leading order; this is the key structure for proving global existence for the nonlinear problem later. Recall from [47] that

$$\begin{aligned} D_g\text {Ric}&= \tfrac{1}{2}\Box _g - \delta _g^*\delta _g G_g + \mathscr {R}_g, \nonumber \\&\quad \mathscr {R}_g(u)_{\mu \nu }=(R_g)^\kappa {}_{\mu \nu \lambda }u_\kappa {}^\lambda + \tfrac{1}{2}(\text {Ric}(g)_\mu {}^\lambda u_{\lambda \nu }+\text {Ric}(g)_\nu {}^\lambda u_{\lambda \mu }), \nonumber \\ D_g\Upsilon (g)u&= -\delta _g G_g u - \mathscr {C}_g(u)+\mathscr {Y}_g(u), \end{aligned}$$
(3.21)

where (our notation differs from the one used in [47] by various signs)

$$\begin{aligned} \mathscr {C}_g(u)_\kappa = g_{\kappa \lambda }C^\lambda _{\mu \nu }u^{\mu \nu },\ C^\lambda _{\mu \nu }=\Gamma (g)^\lambda _{\mu \nu }-\Gamma (g_m)^\lambda _{\mu \nu }; \qquad \mathscr {Y}_g(u)_\kappa = \Upsilon (g)^\lambda u_{\kappa \lambda }. \end{aligned}$$

Here, index raising and lowering as well as covariant derivatives are defined using the metric g, and \((\Box _g u)_{\mu \nu }=-u_{\mu \nu ;\kappa }{}^\kappa \). Thus, recalling the definition (3.3) of \(\widetilde{\delta }{}^*\), we have

$$\begin{aligned} L_h = \rho ^{-3}\bigl (\tfrac{1}{2}\Box _g + (\widetilde{\delta }{}^*-\delta _g^*)\delta _g G_g + \widetilde{\delta }{}^*(\mathscr {C}_g-\mathscr {Y}_g) + \mathscr {R}_g\bigr )\rho ,\qquad g=g_m+\rho h, \end{aligned}$$
(3.22)

which has principal symbol

$$\begin{aligned} \sigma _2(L_h) = \tfrac{1}{2}G_{\text {b}}:= \tfrac{1}{2}(g_{\text {b}})^{-1},\ \ g_{\text {b}}:=\rho ^2 g, \end{aligned}$$
(3.23)

where \(G\in \mathcal {C}^\infty (T^*\mathbb {R}^4)\) is the dual metric function \(G(\zeta )=|\zeta |_G^2\). As a first step towards understanding the nature of \(L_h\) as a b-differential operator on M, we prove:

Lemma 3.6

We have \(L_0\in \rho _I^{-1}\text {Diff}_{\text {b}}^2(M;\beta ^*S^2)\) (see (3.13)).

Proof

Since \(g_m\) is a smooth scattering metric, we see, using local coordinates \(z^\mu \) and the membership \(\partial _{z^\mu }\in \rho \mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4})\) as in the discussion preceding Lemma 3.5 to compute Christoffel symbols, that

$$\begin{aligned} \mathscr {R}_{g_m}\in \rho ^2\,\mathcal {C}^\infty (\overline{\mathbb {R}^4};{\text {End}}(S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})),\ \ \delta _{g_m}\in \rho \,\text {Diff}_{\text {b}}^1(\overline{\mathbb {R}^4};S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4},{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}), \end{aligned}$$

and \(\Box _{g_m}\in \rho ^2\,\text {Diff}_{\text {b}}^2(\overline{\mathbb {R}^4};S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})\). This gives \(L_0\in \text {Diff}_{\text {b}}^2(\overline{\mathbb {R}^4};S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})\), and thus the desired conclusion away from \(\mathscr {I}^+\). Near \(\mathscr {I}^+\), any element of \(\text {Diff}_{\text {b}}^1(\overline{\mathbb {R}^4})\) lifts to an element of \(\rho _I^{-1}\text {Diff}_{\text {b}}^1(M)\); moreover, for \(V_1,V_2\in \mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4})\), the product \(V_1 V_2\) lifts to an element of \(\rho _I^{-1}\text {Diff}_{\text {b}}^2(M)\) provided at least one of the \(V_j\) is tangent to \(S^+\). Thus, expressing \(\Box _{g_m}\) in the null frame \(\partial _0,\partial _1,\partial _a\) (\(a=2,3\)), we merely need to check that the coefficient of \(\partial _1^2\) vanishes at \(S^+\); but this coefficient is \(g_m^{1 1}\equiv 0\). \(\square \)

As suggested by the toy estimate (1.19) and explained in §2.4, we need to describe lower order terms of \(L_h\) near \(\mathscr {I}^+\) in two stages, one involving the module \(\mathcal {M}\) from Definition 2.12, the other being general b-differential operators but with extra decay at \(\rho _I=0\). For illustration and for later use, we calculate the leading terms, i.e. the ‘normal operator,’ of the scalar wave operator:

Lemma 3.7

The scalar wave operator \(\Box _{g_{\text {b}}}\) (see (3.23)) satisfies

$$\begin{aligned} \Box _{g_{\text {b}}} \in -4\rho ^{-2}\partial _0\partial _1 +H_{{\text {b}}}^{\infty ;1+b_0,-1+b'_I,1+b_+}\mathcal {M}^2_{\underline{\mathbb {C}}{}} + (\mathcal {C}^\infty +H_{{\text {b}}}^{\infty ;1+b_0,-0,1+b_+})\text {Diff}_{\text {b}}^2(M). \end{aligned}$$
(3.24)

For the linearized gauge-fixed Einstein operator \(L_h\), the analogous result is:

Lemma 3.8

For \(h\in \mathcal {X}^\infty \) small in \(\mathcal {X}^3\), we have

$$\begin{aligned} L_h = L_h^0 + \widetilde{L}_h \end{aligned}$$

where, using the notation (3.13) and fixing \(\rho =r^{-1}\) near \(\mathscr {I}^+\),

$$\begin{aligned} L_h^0&= -\rho ^{-1}\bigl ((2\rho ^{-1}\partial _0+A_h)\partial _1 - B_h\bigr ), \nonumber \\ \widetilde{L}_h&\in H_{{\text {b}}}^{\infty ;1+b_0,-1+b'_I,1+b_+}\mathcal {M}_{\beta ^*S^2}^2 + (\mathcal {C}^\infty +H_{{\text {b}}}^{\infty ;1+b_0,-0,1+b_+})\text {Diff}_{\text {b}}^2(M;\beta ^*S^2); \end{aligned}$$
(3.25)

here \(\rho ^{-1}\partial _0\) and \(\partial _1\) are defined using equation (2.42) and Lemma 2.13(3). In the refinement of the bundle splitting (2.21) by (2.22), \(A_h\) and \(B_h\) are given by

$$\begin{aligned} A_h=\left( \begin{array}{ccccccc} 2\gamma &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ -2\partial _1 h_{0 1} &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} \gamma &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -2\partial _1 h_1{}^{\bar{a}} &{} 0 &{} 0 &{} \gamma +2\partial _1 h_{0 1} &{} \tfrac{1}{2}\partial _1 h^{\bar{a}\bar{b}} \\ -2\partial _1 h_{1\bar{b}} &{} 0 &{} \gamma &{} 0 &{} 0 &{} 0 &{} 0 \\ 2\gamma &{} 0 &{} 0 &{} 0 &{} 0 &{} \gamma &{} 0 \\ -2\partial _1 h_{\bar{a}\bar{b}} &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{array}\right) \end{aligned}$$

and

$$\begin{aligned} B_h= \begin{pmatrix} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 2\partial _1\partial _1 h_{0 1} &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 2\partial _1\partial _1 h_{1 1} &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 2\partial _1\partial _1 h_{1\bar{b}} &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 2\partial _1\partial _1 h_{\bar{a}\bar{b}} &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$

The proofs of these lemmas only involve simple calculations and careful bookkeeping; they are given in Appendix B. We thus see that at \(\mathscr {I}^+\), \(L_h\) effectively becomes a differential operator in the null coordinates \(x^0=q\) and \(x^1=s\) only, as spherical derivatives have decaying coefficients; this is to be expected since \(r^{-1}V\), \(V\in \mathcal {V}(\mathbb {S}^2)\subset \mathcal {V}_{\text {b}}(M)\), is the naturally appearing (scattering) derivative just like \(\partial _0\) and \(\partial _1\). We point out that a number of terms of \(L_h\) which are not of leading order at \(\mathscr {I}^+\) do contribute to the normal operators at \(I^0\) and \(I^+\); this includes in particular the spherical Laplacian, which is crucial for proving an energy estimate.

For the analysis of the linearized operator \(L_h\), the structure of the leading term \(L_h^0\) will be key for obtaining the rough background estimate, Theorem 4.2, as well as the precise asymptotic behavior at \(\mathscr {I}^+\), as encoded in the space \(\mathcal {X}^\infty \). To describe this structure concisely, recall the projection \(\pi _0\) defined in (3.11) projecting a metric perturbation onto the bundle \(K_0\) encoding the components which we expect to be decaying from the gauge condition; and the projection \(\pi _{1 1}\) defined in (3.12) onto the bundle \(K_{1 1}\) encoding the (1, 1) component, which we allow to include a logarithmic term. Thus, in the splitting used in Lemma 3.8, \(\pi _0\) picks out components 1, 3, 6, \(\pi _{1 1}\) picks out component 4, and \(\pi _{1 1}^c\) picks out components 2, 5, 7. Suppose now \(h'\) satisfies the asymptotic equation \(L_h^0 h'=0\). Since \(\pi _0 A_h|_{K_0^c}=0\) and \(\pi _0 B_h|_{K_0^c}=0\), the components \(\pi _0 h'\), which we hope to be decaying, satisfy a decoupled equation

$$\begin{aligned} (2\rho ^{-1}\partial _0+A_{\text {CD}})\partial _1(\pi _0 h') = 0, \quad A_{\text {CD}}:=\begin{pmatrix} 2\gamma &{} 0 &{} 0 \\ 0 &{} \gamma &{} 0 \\ 2\gamma &{} 0 &{} \gamma \end{pmatrix}, \end{aligned}$$
(3.26a)

where \(A_{\text {CD}}\in \mathcal {C}^\infty (M;{\text {End}}(K_0))\) is the endomorphism induced by \(\pi _0 A_h\) on \(\beta ^*S^2/K_0^c\cong K_0\). (Thus, this matrix is the expression for \(A_{h,0}\) in the splitting of \(K_0\cong \beta ^*S^2/K_0^c\) induced by the splittings (2.21)–(2.22) via the projection \(\pi _0\)). Note that by equation (2.28), \(\rho ^{-1}\partial _0\) is proportional to the dilation vector field \(-\rho _I\partial _{\rho _I}\) (which is the asymptotic generator of dilations on outgoing light cones), hence equation (3.26a) is, schematically, \((\rho _I\partial _{\rho _I}-A_{\text {CD}})(\pi _0 h')=0\). Choosing \(\gamma >0\), the spectrum of \(A_{\text {CD}}\) is positive, which will allow us to prove that \(\pi _0 h'\) decays at \(\mathscr {I}^+\), similarly to the discussion of the model equation (1.24); we will make this precise in §§4.1 and 5.1.

Next, using that \(\pi _{1 1}^c A_h|_{K_{1 1}}=0\) and \(\pi _{1 1}^c B_h|_{K_{1 1}}=0\), i.e. the logarithmic component \(h_{1 1}\) does not couple into the other nondecaying components, we can obtain an equation for the nonlogarithmic components \(\pi _{1 1}^c h'\) which only couples to (3.26a), namely

$$\begin{aligned}&2\rho ^{-1}\partial _0\partial _1(\pi _{1 1}^c h') = (-A_{h,1 1}^c\partial _1 + B_{h,1 1}^c)(\pi _0 h'), \nonumber \\&\quad A_{h,1 1}^c= \left( \begin{array}{ccc} -2\partial _1 h_{0 1} &{} 0 &{} 0 \\ -2\partial _1 h_{1\bar{b}} &{} \gamma &{} 0 \\ -2\partial _1 h_{\bar{a}\bar{b}} &{} 0 &{} 0 \end{array}\right) B_{h,1 1}^c= \begin{pmatrix} 2\partial _1\partial _1 h_{0 1} &{} 0 &{} 0 \\ 2\partial _1\partial _1 h_{1\bar{b}} &{} 0 &{} 0 \\ 2\partial _1\partial _1 h_{\bar{a}\bar{b}} &{} 0 &{} 0 \end{pmatrix}; \end{aligned}$$
(3.26b)

the precise form of \(A_{h,1 1}^c,B_{h,1 1}^c\), mapping sections of \(K_0\) to sections of \(K_{1 1}^c\), is irrelevant: only their boundedness matters (even mild growth towards \(\mathscr {I}^+\) would be acceptable). The operator on the left hand side of (3.26b) has the same structure as the model operator in (1.22); the fact that the forcing term in (3.26b) is decaying will thus allow us to prove that \(\pi _{1 1}^c h'\) is bounded at \(\mathscr {I}^+\), consistent with what the function space \(\mathcal {X}^\infty \) encodes.

Lastly, \(\pi _{1 1} h'\) couples to all previous quantities,

$$\begin{aligned}&2\rho ^{-1}\partial _0\partial _1(\pi _{1 1} h') = (-A_{h,1 1}\partial _1 + B_{h,1 1})\begin{pmatrix}\pi _0 h' \\ \pi _{1 1}^c h'\end{pmatrix}, \nonumber \\&\quad A_{h,1 1}=\begin{pmatrix} 0&-2\partial _1 h_1{}^{\bar{a}}&\gamma +2\partial _1 h_{0 1}&0&0&\tfrac{1}{2}\partial _1 h^{\bar{a}\bar{b}} \end{pmatrix}, \nonumber \\&\quad B_{h,1 1}=\begin{pmatrix} 2\partial _1\partial _1 h_{1 1}&0&0&0&0&0 \end{pmatrix}. \end{aligned}$$
(3.26c)

The logarithmic growth of the first component of \(B_{h,1 1}\) is more than balanced by the fast decay of the (0, 0)-component of \(h'\) that it acts on.

Remark 3.9

The fact that the logarithmic growth of \(h_{1 1}\) is rendered harmless due to its coupling only to the faster decaying \(\pi _0 h'\) is the manifestation of the weak null condition [78] in our framework. Here, the faster decay of \(\pi _0 h'\) is accomplished by means of constraint damping, whereas in [79, 80] the faster decay of \(\pi _0\) applied to the difference of the nonlinear solution and the background (Minkowski) metric follows from the gauge condition which the nonlinear solution verifies, cf. [80, Corollary 9.7].

More subtly, the \(\rho _I^{b'_I}\) decay of \(h'_{0 0}\) is required at this point to allow for an estimate of the remainder of \(h_{1 1}\) with weight \(\rho _I^{b_I}\) (\(\gg \rho _I^{b'_I}\log \rho _I\)). The last component of \(A_{h,1 1}\), acting on the trace-free spherical part of \(h'\), in general has a nonzero leading term at \(\mathscr {I}^+\);Footnote 31 hence, solving the equation (3.26c), schematically \(\rho _I\partial _{\rho _I}(\partial _1\pi _{1 1} h')\approx \partial _1 h^{\bar{a}\bar{b}}(\partial _1 h')_{\bar{a}\bar{b}}\), requires \(\pi _{1 1} h'\) to have a \(\log \rho _I\) term.

At the other boundaries \(I^0\) and \(I^+\), we only need crude information about \(L_h\) for the purpose of obtaining an energy estimate in §4:

Lemma 3.10

We have \(L_h-L_0\in H_{{\text {b}}}^{\infty ;1+b_0,-1-0,1+b_+}\text {Diff}_{\text {b}}^2(M;\beta ^*S^2)\).

Proof

Near \((I^+)^\circ \), the stated \(\rho _+^{1+b_+}\) decay is a consequence of the calculation of differences of Christoffel symbols and curvature components as in the proof of Lemma 3.5. Near \(\mathscr {I}^+\), we revisit the proof of Lemma 3.8: in the notation of equation (3.25), the expressions for \(A_h\) and \(B_h\) give \(L_h^0-L_0^0\in H_{{\text {b}}}^{\infty ;1+b_0,-0,1+b_+}\text {Diff}_{\text {b}}^1\). Regarding the second remainder term in \(\widetilde{L}_h\), we note that the leading order terms, captured by the \(\text {Diff}_{\text {b}}^2\) summand with \(\mathcal {C}^\infty \) coefficients, come from terms of the metric and the Christoffel symbols which do not involve h; thus, these are equal to the corresponding terms of \(L_0\). \(\square \)

In order to obtain optimal decay results at \(I^+\) in §5.2, we shall need the precise form of the normal operator of \(L_h\), which by Lemma 3.10 is the same as that of \(L_0\). Now, \(g_m\) is itself merely a perturbation of the Minkowski metric, pulled back by a diffeomorphism, see (2.10). It is convenient for the normal operator analysis at \(I^+\) in §§5.2 and 7 to relate this to the usual presentation of the Minkowski metric \(\underline{g}{}=dt^2-dx^2\) on \(\mathbb {R}^4\) in \(U=\{t>\tfrac{2}{3} r\}\):

Lemma 3.11

The metric \(\underline{g}{}\) lies in \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}(U;S^2\,{}^{{\text {sc}}}T^*\,{}^m\overline{\mathbb {R}^4})\) for the index set \(\mathcal {E}_{\mathrm{log}}\) defined in (2.36), and \(\underline{g}{}-g_m\in \mathcal {A}_{\text {phg}}^{\mathcal {E}'_{\mathrm{log}}}(U;S^2\,{}^{{\text {sc}}}T^*\,{}^m\overline{\mathbb {R}^4})\subset \rho ^{1-0}H_{{\text {b}}}^\infty (U;S^2\,{}^{{\text {sc}}}T^*\,{}^m\overline{\mathbb {R}^4})\).

The failure of smoothness (for \(m\ne 0\)) of \(\underline{g}{}\) is due to the logarithmic correction, see (2.5), in the definition of the compactification \({}^m\overline{\mathbb {R}^4}\). On the radial compactification \({}^0\overline{\mathbb {R}^4}\) on the other hand, \(\underline{g}{}\)is a smooth scattering metric.

Proof of Lemma 3.11

In the region \(C_2\) defined in (2.8), \(g_m=\underline{g}{}\) is smooth, see the discussion after equation (2.10). In the region \(C_1\), see equation (2.6), the spatial part is a smooth symmetric scattering 2-tensor on \({}^m\overline{\mathbb {R}^4}\). In the region \(t\ge \tfrac{2}{3} r\) and for large r, the claim follows from Lemma 2.8 in that region. \(\square \)

Define

$$\begin{aligned} \underline{L}{} := \tfrac{1}{2}\Box _{\underline{g}{}} + (\underline{\widetilde{\delta }{}}{}^*-\delta _{\underline{g}{}}^*)\delta _{\underline{g}{}}G_{\underline{g}{}}, \ \ (\underline{\widetilde{\delta }{}}{}^*-\delta _{\underline{g}{}}^*)u := 2\gamma t^{-1}\,dt\otimes _s u - \gamma t^{-1}(\iota _{\nabla ^{\underline{g}{}}t}u)\underline{g}{}, \end{aligned}$$
(3.27)

cf. the definition (3.3), which is the linearization \(\text {Ric}(g)-\underline{\widetilde{\delta }{}}{}^*\underline{\Upsilon }{}(g)\) around \(g=\underline{g}{}\), where \(\underline{\Upsilon }{}(g)\) is defined like \(\Upsilon (g)\) in (3.1) with \(\underline{g}{}\) in place of \(g_m\). Using Lemma 3.11, one finds \(\underline{L}{}\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\cdot \text {Diff}_{\text {b}}^2(U;S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4})\). Furthermore,

$$\begin{aligned} \underline{L}{}-L_0\in \mathcal {A}_{\text {phg}}^{\mathcal {E}'_{\mathrm{log}}}(U)\cdot \text {Diff}_{\text {b}}^2(U;S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}); \end{aligned}$$
(3.28)

but \(\partial _v\in \rho _I^{-1}\mathcal {V}_{\text {b}}(M)\), while derivatives along b-vector fields tangent to \(S^+\) lift to elements of \(\mathcal {V}_{\text {b}}(M)\); thus,

$$\begin{aligned} \underline{L}{}-L_0\in \rho _I^{-1-0}\rho _+^{1-0}H_{{\text {b}}}^\infty \cdot \text {Diff}_{\text {b}}^2\ \ (\text{ near }\ I^+\subset M). \end{aligned}$$
(3.29)

4 Global background estimate

We prove a global energy estimate for solutions of the linearized equation \(L_h u=f\) with \(h\in \mathcal {X}^\infty \), and show that u lies in a weighted conormal space provided f does; recall here the definition (3.20) of \(L_h\). The weak asymptotics of u at the boundaries \(I^0\), \(\mathscr {I}^+\), and \(I^+\) can be improved subsequently using normal operator arguments in §5. At \(\mathscr {I}^+\), the estimate loses a weight of \(\rho _I^{1/2}\) for general b-derivatives, as we will explain in detail in §4.1. We capture this using the function space \(H_{\mathscr {I}}^1\):

Definition 4.1

Let \(E\rightarrow \overline{\mathbb {R}^4}\) be a smooth vector bundle. With \(\mathcal {M}_{\beta ^*E}\) defined in §2.4, let

$$\begin{aligned} H_{\beta }^1(M;\beta ^*E)&:= \{ u\in L^2_{\text {b}}(M;\beta ^*E) :\mathcal {M}_{\beta ^*E}u\subset L^2_{\text {b}}(M;\beta ^*E) \}, \\ H_{\mathscr {I}}^1(M;\beta ^*E)&:= \{ u\in H_{\beta }^1(M;\beta ^*E) :\rho _I^{1/2}\text {Diff}_{\text {b}}^1(M;\beta ^*E)u\subset L^2_{\text {b}}(M;\beta ^*E) \}. \end{aligned}$$

For \(k\in \mathbb {N}_0\) and \(\bullet =\beta ,\mathscr {I}\), define

$$\begin{aligned} H_{\bullet ,{\text {b}}}^{1,k}(M;\beta ^*E) := \{ u\in L^2_{\text {b}}(M;\beta ^*E) :\text {Diff}_{\text {b}}^k(M;\beta ^*E)u\subset H_{\bullet }^1(M;\beta ^*E) \}. \end{aligned}$$

If \(\{A_j\}\subset \mathcal {M}_{\beta ^*E}\) is a finite set spanning \(\mathcal {M}_{\beta ^*E}\) over \(\mathcal {C}^\infty (M)\), we define norms on these spaces by

$$\begin{aligned} \Vert u \Vert _{H_{\beta ,{\text {b}}}^{1,k}(M;\beta ^*E)}&:= \Vert u\Vert _{H_{{\text {b}}}^k(M;\beta ^*E)} + \sum _j \Vert A_j u\Vert _{H_{{\text {b}}}^k(M;\beta ^*E)}, \\ \Vert u \Vert _{H_{\mathscr {I},{\text {b}}}^{1,k}(M;\beta ^*E)}&:= \Vert u\Vert _{H_{\beta ,{\text {b}}}^{1,k}(M;\beta ^*E)} + \Vert \rho _I^{1/2}u\Vert _{H_{{\text {b}}}^{k+1}(M;\beta ^*E)}. \end{aligned}$$

Note that for \(u\in H_{\beta }^1\), we automatically have \(\rho _I\text {Diff}_{\text {b}}^1(M)u\subset L^2_{\text {b}}\) by Lemma 2.13(1), so the subspace \(H_{\mathscr {I}}^1\subset H_{\beta }^1\) encodes a \(\rho _I^{1/2}\) improvement over this. Away from \(\mathscr {I}^+\), the spaces \(H_{\beta ,{\text {b}}}^{1,k}\) and \(H_{\mathscr {I},{\text {b}}}^{1,k}\) are the same as \(H_{{\text {b}}}^{k+1}\).

Fix a vector field

$$\begin{aligned} \partial _\nu \in \mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4}) \end{aligned}$$
(4.1)

transversal to the Cauchy surface \(\Sigma \); we extend the action of \(\partial _\nu \) to sections u of a vector bundle E using an arbitrary fixed b-connection \(d^E\) on E, see (2.45), by setting \(\partial _\nu u:=(d^E u)(\partial _\nu )\).

Theorem 4.2

Fix weights \(b_0,b'_I,b_I,b_+\) as in Definition 3.1, let \(\gamma >b'_I\) in the definition (3.3) of \(\widetilde{\delta }{}^*\), and fix \(a_0,a_I,a'_I\in \mathbb {R}\) satisfying

$$\begin{aligned} a_I<a'_I<a_0, \quad a_I<0, \quad a'_I<a_I+b'_I. \end{aligned}$$

Then there exists \(a_+\in \mathbb {R}\) such that the following holds for all \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) which are small in \(\mathcal {X}^3\): for \(k\in \mathbb {N}\), \(u_j\in \rho _0^{a_0}H_{{\text {b}}}^{k-j}(\Sigma )\), \(j=0,1\), and \(f\in H_{{\text {b}}}^{k-1;a_0,a_I-1,a_+}(M;\beta ^*S^2)\) with \(\pi _0 f\in H_{{\text {b}}}^{k-1;a_0,a'_I-1,a_+}(M;\beta ^*S^2)\), the linear wave equation

$$\begin{aligned} L_h u = f, \ \ (u,\partial _\nu u)|_\Sigma = (u_0,u_1), \end{aligned}$$
(4.2)

has a unique global solution u satisfying

$$\begin{aligned}&\Vert u \Vert _{\rho _0^{a_0}\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}(M;\beta ^*S^2)} + \Vert \pi _0 u \Vert _{\rho _0^{a_0}\rho _I^{a'_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}(M;\beta ^*S^2)} \nonumber \\&\quad \le C\Bigl (\Vert u_0 \Vert _{\rho _0^{a_0}H_{{\text {b}}}^k} + \Vert u_1 \Vert _{\rho _0^{a_0}H_{{\text {b}}}^{k-1}} + \Vert f \Vert _{H_{{\text {b}}}^{k-1;a_0,a_I-1,a_+}} + \Vert \pi _0 f \Vert _{H_{{\text {b}}}^{k-1;a_0,a'_I-1,a_+}}\Bigr ). \end{aligned}$$
(4.3)

In particular, if the assumptions on \(u_j\) and f hold for all k, then

$$\begin{aligned} u\in H_{{\text {b}}}^{\infty ;a_0,a_I,a_+},\ \ \pi _0 u\in H_{{\text {b}}}^{\infty ;a_0,a'_I,a_+}. \end{aligned}$$
(4.4)

We refer the reader to Remark 1.9 for a translation of the memberships (4.4) to pointwise decay estimates. (For obtaining pointwise decay for any fixed number of derivatives of u, the estimate of (4.3) for sufficiently large k is of course sufficient).

For completeness, we prove a version of such a background estimate with an explicit weight \(a_+\) in §4.3. As we will see in §5.2, this allows us to give an explicit bound on the number of derivatives needed to close the nonlinear iteration in §6. A nonexplicit value of \(a_+\) as in Theorem 4.2 is sufficient to prove Theorem 1.1 if one is content with a nonexplicit value for N.Footnote 32 We will prove Theorem 4.2 by means of energy estimates, as outlined in §1.1.1. Microlocal techniques on \(\overline{\mathbb {R}^4}\) on the other hand, as employed in [13], would work well away from the light cone at infinity \(S^+\), but since the coefficients of \(L_h\) are singular at \(S^+\), it is a delicate question how ‘microlocal’ the behavior of \(L_h\) is at \(S^+\), i.e. whether or not and what strengths of singularities could ‘jump’ from one part of the b-cotangent bundle to another at \(S^+\); since we do not need precise microlocal control of \(L_h\) for present purposes, we do not study this further.

Since dt is globally timelike for \(g=g_m+\rho h\) provided \(\rho h\) is small in \(\rho \mathcal {X}^3\subset L^\infty \), existence and uniqueness of a solution \(u\in H_{{\text {loc}}}^k(M\cap \mathbb {R}^4;S^2 T^*\mathbb {R}^4)\) are immediate, together with an estimate for any compact set \(K\Subset M\cap \mathbb {R}^4\),

$$\begin{aligned} \Vert u \Vert _{H^k(K)} \le C_K ( \Vert u_0 \Vert _{\rho _0^{a_0}H_{{\text {b}}}^k} + \Vert u_1 \Vert _{\rho _0^{a_0}H_{{\text {b}}}^{k-1}} + \Vert f \Vert _{H_{{\text {b}}}^{k-1;a_0,a_I,a_+}} ), \end{aligned}$$
(4.5)

where one could equally well replace the norms on the right by standard Sobolev norms on sufficiently large compact subsets of \(M\cap \mathbb {R}^4\) depending on K, due to the domain of dependence properties of solutions of (4.2).

Using Lemma 3.10, it is straightforward to prove (4.3) near any compact subset of \((I^0)^\circ \), where \(H_{\mathscr {I},{\text {b}}}^{1,k-1}\) is the same as \(H_{{\text {b}}}^k\). Let us define \(\rho _0,\rho _I,\rho \) near \(I^0\) as in equation (2.25). Fix \(\epsilon >0\), and define for \(\delta ,\eta >0\) small

$$\begin{aligned} U := \{ \rho _I>\epsilon ,\ \rho _0-\eta \rho _I<\delta \} \subset M, \end{aligned}$$

which for \(\epsilon \) small is a neighborhood of any fixed compact subset of \(M\cap (I^0)^\circ \). (Since \(\rho _I\) is bounded from above, U can be made to lie in any fixed neighborhood \(\{\rho _0<\delta _0\}\) of \(I^0\) provided \(\delta \) and \(\eta \) are sufficiently small). In view of (3.15), we have , hence the calculation (2.26) gives

(4.6)

Thus, \(d\rho _I\) and \(d(\rho _0-\eta \rho _I)\) are timelike in U once we fix \(\delta ,\eta >0\) to be sufficiently small, and thus U is bounded by \(\Sigma \cap U\) and two spacelike hypersurfaces, \(U^\partial _1=\{\rho _I=\epsilon \}\) and \(U^\partial _2=\{\rho _0-\eta \rho _I=\delta \}\) (as well as by \(U\cap \partial M\) at infinity), see Figure 9.

Fig. 9
figure 9

The domain U with its spacelike boundaries \(U_1^\partial \), and \(U_2^\partial \). We draw \(I^0\) at a 45 degree angle as the level sets of the chosen boundary defining function \(\rho _0\) are approximately null (namely, \(|d\rho _0|_{G_{0,{\text {b}}}}^2=0\)). The level sets of \(\rho _I\) are spacelike in \(\rho _I>0\), but not uniformly so as \(\rho _I\rightarrow 0\)

Proposition 4.3

Under the assumptions of Theorem 4.2, we have

$$\begin{aligned} \Vert u \Vert _{\rho _0^{a_0}H_{{\text {b}}}^k(U)} \le C\bigl (\Vert u_0\Vert _{\rho _0^{a_0}H_{{\text {b}}}^k(\Sigma \cap U)}+\Vert u_1\Vert _{\rho _0^{a_0}H_{{\text {b}}}^{k-1}(\Sigma \cap U)} + \Vert f\Vert _{\rho _0^{a_0}H_{{\text {b}}}^{k-1}(U)}\bigr ). \end{aligned}$$
(4.7)

Proof

We give a positive commutator proof of this standard estimate, highlighting the connection to the more often encountered fashion in which energy estimates are phrased [37]. Let us work in a trivialization \({}^{{\text {b}}}T^*\overline{\mathbb {R}^4}\cong \overline{\mathbb {R}^4}\times \mathbb {R}^4\), and fix the fiber inner product to be the Euclidean metric in this trivialization. For proving the case \(k=1\) of the lemma, we set \(L:=L_h\); it will be convenient however for showing higher regularity to allow \(L\in \text {Diff}_{\text {b}}^2+\rho _0^{1-0}H_{{\text {b}}}^\infty \text {Diff}_{\text {b}}^2\) to be any principally scalar operator with \(\sigma _{{\text {b}},2}(L)=\tfrac{1}{2}G_{\text {b}}\), acting on \(\mathbb {C}^N\)-valued functions for some \(N\in \mathbb {N}\); we equip \(\mathbb {C}^N\) with the standard Hermitian inner product. (One may also phrase the proof invariantly, i.e. not using global bundle trivializations, as we shall do in §§4.1 and 4.2 for conceptual clarity).

We will use a positive commutator argument: let \(V=-\nabla \rho _I\in \mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4})\), with \(\nabla \) defined with respect to \(g_{\text {b}}\); this is future timelike. For \(\digamma >0\) chosen later, let \(w=\rho _0^{-a_0}e^{\digamma \rho _I}\), and let \(\mathbb {1}_U\) denote the characteristic function of U. Put \(W=\mathbb {1}_U w^2 V\). Write \(L=L_2+L_1\), where \(L_2=\tfrac{1}{2}\Box _{g_{\text {b}}}\otimes 1_{10\times 10}\), \(L_1\in (\mathcal {C}^\infty +\rho _0^{1-0}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^1\). We then calculate the commutator

$$\begin{aligned} 2 {\text {Re}}\langle \mathbb {1}_U w f, \mathbb {1}_U w V u\rangle = 2 {\text {Re}}\langle L u, W u\rangle = \langle A u, u\rangle + 2{\text {Re}}\langle \mathbb {1}_U wL_1 u, \mathbb {1}_U w V u \rangle \end{aligned}$$
(4.8)

using the \(L^2_{\text {b}}\) inner product, where \(A=[L_2,W]+(W+W^*)L_2\). A simple calculation gives \(\sigma _{{\text {b}},2}(A)(\xi )=K_W(\xi ,\xi )\), where

$$\begin{aligned} K_W := -\tfrac{1}{2}(\mathcal {L}_W G_{\text {b}}+ ({\text {div}}_{g_{\text {b}}}W)G_{\text {b}}). \end{aligned}$$
(4.9)

(The K-current is often given in its covariant form \(\tfrac{1}{2}(\mathcal {L}_W g_{\text {b}}-({\text {div}}_{g_{\text {b}}}W)g_{\text {b}})\)). Therefore, \(A=d^*K_W d\), since the principal symbols of both sides agree, hence the difference is a scalarFootnote 33 first order b-differential operator which has real coefficients and is symmetric—thus is in fact of order zero, and since it annihilates constant vectors in \(\mathbb {C}^N\), the difference vanishes. Differentiation of the exponential weight in W upon evaluating \(K_W\) will produce the main positive term into which all other terms can be absorbed. Indeed, the identity \(\mathcal {L}_{f V}G_{\text {b}}=f\mathcal {L}_V G_{\text {b}}- 2\nabla f\otimes _s V\) for \(V\in \mathcal {V}_{\text {b}}\) and \(f\in \mathcal {C}^\infty \) gives

$$\begin{aligned} K_{f V} = T(\nabla f,V) + f K_V, \end{aligned}$$
(4.10)

where

$$\begin{aligned} T(X,Y)=X\otimes _s Y-\tfrac{1}{2}g_{\text {b}}(X,Y)G_{\text {b}}\end{aligned}$$

denotes the (abstract) energy-momentum tensor. (The energy-momentum tensor of a scalar wave u, say, is given by T(XY)(dudu)). Therefore, \(K_W=w^2(2\digamma \mathbb {1}_U K_0+\mathbb {1}_U K_1+K_2)\), where

$$\begin{aligned} K_0 = T(\nabla \rho _I,V), \ \ K_1 = -2 a_0 T\bigl (\tfrac{\nabla \rho _0}{\rho _0},V\bigr ), \ \ K_2 = T(\nabla \mathbb {1}_U,V). \end{aligned}$$

Since \(\nabla \rho _I\) is past timelike, the main term \(K_0\) is negative definite; \(K_2\) has support in \(\partial U\setminus \partial M\), so \(\nabla \mathbb {1}_U\) being past timelike at \(U^\partial _1\) and \(U^\partial _2\), \(K_2\) has the same sign as \(K_0\) there. Lastly, \(K_1\) has no definite sign, but can be absorbed into \(K_0\) by choosing \(\digamma >0\) large: indeed, \(|T(\tfrac{\nabla \rho _0}{\rho _0},V)(\xi ,\xi )| \le -C T(\nabla \rho _I,V)\) for some constant C depending only on K, since \(g_{\text {b}}\) is a b-metric. Thus, (4.8) gives the estimate

$$\begin{aligned} \langle \mathbb {1}_U w (-2\digamma K_0-K_1)d u, \mathbb {1}_U d u\rangle&\le 2(\Vert \mathbb {1}_U w V u\Vert ^2+\Vert \mathbb {1}_U w L_1 u\Vert ^2) \nonumber \\&\quad + \Vert \mathbb {1}_U w f\Vert ^2 + C\Vert \mathbb {1}_U w(d u_0, u_1) \Vert ^2. \end{aligned}$$
(4.11)

In order to control u itself, consider the ‘commutator’

$$\begin{aligned} 2{\text {Re}}\langle \mathbb {1}_U w u,\mathbb {1}_U w V u\rangle = 2{\text {Re}}\langle u, W u\rangle = \langle -\mathbb {1}_U w({\text {div}}V)u, \mathbb {1}_U w u\rangle - \langle V(\mathbb {1}_U w^2)u,u\rangle , \end{aligned}$$
(4.12)

where \(V(\mathbb {1}_U w^2)=2\digamma \mathbb {1}_U w^2(V\rho _I)-2 a_0\mathbb {1}_U w^2\tfrac{V\rho _0}{\rho _0}+w^2 V(\mathbb {1}_U)\). In the first, main, term, \(V\rho _I=-|d\rho _I|_{g_{\text {b}}}^2\le -c_0<0\) has a strictly negative upper bound on U; the third term gives \(\delta \)-distributions at \(\partial U\) with the same sign as this main term at \(U^\partial _1\) and \(U^\partial _2\) since V is outward pointing there. Choosing \(\digamma \) large to absorb the contribution of the second term, we get

$$\begin{aligned} c_0\digamma \Vert \mathbb {1}_U w u\Vert ^2 \le \eta \digamma \Vert \mathbb {1}_U w u\Vert ^2 + C_\eta \digamma ^{-1}\Vert \mathbb {1}_U w V u\Vert ^2 + C\Vert \mathbb {1}_U w u_0 \Vert ^2, \end{aligned}$$

so fixing \(\eta =c_0/2\), this gives \(\Vert \mathbb {1}_U w u\Vert ^2 \le C\digamma ^{-2}\Vert \mathbb {1}_U w V u\Vert ^2+C_\digamma \Vert \mathbb {1}_U w u_0\Vert ^2\). Adding \(C'\) times this to (4.11) yields

$$\begin{aligned}&\langle \mathbb {1}_U w(-2\digamma K_0-K_1)d u,\mathbb {1}_U d u\rangle + C' \Vert \mathbb {1}_U w u\Vert ^2 \\&\quad \le (2+C C'\digamma ^{-2})\Vert \mathbb {1}_U w V u\Vert ^2 + 2\Vert \mathbb {1}_U w L_1 u\Vert ^2 \\&\qquad + C_\digamma \bigl (\Vert \mathbb {1}_U w f\Vert ^2+(C+C')\Vert \mathbb {1}_U w(u_0,d u_0,u_1)\Vert ^2\bigr ). \end{aligned}$$

Fixing \(C'\) sufficiently large and then \(\digamma >0\) large, we can absorb the two first terms on the right into the first term on the left hand side, using that \(-\digamma K_0>-2\digamma K_0-K_1\) for large \(\digamma \). This gives (4.7) for \(k=1\).

We now proceed by induction, assuming (4.7) holds for some value of k for all operators L of the form considered above. If \(L u=f\), let \(X\in (\text {Diff}_{\text {b}}^1(\overline{\mathbb {R}^4}))^N\) denote an N-tuple of b-differential operators which generate \(\text {Diff}_{\text {b}}^1(\overline{\mathbb {R}^4})\) over \(\mathcal {C}^\infty (\overline{\mathbb {R}^4})\); writing \([L,X]=L'\cdot X\) for \(L'\) an N-tuple of operators in \((\mathcal {C}^\infty +\rho _0^{1-0}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^1\), we then have \((L-L')(X u)=X f\). Applying (4.7) to this equation, we obtain the estimate (4.7) for \(L u=f\) itself with k replaced by \(k+1\). \(\square \)

Given the structure of the operator \(L_h\) on the manifold with corners M as described in §3.3, it is natural to proceed proving the estimate (4.3) in steps: in §4.1, we propagate the control given by Proposition 4.3 uniformly up to a neighborhood of the past corner \(I^0\cap \mathscr {I}^+\) of null infinity and thus into \((\mathscr {I}^+)^\circ \). In §4.2, we prove the energy estimate uniformly up to \(I^+\); the last estimate cannot be localized near the corner \(\mathscr {I}^+\cap I^+\) since typically limits of future-directed null-geodesic tending to \(\mathscr {I}^+\cap I^+\) pass through points in \(I^+\) far from \(\mathscr {I}^+\).

4.1 Estimate up to null infinity

We work near the past corner \(I^0\cap \mathscr {I}^+\) of the radiation field; recall the definition of the boundary defining functions \(\rho _0\) and \(\rho _I\) of \(I^0\) and \(\mathscr {I}^+\) from (2.25), and let \(\rho =r^{-1}\). At \(\mathscr {I}^+\), we need to describe \(G_{\text {b}}\) more precisely than was needed near \((I^0)^\circ \); we make extensive use of the structures defined in §2.4. Equations (3.15) and (2.26) give

$$\begin{aligned} G_{\text {b}}= G_{0,{\text {b}}} + G_{1,{\text {b}}} + \widetilde{G}_{\text {b}},\ \ G_{1,{\text {b}}}:=\rho ^{-2}g_m^{-1}-G_{0,{\text {b}}}\in \mathcal {C}^\infty (M;S^2\,{}^{\beta }TM), \end{aligned}$$
(4.13)

with as before, and

$$\begin{aligned} \widetilde{G}_{\text {b}}\in \rho _0^{1+b_0}\rho _I^{-1+b'_I}H_{{\text {b}}}^\infty (M;S^2\,{}^{\beta }TM+\rho _I\,S^2\,{}^{{\text {b}}}TM). \end{aligned}$$

Dually, equation (2.27) gives

$$\begin{aligned} g_{\text {b}}\in (\mathcal {C}^\infty +\rho _0^{1+b_0}\rho _I^{b'_I}H_{{\text {b}}}^\infty )(M;S^2({}^{\beta }TM)^\perp + \rho _I\,S^2\,{}^{{\text {b}}}T^*M) \end{aligned}$$
(4.14)

where the smooth term is .

Fix \(\beta \in (0,b'_I)\). For small \(\epsilon >0\), we define the domain

$$\begin{aligned} U_\epsilon := \{ \rho _I< \epsilon ,\ \rho _0-\rho _I^\beta<1 \} \subset M,\ \ U_\epsilon ^0 := U_\epsilon \cap \{ \tfrac{1}{2}\epsilon< \rho _I < \epsilon \}, \end{aligned}$$
(4.15)

see Figure 10. Thus, \(U_\epsilon \) is bounded by \(I^0\), \(\mathscr {I}^+\), \(\{\rho _I=\epsilon \}\), and \(U^\partial _\epsilon =\{\rho _0-\rho _I^\beta =1,\ \rho _I<\epsilon \}\). At \(U^\partial _\epsilon \), we use (4.6) and (4.13) to compute

$$\begin{aligned} |d(\rho _0-\rho _I^\beta )|_{G_{\text {b}}}^2 \in 2\beta \rho _I^{-1+\beta }(\rho _0+\beta \rho _I^\beta ) + \rho _I^{2\beta }\mathcal {C}^\infty + \rho _0^{1-0}\rho _I^{-1+b'_I}H_{{\text {b}}}^\infty , \end{aligned}$$
(4.16)

hence \(U_\epsilon ^\partial \) is timelike for small enough \(\epsilon \). As in the proof of Proposition 4.3, the main term is the K-current of a timelike vector field with suitable weights:

Lemma 4.4

Fix \(c_V\in \mathbb {R}\), let \(W:=\rho _0^{-2 a_0}\rho _I^{-2 a_I} V\), and \(V:=-(1+c_V)\rho _I\partial _{\rho _I}+\rho _0\partial _{\rho _0}\), then

(4.17)

Furthermore,

$$\begin{aligned} {\text {div}}_{g_{\text {b}}}W&\in -2\rho _0^{-2 a_0}\rho _I^{-2 a_I}\bigl (1+2(a_0-a_I)+c_V(1-2 a_I) \bigr ) \nonumber \\&\quad + \rho _0^{-2 a_0}\rho _I^{-2 a_I+1}(\mathcal {C}^\infty +\rho _0^{1+b_0}\rho _I^{-1+b'_I}H_{{\text {b}}}^\infty ) \end{aligned}$$
(4.18)

Here, \(\rho _I^{-1}|V|_{g_{\text {b}}}^2\in 2 c_V+\rho _I\,\mathcal {C}^\infty +\rho _0^{1+b_0}\rho _I^{b'_I}H_{{\text {b}}}^\infty \), so V is timelike for \(c_V>0\). This calculation also shows that the level sets of \(\rho _I\) are spacelike in \(U_\epsilon \). The term \(\rho _I K_W(d u,d u)\) will provide control of u in \(\rho _0^{a_0}\rho _I^{a_I}H_{\mathscr {I}}^1\) (modulo control of \(|u|^2\) itself, which we obtain by integration), similarly to (1.19).

Remark 4.5

For easier comparison with energy estimates expressed in standard coordinates on \(\mathbb {R}^4\), consider the special case \(m=0\), so \(\rho _0=(r-t)^{-1}\) and \(\rho _I=(r-t)/r\); then \(\rho _0\partial _{\rho _0}=-(r\partial _r+t\partial _t)\) (scaling) and \(\rho _I\partial _{\rho _I}=-r(\partial _t+\partial _r)\) (weighted outgoing derivative). Thus, the multiplier vector field W in \(t<r\), \(r>0\), equals

$$\begin{aligned} W = r^{2 a_I+1}(r-t)^{2(a_0-a_I)}\bigl (c_V\partial _r + (c_V+\tfrac{r-t}{r})\partial _t\bigr ). \end{aligned}$$

Proof of Lemma 4.4

Recall that \(K_W=\tfrac{1}{2}(\pi -\tfrac{1}{2}({\text {tr}}_{g_{\text {b}}}\pi )G_{\text {b}})\), \(\pi :=-\mathcal {L}_W G_{\text {b}}\). Since \(V\in \mathcal {M}_{\underline{\mathbb {C}}{}}\), Lemma 2.13(2) shows that \(\widetilde{\pi }:=-\mathcal {L}_W \widetilde{G}_{\text {b}}\), expressed using vector field commutators, lies in the remainder space in (4.17); using (4.14), this implies \({\text {tr}}_{g_{\text {b}}}\widetilde{\pi } \in \rho _0^{-2 a_0+1+b_0}\rho _I^{-2 a_I+b'_I}H_{{\text {b}}}^\infty \), so \(({\text {tr}}_{g_{\text {b}}}\widetilde{\pi })G_{\text {b}}\) also lies in the remainder space. Similarly, \(G_{1,{\text {b}}}\) contributes a (weighted) smooth remainder term to \(K_W\). Lastly, for \(\pi _0=-\mathcal {L}_W G_{{\text {b}},0}\), the term \(\tfrac{1}{2}(\pi _0-\tfrac{1}{2}({\text {tr}}_{g_{\text {b}}}\pi _0)G_{\text {b}})\) contributes the main term, i.e. the first line of (4.17) after a short calculation, as well as two more error terms, one from \(\widetilde{G}_{\text {b}}\), the other coming from the nonsmooth remainder term in (4.14). The calculation (4.18) drops out as a by-product of this, and can also be recovered by \({\text {div}}_{g_{\text {b}}}W=-{\text {tr}}_{g_{\text {b}}}K_W\). \(\square \)

In order to get the sharp weightsFootnote 34 for the decaying components \(\pi _0 u\) of u at \(\mathscr {I}^+\) in Theorem 4.2, we need to exploit the sign of the leading subprincipal part of \(L_h\) at \(\mathscr {I}^+\), given by the term involving \(\rho ^{-1}A_h\partial _1\) in Lemma 3.8, in the decoupled equation for \(\pi _0 u\), see (3.26a) for the model. We thus prove:

Lemma 4.6

Define \(W=\rho _0^{-2 a_0}\rho _I^{-2 a'_I}(\rho _0\partial _{\rho _0}-(1+c_V)\rho _I\partial _{\rho _I})\) similarly to previous lemma. Let \(\gamma \in \mathbb {R}\), and fix \(a_0,a'_I\in \mathbb {R}\) such that \(a'_I<\min (\gamma ,a_0)\). Then for small \(c_V>0\), there exists a constant \(C>0\) such that

(4.19)

in the sense of quadratic forms, in \(U_\epsilon \), \(\epsilon >0\) small.

Proof

Using the expression (2.26) for \(\rho _0^{-1}\rho _I^{-1}\partial _1\), we have

$$\begin{aligned}&\rho _0^{2 a_0}\rho _I^{2 a'_I+1}W\otimes _s\rho ^{-1}\partial _1 \\&\quad \in (\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})^2-c_V\rho _I\partial _{\rho _I}\otimes _s(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}) + \rho _I\,\mathcal {C}^\infty (M;{}^{\beta }TM) \end{aligned}$$

We can then calculate the leading term of \(\rho _0^{2 a_0}\rho _I^{2 a'_I+1}\) times the left hand side of (4.19) by completing the square:

The first term is the negative of a square, and so is the second term if we choose \(c_V>0\) sufficiently small; reducing \(c_V\) further if necessary, the coefficient of the last term is negative as well, finishing the proof. \(\square \)

Remark 4.7

For the value of \(c_V\) determined in the proof, we have \({\text {div}}_{g_{\text {b}}}W\le -C\rho _0^{-2 a_0}\rho _I^{-2 a'_I}\) near \(\mathscr {I}^+\) by inspection of the expression (4.18).

Suppose now u solves \(L_h u=f\) with initial data \((u_0,u_1)\) as in (4.2). Note that the estimates (4.5) and (4.7) provide control of u on \(U_\epsilon ^0\) for any choice of \(\epsilon >0\); thus, it suffices to prove an estimate in \(U_\epsilon \) for any arbitrary but fixed \(\epsilon >0\). Let \(\chi \in \mathcal {C}^\infty (\mathbb {R})\) be a cutoff, \(\chi (\rho _I)\equiv 1\) for \(\rho _I<\epsilon /4\) and \(\chi (\rho _I)\equiv 0\) for \(\rho _I>\epsilon /2\), and put \(\widetilde{u}:=\chi u\), then \(\widetilde{u}\) solves the forward problem

$$\begin{aligned} L_h\widetilde{u} = \widetilde{f} := \chi f + [L_h,\chi ]u \end{aligned}$$
(4.20)

in \(U_\epsilon \), with \(\Vert \widetilde{f}\Vert _{\rho _0^{a_0}\rho _I^{a_I-1}H_{{\text {b}}}^{k-1}(U_\epsilon )}+\Vert \pi _0\widetilde{f}\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^{k-1}(U_\epsilon )}\) controlled by the corresponding norm of f plus the right hand sides of (4.5) and (4.7). (Use Lemma 3.10 to compute the rough form of the commutator term). Note that \(\widetilde{u}=\chi u\) is the unique solution of \(L_h\widetilde{u}=\widetilde{f}\) vanishing in \(\rho _I>\tfrac{1}{2}\epsilon \). See Figure 10.

Fig. 10
figure 10

The domain \(U_\epsilon \) and its subdomain \(U_\epsilon ^0\) where we have a priori control of u, allowing us to cut off and study equation (4.20) instead

Thus, the estimate (4.3) of u in \(U_\epsilon \) is a consequence of the following result (dropping the tilde on \(\widetilde{u}\) and \(\widetilde{f}\)):

Proposition 4.8

For weights \(b_0,b'_I,b_I,a_0,a'_I,a_I\), and for \(h\in \mathcal {X}^\infty \), small in \(\mathcal {X}^3\), as in Theorem 4.2, and for \(k\in \mathbb {N}\), let \(f\in \rho _0^{a_0}\rho _I^{a_I-1}H_{{\text {b}}}^{k-1}(U_\epsilon )\), \(\pi _0 f\in \rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^{k-1}(U_\epsilon )\); suppose f vanishes in \(\rho _I>\tfrac{1}{2}\epsilon \). Let u denote the unique forward solution of \(L_h u=f\). Then

$$\begin{aligned}&\Vert u\Vert _{\rho _0^{a_0}\rho _I^{a_I}H_{\mathscr {I},{\text {b}}}^{1,k-1}(U_\epsilon )} + \Vert \pi _0 u\Vert _{\rho _0^{a_0}\rho _I^{a'_I}H_{\mathscr {I},{\text {b}}}^{1,k-1}(U_\epsilon )} \nonumber \\&\quad \le C\Bigl (\Vert f\Vert _{\rho _0^{a_0}\rho _I^{a_I-1}H_{{\text {b}}}^{k-1}(U_\epsilon )} + \Vert \pi _0 f\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^{k-1}(U_\epsilon )}\Bigr ). \end{aligned}$$
(4.21)

Proof

The idea is to exploit the decoupling of the leading terms of \(L_h\) at \(\mathscr {I}^+\) given by Equations (3.26a)–(3.26c): this allows us to prove an energy estimate (for the case \(k=1\))

$$\begin{aligned} \Vert \pi _0 u\Vert _{\rho _0^{a_0}\rho _I^{a'_I}H_{\mathscr {I}}^1} \le C\bigl ( \Vert \pi _0 f\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}L^2_{\text {b}}} + \Vert \pi _0^c u\Vert _{\rho _0^{a_0}\rho _I^{a_I-\delta }H_{\mathscr {I}}^1}\bigr ), \end{aligned}$$
(4.22)

where \(\delta >0\) fixed such that

$$\begin{aligned} a'_I-b'_I<a_I-\delta ,\quad a_I<a'_I-\delta . \end{aligned}$$
(4.23)

The estimate (4.22) contains \(\pi _0^c u\) as an error term, but with a weaker weight due to the decay of the coefficients of the error term \(\widetilde{L}_h\)—which is dropped in (3.26a). On the other hand, \(\pi _0 u\) couples into \(\pi _0^c u\) via at most logarithmic terms, hence we can prove

$$\begin{aligned} \Vert \pi _0^c u\Vert _{\rho _0^{a_0}\rho _I^{a_I}H_{\mathscr {I}}^1} \le C\bigl ( \Vert \pi _0^c f\Vert _{\rho _0^{a_0}\rho _I^{a_I-1}L^2_{\text {b}}} + \Vert \pi _0 u\Vert _{\rho _0^{a_0}\rho _I^{a_I+\delta }H_{\mathscr {I}}^1}\bigr ) \end{aligned}$$
(4.24)

Close to \(\mathscr {I}^+\), the last term in the estimate (4.22), resp. (4.24), is controlled by a small constant times the left hand side of (4.24), resp. (4.22), hence summing the two estimates yields the full estimate (4.21). The proof of (4.24) and its higher regularity version will itself consist of two steps, corresponding to the weak null structure expressed by the decoupling of (3.26b) and (3.26c).

All energy estimates will use the vector field

$$\begin{aligned} V_1=-(1+c_V)\rho _I\partial _{\rho _I}+\rho _0\partial _{\rho _0} \end{aligned}$$

from Lemma 4.4, with \(c_V>0\) chosen according to Lemma 4.6. Denote \(u_0:=\pi _0 u\), \(u_{1 1}:=\pi _{1 1}u\), \(u_{1 1}^c:=\pi _{1 1}^c u\), and \(u_0^c:=\pi _0^c u=u_{1 1}+u_{1 1}^c\). We expand \(L_h u=f\) as

$$\begin{aligned} \pi _0 L_h\pi _0 u_0&= \pi _0 f - \pi _0 L_h\pi _0^c u_0^c , \end{aligned}$$
(4.25a)
$$\begin{aligned} \pi _{1 1}^c L_h\pi _{1 1}^c u_{1 1}^c&= \pi _{1 1}^c f - \pi _{1 1}^c L_h\pi _0 u_0 - \pi _{1 1}^c L_h\pi _{1 1} u_{1 1} , \end{aligned}$$
(4.25b)
$$\begin{aligned} \pi _{1 1}L_h\pi _{1 1} u_{1 1}&= \pi _{1 1}f - \pi _{1 1}L_h\pi _0 u_0 - \pi _{1 1}L_h\pi _{1 1}^c u_{1 1}^c . \end{aligned}$$
(4.25c)

Here, we regard \(\beta ^*K_0\rightarrow M\) as a vector bundle in its own right, and \(u_0\) as a section of \(\beta ^*K_0\): the inclusion \(K_0\hookrightarrow S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\) and the structures on the latter bundle induced by g or \(g_m\) play no role; likewise for \(K_{1 1}\) and \(K_{1 1}^c\).

Starting the proof of the estimate (4.22) using equation (4.25a), let us abbreviate \(L:=\pi _0 L_h\pi _0\). By Lemma 3.8 and recalling the definition of \(A_{\text {CD}}\) from equation (3.26a), we have

$$\begin{aligned} L = L^0 + \widetilde{L},\ \ L^0 = -2\rho ^{-2}\partial _0\partial _1 + L^0_1,\ L^0_1=-\rho ^{-1}A_{\text {CD}}\partial _1, \end{aligned}$$
(4.26)

with \(\widetilde{L}\) lying in the same space as \(\widetilde{L}_h\) in (3.25) with \(\beta ^*S^2\) replaced by \(\beta ^*K_0\). Here, \(L^0_1\) denotes a fixed representative in \(\rho _I^{-1}\cdot {}^0\mathcal {M}_{\beta ^*K_0}\), defined by fixing a representative of \(\rho _0^{-1}\partial _1\in {}^0\mathcal {M}_{\beta ^*K_0}\), see equation (2.42), in the image space of Lemma 2.13(3). Let \(w=\rho _0^{-a_0}\rho _I^{-a'_I}\); let further \(\mathbb {1}_{U_\epsilon }\) denote the characteristic function of \(U_\epsilon \). Fix \(V\in {}^0\mathcal {M}_{\beta ^*K_0}\), with scalar principal symbol equal to that of \(V_1\). Let

$$\begin{aligned} W:=\mathbb {1}_{U_\epsilon }W^\circ ,\ \ W^\circ :=w^2 V. \end{aligned}$$

Fix a positive definite fiber inner product \(B:{}^{{\text {b}}}TM\rightarrow {}^{{\text {b}}}T^*M\) on \({}^{{\text {b}}}TM\), a connection \(d\in \text {Diff}^1(\overline{\mathbb {R}^4};K_0,T^*\overline{\mathbb {R}^4}\otimes K_0)\) on \(K_0\), and a positive definite fiber metric \(k_0\) on \(K_0\) with respect to which \(A_{\text {CD}}=A_{\text {CD}}^*\); note here that \(A_{\text {CD}}\) is constant on the fibers of \(\mathscr {I}^+\), hence indeed descends to an endomorphism of \(K_0|_{S^+}\). Let \(\langle \cdot ,\cdot \rangle \) denote the \(L^2\) inner product with respect to \(k_0\) and the density ; defining the b-density to define \(L^2_{\text {b}}(M)\), we then have

$$\begin{aligned} \langle u,v\rangle = \langle \rho _I u,v\rangle _{L^2_{\text {b}}}. \end{aligned}$$
(4.27)

We shall evaluate

$$\begin{aligned}&2{\text {Re}}\langle w L u_0, \mathbb {1}_{U_\epsilon }w V u_0\rangle = \langle \mathcal {C}u_0, u_0\rangle , \nonumber \\&\quad \mathcal {C}:= L^*W+W^*L = [L,W]+(W+W^*)L+(L^*-L)W. \end{aligned}$$
(4.28)

Let \(K_W\) denote the current associated with the scalar principal part of W, see (4.9), now understood as taking values in the bundle \(S^2\,{}^{{\text {b}}}TM\otimes {\text {End}}(\beta ^*K_0)\), acting on \(\beta ^*K_0\) by scalar multiplication. While \(K_W\) provides positivity of \(\mathcal {C}\) near \(\mathscr {I}^+\) for suitable weights by Lemma 4.4—in particular, this would require \(a'_I<0\)—we will show around (4.35) below how to obtain a better result by exploiting the sign of \(A_{\text {CD}}\) entering through \((L^*-L)W\).

In the proof of Proposition 4.3, where we worked in a global trivialization, all terms of W and L other than the top order ones could be treated as error terms; we show that the same is true here by patching together estimates obtained from calculations in local coordinates and trivializations. Thus, let \(\{\mathcal {U}_j\}\) be a covering of a neighborhood of \(S^+\) containing \(U_\epsilon \) by open sets on which \(K_0\) is trivial, and let \(\{\chi _j\}\), \(\chi _j\in \mathcal {C}^\infty _{\text {c}}(\mathcal {U}_j)\), denote a subordinate partition of unity; let \(\widetilde{\chi }_j\in \mathcal {C}^\infty _{\text {c}}(\mathcal {U}_j)\), \(\widetilde{\chi }_j\equiv 1\) on \({\text {supp}}\chi _j\). Fix trivializations \((K_0)|_{\mathcal {U}_j}\cong \mathcal {U}_j\times \mathbb {C}^4\) and the induced trivializations of \(\beta ^*K_0\). Write

$$\begin{aligned} L=L_{j,2} + L_{j,1},\ \ W=W_{j,1} + W_{j,0}, \end{aligned}$$

where \(L_{j,2}:=\tfrac{1}{2}\Box _{g_{\text {b}}}\) acts component-wise as the scalar wave operator and \(L_{j,1}\) is a first order operator, while \(W_{j,1}:=\mathbb {1}_{U_\epsilon }w^2 V_1\) acts component-wise, and \(W_{j,0}\in \mathbb {1}_{U_\epsilon }w^2\rho _I\,\mathcal {C}^\infty (\mathcal {U}_j,{}^{\beta }TM)\), with the extra factor of \(\rho _I\) due to the choice of V. On \((K_0)|_{\mathcal {U}_j}\), let moreover \(d_j\) denote the standard connection, given component-wise as the exterior derivative on functions, and let \(k_j\) denote the standard Hermitian fiber metric; we denote adjoints with respect to \(k_j\) by \(\dag \). Now,

$$\begin{aligned} \langle \mathcal {C}u_0,u_0\rangle =\sum \langle \mathcal {C}_j u_0,\chi _j u_0\rangle , \end{aligned}$$
(4.29)

where

$$\begin{aligned} \mathcal {C}_j = \sum _{k,\ell } \mathcal {C}_{j,k\ell },\ \ \mathcal {C}_{j,k\ell } := L_{j,k}^*W_{j,\ell } + W_{j,\ell }^*L_{j,k}. \end{aligned}$$

The usual calculation in the scalar case, see the discussion around (4.8), gives

$$\begin{aligned} \underline{\mathcal {C}}{}_{j,2 1}:=L_{j,2}^\dag W_{j,1}+W_{j,1}^\dag L_{j,2} = d_j^\dag B K_W d_j, \end{aligned}$$

so

$$\begin{aligned}&\langle \mathcal {C}_{j,2 1}u_0,\chi _j u_0\rangle = \langle d^* B K_W d u_0, \chi _j u_0\rangle + \langle (\mathcal {C}_{j,2 1}-\underline{\mathcal {C}}{}_{j,2 1})u_0,\chi _j u_0\rangle \nonumber \\&\quad + \langle (d_j^\dag -d_j^*)B K_W d_j u_0,\chi _j u_0\rangle + \langle (d_j^* B K_W d_j-d^* B K_W d)u_0,\chi _j u_0\rangle . \end{aligned}$$
(4.30)

Summing the first term over j yields

$$\begin{aligned} \int _{U_\epsilon } \rho _I K_{W^\circ }(d u_0,d u_0)\,d\mu _{\text {b}}+ \int T(\rho _I\nabla \mathbb {1}_{U_\epsilon },W^\circ )(d u_0,d u_0)\,d\mu _{\text {b}}\end{aligned}$$
(4.31)

upon application of the formula (4.10). The first summand—after adding the term (4.35) below—is negative definite, controlling derivatives of \(u_0\) as in (4.22); the second term gives a contribution of the same sign: we have

$$\begin{aligned} T(\rho _I\nabla \mathbb {1}_{U_\epsilon },W^\circ ) = \delta _{U_\epsilon ^\partial }\otimes w^2 T^\partial , \end{aligned}$$

with \(T^\partial \le 0\) since \(-\nabla \mathbb {1}_{U_\epsilon }\) and \(W^\circ \) are future causal. The remaining terms in (4.30) are error terms: the second term is equal to

$$\begin{aligned} \langle W_{j,1}u_0,(L_{j,2}-L_{j,2}^{\dag *})\chi _j u_0\rangle + \langle L_{j,2} u_0, (W_{j,1}-W_{j,1}^{\dag *})\chi _j u_0\rangle . \end{aligned}$$

Now, \(k_0\) and \(k_j\) are related by \(k_j(\cdot ,\cdot )=k_0(\widetilde{Q}_j\cdot ,\widetilde{Q}_j\cdot )\), with \(\widetilde{Q}_j\in \mathcal {C}^\infty (\mathcal {U}_j;{\text {End}}(K_0))\) invertible, and then \(A^\dag =Q_j^{-1}A^*Q_j\) for \(Q_j:=\widetilde{Q}_j^*\widetilde{Q}_j\) when A is an operator acting on sections of \(K_0\). Thus, \(W_{j,1}-W_{j,1}^{\dag *}=[W_{j,1},Q_j^*](Q_j^{-1})^*\). On M, the constancy of \(Q_j\), and hence of \(Q_j^*\), along the fibers of \(\beta \) and \(V_1\in {}^0\mathcal {M}\) give the extra vanishing factor \(\rho _I\) in

$$\begin{aligned} W_{j,1}-W_{j,1}^{\dag *} = \mathbb {1}_{U_\epsilon }\rho _I w^2 q_{j,1},\ \ q_{j,1}\in \mathcal {C}^\infty (\beta ^{-1}(\mathcal {U}_j);{\text {End}}(\beta ^*K_0)), \end{aligned}$$

with \(q_{j,1}\) only depending on \(Q_j\). Similarly, \(L_{j,2}-L_{j,2}^{\dag *}=[L_{j,2},Q_j^*](Q_j^{-1})^*\); using Lemma 3.7 and \([\partial _1,Q_j^*]\in \rho \,\mathcal {C}^\infty \), we find (replacing the weight \(-0\) there by \(-1/2+b'_I\) for definiteness)

$$\begin{aligned}&L_{j,2}-L_{j,2}^{\dag *} \in \rho _0^{1+b_0}\rho _I^{-1+b'_I}H_{{\text {b}}}^\infty (M)\mathcal {M}_{\beta ^*K_0} \nonumber \\&\quad +\, (\mathcal {C}^\infty +\rho _0^{1+b_0}\rho _I^{-1/2+b'_I}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^1(\mathcal {U}_j;\beta ^*K_0). \end{aligned}$$
(4.32)

Writing \(L_{j,2}u_0=L u_0-L_{j,1}u_0\) and using the relationship (4.27), we thus get

$$\begin{aligned}&|\langle (\mathcal {C}_{j,2 1}-\underline{\mathcal {C}}{}_{j,2 1})u_0,\chi _j u_0\rangle | \nonumber \\&\quad \le C\Vert \widetilde{\chi }_j w V_1 u_0\Vert _{L^2_{\text {b}}} \bigl (\Vert \widetilde{\chi }_j\rho _I^{b'_I}w u_0\Vert _{H_{\beta }^1} + \Vert \widetilde{\chi }_j\rho _I^{1/2+b'_I}w u_0\Vert _{H_{{\text {b}}}^1}\bigr ) \nonumber \\&\qquad + C\bigl (\Vert \widetilde{\chi }_j\rho _I w L u_0\Vert _{L^2_{\text {b}}} + \Vert \widetilde{\chi }_j\rho _I w L_{j,1}u_0\Vert _{L^2_{\text {b}}}\bigr ) \Vert \widetilde{\chi }_j \rho _I w u_0\Vert _{L^2_{\text {b}}}, \end{aligned}$$
(4.33)

where the norms are taken on \(U_\epsilon \). Note that in all terms on the right, at least one factor comes with an extra decaying power of \(\rho _I\) relative to \(w u_0\), hence is small compared to \(w u_0\) if we localize to \(U_\epsilon \) for small \(\epsilon >0\), i.e. to a small neighborhood of \(\mathscr {I}^+\). Next, we combine Lemmas 2.16 and 4.4 in the same fashion as in the proof of Lemma 2.17 to estimate the last two terms of (4.30) by

$$\begin{aligned}&C\bigl ( \Vert \widetilde{\chi }_j \rho _I w u_0 \Vert _{H_{{\text {b}}}^1} \Vert \chi _j w u_0 \Vert _{L^2_{\text {b}}} \nonumber \\&\quad + (\Vert \widetilde{\chi }_j\rho _I w T^\partial (d u_0,d u_0)^{1/2} \Vert _{L^2_{\text {b}}(U_\epsilon ^\partial )}+\Vert \widetilde{\chi }_j\rho _I w u_0\Vert _{L^2_{\text {b}}(U_\epsilon ^\partial )}) \Vert \chi _j w u_0\Vert _{L^2_{\text {b}}(U_\epsilon ^\partial )} \bigr ); \end{aligned}$$
(4.34)

where the second term in the inner parenthesis comes from the pointwise estimate \(T^\partial (d_j u_0,d_j u_0)^{1/2}\le C(T^\partial (d u_0,d u_0)^{1/2}+|u_0|)\).

The next interesting term in (4.29) is \(\mathcal {C}_{j,1 1}+\mathcal {C}_{j,1 0}\), specifically the term coming from the ‘constraint damping part’ \(L^0_1\) defined in (4.26). In a local trivialization, \(L^0_1=-\rho ^{-1}A_{\text {CD}}\partial _1+L^0_{1,j}\), \(L^0_{1,j}\in \mathcal {C}^\infty (\mathcal {U}_j)\) (using the discussion around (2.42) for this membership), so we have the pointwise equality

$$\begin{aligned} 2{\text {Re}}k_0(W u_0,L_1^0\chi _j u_0)&= -2{\text {Re}}k_0(W_{j,1}u_0,\rho ^{-1}A_{\text {CD}}\partial _1 \chi _j u_0) \\&\quad + 2{\text {Re}}k_0(W_{j,0}u_0,L_1^0\chi _j u_0) + 2{\text {Re}}k_0(W_{j,1} u_0,L^0_{1,j}\chi _j u_0); \end{aligned}$$

letting

$$\begin{aligned} K'&:= -2 w^2 (V_1\otimes _s \rho ^{-1}\partial _1) \otimes A_{\text {CD}}\\&\in \rho _0^{-2 a_0}\rho _I^{-2 a'_I-1}\mathcal {C}^\infty \bigl (U_\epsilon ; (S^2\,{}^{\beta }TM+\rho _I\,S^2\,{}^{{\text {b}}}TM)\otimes {\text {End}}(\beta ^*K_0)\bigr ), \end{aligned}$$

the first term integrates to \(\int \rho _I K'(d_j u_0,d_j\chi _j u_0)\,d\mu _{\text {b}}\), which equals

$$\begin{aligned} \int \rho _I K'(d u_0,d\chi _j u_0)\,d\mu _{\text {b}}\end{aligned}$$
(4.35)

plus error terms of the same kind as in the second line of (4.30). The extra factor of \(\rho _I\) in \(W_{j,0}\) and \(L^0_{1,j}\) (as compared to \(W_{j,1}\) and \(L^0_1\)) allows the remaining two terms to be estimated in a fashion similar to (4.33). The remaining contributions to \(\mathcal {C}_{j,1 1}+\mathcal {C}_{j,1 0}\) are error terms coming from \(\widetilde{L}\) in (4.26) and can be estimated as in (4.33).

Lastly, the terms of (4.29) involving \(\mathcal {C}_{j,2 0}\) can be rewritten and estimated as follows:

$$\begin{aligned}&\bigl |2{\text {Re}}\langle (L-L_{j,1})u_0, W_{j,0}\chi _j u_0\rangle + \langle W_{j,0}u_0,[L_{j,2},\chi _j]u_0\rangle \bigr | \\&\quad \le 2\bigl (\Vert \rho _I w L u_0\Vert _{L^2_{\text {b}}}+\Vert \widetilde{\chi }_j\rho _I w L_{j, 1}u_0\Vert _{L^2_{\text {b}}}\bigr ) \Vert \chi _j\rho _I w u_0 \Vert _{L^2_{\text {b}}} \\&\qquad + \Vert \widetilde{\chi }_j\rho _I w u_0\Vert _{L^2_{\text {b}}}\bigl (\Vert \widetilde{\chi }_j\rho _I^{b'_I}w u_0\Vert _{H_{\beta }^1}+\Vert \widetilde{\chi }_j\rho _I^{1/2+b'_I}w u_0\Vert _{H_{{\text {b}}}^1}\bigr ); \end{aligned}$$

the norms are taken on \(U_\epsilon \), and we use that \([L_{j,2},\chi _j]\) lies in the same space as (4.32). We note that by Lemma 3.8, the terms involving \(L_{j,1}\) here and in (4.33) can be estimated by

$$\begin{aligned} \Vert \widetilde{\chi }_j\rho _I w L_{j,1}u_0\Vert _{L^2_{\text {b}}} \le C\bigl (\Vert \bar{\chi }_j\rho _I^{b'_I}w u_0\Vert _{H_{\beta }^1}+\Vert \bar{\chi }_j\rho _I^{1/2+b'_I} w u_0\Vert _{H_{{\text {b}}}^1}\bigr ), \end{aligned}$$

where \(\bar{\chi }_j\in \mathcal {C}^\infty _{\text {c}}(\mathcal {U}_j)\) is identically 1 on \({\text {supp}}\widetilde{\chi }_j\).

This finishes the evaluation of (4.28); we now turn to the estimate of \(w u_0\) itself by wVu. As in the proof of Proposition 4.3, this follows from integration along V. Concretely, we consider a ‘commutator’ as in (4.12), that is,

$$\begin{aligned} 2{\text {Re}}\langle \mathbb {1}_{U_\epsilon }w V u_0, \rho _I^{-1}w u_0\rangle = -\langle \rho _I^{-1}{\text {div}}_{g_{\text {b}}}(\mathbb {1}_{U_\epsilon }w^2 V_1)u_0, u_0\rangle + E, \end{aligned}$$
(4.36)

where \(|E|\le C\Vert w u_0\Vert _{L^2_{\text {b}}}\Vert \rho _I w u_0\Vert _{L^2_{\text {b}}}\) by Lemma 2.18. Using the negativity of the divergence near \(\mathscr {I}^+\) due to Lemma 4.4 and Remark 4.7, and that \(V_1\) is outward pointing at \(U_\epsilon ^\partial \), so \(V_1(\mathbb {1}_{U_\epsilon })\) is a negative\(\delta \)-distribution at \(U_\epsilon ^\partial \), we get

$$\begin{aligned} \Vert w u_0\Vert _{L^2_{\text {b}}(U_\epsilon )} + \Vert w u_0\Vert _{L^2_{\text {b}}(U_\epsilon ^\partial )} \le C\Vert w V u_0\Vert _{L^2_{\text {b}}(U_\epsilon )}; \end{aligned}$$
(4.37)

recall here that \(u_0\) vanishes in \(\rho _I>\tfrac{\epsilon }{2}\), hence there is no a priori control term on the right. Subtracting this estimate from (4.28) (the latter having main terms which are negative definite in \(d u_0\)), the main terms are the left hand side of (4.37) and \(\int _{U_\epsilon } \rho _I(K'+K_{W^\circ })(d u_0,d u_0)\,d\mu _{\text {b}}\) from (4.31) and (4.35). By Lemma 4.6, they control \(\Vert w u_0 \Vert _{H_{\mathscr {I}}^1(U_\epsilon )}\): the error terms in \(U_\epsilon \) can be absorbed into this, while those at \(U_\epsilon ^\partial \) in (4.34) can be absorbed into the second terms of (4.31) and (4.37), due to the extra decaying weights on at least one of the factors in each of those error terms as discussed after (4.33). Thus, we have proved

$$\begin{aligned} \Vert u_0 \Vert _{\rho _0^{a_0}\rho _I^{a'_I}H_{\mathscr {I}}^1} \le C\bigl (\Vert \pi _0 f\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}L^2_{\text {b}}} + \Vert \pi _0 L_h\pi _0^c u_0^c \Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}L^2_{\text {b}}}\bigr ), \end{aligned}$$
(4.38)

valid for \(a'_I<\min (a_0,\gamma )\). Since \(L_h\) is principally scalar, \(\pi _0 L_h\pi _0^c\) is a first order operator, and by Lemma 3.8, we have

$$\begin{aligned} \pi _0 L_h\pi _0^c \in \rho _0^{1-0}\rho _I^{-1+b'_I}\mathcal {M}_{\beta ^*K_0} + (\mathcal {C}^\infty +\rho _0^{1-0}\rho _I^{-0}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^1(M;\beta ^*K_0); \end{aligned}$$
(4.39)

since \(a'_I<a_I+b'_I<a_I+\tfrac{1}{2}\), the second term in (4.38) is bounded by \(\Vert u_0^c\Vert _{\rho _0^{a_0}\rho _I^{a_I-\delta }H_{\mathscr {I}}^1}\) for sufficiently small \(\delta >0\) (by the assumptions on the weights in Theorem 4.2), which establishes the estimate (4.22).

The proof of the estimate (4.24) proceeds along completely analogous lines, using the weight \(w=\rho _0^{-a_0}\rho _I^{-a_I}\) and positive commutator estimates for the equations (4.25b) and (4.25c). The main difference is that \(\pi _{1 1}L_h\pi _{1 1}\) and \(\pi _{1 1}^c L_h\pi _{1 1}^c\) have no leading order subprincipal terms like \(\pi _0 L_h\pi _0\) does, hence we need \(a_I<\min (a_0,0)\) for \(K_{w^2 V}\) to have a sign—this is the case \(a'_I=a_I\), \(\gamma =0\) in the notation of Lemma 4.6. In order to estimate the coupling terms on the right hand side of (4.25b), we use Lemma 3.8, so

$$\begin{aligned} \pi _{1 1}^c L_h\pi _0&\in (\rho _I^{-1}\mathcal {C}^\infty + \rho _0^{1-0}\rho _I^{-1-0}H_{{\text {b}}}^\infty )\mathcal {M}+ (\mathcal {C}^\infty +\rho _0^{1-0}\rho _I^{-0}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^1, \nonumber \\ \pi _{1 1}^c L_h\pi _{1 1}&\in \rho _0^{1-0}\rho _I^{-1+b'_I}\mathcal {M}+ (\mathcal {C}^\infty +\rho _0^{1-0}\rho _I^{-0})\text {Diff}_{\text {b}}^1, \end{aligned}$$
(4.40)

which gives

$$\begin{aligned} \Vert u_{1 1}^c \Vert _{\rho _0^{a_0}\rho _I^{a_I}H_{\mathscr {I}}^1} \le C\bigl (\Vert \pi _{1 1}^c f\Vert _{\rho _0^{a_0}\rho _I^{a_I-1}L^2_{\text {b}}} + \Vert u_0 \Vert _{\rho _0^{a_0}\rho _I^{a_I+\delta }H_{\mathscr {I}}^1} + \Vert u_{1 1} \Vert _{\rho _0^{a_0}\rho _I^{a_I-\delta }H_{\mathscr {I}}^1}\bigr ); \end{aligned}$$
(4.41)

for our choice (4.23) of \(\delta \), the second term is bounded by a small constant times the left hand side of (4.22). For analyzing the equation (4.25c) for \(u_{1 1}\), we observe that \(\pi _{1 1}L_h\pi _0\) lies in the space (4.40), while

$$\begin{aligned}&\pi _{1 1}L_h\pi _{1 1}^c \in \bigl (\rho _0^{1+b_0}\rho _I^{-1}H_{{\text {b}}}^\infty (\mathscr {I}^+\cap U_\epsilon )+\rho _0^{1-0}\rho _I^{-1+b_I}H_{{\text {b}}}^\infty \bigr )\mathcal {M}\\&\quad + (\mathcal {C}^\infty +\rho _0^{1-0}\rho _I^{-0}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^1, \end{aligned}$$

where we exploit that \(h^{\bar{a}\bar{b}}\) has a leading term at \(\mathscr {I}^+\). Thus,

$$\begin{aligned} \Vert u_{1 1} \Vert _{\rho _0^{a_0}\rho _I^{a_I}H_{\mathscr {I}}^1} \le C'\bigl (\Vert \pi _{1 1} f\Vert _{\rho _0^{a_0}\rho _I^{a_I-1}L^2_{\text {b}}} + \Vert u_0 \Vert _{\rho _0^{a_0}\rho _I^{a_I-\delta }H_{\mathscr {I}}^1} + \Vert u_{1 1}^c \Vert _{\rho _0^{a_0}\rho _I^{a_I}H_{\mathscr {I}}^1}\bigr ). \end{aligned}$$
(4.42)

In order to obtain the estimate (4.24), we add (4.41) and a small multiple, \(\eta \), of (4.42), so that \(\eta C'<1\) and \(u_{1 1}^c\) can be absorbed into the left hand side of (4.41); note that the \(u_{1 1}\) term in (4.41) is arbitrarily small compared to the left hand side of (4.42) when we localize sufficiently closely to \(\mathscr {I}^+\). As explained at the beginning of the proof, this establishes the desired estimate (4.21) for \(k=1\).

To prove (4.21) for \(k\ge 2\), we proceed by induction on the level of the hierarchy (4.25a)–(4.25c) and the corresponding estimates (4.22), (4.41), and (4.42). The key structures for obtaining higher regularity are the symmetries of the normal operators of \(\pi _0 L_h\pi _0\) etc. at \(\mathscr {I}^+\). Namely, \(-2\rho ^{-2}\partial _0\partial _1\in \partial _{\rho _I}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})+\text {Diff}_{\text {b}}^2\) commutes (modulo \(\text {Diff}_{\text {b}}^2\)) with \(\rho _0\partial _{\rho _0}\), while for the vector field \(\rho _I\partial _{\rho _I}\) generating dilations along approximate (namely, Schwarzschildean) light cones, we have

$$\begin{aligned}{}[-2\rho ^{-2}\partial _0\partial _1,\rho _I\partial _{\rho _I}] \in -2\rho ^{-2}\partial _0\partial _1 + \text {Diff}_{\text {b}}^2. \end{aligned}$$

Commutation with spherical vector fields is more subtle: we need to define rotation ‘vector fields’ somewhat carefully. We only define these on \(\beta ^*K_0\), the definition for the other bundles being analogous. Using the product splitting \(\mathbb {R}_q\times \mathbb {R}_s\times \mathbb {S}^2\) of \(\mathbb {R}^4\) near \(S^+\), denote by \(\{\Omega _{1,i}:i=1,2,3\}\subset \mathcal {V}(\mathbb {S}^2)\hookrightarrow \mathcal {V}_{\text {b}}(M)\) a spanning set of the space of vector fields on \(\mathbb {S}^2\), e.g. rotation vector fields, though the concrete choice or their (finite) number are irrelevant; we can then define elements \(\Omega _i\in \text {Diff}_{\text {b}}^1(M;\beta ^*K_0)\) with scalar principal symbols equal to those of \(\Omega _{1,i}\) such that

$$\begin{aligned}{}[\rho ^{-1}\partial _0,\Omega _i],\ [\rho _0^{-1}\partial _1,\Omega _i] \in \rho _I\text {Diff}_{\text {b}}^1(M;\beta ^*K_0), \end{aligned}$$
(4.43)

where \(\rho ^{-1}\partial _0,\rho _0^{-1}\partial _1\) denote elements in \({}^0\mathcal {M}_{\beta ^*K_0}\). (Note that the \(\rho _I\,\mathcal {C}^\infty \) indeterminacy of \(\rho ^{-1}\partial _0,\rho _0^{-1}\partial _1\) does not affect (4.43)). Here, it is crucial that we fix\(\rho _0\) and \(\rho \) to be given by (2.25) and thus rotationally invariant: \(\Omega _{i,1}\rho _0=0\), so \([\Omega _i,\rho _0]\in \rho _I\,\mathcal {C}^\infty \); we also have \([\Omega _i,\rho _I]\in \rho _I\,\mathcal {C}^\infty \) independently of choices. Regarding (4.43) then, we automatically have membership in \(\text {Diff}_{\text {b}}^1\) by principal symbol considerations; to get the additional vanishing at \(\rho _I\) is then exactly the statement that the normal operators of \(\rho ^{-1}\partial _0\), resp. \(\rho _0^{-1}\partial _1\), and \(\Omega _i\) commute. For \(\rho ^{-1}\partial _0\), whose normal operator is \(-\tfrac{1}{2}\rho _I\partial _{\rho _I}\), this is automatic, while for \(\rho _0^{-1}\partial _1\), we merely need to arrange \([\rho _0\partial _{\rho _0},\Omega _i]=0\) at \(\mathscr {I}^+\), which holds if we define \(\Omega _i\) in the decomposition (3.10) by . We therefore obtain

$$\begin{aligned} {[}-2\rho ^{-2}\partial _0\partial _1, \Omega _i ],\ [ L^0, \Omega _i ] \in \text {Diff}_{\text {b}}^2, \end{aligned}$$

with \(L^0\) given in (4.26), which improves over the a priori membership in \(\rho _I^{-1}\text {Diff}_{\text {b}}^2\). Let us now assume that for the solution of equation (4.25a), we have already established the estimate

$$\begin{aligned} \Vert u_0 \Vert _{\rho _0^{a_0}\rho _I^{a'_I}H_{\mathscr {I},{\text {b}}}^{1,k-1}} \le C\bigl (\Vert \pi _0 f\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^{k-1}} + \Vert \pi _0^c u\Vert _{\rho _0^{a_0}\rho _I^{a_I-\delta }H_{\mathscr {I},{\text {b}}}^{1,k-1}}\bigr ). \end{aligned}$$
(4.44)

We use \(\{G_j\}:=\{\rho _0\partial _{\rho _0}, \rho _I\partial _{\rho _I},\ \Omega _1,\ \Omega _2,\ \Omega _3,\ 1\}\), which spans \(\text {Diff}_{\text {b}}^1(M;\beta ^*K_0)\) over \(\mathcal {C}^\infty (M)\), as a set of commutators. Writing \(L=\pi _0 L_h\pi _0\), we then have

$$\begin{aligned} L G_j u_0 = f_j+[L,G_j]u_0,\ \ f_j := G_j\pi _0 f - G_j\pi _0 L_h\pi _0^c u_0^c. \end{aligned}$$
(4.45)

We estimate the first term by

$$\begin{aligned} \Vert f_j\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^{k-1}} \le C\bigl (\Vert \pi _0 f\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^k} + \Vert \pi _0^c u\Vert _{\rho _0^{a_0}\rho _I^{a_I-\delta }H_{\mathscr {I},{\text {b}}}^{1,k}}\bigr ). \end{aligned}$$

For the second, delicate, term, we use the above discussion to see that

$$\begin{aligned} {[}L,G_j] \in c_j L + \rho _0^{1-0}\rho _I^{-1+b'_I}\mathcal {M}\circ \text {Diff}_{\text {b}}^1 + (\mathcal {C}^\infty +\rho _0^{1-0}\rho _I^{-0})\text {Diff}_{\text {b}}^2 \end{aligned}$$
(4.46)

with \(c_j=1\) if \(G_j=\rho _I\partial _{\rho _I}\), and \(c_j=0\) otherwise. Thus, \([L,G_j]=c_j L + C_j^\ell G_\ell \) with \(C_j^\ell \in \rho _0^{1-0}\rho _I^{-1+b'_I}\mathcal {M}+ (\mathcal {C}^\infty +\rho _0^{1-0}\rho _I^{-0})\text {Diff}_{\text {b}}^1\), and therefore

$$\begin{aligned} \Vert [L,G_j]u_0\Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^{k-1}} \le c_j\Vert L u_0 \Vert _{\rho _0^{a_0}\rho _I^{a'_I-1}H_{{\text {b}}}^{k-1}} + C \sum _\ell \Vert G_\ell u_0 \Vert _{\rho _0^{a_0}\rho _I^{a_I-\delta -\delta '}H_{\mathscr {I},{\text {b}}}^{1,k-1}} \end{aligned}$$
(4.47)

for \(\delta '>0\) small; recall that our choice (4.23) of \(\delta \) leaves some extra room. Now, applying (4.44) to \(G_j u_0\) in equation (4.45) and summing over j, we can absorb the term (4.47) into the left hand side of the estimate due to the weaker weight. This establishes (4.44) for k replaced by \(k+1\). The higher regularity analogues of the estimates (4.41) and (4.42) are proved in the same manner; as before, this then yields the estimate (4.21) for all k. \(\square \)

This proposition remains valid near any compact subset of \(\mathscr {I}^+\setminus I^+\): the proof only required localization near \(\mathscr {I}^+\). At this point, we therefore have quantitative control of the solution of the initial value problem for \(L_h u=f\) in any compact subset of \(M\setminus I^+\).

4.2 Estimate near timelike infinity

Near the corner \(I^+\cap \mathscr {I}^+\), fix the local defining functions

$$\begin{aligned} \rho _I := v = (t-r_*)/r,\ \ \mathring{\rho }_+:=(t-r_*)^{-1} \end{aligned}$$
(4.48)

of \(\mathscr {I}^+\) and \(I^+\), and let \(\rho :=\rho _I\mathring{\rho }_+=r^{-1}\); these only differ from the expressions for the defining functions \(\rho _I\) and \(\rho _0\) used in §4.1 by a sign. We thus have \(G_{\text {b}}=\rho ^{-2}G=G_{0,{\text {b}}}+G_{1,{\text {b}}}+\widetilde{G}_{\text {b}}\) for

(4.49)

and \(G_{1,{\text {b}}}\in \mathcal {C}^\infty (M;S^2\,{}^{\beta }TM)\), \(\widetilde{G}_{\text {b}}\in \rho _I^{-1+b'_I}\rho _+^{1+b_+}H_{{\text {b}}}^\infty (M;S^2\,{}^{\beta }TM+\rho _I\,S^2\,{}^{{\text {b}}}TM)\), while

$$\begin{aligned} g_{\text {b}}\in (\mathcal {C}^\infty +\rho _I^{b'_I}\rho _+^{1+b_+}H_{{\text {b}}}^\infty )(M;S^2({}^{\beta }TM)^\perp + \rho _I\,S^2\,{}^{{\text {b}}}T^*M) \end{aligned}$$

with smooth term given by \(\rho ^2 g_m=2\rho _I\tfrac{d\mathring{\rho }_+}{\mathring{\rho }_+}(\tfrac{d\mathring{\rho }_+}{\mathring{\rho }_+}+\tfrac{d\rho _I}{\rho _I})+\rho _I^2\,\mathcal {C}^\infty (M;S^2\,{}^{{\text {b}}}T^*M)\). In order to be able to work near all of\(I^+\), we first prove:

Lemma 4.9

There exists a defining function \(\rho _+\in \mathcal {C}^\infty (M)\) of \(I^+\) such that \(d\rho _+/\rho _+\) is past timelike near \(I^+\) for the dual b-metric \(\rho ^{-2}g_m^{-1}\). Moreover, if \(C>0\) is fixed, then for any \(h\in \mathcal {X}^\infty \) with \(\Vert h\Vert _{\mathcal {X}^3}<C\) and for any \(\epsilon >0\), there exists \(\delta >0\) such that \(d\rho _+/\rho _+\) is past timelike with \(|d\rho _+/\rho _+|_{G_{\text {b}}}^2>0\) in \(\{\rho _I\ge \epsilon ,\ \rho _+\le \delta \}\) for the dual b-metric \(G_{\text {b}}=\rho ^{-2}g^{-1}\), \(g=g_m+\rho h\).

Proof

For the second claim, note that in \(\rho _I\ge \epsilon >0\), we have \(G_{\text {b}}-\rho ^{-2}g_m^{-1}\in \rho _+^{1+b_+}L^\infty \) with norm controlled by \(\Vert h\Vert _{\mathcal {X}^3}\), so

$$\begin{aligned} |d\rho _+/\rho _+|_{G_{\text {b}}}^2 \in |d\rho _+/\rho _+|^2_{\rho ^{-2}g_m^{-1}} + \rho _+^{1+b_+}H_{{\text {b}}}^\infty \end{aligned}$$
(4.50)

is indeed positive near \(\rho _+=0\). To prove the first claim, we compute on Minkowski space \(|f_0^{-1}d f_0|^2\equiv 1\), \(f_0=t/(t^2-r^2)\) in \(t>r\), computed with respect to the dual metric of \(t^{-2}(dt^2-dr^2)\).Footnote 35 Similarly, in \(r/t>\tfrac{1}{4}\), and \(t>r_*\) large, we have \(|f_*^{-1}d f_*|_{\rho ^{-2}g_m}^2>0\) for \(f_*=t/(t^2-r_*^2)\): this is a simple calculation where \(g_m=g_m^S\) is the Schwarzschild metric, and follows in general by an estimate similar to (4.50) since \(g_m\) differs from \(g_m^S\) by a scattering metric of class \(\rho ^{1-0}H_{{\text {b}}}^\infty \) in \(r/t<\tfrac{3}{4}\). Moreover, \(f_*\) is (apart from minor smoothness issues, which we address momentarily) a defining function of \(I^+\) near \(\mathscr {I}^+\). But \(f_0-f_*\in \rho ^{2-0}H_{{\text {b}}}^\infty \) for \(r/t\in (\tfrac{1}{4},\tfrac{3}{4})\), hence \(f':=\chi f_*+(1-\chi )f_0\) has \(|(f')^{-1}d f'|_{\rho ^{-2}g_m}^2>0\) near \(I^+\), where \(\chi =\chi (r/t)\) is smooth and identically 0, resp. 1, in \(r/t<\tfrac{1}{4}\), resp. \(r/t>\tfrac{3}{4}\). Fixing any defining function \(\rho '_+\) of \(I^+\), Lemma 2.8 implies \(f'\in \rho '_+\,\mathcal {C}^\infty (M)+(\rho '_+)^{2-0}H_{{\text {b}}}^\infty (M)\) (with the nonsmooth summand supported away from \(\mathscr {I}^+\) by construction), so we may take \(\rho _+\in \mathcal {C}^\infty (M)\) to be any defining function of \(I^+\) such that \(f'-\rho _+\in (\rho '_+)^{2-0}H_{{\text {b}}}^\infty \). \(\square \)

For the remainder of this section, \(\rho _+\)will denote this particular defining function. Near \(I^+\cap \mathscr {I}^+\), we need to modify \(\rho _+\) in the spirit of (4.16) in order to get a timelike (but not quite smooth) boundary defining function. Thus, fix \(\beta \in (0,b'_I)\) and some small \(\eta >0\), and let \(p_\beta \in \rho _I^\beta \,\mathcal {C}^\infty (U)\) be a nonnegative function in a neighborhood U of \(I^+\) such that \(p_\beta \equiv \eta ^\beta \) in \(\rho _I\ge 2\eta \), \(p_\beta (\rho _I)=\rho _I^\beta \) in \(\rho _I\le \tfrac{1}{2}\eta \), and \(0\le p'_\beta \le \beta \rho _I^{\beta -1}\); let then

$$\begin{aligned} \widetilde{\rho }_+ := \rho _+(1+p_\beta ) \in \rho _+(1+\rho _I^\beta )\,\mathcal {C}^\infty (M). \end{aligned}$$
(4.51)

It is easy to see that \(\widetilde{\rho }_+^{a_+}H_{{\text {b}}}^k(M)=\rho _+^{a_+}H_{{\text {b}}}^k(M)\), likewise for weighted \(H_{\mathscr {I}}\) and \(H_{\mathscr {I},{\text {b}}}\) spaces.

Lemma 4.10

Fix \(C>0\). Then there exist \(\eta ,\delta >0\) such that for all \(h\in \mathcal {X}^\infty \) with \(\Vert h\Vert _{\mathcal {X}^3}<C\), we have \(|d\widetilde{\rho }_+/\widetilde{\rho }_+|_{G_{\text {b}}}^2>0\) in \(\rho _+\le \delta \).

Proof

We compute the \(G_{\text {b}}\)-norms

$$\begin{aligned} \Bigl |\frac{d\widetilde{\rho }_+}{\widetilde{\rho }_+}\Bigr |^2 = \Bigl |\frac{d\rho _+}{\rho _+}\Bigr |^2 + \frac{\rho _I p_\beta '}{1+p_\beta }\Bigl (2\Big \langle \frac{d\rho _+}{\rho _+},\frac{d\rho _I}{\rho _I}\Big \rangle + \frac{\rho _Ip_\beta '}{1+p_\beta }\Bigl |\frac{d\rho _I}{\rho _I}\Bigr |^2\Bigr ). \end{aligned}$$
(4.52)

In \(\rho _I<2\eta \) and thus near \(\mathscr {I}^+\), we first note that \(\rho _+=f\mathring{\rho }_+\) with \(f>0\) smooth; since df/f thus vanishes at \(\mathscr {I}^+\cap I^+\) as a b-1-form, we have

$$\begin{aligned} 2\Big \langle \frac{d\rho _+}{\rho _+},\frac{d\rho _I}{\rho _I}\Big \rangle \in (2+\rho _I\,\mathcal {C}^\infty +\rho _+\,\mathcal {C}^\infty )\rho _I^{-1} + \rho _I^{-1+b'_I}\rho _+^{1+b_+}H_{{\text {b}}}^\infty , \end{aligned}$$

thus the second summand of (4.52) is \(\gtrsim \rho _I^{-1+\beta }\) in \(\rho _I\le \tfrac{1}{2}\eta \) and \(\rho _+\) small. The first and third terms on the other hand are dominated by this, as they are bounded by \(\rho _I^{-1+b'_I}\) and \(\rho _I^{-1+2\beta }\), respectively. In \(\tfrac{1}{2}\eta<\rho _I<2\eta \) and \(\rho _+\) small, the parenthesis in (4.52) is positive, the second summand being bounded by \(\rho _I^{-1+\beta }\); the prefactor being positive due to \(p_\beta '\ge 0\), the claimed positivity thus follows from Lemma 4.9. \(\square \)

We also note that \(\rho _+\partial _{\rho _+}\), which is well-defined as a b-vector field at \(I^+\) and equals the scaling vector field in \((I^+)^\circ \), is past timelike in \((I^+)^\circ \). Let

$$\begin{aligned} U = \{ \widetilde{\rho }_+<\delta \} \subset M \end{aligned}$$

denote the neighborhood of \(I^+\subset M\) on which we will formulate our energy estimate. Near \(\mathscr {I}^+\), we need to exploit the weak null structure as in §4.1; thus, let

$$\begin{aligned} \chi \in \mathcal {C}^\infty _{\text {c}}([0,\infty )_{\rho _I}),\ \ \chi \equiv 1\ \ \text{ near }\ \rho _I=0, \end{aligned}$$
(4.53)

denote a smooth function on U localizing in a neighborhood of \(\mathscr {I}^+\) where the projections \(\pi _0\) etc. are defined, see the discussion around Definition 3.4.

Proposition 4.11

For weights \(b'_I,b_I,b_+,a'_I,a_I\) as in Theorem 4.2, there exists \(a_+\in \mathbb {R}\) such that for all \(h\in \mathcal {X}^\infty \) which are small in \(\mathcal {X}^3\), the following holds: Let \(f\in \rho _I^{a_I-1}\rho _+^{a_+}H_{{\text {b}}}^{k-1}(U)\), \(\chi \pi _0 f\in \rho _I^{a'_I-1}\rho _+^{a_+}H_{{\text {b}}}^{k-1}(U)\), and suppose f vanishes in \(\widetilde{\rho }_+>\tfrac{1}{2}\delta \). Let u denote the unique forward solution of \(L_h u=f\). Then

$$\begin{aligned}&\Vert u\Vert _{\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}(U)} + \Vert \chi \pi _0 u\Vert _{\rho _I^{a'_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}(U)} \nonumber \\&\quad \le C\Bigl (\Vert f\Vert _{\rho _I^{a_I-1}\rho _+^{a_+}H_{{\text {b}}}^{k-1}(U)} + \Vert \chi \pi _0 f\Vert _{\rho _I^{a'_I-1}\rho _+^{a_+}H_{{\text {b}}}^{k-1}(U)}\Bigr ). \end{aligned}$$
(4.54)

Proof

We first consider \(k=1\). Near \(\partial I^+\), we will make use of the vector field \(V_0'=(1-c_V)\rho _I\partial _{\rho _I}-\mathring{\rho }_+\partial _{\mathring{\rho }_+}\), \(c_V>0\) small, analogously to Lemma 4.4; away from \(\partial I^+\), the vector field \(V_0'':=-\nabla \rho _+/\rho _+\) is future timelike. Fix \(a^0_+\le -\tfrac{1}{2}\) and consider the vector field \(V_I:=\rho _I^{-2 a_I}\mathring{\rho }_+^{-2 a^0_+}V_0'\), then

is \(\lesssim -\rho _I^{-2 a_I-1}\rho _+^{-2 a^0_+}\) as a quadratic form, and \({\text {div}}_{g_{\text {b}}} V_I\lesssim -\rho _I^{-2 a_I}\rho _+^{-2 a^0_+}\). Analogously to Lemma 4.6, if \(V_I'=\rho _I^{-2 a'_I}\mathring{\rho }_+^{-2 a_+^0}V_0'\), then \(K_{V_I'}-2\gamma V_I'\otimes _s\rho ^{-1}\partial _1\) is negative definite near \(\partial I^+\) for \(c_V>0\) sufficiently small.

To explain the idea for obtaining a global (near \(I^+\)) negative commutator, consider the timelike vector field \(W_0:=\chi V_I+(1-\chi )\rho _+^{-2 a_+^0}V''_0\), and let \(W=\widetilde{\rho }_+^{-2 a_+^1}W_0\); then formula (4.10) gives

$$\begin{aligned} K_W = \widetilde{\rho }_+^{-2 a_+^1}K_{W_0} + 2 a_+^1\widetilde{\rho }_+^{-2 a_+^1} T(W_0,-\tfrac{\nabla \widetilde{\rho }_+}{\widetilde{\rho }_+}). \end{aligned}$$
(4.55)

Letting

$$\begin{aligned} a_+:=a_+^0+a_+^1, \end{aligned}$$

the first term gives control in \(\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I}}^1\) near \(\mathscr {I}^+\) in a positive commutator argument. On the other hand, its size is bounded by a fixed constant times \(\rho _+^{-2 a_+}\) in \(\rho _I\ge \epsilon >0\); but there, \(T(W_0,-\tfrac{d\widetilde{\rho }_+}{\widetilde{\rho }_+})\gtrsim \rho _+^{-2 a_+^0}\) in the sense of quadratic forms on \({}^{{\text {b}}}T^*M\) since \(W_0\) and \(-d\widetilde{\rho }_+/\widetilde{\rho }_+\) are both future timelike. Therefore, choosing \(a_+^1\) large and negative, we obtain

$$\begin{aligned} K_W \le -C\rho _I^{-2 a_I-1}\rho _+^{-2 a_+}K_W', \end{aligned}$$

where \(K_W'\) is positive definite on \({}^{{\text {b}}}T^*M\) in \(\rho _I\ge \epsilon >0\), while near \(\mathscr {I}^+\), we have \(K_W'=K_1+\rho _I K_2\), with \(K_1\), resp. \(K_2\), positive definite on \({}^{{\text {b}}}T^*M\), resp. \(({}^{\beta }TM)^\perp \). This gives global (near \(I^+\)) control in \(\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I}}^1\).

We now apply this discussion to the situation at hand. For brevity, let us use the same symbol to denote a b-vector field in \({}^0\mathcal {M}\) and an arbitrary but fixed representative in \({}^0\mathcal {M}_{\beta ^*E}\) according to Lemma 2.13(3), similarly for b-vector fields with weights (such as \(V_I\) and \(V'_I\)); the bundle \(E\rightarrow \overline{\mathbb {R}^4}\) will be clear from the context. For \(a^1_+\in \mathbb {R}\) chosen later, consider then the operator W acting on sections of \(\beta ^*S^2\),

$$\begin{aligned} W:=\widetilde{\rho }_+^{-2 a_+^1}W_0,\ \ W_0:=\chi \bigl (\pi _0 V'_I\pi _0 + \pi _{1 1}^c V_I\pi _{1 1}^c + \eta \pi _{1 1} V_I\pi _{1 1}\bigr ) + (1-\chi )\rho _+^{-2 a_+^0}V''_0, \end{aligned}$$
(4.56)

where \(\eta >0\) will be taken small, as in the discussion after (4.42). (Since u vanishes in \(\rho _+>\tfrac{1}{2}\delta \), we do not need to include a cutoff term here). ‘Integrating’ along W via a commutator calculation for \(2{\text {Re}}\langle W u,\rho _I^{-1} u\rangle \) as in (4.36) gives control on u in the function space appearing in (4.7) in terms of Wu. The evaluation of the commutator \(2{\text {Re}}\langle L_h u, W u\rangle =\langle \mathcal {C}u,u\rangle \), \(\mathcal {C}=[L_h,W]+(W+W^*)L_h+(L_h^*-L_h)W\), then combines the three separate calculations for the equations (4.25a)–(4.25c) into one: near \(\mathscr {I}^+\), one writes \(L_h\) in block form according to the bundle decomposition \(\beta ^*S^2=K_0\oplus K_{1 1}^c\oplus K_{1 1}\), with the diagonal elements \(\pi _0 L_h\pi _0\) etc. giving rise to the main terms of the commutator, while the off-diagonal terms can be estimated using Cauchy–Schwarz and absorbed into the main terms due to the weak null structure, as explained in detail in the proof of Proposition 4.8. Away from \(\mathscr {I}^+\), all error terms can be absorbed in the main term, corresponding to the second term in (4.55) upon choosing \(a_+^1<0\) negative enough. This proves the proposition for \(k=1\).

Suppose now we have proved (4.54) for some \(k\ge 1\). First, the b-operator \(L_h\)automatically commutes with \(\rho _+\partial _{\rho _+}\) to leading order at \(I^+\); concretely, Lemma 3.8 gives

$$\begin{aligned} {[}L_h,\rho _+\partial _{\rho _+}] \in \rho _I^{-1+b'_I}\rho _+^{1+b_+}\mathcal {M}_{\beta ^*S^2}^2 + (\rho _+\,\mathcal {C}^\infty +\rho _I^{-0}\rho _+^{1+b_+}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^2. \end{aligned}$$

Here, by an abuse of notation, \(\rho _+\partial _{\rho _+}\in {}^0\mathcal {M}_{\beta ^*S^2}\) is defined by first extending the vector field \(\rho _+\partial _{\rho _+}\in \mathcal {C}^\infty (I^+,{}^{{\text {b}}}T_{I^+}M)\) to an element of \({}^0\mathcal {M}_{\underline{\mathbb {C}}{}}\), and then taking a representative of the image space in Lemma 2.13(3); for this particular vector field, such a representative is in fact well-defined modulo \(\rho _I\rho _+\,\mathcal {C}^\infty (M;{\text {End}}(\beta ^*S^2))\), the extra vanishing at \(\rho _+\) being due to the special (b-normal) nature of \(\rho _+\partial _{\rho _+}\).

Therefore, commuting \(\rho _+\partial _{\rho _+}\) through the equation \(L_h u=f\), we have the estimate

$$\begin{aligned}&\Vert \rho _+\partial _{\rho _+}u\Vert _{\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} + \Vert \chi \pi _0\rho _+\partial _{\rho _+}u\Vert _{\rho _I^{a'_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} \nonumber \\&\quad \le C\Bigl ( \Vert f\Vert _{\rho _I^{a_I-1}\rho _+^{a_+}H_{{\text {b}}}^k} + \Vert \chi \pi _0 f\Vert _{\rho _I^{a'_I-1}\rho _+^{a_+}H_{{\text {b}}}^k} + \Vert u\Vert _{\rho _I^{a_I-\delta }\rho _+^{a_+-(1+b_+)}H_{\mathscr {I},{\text {b}}}^{1,k}}\Bigr ) \end{aligned}$$
(4.57)

by the inductive hypothesis, where we used \(a_I-\delta >a'_I-b'_I\) for \(\delta >0\) small to bound the forcing term \([L_h,\rho _+\partial _{\rho _+}]u\) by the third term on the right; see the related discussion around (4.39).

Second, the timelike character of \(\rho _+\partial _{\rho _+}\) at \((I^+)^\circ \) for \(\epsilon >0\) implies that \(C(\rho _+ D_{\rho _+})^2-L_h\) is elliptic in \(\rho _I\ge \epsilon \) for large C (depending on \(\epsilon \)); therefore, letting \(\chi _j\in \mathcal {C}^\infty _{\text {c}}(U\setminus \mathscr {I}^+)\), \(j=1,2\), denote cutoffs with \(\chi _1\equiv 1\) on \({\text {supp}}(1-\chi )\) and \(\chi _2\equiv 1\) on \({\text {supp}}\chi _1\), we have an elliptic estimate away from \(\mathscr {I}^+\),

$$\begin{aligned} \Vert \chi _1 u \Vert _{\rho _+^{a_+}H_{{\text {b}}}^{k+1}} \le C\bigl ( \Vert \chi _2\rho _+\partial _{\rho _+}u \Vert _{\rho _+^{a_+}H_{{\text {b}}}^k} + \Vert \chi _2 u \Vert _{\rho _+^{a_+}H_{{\text {b}}}^k} + \Vert \chi _2 f \Vert _{\rho _+^{a_+}H_{{\text {b}}}^{k-1}}\bigr ), \end{aligned}$$
(4.58)

for u supported in \(\rho _+\le \tfrac{1}{2}\delta \). Near \(\mathscr {I}^+\) on the other hand, we have the symmetries of null infinity at our disposal, encoded by the operators \(\rho _I\partial _{\rho _I}\) and the spherical derivatives \(\Omega _j\), see the discussion around (4.43). Let \(\widetilde{\chi }\in \mathcal {C}^\infty (U)\) be identically 1 on \({\text {supp}}\chi \), and supported close to \(\mathscr {I}^+\). Defining the set of (cut-off) commutators \(\{\chi G_j\}:=\{\chi \rho _I\partial _{\rho _I},\ \chi \Omega _1,\ \chi \Omega _2,\ \chi \Omega _3\}\) which together with \(\rho _+\partial _{\rho _+}\) spans \(\mathcal {V}_{\text {b}}(M)\) near \(\mathscr {I}^+\), and recalling the commutation relations (4.46), we find

$$\begin{aligned}&\Vert \chi G_j u\Vert _{\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} + \Vert \chi \pi _0 G_j u\Vert _{\rho _I^{a'_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} \nonumber \\&\quad \le C\Bigl (\Vert f\Vert _{\rho _I^{a_I-1}\rho _+^{a_+}H_{{\text {b}}}^k} + \Vert \chi \pi _0 f\Vert _{\rho _I^{a'_I-1}\rho _+^{a_+}H_{{\text {b}}}^k} \nonumber \\&\qquad + \sum _\ell \Vert \widetilde{\chi } G_\ell u\Vert _{\rho _I^{a_I-\delta }\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} + \Vert \widetilde{\chi }\rho _+\partial _{\rho _+}u\Vert _{\rho _I^{a_I-\delta }\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} \Bigr ). \end{aligned}$$
(4.59)

But for any \(\eta >0\), we have the estimate

$$\begin{aligned} \Vert \widetilde{\chi } G_\ell u\Vert _{\rho _I^{a_I-\delta }\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} \le \eta \Vert \chi G_\ell u\Vert _{\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} + C_\eta \Vert \chi _1 u\Vert _{\rho _+^{a_+}H_{{\text {b}}}^{k+1}}, \end{aligned}$$

and the second term can in turn be estimated using (4.58). Summing the estimate (4.59) over j and fixing \(\eta >0\) sufficiently small, we can thus absorb the terms involving \(\widetilde{\chi } G_\ell u\) into the left hand side, getting control by the norm of f, plus a control term \(C\Vert \rho _+\partial _{\rho _+}u\Vert _{\rho _+^{a_+}H_{{\text {b}}}^k}\). Adding to this estimate 2C times (4.57), this control term can be absorbed in the left hand side of (4.57). This gives control of u as in the left hand side of (4.54) with k replaced by \(k+1\), but with an extra term on the right coming from the last term in (4.57); however, this term has a weaker weight at \(I^+\), \(\rho _+^{a_+-(1-b_+)}\gg \rho _+^{a_+}\), hence can be absorbed. This gives (4.54) for k replaced by \(k+1\). \(\square \)

Combining the estimate (4.5) in compact subsets of \(M^\circ \) with Proposition 4.3 near \((I^0)^\circ \), Proposition 4.8 near \(\mathscr {I}^+\setminus (\mathscr {I}^+\cap I^+)\), and Proposition 4.11 near \(I^+\) proves Theorem 4.2.

4.3 Explicit weights for the background estimate

We sketch the calculations needed to obtain explicit values for the weights in the background estimate. More precisely, we prove the following slight modification of Theorem 4.2:

Theorem 4.12

Let \(a_+=-\tfrac{3}{2}\). There exists an \(\epsilon >0\) such that for \(a_I<\bar{a}_I<a'_I<\min (0,a_0)\) with \(|a_I|\), \(|a'_I|\), \(|\bar{a}_I|\), \(b_I\), \(b'_I<\gamma <\epsilon \) subject to the conditions in Definition 3.1, as well for \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) with \(\Vert h\Vert _{\mathcal {X}^3}<\epsilon \), the unique global solution of the linear wave equation

$$\begin{aligned} L_h u = f, \ \ (u,\partial _\nu u)|_\Sigma = (u_0,u_1) \end{aligned}$$

satisfies the estimate

$$\begin{aligned}&\Vert u\Vert _{\rho _0^{a_0}\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} + \Vert \pi _{1 1}^c u\Vert _{\rho _0^{a_0}\rho _I^{\bar{a}_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} + \Vert \pi _0 u\Vert _{\rho _0^{a_0}\rho _I^{a'_I}\rho _+^{a_+}H_{\mathscr {I},{\text {b}}}^{1,k-1}} \nonumber \\&\quad \le C\Bigl (\Vert u_0\Vert _{\rho _0^{a_0}H_{{\text {b}}}^k}+\Vert u_1\Vert _{\rho _0^{a_0}H_{{\text {b}}}^{k-1}} \nonumber \\&\qquad + \Vert f\Vert _{H_{{\text {b}}}^{k-1;a_0,a_I-1,a_+}} + \Vert \pi _{1 1}^c f\Vert _{H_{{\text {b}}}^{k-1;a_0,\bar{a}_I-1,a_+}} + \Vert \pi _0 f\Vert _{H_{{\text {b}}}^{k-1;a_0,a'_I-1,a_+}}\Bigr ). \end{aligned}$$
(4.60)

Proof

The usage of an intermediate weight \(\bar{a}_I\in (a_I,a'_I)\) allows for a small but useful modification of the argument following (4.42): namely, in the notation of that proof, we are presently estimating \(u_{1 1}\) with weight \(\rho _I^{a_I}\), while the term \(u_{1 1}^c\) coupling into the equation for \(u_{1 1}\) via \(\pi _{1 1}L_h\pi _{1 1}^c\) is estimated with weight \(\rho _I^{\bar{a}_I}\ll \rho _I^{a_I}\), hence automatically comes with a small prefactor if we work in a sufficiently small neighborhood of \(\mathscr {I}^+\). Correspondingly, in the proof of Proposition 4.11, we would replace the third inner summand in (4.56) by \(\pi _{1 1}\bar{V}_I\pi _{1 1}\), with \(\bar{V}_I=\rho _I^{-2\bar{a}_I}\mathring{\rho }_+^{-2 a_+^0}V'_0\) in order to obtain (4.60) (with \(a_+\ll 0\) not explicit at this point yet).

The only part of the proof of Theorem 4.2 in which we did not get explicit control on the weights is the energy estimate near \(I^+\). In order to obtain the explicit weights there, we note that for \(\gamma =0\), \(h=0\), and Schwarzschild mass \(m=0\), we simply have \(2 L_h=\Box _{\underline{g}{}}\), the wave operator of the Minkowski metric \(\underline{g}{}=dt^2-dx^2\), which acts component-wise on \(S^2 T^*\mathbb {R}^4\) in the trivialization given by coordinate differentials. Recalling from (2.17) that \({}^0M\) denotes the manifold with corners constructed in §2.1 for \(m=0\), we shall prove that the solution of the scalar wave equation\(t^3\Box _{\underline{g}{}}t^{-1}u=f\), with \(f\in \rho _I^{a_I-1}\rho _+^{a_+}L^2_{\text {b}}\) supported in \(\rho _+<1\), satisfies the estimate

$$\begin{aligned} \Vert u\Vert _{\rho _I^{a_I}\rho _+^{a_+}H_{\mathscr {I}}^1} \lesssim \Vert f\Vert _{\rho _I^{a_I-1}\rho _+^{a_+}L^2_{\text {b}}} \end{aligned}$$
(4.61)

for \(a_+=-\tfrac{3}{2}\) and \(a_I<0\) small, using a vector field multiplier argument; here, \(\rho _I={}^0\rho _I\) and \(\rho _+={}^0\rho _+\). But then, if the weights \(a_I,a'_I,\bar{a}_I\) etc. are very close to one another, the nonscalar commutant used in (4.56), modified as above, is very close to being principally scalar away from \(\mathscr {I}^+\); correspondingly, a slight modification of our arguments below for the Minkowski case (4.61) yield the estimate (4.60) for \(k=1\). Higher b-regularity follows as in the proof of Proposition 4.11.

In order to prove the estimate (4.61), we introduce explicit coordinates near the temporal face \(I^+\subset M\) within the blow-up of compactified Minkowski space. First of all, the calculations in A.3 imply

$$\begin{aligned} t^3\Box _{\underline{g}{}}t^{-1}=\Box _{g_{\text {dS}}}-2, \end{aligned}$$
(4.62)

where

$$\begin{aligned} g_{\text {dS}}=t^{-2}(dt^2-dx^2) \end{aligned}$$
(4.63)

is the de Sitter metric; notice though that we are interested in \(t\gg 1\). Thus, consider the isometry

$$\begin{aligned} (t,x)\mapsto (\hat{\tau },\hat{x})=\frac{1}{t^2-r^2}(t,x)\in [0,\infty )_{\hat{\tau }}\times \mathbb {R}^3_{\hat{x}} \end{aligned}$$
(4.64)

of \(g_{\text {dS}}\), defined in \(t>r=|x|\): it maps \(I^+\) to (0, 0) and \(\mathscr {I}^+\) to \(\{\hat{\tau }=|\hat{x}|\}\), see Figure 11. (The map (4.64) is the change of coordinates between the upper half space models of de Sitter space associated with q on the one hand and its antipodal point on the future conformal boundary of de Sitter space on the other hand; see [61, §6.1] for the relevant formulas).

Fig. 11
figure 11

Left: part of the conformal embedding of Minkowski space into the Einstein universe \((E,dt^2-g_{\mathbb {S}^3})\), \(E=\mathbb {R}\times \mathbb {S}^3\). Right: conformal embedding of de Sitter space into E, and the backward light cone of a point q on its conformal boundary, whose interior is the domain of the upper half space model (4.63) of de Sitter space, which near q is equal to the static model of de Sitter space near its future timelike infinity, q. The coordinates \((\hat{\tau },\hat{x})\) are regular near \(q=(\hat{\tau }=0,\hat{x}=0)\)

Define the blow-up \(M':=\bigl [[0,\infty )_{\hat{\tau }}\times \mathbb {R}^3_{\hat{x}},\{(0,0)\}\bigr ]\) at the image of \(I^+\). Then the lift of \(\{\hat{\tau }\le |\hat{x}|\}\) to \(M'\) is canonically identified with a neighborhood of \(I^+\subset M\). Concretely,

$$\begin{aligned} (\rho _+,Z):=(\hat{\tau },\hat{x}/\hat{\tau })=\bigl (t/(t^2-r^2),x/t\bigr ) \in [0,\infty )\times \mathbb {R}^3 \end{aligned}$$

gives coordinates on \(M'\), in which \(U:=[0,1)_{\rho _+}\times \{|Z|\le 1\}\) is identified with a collar neighborhood of \(I^+\subset M\) so that

$$\begin{aligned} g_{\text {dS}}=\hat{\tau }^{-2}(d\hat{\tau }^2-d\hat{x}^2) = (1-|Z|^2)\frac{d\rho _+^2}{\rho _+^2} - 2 Z\,dZ\frac{d\rho _+}{\rho _+} - dZ^2. \end{aligned}$$
(4.65)

Furthermore, \(\rho _I:=1-|Z|^2=1-r^2/t^2\) is a defining function of \(\mathscr {I}^+\) in U. Let us write \(R:=|Z|\). Instead of the vector field \(V_{\text {loc}}=(1-c_V)\rho _I\partial _{\rho _I}-\rho _+\partial _{\rho _+}\), which is defined locally near \(\mathscr {I}^+\) and was used in the proof of Proposition 4.11, we use the global vector field

$$\begin{aligned} V_0 = -(1+R^2)\rho _+\partial _{\rho _+} - (1-c_V)(1-R^2)R\partial _R \end{aligned}$$

which is equal to \(V_{\text {loc}}\) near \(\mathscr {I}^+\), up to an overall scalar and modulo \(\rho _I\mathcal {V}_{\text {b}}+\rho _+\mathcal {V}_{\text {b}}\); moreover, \(V_0\) is timelike in \(U\setminus \mathscr {I}^+\) for small \(c_V\ge 0\). Considering the commutant/vector field multiplier \(W:=\rho _I^{-2 a_I}\rho _+^{-2 a_+}V_0\) with \(a_+=-\tfrac{3}{2}\) and \(a_I<0\) small, the expression for the K-current \(K_W\) is somewhat lengthy, so we merely list its main features in \(0\le R\le 1\), writing , with \(K_1\) a section of \(S^2\langle \rho _+\partial _{\rho _+},\partial _R\rangle \) (considered a \(2\times 2\) matrix in this frame) and a scalar:

  • \({\text {tr}}K_1|_{c_V=0}=-2(1-R^4-a_I R^2(4+R^2))<0\), which persists for small \(c_V>0\);

  • \(\det K_1|_{c_V=0}=-4 a_I(1+a_I)R^2(1-R^2)\ge 0\) and

    $$\begin{aligned} (\partial _{c_V}\det K_1)|_{c_V=0} = -16 a_I^2(R^2-\tfrac{1}{1+4 a_I})(R^2+\tfrac{3}{3+4 a_I}) > 0, \end{aligned}$$

    so \(\det K_1>0\) for small \(c_V>0\);

  • , which persists for small \(c_V>0\);

  • \(\rho _I^{2 a_I}\rho _+^{2 a_+}{\text {div}}_{g_{\text {dS}}}W|_{c_V=0}=6-(2-4 a_I)R^2>0\).

Thus, fixing \(c_V>0\) to be small, the main term arising in the evaluation of the commutator \(-2{\text {Re}}\langle (\Box _{g_{\text {dS}}}-2)u,W u\rangle \) is \(\int _U -K_W(d u,d u)+4({\text {div}}_{g_{\text {dS}}}W)|u|^2\,\frac{d\rho _+}{\rho _+}\,dZ\), which thus gives the desired control on u in \(H_{\mathscr {I}}^1\), except \(|u|^2\) itself is only controlled in \(\rho _I^{a_I}\rho _+^{a_+-1/2}L^2_{\text {b}}\) due to the weaker weight of \({\text {div}}_{g_{\text {dS}}}W\) at \(\mathscr {I}^+\); control in \(\rho _I^{a_I}\rho _+^{a_+}L^2_{\text {b}}\) is obtained by integrating \(\rho _+\partial _{\rho _+}u\in \rho _I^{a_I}\rho _+^{a_+}L^2_{\text {b}}\) from \(\rho _+=1\). This yields (4.61). \(\square \)

5 Newton iteration

Fix \(b_0\), \(b_I\), \(b'_I\), \(b_+\) and \(\gamma \) as in Theorem 4.2. Recall that we want to solve the symmetric 2-tensor-valued wave equation

$$\begin{aligned} P(h)=0,\ \ (h,\partial _\nu h)|_\Sigma = (h_0, h_1) \end{aligned}$$

for initial data \((h_0,h_1)\), \(h_j\in \rho _0^{b_0}H_{{\text {b}}}^\infty (\Sigma )\), small in a suitable high regularity norm, and we hope to find a solution \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\). Following the strategy, outlined in §1, of solving a linearized equation at each step of an iteration scheme, we consider, formally, the iteration scheme with initialization

$$\begin{aligned} L_0 h^{(0)}=0,\ \ (h^{(0)},\partial _\nu h^{(0)})|_\Sigma = (h_0, h_1), \end{aligned}$$

and iterative step \(h^{(N+1)}=h^{(N)}+u^{(N+1)}\), where

$$\begin{aligned} L_{h^{(N)}}u^{(N+1)} = -P(h^{(N)}),\ \ (h^{(N+1)},\partial _\nu h^{(N+1)})|_\Sigma = 0. \end{aligned}$$

Assume that \(h^{(N)}\in \mathcal {X}^\infty \) has small \(\mathcal {X}^3\) norm. In order for this iteration scheme to close, we need to show that \(h^{(N+1)}\in \mathcal {X}^\infty \). Since \(P(h^{(N)})\in \mathcal {Y}^\infty \) by Lemma 3.5, this means that we need to prove:

Theorem 5.1

For weights as above, there exists \(\epsilon >0\) such that for \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) with \(\Vert h\Vert _{\mathcal {X}^3}<\epsilon \), the following holds: if \(f\in \mathcal {Y}^{\infty ;b_0,b_I,b'_I,b_+}\) and \(h_0,h_1\in \rho _0^{b_0}H_{{\text {b}}}^\infty (\Sigma )\), then the solution of the initial value problem

$$\begin{aligned} L_h u = f,\ \ (u,\partial _\nu u)|_\Sigma = (u_0,u_1), \end{aligned}$$

satisfies \(u\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\).

Remark 5.2

We recall that membership, of a scalar function u for simplicity, in \(\rho _0^{b_0}H_{{\text {b}}}^\infty (\overline{\mathbb {R}^3})\) is equivalent (up to an arbitrarily loss in decay) to pointwise estimates \(|V_1\cdots V_N u|\lesssim \langle r\rangle ^{-b_0}\) where the \(V_i\) are translation, rotation, or scaling vector fields on \(\mathbb {R}^3\). The membership \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) means pointwise decay of various components of h towards leading order terms at \(\mathscr {I}^+\) or to zero; see Definition 3.1 and Remark 1.9.

According to Theorem 4.2, we have the background estimate

$$\begin{aligned} u\in H_{{\text {b}}}^{\infty ;b_0,-0,a_+}(M;\beta ^*S^2),\ \ \pi _0 u\in H_{{\text {b}}}^{\infty ;b_0,b'_I-0,a_+}(M;\beta ^*S^2), \end{aligned}$$
(5.1)

for suitable \(a_+\). We shall improve this to \(u\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) using normal operator analysis in several steps, which were outlined around (1.22): using the leading order form (3.25) of \(L_h\), or rather its decoupled versions (3.26a)–(3.26c), we obtain the precise behavior of u near \(\mathscr {I}^+\setminus (\mathscr {I}^+\cap I^+)\) in §5.1 by simple ODE analysis; the correct weight at \(I^+\) but losing some precision at \(\mathscr {I}^+\) near its future boundary in §5.2 by normal operator analysis and a contour shifting argument; and finally the precise behavior near \(\mathscr {I}^+\), uniformly up to \(\mathscr {I}^+\cap I^+\), again by ODE analysis in §5.3.

For later use, we record the mapping properties of P and its linearization on the polyhomogeneous and conormal parts of \(\mathcal {X}^\infty \)—recall (3.9).

Lemma 5.3

Let \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\), with \(\Vert h\Vert _{\mathcal {X}^3}\) small; write \(h=h_{\text {phg}}+h_{\text {b}}\), \(h_{\text {phg}}\in \mathcal {X}^\infty _{\text {phg}}\), \(h_{\text {b}}\in \mathcal {X}^\infty _{\text {b}}\). Then: (1) \(P(h_{\text {phg}})\in \mathcal {Y}^\infty \), (2) \(L_h^0:\mathcal {X}^\infty _{\text {phg}}\rightarrow \mathcal {Y}^\infty \), (3) \(L_h^0\), \(\widetilde{L}_h:\mathcal {X}^\infty _{\text {b}}\rightarrow \mathcal {Y}^\infty _{\text {b}}\), (4) \(\widetilde{L}_h:\mathcal {X}^\infty _{\text {phg}}\rightarrow \mathcal {Y}^\infty _{\text {b}}\).

The point is that the behavior (2)–(3) of the leading term \(L_h^0\) and simple information (1) on the nonlinear operator automatically imply precise mapping properties (4) of the error term \(\widetilde{L}_h\) which are not encoded in (3.25).

Proof of Lemma 5.3

Part (1) follows from Lemma 3.5. One obtains (2) by inspection of (3.25); note that \(L_h^0\) is only well-defined modulo terms in \((\mathcal {C}^\infty +\rho _0^{1+b_0}\rho _I^{-0}\rho _+^{1+b_+}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^1\) which always map \(\mathcal {X}^\infty _{\text {phg}}\rightarrow \mathcal {Y}^\infty \). Likewise, the first part of (3) follows from (3.25); the fact that the ‘good components’ (encoded by the bundle \(K_0\)) have a better weight \(b'_I\) than the weight \(b_I\) of the remaining components (in \(K_0^c\)) is again due to the structure of \(L_h^0\) discussed after Lemma 3.8. The second part of (3) is clear, since this concerns the remainder operator \(\widetilde{L}_h\), whose coefficients are decaying relative to \(\rho _I^{-1}\text {Diff}_{\text {b}}^2\), acting on \(\mathcal {X}^\infty _{\text {b}}\), which consists of tensors decaying at \(\mathscr {I}^+\).

Finally, to prove (4), we take \(u_{\text {phg}}\in \mathcal {X}^\infty _{\text {phg}}\) and write \(\widetilde{L}_h u_{\text {phg}}=D_h P(u_{\text {phg}})-L_h^0 u_{\text {phg}}\). The second term lies in \(\mathcal {Y}^\infty \) by (2), while the first term equals

$$\begin{aligned}&\frac{d}{d s}P(h_{\text {phg}}+s u_{\text {phg}}+h_{\text {b}})\big |_{s=0} \\&\quad =\frac{d}{d s}\biggl (P(h_{\text {phg}}+s u_{\text {phg}}) + \int _0^1 L^0_{h_{\text {phg}}+s u_{\text {phg}}+t h_{\text {b}}}(h_{\text {b}}) + \widetilde{L}_{h_{\text {phg}}+s u_{\text {phg}}+t h_{\text {b}}}(h_{\text {b}})\,dt \biggr )\biggr |_{s=0}; \end{aligned}$$

but each of the three terms in parentheses depends smoothly on s as an element of \(\mathcal {Y}^\infty \) by (1), (2), and (3), respectively. \(\square \)

5.1 Asymptotics near \(I^0\cap \mathscr {I}^+\)

With conormal regularity of u at our disposal, all but the leading order terms of \(L_h\) can be regarded as error terms at \(\mathscr {I}^+\): from (5.1) and Lemma 3.8, we get

$$\begin{aligned} L_h^0 u \in \mathcal {Y}^{\infty ;b_0,b_I,b'_I,b_+} + H_{{\text {b}}}^{\infty ;b_0,-1+b'_I-0,a_+}. \end{aligned}$$

Let us now work in a neighborhood \(U\subset M\) of \(I^0\cap \mathscr {I}^+\) and drop the weight at \(I^+\) from the notation. To improve the asymptotics of \(u_{1 1}^c:=\pi _{1 1}^c u\), we use part (3.26b) of the constraint damping/weak null structure hierarchy as well as \(b'_I>b_I\): this gives

$$\begin{aligned} 2\rho ^{-2}\partial _0\partial _1 u_{1 1}^c \in \rho _0^{b_0}\rho _I^{b_I-1}H_{{\text {b}}}^\infty . \end{aligned}$$

Using the local defining functions \(\rho _0\) and \(\rho _I\) from (2.25) and multiplying by \(\rho _I\), this becomes

$$\begin{aligned} \rho _I\partial _{\rho _I}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})u_{1 1}^c \in \rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty . \end{aligned}$$
(5.2)

We can integrate the second vector field from \(\rho _I\ge \epsilon \), where \(u_{1 1}^c\in \rho _0^{b_0}H_{{\text {b}}}^\infty \), obtaining \(\rho _I\partial _{\rho _I}u_{1 1}^c\in \rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty \); this uses \(b_I<b_0\) (see Lemma 7.7 for details). Integrating out \(\rho _I\partial _{\rho _I}\) (see Lemma 7.6) shows that \(u_{1 1}^c\) is the sum of a leading term in \(\rho _0^{b_0}H_{{\text {b}}}^\infty (\mathscr {I}^+\cap U)\) and a remainder in \(\rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty (U)\). This then couples into the equation for \(u_{1 1}=\pi _{1 1}u\), corresponding to part (3.26c) of the hierarchy:

$$\begin{aligned} \rho _I\partial _{\rho _I}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})u_{1 1} \in \rho _I\pi _{1 1}f - \tfrac{1}{2}(\partial _1 h^{\bar{a}\bar{b}})\partial _1(u_{1 1}^c)_{\bar{a}\bar{b}} + \rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty . \end{aligned}$$
(5.3)

The first two summands lie in \(\rho _0^{b_0}H_{{\text {b}}}^\infty (\mathscr {I}^+\cap U)+\rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty \); integrating this along \(\rho _I\partial _{\rho _I}\) generates the logarithmic leading term of \(u_{1 1}\). Thus, \(u_{1 1}=u_{1 1}^{(1)}\log \rho _I+u_{1 1}^{(0)}+u_{1 1,{\text {b}}}\) with \(u_{1 1}^{(j)}\in \rho _0^{b_0}H_{{\text {b}}}^\infty (\mathscr {I}^+\cap U)\) and \(u_{1 1,{\text {b}}}\in \rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty \), as desired.

It remains to improve \(u_0=\pi _0 u\). Write \(u=u_{\text {phg}}+u_{\text {b}}\), where \(u_{\text {phg}}\in \mathcal {X}_{\text {phg}}^\infty \) and \(u_{\text {b}}\in \rho _0^{b_0}\rho _I^{b_I}H_{{\text {b}}}^\infty \) according to what we have already established; note that the space \(\mathcal {X}_{\text {phg}}^\infty \) is independent of the choice of \(b_I,b'_I\in (0,1)\). Then

$$\begin{aligned} \pi _0 L_h^0\pi _0 u_0 = \pi _0 f - \pi _0\widetilde{L}_h(\pi _0 u_0) - \pi _0\widetilde{L}_h(\pi _0^c u_{\text {b}}) - \pi _0\widetilde{L}_h(\pi _0^c u_{\text {phg}}) \in \rho _0^{b_0}\rho _I^{b'_I-1}H_{{\text {b}}}^\infty : \end{aligned}$$

for the first summand, this follows from \(f\in \mathcal {Y}^{\infty ;b_0,b_I,b'_I,b_+}\), for the second summand from \(u_0\in \rho _0^{b_0}\rho _I^{b'_I-0}H_{{\text {b}}}^\infty \) and the decay of the coefficients of \(\widetilde{L}_h\), similarly for the third summand; and for the fourth summand, we use Lemma 5.3(4). Using the notation of part (3.26a) of the hierarchy, this means

$$\begin{aligned} (\rho _I\partial _{\rho _I}-A_{\text {CD}})(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})u_0 \in \rho _0^{b_0}\rho _I^{b'_I}H_{{\text {b}}}^\infty . \end{aligned}$$

Since we are taking \(\gamma >b'_I\), all eigenvalues of \(A_{\text {CD}}\) are \(>b'_I\), so integration of \(\rho _I\partial _{\rho _I}-A_{\text {CD}}\) and then of \(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I}\) (using \(b'_I<b_0\)) gives \(u_0\in \rho _0^{b_0}\rho _I^{b'_I}H_{{\text {b}}}^\infty \). We have thus shown that \(u\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) near \(I^0\cap \mathscr {I}^+\); in fact, this holds away from \(I^+\).

5.2 Asymptotics at the temporal face

We work near \(I^+\) now and drop the weight at \(I^0\) from the notation. Recall from (3.27) the gauge-damped operator \(\underline{L}{}\) on Minkowski space; by Lemma 3.10 and (3.29), we have

$$\begin{aligned} L_h-\underline{L}{}\in \rho _I^{-1-0}\rho _+^{1+b_+}H_{{\text {b}}}^\infty (M)\cdot \text {Diff}_{\text {b}}^2(M;\beta ^*S^2). \end{aligned}$$
(5.4)

We shall deduce the asymptotic behavior of u at \(I^+\) from a study of the operator \(\underline{L}{}\) (and its resonances) on a partial radial compactification N of \(\mathbb {R}^4\)without blowing up the latter at the light cone at future infinity. Before making this precise, we study \(\underline{L}{}\) in detail as a b-operator onN. Let

$$\begin{aligned} \tau =t^{-1},\ \ X=x/t; \end{aligned}$$

these are smooth coordinates on the radial compactification

$$\begin{aligned} N:=[0,\infty )_\tau \times \mathbb {R}^3_X \end{aligned}$$

of \(\mathbb {R}^4\) in \(t>0\), see Figure 12. We have \(d_X=t d_x\), \(t\delta _e=\delta _X\), \(t\delta _e^*=\delta _X^*\), and \(t\partial _t=-\tau \partial _\tau -X\partial _X\). Thus, if we trivialize \(S^2\,{}^{{\text {sc}}}T^*\,{}^0\overline{\mathbb {R}^4}\) using coordinate differentials, the explicit expression of \(\underline{L}{}\) given in §A.3 shows that \(\underline{L}{}\) is a dilation-invariant element of \(\text {Diff}_{\text {b}}^2(N;\underline{\mathbb {C}}{}^{10})\), i.e. \(\underline{L}{}=N(\underline{L}{})\), recalling the definition (2.2) of the normal operator.

Note that \(L_h\) (and even \(L_0\)) has singular coefficients at \(\partial I^+\subset {}^mM\) due to the gauge/constraint damping term: the singular terms come from \(-\rho ^{-1}A_h\partial _1\) in Lemma 3.8. Likewise, \(\underline{L}{}\), on the blow-up of N at the light cone \(\{\tau =0,\,|X|=1\}\) at infinity, has coefficients with \(\rho _I^{-1}\) singularities, which would complicate the normal operator analysis at the temporal face \({}^0i^+\), the lift of

$$\begin{aligned} B:=\{\tau =0,\,|X|\le 1\}, \end{aligned}$$

On the other hand, \(\underline{L}{}\)does have smooth coefficients on the un-blown-up space N, and we recall its well-understood b- and normal operator analysis at \(\partial N\) momentarily. The discussion of the relation between the blown-up and the un-blown-up picture starts with Lemma 5.7 below.

Fig. 12
figure 12

Illustration of the compactification N near its boundary at infinity \(\partial N=\{\tau =0\}\). Shown are future timelike infinity \(B={}^0\beta ({}^0 I^+)\), its boundary \(\partial B=S^+\), and, for illustration, the light cone \(|x|=t\) (dashed)

Conjugating \(\underline{L}{}\) by the Mellin-transform in \(\tau \), thus formally replacing \(\tau \partial _\tau \) by \(i\sigma \), gives the Mellin-transformed normal operator family \(\widehat{\underline{L}{}}(\sigma )\in \text {Diff}^2(\partial N;\underline{\mathbb {C}}{}^{10})\), depending holomorphically on \(\sigma \in \mathbb {C}\); the principal symbol of \(\widehat{\underline{L}{}}\) is independent of \(\sigma \).

We already control u in Theorem 5.1 away from \(I^+\subset M\), so only need to study u (and how \(\underline{L}{}\) relates to it) near \({}^m I^+\), whose image under the blow-down map \({}^m\beta \) on \({}^mM\) is identified with B, see Lemma 2.10. For \(s\in \mathbb {R}\), we then define the function space \(\dot{H}^s(B;\underline{\mathbb {C}}{}^{10})\) as the space of all \(v\in H_{{\text {loc}}}^s(\partial N;\underline{\mathbb {C}}{}^{10})\) which are supported in B. (We are using the notation of [56, Appendix B]). Let

$$\begin{aligned} \mathfrak {X}^s := \{ u\in \dot{H}^s(B;\underline{\mathbb {C}}{}^{10}) :\widehat{\underline{L}{}}(0)u \in \dot{H}^{s-1}(B;\underline{\mathbb {C}}{}^{10})\},\ \ \mathfrak {Y}^s := \dot{H}^s(B;\underline{\mathbb {C}}{}^{10}). \end{aligned}$$

Semiclassical Sobolev spaces are defined by \(\dot{H}_h^s=\dot{H}^s\) with h-dependent norm \(\Vert u\Vert _{\dot{H}_h^s}=\Vert \langle h D\rangle ^s u\Vert _{L^2}\) on \(\partial N\cong \mathbb {R}_X^3\). Let further \(\underline{\mathcal {M}}{}\subset \text {Diff}^1(\partial N;\underline{\mathbb {C}}{}^{10})\) denote the \(\mathcal {C}^\infty (\partial N)\)-module of first order operators with principal symbol vanishing on \(N^*\partial B\), and fix a finite set \(\{A_j\}\subset \underline{\mathcal {M}}{}\) of generators.Footnote 36 For \(k\in \mathbb {N}_0\), we then define

$$\begin{aligned} \dot{H}^{s,k}(B;\underline{\mathbb {C}}{}^{10}) = \{ u\in \dot{H}^s :A_{j_1}\cdots A_{j_\ell }u\in \dot{H}^s,\,0\le \ell \le k \} \end{aligned}$$

and the semiclassical analogue \(\dot{H}_h^{s,k}=\dot{H}^{s,k}\) with norm

$$\begin{aligned} \Vert u\Vert _{\dot{H}_h^{s,k}}^2 = \Vert u\Vert _{\dot{H}_h^s}^2 + \sum _{0\le \ell \le k} \Vert (h A_{j_1})\cdots (h A_{j_\ell })u\Vert _{\dot{H}_h^s}^2. \end{aligned}$$

Lemma 5.4

Let \(C>0\), and fix \(s<\tfrac{1}{2}-C\). Then \(\widehat{\underline{L}{}}(\sigma ):\mathfrak {X}^s\rightarrow \mathfrak {Y}^{s-1}\) is an analytic family of Fredholm operators in \(\{\sigma \in \mathbb {C}:{\text {Im}}\sigma >-C\}\), with meromorphic inverse satisfying

$$\begin{aligned} \Vert \widehat{\underline{L}{}}(\sigma )^{-1}f\Vert _{\dot{H}_{\langle \sigma \rangle ^{-1}}^{s,k}} \le C'_k\langle \sigma \rangle ^{-1}\Vert f\Vert _{\dot{H}_{\langle \sigma \rangle ^{-1}}^{s-1,k}},\ \ |{\text {Im}}\sigma |\le C,\ |{\text {Re}}\sigma | \gg 1, \end{aligned}$$

for any \(k\in \mathbb {N}_0\).

Proof

For \(k=0\), this is almost the same statement as proved in [107, §5], see also [13] and the summary of the presently relevant results in [14, §6]; adding higher module regularity, i.e. \(k\ge 1\), follows by a standard argument, commuting (compositions of) a well-chosen spanning set of \(\underline{\mathcal {M}}{}\) through the equation \(\widehat{\underline{L}{}}(\sigma )u=f\); see [13, Proof of Proposition 4.4] and the discussion prior to [58, Theorem 5.4] for details in the closely related b-setting (i.e. prior to conjugation by the Mellin transform). We shall thus be brief.

The only two differences between the references and the present situation are: (1) \(\hat{\underline{L}{}}(\sigma )\) is an operator acting on a vector bundle; (2) we are working with supported function spaces in B, i.e. future timelike infinity, rather than globally on the boundary of the radial compactification of Minkowski space. Since \(\hat{\underline{L}{}}(\sigma )\) is principally scalar, (1) only affects the threshold regularity at the radial set \(N^*\partial B\). For \(\gamma =0\), \(\underline{L}{}\) is simply a conjugation of \(\tfrac{1}{2}\) times the scalar wave operator, acting diagonally on \(\underline{\mathbb {C}}{}^{10}\), and in this case the threshold regularity is given as \(s<\tfrac{1}{2}+{\text {Im}}\sigma \) in [14, §6], which is implied by our assumption \(s<\tfrac{1}{2}-C\). For small \(\gamma >0\) (depending on the choice of s), this assumption is still sufficient. A straightforward calculation (which we omit) shows that the eigenvalues of \(\sigma _1(t^3(\underline{\widetilde{\delta }{}}{}^*-\delta _{\underline{g}{}}^*)\delta _{\underline{g}{}}G_{\underline{g}{}}t^{-1})|_{N^*\partial B}\) are \(\ge 0\), hence the threshold regularity is \(s<\tfrac{1}{2}+{\text {Im}}\sigma \) for any\(\gamma \ge 0\). (This is closely related to the fact that the components of the solution of \(\underline{L}{} u=f\in \mathcal {C}^\infty _{\text {c}}(\mathbb {R}^4)\) do not grow at \(\mathscr {I}^+\); see Lemma 5.7 below for the relation between growth/decay on M and regularity on \(\overline{\mathbb {R}^4}\)).

In order to deal with (2), it is convenient to first study \(\widehat{\underline{L}{}}(\sigma )\) acting on supported distributions on a larger ball \(B_d:=\{|X|\le 1+d\}\). The only slightly delicate part of the argument establishing the Fredholm property of \(\widehat{\underline{L}{}}(\sigma )\) acting between \(\dot{H}^s(B_2;\underline{\mathbb {C}}{}^{10})\)-type spaces is the adjoint estimate: we need to show that \(\widehat{\underline{L}{}}(\sigma )^*\) satisfies an estimate

$$\begin{aligned} \Vert u\Vert _{\bar{H}^{1-s}(B_2^\circ )}\lesssim \Vert \widehat{\underline{L}{}}(\sigma )^*u\Vert _{\bar{H}^{-s}(B_2^\circ )}+\Vert u\Vert _{\bar{H}^{s_0}(B_2^\circ )} \end{aligned}$$
(5.5)

for some \(s_0<1-s\); here \(\bar{H}^s(B_2^\circ )\) denotes extendible distributions, i.e. restrictions of \(H_{{\text {loc}}}^s\) sections on \(\partial N\) to \(B_2^\circ \). This estimate however is straightforward to obtain by combining elliptic, real principal type, and radial point estimates in \(B_1\), as in the references, with energy estimates for \(\widehat{\underline{L}{}}(\sigma )^*\) which is a wave operator (on the principal symbol level) in \(B_2\setminus B_{1/2}\), see e.g. [114, §3.2] where our \(\widehat{\underline{L}{}}(\sigma )^*\) is denoted P. High energy estimates for \(\widehat{\underline{L}{}}(\sigma )\) on \(\dot{H}^s(B_2)\)-type spaces follow by similar arguments (using [107, Proposition 3.8] for the energy estimate).

Suppose now \(\widehat{\underline{L}{}}(\sigma )u=f\in \dot{H}^{s-1}(B)\) with \(u\in \dot{H}^s(B_2)\). Then energy estimates in \(B_2\setminus B\) imply \({\text {supp}}u\subset B\). This and the Fredholm property of \(\widehat{\underline{L}{}}\) on \(B_2\) yield the desired Fredholm property of \(\widehat{\underline{L}{}}:\mathfrak {X}^s\rightarrow \mathfrak {Y}^{s-1}\) (specifically, the finite codimensionality of the range). Similarly, the high energy estimates on \(B_2\) imply those on B, finishing the proof. \(\square \)

Lemma 5.5

For small \(\gamma \ge 0\), all resonances \(\sigma \in \mathbb {C}\) of \(\underline{L}{}\) satisfy \({\text {Im}}\sigma <0\).

Remark 5.6

One can in fact compute the divisor of \(\underline{L}{}\), i.e. the set of \((z,k)\in \mathbb {C}\times \mathbb {N}_0\) such that \(\widehat{\underline{L}{}}(\sigma )^{-1}\) has a pole of order \(\ge k+1\) at \(\sigma =z\), quite explicitly for any \(\gamma \): it is contained in \(-i \overline{\cup }-2 i \overline{\cup }-i(1+\gamma ) \overline{\cup }-i(1+2\gamma )\), using the shorthand notation (2.35).

Proof of Lemma 5.5

For \(\gamma =0\), and in the trivialization of \(S^2 T^*\mathbb {R}^4\) by coordinate differentials, \(\underline{L}{}\) acts, up to conjugation and rescaling, component-wise as the scalar wave operator on Minkowski space, for which the divisor is known to be \(-i\), see [13, §10.1]. For small \(\gamma \), \(\underline{L}{}\) is a small perturbation of this, and the lemma follows. (See also [107, §2.7]). \(\square \)

Since by Eq. (3.28), \(L_0-\underline{L}{}\in \rho ^{1-0}H_{{\text {b}}}^\infty \text {Diff}_{\text {b}}^2({}^m\overline{\mathbb {R}^4})\), the normal operators as b-differential operators on\({}^m\overline{\mathbb {R}^4}\) are the same, \(N(L_0)=N(\underline{L}{})\), hence the above results hold for \(N(L_0)\) in place of \(\underline{L}{}\).

We next relate the relevant function spaces on \({}^mM\), \({}^m\overline{\mathbb {R}^4}\). We only need to consider supported distributions near \({}^mi^+\subset {}^mM\). We drop m from the notation. If \(\rho _+\in \mathcal {C}^\infty (M)\) denotes a defining function of \(I^+\) such that \(\rho _+>2\) at \(I^0\), let

$$\begin{aligned} U := \{ \rho _+ \le 1 \} \subset M. \end{aligned}$$

Let \(\mathcal {M}_{\text {b}}\subset \text {Diff}_{\text {b}}^1(\overline{\mathbb {R}^4})\) be the \(\mathcal {C}^\infty (\overline{\mathbb {R}^4})\)-module of b-differential operators with b-principal symbol vanishing on \({}^{{\text {b}}}N^*S^+\),Footnote 37 and define \(H_{{\text {b}},{\text {loc}}}^{s,k}(\overline{\mathbb {R}^4})\) to consist of all \(u\in H_{{\text {b}},{\text {loc}}}^s(\overline{\mathbb {R}^4})\) for which \(A_1\cdots A_\ell u\in H_{{\text {b}},{\text {loc}}}^s(\overline{\mathbb {R}^4})\) for all \(0\le \ell \le k\), \(A_j\in \mathcal {M}_{\text {b}}\). Supported distributions on a compact set \(V\subset \overline{\mathbb {R}^4}\) are denoted \(\dot{H}_{{\text {b}}}^{s,k}(V)\).

Lemma 5.7

For \(a_+\in \mathbb {R}\), \(d>-\tfrac{1}{2}\), and \(k\in \mathbb {N}_0\), the map \(\beta |_{U\setminus \partial M}:U\setminus \partial M\xrightarrow {\cong }\beta (U)\setminus \partial \overline{\mathbb {R}^4}\) induces a continuous inclusion

$$\begin{aligned} \rho _I^{a_++d-1/2}\rho _+^{a_+} \dot{H}_{{\text {b}}}^{k+d}(U) \hookrightarrow \rho ^{a_+}\dot{H}_{{\text {b}}}^{d,k}(\beta (U)), \end{aligned}$$
(5.6)

and conversely

$$\begin{aligned} \rho ^{a_+}\dot{H}_{{\text {b}}}^{d,k}(\beta (U)) \hookrightarrow \rho _I^{a_++d-1/2}\rho _+^{a_+}\dot{H}_{{\text {b}}}^k(U). \end{aligned}$$
(5.7)

Thus, given the condition on supports, b-regularity near \(S^+\) is, apart from losses in module regularity, the same as decay at \(\mathscr {I}^+\). See Figure 13. A version of the inclusion (5.7) is (implicitly) a key ingredient of [14], see in particular §9.2 there.

Fig. 13
figure 13

The neighborhood U of \(I^+\subset M\) as well as its image in \(\overline{\mathbb {R}^4}\) under the blow-down map \(\beta \)

Proof of Lemma 5.7

First consider (5.6). Dividing by \(\rho ^{a_+}=\rho _I^{a_+}\rho _+^{a_+}\), it suffices to prove this for \(a_+=0\). Furthermore, elements of \(\mathcal {M}_{\text {b}}\) lift to b-differential operators on M; in fact, \(\text {Diff}_{\text {b}}^1(M)\) is generated, over \(\mathcal {C}^\infty (M)\), by the lift of \(\mathcal {M}_{\text {b}}\) to M. Therefore, it suffices to consider the case \(k=0\) and prove

$$\begin{aligned} \rho _I^{d-1/2} \dot{H}_{{\text {b}}}^d(U) \hookrightarrow \dot{H}_{{\text {b}}}^d(\beta (U)),\ \ d>-\tfrac{1}{2}. \end{aligned}$$
(5.8)

For \(d=0\), this is a consequence of the fact that \(\rho _I\) times a b-density on M pushes forward to a b-density on \(\overline{\mathbb {R}^4}\), cf. (4.27). Next, note that \(\mathcal {V}_{\text {b}}(\overline{\mathbb {R}^4})\) lifts to \(\rho _I^{-1}\mathcal {V}_{\text {b}}(M)\) and thus maps \(\rho _I^\alpha H_{{\text {b}},{\text {loc}}}^s(M)\rightarrow \rho _I^{\alpha -1}H_{{\text {b}},{\text {loc}}}^{s-1}(M)\); the Leibniz rule thus reduces the case \(d\in \mathbb {N}\) to the already established case \(d=0\). For general \(d\ge 0\), (5.8) follows by interpolation; we discuss \(d\in (-\tfrac{1}{2},0]\) below.

For (5.7), we again only need to consider \(a_+=0\), \(k=0\), and prove

$$\begin{aligned} \dot{H}_{{\text {b}}}^d(\beta (U)) \hookrightarrow \rho _I^{d-1/2}L^2_{\text {b}}(U) \cong \rho _I^d L^2_{\text {b}}(\beta (U)). \end{aligned}$$
(5.9)

For \(d=0\), this is clear; for \(d=1\), integrating the 1-dimensional Hardy inequality, \(\Vert x^{-1}u\Vert _{L^2(\mathbb {R}_+)}\lesssim \Vert u'\Vert _{L^2(\mathbb {R}_+)}\), \(u\in \mathcal {C}^\infty _{\text {c}}(\mathbb {R}_+)\), in fact gives \(\dot{H}_{{\text {b}}}^1(\beta (U))\hookrightarrow x L^2_{\text {b}}(\beta (U))\), where x is a defining function for \(\beta (\partial U)\)within\(\overline{\mathbb {R}^4}\). In particular, \(\beta ^*x\in \mathcal {C}^\infty (M)\) vanishes at \(\mathscr {I}^+\) and is hence a bounded multiple of \(\rho _I\), from which (5.9) follows. For general \(d\in \mathbb {N}\), we use the following generalization of the Hardy inequality: for \(u\in \mathcal {C}^\infty _{\text {c}}(\mathbb {R}_+)\),

$$\begin{aligned} \Vert x^{-d}u\Vert _{L^2}&= \biggl \Vert \int _0^1\int _0^{s_2}\cdots \int _0^{s_d} u^{(d)}(t x)\,dt\,dt_d\cdots dt_2\,dx\biggr \Vert _{L^2} \\&\le \int _0^1\int _0^{s_2}\cdots \int _0^{s_d} \Vert u^{(d)}(t\cdot )\Vert _{L^2}\,dt\,dt_d\cdots dt_2\,dx \\&= \frac{2^{2 d}d!}{(2 d)!} \Vert u^{(d)}\Vert _{L^2}. \end{aligned}$$

For real \(d\ge 0\), (5.9) again follows by interpolation.

For \(d\in (-\tfrac{1}{2},0]\), we dualize (5.8) with respect to \(L^2_{\text {b}}(\beta (U))\) and thus need to show \(\bar{H}_{{\text {b}}}^e(\beta (U))\hookrightarrow \rho _I^{e-1/2}\bar{H}_{{\text {b}}}^e(U)\), \(e=-d\in [0,1/2)\). But this follows from (5.7), as in this regularity range, supported and extendible Sobolev spaces are naturally isomorphic [102, §4.5]. Similarly, (5.9) for \(d\in (-\tfrac{1}{2},0]\) follows from (5.8) for \(d\in [0,\tfrac{1}{2})\) by dualization. \(\square \)

Returning to the proof of Theorem 5.1, we have already proved \((1-\chi )u\in \mathcal {X}^\infty \) where \(\chi =\chi (\rho _+)\) is identically 1 for \(\rho _+\le \tfrac{1}{2}\) and vanishes for \(\rho _+\ge 1\). Consider \(\chi u\in \rho _I^{-0}\rho _+^{a_+}\dot{H}_{{\text {b}}}^\infty (U)\), \(a_+<b_+\), which satisfies

$$\begin{aligned} L_h \chi u=f_1:=\chi f+[L_h,\chi ]u\in \rho _I^{-1-0}\rho _+^{b_+}\dot{H}_{{\text {b}}}^\infty (U), \end{aligned}$$

where we use that \([L_h,\chi ]u\) is supported away from \(I^+\). Let

$$\begin{aligned} a'_+=\min (a_+ +1+b_+,b_+)<0, \end{aligned}$$

and fix \(d\in (-\tfrac{1}{2},-\tfrac{1}{2}-a'_+)\), then \(L_h-N(L_0)\in \rho _I^{-1-0}\rho _+^{1+b_+}H_{{\text {b}}}^\infty (M)\cdot \text {Diff}_{\text {b}}^2(M)\) (see Lemma 3.10) and Lemma 5.7 yield

$$\begin{aligned} N(L_0)\chi u =: f_2 \in \rho _I^{-1-0}\rho _+^{a'_+}\dot{H}_{{\text {b}}}^\infty (U) \hookrightarrow \rho ^{a'_+}\dot{H}_{{\text {b}}}^{d,\infty }(\beta (U)). \end{aligned}$$
(5.10)

Shrinking U if necessary, we may assume that \(t>1+r_*\) in U. It then suffices to use dilation-invariant operators on \({}^m\overline{\mathbb {R}^4}\) to measure module regularity at \({}^mS^+\). Indeed, for \(m=0\) and thus \(r_*=r\) (the discussion for general m being similar), recall that with \(R=|X|\), \(\omega =X/|X|\), we can take \(\tau \partial _\tau \), \((1-R)\partial _R\), \(\partial _\omega \), and \(\tau \partial _R\) as generators of \(\mathcal {M}_{\text {b}}\); but \(\tau \partial _R=c(1-R)\partial _R\) with \(c=\tau /(1-R)\in [0,1]\) bounded. Write (5.10) using the Mellin transform in \(\tau \) as

$$\begin{aligned} \chi u = \frac{1}{2\pi }\int _{{\text {Im}}\sigma =-\alpha } \tau ^{i\sigma }\widehat{\underline{L}{}}(\sigma )^{-1}\widehat{f_2}(\sigma )\,d\sigma , \end{aligned}$$

initially for \(\alpha =-a_+\); then \(\widehat{f_2}(\sigma )\) is holomorphic in \({\text {Im}}\sigma >-a'_+\) with values in \(\dot{H}^{d,\infty }(B;\underline{\mathbb {C}}{}^{10})\), and in fact extends by continuity to

$$\begin{aligned} \widehat{f_2}(\sigma ) \in L^2\Bigl (\{{\text {Im}}\sigma =-a'_+\}; \langle \sigma \rangle ^{-d-N}\dot{H}_{\langle \sigma \rangle ^{-1}}^{d,N}(B;\underline{\mathbb {C}}{}^{10})\Bigr ) \quad (\forall \,N). \end{aligned}$$
(5.11)

By Lemmas 5.4 and 5.5, \(\widehat{\underline{L}{}}(\sigma )^{-1}\widehat{f_2}(\sigma )\) is thus holomorphic in \({\text {Im}}\sigma >-a'_+\) as well, with values in \(\dot{H}^{d+1,\infty }\), extending by continuity to the space in (5.11) with d replaced by \(d+1\); therefore \(\chi u\in \rho ^{a'_+}\dot{H}_{{\text {b}}}^{d+1,\infty }(\beta (U))\), so \(\chi u\in \rho _I^{-0}\rho _+^{a'_+}\dot{H}_{{\text {b}}}^\infty (U)\) by Lemma 5.7, as we may choose d arbitrarily close to \(-\tfrac{1}{2}-a'_+\). This improves the weight of u at \(I^+\) by \(a'_+-a_+\); iterating the argument gives \(\chi u\in \rho _I^{-0}\rho _+^{b_+}\dot{H}_{{\text {b}}}^\infty (U)\).

5.3 Asymptotics near \(\mathscr {I}^+\cap I^+\)

It remains to show that the precise asymptotics at \(\mathscr {I}^+\) which we established away from \(I^+\) in §5.1 extend all the way up to \(I^+\), with the weight \(\rho _+^{b_+}\) at \(I^+\). This is completely parallel to the arguments in §5.1: working near \(I^+\), we now have \(L^0_h u\in \mathcal {Y}^{\infty ;b_0,b_I,b'_I,b_+}+H_{{\text {b}}}^{\infty ;b_0,-1+b'_I-0,b_+}\), so with coordinates \(\rho _I,\rho _+\) as in (4.48) (dropping the superscript ‘\(\circ \)’),

$$\begin{aligned} \rho _I\partial _{\rho _I}(\rho _+\partial _{\rho _+}-\rho _I\partial _{\rho _I})u_{1 1}^c\in \rho _I^{b_I}\rho _+^{b_+}H_{{\text {b}}}^\infty ; \end{aligned}$$

now, in \(\rho _+>0\) (and away from \(I^0\)), \(u_{1 1}^c\) has a leading term at \(\mathscr {I}^+\), plus a remainder in \(\rho _I^{b_I}H_{{\text {b}}}^\infty \), while in \(\rho _I>0\), \(u_{1 1}^c=\pi _{1 1}^c u\) lies in \(\rho _+^{b_+}H_{{\text {b}}}^\infty \). Using Lemma 7.6 to integrate the above equation for \(u_{1 1}^c\), we conclude that \(u_{1 1}^c\) is the sum of a leading term in \(\rho _0^{b_0}\rho _+^{b_+}H_{{\text {b}}}^\infty (\mathscr {I}^+)\) and a remainder in \(\rho _0^{b_0}\rho _I^{b_I}\rho _+^{b_+}H_{{\text {b}}}^\infty \), as desired. Similarly, we obtain the desired asymptotic behavior, uniformly up to \(I^+\), of \(u_{1 1}\) and then of \(u_0\). Therefore, \(u\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\), completing the proof of Theorem 5.1.

6 Proof of global stability

We now make Theorem 5.1 quantitative by keeping track of the number of derivatives used and proving tame estimates, the crucial ingredient in Nash–Moser iteration. Fix the mass m; for weights \(b_0,b_I,b'_I,b_+\) as in Definitions 3.1 and 3.3, let

$$\begin{aligned}&B^k := \mathcal {X}^{k;b_0,b_I,b'_I,b_+}; \quad \mathbf{B} ^k := \mathcal {Y}^{k;b_0,b_I,b'_I,b_+} \oplus D^{k;b_0},\\&D^{k;b_0}:=\rho _0^{b_0}H_{{\text {b}}}^{k+1}(\Sigma ) \oplus \rho _0^{b_0}H_{{\text {b}}}^k(\Sigma ). \end{aligned}$$

Let us write \(|\cdot |_s\), resp. \(\Vert \cdot \Vert _s\), for the norm on \(B^s\), resp. \(\mathbf{B} ^s\). Put

$$\begin{aligned} B^\infty = \bigcap _{k\in \mathbb {N}} B^k,\quad \mathbf{B} ^\infty = \bigcap _{k\in \mathbb {N}} \mathbf{B} ^k. \end{aligned}$$

We recall Saint-Raymond’s version [100] of the Nash–Moser inverse function theorem:

Theorem 6.1

(See [100]). Let \(\phi :B^\infty \rightarrow \mathbf{B} ^\infty \) be a \(\mathcal {C}^2\) map, and assume that there exist \(d\in \mathbb {N}\), \(\epsilon >0\), and constants \(C_1,C_2,(C_s)_{s\ge d}\) such that for any \(h,u,v\in B^\infty \) with \(|h|_{3 d}<\epsilon \),

$$\begin{aligned} \Vert \phi (h)\Vert _s&\le C_s(1+|h|_{s+d})\ \ \forall \,s\ge d, \end{aligned}$$
(6.1a)
$$\begin{aligned} \Vert \phi '(h)u\Vert _{2 d}&\le C_1|u|_{3 d}, \end{aligned}$$
(6.1b)
$$\begin{aligned} \Vert \phi ''(h)(u,v)\Vert _{2 d}&\le C_2|u|_{3 d}|v|_{3 d}. \end{aligned}$$
(6.1c)

Moreover, assume that for such h, there exists an operator \(\psi (h):\mathbf{B} ^\infty \rightarrow B^\infty \) satisfying \(\phi '(h)\psi (h)f=f\) and the tame estimate

$$\begin{aligned} |\psi (h)f|_s \le C_s(\Vert f\Vert _{s+d}+|h|_{s+d}\Vert f\Vert _{2 d}),\ \ \forall \,s\ge d,\ f\in \mathbf{B} ^\infty . \end{aligned}$$
(6.2)

Then if \(\Vert \phi (0)\Vert _{2 d}<c\), where \(c>0\) is a constant depending on \(\epsilon \) and \(C_s\) for \(s\le D\), where \(D=16 d^2+43 d+24\), there exists \(h\in B^\infty \), \(|h|_{3 d}<\epsilon \), such that \(\phi (h)=0\).

This uses a family of smoothing operators \((S_\theta )_{\theta >1}:B^\infty \rightarrow B^\infty \) satisfying the estimates

$$\begin{aligned} |S_\theta v|_s\le C_{s,t}\theta ^{s-t}|v|_t,\ \ s\ge t; \qquad |v-S_\theta v|_s\le C_{s,t}\theta ^{s-t}|v|_t,\ \ s\le t. \end{aligned}$$
(6.3)

Acting on standard Sobolev spaces \(H^s(\mathbb {R}^n)\), the existence of such a family is proved in [100, Appendix], and the extension to weighted b-Sobolev spaces on manifolds with corners is straightforward: the arguments on manifolds with boundary given in [60, §11.2] generalize directly to the corner setting. For the spaces \(B^s=\mathcal {X}^s\) at hand then, one writes \(h\in B^\infty \) as \(\chi _1 h+(1-\chi _1)h\), with \(\chi _j\in \mathcal {C}^\infty (M)\), \(j=0,1,2\), identically 1 in a small neighborhood of \(\mathscr {I}^+\), and \(\chi _{j+1}\equiv 1\) on \({\text {supp}}\chi _j\). We smooth out \((1-\chi _1)h\in \rho _0^{b_0}\rho _I^\infty \rho _+^{b_+}H_{{\text {b}}}^\infty (M)\) (see (2.29) for the notation \(\rho _I^\infty \)) as usual and cut the result off using \((1-\chi _0)\); since we are working away from \(\mathscr {I}^+\), the weight of \(\rho _I\) plays no role here. (The proof of [59, Lemma 5.9] shows that cutting off the smoothing of \((1-\chi _1)h\) away from its support does not affect the estimates (6.3)). Near \(\mathscr {I}^+\) on the other hand, we have \(\chi _1 h=(\chi _1 h_\alpha )\), where we denote by \(h_\alpha \) the components of h in the bundle splitting (2.21). The decaying components (3.4) as well as the remainder terms \(h_{\alpha ,{\text {b}}}\) in (3.5)–(3.6) can then be smoothed out and cut off using \(\chi _2\). To smooth out the leading terms, fix a collar neighborhood of \(\mathscr {I}^+\); considering for example \(\chi _1 h_{0 1}=\chi _0 h_{0 1}^{(0)} + \chi _1 h_{0 1,{\text {b}}}\), see (3.6), we smooth out \(h_{0 1}^{(0)}\) in the weighted b-Sobolev space \(\rho _0^{b_0}\rho _+^{b_+}H_{{\text {b}}}^\infty (\mathscr {I}^+)\), extend the result to the collar neighborhood, and cut off using \(\chi _0\); similarly for the other components of h.

Given initial data \((h_0,h_1)\in D^\infty \), we want to apply Theorem 6.1 to the map

$$\begin{aligned} \phi (h)=\bigl (P(h),\,(h,\partial _\nu h)|_\Sigma -(h_0,h_1)\bigr ), \end{aligned}$$
(6.4)

with P given in (3.2). Note that the smallness of \(\phi (0)\) in particular requires \(P(0)=\rho ^{-3}\text {Ric}(g_m)\) to be small. Now, P(0) is nonzero only in the region where we interpolate between the mass m Schwarzschild metric and the Minkowski metric (both of which are Ricci-flat!), i.e. on \({\text {supp}}d\psi \cup {\text {supp}}d\phi \) in the notation of (2.10)–(2.11); thus in fact \(P(0)\in \mathcal {A}_{\text {phg}}^{\emptyset ,\emptyset ,0}\). It is then easy to see that \(\Vert P(0)\Vert _{\mathcal {Y}^k}\le C_k m\) for all \(k\in \mathbb {N}\), which is the reason why we need to assume the ADM mass m to be small to get global solvability.

For \(h\in \mathcal {X}^\infty \) with \(|h|_3\) small, the tensor

$$\begin{aligned} g=g_m+\rho h \end{aligned}$$

is Lorentzian (by Sobolev embedding) and hence \(\phi (h)\) is defined; since P is a second order (nonlinear) differential operator with coefficients which are polynomials in \(g^{-1}\) and up to 2 derivatives of g, and since \(h\mapsto (h,\partial _\nu h)|_\Sigma \) is continuous as a map \(\mathcal {X}^k\rightarrow D^{k-3/2}\) for \(k\ge 2\), the estimate (6.1a) follows for \(d=3\). The estimate (6.1b) also holds for \(d=3\) and \(|h|_{3 d}<\epsilon \) small, since the first component of \(\phi '(h)u\), namely \(L_h u\), is a second order linear differential operator acting on u, with coefficients involving at most 2 derivatives of h; similarly for (6.1c).

The existence of the right inverse \(\psi (u):\mathbf{B} ^\infty \rightarrow B^\infty \) is the content of Theorem 5.1; we merely need to determine a value for d such that the tame estimate (6.2) holds. (As stressed in the introduction, the mere existence of such a d is clear, since the estimates on \(\psi (u)\) are obtained using energy methods, integration along approximate characteristics, and inversion of a linear, smooth coefficient, model operator in §4, §§5.1 and 5.3, and §5.2, respectively). Consider the first term on the right in (6.2): we need to quantify the loss of derivatives of the solution v of \(L_h u=f\), \((u,\partial _\nu u)|_\Sigma =(u_0,u_1)\), relative to the regularity \(k\ge 0\) of \((f,(u_0,u_1))\in \mathbf{B} ^k\).

Now, dropping the \(H_{\mathscr {I}}^1\) regularity part of Theorem 4.2, we obtain \(u\in \rho _0^{b_0}\rho _I^{a_I}\rho _+^{a_+}H_{{\text {b}}}^k\), \(\pi _0 u\in \rho _0^{b_0}\rho _I^{a'_I}\rho _+^{a_+}H_{{\text {b}}}^k\). The arguments near \(I^0\cap \mathscr {I}^+\) in §5.1 first express \(u_{1 1}^c\) as the solution of a transport equation (5.2), with the right hand side involving up to two derivatives of u; since integration of this equation does not regain full b-derivatives, the leading terms (and the remainder term) of \(u_{1 1}^c\) lie in \(H_{{\text {b}}}^{k-2}\), with the correct weight \(b_0\) at \(I^0\) (and \(b_I\) at \(\mathscr {I}^+\)); next, this couples into the transport equation (5.3) for \(u_{1 1}\), again with up to 2 derivatives of u, so integrating this yields leading and remainder terms of \(u_{1 1}\) in \(H_{{\text {b}}}^{k-4}\); and similarly then \(u_0\in \rho _0^{b_0}\rho _I^{b'_I}H_{{\text {b}}}^{k-6}\) near \(I^0\cap \mathscr {I}^+\).

On the other hand, improving the b-weight at \(I^+\) by \(1+b_+\), which we may take to be arbitrarily close to 1 by taking \(b_+<0\) close to 0, uses the rewriting (5.10), which due to the second order nature of \(L_h-N(L_0)\) involves an error term (subsumed into \(f_2\) there) with 2 derivatives on u. Passing to the blow-down using Lemma 5.7 loses at most 1 module derivative; inverting \(N(L_0)\) gains 1 b-derivative (which is used to recover the \(\rho _I^{-0}\) bound at \(\mathscr {I}^+\)), but no module derivatives, so passing back to the blow-up, we have lost at most 3 b-derivatives. Thus, improving the weight at \(I^+\) from \(a_+\) to \(b_+\approx 0\) loses at most \(d_+:=1+3\lceil a_+\rceil \) derivatives relative to \(H_{{\text {b}}}^k\).

These two pieces of information are combined near \(\mathscr {I}^+\cap I^+\) in §5.3, where we lose at most 6 derivatives, just as in the discussion near \(I^0\cap \mathscr {I}^+\), relative to the less regular of the two spaces \(H_{{\text {b}}}^{k-6}\) and \(H_{{\text {b}}}^{k-d_+}\) from above; we thus take \(d=6+\max (6,d_+)\). If we use the explicit background estimate, Theorem 4.12, so \(a_+=\tfrac{3}{2}\), this gives \(d_+=7\) and therefore

$$\begin{aligned} d=13. \end{aligned}$$

For this value of d, one may then verify the tame estimate (6.2) by going through the proofs of Theorems 4.2 and 5.1 and proving tame estimates by exploiting Moser estimates; this is analogous to the manner in which the microlocal estimates for smooth coefficient operators in [107, §2], [58, §2.1] were extended to estimates for rough coefficient operators in [52, §§3–6], which were subsequently sharpened to tame estimates in [59, §§3–4]. In the present setting, obtaining tame estimates is much simpler than in the references, as the estimates in §§45 are based on standard energy estimates, so one can appeal directly to the Moser estimates; or, in view of the fact that our energy estimates can be proved using positive commutators (and are indeed phrased this way here), which also underlie the tame estimates in these references, the arguments given there (using vector fields instead of microlocal commutants) apply here as well. We omit the details, but we do point out that it is key that the proofs as stated only use pointwise control of up to 1 derivative of h (via causality considerations and deformation tensors, see e.g. the calculation (4.16) and Lemma 4.6) in order to obtain the main positive terms in the commutator arguments; thus, control of \(|h|_4\) suffices in this sense, that is, the constant in (4.3) for \(k=1\) only depends on \(|h|_4\). The proofs of higher b-regularity use commutation arguments, which do not affect the principal part of \(L_h\), as well as ellipticity considerations around (4.58) which only require pointwise control of h itself; correspondingly, at no point do we need to use the smallness of any higher regularity norms of h. (See the end of [55, §6.4] for a related discussion).

Next, we deal with a small technical complication stemming from the fact that for \(m\ne 0\), the closure of \(\{t=0\}\), on which in Theorem 1.1 we compare the initial data with those of the Schwarzschild metric in its standard form, inside of \({}^m\overline{\mathbb {R}^4}\) is not a smooth hypersurface when \(m\ne 0\), the issue being smoothness at \(\partial {}^m\overline{\mathbb {R}^4}\); furthermore, our discussion of linear Cauchy problems used \({}^m\Sigma \ne \overline{\{t=0\}}\) as the Cauchy surface. We resolve this issue by solving the initial value problem for a short amount of time in the radial compactification \({}^0\overline{\mathbb {R}^4}\), with initial surface \(\{t=0\}\) (whose closure is smooth in \({}^0\overline{\mathbb {R}^4}\)), pushing the local solution forward to \({}^m\overline{\mathbb {R}^4}\), and then solving globally from there. Recall the function \(t_{\text {b}}\) from (2.14), and the notation (2.17). (Thus, \({}^0 t_{\text {b}}\) is a rescaling of t, and \({}^0\Sigma =\{{}^0 t_{\text {b}}=0\}\)).

Lemma 6.2

Fix N large, and let \(b_0>0\), \(\epsilon >0\). Suppose \(\gamma ,k\in \mathcal {C}^\infty (\mathbb {R}^3;S^2 T^*\mathbb {R}^3)\) are vacuum initial data on \(\mathbb {R}^3\), that is, solutions of the constraint equations (1.5), such that for some \(m\in \mathbb {R}\),

(6.5)

and \(k\in \rho _0^{2+b_0}H_{{\text {b}}}^\infty (\overline{\mathbb {R}^3};S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^3})\) satisfy

$$\begin{aligned} |m|+\Vert \widetilde{\gamma }\Vert _{\rho _0^{1+b_0}H_{{\text {b}}}^{N+1}}+\Vert k\Vert _{\rho _0^{2+b_0}H_{{\text {b}}}^N}<\delta , \end{aligned}$$
(6.6)

where \(\delta >0\) is a sufficiently small constant; here \(\chi =\chi (r)\) is a cutoff, \(\chi \equiv 0\) for \(r<1\), \(\chi \equiv 1\) for \(r>2\). Then, identifying \(\overline{\mathbb {R}^3}\cong {}^0\Sigma \subset {}^0M\) via \(\mathbb {R}^3\ni x\mapsto (0,x)\in \mathbb {R}^4\), there exists a solution g of the Einstein vacuum equation \(\text {Ric}(g)=0\) in the neighborhood

$$\begin{aligned} U:=\{|{}^0 t_{\text {b}}|<\tfrac{1}{4}\}, \end{aligned}$$
(6.7)

attaining the data \((\gamma ,k)\) at \({}^0\Sigma \) (that is, (1.4) holds) and satisfying the gauge condition \(\Upsilon (g;g_m)=0\); moreover, \(g=g_m+\rho h\), where \(h\in \rho _0^{b_0}H_{{\text {b}}}^\infty (U;S^2\,{}^{{\text {sc}}}T^*\,{}^0\overline{\mathbb {R}^4})\) has norm \(\Vert h\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{N+1}(U)}<\epsilon \).

Proof

Note that the metric \(g_m\) is smooth on \(U\subset {}^0\overline{\mathbb {R}^4}\), as near \(I^0\) it is given by the Schwarzschild metric \(g_m^S\), see (1.3). Using the product decomposition \(\mathbb {R}^4=\mathbb {R}_t\times \mathbb {R}^3_x\), we define a Lorentzian signature metric over the interior \(({}^0\Sigma )^\circ =\{t=0\}\) by

$$\begin{aligned} g_0 := (1-\chi (r)\tfrac{2 m}{r})d t^2 - \gamma \in \mathcal {C}^\infty (({}^0\Sigma )^\circ ;S^2 T^*\mathbb {R}^4), \end{aligned}$$
(6.8)

whose pullback to \({}^0\Sigma \) is equal to \(-\gamma \). We next find \(g_1\in \mathcal {C}^\infty ({}^0\Sigma ;S^2 T^*\mathbb {R}^4)\) such that \(k=II_{g_0+t g_1}\); denoting by \(N=(1-\chi (r)\tfrac{2 m}{r})^{-1/2}\partial _t\) the future unit normal, this is equivalent, by polarization, to

$$\begin{aligned} g_0((\nabla _X^{g_0+t g_1}-\nabla _X^{g_0})X,N)=k(X,X)\ \ \forall \,X\in T({}^0\Sigma )^\circ ; \end{aligned}$$

Here, we view \(g_0\) as a stationary metric near \(t=0\), which due to its symmetry under time reversal \(t\mapsto -t\) has vanishing second fundamental form: \(g_0(\nabla _X^{g_0}X,N)\equiv 0\). A calculation in normal coordinates for \(g_0\) shows that this is uniquely solved by

$$\begin{aligned} g_1(X,X)=-2(N t)^{-1} k(X,X) = -2(1-\chi (r)\tfrac{2 m}{r})^{1/2}k(X,X). \end{aligned}$$
(6.9)

It remains to specify \(g_1(N,\cdot )\) and \(g_1(N,N)\), which involves the gauge condition at \(t=0\); that is, for all \(V\in T_{\{t=0\}}\mathbb {R}^4\), we require

$$\begin{aligned} -\Upsilon (g_0;g_m)(V)&= \bigl (\Upsilon (g_0+t g_1;g_m)-\Upsilon (g_0;g_m)\bigr )(V) \nonumber \\&= (G_{g_0}g_1)(V,\nabla ^{g_0}t) = (1-\chi (r)\tfrac{2 m}{r})^{-1/2} (G_{g_0}g_1)(V,N). \end{aligned}$$
(6.10)

For \(V\in T({}^0\Sigma )^\circ \), this determines \((G_{g_0}g_1)(V,N)=g_1(V,N)\). Lastly, if \(E_1,E_2,E_3\in T({}^0\Sigma )^\circ \) completes N to an orthonormal basis, this also determines \((G_{g_0}g_1)(N,N)=\tfrac{1}{2}(g_1(N,N)+\sum _j g_1(E_j,E_j))\) and thus \(g_1(N,N)\).

The assumption on \(\gamma \) gives

$$\begin{aligned} h_0:=\rho _0^{-1}(g_0-g_m)\in \rho _0^{b_0}H_{{\text {b}}}^\infty ({}^0\Sigma ;S^2\,{}^{{\text {sc}}}T^*_{{}^0\Sigma }{}^0\overline{\mathbb {R}^4}). \end{aligned}$$
(6.11)

We claim that likewise

$$\begin{aligned} h_1:=\rho _0^{-2}g_1\in \rho _0^{b_0}H_{{\text {b}}}^\infty ({}^0\Sigma ;S^2\,{}^{{\text {sc}}}T^*_{{}^0\Sigma }{}^0\overline{\mathbb {R}^4}). \end{aligned}$$
(6.12)

We introduce the extra factor of \(\rho _0^{-1}\) since \(\rho _0^{-1}\partial _t\) is a smooth b-vector field on \({}^0\overline{\mathbb {R}^4}\) near \({}^0\Sigma \) and transversal to it; that is, in (4.1), we can take

$$\begin{aligned} \partial _\nu =\rho _0^{-1}\partial _t. \end{aligned}$$

Now the restriction of \(h_1\) to \(S^2\,{}^{{\text {sc}}}T\,{}^0\Sigma \) lies in \(\rho _0^{b_0}H_{{\text {b}}}^\infty \), as follows from (6.9). (Recall that \({}^{{\text {sc}}}T\,{}^0\Sigma \) is spanned by coordinate vector fields on \(\mathbb {R}^3\)). To prove (6.12), it thus suffices to prove that \(\Upsilon (g_0;g_m)(V)\in \rho _0^{2+b_0}H_{{\text {b}}}^\infty \) for V equal to \(\partial _t\) or a coordinate vector field on \(\mathbb {R}^3\); this however follows from (6.11) and the local coordinate expression (3.1) of \(\Upsilon \), as such a vector field V is equal to \(\rho _0\) times a b-vector field on \({}^0\overline{\mathbb {R}^4}\).

This construction preserves smallness, i.e. we have \(\Vert h_0\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{N+1}}+\Vert h_1\Vert _{\rho _0^{b_0}H_{{\text {b}}}^N}<C\delta \) for some constant C. We can then solve the quasilinear wave equation \(P(h)=0\) in the neighborhood U of \({}^0\Sigma \), e.g. using Nash–Moser iteration as explained above. (Since we are not solving up to \(\mathscr {I}^+\) where our arguments in §5 lose derivatives, one can use a simpler iteration scheme here, see [102, §16.1]). The constraint equations then imply that \(\partial _\nu \Upsilon (g_m+\rho h;g_m)=0\) at \({}^0\Sigma \), see [60, §2.1]; since \(\Upsilon \) solves the wave equation (1.31), we have \(\Upsilon \equiv 0\). \(\square \)

To extend this to a global solution, we recall from Lemma 2.10 and the isomorphism (2.40) that h pushes forward to an element of \(\rho _0^{b_0}H_{{\text {b}}}^\infty (U')\), \(U':=\{|{}^m t_{\text {b}}|<\tfrac{1}{8}\}\), and satisfies a bound \(\Vert h\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{N+1}(U')}<C\epsilon \), with C a constant depending only on m. We can thus use \((h_0,h_1)=(h,\partial _\nu h)|_{{}^m\Sigma }\) as Cauchy data for the equation \(P(h)=0\). Note that the gauge condition \(\Upsilon (g)=0\), \(g=g_m+\rho h\), holds identically near \({}^m\Sigma \); by uniqueness of solutions of \(P(h)=0\) with Cauchy data \((h_0,h_1)\), a global solution h will automatically satisfy \(\Upsilon (g)\equiv 0\), as this holds near \({}^m\Sigma \), and then globally by the argument given around equation (1.31).

Theorem 6.3

Fix N large, \(b_0>0\), \(\epsilon >0\), and \(0<\eta <\min (\tfrac{1}{2},b_0)\). Then if \(m\in \mathbb {R}\) and \(h_0,h_1\in \rho _0^{b_0}H_{{\text {b}}}^\infty ({}^m\Sigma )\) satisfy

$$\begin{aligned} |m|+\Vert h_0\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{N+1}}+\Vert h_1\Vert _{\rho _0^{b_0}H_{{\text {b}}}^N}<\delta , \end{aligned}$$

where \(\delta >0\) is a small constant, then there exists a global solution h of

$$\begin{aligned} P(h)=0, \quad (h,\partial _\nu h)|_{{}^m\Sigma }=(h_0,h_1), \end{aligned}$$
(6.13)

that is,

$$\begin{aligned} \text {Ric}(g)-\widetilde{\delta }{}^*\Upsilon (g)=0,\ \ g=g_m+\rho h, \end{aligned}$$

which satisfies \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) for all weights \(b_I<b'_I<\min (1,b_0)\) and \(b_+<0\), and so that moreover \(\Vert h\Vert _{\mathcal {X}^{6;b_0,\eta ,\eta /2,-\eta }}<\epsilon \). If in addition \(\Upsilon (g_m+\rho h;g_m)=0\), \(\partial _\nu \Upsilon (g_m+\rho h;g_m)=0\) at \({}^m\Sigma \), then g solves

$$\begin{aligned} \text {Ric}(g)=0 \end{aligned}$$

in the gauge \(\Upsilon (g)=0\).

As explained above, data for which the assumption in the second part of the theorem holds arise from an application of Lemma 6.2. This assumption is equivalent to the statement that the Riemannian metric and second fundamental form of \({}^m\Sigma \) induced by a metric \(g_m+\rho h\) with \((h,\partial _\nu h)|_{{}^m\Sigma }=(h_0,h_1)\) satisfy the constraint equations, and that the gauge condition \(\Upsilon (h;g_m)=0\) holds pointwise at \({}^m\Sigma \). These are assumptions only involving the data \((h_0,h_1)\); the vanishing of \(\partial _\nu \Upsilon (h)|_{{}^m\Sigma }\) for the solution h of \(P(h)=0\) with these data follows as in the proof of Lemma 6.2.

Proof of Theorem 6.3

This follows, with \(b_I<b'_I<\min (\tfrac{1}{2},b_0)\) at first, for \(N=2 d=26\), from Theorem 6.1 applied to the map in (6.4). The constant \(\delta >0\) depends in particular on the constants \(C_s\) in (6.1a) for \(s\le D=3287\); that is, \(\delta =\delta (\Vert h_0\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{D+1}}+\Vert h_1\Vert _{\rho _0^{b_0}H_{{\text {b}}}^D})\). Repeating the arguments in §§5.1 and 5.3 once more shows that one can take \(b_I<b'_I<\min (1,b_0)\); see also the proof of Theorem 7.1 below.

We remark that h is in fact small in \(\mathcal {X}^{3 d}=\mathcal {X}^{39}\), but if one is interested in the size of up to two derivatives (e.g. curvature) of h, control of its \(\mathcal {X}^6\) norm is sufficient by Sobolev embedding. \(\square \)

Remark 6.4

In other words, using the notation of the proof and \(d\ge 13\), \(N=2 d\), \(D=16 d^2+43 d+24=3287\), and fixing m and \(b_0\), we can solve the initial value problem (6.13) for data in the space \(\mathscr {D}:=\bigcup _C \mathscr {D}(C)\), where

$$\begin{aligned} \mathscr {D}(C)&:=\Bigl \{(h_0,h_1) :h_0,\,h_1\in \rho _0^{b_0}H_{{\text {b}}}^\infty ({}^m\Sigma ),\ \ |m|\\&\qquad \qquad +{}\Vert h_0\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{N+1}}+\Vert h_1\Vert _{\rho _0^{b_0}H_{{\text {b}}}^N}<\delta (C), \\&\qquad \qquad \Vert h_0\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{D+1}}+\Vert h_1\Vert _{\rho _0^{b_0}H_{{\text {b}}}^D}<C \Bigr \}. \end{aligned}$$

An inspection of the proof of Theorem 6.1 in [100] shows that \(\lim _{C\rightarrow 0}\delta (C)>0\), so \(\mathscr {D}\) in particular contains all conormal data \((h_0,h_1)\) for which \(|m|+\Vert h_0\Vert _{\rho _0^{b_0}H_{{\text {b}}}^{D+1}}+\Vert h_1\Vert _{\rho _0^{b_0}H_{{\text {b}}}^D}<\delta _0\), where \(\delta _0>0\) is a universal constant (i.e. depending only on m and \(b_0\)). Moreover, one also has a continuity statement: for any choice of weights \(b_I,b'_I,b_+\) as in Theorem 6.3, the solution \(h\in \mathcal {X}^{3 d;b_0,b_I,b'_I,b_+}\) of (6.13) depends continuously on \((h_0,h_1)\in \mathscr {D}\), the latter being equipped with the \(\rho _0^{b_0}H_{{\text {b}}}^{D+1}\oplus \rho _0^{b_0}H_{{\text {b}}}^D\) topology.Footnote 38 Indeed, to obtain continuity at the Minkowski solution, note that the map \(\phi \) in (6.4) depends parametrically on the data \((h_0,h_1)\in \mathscr {D}\), but the constants appearing in the estimates in [100] can be taken to be uniform when \((h_0,h_1)\) varies in \(\mathscr {D}(C)\) with C fixed. Continuity at other solutions is similarly automatic, but the base point of the Nash–Moser iteration (called \(u_0\) in [100, Lemma 1]) should then be given by the solution one is perturbing around.

The solution h of (6.13) in fact has a leading term at \(I^+\), as will follow from the arguments in §7, see the discussion around (7.16); this precise information was not needed to close the iteration scheme, hence we did not encode it in the spaces \(\mathcal {X}^s\).

The conclusion in the form given in Theorem 1.1 can be obtained by combining Lemma 6.2 and Theorem 6.3: using the coordinate \(t_{\text {b}}\) on \({}^mM'\), the initial surface \({}^0\Sigma \) in Minkowski space is given by \(t_{\text {b}}=-2 m \rho _0\chi (r)\log (r-2 m)\). A diffeomorphism of \({}^m\overline{\mathbb {R}^4}\) which near \({}^m\Sigma \) is not smooth but rather polyhomogeneous with index set \(\mathcal {E}_{\mathrm{log}}\), and which is the identity away from \({}^m\Sigma \), can be used to map \(\{t_{\text {b}}\ge -2 m\rho _0\chi (r)\log (r-2 m)\}\subset {}^mM'\) to \({}^mM=\{t_{\text {b}}\ge 0\}\); pushing the solution g obtained from Lemma 6.2 and Theorem 6.3, which is defined on \(t\ge 0\), forward using this diffeomorphism produces the solution g as in Theorem 1.1. (The gauge condition satisfied by g is the wave map condition with respect to the background metric which is the pushforward of \(g_m\)). We omit the proofs of future causal geodesic completeness of (Mg), as one can essentially copy the arguments of Lindblad–Rodnianski [79, §16].

Remark 6.5

By Sobolev embedding, h obeys the pointwise bound

$$\begin{aligned} |h|\le C_\eta (1+t+r)^{-1+\eta }(1+(r_*-t)_+)^{b_0}\ \ \forall \,\eta >0 \end{aligned}$$
(6.14)

and is small for fixed \(\eta >0\) if \(\delta =\delta (\eta )>0\) in the theorem is sufficiently small; here, we measure the size of h using any fixed Riemannian inner product on the fibers of \(\beta ^*S^2\), equivalently, by measuring \(\sum _{i j}|h(Z_i,Z_j)|\), where \(\{Z_i\}=\{\partial _t,\partial _{x^1},\partial _{x^2},\partial _{x^3}\}\) are coordinate vector fields. The bound (6.14) also holds for all covariant derivatives of h along b-vector fields on \({}^mM\). In particular, by Lemma 3.11, \(|g-\underline{g}{}|\le C_\eta (1+t+r)^{-1+\eta }\), \(\eta >0\). The Riemann curvature tensor also decays to 0 as \(t+r\rightarrow \infty \), with the decay rate depending on the component: this follows from an inspection of the expressions in §A.2. Note however that the components in the frame (2.23) have no geometric meaning away from \(\mathscr {I}^+\). Geometric and more precise decay statements were obtained by Klainerman–Nicolò [66].

Remark 6.6

If the ADM mass m of the initial data is large, there does not exist a metric with the mass m Schwarzschild behavior near \(\mathscr {I}^+\) but Minkowski-like far from \(I^0\cup \mathscr {I}^+\) which is sufficiently close to being Ricci flat for an application of a small data nonlinear iteration scheme like Nash–Moser: this follows from work of Christodoulou [25], Klainerman–Rodnianski and Luk [65, 68], An–Luk [5], and (for the noncharacteristic problem) Li–Yu [83]. On the other hand, for arbitrary m, but without the smallness condition (1.6) on the data, one does obtain small data by restricting to the complement of a sufficiently large ball. Working on a suitable submanifold of \({}^mM\), defined near \(I^0\cap \mathscr {I}^+\) by \(\rho _0<\epsilon +\rho _I^\beta \) for \(\beta \in (0,b_0)\) and \(\epsilon >0\) sufficiently small, cf. (4.15), our method of proof then ensures the existence of a vacuum solution on this submanifold; in particular, the solution includes a piece of null infinity.

We can also solve towards the past: Lemma 6.2 produces a solution g of Einstein’s equation in the gauge \(\Upsilon (g;g_m)=0\) in a full neighborhood of \(\{t=0\}\), and we can then use the time-reversed analogue of Theorem 6.3 for solving backwards in time, obtaining a global solution g on \(\mathbb {R}^4\). Note here that by construction, the background metric \(g_m\) is invariant under the time reversal map \(\iota :t\mapsto -t\) on \(\mathbb {R}^4\), hence the gauge conditions of the future and past solutions match. To describe the behavior of g on a compact space, as illustrated in Figure 1, let us denote by \({}_m\overline{\mathbb {R}^4}\) the compactification defined like \({}^m\overline{\mathbb {R}^4}\) in §2.1 but with t replaced by \(-t\) everywhere. Thus, \(\iota \) induces diffeomorphisms \({}^m\overline{\mathbb {R}^4}\cong {}_m\overline{\mathbb {R}^4}\); denote by \(S^-\) the image of \(S^+\). The identity map on \(\mathbb {R}^4\) induces an identification of the interiors of \({}^m\overline{\mathbb {R}^4}\) and \({}_m\overline{\mathbb {R}^4}\) which extends to be polyhomogeneous of class \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\) on the maximal domain of existence by a simple variant of Lemma 2.10. We then define the compact topological space \({}^m_m\overline{\mathbb {R}^4}\) to be the union of \({}^m\overline{\mathbb {R}^4}\) and \({}_m\overline{\mathbb {R}^4}\) quotiented out by this identification; this is thus a manifold of class \(\mathcal {A}_{\text {phg}}^{\mathcal {E}_{\mathrm{log}}}\), and in fact of class \(\mathcal {C}^\infty \) away from \(\partial {}^m\overline{\mathbb {R}^4}\cap \partial {}_m\overline{\mathbb {R}^4}\), hence in particular near \(S^\pm \) as well as near \({}^m\beta ({}^m I^+)\) and its image under \(\iota \). Define the blown-up space

$$\begin{aligned} {}^m_m M := [{}^m_m\overline{\mathbb {R}^4};S^+,S^-], \end{aligned}$$

i.e. blow up both \(S^+\) and \(S^-\); these are closed and disjoint submanifolds, hence the order of blow-up does not matter. Then \({}^m_m M\) is a polyhomogeneous manifold, covered by the two smooth manifolds \({}^mM'\) and \({}_mM':=[{}_m\overline{\mathbb {R}^4};S^-]\), and with interior naturally diffeomorphic to \(\mathbb {R}^4_{t,x}\). We denote its boundary hypersurfaces by \(\mathscr {I}^\pm \) and \(i^\pm \) in the obvious manner, see Figure 1, and \(I^0\) is the closure of the remaining part of the boundary. In view of the isomorphism (2.40), weighted b-Sobolev spaces on \({}^m_mM\) are well-defined. For future use, we also note that polyhomogeneity at \(I^0\) with index set \(\mathcal {E}_0\) is well-defined provided

$$\begin{aligned} \mathcal {E}_0+\mathcal {E}_{\mathrm{log}}=\mathcal {E}_0, \end{aligned}$$
(6.15)

as follows from (2.41); note that given any index set \(\mathcal {E}_0^0\), the index set \(\mathcal {E}_0:=\mathcal {E}_0^0+\mathcal {E}_{\mathrm{log}}\) satisfies (6.15) (and is the smallest such index set which contains \(\mathcal {E}_0^0\)) since \(\mathcal {E}_{\mathrm{log}}+\mathcal {E}_{\mathrm{log}}=\mathcal {E}_{\mathrm{log}}\).

It is useful to describe \({}^m_mM\) as the union of three (overlapping) smooth manifolds, namely \({}^mM\), \({}_m M:=\iota {}^mM\), and the set U defined in (6.7). We can then define the function space

$$\begin{aligned} \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}_{\mathrm{global}} \end{aligned}$$

to consist of all distributions on \(\mathbb {R}^4\) which lie in \(\rho _0^{b_0}H_{{\text {b}}}^\infty \) on U, and such that their restriction as well as the restriction of their pullback by \(\iota \) to \({}^mM\) lie in \(\mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\).

Theorem 6.7

Given initial data \(\gamma ,k\) as in Lemma 6.2, there exists a global solution g of the Einstein vacuum equation \(\text {Ric}(g)=0\), attaining the data \(\gamma ,k\) at \(\{t=0\}\) and satisfying the gauge condition \(\Upsilon (g)\), which is of the form \(g=g_m+\rho h\) with \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}_{\mathrm{global}}\) for all \(b_I<b'_I<b_0\) and \(b_+<0\).

7 Polyhomogeneity

We state and prove a precise version of the polyhomogeneity statement, made in Theorem 1.1, about the solution of the initial value problem which we constructed in §6. We use the short hand notations (2.32) and (2.35).

Theorem 7.1

Let \(b_0>0\), and let \(\mathcal {E}^0_0\subset \mathbb {C}\times \mathbb {N}_0\) be an index set with \({\text {Im}}\mathcal {E}^0_0<-b_0\). Suppose \(\gamma ,k\in \mathcal {C}^\infty (\mathbb {R}^3;S^2 T^*\mathbb {R}^3)\) are initial data such that \(m\in \mathbb {R}\), \(\widetilde{\gamma }\), defined in (6.5), and k satisfy the smallness condition (6.6), for N large and \(\delta >0\) small.Footnote 39 Assume moreover that the initial data are polyhomogeneous (namely, \(\mathcal {E}_0^0\)-smooth):

$$\begin{aligned} \rho _0^{-1}\widetilde{\gamma },\,\rho _0^{-2}k \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0^0}(\overline{\mathbb {R}^3};S^2\,{}^{{\text {sc}}}T^*\overline{\mathbb {R}^3}). \end{aligned}$$
(7.1)

Let h denote the global solution of \(\text {Ric}(g)=0\), \(g=g_m+\rho h\), in M, satisfying the gauge condition \(\Upsilon (g;g_m)=0\). Then h is polyhomogeneous on M. More precisely, h is \(\mathcal {E}\)-smooth, \(\mathcal {E}=(\mathcal {E}_0,\mathcal {E}_I,\mathcal {E}_+)\):

$$\begin{aligned} h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}_I,\mathcal {E}_+}, \end{aligned}$$

with the refinements \(\pi _{1 1}^c h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\bar{\mathcal {E}}_I,\mathcal {E}_+}\) and \(\pi _0 h \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}'_I,\mathcal {E}_+}\) near \(\mathscr {I}^+\), where the index sets are the smallest ones satisfyingFootnote 40

$$\begin{aligned} \mathcal {E}_0\supset \mathcal {E}_0^0+\mathcal {E}'_{\mathrm{log}},\ \ \mathcal {E}_0\supset j(\mathcal {E}_0-i)+i\ \ \forall \,j\in \mathbb {N}\end{aligned}$$
(7.2a)

at \(I^0\), with \(\mathcal {E}'_{\mathrm{log}}\) defined in (2.36), while at \(\mathscr {I}^+\),

$$\begin{aligned} \mathcal {E}'_I&\supset \mathcal {E}_0\overline{\cup }(2\mathcal {E}_I-i) \end{aligned}$$
(7.2b)
$$\begin{aligned} \bar{\mathcal {E}}_I&\supset 0\cup \bigl (\mathcal {E}_0\overline{\cup }\bigl ((\bar{\mathcal {E}}_I+\mathcal {E}'_I)\cup (2\mathcal {E}_I-i)\bigr )\bigr ), \end{aligned}$$
(7.2c)
$$\begin{aligned} \mathcal {E}_I&\supset 0\overline{\cup }\mathcal {E}_0\overline{\cup }\bigl ((\mathcal {E}_I+\mathcal {E}'_I)\cup (2\bar{\mathcal {E}}_I)\bigr ), \end{aligned}$$
(7.2d)
$$\begin{aligned} \mathcal {E}_I&\supset j(\mathcal {E}_I-i)+i\ \ \forall \,j\in \mathbb {N}, \end{aligned}$$
(7.2e)

and finally at \(I^+\),

$$\begin{aligned} \mathcal {E}_+ \supset (-i\overline{\cup }0) \cup \bigl ((\mathcal {E}_+-i)\overline{\cup }-i\overline{\cup }(\mathcal {E}_I\setminus \{(0,1)\})\bigr ). \end{aligned}$$
(7.2f)

At \(I^0\), we only need to capture the index set arising from nonlinear terms in Einstein’s equation since the background metric \(g_m\) solves \(\rho ^{-3}\text {Ric}(g_m)=0\) identically near \(I^0\); the addition of the index set \(\mathcal {E}_{\mathrm{log}}\) arises when pushing the solution near \(\{t=0\}\subset {}^0\overline{\mathbb {R}^4}\) forward to \({}^mM\); see (6.15). We point out that the index sets we obtain are very likely to be nonoptimal due to our rather coarse analysis of nonlinear interactions.

Example 7.2

For data which are Schwarzschildean modulo Schwartz functions, i.e. \(\mathcal {E}_0^0=\emptyset \), the above gives \(\mathcal {E}_0=\emptyset \) and

$$\begin{aligned}&\mathcal {E}_I = \bigcup _{j\in \mathbb {N}_0} (-i j,3 j+1),\ \ \bar{\mathcal {E}}_I = 0 \cup \mathcal {E}'_I,\ \ \mathcal {E}'_I = \bigcup _{j\in \mathbb {N}} (-i j,3 j-1),\\&\mathcal {E}_+ = \bigcup _{j\in \mathbb {N}_0} \bigl (-i j,\tfrac{3}{2} j(j+3)\bigr ). \end{aligned}$$

Recalling the notation \(\log ^{\le k}\) introduced around (1.38), this gives, schematically, leading terms \(\pi _{1 1}h\sim \log ^{\le 1}\rho _I + \rho _I\log ^{\le 4}\rho _I\), \(\pi _{1 1}^c h\sim 1 + \rho _I\log ^{\le 2}\rho _I\), \(\pi _0 h\sim \rho _I\log ^{\le 2}\rho _I\) at \(\mathscr {I}^+\) (near the interior of which one can take \(\rho _I=r^{-1}\)), and \(h\sim 1+\rho _+\log ^{\le 6}\rho _+\) at \(I^+\) (near the interior of which one can take \(\rho _+=t^{-1}\)).

Example 7.3

Consider \(\mathcal {E}_0^0=-i\): this corresponds to initial data which have a full Taylor expansion in 1/r at infinity, beginning with \(\mathcal {O}(r^{-2})\) perturbations of the Schwarzschild metric. In this case, we get many additional logarithmic terms from \(\mathcal {E}_0=\mathcal {E}_0^0+\mathcal {E}_{\mathrm{log}}=\bigcup _{j\in \mathbb {N}}(-i j,j-1)\), namely

$$\begin{aligned} \mathcal {E}_I= & {} \bigcup _{j\in \mathbb {N}_0} \bigl (-i j,\tfrac{1}{2}j(3 j+7)+1\bigr ),\ \ \bar{\mathcal {E}}_I = 0 \cup \bigcup _{j\in \mathbb {N}} \bigl (-i j,\tfrac{1}{2}j(3 j+5)\bigr ), \\ \mathcal {E}'_I= & {} \bigcup _{j\in \mathbb {N}} \bigl (-i j,\tfrac{1}{2}j(3 j+3)\bigr ), \ \ \mathcal {E}_+ = \bigcup _{j\in \mathbb {N}_0} \bigl (-i j,\tfrac{1}{2}j(j^2+5 j+10)\bigr ), \end{aligned}$$

so \(\pi _{1 1}h\sim \log ^{\le 1}\rho _I+\rho _I\log ^{\le 6}\rho _I\), \(\pi _{1 1}^c h\sim 1+\rho _I\log ^{\le 4}\rho _I\), \(\pi _0 h\sim \rho _I\log ^{\le 3}\rho _I\) at \(\mathscr {I}^+\), and \(h\sim 1 +\rho _+\log ^{\le 8}\rho _+\) at \(I^+\).

Remark 7.4

Let us consider the index set \(\mathcal {E}_0^0=-i\) again. As indicated above, the addition of \(\mathcal {E}'_{\mathrm{log}}\) in (7.2a) is only due to an inconvenient choice of initial surface which produces logarithmic terms when passing from \({}^0\overline{\mathbb {R}^4}\) (which the initial surface in Theorem 7.1 is a smooth submanifold of) to \({}^m\overline{\mathbb {R}^4}\). If instead one is given the ADM mass m and initial data \((\gamma ,k)\) on \({}^m\Sigma \), with \((\gamma ,k)\) close to the data induced by \(g_m\) on \({}^m\Sigma \) (measured in \(\rho _0^{b_0}H_{{\text {b}}}^N({}^m\Sigma ;S^2\,{}^{{\text {sc}}}T^*\,{}^m\Sigma )\) for suitable N), then the index set at \(I^0\) can be defined as in (7.2a) but without\(\mathcal {E}'_{\mathrm{log}}\). Correspondingly, the index sets at the other boundary faces have fewer logarithms:

$$\begin{aligned} \mathcal {E}_I= & {} \bigcup _{j\in \mathbb {N}_0} \bigl (-i j,5 j+1\bigr ),\ \ \bar{\mathcal {E}}_I = 0 \cup \bigcup _{j\in \mathbb {N}} \bigl (-i j,5 j-1\bigr ), \\ \mathcal {E}'_I= & {} \bigcup _{j\in \mathbb {N}} \bigl (-i j,5 j-2\bigr ), \ \ \mathcal {E}_+ = \bigcup _{j\in \mathbb {N}_0} \bigl (-i j,\tfrac{1}{2}j(5 j+11)\bigr ), \end{aligned}$$

so \(\pi _{1 1}h\sim \log ^{\le 1}\rho _I+\rho _I\log ^{\le 6}\rho _I\), \(\pi _{1 1}^c h\sim 1+\rho _I\log ^{\le 4}\rho _I\), \(\pi _0 h\sim \rho _I\log ^{\le 3}\rho _I\) at \(\mathscr {I}^+\), and \(h\sim 1 +\rho _+\log ^{\le 8}\rho _+\) at \(I^+\). (The exponents in subsequent terms of the expansion are smaller than in Example 7.3).

The proof of Theorem 7.1 is straightforward but requires some bookkeeping: we will peel off the polyhomogeneous expansion at the various boundary faces iteratively, writing the nonlinear equation \(P(h)=0\) as a linear equation plus error terms with better decay, much like in §5. As a preparation, we prove a few lemmas for ODEs which were already used in §5:

Lemma 7.5

Let \(X:=[0,\infty )_\rho \), \(u\in \rho ^{-\infty }H_{{\text {b}}}^\infty (X)\), \({\text {supp}}u\subset [0,1]\), and \(f:=\rho D_\rho u\). Then:

  1. (1)

    \(f\in \rho ^aH_{{\text {b}}}^\infty (X)\), \(a<0\)\(\Rightarrow \)\(u\in \rho ^aH_{{\text {b}}}^\infty (X)\);

  2. (2)

    \(f\in \rho ^aH_{{\text {b}}}^\infty (X)\), \(a>0\)\(\Rightarrow \)\(u\in \mathcal {A}_{\text {phg}}^0(X)+\rho ^aH_{{\text {b}}}^\infty (X)\);

  3. (3)

    \(f\in \mathcal {A}_{\text {phg}}^\mathcal {E}(X)\), \(\mathcal {E}\) any index set \(\Rightarrow \)\(u\in \mathcal {A}_{\text {phg}}^{\mathcal {E}\overline{\cup }0}(X)\); if \((0,0)\notin \mathcal {E}\), then \(u\in \mathcal {A}_{\text {phg}}^{\mathcal {E}\cup 0}(X)\).

Proof

This follows immediately from the characterization of b-Sobolev and polyhomogeneous spaces using the Mellin transform [87, §4]. Alternatively, one can explicitly construct the unique solution of \(\rho D_\rho u=f\) with support in \(\rho \le 1\): part (1) follows easily from \(u=-i\int _\rho ^1 f\,\frac{d\rho }{\rho }\), while for part (2), \(u=-i\int _0^1 f\,\frac{d\rho }{\rho }+i\int _0^\rho f\,\frac{d\rho }{\rho }\) gives the decomposition into constant and remainder term. The appearance of the extended union in (3) is due to the fact that while \(\rho D_\rho u=\rho ^{i z}(\log \rho )^k\), \(k\in \mathbb {N}_0\), is solved to leading order by \(u=\tfrac{1}{z}\rho ^{i z}(\log \rho )^k\) for \(z\ne 0\), we need an extra logarithmic term for \(z=0\), as \(\rho D_\rho (\tfrac{1}{k+1}(\log \rho )^{k+1} a)=-i(\log \rho )^k a\) plus lower order terms. \(\square \)

Adding more dimensions is straightforward:

Lemma 7.6

Let \(X=[0,\infty )_{\rho _1}\times [0,\infty )_{\rho _2}\times \mathbb {R}^n_\omega \), \(U=\{\rho _1<1,\,\rho _2<1\}\subset X\), \(\rho =\rho _1\rho _2\), and let \(\mathcal {E}_1,\mathcal {E}_2\) denote two index sets. Suppose \(u\in \rho ^{-\infty }H_{{\text {b}}}^\infty (X)\) has support in U, and let \(f:=\rho _1 D_{\rho _1}u\). Then:

  1. (1)

    \(f\in \rho _1^{a_1}\rho _2^{a_2}H_{{\text {b}}}^\infty (X)\), \(a_1\ne 0\)\(\Rightarrow \)\(u\in \mathcal {A}_{{\text {phg}},{\text {b}}}^{0,a_2}(X)+\rho _1^{a_1}\rho _2^{a_2}H_{{\text {b}}}^\infty (X)\);

  2. (2)

    \(f\in \mathcal {A}_{{\text {b}},{\text {phg}}}^{a_1,\mathcal {E}_2}(X)\), \(a_1\ne 0\)\(\Rightarrow \)\(u\in \mathcal {A}_{\text {phg}}^{0,\mathcal {E}_2}(X)+\mathcal {A}_{{\text {b}},{\text {phg}}}^{a_1,\mathcal {E}_2}(X)\);

  3. (3)

    \(f\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1,\mathcal {E}_2}(X)\)\(\Rightarrow \)\(u\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1\overline{\cup }0,\mathcal {E}_2}(X)\); if \((0,0)\notin \mathcal {E}_2\), then \(u\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1\cup 0,\mathcal {E}_2}(X)\).

Lemma 7.7

In the notation of Lemma 7.6, with \(u\in \rho ^{-\infty }H_{{\text {b}}}^\infty (X)\) supported in \(\rho _1\le 1\), let \(f:=(\rho _1 D_{\rho _1}-\rho _2 D_{\rho _2})u\). Let \(\chi =\chi (\rho _1,\rho _2)\in \mathcal {C}^\infty _{\text {c}}([0,1)^2)\) denote a localizer, identically 1 in a neighborhood of the corner \(\rho _1=\rho _2=0\). See Figure 14. Then:

  1. (1)

    \(f\in \rho _1^{b_1}\rho _2^{b_2}H_{{\text {b}}}^\infty (X)\), \(b_2>b_1\)\(\Rightarrow \)\(\chi u\in \rho _1^{b_1}\rho _2^{b_2}H_{{\text {b}}}^\infty (X)\);

  2. (2)

    \(f\in \mathcal {A}_{{\text {b}},{\text {phg}}}^{b_1,\mathcal {E}_2}(X)\), \({\text {Im}}z\ne -b_1\) whenever \((z,0)\in \mathcal {E}_2\)\(\Rightarrow \)\(\chi u\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_2,\mathcal {E}_2}(X)+\mathcal {A}_{{\text {b}},{\text {phg}}}^{b_1,\mathcal {E}_2}(X)\);

  3. (3)

    \(f\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1,\mathcal {E}_2}(X)\)\(\Rightarrow \)\(\chi u\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1\overline{\cup }\mathcal {E}_2,\mathcal {E}_2}(X)\).

Fig. 14
figure 14

Illustration of Lemma 7.7 which describes solutions of the transport equation along the vector field \(-\rho _1\partial _{\rho _1}+\rho _2\partial _{\rho _2}\); one integral curve of this vector field is shown here

Proof

We drop the \(\mathbb {R}^n_\omega \) factor from the notation for brevity. For (1), write \(u(\rho _1,\rho _2)=-i\int _{\rho _1}^1 f(t^{-1}\rho _1,t\rho _2)\,\frac{d t}{t}\) and \(f=\rho _1^{b_1}\rho _2^{b_2}\widetilde{f}\), \(\widetilde{f}\in H_{{\text {b}}}^\infty \), then for \(0<\epsilon <b_2-b_1\)

$$\begin{aligned} \Vert \chi u\Vert _{\rho _1^{b_1}\rho _2^{b_2}L^2_{\text {b}}}^2&\le \int _0^1\int _0^1 \biggl |\int _{\rho _1}^1 t^{b_2-b_1}\widetilde{f}(t^{-1}\rho _1, t\rho _2)\,\frac{d t}{t}\biggr |^2 \frac{d\rho _1}{\rho _1}\frac{d\rho _2}{\rho _2} \\&\le \int _0^1 \biggl (\int _{\rho _1}^1 t^{b_2-b_1} \Bigl (\int _0^1 |\widetilde{f}(t^{-1}\rho _1,x_2)|\,\frac{d x_2}{x_2}\Bigr )^{1/2}\frac{d t}{t}\biggr )^2 \frac{d\rho _1}{\rho _1} \\&\le \biggl (\int _0^1 t^{2(b_2-b_1-\epsilon )}\,\frac{d t}{t}\biggr )\cdot \int _0^1\int _0^1\int _0^1 t^{2\epsilon }|\widetilde{f}(x_1,x_2)|^2\,\frac{d x_2}{x_2}\frac{d t}{t}\frac{d x_1}{x_1} \\&\le C\Vert f\Vert _{\rho _1^{b_1}\rho _2^{b_2}L^2_{\text {b}}}, \end{aligned}$$

as desired; higher b-regularity follows by commuting \(\rho _j D_{\rho _j}\) through the equation for u.Footnote 41

For the proof of (2), it suffices to consider a single term

$$\begin{aligned} f_k=\rho _2^{i z}(\log \rho _2)^k a_k(\rho _1), \end{aligned}$$
(7.3)

with \(a_k\in \rho _1^{b_1}H_{{\text {b}}}^\infty (H_1)\) supported in \(\rho _1\le 1\). Let \(u_k=\rho _2^{i z}(\log \rho _2)^k b_k(\rho _1)\), where \(b_k=b_k(\rho _1)\) solves

$$\begin{aligned} (\rho _1 D_{\rho _1}-z)b_k=a_k \end{aligned}$$
(7.4)

and is supported in \(\rho _1\le 1\), then the error term

$$\begin{aligned} f_{k-1}&:= (\rho _1 D_{\rho _1}-\rho _2 D_{\rho _2})u_k-f_k = \bigl ((\rho _1 D_{\rho _1}-z)-(\rho _2 D_{\rho _2}-z)\bigr )u_k-f_k \\&= \rho _2^{i z}(\log \rho _2)^{k-1}a_{k-1}(\rho _1),\quad a_{k-1}:=i k b_k, \end{aligned}$$

is one power of \(\log \rho _2\) better than \(f_k\). Rewriting equation (7.4) as \(\rho _1 D_{\rho _1}(\rho _1^{-i z}b_k)=\rho _1^{-i z}a_k\in \rho _1^{b_1+{\text {Im}}z}H_{{\text {b}}}^\infty (H_1)\), we can use Lemma 7.5 to obtain \(b_k\in \mathcal {A}_{\text {phg}}^z(H_1)+\rho _1^{b_1}H_{{\text {b}}}^\infty (H_1)\); therefore \(u_k\in \mathcal {A}_{\text {phg}}^{z,(z,k)}(X)+\mathcal {A}_{{\text {b}},{\text {phg}}}^{b_1,(z,k)}(X)\). Proceeding iteratively, we next solve \((\rho _1 D_{\rho _1}-\rho _2 D_{\rho _2})u_{k-1}=f_{k-1}\) to leading order, etc., reducing k by 1 at each step, and picking up one extra power of \(\log \rho _1\) at each stage by Lemma 7.5(3) (conjugated by \(\rho ^{i z}\)). We obtain \(u=\sum _{j=0}^k u_j\in \mathcal {A}_{\text {phg}}^{(z,k),(z,k)}(X)+\mathcal {A}_{{\text {b}},{\text {phg}}}^{b_1,(z,k)}(X)\).

The proof of (3) proceeds in the same manner: if \(f_k\) is of the form (7.3), now with \(a_k\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1}(H_1)\), then \(b_k\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1\overline{\cup }z}(H_1)\), so \(u_k\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1\overline{\cup }z,(z,k)}(X)\) and \(f_{k-1}\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1\overline{\cup }z,(z,k-1)}(X)\). Iterating as before gives \(u\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_1\overline{\cup }(z,k),(z,k)}(X)\). \(\square \)

Proof of Theorem 7.1

We shall first prove that if the Cauchy data \((h_0,h_1)\) in the notation of Theorem 6.3 are polyhomogeneous at \(\partial {}^m\Sigma \),

$$\begin{aligned} h_0,\,h_1 \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0}({}^m\Sigma ), \end{aligned}$$
(7.5)

then the conclusion of Theorem 7.1 holds. Now, by Theorem 6.3, we have \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\) for all \(b_I<b'_I<b_0\) and \(b_+<0\). Note that since the gauge condition \(\Upsilon (g)=0\) is satisfied identically, h solves \(\text {Ric}(g)-\widetilde{\delta }{}^*\Upsilon (g)=0\) for any choice of \(\widetilde{\delta }{}^*\); this will be useful as it will allow us to work with simpler normal operator models.

For now, consider h as a solution of \(P(h)=0\) for \(\gamma >b'_I\) as in Theorem 4.2. We write

$$\begin{aligned} 0 = P(h) = p_0 + \int _0^1 L_{t h}(h)\,d t,\ \ p_0:=P(0)\in \mathcal {A}_{\text {phg}}^{\emptyset ,\emptyset ,0}. \end{aligned}$$
(7.6)

(In fact, \({\text {supp}}p_0\cap (I^0\cup \mathscr {I}^+)=\emptyset \) since \(g_m\) is the Schwarzschild metric near \(I^0\cap \mathscr {I}^+\)).

Let us first work near \(I^0\), away from \(I^+\). Suppose that for some \(c\ge b_0\), we already have \(h\in \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,-0}+\rho _0^c\rho _I^{-0}H_{{\text {b}}}^\infty \), \(\pi _0 h\in \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,b'_I-0}+\rho _0^c\rho _I^{b'_I-0}H_{{\text {b}}}^\infty \), with the exponents referring to the behavior at \(I^0\) and \(\mathscr {I}^+\), respectively. Then

$$\begin{aligned} L_0 h = -p_0 + \int _0^1(L_0-L_{t h})(h)\,d t; \end{aligned}$$
(7.7)

we have \(L_0-L_{t h}\in (\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0-i,-1-0}+\rho _0^{c+1}\rho _I^{-1-0}H_{{\text {b}}}^\infty )\text {Diff}_{\text {b}}^2\) by an inspection of the proof of Lemma 3.8, and it respects the improved behavior of \(\pi _0 h\), so we find

$$\begin{aligned}&L_0 h \in \mathcal {A}_{{\text {phg}},{\text {b}}}^{2\mathcal {E}_0-i,-1-0}\\&\quad +\rho _0^{c+1}\rho _I^{-1-0}H_{{\text {b}}}^\infty ,\ \ \pi _0 L_0 h \in \mathcal {A}_{{\text {phg}},{\text {b}}}^{2\mathcal {E}_0-i,-1+b'_I-0}+\rho _0^{c+1}\rho _I^{-1+b'_I-0}H_{{\text {b}}}^\infty . \end{aligned}$$

Denote by \(\mathcal {E}_1:=\{(z,j)\in \mathcal {E}_0:{\text {Im}}z\ge -c\}\) the (finite) set of exponents already captured, and let \(\mathcal {E}_2:=\{(z,j)\in \mathcal {E}_0:-c-1\le {\text {Im}}z<-c\}\). Let

$$\begin{aligned} R_j := \prod _{(z,k)\in \mathcal {E}_j} (\rho _0 D_{\rho _0}-z),\ \ R=R_2\circ R_1. \end{aligned}$$

Let \(N(L_0)\in \rho _I^{-1}\text {Diff}_{\text {b}}^2(M)\) denote the normal operator of \(L_0\) at \(I^0\), i.e. freezing the coefficients of \(L_0\) at \(\rho _0=0\) for a fixed choice of a collar neighborhood \([0,\epsilon )_{\rho _0}\times I^0\) of \(I^0\); thus \(N(L_0)\) commutes with \(\rho _0\partial _{\rho _0}\), and \(L_0-N(L_0)\in \rho _0\rho _I^{-1}\text {Diff}_{\text {b}}^2\). Then \(R h\in \rho _0^c\rho _I^{-0}H_{{\text {b}}}^\infty \) solves the equation \(N(L_0)(R h)=f\), where

$$\begin{aligned} f := -R(L_0-N(L_0))h + R L_0 h \in \rho _0^{c+1}\rho _I^{-1-0}H_{{\text {b}}}^\infty ,\ \ \pi _0 f\in \rho _0^{c+1}\rho _I^{-1+b'_I-0}H_{{\text {b}}}^\infty , \end{aligned}$$

due to \(2\mathcal {E}_0-i\subset \mathcal {E}_0\); the Cauchy data of Rh lie in \(\rho _0^{c+1}H_{{\text {b}}}^\infty \) due to the polyhomogeneity of \(h_0\) and \(h_1\). The background estimate near \(I^0\) being sharp with regards to the weight at \(I^0\), see Propositions 4.3 and 4.8, this gives \(R h\in \rho _0^{c+1}\rho _I^{-0}H_{{\text {b}}}^\infty \), \(\pi _0 R h\in \rho _0^{c+1}\rho _I^{b'_I-0}H_{{\text {b}}}^\infty \). Thus, \(h\in \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,-0}+\rho _0^{c+1}\rho _I^{-0}H_{{\text {b}}}^\infty \), \(\pi _0 h\in \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,b'_I-0}+\rho _0^{c+1}\rho _I^{b'_I-0}H_{{\text {b}}}^\infty \). Iterating this gives

$$\begin{aligned} h\in \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,-0},\ \ \pi _0 h\in \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,b'_I-0}\ \ \text{ near }\ I^0. \end{aligned}$$
(7.8)

Following the structure of the argument in §5, we next prove the polyhomogeneity at \(\mathscr {I}^+\setminus (\mathscr {I}^+\cap I^+)\) using Lemmas 7.6 and 7.7. We now take \(\gamma =0\) in the definition of P and its linearization. Thus, let us work near \(I^0\cap \mathscr {I}^+\), and assume that we already have

$$\begin{aligned} \pi _0 h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}'_I} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c'_I-0},\ \ \pi _{1 1}^c h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\bar{\mathcal {E}}_I} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c_I-0},\ \ \pi _{1 1} h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}_I} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c_I-0},\ \ \end{aligned}$$
(7.9)

for some \(0\le c_I<c'_I\le c_I+1\). Using (7.6) and the structure of \(L_{t h}=L_{t h}^0+\widetilde{L}_{t h}\), we find

$$\begin{aligned} \pi _{1 1}^c L_0^0\pi _{1 1}^c h = -\pi _{1 1}^c p_0 - \int _0^1 \bigl (\pi _{1 1}^c\widetilde{L}_{t h}\pi _{1 1}^c h + \pi _{1 1}^c L_{t h}\pi _0 h + \pi _{1 1}^c L_{t h}\pi _{1 1}h\bigr )\,d t. \end{aligned}$$
(7.10)

The proof of Lemma 3.8, condition (7.2e), and the fact that \(\mathcal {E}_I\supset \bar{\mathcal {E}}_I\supset \mathcal {E}'_I\supset \mathcal {E}_I-i\) give

$$\begin{aligned}&\widetilde{L}_{t h}\in (\mathcal {C}^\infty +\mathcal {A}_{\text {phg}}^{\mathcal {E}_0-i,\mathcal {E}_I}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0-i,c_I-0})\text {Diff}_{\text {b}}^2, \nonumber \\&\pi _{1 1}^c L_{t h}^0\pi _0 \in \rho _I^{-1}(\mathcal {C}^\infty +\mathcal {A}_{\text {phg}}^{\mathcal {E}_0-i,\bar{\mathcal {E}}_I}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0-i,c_I-0})\text {Diff}_{\text {b}}^1, \end{aligned}$$
(7.11)

and \(\pi _{1 1}^c L_{t h}^0\pi _{1 1}=0\). Multiplying (7.10) by \(\rho _I\), grouping function spaces in the order of the summands in the integrand above, and simplifying using \(2\mathcal {E}_0-i\subset \mathcal {E}_0\) and \(0\subset \mathcal {E}_I\), this gives

$$\begin{aligned} \rho _I\partial _{\rho _I}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})\pi _{1 1}^c h \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}_I+\bar{\mathcal {E}}_I-i} + \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\bar{\mathcal {E}}_I+\mathcal {E}'_I} + \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,2\mathcal {E}_I-i} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c'_I-0}; \end{aligned}$$

the first space is contained in the second. In view of condition (7.2c) (note that the index sets in parentheses there lie in \({\text {Im}}z<0\)), we obtain

$$\begin{aligned} \pi _{1 1}^c h \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\bar{\mathcal {E}}_I}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c'_I-0}, \end{aligned}$$
(7.12)

which improves on the a priori weight of the remainder term at \(\mathscr {I}^+\). Next,

$$\begin{aligned} \pi _{1 1}L_0^0\pi _{1 1}h = -\pi _{1 1}p_0 - \int _0^1 \bigl (\pi _{1 1}\widetilde{L}_{t h}\pi _{1 1}h + \pi _{1 1}L_{t h}\pi _0 h+\pi _{1 1}L_{t h}\pi _{1 1}^c h\bigr )\,d t. \end{aligned}$$

Lemma 3.8 and the membership (7.12) imply

$$\begin{aligned} \pi _{1 1}L_{t h}^0\pi _0&\in \rho _I^{-1}(\mathcal {C}^\infty +\mathcal {A}_{\text {phg}}^{\mathcal {E}_0-i,\mathcal {E}_I}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0-i,c_I-0})\text {Diff}_{\text {b}}^1, \\ \pi _{1 1}L_{t h}^0\pi _{1 1}^c&\in \rho _I^{-1}(\mathcal {A}_{\text {phg}}^{\mathcal {E}_0-i,\bar{\mathcal {E}}_I}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0-i,c'_I-0})\text {Diff}_{\text {b}}^1, \end{aligned}$$

with \(\rho _I\) times the latter having a leading order term at \(\mathscr {I}^+\), cf. the discussion of (5.3); together with (7.11) and (7.12), and using \(\bar{\mathcal {E}}_I\subset \mathcal {E}_I\), one finds

$$\begin{aligned} \rho _I\partial _{\rho _I}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})\pi _{1 1}h \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,2\mathcal {E}_I-i} + \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}_I+\mathcal {E}'_I} + \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,2\bar{\mathcal {E}}_I} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c'_I-0}, \end{aligned}$$

with the first space again contained in the second. Condition (7.2d) then gives

$$\begin{aligned} \pi _{1 1} h \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}_I} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c'_I-0}. \end{aligned}$$
(7.13)

Lastly then, we can improve on the asymptotics of \(\pi _0 h\) at \(\mathscr {I}^+\) by writing

$$\begin{aligned} \pi _0 L_0^0\pi _0 h = -\pi _0 p_0 - \int _0^1\bigl (\pi _0\widetilde{L}_{t h}\pi _0 h+\pi _0 L_{t h}\pi _{1 1}^c h + \pi _0 L_{t h}\pi _{1 1}h\bigr )\,d t; \end{aligned}$$

now \(\pi _0 L_{t h}^0\pi _{1 1}^c=0=\pi _0 L_{t h}^0\pi _{1 1}\) and \(\mathcal {E}'_I\subset \bar{\mathcal {E}}_I\subset \mathcal {E}_I\), so, since \(\gamma =0\),

$$\begin{aligned} \rho _I\partial _{\rho _I}(\rho _0\partial _{\rho _0}-\rho _I\partial _{\rho _I})\pi _0 h \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,2\mathcal {E}_I-i} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c'_I+1-0}; \end{aligned}$$

but condition (7.2b) and Lemma 7.7 imply

$$\begin{aligned} \rho _I\partial _{\rho _I}\pi _0 h \in \mathcal {A}_{\text {phg}}^{\mathcal {E}_0,\mathcal {E}'_I} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_0,c'_I+1-0}; \end{aligned}$$

an application of Lemma 7.6 gives the same membership for \(\pi _0 h\), since we already know that \(\pi _0 h\) has no leading term at \(\mathscr {I}^+\). This establishes (7.9) for \((c_I,c'_I)\) replaced by \((c'_I,c'_I+1)\), and we can iterate the procedure to establish the full polyhomogeneity away from \(I^+\). Near \(\mathscr {I}^+\cap I^+\), the arguments are completely analogous, except we only have conormal regularity \(\rho _+^{b_+}H_{{\text {b}}}^\infty \) at \(I^+\). Thus,

$$\begin{aligned} \pi _0 h\in \mathcal {A}_{{\text {phg}},{\text {phg}},{\text {b}}}^{\mathcal {E}_0,\mathcal {E}'_I,b_+},\ \ \pi _{1 1}^c h\in \mathcal {A}_{{\text {phg}},{\text {phg}},{\text {b}}}^{\mathcal {E}_0,\bar{\mathcal {E}}_I,b_+},\ \ \pi _{1 1} h\in \mathcal {A}_{{\text {phg}},{\text {phg}},{\text {b}}}^{\mathcal {E}_0,\mathcal {E}_I,b_+}. \end{aligned}$$

Next, we use this information to obtain an expansion at \(I^+\), similarly to the arguments around (5.10). We shall use the linearization \(L_0\), still defined using \(\gamma =0\), and its normal operator at\(I^+\subset M\)—instead of its normal operator at the boundary of \(\overline{\mathbb {R}^4}\), which obviates the need to relate (partially) polyhomogeneous function spaces on \(\overline{\mathbb {R}^4}\) and M. Namely, fix a collar neighborhood

$$\begin{aligned} U:=[0,1)_{\rho _+}\times I^+,\ \ I^+=\{Z\in \mathbb {R}^3:|Z|\le 1\}, \end{aligned}$$

of \(I^+\) in M, and denote by \(\mathcal {V}_{{\text {b}},-}(U)\subset \mathcal {V}(U)\) the Lie subalgebra of vector fields tangent to \(I^+\) but with no condition at\(\mathscr {I}^+\). Then for \(\gamma =0\), we have \(L_0\in \text {Diff}_{{\text {b}},-}^2(U)\) (the algebra generated by \(\mathcal {V}_{{\text {b}},-}\)), acting on sections of \(\beta ^*S^2|_U\): by Lemma 3.8, \(\widetilde{L}_0\in \text {Diff}_{\text {b}}^2(M)\hookrightarrow \text {Diff}_{{\text {b}},-}^2(M)\) certainly has smooth coefficients, and the same is true for \(L_0^0=-2\rho ^{-2}\partial _0\partial _1=\partial _{\rho _I}(\rho _I\partial _{\rho _I}-\rho _+\partial _{\rho _+})+\text {Diff}_{\text {b}}^2(M)\), \(\rho _I=1-|Z|^2\). Furthermore, by Lemmas 2.10 and 3.10 as well as equation (3.29), the normal operator \(N(L_0)\) of \(L_0\) at \(I^+\) can be identified with \(N(\underline{L}{})\), so that in fact \(N(L_0)=\Box _{g_{\text {dS}}}-2\), defined using the expressions (4.62) and (4.65), acting component-wise on the fibers of the trivial bundle \(\underline{\mathbb {R}}{}^{10}\), where we use Lemma 2.11 to identify \(\beta ^*S^2|_{I^+}\cong {}^0\beta ^*(S^2\,{}^{{\text {sc}}}T^*{}^0\,\overline{\mathbb {R}^4})|_{{}^0 I^+}\cong \underline{\mathbb {R}}{}^{10}\) by means of coordinate differentials. By [107, §4] and the module regularity proved in [57],

$$\begin{aligned} \widehat{L_0}(\sigma )^{-1}:\bar{H}^{s-1,k}(I^+)\rightarrow \bar{H}^{s,k}(I^+) \end{aligned}$$
(7.14)

is meromorphic for \(\sigma \in \mathbb {C}\) with \(s>\tfrac{1}{2}-{\text {Im}}\sigma \), where the bar refers to extendibility at \(\partial I^+=\{|Z|=1\}\), while the parameter \(k\in \mathbb {N}_0\) measures the amount of regularity under the \(\mathcal {C}^\infty (I^+)\)-module \(\text {Diff}_{\text {b}}^1(I^+)\); that is, \(\bar{H}^{s,k}(I^+)\) consists of \(H^s\) functions on \(I^+\) which remain in \(H^s\) under application of any operator in \(\text {Diff}_{\text {b}}^k(I^+)\). (This is analogous to Lemma 5.4, except in the present de Sitter setting we work on high regularity spaces rather than the low regularity spaces in the Minkowski setting, see [107, §5]). Strictly speaking, the references only apply to the operator obtained from \(L_0\) by smooth extension across \(\partial I^+\) to an operator on a slightly larger space than \(I^+\); but (7.14) follows simply by using extension and restriction operators, and the choice of extensions is irrelevant since \(\widehat{L_0}(\sigma )\) is principally a wave operator beyond \(\partial I^+\).

The divisor \(\mathcal {R}\) of \(L_0\), see Remark 5.6, is then

$$\begin{aligned} \mathcal {R}=-i; \end{aligned}$$
(7.15)

indeed, using the relation between asymptotics on global de Sitter space and resonances on static de Sitter space as in [60, Appendix C], this follows from [106, Theorem 1.1] for \(n=4\), \(\lambda =2\), with the logarithmic terms absent: the indicial roots are 1 and 2, see [106, Lemma 4.13], and in the notation of (4.65), the difference of \(\Box _{g_{\text {dS}}}\) and its indicial operator \(-(\hat{\tau }\partial _{\hat{\tau }})^2+3\hat{\tau }\partial _{\hat{\tau }}\) is \(\hat{\tau }^2\Delta _{\hat{x}}\), thus vanishes quadratically in \(\hat{\tau }\) as a b-operator on \([0,\infty )_{\hat{\tau }}\times \mathbb {R}^3_{\hat{x}}\). Hence, for the formal solution \(u=\hat{\tau } v_-+\hat{\tau }^2 v_+\) constructed in [106, Lemma 4.13], the Taylor series of \(v_\pm \) only contain even powers of \(\hat{\tau }\); \(1-2\mathbb {N}_0\) and \(2-2\mathbb {N}_0\) being disjoint, there are no integer coincidences which would cause logarithmic terms.

Now, consider again (7.7): if \(\chi =\chi (\rho _+)\) denotes a localizer near \(I^+\), identically 1 near \(I^+\) and vanishing near \(I^0\), we have

$$\begin{aligned} L_0(\chi h) = -\chi p_0 + [L_0,\chi ]h + \int _0^1 \chi (L_0-L_{t h})(h)\,d t. \end{aligned}$$
(7.16)

We have \(\chi p_0\in \mathcal {A}_{\text {phg}}^{\emptyset ,0}\), with the exponents now referring to the behavior at \(\mathscr {I}^+\) and \(I^+\), respectively. Suppose we already have

$$\begin{aligned} \pi _0 h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}'_I,\mathcal {E}_+}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}'_I,c_+},\ \ \pi _{1 1}^c h\in \mathcal {A}_{\text {phg}}^{\bar{\mathcal {E}}_I,\mathcal {E}_+}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\bar{\mathcal {E}}_I,c_+},\ \ \pi _{1 1} h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_I,\mathcal {E}_+}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\mathcal {E}_I,c_+}. \end{aligned}$$
(7.17)

Using that \(\mathcal {E}_+-i\) is closed under nonlinear operations, i.e. \(j(\mathcal {E}_+-i)+i\subset \mathcal {E}_+\), \(j\in \mathbb {N}\), we find \(L_0-L_{t h}\in \mathcal {A}_{\text {phg}}^{\mathcal {E}_+-i}+\rho _+^{c_+ +1}H_{{\text {b}}}^\infty \) near \((I^+)^\circ \); see also Lemma 3.10. Using the structure of \(L_{t h}\) near \(\mathscr {I}^+\cap I^+\) from Lemma 3.8 as above, and noting that \({\text {supp}}[L_0,\chi ]h\subset {\text {supp}}d\chi \) is disjoint from \(I^+\), we deduce that

$$\begin{aligned} L_0(\chi h) \in \mathcal {A}_{\text {phg}}^{\emptyset ,0} + \mathcal {A}_{\text {phg}}^{\widetilde{\mathcal {E}}_I+i,\emptyset } + \mathcal {A}_{\text {phg}}^{\widetilde{\mathcal {E}}_I+i,2\mathcal {E}_+-i} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\widetilde{\mathcal {E}}_I+i,c_+ +1},\ \ \widetilde{\mathcal {E}}_I:=\mathcal {E}_I\setminus \{(0,1)\}, \end{aligned}$$

where the weight of the remainder term is as stated since all \((z,k)\in \mathcal {E}_+\) except for (0, 0) have \({\text {Im}}z<0\). (Here \(\widetilde{\mathcal {E}}_I\supset \bar{\mathcal {E}}_I+i\supset \mathcal {E}'_I+i\) allows for a nonlogarithmic leading term at \(\mathscr {I}^+\), capturing the worst component of elements of the space \(\mathcal {Y}^\infty \) in Definition 3.3, and moreover captures all nonlinear terms of (7.16)). Replacing \(L_0\) by \(N(L_0)\) causes another error term, \((L_0-N(L_0))(\chi h)\in \mathcal {A}_{\text {phg}}^{\widetilde{\mathcal {E}}_I+i,\mathcal {E}_+-i}+\mathcal {A}_{{\text {phg}},{\text {b}}}^{\widetilde{\mathcal {E}}_I+i,c_++1}\), so

$$\begin{aligned} N(L_0)(\chi h) \in \mathcal {A}_{\text {phg}}^{\emptyset ,0} + \mathcal {A}_{\text {phg}}^{\widetilde{\mathcal {E}}_I+i,\mathcal {E}_+-i} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{\widetilde{\mathcal {E}}_I+i,c_++1}. \end{aligned}$$

Mellin transforming in \(\rho _+\) at \({\text {Im}}\sigma =-b_+\), inverting \(\widehat{L_0}(\sigma )\) on \(\mathcal {A}_{\text {phg}}^{\widetilde{\mathcal {E}}_I+i}(I^+)\) using Lemma 7.8 below, taking the inverse Mellin transform, and shifting the contour to \({\text {Im}}\sigma =-c_+-1\), we obtain

$$\begin{aligned} \chi h \in \mathcal {A}_{\text {phg}}^{0,\mathcal {R}\overline{\cup }0} + \mathcal {A}_{\text {phg}}^{0\overline{\cup }\widetilde{\mathcal {E}}_I,(\mathcal {R}\overline{\cup }\widetilde{\mathcal {E}}_I)\overline{\cup }(\mathcal {E}_+-i)} + \mathcal {A}_{{\text {phg}},{\text {b}}}^{0\overline{\cup }\widetilde{\mathcal {E}}_I,c_+ +1}. \end{aligned}$$

The index set at \(I^+\) is contained in \(\mathcal {E}_+\) by condition (7.2f), so this improves over (7.17) by the weight 1 in the remainder term; the index sets at \(\mathscr {I}^+\) on the other hand are automatically the ones stated (but now with the improvement at \(I^+\)), as the presence of a nonzero term in the expansion of \(\pi _{1 1} h\), say, at \(\mathscr {I}^+\) corresponding to some element in \((0\overline{\cup }\widetilde{\mathcal {E}}_I)\setminus \mathcal {E}_I\), would contradict our a priori knowledge (7.17). Iterating this gives the polyhomogeneity at \(I^+\), as claimed.

Next, let us show that the smallest sets satisfying conditions (7.2a)–(7.2f) are indeed index sets: we need to verify condition (2.31b). For \(\mathcal {E}_0\), this is clear since, letting \(\widetilde{\mathcal {E}}_0^0:=\mathcal {E}_0^0+\mathcal {E}'_{\mathrm{log}}\),

$$\begin{aligned} \mathcal {E}_0=\widetilde{\mathcal {E}}_0^0\cup \bigcup _{j\in \mathbb {N}}j(\widetilde{\mathcal {E}}_0^0-i)+i \end{aligned}$$

and \({\text {Im}}\widetilde{\mathcal {E}}_0^0<0\); note that this gives \({\text {Im}}\mathcal {E}_0<0\). At \(\mathscr {I}^+\), we take \(\mathcal {E}'_I=\bigcup _{k\in \mathbb {N}}\mathcal {E}'_{I,k}\), likewise for \(\bar{\mathcal {E}}_I\) and \(\mathcal {E}_I\), where we recursively define \(\mathcal {E}'_{I,0}=\bar{\mathcal {E}}_{I,0}=\mathcal {E}_{I,0}=\emptyset \) and

$$\begin{aligned} \mathcal {E}'_{I,k+1}&= \mathcal {E}_0 \overline{\cup }(2\mathcal {E}_{I,k}-i), \end{aligned}$$
(7.18a)
$$\begin{aligned} \bar{\mathcal {E}}_{I,k+1}&= 0 \cup \bigl (\mathcal {E}_0\overline{\cup }\bigl ((\bar{\mathcal {E}}_{I,k}+\mathcal {E}'_{I,k})\cup (2\mathcal {E}_{I,k}-i)\bigr )\bigr ), \end{aligned}$$
(7.18b)
$$\begin{aligned} \mathcal {E}_{I,k+1}&= \Bigl (0 \overline{\cup }\mathcal {E}_0\overline{\cup }\bigl ((\mathcal {E}_{I,k}+\mathcal {E}'_{I,k})\cup (2\bar{\mathcal {E}}_{I,k})\bigr )\Bigr ) \cup \bigcup _{j\in \mathbb {N}}\bigl (j(\mathcal {E}_{I,k}-i)+i\bigr ). \end{aligned}$$
(7.18c)

It easy to see by induction that

$$\begin{aligned} {\text {Im}}\mathcal {E}'_{I,k}, {\text {Im}}\bigl (\bar{\mathcal {E}}_{I,k}{\setminus }(0,0)\bigr ),\ {\text {Im}}\bigl (\mathcal {E}_{I,k}{\setminus }(0,1)\bigr )\le -c,\ \ c := \min (1,-\sup {\text {Im}}\mathcal {E}_0)>0, \end{aligned}$$

for all k. Therefore, to compute the index sets in any fixed half space \({\text {Im}}z>-N\), it suffices to restrict to \(j\le N+1\) in (7.18c), which implies that the truncated sets \(\mathcal {E}'_{I,k;N}:=\mathcal {E}'_{I,k}\cap \{{\text {Im}}z>-N\}\) etc. are finite for all k; we must show that \(\mathcal {E}'_{I,k;N}\) etc. are independent of k for sufficiently large k (depending on N). Note then:

  • \(\mathcal {E}'_{I,k+1;N}\) only depends on \(\mathcal {E}_{I,k;(N-1)/2}\);

  • \(\bar{\mathcal {E}}_{I,k+1;N}\) only depends on \(\mathcal {E}_{I,k;(N-1)/2}\), \(\bar{\mathcal {E}}_{I,k;N-c}\), and \(\mathcal {E}'_{I,k;N}\);

  • \(\mathcal {E}_{I,k+1;N}\) only depends on \(\mathcal {E}_{I,k;N-c}\), \(\mathcal {E}_{I,k;(N-1)/2}\), \(\bar{\mathcal {E}}_{I,k;N}\), and \(\mathcal {E}'_{I,k;N}\).

Combining these, one finds that, a fortiori, \(\mathcal {E}'_{I,k+1;N}\), \(\bar{\mathcal {E}}_{I,k+1;N}\), and \(\mathcal {E}_{I,k+1;N}\) only depend on the sets \(\mathcal {E}'_{I,k-\ell ;N-c}\), \(\bar{\mathcal {E}}_{I,k-\ell ;N-c}\), \(\mathcal {E}_{I,k-\ell ;\max (N-c,(N-1)/2)}\), \(\ell =0,1,2\). Therefore, for \(N>0\), \(\mathcal {E}'_{I,k;N}\) etc. are independent of k for \(k>3 N/c\), as desired. An analogous argument implies that \(\mathcal {E}_+\) is an index set as well.

Finally, we show that the polyhomogeneity of the initial data \(\gamma \) and k in the sense of (7.1) implies that the solution in the neighborhood U, see (6.7), of \(\overline{\{t=0\}}\) constructed in Lemma 6.2 is indeed polyhomogeneous at \(I^0\cap U\) with index set \(\mathcal {E}_0\); this however follows from the same arguments used to prove (7.8) (and we can in fact ignore the weight at \(\mathscr {I}^+\)). In fact, working on \({}^0\overline{\mathbb {R}^4}\), we have \(h\in \mathcal {A}_{\text {phg}}^{\mathcal {E}'_0}(U)\) where \(\mathcal {E}'_0=\bigcup _{j\in \mathbb {N}_0}\bigr (j(\mathcal {E}_0^0-i)+i\bigr )\) does not include the extra logarithmic terms from \(\mathcal {E}_{\mathrm{log}}\); this relies on the observation that the gauged Cauchy data constructed in the proof of Lemma 6.2, see (6.11)–(6.12), lie in \(\mathcal {A}_{\text {phg}}^{\mathcal {E}'_0}({}^0\Sigma )\), which follows from an inspection of the proof. Upon pushing the local solution h in U forward to \({}^m\overline{\mathbb {R}^4}\), we incur the logarithmic terms encoded in the index set \(\mathcal {E}_{\mathrm{log}}\), see (6.15); this proves (7.5). \(\square \)

To complete the proof, we need to study the action of \(\widehat{L_0}(\sigma )^{-1}\) on polyhomogeneous spaces. Let \(\mathcal {E}\) be an index set, and let \(c\in \mathbb {R}\) be such that \({\text {Im}}z<-c\) for all \((z,0)\in \mathcal {E}\); then \(\mathcal {A}_{\text {phg}}^{\mathcal {E}+i}(I^+)\subset \rho _I^{c-1}H_{{\text {b}}}^\infty (I^+)\subset \bar{H}^{-1/2+c-0,\infty }(I^+)\).

Lemma 7.8

The operator \(\widehat{L_0}(\sigma )^{-1}\) in (7.14) extends from \({\text {Im}}\sigma >-c\) as a meromorphic operator family \(\widehat{L_0}(\sigma )^{-1}:\mathcal {A}_{\text {phg}}^{\mathcal {E}+i}(I^+)\rightarrow \mathcal {A}_{\text {phg}}^{0\overline{\cup }\mathcal {E}}(I^+)\) with divisor contained in \(\mathcal {R}\overline{\cup }\mathcal {E}\).

Proof

Given \(f\in \rho _I^{-1}\mathcal {A}_{\text {phg}}^\mathcal {E}(I^+)\), we shall explicitly construct a formal solution \(u_{\text {phg}}\) of \(\widehat{L_0}(\sigma )u_{\text {phg}}=f\) at \(\partial I^+\), which we then correct using the inverse (7.14) acting on \(\dot{\mathcal {C}}^\infty (I^+)\). The construction uses that

$$\begin{aligned} \widehat{L}_0(\sigma )=-D_{\rho _I}(\rho _I D_{\rho _I}-\sigma )+\text {Diff}_{\text {b}}^2(I^+), \end{aligned}$$
(7.19)

which follows from the form (4.49) of the dual metric of \(\rho ^{-2}g_m\). Thus, consider \((z,k)\in \mathcal {E}\), \(f_0\in \mathcal {C}^\infty (\partial I^+)=\mathcal {C}^\infty (\mathbb {S}^2)\), and suppose \(f=\rho _I^{i z-1}(\log \rho _I)^k f_k\in \rho _I^{-1}\mathcal {A}_{\text {phg}}^{(z,k)}(I^+)\) near \(\rho _I=0\). If \(z\ne 0\), we then have

$$\begin{aligned}&\widehat{L_0}(\sigma )\bigl (-z^{-1}(z-\sigma )^{-1}\rho _I^{i z}(\log \rho _I)^k f_k\bigr ) - f_{\text {phg}}\\&\quad = (z-\sigma )^{-1}\rho _I^{i z-1}(\log \rho _I)^{k-1}f_{k-1} + (z-\sigma )^{-1}f' \end{aligned}$$

for some \(f_{k-1}\in \mathcal {C}^\infty (\partial I^+)\), and with \(f'\in \rho _I^{-1}\mathcal {A}_{\text {phg}}^{(z,k)-i}(I^+)\) holomorphic in \(\sigma \). We can iteratively solve away the first term, obtaining \(u_j\in \mathcal {C}^\infty (\partial I^+)\) such that

$$\begin{aligned} \widehat{L_0}(\sigma )\Biggl (\sum _{j=0}^k (z-\sigma )^{-j-1}\rho _I^{i z}(\log \rho _I)^{k-j}u_j\Biggr ) - f = \sum _{j=0}^k(z-\sigma )^{-j-1}f'_j, \end{aligned}$$

where \(f'_j\in \rho _I^{-1}\mathcal {A}_{\text {phg}}^{(z,k-j)-i}(I^+)\) is holomorphic in \(\sigma \) and has improved asymptotics at \(\partial I^+\). If on the other hand \(z=0\), \(f=\rho _I^{-1}(\log \rho _I)^k f_k\in \rho _I^{-1}\mathcal {A}_{\text {phg}}^{(0,k)}(I^+)\), we need an extra \(\log \rho _I\) term: there exist \(u_j\in \mathcal {C}^\infty (\partial I^+)\) such that

$$\begin{aligned}&\widehat{L_0}(\sigma )\Biggl (\sum _{j=0}^k\sigma ^{-j-1}(\log \rho _I)^{k+1-j}u_j\Biggr ) - f \\&\quad = \sum _{j=0}^k \sigma ^{-j-1}f'_j,\ \ f'_j\in \rho _I^{-1}\mathcal {A}_{\text {phg}}^{(0,k+1-j)-i}(I^+). \end{aligned}$$

(Note that there is no term on the left with \((\log \rho _I)^0\)). In general, given \(f\in \rho _I^{-1}\mathcal {A}_{\text {phg}}^{\mathcal {E}}(I^+)\), we can use these arguments and asymptotic summation to construct, locally in \(\sigma \), a family \(u_{\text {phg}}\in \mathcal {A}_{\text {phg}}^{0\overline{\cup }\mathcal {E}}(I^+)\), depending meromorphically on \(\sigma \) with divisor contained in \(\mathcal {E}\), such that

$$\begin{aligned} \widehat{L_0}(\sigma )u_{\text {phg}}- f =: f' \in \mathcal {A}_{\text {phg}}^\emptyset (I^+) = \dot{\mathcal {C}}^\infty (I^+) \end{aligned}$$

is meromorphic with divisor contained in \(\mathcal {E}\); applying \(\widehat{L_0}(\sigma )^{-1}\) to this gives an element of \(\mathcal {C}^\infty (I^+)=\mathcal {A}_{\text {phg}}^0(I^+)\), and

$$\begin{aligned} u:=\widehat{L_0}(\sigma )^{-1}f = u_{\text {phg}}- \widehat{L_0}(\sigma )^{-1}f' \end{aligned}$$

solves \(\widehat{L_0}(\sigma )u=f\), with divisor contained in \(\mathcal {R}\overline{\cup }\mathcal {E}\) due to the second term. \(\square \)

The global solution \(g=g_m+\rho h\) constructed on the space \({}^m_m M\) in Theorem 6.7 is polyhomogeneous as well; the only place where this is not immediate is \(I^0\), where however polyhomogeneity is well-defined under the assumption (6.15) on the index set \(\mathcal {E}_0\), which is already satisfied for the set \(\mathcal {E}_0\) constructed in Theorem 7.1. Thus, the index sets of h at \(I^-\), \(\mathscr {I}^-\), \(I^0\), \(\mathscr {I}^+\), and \(I^+\) are \(\mathcal {E}_+\), \(\mathcal {E}_I\), \(\mathcal {E}_0\), \(\mathcal {E}_I\), and \(\mathcal {E}_+\), respectively, likewise for the refined asymptotics of \(\pi _{1 1}^c h\) and \(\pi _0 h\) near \(\mathscr {I}^\pm \).

8 Bondi mass and the mass loss formula

We shall first use a different characterization of the Bondi mass than the one outlined in §1.3: the Bondi mass can be calculated from the leading lower order terms of the metric g in a so-called Bondi–Sachs coordinate system in §8.2; in order to define these coordinates, we first need to study a special class of null-geodesics in §8.1, namely those which asymptotically look like outgoing radial null-geodesics in the Schwarzschild spacetime. For simplicity, we work with the infinite regularity solutions of Theorem 1.8, and we only control the Bondi–Sachs coordinates in a small neighborhood of \((\mathscr {I}^+)^\circ \), as this is all that is needed for deriving the mass loss formula. More precise estimates, including up to \(\mathscr {I}^+\cap I^+\), of this coordinate system, and a precise description of future-directed null-geodesics and other aspects of the geometry near (null) infinity will be discussed elsewhere.

8.1 Asymptotically radial null-geodesics

Suppose \(g=g_m+\rho h\), \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\), solves \(\text {Ric}(g)=0\) in the gauge \(\Upsilon (g;g_m)=0\), where the weights are as in Definition 3.1; by an inspection of the expressions in §A.2, the gauge condition implies improved decay of certain (sums and derivatives of) components of the metric perturbation h, for instance, \(\Upsilon (g)_0=0\) implies

$$\begin{aligned} \Gamma _0^{0 0}\in m r^{-2}+H_{{\text {b}}}^{\infty ;2+b_0,2+b_I,2+b_+}. \end{aligned}$$
(8.1)

We wish to study null-geodesics near \((\mathscr {I}^+)^\circ \). Introducing coordinates \(v^\mu \) on \(T\mathbb {R}^4\) by writing tangent vectors as \(v^\mu \partial _{x^\mu }\), the geodesic vector field \(H\in \mathcal {V}(T\mathbb {R}^4)\) takes the form

$$\begin{aligned} H = v^\mu \partial _{x^\mu } + \Gamma ^\mu _{\kappa \lambda }v^\kappa v^\lambda \partial _{v^\mu }. \end{aligned}$$

As usual, we will use \(x^0=t+r_*\), \(x^1=t-r_*\), and local coordinates \(x^2,x^3\) on \(\mathbb {S}^2\). Consider first the case that \(h=0\), so g is the Schwarzschild spacetime near \(\mathscr {I}^+\). Radial null-geodesics then have constant \(x^1\) and \(x^b\), \(b=2,3\), while \(v^0(s)=\dot{x}^0(s)\) satisfies the ODE \(\dot{v}^0=-m r(s)^{-2}(v^0)^2\), so \(\ddot{x}^0=-m r^{-2}(\dot{x}^0)^2\). We then use:

Lemma 8.1

We have \(r=r_*-2 m\log r_* + \mathcal {O}(r_*^{-1}\log r_*)\), and \(r_*=\tfrac{1}{2}(x^0-x^1)\).

Proof

Let \(r_0(r_*)\equiv r_*\) and

$$\begin{aligned} r_{k+1}(r_*)=r_*-2 m\log (r_k(r_*)-2 m)=r_*-2 m\log (r_k)-2 m\log (1-2 m r_k^{-1}), \end{aligned}$$

then \(|r_{k+1}-r_k|\le C r_*^{-1}|r_k-r_{k-1}|\), \(k\ge 1\), and the fact that \(|r_1-r_0|=\mathcal {O}(\log r_*)\) show that \(r-r_1=\mathcal {O}(r_*^{-1}\log r_*)\), hence evaluation of \(r_1\) gives the result. \(\square \)

Often, we will only need the consequence that

$$\begin{aligned} r=\tfrac{1}{2}x^0+\mathcal {O}(\log x^0) \end{aligned}$$
(8.2)

for bounded \(x^1\), suggesting the approximation \(\ddot{x}^0=-4 m(x^0)^{-2}(\dot{x}^0)^2\) for the geodesic equation. Solving this by Picard iteration with initial guess \(x^0_0(s)\equiv s\) gives

$$\begin{aligned} x^0_1(s)=s+4 m\log s,\ \ \dot{x}^0_1(s)=1+4 m s^{-1}, \end{aligned}$$

and subsequent iterations give \(\mathcal {O}(s^{-1}\log s)\), resp. \(\mathcal {O}(s^{-2}\log s)\), corrections to \(x^0_1(s)\), resp. \(\dot{x}^0_1(s)\). Let us generalize such radial null-geodesics:

Proposition 8.2

Fix a point \(p\in (\mathscr {I}^+)^\circ \) with coordinates \(x^i(p)=:\bar{x}^i\). Then there exists a future-directed null-geodesic \(\gamma :[0,\infty )\rightarrow M\), \(\gamma (s)=(x^\mu (s))\) such that \(\gamma (s)\rightarrow p\) in M and \(x^a(s)-\bar{x}^a=o(s^{-1})\) as \(s\rightarrow \infty \).

Proof

We will normalize \(\gamma \) by requiring that \(x^0(s)\sim s+4 m\log s\), and we shall seek \(\gamma :[s_0,\infty )\rightarrow M\) for \(s_0>0\) large. For weights , to be specified in (8.10) below, we will solve the geodesic equation on the level of the velocity \(v^\mu =\dot{x}^\mu \) using a suitable Picard iteration scheme on the Banach space

(8.3)

where we use the notation

$$\begin{aligned} \widetilde{v}^0(s):=v^0(s)-(1+4 m s^{-1}), \end{aligned}$$

and where \(\mathcal {C}^0\equiv \mathcal {C}^0([s_0,\infty ))\) is equipped with the \(\sup \) norm; as the norm on X, we then take the maximum of the weighted \(\mathcal {C}^0\) norms of \(\widetilde{v}^0\) and \(v^i\), \(i=1,2,3\). For \(v\in X\), we define its integral \(x=I(v)\), \(\dot{x}^\mu (s)=v^\mu (s)\), by

$$\begin{aligned} x^0(s)&:= s+4 m\log s - \int _s^\infty \widetilde{v}^0(u)\,d u, \nonumber \\ x^i(s)&:= \bar{x}^i - \int _s^\infty v^i(u)\,d u,\ \ i=1,2,3. \end{aligned}$$
(8.4)

As the first iterate, we take

$$\begin{aligned} \widetilde{v}_0^0(s),\, v_0^i(s)\equiv 0,\ \ x_0:=I(v_0); \end{aligned}$$

note that \(\Vert v_0\Vert _X=0\). For \(k\ge 0\), \(v_k\in X\), \(\Vert v_k\Vert _X\le 1\), and \(x_k=I(v_k)\), let then

$$\begin{aligned} v^\mu _{k+1}(s) := v_k^\mu (\infty ) + \int _s^\infty \Gamma ^\mu _{\kappa \lambda }|_{x_k(u)} v_k^\kappa (u)v_k^\lambda (u)\,d u,\ \ x_{k+1} := I(v_{k+1}). \end{aligned}$$
(8.5)

Note that for some fixed constant \(C>0\),

(8.6)

which in particular allows us to estimate the Christoffel symbols appearing in (8.5). For \(\mu =0\), writing \(r_k(s)=r(x_k(s))\), and using the improved decay of various Christoffel symbols due to the gauge condition \(\Upsilon (g)=0\), we have

(8.7)

with the integrals on the first line coming from terms with \((\kappa ,\lambda )=(0,0)\) and using (8.1), while the remaining terms come from \((\kappa ,\lambda )=(0,1)\), (0, b), (1, 1), (1, b), (ab), in this order, using that \(v_k^0=\mathcal {O}(1)\), \(v_k^1=\mathcal {O}(s^{-1-\alpha _1})\), and . As for the notation, the constants implicit in the \(\mathcal {O}_{s_0}\) notation depend only on \(s_0\) and are nonincreasing with \(s_0\), as they come from the size of the Christoffel symbols along \(x_k(s)\), which satisfies (8.6). By (8.2) and (8.6), we have

$$\begin{aligned} \int _s^\infty m r_k(u)^{-2}\,d u=\int _s^\infty 4 m(u^{-2}+\mathcal {O}(u^{-3}\log u))\,d u = 4 m s^{-1} + \mathcal {O}(s^{-2}\log s). \end{aligned}$$

Therefore, we have

which, for fixed \(\alpha _0<b_I\), is bounded by \(\frac{1}{10}s^{-1-\alpha _0}\) for large \(s_0\), provided ; in particular, this requires .

We obtain estimates on \(v_{k+1}^i(s)\), \(i=1,2,3\), in a similar manner. Namely,

(8.8)

satisfies , hence \(|v_{k+1}^1(s)|<\frac{1}{10}s^{-1-\alpha _1}\) provided the weights satisfy , and provided we increase \(s_0\), if necessary.

Lastly, using the precise form of the leading term of \(\Gamma _{0 b}^c\),

(8.9)

Integrating the first term in the second line gives a term bounded from above by

so we get provided (which is consistent with ). Thus, the iteration (8.5) maps the unit ball in X into itself, provided we fix weights

(8.10)

and choose \(s_0\) large; recall here that \(0<b_I<b'_I<1\). Moreover, taking \(s_0\) larger if necessary, \(v_k\mapsto v_{k+1}\) is a contraction; such an estimate is only nonobvious for the difference of quadratic terms in (8.5) involving the component \(v^0\); however, the corresponding terms come with a small prefactor due to the smallness of the relevant Christoffel symbols.

Let now \(v:=\lim _{k\rightarrow \infty } v_k\in X\) denote the limiting curve in \(T\mathbb {R}^4\), and integrate it by setting \(\gamma :=I(v)\). Then v satisfies the integral equation (8.5) with \(v_k\) and \(v_{k+1}\) replaced by v, so v is \(\mathcal {C}^1\), hence \(\gamma \) is a \(\mathcal {C}^2\) geodesic. In particular, \(|v(s)|_{g(s)}^2\) is constant, hence equal to its limit as \(s\rightarrow \infty \), which is

This proves that \(\gamma \) is a null-geodesic with the desired properties. \(\square \)

Note that \(\gamma \) is the unique null-geodesic, up to translation of the affine parameter, tending to p and such that \(\dot{\gamma }\in X\). (Indeed, for any such \(\gamma \), the velocity \(\dot{\gamma }\) has small norm in a space defined like X but with weights decreased by a small amount and for \(s_0\) large enough. The uniqueness then follows from the fixed point theorem).

Definition 8.3

For \(p\in (\mathscr {I}^+)^\circ \), denote by \(\gamma _p(s)\) the maximal null-geodesic such that \(v=\dot{\gamma }_p\) and \(x=\gamma _p\) satisfy equation (8.4) and \(v\in X\), with X given in (8.3). We call \(\gamma _p\) a radial null-geodesic.

We record the following stronger regularity property of the geodesics \(\gamma _p\):

Lemma 8.4

In the notation of Proposition 8.2, let \(\gamma _p(s)=(x^\mu (s))\) denote a radial null-geodesic; then we have

for all weights \(\alpha _0<b_I\), \(\alpha _1<b'_I\), , where \(\widetilde{x}^0(s):=x^0(s)-(s+4 m\log s)\), \(\widetilde{x}^i(s):=x^i(s)-\bar{x}^i\), and where \(S^m([s_0,\infty ))\) denotes symbols of order m, i.e. functions \(u\in \mathcal {C}^\infty ([s_0,\infty ))\) such that for any \(k\in \mathbb {N}_0\), \(|u^{(k)}(s)|\le C_k\langle s\rangle ^{m-k}\).

Proof

Certainly \(x^\mu (s)\) is smooth as a geodesic in a spacetime with smooth metric tensor. The symbolic estimates for \(\partial _s^k\widetilde{x}^\mu (s)\) for \(k=0,1\) follow immediately from the construction of \(\gamma _p\) in the proof of Proposition 8.2; for \(k=2\), they follow from the proof as well, specifically, from the decay of the integrands in (8.7)–(8.9). Assuming that for some \(k\ge 1\) we have \(|\partial _s^j\widetilde{x}^0(s)|\lesssim \langle s\rangle ^{\alpha _0-j}\), \(0\le j\le k+1\), with \(\alpha _0\) as in (8.10), likewise for \(\widetilde{x}^i\), \(i=1,2,3\), we have

$$\begin{aligned} \partial _s^k(\partial _s^2\widetilde{x}^0) = \partial _s^k\ddot{x}^0 - \partial _s^{k+2}(s+4 m\log s) = \partial _s^k\ddot{x}^0 + \partial _s^k(4 m s^{-2}), \end{aligned}$$

and \(\partial _s^k\ddot{x}^0=-\partial _s^k(\Gamma ^0_{\mu \nu }\dot{x}^\mu \dot{x}^\nu )\). Note that \(x^0(s)=\mathcal {O}(s)\), \(\partial _s x^0(s)=\mathcal {O}(1)\), and \(\partial _s^j x^0(s)=\mathcal {O}(s^{-1-j})\) for \(2\le j\le k+1\). Expanding the derivatives using the Leibniz and chain rules thus gives the following types of terms: for \((\mu ,\nu )=(0,0)\) and all derivatives falling on the Christoffel symbol,

$$\begin{aligned} (\partial _s^k\Gamma ^0_{0 0})(\dot{x}^0)^2&=\partial _s^k(4 m s^{-2}+\mathcal {O}(s^{-2-b_I}))(1+\mathcal {O}(s^{-1}\log s)) \\&=\partial _s^k(4 m s^{-2}) + \mathcal {O}(s^{-k-2-b_I}) \end{aligned}$$

by the inductive hypothesis and the b-regularity of the remainder term in \(\Gamma ^0_{0 0}\); the remaining \((\mu ,\nu )=(0,0)\) terms are, with \(\ell _1+\ell _2+\ell _3=k\) and \(\ell _2>0\),

$$\begin{aligned} (\partial _s^{\ell _1}\Gamma _{0 0}^0)(\partial _s^{\ell _2}\dot{x}^0)(\partial _s^{\ell _3}\dot{x}^0) = \mathcal {O}(s^{-2-\ell _1}\cdot s^{-1-\ell _2}\cdot s^{-\ell _3}) = \mathcal {O}(s^{-k-3}). \end{aligned}$$

Estimating the terms with \((\mu ,\nu )\ne (0,0)\) does not require special care: derivatives falling on \(\dot{x}^\mu \) are estimated using the inductive hypothesis (thus every derivative gives an extra power of decay in s); a derivative falling on \(\Gamma _{\mu \nu }^0\) on the other hand either produces \((\partial _0\Gamma _{\mu \nu }^0)\dot{x}^0\), which gains an order of decay due to the Christoffel symbol (recall that \(\partial _0\) is a b-derivative which vanishes at \(\mathscr {I}^+\)), or \((\partial _i\Gamma _{\mu \nu }^0)\dot{x}^i\), which gains an order of decay due to \(\dot{x}^i=\mathcal {O}(s^{-1})\). Thus, the bound \(\partial _s^k(\partial _s^2\widetilde{x}^0)=\mathcal {O}(s^{-k-2-\alpha _0})\) follows from the same arithmetic of weights as used after (8.7).

The arguments for the other components \(\widetilde{x}^i\) are completely analogous, and in fact simpler as no terms need to be handled separately. This finishes the inductive step, and thus the proof of the lemma. \(\square \)

We further note that for any compact subset \(K\Subset (\mathscr {I}^+)^\circ \), there exists a uniform value \(s_0\in \mathbb {R}\) such that the null-geodesics \(\gamma _p\), \(p\in K\), are defined on \([s_0,\infty )\); since moreover \(\gamma _p\) arises, via \(\gamma _p=I(\dot{\gamma }_p)\) as in (8.4), from the Banach fixed point theorem for a smooth (in p) contraction, Lemma 8.4 holds smoothly in the parameter p, that is, making the dependence on p explicit as a subscript, we have \(\widetilde{x}^0_p(s)\in \mathcal {C}^\infty (K;S^{-\alpha _0}([s_0,\infty )))\) etc.

Consider now the union of radial null-geodesics tending to the points of particular \(\mathbb {S}^2\) sections of \(\mathscr {I}^+\). Concretely, for fixed \(\bar{x}^1\in \mathbb {R}\), denote

$$\begin{aligned} S(\bar{x}^1):=\{p\in \mathscr {I}^+:x^1(p)=\bar{x}^1\},\ \ C_{\bar{x}^1}:=\bigcup _{p\in S(\bar{x}^1)} \gamma _p((s_0,\infty )), \end{aligned}$$
(8.11)

where \(s_0\) is chosen sufficiently large, which will always be assumed from now on. See Figure 15. Thus, on the Schwarzschild spacetime, \(C_{\bar{x}^1}\) is the part of the null hypersurface \(x^1=\bar{x}^1\) on which \(x^0\gtrsim s_0\).

Fig. 15
figure 15

The outgoing light cone \(C_{\bar{x}^1}\) limiting to the sphere \(S(\bar{x}^1)\subset (\mathscr {I}^+)^\circ \). Also shown are a number of radial null-geodesics

Lemma 8.5

For \(\bar{x}^1\in \mathbb {R}\), the set \(C_{\bar{x}^1}\) is a smooth null hypersurface near \(\mathscr {I}^+\). Moreover, if \(I^1\Subset \mathbb {R}\) is a precompact open interval, then there exists a function u such that

$$\begin{aligned} u-x^1=:\widetilde{u}\in \rho _I^{b'_I-0}H_{{\text {b}}}^\infty (M); \quad C_{\bar{x}^1} = \{ u=\bar{x}^1 \},\ \ \bar{x}^1\in I^1. \end{aligned}$$
(8.12)

Proof

With coordinates \(x^a\), \(a=2,3\), on \(\mathbb {S}^2\), write \(\gamma (\bar{x}^1;s,\bar{x}^2,\bar{x}^3) := \gamma _{(\bar{x}^1,\bar{x}^2,\bar{x}^3)}(s)\). First, we shall prove that there exists a coordinate change of \(\mathbb {R}_{x^0}\times \mathbb {R}^2_{x^2,x^3}\),

$$\begin{aligned} \Phi (\bar{x}^1;x^0,x^2,x^3)=(x^0-4 m\log x^0+\widetilde{\Phi }^0,x^2+\widetilde{\Phi }^2,x^3+\widetilde{\Phi }^3)=:(\Phi ^0,\Phi ^2,\Phi ^3), \end{aligned}$$
(8.13)

depending parametrically on \(\bar{x}^1\in I^1\), and with \(\widetilde{\Phi }^0\in S^{-\alpha _0}\), for weights as in (8.10) (with the symbolic behavior in \(x^0\)), such that the map

$$\begin{aligned} \delta (x^0,\bar{x}^1,x^2,x^3) := \gamma (\bar{x}^1;\Phi (\bar{x}^1;x^0,x^2,x^3)) \end{aligned}$$

satisfies \(x^i\circ \delta =x^i\), \(i=0,2,3\). To do this, recall that, putting \(\gamma ^\mu :=x^\mu \circ \gamma \), we have \(\gamma ^0-(s+4 m\log s)=:\widetilde{\gamma }^0\in S^{-\alpha _0}\), \(\gamma ^1-\bar{x}^1=:\widetilde{\gamma }^1\in S^{-\alpha _1}\), and , so after some simplifications, our task becomes choosing \(\widetilde{\Phi }^i\) such that

$$\begin{aligned} \widetilde{\Phi }^0 = 4 m\log \bigl (1-4 m(x^0)^{-1}(\log x^0+\widetilde{\Phi }^0)\bigr )-\widetilde{\gamma }^0(\bar{x}^1;\Phi ), \quad \widetilde{\Phi }^a = -\widetilde{\gamma }^a(\bar{x}^1;\Phi ); \end{aligned}$$
(8.14)

this can be solved, first with \(\widetilde{\Phi }^0\in (x^0)^{-\alpha _0}\mathcal {C}^0\) etc. using the fixed point theorem, and then in symbol spaces using the smoothness of \(\widetilde{\Phi }^0\) (which follows from the implicit function theorem) and an iterative argument.

Let us drop \(x^0,x^2,x^3\) from the notation. The desired function u is then defined implicitly by \(u\circ \delta =\bar{x}^1\). Writing \(x^1(\delta (\bar{x}^1))=:\bar{x}^1+f\), where \(f\in S^{-\alpha _1}\) by Lemma 8.4, we see that \(\delta \) is one to one for large \(x^0\), as \(\bar{x}^1+f(\bar{x}^1)=\bar{y}^1+f(\bar{y}^1)\) implies \(0\ge |\bar{x}^1-\bar{y}^1|-C(x^0)^{-\alpha _1}|\bar{x}^1-\bar{y}^1|\), so \(\bar{x}^1=\bar{y}^1\) if \(x^0\) is large. Writing \(u=\bar{x}^1+\widetilde{u}\), we thus need to solve

$$\begin{aligned} (\bar{x}^1+\widetilde{u})+f(\bar{x}^1+\widetilde{u})=\bar{x}^1\ \ \Longleftrightarrow \ \ \widetilde{u} = -f(\bar{x}^1+\widetilde{u}), \end{aligned}$$

which by another application of the fixed point theorem has a solution \(\widetilde{u}\in S^{-\alpha _1}\). Lastly, note that the vector fields \(\partial _{x^i}\), \(i=2,3,4\), and \(x^0\partial _{x^0}\) span \(\mathcal {V}_{\text {b}}(M)\) near \((\mathscr {I}^+)^\circ \) in view of \(\rho _I=1/x^0\), hence \(S^{-\alpha _1}\subset \rho _I^{\alpha _1-0}H_{{\text {b}}}^\infty \) near \((\mathscr {I}^+)^\circ \). Since we can take \(\alpha _1\) arbitrarily close to \(b'_I\) by (8.10), the existence of u and smoothness of \(C_{\bar{x}^1}\) follows.

It remains to prove that \(C_{\bar{x}^1}\) is a null hypersurface. To this end, we sketch a different way of constructing \(C_{\bar{x}^1}\): let \(\bar{x}^0>0\), and consider the 2-sphere \(S_{\bar{x}^1\bar{x}^0}=\{x^0=\bar{x}^0,\ x^1=\bar{x}^1\}\). For sufficiently large \(\bar{x}^0\), \(S_{\bar{x}^1\bar{x}^0}\) is spacelike; hence, for any \(p\in S_{\bar{x}^1\bar{x}^0}\), there are precisely 4 rays of lightlike directions in \((T_p S_{\bar{x}^1\bar{x}^0})^\perp \), and there exists a unique \(v(p)\in (T_p S_{\bar{x}^1\bar{x}^0})^\perp \) which is future lightlike and outgoing (i.e. \(d r(v(p))>0\)), and for which \(v(p)^0=1+\frac{2 m}{r(p)}\). By writing out the condition \(g(v(p),\partial _a)=0\) using the form (3.14) of g, one obtains an expression for \(v(p)^a\) in terms of a small multiple of \(v(p)^1\) and certain metric coefficients, while using \(|v(p)|_g^2=0\) (and using the nonvanishing of \(g_{0 1}\)) gives an expression for \(v(p)^1\) in terms of a small multiple of \(v(p)^a\), plus certain metric coefficients. Solving this simple system, one finds that the components of v(p) satisfy \(v(p)^1=\mathcal {O}(r^{-1-b_I'})\) and \(v(p)^a=\mathcal {O}(r^{-2-b_I'})\); they are thus small when measured in the norm of X (restricted to a single point) in (8.3), cf. the upper bounds on the weights in (8.10).

A small modification of the fixed point argument in the proof of Proposition 8.2 shows that we can solve the geodesic equation with initial data v(p) in the backwards direction up to a fixed value of \(x^0\), say \(x^0=C\gg 1\); denote the union of these null-geodesic segments emanating from points on \(S_{\bar{x}^1\bar{x}^0}\) by \(C_{\bar{x}^1\bar{x}^0}\). Letting \(\bar{x}^0\rightarrow \infty \), it then follows that \(C_{\bar{x}^1\bar{x}^0}\) converges over every compact subset of \(\mathbb {R}^4\cap \{x^0>C\}\) to \(C_{\bar{x}^1}\) in the \(\mathcal {C}^1\) topology. By construction, every \(C_{\bar{x}^1\bar{x}^0}\) is a null hypersurface; thus, its \(\mathcal {C}^1\) limit \(C_{\bar{x}^1}\) is a null hypersurface as well. \(\square \)

The function u is uniquely defined by (8.12); thus, Lemma 8.5 shows the existence of a neighborhood

$$\begin{aligned} (\mathscr {I}^+)^\circ \subset U^+\subset M \end{aligned}$$
(8.15)

and a function \(u\in x^1+\rho _I^{b'_I-0}H_{{\text {b}},{\text {loc}}}^\infty (U^+)\) such that \(C_{\bar{x}^1}\cap U^+=\{u=\bar{x}^1\}\) for all \(\bar{x}^1\in \mathbb {R}\).

Remark 8.6

The weight in (8.12) is consistent with the choice of the domain (4.15) whose boundary component \(U_\epsilon ^\partial \) is spacelike, see (4.16).

Since \(|\nabla u|^2\equiv 0\) by construction, the vector field \(\nabla u\) consists of null-generators of its level sets \(C_u\); more precisely, we have \(\nabla _{\nabla u}\nabla u=0\), so restricted to the image of a radial null-geodesic \(\gamma _p\subset C_u\), we have \((\nabla u)|_{\gamma _p(s)}=c_p\dot{\gamma }_p(s)\) for some constant\(c_p\). Taking the inner product with \(\partial _1\) and using the form (3.14) of g yields \(1+\mathcal {O}(s^{-b'_I+0})=c_p(\tfrac{1}{2}+\mathcal {O}(s^{-1}))\), so letting \(s\rightarrow \infty \) gives \(c_p=2\) and thus

$$\begin{aligned} (\nabla u)|_{\gamma _p(s)} = 2\dot{\gamma }_p(s). \end{aligned}$$

We can then extract more information using \(r=\tfrac{1}{2}s+\mathcal {O}(\log s)\) and \(g_{0 1}=\tfrac{1}{2}+2 s^{-1}(h_{0 1}-m)+\mathcal {O}(s^{-2}\log s)\): Lemma 8.4 then gives \(2\langle \dot{\gamma }_p(s),\partial _1\rangle =1+4 s^{-1}h_{0 1}+\mathcal {O}(s^{-1-\alpha _0})\), so

$$\begin{aligned} \partial _1\widetilde{u}-2 r^{-1}h_{0 1}\in \rho _I^{1+b_I-0}H_{{\text {b}}}^\infty . \end{aligned}$$
(8.16)

8.2 Bondi–Sachs coordinates; proof of the mass loss formula

The function u has nonvanishing differential everywhere on \(C_{\bar{x}^1}\) when \(x^0\) is large; we will use it one coordinate of a Bondi–Sachs coordinate system \((u,\mathring{r},\mathring{x}^2,\mathring{x}^3)\), where the coordinates \(\mathring{r}\) and \(\mathring{x}^a\), \(a=2,3\), are geometrically defined and constructed below; with respect to such a coordinate system, the metric takes the form

$$\begin{aligned} g = g_{u u}\,d u^2 + 2 g_{u\mathring{r}}\,d u\,d\mathring{r} - \mathring{r}^2 q_{a b}(d\mathring{x}^a-\widetilde{U}^a\,d u)(d\mathring{x}^b-\widetilde{U}^b\,d u) \end{aligned}$$

for some \(g_{u u}\), \(g_{u \mathring{r}}\), \(q_{a b}\), and \(\widetilde{U}^a\), and quantities of geometric or physical interest such as the Bondi mass and the gravitational energy flux can be calculated in terms of certain lower order terms of these metric coefficients [12, 91]. We begin by defining \(\mathring{r}\). Introduce a projection \(\pi :U^+\rightarrow \mathbb {S}^2\) by

$$\begin{aligned} \pi (\gamma _{(\bar{x}^1,\theta )}(s)) := \theta ,\ \ \theta \in \mathbb {S}^2, \end{aligned}$$

which is well-defined due to Lemma 8.5; in fact, in the notation of its proof, using local coordinates \(x^a\), \(a=2,3\), on \(\mathbb {S}^2\), we have

$$\begin{aligned} \pi (x^0,x^1,x^2,x^3)=(\Phi ^a(x^1+\widetilde{u};x^0,x^2,x^3))_{a=2,3}, \end{aligned}$$
(8.17)

which in particular gives

(8.18)

The map \(\pi \) defines a fibration of every \(C_u\); these fibrations have natural sections, as we proceed to explain invariantly. Let \(N:=\ker \pi _*\) denote the subbundle (smooth in \(M^\circ \)) consisting of vectors tangent to the fibers of \(\pi \): this is the bundle of null generators of the null hypersurfaces \(C_u\), and therefore \(N \perp T C_u\). This implies that the spacetime metric g restricts to an element

$$\begin{aligned}{}[g]\in S^2(T C_u/N)^*. \end{aligned}$$

On the other hand, the pull-back induces a Riemannian metric on \(T C_u/N\), i.e. an isomorphism \(T C_u/N\rightarrow (T C_u/N)^*\), hence is well-defined. We then define the area radius\(\mathring{r}\) by the formula

Lemma 8.7

We have \(\mathring{r}-r\in \rho _I^{b'_I-0}H_{{\text {b}}}^\infty \) and \(\partial _0\mathring{r}=\tfrac{1}{2}-m r^{-1}+\rho _I^{1+b'_I-0}H_{{\text {b}}}^\infty \) near \((\mathscr {I}^+)^\circ \).

Proof

It suffices to prove the first claim. We start by finding representatives in \(T C_u\) of a basis of \(T C_u/N\) by considering the vector fields

$$\begin{aligned} V_a=f_a\partial _1+\partial _a,\ \ a=2,3, \end{aligned}$$
(8.19)

with \(f_a\) to be determined. Working over the image of a fixed geodesic \(\gamma _p:[s_0,\infty )\rightarrow M\), we use and the form of g to calculate

demanding this to vanish determines . Since is arbitrary, we conclude that

(8.20)

while the observation (8.18) implies that \(\pi _*(V_a)\in \partial _a+C_a^b\partial _b\), , hence

(8.21)

Therefore,

which is equal to \(r^4(1+\mathcal {O}(s^{-1-b'_I+0}))\) due to the decay of at \(\mathscr {I}^+\) coming from the membership \(h\in \mathcal {X}^{\infty ;b_0,b_I,b'_I,b_+}\), i.e. ultimately from the gauge condition. Taking fourth roots, carrying symbolic behavior in s through the argument, and noting that these calculations depend smoothly on the parameter \(p\in (\mathscr {I}^+)^\circ \) completes the proof. \(\square \)

Corollary 8.8

Define the punctured neighborhood \(\dot{U}^+:=U^+ \backslash (\mathscr {I}^+)^\circ \) of \((\mathscr {I}^+)^\circ \), see (8.15). Then if \(U^+\) is a sufficiently small neighborhood, \((u,\mathring{r},\pi ):\dot{U}^+\rightarrow \mathbb {R}\times \mathbb {R}\times \mathbb {S}^2\) is a coordinate system on \(\dot{U}^+\).

Proof

This follows from Lemma 8.7 and the asymptotics of u and \(\pi \) in (8.12) and (8.18). \(\square \)

Choosing local coordinates \(x^a\) on \(\mathbb {S}^2\) and letting \(\mathring{x}^a:=x^a\circ \pi =x^a+\rho _I^{1+b'_I-0}H_{{\text {b}}}^\infty \), we can introduce the Bondi–Sachs coordinates

$$\begin{aligned} (u,\mathring{r},\mathring{x}^2,\mathring{x}^3) \end{aligned}$$
(8.22)

on U; the metric g and its dual \(G=g^{-1}\) simplify in this coordinate system since, by construction,

$$\begin{aligned} G(d u,d u)\equiv 0, \quad G(d u,d\mathring{x}^a)=(\nabla u)(\mathring{x}^a)\equiv 0. \end{aligned}$$
(8.23)

Furthermore, using (8.16) and Lemma 8.7,

(8.24)

where the leading term in the first expression comes from \(g^{0 1}(\partial _1 u)(\partial _0\mathring{r})\). In order to calculate \(G(d\mathring{r},d\mathring{r})\) to the same level of precision, we need to sharpen Lemma 8.7.

Lemma 8.9

Near \((\mathscr {I}^+)^\circ \), we have

Note that in (8.20), we already control \(g(V_a,V_b)\) modulo terms more than two orders beyond the leading term, which suffices for present purposes. On the other hand, the remainder term in (8.21) is not precise enough.

Proof of Lemma 8.9

Put , so \((\mathring{r}/r)^4=\det A\), and Lemma 8.7 gives \(A_a^b=\delta _a^b-r^{-1}h_{\bar{a}}{}^{\bar{b}}+\rho _I^{1+b'_I-0}H_{{\text {b}}}^\infty \) and \((\det A)-1\in \rho _I^{1+b'_I-0}H_{{\text {b}}}^\infty \). Suppose now that

$$\begin{aligned} \partial _1(\det A)=r^{-2}\mu +o(r^{-2}), \end{aligned}$$
(8.25)

then \(\partial _1((\mathring{r}-r)/r)=\tfrac{1}{4}(\det A)^{-3/4}\partial _1(\det A)=\tfrac{1}{4} r^{-2}\mu +o(r^{-2})\), so expanding the left hand side as \(r^{-1}(\partial _1\mathring{r}+\tfrac{1}{2}-m r^{-1})+o(r^{-2})\) implies that

$$\begin{aligned} \partial _1\mathring{r} = -\tfrac{1}{2}+ r^{-1}(m+\tfrac{1}{4} \mu ) + o(r^{-1}) \end{aligned}$$
(8.26)

Our calculations will imply that the \(o(r^{-1})\) remainder is of size \(\mathcal {O}(r^{-1-b_I+0})\), but we shall stick to \(o(r^{-1})\) etc. for brevity. Trivializing \(T C_u/N\) locally using the frame \(\{V_a:a=2,3\}\), with \(V_a\) defined in (8.19), A becomes a \(2\times 2\) matrix-valued function. We can thus use the formula \(\partial _1(\det A)=(\det A){\text {tr}}(A^{-1}\partial _1 A)\), so it suffices to determine the function \(\mu \) in \({\text {tr}}(A^{-1}\partial _1 A)=r^{-2}\mu +o(r^{-2})\). One contribution comes from differentiating \([r^{-2}g]\), which by (8.20) and \(\Upsilon (g)_1=0\) yields

(8.27)

The remaining contribution to \({\text {tr}}(A^{-1}\partial _1 A)\) is (using the cyclicity of the trace). Let us work near a point \(z_0\in \mathbb {R}^4\), and suppose \(x^2,x^3\) are normal coordinates on \(\mathbb {S}^2\) centered at the point \(\pi (z_0)\). Then

Now \((\pi _*V_a)^c=\delta _a^c+\mathcal {O}(r^{-1-b'_I+0})\), whose derivative along \(\partial _1\) is of size \(\mathcal {O}(r^{-1-b'_I+0})\), so

(8.28)

Let us first calculate the contribution to this coming from the term \(\partial _a\) in \(V_a\). By (8.17) and recalling the form of the map \(\Phi \) from (8.13) as well as its defining relation (8.14), we have

$$\begin{aligned} \partial _1(\pi _*\partial _a)^b&= \partial _1\partial _a\widetilde{\Phi }^b(x^1+\widetilde{u}; x^0,x^2,x^3) \nonumber \\&= -\partial _1\partial _a\widetilde{\gamma }^b(x^1+\widetilde{u};x^0-4 m\log x^0+\widetilde{\Phi }^0,x^2+\widetilde{\Phi }^2,x^3+\widetilde{\Phi }^3); \end{aligned}$$
(8.29)

now \(\widetilde{\gamma }^b\), its \(x^c\)-derivatives (\(c=2,3\)), and \(\widetilde{\Phi }^b\) are of size \(\mathcal {O}((x^0)^{-1-b'_I+0})\), so dropping \(\widetilde{\Phi }^2\) and \(\widetilde{\Phi }^3\) gives an \(o(r^{-2})\) error; likewise, \(\partial _{x^0}\widetilde{\gamma }^b=\mathcal {O}(r^{-2-b'_I+0})\), so replacing the second argument by \(x^0\) gives another \(o(r^{-2})\) error.

To analyze this further, we need to digress: consider the 1-parameter family \(w(s;\epsilon ):=\gamma _{(x^1+\epsilon ,x^2,x^3)}(s)\) of null-geodesics, with \(x^2,x^3\) fixed, and let

$$\begin{aligned} Y(s):=\partial _\epsilon w(s;0)\equiv \partial _1\gamma _{(x^1,x^2,x^3)}(s) \end{aligned}$$

denote the Jacobi field along \(\gamma (s):=w(s;0)\). The asymptotics proved in Proposition 8.2 give the a priori information

$$\begin{aligned} Y(s)&= \mathcal {O}(s^{-b_I+0})\partial _0 + (1+\mathcal {O}(s^{-b'_I+0}))\partial _1 + \sum _c \mathcal {O}(s^{-1-b'_I+0})\partial _c, \nonumber \\ \partial _s Y(s)&= \mathcal {O}(s^{-1-b_I+0})\partial _0 + \mathcal {O}(s^{-1-b'_I+0})\partial _1 + \sum _c \mathcal {O}(s^{-2-b'_I+0})\partial _c. \end{aligned}$$
(8.30)

We shall determine the component \(Y(s)^b\) by solving the Jacobi equation

$$\begin{aligned} \bigl (\nabla _{\dot{\gamma }}\nabla _{\dot{\gamma }}Y(s) + R(Y,\dot{\gamma })\dot{\gamma }\bigr )^b = 0. \end{aligned}$$
(8.31)

Heuristically, it suffices to calculate this modulo \(o(s^{-4})\) errors, as the second integral of such error terms (integrating from infinity) is \(o(s^{-2})\); we will verify this heuristic in the course of our calculations. Using \(\dot{\gamma }^0=1+\mathcal {O}(s^{-1})\), \(\dot{\gamma }^1=\mathcal {O}(s^{-1-b'_I+0})\), \(\dot{\gamma }^c=\mathcal {O}(s^{-2-b'_I+0})\), the a priori information (8.30), and the expressions for the curvature tensor in (A.7), one finds

$$\begin{aligned} (R(Y,\dot{\gamma })\dot{\gamma })^b = R^b{}_{\lambda \mu \nu }\dot{\gamma }^\lambda Y^\mu \dot{\gamma }^\nu = -R^b{}_{0 0 1}(\dot{\gamma }^0)^2 Y^1 - R^b{}_{0 0 a}(\dot{\gamma }^0)^2 Y^a + o(s^{-4}). \end{aligned}$$

Now, using the gauge condition \(\Upsilon (g)_0=0\) and the expressions for Christoffel symbols given in (A.3), one finds that in fact \(R^b{}_{0 0 a}=o(s^{-3})\), rendering the second term size \(o(s^{-4})\). Let us calculate \(R^b{}_{0 0 1}=\partial _0\Gamma ^b_{0 1}-\partial _1\Gamma _{0 0}^b+\Gamma _{0 1}^\mu \Gamma _{0 \mu }^b-\Gamma _{0 0}^\mu \Gamma _{\mu 1}^b\) more accurately than in (A.7). In the third term, the only contribution which is not \(o(r^{-4})\) comes from \(\mu =2,3\), giving ; the fourth term is \(o(r^{-4})\). For the second term, we use

exploiting \(\Upsilon (g)_0=0\). In view of the leading order vanishing of \(h_0{}^{\bar{b}}\) and \(h_{0 0}\) at \(\mathscr {I}^+\), we have ; now \(\partial _1 h_0{}^{\bar{b}}\) can be rewritten, using \(\Upsilon (g)_b=0\), in terms of \(h_{0 1}\), \(h_{\bar{b}\bar{c}}\), and \(h_{1\bar{b}}\); since these have (size 1) leading terms at \(\mathscr {I}^+\), subsequent differentiation along \(\partial _0\) only produces nontrivial terms (i.e. not of size \(o(r^{-4})\)) when acting on the r-weights. On the other hand, \(\partial _1 h_{0 0}=-r^{-1}h_{0 1}+o(r^{-1})\) from \(\Upsilon (g)_0=0\). Arguing similarly for the computation of \(\partial _0\Gamma ^b_{0 1}\), one ultimately finds that all nontrivial terms cancel, so

$$\begin{aligned} R^b{}_{0 0 1} = o(r^{-4}). \end{aligned}$$

Thus, the curvature term of the Jacobi equation (8.31) is of size \(o(s^{-4})\) simply. Regarding the first term of (8.31), the information (8.30) and a brief calculation give \((\nabla _{\dot{\gamma }}Y)^0=\mathcal {O}(s^{-1-b_I+0})\), \((\nabla _{\dot{\gamma }}Y)^1 = \mathcal {O}(s^{-1-b'_I+0})\), and, using \(r^{-1}=2 s^{-1}+\mathcal {O}(s^{-2}\log s)\),

with nontrivial contributions only from \((\mu ,\lambda )=(0,1)\), (0, c). In particular, \(\nabla _{\dot{\gamma }}Y\) satisfies the same rough asymptotics as \(\partial _s Y\) in (8.30). Since differentiation of \(h^{\bar{b}\bar{d}}\) and \(h_1{}^{\bar{b}}\) along \(\dot{\gamma }\) gains a weight \(s^{1+b_I}\) due to these components having a leading term, this and (8.31) imply

where

is the value of this combination of metric coefficients at \(\gamma (\infty )\in \mathscr {I}^+\). Since \(\lim _{s\rightarrow \infty }s^2\partial _s Y^b=0\) due to (8.30), we find \(\partial _s Y^b=-2\widetilde{\mu } s^{-3}+o(s^{-3})\) and thus

$$\begin{aligned} Y^b=\widetilde{\mu } s^{-2}+o(s^{-2}) \end{aligned}$$
(8.32)

since \(\lim _{s\rightarrow \infty } Y^b=0\).

Returning to the expression (8.29), dropping \(\widetilde{u}\) gives an \(\mathcal {O}(r^{-2-b'_I+0})\) error term by Lemma 8.5; we thus conclude that

(8.33)

We have another term in (8.28) coming from the term \(f_a\partial _1\) in \(V_a\); but \(f_a\) and its derivative along \(x^1\) being of size \(\mathcal {O}(r^{-b'_I+0})\) (see the proof of Lemma 8.7), it suffices to show that \((\pi _*\partial _1)^c=\mathcal {O}(r^{-2})\) in order to conclude that \(\partial _1(\pi _*(f_a\partial _1))^c=o(r^{-2})\) is a lower order term. But we can simplify \((\pi _*\partial _1)^c|_{(x^0,x^1,x^2,x^3)}=\partial _1\Phi ^c=-\partial _1\widetilde{\gamma }^c(x^1;x^0,x^2,x^3)+o(r^{-2})=\mathcal {O}(r^{-2})\) (using (8.32)) in the same manner as we simplified (8.29).

Finally then, plugging (8.33) into (8.28), and adding the result to (8.27) yields (8.25) for

which by (8.26) proves the lemma. \(\square \)

We can also compute \(\partial _1\mathring{x}^b=\partial _1\pi ^b\) modulo \(o(r^{-2})\), as this is given by the component \(Y^b\) of the Jacobi vector field of the proof of Lemma 8.9, so . In summary, we have shown that

(8.34)

where the remainders are in fact more precise: \(o(r^{-k})\) can be replaced by \(\rho _I^{k+b_I-0}H_{{\text {b}}}^\infty \) near \((\mathscr {I}^+)^\circ \), so a fortiori by \(\mathcal {O}(r^{-k-b_I+0})\). We can now supplement (8.23)–(8.24) by

(8.35)

(Note that in the first line, the logarithmically divergent terms \(h_{1 1}\) from \(g^{0 0}(\partial _0\mathring{r})^2\) and \(g^{1 1}(\partial _1\mathring{r})^2\) cancel). Let us summarize the calculations (8.23)–(8.24) and (8.35):

Proposition 8.10

In the Bondi–Sachs coordinates (8.22), the dual metric \(G=g^{-1}\) is

where . The metric g itself takes the form

The \(o(\mathring{r}^{-k})\) remainders can be replaced by \(\rho _I^{k+b_I-0}H_{{\text {b}}}^\infty =\mathcal {O}(r^{-k-b_I+0})\) near \((\mathscr {I}^+)^\circ \). Furthermore, the coordinate vector fields satisfy

(8.36)

Proof

The statement (8.36) on the dual basis of (8.34) follows by matrix inversion. \(\square \)

Remark 8.11

For comparison, the Bondi–Sachs coordinates on Schwarzschild are simply \(u=x^1\), \(\mathring{r}=r\), and spherical coordinates \(\mathring{x}^a=x^a\), and the metric takes the form

Remark 8.12

Near \((\mathscr {I}^+)^\circ \) and relative to the smooth structure on M, the conformally rescaled metric \(r^{-2}g\) is singular as an incomplete metric at \(\mathscr {I}^+\): indeed, \(r^2\partial _0\) is a nonzero multiple of \(\partial _{\rho _I}\) by (2.26), and \(r^2 g(r^2\partial _0,r^2\partial _0)=r h_{0 0}=\mathcal {O}(\rho _I^{-1+b'_I})\). On the other hand, changing the smooth structure of M near \((\mathscr {I}^+)^\circ \) by declaring \((\mathring{r}^{-1},u,\mathring{x}^2,\mathring{x}^3)\) to be a smooth coordinate system, so \(\mathring{\rho }_I:=\mathring{r}^{-1}\) is a defining function of \(\mathscr {I}^+\), we have \(\mathring{r}^{-2} g\in \mathcal {C}^{1,b_I-0}\). Indeed, \(\partial _{\mathring{\rho }_I}=-\mathring{r}^2\partial _{\mathring{r}}\) is null, while \((\mathring{r}^{-2}g)(\partial _{\mathring{\rho }_I},\partial _u)=1+\mathcal {O}(\mathring{\rho }_I^{1+b_I-0})\) is \(\mathcal {C}^{1,b_I-0}\), and the remaining metric coefficients have at least this amount of regularity. Since by Theorem 6.3 one can take \(b_I\) arbitrarily close to \(\min (b_0,1)\), this gives

$$\begin{aligned} \mathring{r}^{-2}g \in \mathcal {C}^{1,\alpha }\quad \forall \,\alpha <\min (b_0,1), \end{aligned}$$
(8.37)

relative to the new smooth structure. As mentioned in §1.3, smoothness properties of conformal compactifications have been widely discussed, in particular from the point of view of asymptotic simplicity [94] and the decay properties of the curvature tensor [24, 67]; see also [45] for further references. Whether or not there exists a compactification with smooth (or at least highly regular) \(\mathscr {I}^+\), meaning that the conformally rescaled metric extends smoothly and nondegenerately across \(\mathscr {I}^+\), is a delicate issue as it depends very sensitively on the precise choice of the conformal factor and the smooth structure near \(\mathscr {I}^+\) and requires the identification of at least two ‘incommensurable’ geometric quantities.Footnote 42 The observation (8.37) shows that this cannot happen prior to the next-to-leading order terms in the expansion of g at \(\mathscr {I}^+\). Work by Christodoulou [24] on the other hand (see also [35, §1.5.3]) strongly suggests that the conformal compactification is generically at most of class \(\mathcal {C}^{1,\alpha }\).

Therefore, the mass aspect, see [91, Equation (37)], is \(-\tfrac{1}{2}\) times the \(\mathring{r}^{-1}\) coefficient of the \(d u^2\) component,

(8.38)

and the Bondi mass is

(8.39)

where we exploited that the divergence in the expression (8.38) integrates to zero.

Remark 8.13

Recall that near \((\mathscr {I}^+)^\circ \), \(h_{1 1}\) can be written as \(h_{1 1}^{(1)}\log \rho _I+h_{1 1}^{(0)}+\rho _I^{b_I}H_{{\text {b}}}^\infty \), with \(h_{1 1}^{(j)}\in \mathcal {C}^\infty ((\mathscr {I}^+)^\circ )\), \(j=0,1\), so \(r\partial _0 h_{1 1}|_{\mathscr {I}^+}=-\tfrac{1}{2}h_{1 1}^{(1)}\) picks out the logarithmic term.

Theorem 8.14

The Bondi mass (8.39) satisfies the mass loss formula

(8.40)

Moreover, \(M_{\mathrm{B}}(-\infty )=m\) is the ADM mass of the initial data, while \(M_{\mathrm{B}}(+\infty )=0\).

Proof

The formula (8.40) is an immediate consequence of Lemma 3.5, and \(M_{\mathrm{B}}(-\infty )=m\) follows from the fact that \(r\partial _0 h_{1 1}\in \rho _0^{b_0}\rho _+^{b_+}H_{{\text {b}}}^\infty (\mathscr {I}^+)\) decays to 0 as \(\rho _0\rightarrow 0\).

Let us fix the boundary defining function \(\rho \) to be equal to \(r^{-1}\) near \(\mathscr {I}^+\), and fix \(\rho _I\) and \(\rho _+\) near \(I^+\) so that \(\rho _I\rho _+=\rho \). In order to prove \(M_{\mathrm{B}}(+\infty )=0\), we analyze the equation satisfied by \(h^+:=h|_{I^+}\). The existence of this leading term was proved in §7 starting with equation (7.16) (in which we do not use constraint damping); that is, restricting that equation to \(I^+\) and using the Mellin-transformed normal operators \(\widehat{L_0}(0)=\widehat{\underline{L}{}}(0)\in \rho _I^{-1}\text {Diff}_{\text {b}}^2(I^+)\) at frequency 0 (so this is the action of \(L_0\) on 2-tensors smooth down to \(I^+\) followed by restriction to \(I^+\)), we have

$$\begin{aligned} \widehat{\underline{L}{}}(0) h^+ = -P(0)|_{I^+} = -\rho ^{-3}\text {Ric}(g_m)|_{I^+}. \end{aligned}$$
(8.41)

Moreover, \(h^+_{1 1}\) has a logarithmic leading order term \(h^+_\ell \log \rho _I\),

$$\begin{aligned} h^+-h^+_\ell \log \rho _I\,(d x^1)^2 \in \mathcal {C}^\infty (I^+)+\rho _I^{b_I}H_{{\text {b}}}^\infty (I^+)\subset \bar{H}^{1/2+b_I-0}(I^+), \end{aligned}$$
(8.42)

where \(h^+_\ell =(\rho _I\partial _{\rho _I}h_{1 1})|_{\partial I^+}=(-2 r\partial _0 h_{1 1})_{\partial I^+}\), so by Lemma 3.5

$$\begin{aligned} h_\ell ^+(\theta ) = \frac{1}{4} \int _{\beta ^{-1}(\theta )} |N|^2\,d x^1,\ \ \theta \in \partial I^+. \end{aligned}$$

Since \(\widehat{\underline{L}{}}(0)\) is injective on \(\bar{H}^{1/2+0}(I^+)\), the tensor \(h^+\) on \(I^+\) is uniquely determined by equation (8.41) and the ‘boundary condition’ (8.42). The strategy is to evaluate \(h^+_{0 0}|_{\partial I^+}\) in two ways: one the one hand, this quantity vanishes identically by construction of the metric h in our DeTurck gauge; on the other hand, we will show that solving (8.41) directly yields the relation

(8.43)

which thus gives the desired conclusion. For the proof of (8.43), let us split \(h^+=h'+h''\), where

$$\begin{aligned} h'\in \mathcal {C}^\infty (I^+;S^2\,{}^{{\text {sc}}}T^*_{I^+}\overline{\mathbb {R}^4}),\ \ h''\in h^+_\ell \log \rho _I\,(d x^1)^2 + \bar{H}^{1/2+0}(I^+;S^2\,{}^{{\text {sc}}}T^*_{I^+}\overline{\mathbb {R}^4}) \end{aligned}$$
(8.44)

are the unique solutions with these properties solving the equations

$$\begin{aligned} \widehat{\underline{L}{}}(0)h'&= -P(0)|_{I^+}, \end{aligned}$$
(8.45)
$$\begin{aligned} \widehat{\underline{L}{}}(0)h''&= 0; \end{aligned}$$
(8.46)

the first equation is uniquely solvable in this regularity class due to \(P(0)\in \mathcal {C}^\infty _{\text {c}}((I^+)^\circ )\). We first solve (8.46) with the boundary condition (8.44), to the extent that we can determine \(h''_{0 0}\). This can be viewed as a calculation of (a part of) the ‘scattering matrix’ of the operator \(\widehat{\underline{L}{}}(0)\) on \(I^+\),Footnote 43 which can be done explicitly: writing points in \(I^+\) using spherical coordinates as \(Z=R\omega \in \mathbb {R}^3\), \(R=r/t\in [0,1]\), \(\omega \in \mathbb {S}^2\), we have

acting component-wise on the coordinate trivialization of \({}^{{\text {sc}}}T^*\overline{\mathbb {R}^4}\); see (4.62) and (4.65). Since \(\widehat{\underline{L}{}}(0)\) is SO(3)-invariant, it suffices to calculate \(u_{0 0}|_{\partial I^+}\) for the solution of \(\widehat{\underline{L}{}}(0)u=0\) for which \(u-c\log \rho _I(d x^1)^2\in \bar{H}^{1/2+0}(I^+)\); recall that c was defined in (8.43). Now at \(I^+\),

$$\begin{aligned} (d x^1)^2 = d t^2 - 2\tfrac{Z^i}{|Z|}\,d t\,d x_i + \tfrac{Z^i Z^j}{|Z|^2}\,d x_i\,d x_j, \end{aligned}$$
(8.47)

where we write \(x_i\) for the Euclidean coordinates on \(\mathbb {R}^3\); observe then that if \(Y_\ell \in \mathcal {C}^\infty (\mathbb {S}^2)\), , denotes a spherical harmonic, then \(\widehat{\underline{L}{}}(0)\bigl (u_\ell (R) Y_\ell (\omega )\bigr ) = 0\) holds for

$$\begin{aligned} u_0=R^{-1}\log \bigl (\tfrac{1-R}{1+R}\bigr ),\ \ u_1=R^{-2}\log \bigl (\tfrac{1-R}{1+R}\bigr )+2 R^{-1},\ \ u_2=\tfrac{3-R^2}{2 R^3}\log \bigl (\tfrac{1-R}{1+R}\bigr )+3 R^{-2}; \end{aligned}$$
(8.48)

Taylor expanding at \(R=0\), one sees that \(R^{-\ell }u_\ell \) is a smooth function of \(R^2\), hence \(u_\ell Y_\ell \) is smooth there; moreover, \(u_\ell \) satisfies the boundary condition \(u_\ell -\log \rho _I=\mathcal {O}(1)\), \(\rho _I=1-R\), at \(R=1\). (In fact, \(u_\ell \) is the unique solution with these two properties). Using (8.47), we find \(h''=c\cdot \bigl (u_0\,d t^2-2 u_1\,d t\,d r+u_2\,d r^2\bigr )\), so writing \(d t=(d x^0+d x^1)/2\), \(d r=\frac{Z^i}{|Z|}d x_i\), and \(r=(d x^0-d x^1)/2\) near \(\partial I^+\) within \(I^+\), this gives

$$\begin{aligned} h''_{0 0}|_{\partial I^+} = c\cdot \bigl (\tfrac{1}{4} u_0 - \tfrac{1}{2} u_1 + \tfrac{1}{4} u_2\bigr )\big |_{R=1} = -\tfrac{1}{4} c. \end{aligned}$$
(8.49)

In order to solve (8.45), note first that the map \(\underline{h}{}\in \mathcal {C}^\infty (I^+)\mapsto \rho ^{-3}\text {Ric}(\underline{g}{}+\rho \underline{h}{})|_{I^+}\) is linear in \(\underline{h}{}\),Footnote 44 hence writing \(g_m=:\underline{g}{}+\rho \underline{h}{}\), we have

$$\begin{aligned} P(0)|_{I^+}=\rho ^{-3}(\text {Ric}(\underline{g}{}+\rho \underline{h}{})-\text {Ric}(\underline{g}{}))|_{I^+} = \widehat{\underline{L}{}}(0)\underline{h}{} - \rho ^{-3}\delta ^*_{\underline{g}{}}\delta _{\underline{g}{}}G_{\underline{g}{}}\rho \underline{h}{}; \end{aligned}$$

for later use, we note that in a neighborhood of \(\partial I^+\) in \(I^+\),

$$\begin{aligned} \underline{h}{}=-2 m \rho ^{-1}r^{-1}(d t^2+d r^2)=-m((d x^0)^2+(d x^1)^2). \end{aligned}$$
(8.50)

This suggests writing \(\rho h'\) as the sum of \(-\rho \underline{h}{}\) (to solve away the first term) and a pure gauge term, so we make the ansatz

$$\begin{aligned} h'=-\underline{h}{}+\rho ^{-1}\delta ^*_{\underline{g}{}}\omega +\widetilde{h}', \end{aligned}$$
(8.51)

where \(\omega \in \mathcal {C}^\infty ((I^+)^\circ ;{}^{{\text {sc}}}T^*_{I^+}\overline{\mathbb {R}^4})\) solvesFootnote 45

$$\begin{aligned} \rho ^{-2}\delta _{\underline{g}{}}G_{\underline{g}{}}\delta _{\underline{g}{}}^*\omega = \vartheta := \rho ^{-2}\delta _{\underline{g}{}}G_{\underline{g}{}}(\rho \underline{h}{}) \in \mathcal {C}^\infty (I^+;{}^{{\text {sc}}}T_{I^+}^*\overline{\mathbb {R}^4}), \end{aligned}$$
(8.52)

and \(\widetilde{h}'\) is a solution of \(\widehat{\underline{L}{}}(0)\widetilde{h}'=0\) which we will use to solve away any singular terms. We compute \(\vartheta \) to leading order at \(\partial I^+\) by using \(r^2\delta _{\underline{g}{}}r^{-1} d t^2=0\) and \(r^2\delta _{\underline{g}{}}r^{-1} d r^2=d r\), so

$$\begin{aligned} \vartheta = -2 m\,d r = -m\,d x^0+m\,d x^1 + \rho _I\,\mathcal {C}^\infty (I^+). \end{aligned}$$

Write \(\rho ^{-2}\delta _{\underline{g}{}}G_{\underline{g}{}}\delta _{\underline{g}{}}^*=\rho \underline{D}{}\rho ^{-1}\), where \(\underline{D}{}=\rho ^{-3}\delta _{\underline{g}{}}G_{\underline{g}{}}\delta _{\underline{g}{}}^*\rho \) is \(\tfrac{1}{2}\) times the wave operator on 1-forms on Minkowski space, re-weighted to a b-operator as usual; then equation (8.52) becomes

$$\begin{aligned} \widehat{\underline{D}{}}(i)(\rho _I^{-1}\omega ) = \rho _I^{-1}\vartheta . \end{aligned}$$
(8.53)

Now \(\rho _I^{-1}\vartheta \in \bar{H}^{-1/2-0,\infty }(I^+)\), while \(\widehat{\underline{D}{}}(i)^{-1}:\bar{H}^{s-1,\infty }(I^+)\rightarrow \bar{H}^{s,\infty }(I^+)\) for \(s>-\tfrac{1}{2}\), cf. (7.14). Therefore, the solution satisfies \(\omega \in \rho _I\bar{H}^{1/2-0,\infty }(I^+)\subset \rho _I^{1-0}H_{{\text {b}}}^\infty (I^+)\) (by Sobolev embedding for functions of the single variable \(\rho _I\)), which using the expression (A.1) implies that \(\omega \) does not contribute to \(h'_{0 0}|_{\partial I^+}\), namely \((\rho ^{-1}\delta _{\underline{g}{}}^*\omega )_{0 0}|_{\partial I^+} = (\rho ^{-1}\partial _0\omega _0)|_{\partial I^+}=0\), where we used that \(\rho ^{-1}\partial _0\) is a multiple of the b-vector field \(\rho _I\partial _{\rho _I}\) at \(\partial I^+\).

A careful inspection of the solution of (8.53) shows that \(\rho ^{-1}\delta _{\underline{g}{}}^*\omega \) is not smooth. Indeed, in the bundle splitting (2.19), we have \(\underline{D}{}\in -2\rho ^{-2}\partial _0\partial _1+\text {Diff}_{\text {b}}^2({}^0M)\), as follows from the same calculations as (B.13), so using the expression (7.19) for \(\sigma =i\), we have \(\widehat{\underline{D}{}}(i)\in \partial _{\rho _I}(\rho _I\partial _{\rho _I}+1)+\text {Diff}_{\text {b}}^2(I^+)\), which implies thatFootnote 46\(\omega =\rho _I\log \rho _I\,\vartheta |_{\partial I^+}+\bar{H}^{5/2-0,\infty }(I^+)\); therefore

$$\begin{aligned} (\rho ^{-1}\delta _{\underline{g}{}}^*\omega )|_{I^+} = \bigl (-d x^0\,d x^1+(d x^1)^2\bigr )m \log \rho _I + \bar{H}^{3/2-0,\infty }(I^+). \end{aligned}$$

Therefore, while we do have \(\widehat{\underline{L}{}}(0)(-\underline{h}{}+\rho ^{-1}\delta _{\underline{g}{}}^*\omega )=-P(0)\), we need to correct the 2-tensor on the left by adding the unique solution \(\widetilde{h}'\) of

$$\begin{aligned} \widehat{\underline{L}{}}(0)\widetilde{h}'=0,\ \ \widetilde{h}' \in \bigl (d x^0\,d x^1-(d x^1)^2\bigr )m\log \rho _I + \bar{H}^{1/2+0}(I^+) \end{aligned}$$

in order for \(h'\) in (8.51) to have regularity above\(\bar{H}^{1/2}(I^+)\), which, as remarked before, implies that it is the unique smooth solution of (8.45), as desired. Arguing similarly as around (8.47)–(8.48) and noting that \(d x^0\,d x^1=d t^2-d r^2=d t^2-\frac{Z^i Z^j}{|Z|^2}d x_i\,d x_j\), the solution is given by \(\widetilde{h}'=m(u_0\,d t^2-u_2\,d r^2)-m(u_0\,d t^2-2 u_1\,d t\,d r+u_2\,d r^2)\). This gives

$$\begin{aligned} \widetilde{h}'_{0 0}|_{\partial I^+} = \tfrac{1}{4} m (u_0-u_2)|_{\partial I^+} - m\cdot (-\tfrac{1}{4}) = -\tfrac{1}{2}m. \end{aligned}$$

In view of (8.50), we conclude that

$$\begin{aligned} h'_{0 0}|_{\partial I^+} = -\underline{h}{}_{0 0}|_{\partial I^+} + \widetilde{h}'_{0 0}|_{\partial I^+} = \tfrac{1}{2}m. \end{aligned}$$

Adding this to (8.49) establishes the relation (8.43), and proves \(M_{\mathrm{B}}(+\infty )=0\). \(\square \)

Remark 8.15

The construction of Bondi–Sachs coordinates is local near \((\mathscr {I}^+)^\circ \) and as such did not rely on h being small. (The proof of Proposition 8.2 used the smallness of certain Christoffel symbols in a weighted \(\mathcal {C}^0\) space, but this is automatic for any fixed \(h\in \mathcal {X}^\infty \) if one relaxes the weights at \(\mathscr {I}^+\) by a little and works in a sufficiently small neighborhood of \(\mathscr {I}^+\)). Likewise, the proof of Theorem 8.14 did not require h to be small. Therefore, we in fact conclude that any (large) solution of the Einstein vacuum equation of the form \(g=g_m+\rho h\) (with m possibly large), \(h\in \mathcal {X}^\infty \)—which requires it to decay to the Minkowski solution at \(I^+\)—satisfies the conclusions of Theorem 8.14.

Let us connect this to the alternative definition of the Bondi mass and the mass loss formula used in §1.3, which has a more geometric flavor [23]. To describe this, consider an outgoing null cone \(C_u\) and let

$$\begin{aligned} S_{u,\mathring{r}}:=C_u\cap \{r=\mathring{r}\} \end{aligned}$$

denote the 2-sphere of constant area radius (which is a particular choice of transversal of \(C_u\)). Let \(L\in (T C_u)^\perp \) be a future-directed null normal vector field, i.e. a smooth positive multiple of \(\nabla u\); then the null second fundamental form is

$$\begin{aligned} \chi _L(X,Y) := g(\nabla _X L,Y),\ \ X,Y\in T S_{u,\mathring{r}}. \end{aligned}$$

Note that \(\chi _{a L}=a\chi _L\) for any function a. There exists a unique future-directed null vector field

$$\begin{aligned} \underline{L}{}\in ( T S_{u,\mathring{r}})^\perp \ \ \text {such that}\ \ g(L,\underline{L}{})=2. \end{aligned}$$
(8.54)

Define \(T\underline{C}{}_u:=T S_{u,\mathring{r}}\oplus \langle \underline{L}{}\rangle \), which is the tangent space (at \(S_{u,\mathring{r}}\)) of a null hypersurface \(\underline{C}{}_u\) which is the congruence of null-geodesics with initial condition on \(S_{u,\mathring{r}}\) and initial velocity \(\underline{L}{}\). (L and \(C_u\), resp. \(\underline{L}{}\) and \(\underline{C}{}_u\), are often called ‘outgoing’ and ‘ingoing,’ respectively). The conjugate null second fundamental form is then

$$\begin{aligned} \underline{\chi }{}_{\underline{L}{}}(X,Y) := g(\nabla _X\underline{L}{},Y)=-g(\nabla _X Y,\underline{L}{}),\ \ X,Y\in T S_{u,\mathring{r}}, \end{aligned}$$

with the second expression showing that this depends only on \(\underline{L}{}\) at \(S_{u,\mathring{r}}\). Letting \(\mathring{g}:=g|_{S_{u,\mathring{r}}}\) denote the induced metric, the trace-free parts of \(\chi \) and \(\underline{\chi }{}\) are

$$\begin{aligned} \hat{\chi }_L := \chi _L - \tfrac{1}{2}\mathring{g}{\text {tr}}_{\mathring{g}}(\chi _L), \quad \hat{\underline{\chi }{}}_{\underline{L}{}} := \underline{\chi }{}_{\underline{L}{}}-\tfrac{1}{2}\mathring{g}{\text {tr}}_{\mathring{g}}(\underline{\chi }{}_{\underline{L}{}}). \end{aligned}$$

Rescaling L to aL, we must rescale \(\underline{L}{}\) to \(a^{-1}\underline{L}{}\), so the product \({\text {tr}}\chi _L{\text {tr}}\chi _{\underline{L}{}}\) is well-defined, and we may drop the subscripts on \(\chi \) and \(\underline{\chi }{}\). The Hawking mass of \(S_{u,\mathring{r}}\) is defined as

$$\begin{aligned} M_{\mathrm{H}}(u,\mathring{r}) := \frac{\mathring{r}}{2}\biggl (1+\frac{1}{16\pi }\int _{S(u,\mathring{r})} {\text {tr}}\chi {\text {tr}}\underline{\chi }{}\,d S\biggr ), \end{aligned}$$
(8.55)

where dS is the induced surface measure. For a 1-form, let us write its components \(\omega \) in Bondi–Sachs coordinates as \(\omega _u\), \(\omega _{\mathring{r}}\), \(\omega _{\mathring{a}}\), \(a=2,3\), similarly for higher rank tensors.

Lemma 8.16

We have \(|M_{\mathrm{H}}(u,\mathring{r})-M_{\mathrm{B}}(u)|\lesssim \mathring{r}^{-b_I+0}\), hence

$$\begin{aligned} \lim _{\mathring{r}\rightarrow \infty }M_{\mathrm{H}}(u,\mathring{r})=M_{\mathrm{B}}(u). \end{aligned}$$

Proof

We work in Bondi–Sachs coordinates, so \(T S_{u,\mathring{r}}=\langle \partial _{\mathring{x}^2},\partial _{\mathring{x}^3}\rangle \), and

Let us take \(L=\partial _{\mathring{r}}\) and write \(\chi \equiv \chi _L\), then \(\chi _{\mathring{a}\mathring{b}}\) is the Christoffel symbol of the first kind, \(\Gamma _{\mathring{b}\mathring{a}\mathring{r}}=g(\nabla _{\partial _{\mathring{x}^a}}\partial _{\mathring{r}},\partial _{\mathring{x}^b})\). By Proposition 8.10, \(g(\partial _{\mathring{x}^a},\partial _{\mathring{r}^a})\equiv 0\), therefore

(8.56)

which due to gives

$$\begin{aligned} {\text {tr}}\chi = 2\mathring{r}^{-1} + o(\mathring{r}^{-2}), \quad \hat{\chi }_{\mathring{a}\mathring{b}} = -\tfrac{1}{2}h_{\bar{a}\bar{b}} + o(1). \end{aligned}$$
(8.57)

Next, a simple calculation shows that the unique future-directed null vector field \(\underline{L}{}\) defined in (8.54) is given by

(The spherical component is determined by \(g(\underline{L}{},\partial _{\mathring{c}})=0\), \(c=2,3\), the \(\partial _u\) component by \(g(\underline{L}{},L)=2\), and the \(\partial _{\mathring{r}}\) component by \(g(\underline{L}{},\underline{L}{})=0\)). Working in normal coordinates on \(\mathbb {S}^2\), using , , and \(\Gamma _{\mathring{c}\mathring{a}\mathring{b}} = o(\mathring{r}^2)\), the components of \(\underline{\chi }{}:=\underline{\chi }{}_{\underline{L}{}}\) are

(8.58)

which gives

(8.59)

Finally, the surface measure on \(S_{u,\mathring{r}}\) is , hence the Hawking mass is . (As usual, the o(1) remainder is really symbolic as \(\mathring{r}\rightarrow 0\), namely of class \(S^{-b_I+0}\)). \(\square \)

With L and \(\underline{L}{}\) defined as in the proof of the lemma, consider the conjugate null vectors aL and \(a^{-1}\underline{L}{}\). By (8.57) and (8.59), there exists a unique \(a=1+\mathcal {O}(\mathring{r}^{-1})\) such that

$$\begin{aligned} {\text {tr}}\chi _{a L}+{\text {tr}}\underline{\chi }{}_{a^{-1}\underline{L}{}}=a^{-1}(a^2{\text {tr}}\chi +{\text {tr}}\underline{\chi }{})=0; \end{aligned}$$
(8.60)

thus \(\hat{\underline{\chi }{}}_{a^{-1}\underline{L}{}}=\mathring{r}\partial _u h_{\bar{a}\bar{b}}+\mathcal {O}(1)=\hat{\underline{\chi }{}}+\mathcal {O}(1)\), hence to leading order, the normalization (8.60) does not change \(\hat{\underline{\chi }{}}\). We can now calculate the outgoing energy flux through \(S_{u,\mathring{r}}\),

with \(N_{a b}=\partial _u h_{\bar{a}\bar{b}}\) is as in Theorem 8.14.Footnote 47 Clearly, \(\underline{E}{}\) has a limit \(E(u) = \lim _{\mathring{r}\rightarrow \infty } \underline{E}{}(u,\mathring{r})\) at null infinity, and the Bondi mass loss formula (8.40) then takes the equivalent form

$$\begin{aligned} \frac{d}{d u}M_{\mathrm{B}}(u) = -E(u). \end{aligned}$$