1 Introduction

In this article we prove global well-posedness and scattering of the energy critical Maxwell-Klein-Gordon equation on \({{\mathbb {R}}}^{1+4}\) for any finite energy initial data. In Sect. 1.1, we present some background material concerning the Maxwell-Klein-Gordon equation on \({{\mathbb {R}}}^{1+4}\). Readers already familiar with this equation may skip to Sect. 1.2, where we give a precise statement of the main theorem (Theorem 1.3). This paper is the main and logically the final part of the three-paper sequence [26, 27]. In Sects. 2 and 3 below, we provide an overview of the entire proof of Theorem 1.3 spanning the whole sequence.

1.1 \((4+1)\)-dimensional Maxwell-Klein-Gordon system

Let \({{\mathbb {R}}}^{1+4}\) be the \((4+1)\)-dimensional Minkowski space with the metric

$$\begin{aligned} \mathbf{m}_{\mu \nu } := \mathrm {diag}\,(-1,+1,+1,+1,+1) \end{aligned}$$

in the standard rectilinear coordinates \((t=x^{0}, x^{1}, \ldots , x^{4})\). Consider the trivial complex line bundle \(L = {{\mathbb {R}}}^{1+4} \times {{\mathbb {C}}}\) over \({{\mathbb {R}}}^{1+4}\) with structure group \(\mathrm {U}(1) = \left\{ e^{i \chi } \in {{\mathbb {C}}}\right\} \). Global sections of L may be identified with \({{\mathbb {C}}}\)-valued functions on \({{\mathbb {R}}}^{1+4}\). Using the identification \(u(1) \equiv i {{\mathbb {R}}}\) and taking the trivial connection \(\mathrm {d}\) as a reference, any connection \(\mathbf{D}\) on L takes the form

$$\begin{aligned} \mathbf{D}= \mathrm {d}+ i A \end{aligned}$$

for some real-valued 1-form A on \({{\mathbb {R}}}^{1+4}\). The Maxwell-Klein-Gordon system is a Lagrangian field theory for a pair \((A, \phi )\) of a connection on L and a section of L with the action functional

$$\begin{aligned} {{\mathcal {S}}}[A, \phi ] = \int _{{{\mathbb {R}}}^{1+4}} \frac{1}{4} F_{\mu \nu } F^{\mu \nu } + \frac{1}{2} \mathbf{D}_{\mu } \phi \overline{\mathbf{D}^{\mu } \phi } \, \mathrm {d}t \mathrm {d}x, \end{aligned}$$

where \(F_{\mu \nu } = (\mathrm {d}A)_{\mu \nu } = \partial _{\mu } A_{\nu } - \partial _{\nu } A_{\mu }\) is the curvature 2-form associated to \(\mathbf{D}\). We follow the usual convention of raising/lowering indices by the Minkowski metric \(\mathbf{m}\), and also of summing over repeated upper and lower indices. Computing the Euler-Lagrange equations, we arrive at the Maxwell-Klein-Gordon equations (MKG)

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial ^{\mu } F_{\nu \mu } = \mathrm {Im}(\phi \overline{\mathbf{D}_{\nu } \phi }) \\ \Box _{A} \phi = 0, \end{array}\right. } \end{aligned}$$
(MKG)

where \(\Box _{A} := \mathbf{D}^{\mu } \mathbf{D}_{\mu }\) is the (gauge) covariant d’Alembertian.

A basic feature of (MKG) is gauge invariance. Geometrically, a gauge transform is a change of basis in the fiber \({{\mathbb {C}}}\) over each point on \({{\mathbb {R}}}^{1+4}\) by an element of the gauge group \(\mathrm {U}(1)\). Accordingly, we refer to a real-valued function \(\chi : {{\mathbb {R}}}^{1+4} \rightarrow {{\mathbb {R}}}\) (hence \(e^{i \chi } \in \mathrm {U}(1)\)) as a gauge transformation and define the corresponding gauge transform of a pair \((A, \phi )\) as

$$\begin{aligned} (A, \phi ) \mapsto (\widetilde{A}, \widetilde{\phi }) := (A - \mathrm {d}\chi , e^{i \chi } \phi ). \end{aligned}$$
(1.1)

Observe that \(\mathbf{D}\) and \(\Box _{A}\) are covariant under gauge transforms (i.e., \(e^{i \chi } \mathbf{D}\phi = \widetilde{\mathbf{D}} \widetilde{\phi }\) etc), whereas F and \(\mathrm {Im}(\phi \overline{\mathbf{D}_{\mu } \phi })\) are invariant. Hence (MKG) is invariant under gauge transforms. Since \(\mathrm {U}(1)\) is an abelian group, (MKG) is said to be an abelian gauge theory.

We now formulate the initial value problem for (MKG), in a way that is consistent with the gauge invariance of the system. An initial data set for (MKG) consists of a pair of 1-forms \((a_{j}, e_{j})\) and a pair of \({{\mathbb {C}}}\)-valued functions (fg) on \({{\mathbb {R}}}^{4}\). We say that (aefg) is the initial data for a solution \((A, \phi )\) at time \(t_{0}\) if

$$\begin{aligned} (A_{j}, F_{0j}, \phi , \mathbf{D}_{t} \phi ) \!\upharpoonright _{\left\{ t = t_{0}\right\} } = (a_{j}, e_{j}, f, g). \end{aligned}$$

We usually take the initial time \(t_{0}\) to be zero. Observe that the \(\nu = 0\) component of (MKG) imposes a constraint on any initial data for (MKG), namely

$$\begin{aligned} \partial ^{\ell } e_{\ell } = \mathrm {Im}(f \overline{g}) \end{aligned}$$
(1.2)

This equation is called the Gauss (or constraint) equation.

There is a conserved energy for (MKG), which is one of the basic ingredients of the non-perturbative analysis performed in this paper. We define the conserved energy of a solution \((A, \phi )\) at time t to be

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t\right\} \times {{\mathbb {R}}}^{4}}[A, \phi ] := \frac{1}{2} \int _{\left\{ t\right\} \times {{\mathbb {R}}}^{4}} \sum _{0 \le \mu < \nu \le 4} \vert F_{\mu \nu }\vert ^{2} + \sum _{0 \le \mu \le 4} \vert \mathbf{D}_{\mu } \phi \vert ^{2} \, \mathrm {d}x. \end{aligned}$$
(1.3)

For a suitably regular solution to (MKG) defined on a connected interval I, this quantity is constant. This conservation law is in fact a consequence of Nöther’s principle (i.e., continuous symmetry of the field theory corresponds to a conserved quantity) applied to the time translation symmetry of (MKG); we refer to Sect. 5 for further discussion and a proof.

Observe that the conserved energy is invariant under the scaling

$$\begin{aligned} (A, \phi )(t,x) \mapsto \left( \lambda ^{-1} A, \lambda ^{-1} \phi \right) \left( \lambda ^{-1} t, \lambda ^{-1} x\right) \quad \hbox { for any } \lambda > 0, \end{aligned}$$

which also preserves the system (MKG). Hence (MKG) on \({{\mathbb {R}}}^{1+4}\) is energy critical.

1.2 Statement of the main theorem

Our goal now is to give a precise statement of the global well-posedness/scattering theorem proved in this paper. For this purpose, we first borrow some definitions from [19, 26].

We say that a (MKG) initial data set (aefg) (i.e., a solution to the Gauss equation) is classical and write \((a, e, f, g) \in {{\mathcal {H}}}^{\infty }\) if each of aefg belongs to \(H^{\infty }_{x} := \cap _{n=0}^{\infty } H^{n}_{x}\). Correspondingly, we say that a smooth solution \((A, \phi )\) to (MKG) on \(I \times {{\mathbb {R}}}^{4}\) (where \(I \subseteq {{\mathbb {R}}}\) is an interval) is a classical solution if \(A_{\mu }, \phi \in \cap _{n, m=0}^{\infty } C_{t}^{m} (I; H^{n}_{x})\).

Define the space \({{\mathcal {H}}}^{1} = {{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4})\) of finite energy initial data sets to be the space of (MKG) initial data sets for which the following norm is finite:

$$\begin{aligned} \Vert (a, e, f, g)\Vert _{{{\mathcal {H}}}^{1}} := \sup _{j=1, \ldots , 4} \Vert (a_{j}, e_{j})\Vert _{\dot{H}^{1}_{x} \times L^{2}_{x}({{\mathbb {R}}}^{4})} + \Vert (f, g)\Vert _{\dot{H}^{1}_{x} \times L^{2}_{x}({{\mathbb {R}}}^{4})}.\nonumber \\ \end{aligned}$$
(1.4)

Given a pair \((A, \phi )\) on \(I \times {{\mathbb {R}}}^{4}\), we define its \(C_{t} {{\mathcal {H}}}^{1}(I \times {{\mathbb {R}}}^{4})\) norm as

$$\begin{aligned} \Vert (A, \phi )\Vert _{C_{t} {{\mathcal {H}}}^{1}(I \times {{\mathbb {R}}}^{4})} := {{\mathrm{ess\,sup\,}}}_{t \in I} \Big ( \Vert A[t]\Vert _{\dot{H}^{1}_{x} \times L^{2}_{x}} + \Vert \phi [t]\Vert _{\dot{H}^{1}_{x} \times L^{2}_{x}} \Big ), \end{aligned}$$

where A[t] and \(\phi [t]\) are shorthands for \((A, \partial _{t} A)(t)\) and \((\phi , \partial _{t} \phi )(t)\), respectively. We then define the notion of an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) via approximation by classical solutions as follows.

Definition 1.1

(Admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions to (MKG)) Let \(I \subseteq {{\mathbb {R}}}\) be an interval. We say that a pair \((A, \phi ) \in C_{t} {{\mathcal {H}}}^{1}(I \times {{\mathbb {R}}}^{4})\) is an admissible \(C_{t} {{\mathcal {H}}}^{1}(I \times {{\mathbb {R}}}^{4})\) solution to (MKG) if there exists a sequence \((A^{(n)}, \phi ^{(n)})\) of classical solutions to (MKG) on \(I \times {{\mathbb {R}}}^{4}\) such that

$$\begin{aligned} \Vert (A, \phi ) - (A^{(n)}, \phi ^{(n)})\Vert _{C_{t} {{\mathcal {H}}}^{1}(J \times {{\mathbb {R}}}^{4})} \rightarrow 0 \quad \hbox { as } n \rightarrow \infty , \end{aligned}$$

for every compact subinterval \(J \subseteq I\).

The necessity of restricting the class of energy solutions under consideration to the admissible ones as defined above is a relatively standard matter in the realm of low regularity solutions for nonlinear dispersive equations. Often uniqueness statements require additional regularity properties for solutions, which are then proved to hold for the solutions which are limits of smooth solutions, but might not be true or straightforward in general. In our case the difficulties are compounded by the need to have a good notion of finite energy solution which is gauge invariant.

Remark 1.2

The above definitions can be localized to an open subset \(O \subseteq {{\mathbb {R}}}^{4}\) or \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) in an obvious manner; see [26, Sects. 3 and 5].

Next, we recall the global Coulomb gauge condition

$$\begin{aligned} \partial ^{\ell } A_{\ell } = \sum _{\ell = 1, \ldots , 4} \partial _{\ell } A_{\ell } = 0. \end{aligned}$$
(1.5)

The role of this condition is to fix the ambiguity arising from the gauge invariance of (MKG), which is an immediate formal obstruction for well-posedness.

Finally, given an interval \(I \subseteq {{\mathbb {R}}}\), we borrow the space-time norms \(Y^{1}(I \times {{\mathbb {R}}}^{4})\) and \(S^{1}(I \times {{\mathbb {R}}}^{4})\) from [19, 26, 27]. We define the \(S^{1}\) norm of a solution \((A, \phi )\) on \(I \times {{\mathbb {R}}}^{4}\) to be

$$\begin{aligned} \Vert (A, \phi )\Vert _{S[I]} := \Vert A_{0}\Vert _{Y^{1}\left( I \times {{\mathbb {R}}}^{4}\right) } + \Vert A_{x}\Vert _{S^{1}\left( I \times {{\mathbb {R}}}^{4}\right) } + \Vert \phi \Vert _{S^{1}\left( I \times {{\mathbb {R}}}^{4}\right) }. \end{aligned}$$

In particular, the \(S^{1}\) norm captures the dispersive properties of \(A_{x}\) and \(\phi \). The precise definition of the \(S^{1}\) norm is rather intricate; instead of the full definition, in this paper we only rely on a few basic properties of the spaces \(Y^{1}\) and \(S^{1}\), such as those below (see also Remark 4.2).

$$\begin{aligned}&\Vert (\varphi , \partial _{t} \varphi )\Vert _{C_{t}(I; \dot{H}^{1}_{x} \times L^{2}_{x})} \lesssim \Vert \varphi \Vert _{S^{1}\left( I \times {{\mathbb {R}}}^{4}\right) }, \\&\Vert (\varphi , \partial _{t} \varphi )\Vert _{C_{t}(I; \dot{H}^{1}_{x} \times L^{2}_{x})} \lesssim \Vert \varphi \Vert _{Y^{1}\left( I \times {{\mathbb {R}}}^{4}\right) }. \end{aligned}$$

We are now ready to state our main theorem.

Theorem 1.3

(Main Theorem) Let \((a, e, f, g) \in {{\mathcal {H}}}^{1}\) be a finite energy initial data set for (MKG) obeying the global Coulomb gauge condition \(\partial ^{\ell } a_{\ell } = 0\). Then there exists a unique admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution \((A, \phi )\) to the initial value problem defined on the whole \({{\mathbb {R}}}^{1+4}\) which satisfies the global Coulomb gauge condition \(\partial ^{\ell } A_{\ell } = 0\). Moreover, the \(S^{1}\) norm of \((A, \phi )\) is finite, i.e.,

$$\begin{aligned} \Vert A_{0}\Vert _{Y^{1}\left( {{\mathbb {R}}}^{1+4}\right) } + \Vert A_{x}\Vert _{S^{1}\left( {{\mathbb {R}}}^{1+4}\right) } + \Vert \phi \Vert _{S^{1}\left( {{\mathbb {R}}}^{1+4}\right) } < \infty . \end{aligned}$$
(1.6)

Remark 1.4

The a-priori bound above implies scattering towards both \(t \rightarrow \pm \infty \); see Theorem 4.8. It also implies continuity of the data to solution map on compact time intervals, though not on the full real line.

Remark 1.5

We do not lose any generality by restricting to initial data sets in the global Coulomb gauge, since any finite energy initial data set can be gauge transformed to obey the condition \(\partial ^{\ell } a_{\ell } = 0\). See [26, Sect. 3].

Remark 1.6

We note that an independent proof of global well-posedness and scattering of MKG-CG has been recently given by Krieger-Lührmann [16], following a version of the Bahouri-Gérard nonlinear profile decomposition [1] and Kenig-Merle concentration compactness/rigidity scheme [8, 9] developed by Krieger-Schlag [17] for the energy critical wave maps. We refer to Sect. 3.2 for a brief comparison between our work and [16].

1.3 A brief history and broader context

A natural point of view is to place the present papers and results within the larger context of nonlinear wave equations, of which the starting point is the semilinear wave equation \(\Box u = \pm |u|^{p} u\). More accurately, the (MKG) equation belongs to the class of geometric wave equations, which includes wave maps (WM), Yang-Mills (YM), Einstein equations, as well as many other coupled models. Two common features of all these problems are that they admit a Lagrangian formulation, and have some natural gauge invariance properties. Following are some of the key developments that led to the present work.

1. The null condition  A crucial early observation in the study of both long range and low regularity solutions to geometric wave equations was that the nonlinearities appearing in the equations have a favorable algebraic structure, which was called null condition, and which can be roughly described as a cancellation condition in the interaction of parallel waves. In the low regularity setting, this was first explored in work of Klainerman and Machedon [10], and by many others later on.

2. The \(X^{s,b}\) spaces  A second advance was the introduction of the \(X^{s,b}\) spaces,Footnote 1 also first used by Klainerman and Machedon [13] in the context of the wave equation. Their role was to provide enough structure in order to be able to take advantage of the null condition in bilinear and multilinear estimates. Earlier methods, based on energy bounds, followed by the more robust Strichartz estimates, had proved inadequate to the task.

3. The null frame spaces  To study nonlinear problems at critical regularity one needs to work in a scale invariant setting. However, it was soon realized that the homogeneous \(X^{s,b}\) spaces are not even well defined, not to mention suitable for this. The remedy, first introduced in work of the second author [41] in the context of wave maps, was to produce a better description of the fine structure of waves, combining frequency and modulation localizations with adapted frames in the physical space. This led to the null frame spaces, which played a key role in subsequent developments for wave maps. We remark that another scale invariant alternative to \(X^{s,b}\) spaces are the \(U^p\) and \(V^p\) spaces, also originally developed by the second author; while these played a role in the study of other nonlinear dispersive problems at critical regularity, they play no role in the present story.

4. Renormalization  A remarkable feature of all semilinear geometric wave equations is that while at high regularity (and locally in time) the nonlinearity is perturbative, this is no longer the case at critical regularity. Precisely, isolating the non-perturbative component of the nonlinearity, one can see that this is of paradifferential type; in other words, the high frequency waves evolve on a variable low frequency background. To address this difficulty, the idea of Tao [34], also in the wave map context, was to renormalize the paradifferential problem, i.e., to find a suitable approximate conjugation to the corresponding constant coefficient problem.

5. Induction of energy  The ideas discussed so far seem to suffice for small data critical problems. Attacking the large data problem generates yet another range of difficulties. One first step in this direction is Bourgain’s induction of energy idea [2], which is a convenient mechanism to transfer information to higher and higher energies. We remark that an alternate venue here, which sometimes yields more efficient proofs, is the Kenig-Merle idea [9] of constructing minimal blow-up solutions. However, the implementation of this method in problems which require renormalization seems to cause considerable trouble.

For a further discussion on this issue, we refer to the work of Krieger-Schlag [17], where this method was carried out in the case of energy critical wave maps into the hyperbolic plane. We also mention the recent paper [16], where an independent proof of global well-posedness and scattering for (MKG) in the Coulomb gauge was given following the above-mentioned ideas of Kenig-Merle and Krieger-Schlag. See Sect. 3.2 for a brief comparison between the approaches in this paper and [16].

6. Energy dispersion  One fundamental goal in the study of large data problems is to establish a quantitative dichotomy between dispersion and concentration. The notion of energy dispersion, introduced in joint work [32, 33] of the second author and Sterbenz in the wave map context, provides a convenient measure for pointwise concentration. Precisely, at each energy there is an energy dispersion threshold below which dispersion wins. We remark that, when it can be applied, the Kenig-Merle method [9] yields more accurate information; for instance, see [17]. However, the energy dispersion idea, which is what we follow in the present series of papers, is much easier to implement in conjunction with renormalization.

7. The frequency gap  One obstacle in the transition from small to large data in renormalizable problems is that the low frequency background may well correspond to a large solution. Is this fatal to the renormalized solution? The answer to that, also originating in [32, 33], is that there may be a second hidden source of smallness, namely a large frequency gap between the high frequency wave and the low frequency background it evolves on.

8. Morawetz estimates  The outcome of the ideas above is a dichotomy between dispersion and scattering on one hand, and very specific concentration patterns, e.g., solitons, self-similar solutions on the other hand. The Morawetz estimates, first appearing in this role in the work of Grillakis [6], are a convenient and relatively simple tool to eliminate such concentration scenarios.

We now recall some earlier developments on geometric wave equations related to the present paper. We start our discussion with the (MKG) problem above the scaling critical regularity. In the two and three dimensional cases, which are energy subcritical, global regularity of sufficiently regular solutions was shown in the early works [4, 5, 23]. The former two in fact handled the more general Yang-Mills-Higgs system. In dimension \(d=3\), this result was greatly improved by [11], which established global well-posedness for any finite energy data. In this work, the quadratic null structure of (MKG) in the Coulomb gauge was uncovered and used for the first time. Subsequent developments were made by [3] and more recently [22], where an essentially optimal local well-posedness result was established. An important observation in [22] is that (MKG) in Coulomb gauge exhibits a secondary multilinear cancellation feature. The related paper [7] is concerned with global well-posedness of the same problem at low regularity. We also mention the work [30], in which finite energy global well-posedness was established in the Lorenz gauge. In the higher dimensional case \(d \ge 4\), an essentially optimal local well-posedness result for a model problem closely related to (MKG) was obtained in [15]. This was followed by further refinements in [29, 31].

The progress for the closely related Yang-Mills system (YM) in the subcritical regularity has largely paralleled that of (MKG), at least for small data. Indeed, (YM) exhibits a null structure in the Coulomb gauge which is very similar to (MKG). In particular, the aforementioned work [15] is also relevant for the small data problem for (YM) in the Coulomb gauge at an essentially optimal regularity.

However, a new difficulty arises in the large dataFootnote 2 problem for (YM): namely, the gauge transformation law is nonlinear due to the non-abelian gauge group. In particular, gauge transformations into the Coulomb gauge obey a nonlinear elliptic equation, for which no suitable large data regularity theory is available. Note, in comparison, that such gauge transformations obey a linear Poisson equation in the case of (MKG). In [12], where finite energy global well-posedness of the \(3+1\) dimensional (YM) problem was proved, this issue was handled by localizing in space-time via the finite speed of propagation to gain smallness, and then working in local Coulomb gauges. An alternative, more robust approach without space-time localizations to the same problem has been put forth by the first author in [24, 25], inspired by [3640]. The idea is to use an associated geometric flow, namely the Yang-Mills heat flow, to select a global-in-space Coulomb-like gauge for data of any size.

Before turning to the (MKG) and (YM) problems at critical regularity, we briefly recall some recent developments on the wave map equation (WM), where many of the methods we implement here have their roots. We confine our discussion to the energy critical problem in \(2+1\) dimensions, which is both the most difficult and the most relevant to our present paper. For the small data problem, global well-posedness was established in [34, 35, 42]. More recently, the threshold theorem for large data wave maps, which asserts that global well-posedness and scattering hold below the ground state energy, was proved in [32, 33] when the target \({{\mathcal {N}}}\) is a general compact manifolds. Independently, global well-posedness and scattering for (WM) were established in [17] when \({{\mathcal {N}}}\) is the hyperbolic plane and in [3640] when \({{\mathcal {N}}}\) is a general hyperbolic space. In both cases, \({{\mathcal {N}}}\) is a non-compact manifold which do not admit any nontrivial finite energy harmonic maps from \({{\mathbb {R}}}^{2}\); moreover, an a-priori bound of the scattering norm in terms of the energy followed immediately from the proofs of [17, 3640]. In fact, [17] also established a nonlinear profile decomposition for bounded (in energy) sequences of wave maps. See also [20] for a sharp refinement in the case of a two-dimensional target, taking into account an additional topological invariant (namely, the degree of the wave map). Our present strategy was strongly influenced by [32, 33], which can be seen as the first predecessor of this work.

Despite the many similarities, there is a key structural difference between (WM) on the one hand and (MKG), (YM) on the other, whose understanding is crucial for making progress on the latter two problems. Roughly speaking, all three equations can be written in a form where the main ‘dynamic variables’, which we denote by \(\phi \), obey a possibly nonlinear gauge covariant wave equation \(\Box _{A} \phi = \cdots \), and the associated curvature F[A] is determined by \(\phi \). In the case of (WM), this dependence is simply algebraic, whereas for (MKG) and (YM) the curvature F[A] obeys a wave equation with a nonlinearity depending on \(\phi \). This difference manifests in the renormalization procedure for each equation: For (WM) it suffices to use a physical space gauge transformation, whereas for (MKG) and (YM) it is necessary to use a microlocal (more precisely, pseudo-differential) gauge transformation that exploits the fact that A solves a wave equation in a suitable gauge.

The first (MKG) renormalization argument appeared in [28], in which global regularity of (MKG) for small critical Sobolev data was established in dimensions \(d \ge 6\). This work was followed by a similar high dimensional result for (YM) in [18]. Finally, the small data result in the energy critical dimension \(4+1\) was obtained in [19], which may be viewed as the second direct predecessor to the present work. In particular we borrow a good deal of notations, ideas and estimates from [19].

We end our introduction with a few remarks on the energy critical (YM) problem in \(4+1\) dimensions, which is a natural next step after the present work. The issue of non-abelian gauge group for the large data problem has already been discussed. Another important difference between (MKG) and (YM) in \(4+1\) dimensions is that the latter problem admits instantons, which are nontrivial static solutions with finite energy. Therefore, in analogy with (WM), it is reasonable to put forth the threshold conjecture for the energy critical (YM) problem, namely that global well-posedness and scattering hold below the energy of the first instanton. Finally, (YM) is more ‘strongly coupled’ as a system compared to (MKG), in the sense that the connection A itself obeys a covariant wave equation. This feature seems to necessitate a more involved renormalization procedure compared to (MKG).

2 Overview of the proof I: summary of the first two papers

The basic strategy for proving Theorem 1.3 is by contradiction, following the scheme successfully developed in [32, 33] in the setting of energy critical wave maps. In the first two papers of the sequence [26, 27] we establish successively stronger continuation and scattering criteria, whose contrapositives provide precise information about the nature of a finite time blow-up (i.e., failure of global well-posedness) or non-scattering. In the present paper, we use this information, as well as conservation laws and Morawetz-type monotonicity formulae for (MKG), to perform a blow-up analysis and show that the failure of Theorem 1.3 implies the existence of a nontrivial finite energy stationary or self-similar solution to (MKG). Since such a solution does not exist (see Sect. 7 below), Theorem 1.3 must hold.

In this section we review the main results and ideas of the earlier two papers in the sequence [26, 27]. In Sect. 3 we summarize the argument given in the present paper. To steer away from unnecessary technical details we only consider smooth data and solutions; however we remark that the results also apply to merely finite energy data and admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions. For the notation, we refer to Sect. 4.

2.1 Local well-posedness in the global Coulomb gauge and non-concentration of energy

The main result of the first paper [26] of the sequence is local well-posedness of (MKG) in the global Coulomb gauge with a lower bound on the lifespan in terms of the energy concentration scale

$$\begin{aligned} r_{c} = r_{c}(E)[a, e, f, g] := \sup \left\{ r > 0 : \forall x \in {{\mathbb {R}}}^{4},{{\mathcal {E}}}_{B_{r}(x)}[a, e, f, g] < \delta _{0}(E, \epsilon _{*}^{2})\right\} , \end{aligned}$$

where \(B_{r}(x)\) denotes the open ball of radius r with center \(x, \delta _{0}(E, \epsilon _{*}^{2}) = c^2 \epsilon _{*}^{2} \min \left\{ 1, \epsilon _{*}^{4} E^{-2}\right\} , c\) is an absolute constant and \(\epsilon _{*}^{2}\) is the threshold for the small energy global well-posedness result in [19] (see Theorem 4.1). A simplified version of the main theorem of [26] is as follows:

Theorem 2.1

Given any \(E > 0\) let \(\delta _{0}(E,\epsilon _{*}^{2}) > 0\) be as above. Let (aefg) be a smooth finite energy initial data for (MKG) satisfying the global Coulomb gauge condition \(\sum _{j} \partial _{j} a_{j} = 0\). Then there exists a unique smooth solution \((A, \phi )\) to (MKG) in the global Coulomb gauge on \([-r_{c}, r_{c}] \times {{\mathbb {R}}}^{4}\).

Theorem 2.1 implies that finite time blow-up is always accompanied by concentration of energy (i.e., \(r_{c} \rightarrow 0\) at the end of the maximal lifespan). For a precise statement, see Theorem 4.3. In what follows we explain the ideas involved in the proof of local existence, which lies at the heart of Theorem 2.1.

2.1.1 Strategy of proof in model cases

For many other semi-linear equations, such as \(\Box u = \pm u^{\frac{d+2}{d-2}}\) or the wave map equation, a result analogous to Theorem 2.1 is a rather immediate consequence of small energy global well-posedness and finite speed of propagation. Roughly speaking, the proof (of local existence) proceeds in the following three steps:

  • Step A. One truncates the initial data locally in space to achieve small energy.

  • Step B. By small energy global well-posedness, the truncated data give rise to global solutions. Restricting these global solutions to the domain of dependence of the truncated regions, one obtains a family of local-in-spacetime solutions that agree with each other on the intersection of their domains by finite speed of propagation.

  • Step C. One patches together these solutions to obtain a local-in-time solution to the original initial data.

In particular, the lifespan of the solution constructed by this scheme depends on the size of spatial truncation in Step A, which in turn is dictated by the energy concentration scale \(r_{c}\) of the initial data.

2.1.2 Non-locality of (MKG) in the global Coulomb gauge

When carrying out the above strategy in our setting, however, we face difficulties arising from non-local features of (MKG) in the global Coulomb gauge. One source of non-locality is the Gauss (or the constraint) equation

$$\begin{aligned} \partial ^{\ell } e_{\ell } = \mathrm {Im}(f \overline{g}), \end{aligned}$$
(2.1)

which must be satisfied by every (MKG) initial data set. Another source is the presence of the elliptic equation for \(A_{0}\) in the global Coulomb gauge (cf. (2.4)); in particular, finite speed of propagation fails in the global Coulomb gauge.

In the remainder of this subsection, we give an overview of the techniques developed in [26] for overcoming these issues, and explain how these can be used to essentially execute Steps A-C above to obtain Theorem 2.1 from the small energy global well-posedness theorem proved in [19] (see Theorem 4.1).

2.1.3 Execution of step A: initial data excision and gluing

Consider the problem of truncating a (MKG) initial data setFootnote 3 (aefg) to a ball B. A naive way to proceed would be to apply a smooth cutoff to each of aefg. However, integrating the Gauss equation (2.1) by parts over balls of large radius, we see that \(e_{j}\) must in general be nontrivial on the boundary spheres outside B, even if f and g are supported in B.

Instead, the idea of initial data excision and gluingFootnote 4 is as follows: rather than just excising the unwanted part, we glue it to another initial data set (i.e., solution to the Gauss equation) which has an explicit description, so that the Gauss equation is still satisfied. For example, in the exterior of a ball B we may glue to the data

$$\begin{aligned} \left( e_{(q)j} = \frac{q}{2 \pi ^{2}} \frac{x_{j}}{\vert x\vert ^{4}}, 0, 0, 0\right) \end{aligned}$$

with an appropriate q. Note that \(e_{(q)}\) is precisely the electric field of an electric monopole of charge q placed at the origin.

Using this idea we may truncate (aefg) to balls to make the energy sufficiently small. The minimum size of these balls, which later dictates the lifespan of the solution, can be chosen to be proportional to the energy concentration scale. This procedure is our analogue of step A.

2.1.4 Execution of step B: geometric uniqueness of admissible solution to (MKG)

Though finite speed of propagation fails for (MKG) in certain gauges such as the global Coulomb gauge, it is still true up to gauge transformations. We refer to this statement as local geometric uniqueness for (MKG), and use it as a substitute for the usual finite speed of propagation property.

Applying a suitable gauge transformation to each truncated initial data set to impose the global Coulomb gauge condition, we are in position to apply the small energy global well-posedness theorem (Theorem 4.1) and construct a family of global smooth solutions. Restricting these solutions to the domain of dependence of the truncated regions and appealing to local geometric uniqueness, we obtain local-in-spacetime Coulomb solutions (i.e., obey \(\partial ^{\ell } A_{\ell } = 0\) on the domains) which are gauge equivalent to each other on the intersection of their domains. We refer to such solutions as compatible pairsFootnote 5; geometrically, these are precisely local descriptions of a globally defined pair of a connection and a section on local trivializations of the bundle L.

2.1.5 Execution of step C: patching local Coulomb solutions

The final task is to patch together the local-in-spacetime descriptions of a solution (i.e., compatible pairs) to produce a global-in-space solution \((A, \phi )\) in the global Coulomb gauge. We first adapt a patching argument of Uhlenbeck [43] to produce a single global-in-space solution \((A', \phi ')\) obeying an appropriate \(S^1\) norm bound. The fact that a gauge transformation \(\chi \) between Coulomb gauges obeys the Laplace equation \(\triangle \chi = 0\), and hence possesses improved regularity, is important for this step. The solution \((A', \phi ')\) obtained by this patching process is not necessarily in the global Coulomb gauge; it is however approximately Coulomb (i.e., \(\partial ^{\ell } A'_{\ell }\) obeys an improved bound), since it arose from patching together local Coulomb solutions. It is thus possible to find a nicely behaved gauge transformation into the exact global Coulomb gauge, leading us to the desired local-in-time solution.

2.2 Continuation of energy dispersed solutions

We now describe the content of [27]. The main theorem of [27] is a continuation/scattering criterion in the global Coulomb gauge for a large energy solution \((A, \phi )\) to (MKG) in terms of its energy dispersion \(ED[\phi ](I)\), defined as

$$\begin{aligned} ED[\phi ](I) = \sup _{k} \Big ( 2^{-k} \Vert P_{k} \phi \Vert _{L^{\infty }_{t,x}(I \times {{\mathbb {R}}}^{4})} + 2^{-2k} \Vert \partial _{t} P_{k} \phi \Vert _{L^{\infty }_{t,x}(I \times {{\mathbb {R}}}^{4})} \Big )\quad \quad \end{aligned}$$
(2.2)

for any time interval \(I \subseteq {{\mathbb {R}}}\). A simple version is as follows:

Theorem 2.2

Given any \(E > 0\), there exist positive numbers \(\epsilon = \epsilon (E) > 0\) and \(F = F(E)\) such that the following holds. Let \((A, \phi )\) be a smooth solution to (MKG) in the global Coulomb gauge (MKG-CG) on \(I \times {{\mathbb {R}}}^{4}\) with energy \(\le E\). If \(ED[\phi ](I) \le \epsilon (E)\), then the following a-priori \(S^{1}\) norm bound holds:

$$\begin{aligned} \Vert A_{0}\Vert _{Y^{1}[I]} + \Vert A_{x}\Vert _{S^{1}[I]} + \Vert \phi \Vert _{S^{1}[I]} \le F(E). \end{aligned}$$
(2.3)

Moreover, \((A, \phi )\) extends as a smooth solution past finite endpoints of I.

Theorem 2.2 is analogous to the main result in [32] for energy critical wave maps. Thanks to the a-priori bound (2.3), the solution \((A, \phi )\) scatters towards each infinite endpoint in the sense of Remark 1.4. For a more precise formulation, see Theorems 4.7 and 4.8.

We now describe the main ideas of the proof of Theorem 2.2. In what follows, we only consider solutions to (MKG) in the global Coulomb gauge.

2.2.1 Decomposition of the nonlinearity

We begin by describing the structure of the Maxwell-Klein-Gordon system in the global Coulomb gauge (MKG-CG), which take the form

$$\begin{aligned} \left\{ \begin{aligned} \triangle A_{0} =\,&\mathrm {Im}(\phi \overline{\partial _{t} \phi })+ \hbox {(cubic terms)} \\ \Box A_{j} =\,&{{\mathcal {P}}}_{j} \mathrm {Im}(\phi \overline{\partial _{x} \phi }) + \hbox {(cubic terms)} \\ \Box \phi = \,&- 2 i A_{\mu } \partial ^{\mu } \phi + \hbox {(cubic terms)} \end{aligned} \right. \end{aligned}$$
(2.4)

where \({{\mathcal {P}}}\) is the Leray \(L^{2}\)-projection to the space of divergence-free vector fields. We omitted cubic terms as they are strictly easier to handle. The elliptic equation for \(A_{0}\) allows us to obtain the appropriate \(Y^{1}\) bound once we establish \(S^{1}\) bounds for \(A_{x}\) and \(\phi \); henceforth we focus on the wave equations for \(A_{x}\) and \(\phi \).

As in the case of small energy global well-posedness [19], the null structure of (MKG) in the global Coulomb gauge plays an essential role in the proof of Theorem 2.2. All quadratic terms in the wave equations exhibit null structure, i.e., cancellation in the angle between inputs in Fourier space. There is also a secondary multilinear null structure in the term \(2 i A_{\mu } \partial ^{\mu } \phi \) which arises by plugging in the equations for \(A_{0}, A_{j}\). All of this structure is necessary for controlling the \(S^{1}\) norm of \((A, \phi )\), but it is by no means sufficient as we discuss below.

2.2.2 Renormalization for large energy

Even in the case of small energy global well-posedness [19], the null structure alone is not enough to bound the \(S^{1}\) norm of \((A, \phi )\) due to the paradifferential term in the \(\phi \)-equation

$$\begin{aligned} - \sum _{k} 2i P_{<k} A^{\mathrm {free}} \cdot \partial _{x} P_{k} \phi . \end{aligned}$$

Here \(A_{j}^{\mathrm {free}}\) is the free wave evolution of \(A_{j}[0] := (A_{j}, \partial _{t} A_{j}) \!\upharpoonright _{\left\{ t = 0\right\} }\). As in [19, 28], we handle this term by a renormalization argument. More precisely, we treat the problematic term as a part of the linear operator and construct a paradifferential parametrix. The construction in [19, 28], however, relied on smallness of the energy, which we lack in our setting. Instead we consider the linear operator with a frequency gap m

$$\begin{aligned} \Box _{A^{\mathrm {free}}}^{p, m} \psi := \Box \psi + \sum _{k} 2 i P_{<k-m} A_{x}^{\mathrm {free}} \cdot \partial _{x} P_{k}\psi , \end{aligned}$$

and gain smallness by taking m sufficiently large. This idea is akin to the gauge renormalization procedure for wave maps in [32], where a large frequency gap was used to control the large paradifferential term.

2.2.3 Role of energy dispersion

We now describe the role of small energy dispersion \(ED[\phi ]\). Roughly speaking, small energy dispersion allows us to gain in transversal balanced frequency interactions. This complements the gain in parallel interactions, due to the null condition, and the gain in the \(high \times high \rightarrow low\) interactions due to the favorable frequency balance. For instance, by interpolation with (non-sharp) Strichartz norms controlled by the \(S^{1}\) norm, we haveFootnote 6

$$\begin{aligned} \Vert P_{k}(P_{k_{1}} \phi P_{k_{2}} \psi )\Vert _{L^{2}_{t,x}(I \times {{\mathbb {R}}}^{4})} \lesssim 2^{-\frac{1}{2} \min \left\{ k_{1}, k_{2}\right\} } ED[\phi ]^{\theta } \Vert P_{k_{1}} \phi \Vert _{S^{1}[I]}^{1-\theta } \Vert P_{k_{2}} \psi \Vert _{S^{1}[I]},\nonumber \\ \end{aligned}$$
(2.5)

which is useful when \(k_{1} = k+O(1), k_{2} = k + O(1)\) and \(\phi , \psi \) are at a large angle so that the output modulation is high.

To see how this gain is useful, we return to the full nonlinear system (MKG) in the global Coulomb gauge. Upon decomposing the inputs and output into Littlewood-Paley pieces, most of the nonlinearity exhibits an off-diagonal exponential decay in frequency. For example, the nonlinearity in the \(A_{x}\)-equation obeys

$$\begin{aligned} \Vert P_{k} {{\mathcal {P}}}_{x} ( P_{k_{1}} \phi \partial _{x} \overline{P_{k_{2}} \phi })\Vert _{N[I]} \lesssim 2^{-\delta (\vert k-k_{1}\vert + \vert k - k_{2}\vert )} \Vert P_{k_{1}} \phi \Vert _{S^{1}[I]} \Vert P_{k_{2}} \phi \Vert _{S^{1}[I]}. \end{aligned}$$

Introducing again a large frequency gap m, we gain smallness except when \(k_{1} = k + O_{m}(1)\) and \(k_{2} = k + O_{m}(1)\). Furthermore, thanks to the null structure, we also gain extra smallness except for angled interaction; then we are precisely in position to use \(ED[\phi ]\). In conclusion, we gain smallness from \(ED[\phi ] \le \epsilon \) for the nonlinearity in the \(A_{x}\)-equation.

2.2.4 Linear well-posedness of \(\Box _{A} \psi = f\)

Unfortunately the a-priori estimate (2.3) does not close yet, as there exists a nonlinear term in the \(\phi \)-equation with no off-diagonal exponential decay. This part is precisely the \(low \times high \rightarrow high\) frequency and \(high \times low \rightarrow low\) modulation interactionFootnote 7 in the term \(-2 i A \cdot \partial _{x} \phi \), i.e.,

$$\begin{aligned} -2 i \sum _{\begin{array}{c} k_{1} < k \\ k_{2} = k+O(1) \end{array}} \sum _{j < k_{1}} P_{k} Q_{<j}( P_{k_{1}} Q_{j} A \cdot \partial _{x} P_{k_{2}} Q_{<j} \phi ). \end{aligned}$$
(2.6)

Nevertheless, this term has the redeeming feature that it can be bounded by a divisible norm: Given any \(\varepsilon > 0\) the interval I can be split into smaller pieces \(I_{k}\) on each of which the N norm of the above expression is bounded by \(\le \varepsilon ^{2} \Vert P_{k_{2}} \phi \Vert _{S^{1}[I]}\), where the number of such intervals is \(O_{\Vert \phi \Vert _{S^{1}[I]}, \varepsilon }(1)\). For a solution \((A, \phi )\) to (MKG), this observation leads to linear well-posedness of the magnetic wave equationFootnote 8 \(\Box _{A} \psi = f\) with bound

$$\begin{aligned} \Vert \psi \Vert _{S^{1}[I]} \lesssim _{\Vert (A_{x}, \phi )\Vert _{S^{1}[I]}} \Vert \psi [0]\Vert _{\dot{H}^{1}_{x} \times L^{2}_{x}} + \Vert f\Vert _{N[I]}, \end{aligned}$$
(2.7)

where \(\psi [0] := (\psi , \partial _{t} \psi ) \!\upharpoonright _{\left\{ t = 0\right\} }\). The bound (2.7) allows us to setup an induction on energy scheme to establish (2.3), which we now turn to explain.

2.2.5 Induction on energy

The starting point of our induction is the small energy global well-posedness theorem [19], which implies that (2.3) holds with \(F(E) = C \sqrt{E}\) when the energy E is sufficiently small. Our goal is to show the existence of a non-increasing positive function \(c_{0}(\cdot )\) on the whole interval \([0, \infty )\) such that if the conclusion of Theorem 2.2 holds for energy up to E, then it also holds for energy up to \(E + c_{0}(E)\). Monotonicity of \(c_{0}(\cdot )\) implies that it has a uniform positive lower bound on every finite interval; thus the continuous induction works for all energy.

In what follows, we describe the construction of \(c_{0}(E), F := F(E + c_{0}(E))\) and \(\epsilon := \epsilon (E + c_{0}(E))\) under the induction hypothesis that Theorem 2.2 holds up to energy E for some F(E) and \(\epsilon (E)\). For the scheme to work, it is crucial to let \(c_{0}(E)\) depend only on E and not on F(E) or \(\epsilon (E)\). On the other hand, F and \(\epsilon \) may depend on F(E) and \(\epsilon (E)\).

Let \((A, \phi )\) be a solution on \(I \times {{\mathbb {R}}}^{4}\) with energy \( E + c_{0}(E)\) and \(ED[\phi ] \le \epsilon \). To prove (2.3) for \((A, \phi )\), we compare it with another solution \((\tilde{A}, \tilde{\phi })\) with frequency truncated initial dataFootnote 9

$$\begin{aligned} (\tilde{A}_{j}[0], \tilde{\phi }[0]) = (P_{\le k^{*}} A_{j}[0], P_{\le k^{*}} \phi [0]) \end{aligned}$$

where the ‘cut frequency’ \(k^{*} \in {{\mathbb {R}}}\) is chosen so that \((\tilde{A}, \tilde{\phi })\) has energy E. By taking \(c_{0}(E)\) and \(\epsilon \) sufficiently small, we aim for the following two goals:

  • Goal A. The energy dispersion \(ED[\tilde{\phi }](I)\) is sufficiently small so that the induction hypothesis applies to \((\tilde{A}, \tilde{\phi })\). Hence

    $$\begin{aligned} \Vert \tilde{A}_{0}\Vert _{Y^{1}[I]} + \Vert \tilde{A}_{x}\Vert _{S^{1}[I]} + \Vert \tilde{\phi }\Vert _{S^{1}[I]} \le F(E). \end{aligned}$$
    (2.8)
  • Goal B. The difference \((B^{high}, \psi ^{high}) := (A_{\mu } - \tilde{A}_{\mu }, \phi - \tilde{\phi })\) obeys

    $$\begin{aligned} \Vert B^{high}_{0}\Vert _{Y^{1}[I]} + \Vert B^{high}_{x}\Vert _{S^{1}[I]} + \Vert \psi ^{high}\Vert _{S^{1}[I]} \le C_{E, F(E)}. \end{aligned}$$
    (2.9)

Adding (2.8) and (2.9), the desired bound (2.3) would follow with \(F := F(E) + C_{E, F(E)}\).

Goal A is accomplished by showing that if \(\epsilon \) is sufficiently small, then \((\tilde{A}, \tilde{\phi })\) is arbitrarily close (i.e., within \(\epsilon ^\delta \)) to the frequency truncated solution \((P_{\le k^{*}} A, P_{\le k^{*}} \phi )\) which has small energy dispersion. For Goal B, the idea is to view \((B^{high}, \psi ^{high})\) as a perturbation around \((\tilde{A}, \tilde{\phi })\). To ensure that \(c_{0}(E)\) is independent of F(E), we rely on two observations: First, by the weak divisibilityFootnote 10 of the \(S^{1}\) norm, the interval I can be split into \(O_{F(E)}(1)\) many subintervals \(I_{k}\) on each of which we have

$$\begin{aligned} \Vert \tilde{A}_{0}\Vert _{Y^{1}[I_{k}]} + \Vert \tilde{A}_{x}\Vert _{S^{1}[I_{k}]} + \Vert \tilde{\phi }\Vert _{S^{1}[I_{k}]} \lesssim _{E} 1. \end{aligned}$$
(2.10)

Second, by conservation of energy for \((A, \phi )\) and \((\tilde{A}, \tilde{\phi })\), as well as the approximation \((\tilde{A}, \tilde{\phi }) \approx (P_{\le k^{*}} A, P_{\le k^{*}} \phi )\), it follows that the \(\dot{H}^{1}_{x} \times L^{2}_{x}\) norm of the data for \((B^{high}, \psi ^{high})\) can be reinitialized to be of size \(\lesssim c_{0}(E)\) on each \(I_{k}\).

With these two observations in hand, we claim that \((B^{high}, \psi ^{high})\) obeys the following \(S^{1}\) norm bound on each \(I_{k}\):

$$\begin{aligned} \Vert B^{high}_{0}\Vert _{Y^{1}[I_{k}]} + \Vert B^{high}_{x}\Vert _{S^{1}[I_{k}]} + \Vert \psi ^{high}\Vert _{S^{1}[I_{k}]} \lesssim _{E} c_{0}(E) + O_{F}\left( \epsilon ^{\delta }\right) .\nonumber \\ \end{aligned}$$
(2.11)

Indeed, in the equation for \((B^{high}, \psi ^{high})\), all nonlinear terms in \((B^{high}, \psi ^{high})\) can be handled by taking \(c_{0}(E) \ll _{E} 1\) and \(\epsilon \ll _{F} 1\). Furthermore, exploiting small energy dispersion, all linear terms can be made appropriately small except \(- 2 i A_{\mu } \partial ^{\mu } \psi ^{high}\). Nevertheless, the \(S^{1}\) norm of \((A, \phi )\) on I can be assumed to be \(\lesssim _{E} 1\) by (2.10) and a bootstrap assumptionFootnote 11; hence we can group this term with \(\Box \) and use (2.7) (linear well-posedness of \(\Box _{A} \, \psi ^{high}\)) to arrive at (2.11). Goal B now follows by summing up this bound on \(O_{F(E)}(1)\) intervals.

3 Overview of the proof II: content of the present paper

This section is a continuation of the previous section. Section 3.1 provides an overview of the argument in the present paper, thereby completing the summary of our entire proof of Theorem 1.3. In Sect. 3.2, we give a brief comparison of our approach with that of the recent work [16] of Krieger-Luḧrmann. Finally, Sect. 3.3 contains an outline of the structure of the remainder of the paper.

3.1 Blow-up analysis

Here we give an overview of the final blow-up analysis of (MKG), which is carried out in the present paper. This part is analogous to [33] for energy critical wave maps. We refer to Sect. 4 for the notation used below.

3.1.1 Main ingredients

In addition to the continuation/scattering criteria established in [26, 27] (see Theorems 2.1 and 2.2), our blow-up analysis of (MKG) relies on the following three key ingredients:

  • Monotonicity formula for (MKG)  Besides the conservation of energy, we use the following monotonicity (or Morawetz) formula for (MKG). Let \(\rho := \sqrt{t^{2} - \vert x\vert ^{2}}\) and

    $$\begin{aligned} X_{0} := \frac{1}{\rho } (t \partial _{t} + x \cdot \partial _{x}) \end{aligned}$$

    be the normalized scaling vector field. To avoid the degeneracy of \(\rho \) on \(\partial C = \left\{ t = \vert x\vert \right\} \), we also define the translates

    $$\begin{aligned} \rho _{\varepsilon } := \sqrt{(t+\varepsilon )^{2} - \vert x\vert ^{2}}, \quad X_{\varepsilon } := \frac{1}{\rho _{\varepsilon }} ((t + \varepsilon ) \partial _{t} + x \cdot \partial _{x}). \end{aligned}$$

    Given a smooth solution \((A, \phi )\) to (MKG) on the truncated cone \(C_{[\varepsilon , 1]}\) satisfying

    $$\begin{aligned} {{\mathcal {E}}}_{S_{1}}[A, \phi ] \le E, \quad {{\mathcal {F}}}_{\partial C_{[\varepsilon , 1]}} [A, \phi ] \le \varepsilon ^{\frac{1}{2}} E, \quad {{\mathcal {G}}}_{S_{1}} [\phi ] \le \varepsilon ^{\frac{1}{2}} E, \end{aligned}$$

    where \({{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}} := {{\mathcal {E}}}_{S_{t_{1}}} - {{\mathcal {E}}}_{S_{t_{0}}}\) is the energy flux through \(\partial C_{[t_{0}, t_{1}]}\) and \({{\mathcal {G}}}_{S_{t}} := \frac{1}{t}\int _{S_{t}} \vert \phi \vert ^{2}\), we have

    $$\begin{aligned} \begin{aligned}&\int _{S_{1}} {}^{(X_{\varepsilon })} P_{T}[A, \phi ] \, \mathrm {d}x + \iint _{C_{[\varepsilon , 1]}} \frac{1}{\rho _{\varepsilon }} \vert \iota _{X_{\varepsilon }} F\vert ^{2} + \frac{1}{\rho _{\varepsilon }} \left| \left( \mathbf{D}_{X_{\varepsilon }} + \frac{1}{\rho _{\varepsilon }}\right) \phi \right| ^{2} \, \mathrm {d}t \mathrm {d}x \\&\quad \lesssim \int _{S_{\varepsilon }} {}^{(X_{\varepsilon })} P_{T}[A, \phi ] \, \mathrm {d}x + E. \end{aligned} \end{aligned}$$
    (3.1)

    Here \({}^{(X_{\varepsilon })} P_{T}[A, \phi ]\) is a non-negative weighted energy density; we refer to Lemma 5.10 for an explicit formula for \({}^{(X_{\varepsilon })} P_{T}[A, \phi ]\). We remark that the entire right-hand side of (3.1) is bounded by \(\lesssim E\). Finiteness of the space-time integral term ‘breaks the scaling’ and implies that \(\iota _{X_{\varepsilon }} F\) and \((\mathbf{D}_{X_{\varepsilon }} + \frac{1}{\rho _{\varepsilon }}) \phi \) decay near the tip of the cone C.

  • Strong local compactness result  Given a sequence \((A^{(n)}, \phi ^{(n)})\) of solutions whose energy is uniformly small and \(\iota _{X} F^{(n)} \rightarrow 0\) and \((\mathbf{D}_{X}^{(n)} + b) \phi ^{(n)} \rightarrow 0\) in \(L^{2}_{t,x}\) on a space-time cube for some smooth time-like vector field X and smooth function b, we show that there exists a subsequence which converges strongly in (essentially) \(H^{1}_{t,x}\) in a smaller subcube; see Proposition 6.1 for more details. The proof relies on the initial data excision/gluing technique and the small energy global well-posedness theorem.

  • Triviality of finite energy stationary/self-similar solutions  We say that \((A, \phi )\) is a stationary solution to (MKG) if for some constant time-like vector field Y

    $$\begin{aligned} \iota _{Y} F = 0, \quad \mathbf{D}_{Y} \phi = 0, \end{aligned}$$

    and that \((A, \phi )\) is a self-similar solution if

    $$\begin{aligned} \iota _{X_{0}} F = 0, \quad \left( \mathbf{D}_{X_{0}} + \frac{1}{\rho }\right) \phi = 0. \end{aligned}$$

    Using the method of stress tensor, we show that every smooth stationary or self-similar solution with finite energy is trivial (i.e., \(F = 0\) and \(\phi = 0\)); see Propositions 7.1 and 7.2. We also establish a regularity result (Proposition 7.3), which says that all stationary and self-similar solutions arising from the above strong local compactness result (Proposition 6.1) are smooth.

With these in mind, we now sketch the blow-up analysis of (MKG), which is performed in full detail in Sect. 8.

3.1.2 Finite time blow-up/non-scattering scenarios and initial reduction

Suppose that the conclusion of Theorem 1.3 fails for a smooth finite energy data (aefg) in the forward time direction. Then the corresponding smooth solution either blows up in finite time, or does not scatter as \(t \rightarrow \infty \). The first step of the blow-up analysis is to construct in both scenarios a sequence of global Coulomb solutions \((A^{(n)}, \phi ^{(n)})\) on \([\varepsilon _{n}, 1] \times {{\mathbb {R}}}^{4}\) (where \(\varepsilon _{n} \rightarrow 0\)) obeying the following properties:

  • Bounded energy in the cone  \({{\mathcal {E}}}_{S_{t}} [A^{(n)}, \phi ^{(n)}] \le E\) for every \(t \in [\varepsilon _{n}, 1]\)

  • Small energy outside the cone  \({{\mathcal {E}}}_{(\left\{ t\right\} \times {{\mathbb {R}}}^{4}) \setminus S_{t}} [A^{(n)}, \phi ^{(n)}] \ll E\) for every \(t \in [\varepsilon _{n}, 1]\)

  • Decaying flux on \(\partial C\)  \({{\mathcal {F}}}_{[\varepsilon _{n}, 1]} [A^{(n)}, \phi ^{(n)}] + {{\mathcal {G}}}_{S_{1}}[\phi ^{(n)}] \le \varepsilon _{n}^{\frac{1}{2}} E\),

  • Pointwise concentration at \(t = 1\)  There exist \(k_{n} \in {{\mathbb {Z}}}\) and \(x_{n} \in {{\mathbb {R}}}^{4}\) such that

    $$\begin{aligned} 2^{-k_{n}} \vert \zeta _{2^{-k_{n}}} *\phi ^{(n)}(1,x_{n})\vert + 2^{-2k_{n}} \vert \zeta _{2^{-k_{n}}} *\mathbf{D}_{t}^{(n)} \phi ^{(n)}(1, x_{n})\vert > \mathbf{e}\end{aligned}$$
    (3.2)

    for some \(\mathbf{e}= \mathbf{e}(E) > 0\).

Here \(\zeta \) is a smooth function supported in the unit ball \(B_{1}(0)\) and \(\zeta _{2^{-k}}(x) := 2^{4k} \zeta (2^{k} x)\). In view of the next step, we require \(\zeta \) to be non-negative. See Lemma 8.4 for details.

Key to this construction are Theorems 2.1 and 2.2, which provide detailed information about finite time blow-up or non-scattering scenarios. In particular, the tip of the cone C is the point of energy concentration (which exists by Theorem 2.1) in the finite time blow-up case. Pointwise concentration at \(t=1\) follows from the failure of the energy dispersion bound in Theorem 2.2. Decaying flux on \(\partial C\) is a consequence of the local conservation of energy and localized Hardy’s inequality; see Lemma 5.2 and Corollary 5.3. Smallness of the energy outside the cone is achieved using the initial data excision/gluing technique in the finite time blow-up case; in the non-scattering case, this property is trivial to establish.

3.1.3 Elimination of the null concentration scenario

Thanks to the above properties, we may apply the monotonicity formula (3.1) to each solution in the sequence \((A^{(n)}, \phi ^{(n)})\). Using the weighted energy term (i.e., the first term on the left-hand side) in (3.1), we show in Lemma 8.7 that the null concentration scenario (i.e., \(\vert x_{n}\vert \rightarrow 1\) and \(k_{n} \rightarrow \infty \)) is impossible. Unlike in the case of wave maps [33], however, the weighted energy involves the covariant derivatives \(\mathbf{D}^{(n)}_{\mu } \phi ^{(n)} = \partial _{\mu } \phi ^{(n)} + i A^{(n)}_{\mu } \phi ^{(n)}\), and the term involving \(A^{(n)}\) could be problematic. We avoid this issue by first working with the gauge invariant amplitude \(\vert \phi ^{(n)}\vert \), for which we have the diamagnetic inequality

$$\begin{aligned} \vert X^{\mu } \partial _{\mu } \vert \phi ^{(n)}\vert \vert \le \vert \mathbf{D}_{X} \phi ^{(n)}\vert \end{aligned}$$

in the sense of distributions, for any smooth vector field X. We then transfer the bound to \(\phi ^{(n)}\) using the inequality

$$\begin{aligned} \vert \zeta _{2^{-k}} *\phi ^{(n)}\vert \le \zeta _{2^{-k}} *\vert \phi ^{(n)}\vert , \end{aligned}$$

which holds if \(\zeta \) is chosen to be non-negative.

3.1.4 Nontrivial energy in a time-like region

The absence of the null concentration scenario implies the following uniform lower bound for \(\phi ^{(n)}\) away from the boundary at \(t = 1\): There exist \(E_{1} = E_{1}(E) > 0\) and \(\gamma _{1} = \gamma _{1}(E) \in (0, 1)\) such that

$$\begin{aligned} \int _{S_{1}^{1-\gamma _{1}}} \sum _{\mu = 0}^{4} \vert \mathbf{D}_{\mu }^{(n)} \phi ^{(n)}\vert ^{2} + \frac{1}{r^{2}} \vert \phi ^{(n)}\vert ^{2} \, \mathrm {d}x \ge E_{1}. \end{aligned}$$
(3.3)

See Lemma 8.9. Using a localized version of the monotonicity formula (3.1), this lower bound can be propagated towards \(t=0\). More precisely, there exist \(E_{2} = E_{2}(E)\) and \(\gamma _{2} = \gamma _{2}(E) \in (0, 1)\) and \(E_{2} = E_{2}(E) > 0\) such that

$$\begin{aligned} \int _{S^{(1-\gamma _{2}) t}_{t}} {}^{(X_{0})} P_{T}\left[ A^{(n)}, \phi ^{(n)}\right] \, \mathrm {d}x \ge E_{2} \quad \hbox { for all } t \in \left[ \varepsilon _{n}^{\frac{1}{2}}, \varepsilon _{n}^{\frac{1}{4}}\right] . \end{aligned}$$
(3.4)

3.1.5 Final rescaling

Thanks to the space-time integral term in (3.1), \((A^{(n)}, \phi ^{(n)})\) obeys

$$\begin{aligned} \iint _{C_{[\varepsilon _{n}, 1]}} \frac{1}{\rho _{\varepsilon _{n}}} \vert \iota _{X_{\varepsilon _{n}}} F^{(n)}\vert ^{2} + \frac{1}{\rho _{\varepsilon _{n}}} \left| {\left( \mathbf{D}_{X_{\varepsilon _{n}}}^{(n)} + \frac{1}{\rho _{\varepsilon _{n}}}\right) \phi ^{(n)}}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \lesssim E. \end{aligned}$$

which implies an integrated decay of \(\iota _{X_{\varepsilon _{n}}} F^{(n)}\) and \((\mathbf{D}_{X_{\varepsilon _{n}}}^{(n)} + \frac{1}{\rho _{\varepsilon _{n}}}) \phi ^{(n)}\) near the tip of the cone C. Applying the pigeonhole principle and rescaling, we obtain a new sequence of solutions which is asymptotically self-similar. More precisely, there exist a sequence of solutions on \([1, T_{n}] \times {{\mathbb {R}}}^{4}\) (where \(T_{n} \rightarrow \infty \)) to (MKG), which we still denote by \((A^{(n)}, \phi ^{(n)})\), obeying the following properties (see Lemma 8.11):

  • Bounded energy in the cone  \({{\mathcal {E}}}_{S_{t}}[A^{(n)}, \phi ^{(n)}] \le E\) for every \(t \in [1, T_{n}]\),

  • Small energy outside the cone  \({{\mathcal {E}}}_{\left\{ t\right\} \times {{\mathbb {R}}}^{4} {\setminus } S_{t}}[A^{(n)}, \phi ^{(n)}] \ll E\) for every \(t \in [1, T_{n}]\),

  • Nontrivial energy in a time-like region  For every \(t \in [1, T_{n}]\) we have

    $$\begin{aligned} \int _{S^{(1-\gamma _{2}) t}_{t}} {}^{(X_{0})} P_{T}\left[ A^{(n)}, \phi ^{(n)}\right] \, \mathrm {d}x \ge E_{2}, \end{aligned}$$
    (3.5)
  • Asymptotic self-similarity  For every compact subset K of the interior of \(C_{[1, \infty )}\), we have

    $$\begin{aligned} \iint _{K} \vert \iota _{X_{0}} F^{(n)}\vert ^{2} + \left| \left( \mathbf{D}_{X_{0}}^{(n)} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty . \end{aligned}$$
    (3.6)

3.1.6 Extraction of concentration scales and compactness/rigidity argument

Let \((A^{(n)}, \phi ^{(n)})\) be a sequence obtained by the final rescaling argument. Using a combinatorial argument, we show in Lemma 8.12 that one of the following two scenarios holds:

  1. A.

    Either we can identify a sequence of points and decreasing scales at which energy concentrates, or

  2. B.

    There is a uniform non-concentration of energy.

In Scenario A we obtain a fixed number \(r > 0\) and a sequence of times \(t_{n} \rightarrow t_{0}\), points \(x_{n} \rightarrow x_{0}\) and scales \(r_{n} \rightarrow 0\) such that

$$\begin{aligned} \sup _{x \in B_{r}(x_{n})} {{\mathcal {E}}}_{\left\{ t_{n}\right\} \times B_{r_{n}}(x)}\left[ A^{(n)}, \phi ^{(n)}\right] \end{aligned}$$

is uniformly small but nontrivial, and

$$\begin{aligned} \frac{1}{4 r_{n}} \int _{t_{n}-2r_{n}}^{t_{n}+2r_{n}} \int _{B_{r}(x_{n})} \vert \iota _{Y} F^{(n)}\vert ^{2} + \vert \mathbf{D}_{Y}^{(n)} \phi ^{(n)}\vert ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty . \end{aligned}$$

where \(Y = X_{0}(t_{0}, x_{0})\). Applying Proposition 6.1, we obtain as a limit a nontrivial finite energy solution to (MKG) which is stationary with respect to Y. As discussed above, however, such solutions do not exist.

In Scenario B we can cover each truncated cone \(\widetilde{C}_{j} := C^{1/2}_{[1/2, \infty )} \cap \left\{ 2^{j} \le t < 2^{j+1}\right\} \) with spatial balls of radius \(r = r(j)\), on each of which the energy of \((A^{(n)}, \phi ^{(n)})\) is uniformly small and

$$\begin{aligned} \iint _{\widetilde{C}_{j}} \vert \iota _{X_{0}} F^{(n)}\vert ^{2} + \left| \left( \mathbf{D}_{X_{0}}^{(n)} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty . \end{aligned}$$

Hence we are again in position to apply Proposition 6.1 and extract a finite energy self-similar solution to (MKG) on \(C_{[1/2, \infty )}^{1/2}\). By self-similarity, this limit easily extends to the whole forward cone C. By (3.5) this limit is necessarily nontrivial, which contradicts the triviality of finite energy self-similar solutions.

In conclusion, we have seen that neither of the two scenarios can hold, which is a contradiction. This completes the proof of the main theorem.

3.2 Comparison with the approach of [16]

The principal difference between the present work and [16] is that the latter follows the Kenig-Merle concentration compactness/rigidity scheme [9] for establishing global well-posedness and scattering. Roughly, this scheme consists of two steps: First, assuming that the conclusion fails, one constructs a blow-up solution with the minimal energy. Second, one derives a contradiction by playing various conservation laws and monotonicity formulae of (MKG) against special compactness properties of the minimal blow-up solution. As an immediate corollary, this approaches yields some additional information about the solutions, such as an a-priori bound on the \(S^{1}\) norm in terms of the energy. On the other hand, as we explain below, the execution of this scheme in the presence of a non-perturbative paradifferential nonlinearity faces considerable difficulty, a large part of which is avoided in the present work.

The main ingredient for construction of a minimal blow-up solution is the concentration compactness phenomenon, or nonlinear profile decomposition, for (MKG) in the global Coulomb gauge, whose proof takes up the majority of [16]. This concept was first introduced in the context of elliptic PDEs by Lions [21] and was adapted to the semilinear wave equation \(\Box u = \vert u\vert ^{4} u\) on \({{\mathbb {R}}}^{1+3}\) by Bahouri-Gérard [1]. It roughly states that any sequence of solutions with uniformly bounded energy can be decomposed (after passing to a subsequence, on a suitable time interval) into the superposition of solutions modulated by the non-compact symmetries of the problem (called profiles) and an error which can be made arbitrarily small in an appropriate norm weaker than energy.

Key to the proof of concentration compactness in [1] is the asymptotic vanishing of the interaction among different profiles, whose frequency and space-time supports are diverging from each other. However, such a statement partly fails for (MKG), due to the non-perturbative effect of the low frequency part of the solution on the high frequency part through the paradifferential nonlinearity. In [16], this difficulty is overcome by performing an induction on frequency, where one carefully builds a profile decomposition in the order of increasing frequency. This delicate procedure, which originated from the earlier work of Krieger-Schlag [17], necessitates several ideas not used in our approach, such as a process for extracting linear profiles using the paradifferential magnetic wave equation and a uniform dispersive estimate for such a equation (see [16, Sect. 7.5]).

Another notable difference between this paper and [16] is the conservation laws and monotonicity formulae used in the proof. While the present paper relies only on the energy conservation and the Morawetz-type monotonicity formula (3.1), [16] uses in addition momenta conservation and a virial-type monotonicity formula for (MKG), which are of independent interest.

3.3 Structure of the present paper

The remainder of the paper is structured as follows.

3.3.1 Section 4

We provide the setup for our arguments to follow. In particular, we precisely state the results that we need from the other papers of the series [26, 27] in Sect. 4.5.

3.3.2 Section 5

We state and prove all the conservation laws and monotonicity formulae that are used in this paper.

3.3.3 Section 6

We use the small energy global well-posedness theorem (Theorem 4.1) and the technique of initial data excision/gluing to prove a strong local compactness statement (Proposition 6.1) that we rely on in our blow-up analysis. We also formulate a notion of weak solutions to (MKG) and their local descriptions (weak compatible pairs), which naturally arise as limits from Proposition 6.1.

3.3.4 Section 7

We show that there does not exist any nontrivial stationary or self-similar solutions to (MKG) with finite energy. We also prove regularity theorems for weak stationary or self-similar solutions to (MKG) considered in Sect. 6.

3.3.5 Section 8

We finally carry out the blow-up analysis as outlined in Sect. 3.1, thereby completing the proof of global well-posedness and scattering of (MKG).

4 Preliminaries

4.1 Notation for constants and asymptotics

Throughout the paper we use C for a general positive constant, which may vary from line to line. For a constant C that depends on, say, E, we write \(C = C(E)\). We write \(A \lesssim B\) when there exists a constant \(C > 0\) such that \(A \le C B\). When the implicit constant should be regarded as small, we write \(A \ll B\). The dependence of the constant is specified by a subscript, e.g., \(A\lesssim _{E} B\). We write \(A \approx B\) when both \(A \lesssim B\) and \(B \lesssim A\) hold.

4.2 Coordinate systems on \({{\mathbb {R}}}^{1+4}\)

Several different coordinate systems on \({{\mathbb {R}}}^{1+4}\) will be used in this paper. A basic choice, which has already been mentioned in the introduction, is the rectilinear coordinates \((x^{0}, x^{1}, \ldots , x^{4})\) on \({{\mathbb {R}}}^{1+4}\), in which the Minkowski metric takes the diagonal form \(\mathbf{m}= - (\mathrm {d}x^{0})^{2} + (\mathrm {d}x^{1})^{2} + \cdots + (\mathrm {d}x^{4})^{2}\). Alternatively, we will often write \(t = x^{0}\) and \(x = (x^{1}, \ldots , x^{4})\) as well. We reserve the greek indices \(\mu , \nu , \ldots \) for expressions in the rectilinear coordinates, and the latin indices \(j, k, \ell , \ldots \) expressions only in terms of the spatial coordinates \(x^{1}, x^{2}, x^{3}, x^{4}\).

We also introduce the polar coordinates \((t, r, \Theta )\) on \({{\mathbb {R}}}^{1+4}\), where

$$\begin{aligned} r = \vert x\vert , \quad \Theta = \frac{x}{\vert x\vert } \in {{\mathbb {S}}}^{3}, \end{aligned}$$

and the null coordinates \((u, v, \Theta )\), defined by

$$\begin{aligned} u = t - r, \quad v = t + r. \end{aligned}$$

We can furthermore specify a spherical coordinate system for \(\Theta \), but it will not be necessary. We also define the null vector fields \(L, \underline{L}\) as

$$\begin{aligned} L = \partial _{t} + \partial _{r} = 2 \partial _{v}, \quad \underline{L}= \partial _{t} - \partial _{r} = 2 \partial _{u}. \end{aligned}$$

In these coordinates, the metric takes the form

$$\begin{aligned} \mathbf{m}= - \mathrm {d}t^{2} + \mathrm {d}r^{2} + r^{2} g_{{{\mathbb {S}}}^{3}} = - \mathrm {d}u \mathrm {d}v + r^{2}(u, v) g_{{{\mathbb {S}}}^{3}}. \end{aligned}$$

where \(g_{{{\mathbb {S}}}^{3}}\) is the standard metric on \({{\mathbb {S}}}^{3}\) in the coordinates \(\Theta \).

Finally, we will also use the hyperbolic polar coordinates (in short, hyperbolic coordinates) \((\rho , y, \Theta )\) on the future light cone \(C_{(0, \infty )} = \left\{ (t, r, \Theta ) : 0 \le r < t\right\} \) (see below), where

$$\begin{aligned} \rho = \sqrt{t^{2} - r^{2}}, \quad y = \tanh ^{-1} (r/t). \end{aligned}$$

The Minkowski metric takes the form

$$\begin{aligned} \mathbf{m}= - \mathrm {d}\rho ^{2} + \rho ^{2} \left( \mathrm {d}y^{2} + \sinh ^{2} y \, g_{{{\mathbb {S}}}^{3}}\right) . \end{aligned}$$

Every constant \(\rho \) hypersurface \({{\mathcal {H}}}_{\rho }\) is isometric to the simply connected space of constant sectional curvature \(-\frac{1}{\rho ^{2}}\); in particular, \({{\mathcal {H}}}_{1}\) is the hyperboloidal model for the hyperbolic 4-space \({{\mathbb {H}}}^{4}\). Using the coordinates \((y, \Theta )\), the metric on \({{\mathbb {H}}}^{4}\) can be written as

$$\begin{aligned} g_{{{\mathbb {H}}}^{4}} = \mathrm {d}y^{2} + \sinh ^{2} y \, g_{{{\mathbb {S}}}^{3}}. \end{aligned}$$

4.3 Geometric notation

To ease the transition from one coordinate system to another, we shall use the tensor formalism. We will denote by \(\nabla \) the Levi-Civita connection on \({{\mathbb {R}}}^{1+4}\) to distinguish from coordinate vector fields \(\partial _{\mu }\). The gauge covariant connection associated to A for \({{\mathbb {C}}}\)-valued tensors takes the form \(\mathbf{D}= \nabla + i A\). Similarly, we shall denote the Levi-Civita connection on \({{\mathbb {H}}}^{4}\) by \(\nabla _{{{\mathbb {H}}}^{4}}\), and the gauge covariant connection by \(\mathbf{D}_{{{\mathbb {H}}}^{4}} = \nabla _{{{\mathbb {H}}}^{4}} + i A\). We use the bold latin indices \(\mathbf{a}, \mathbf{b}, \ldots \) for expressions in a general coordinate system. We also employ the usual convention of raising and lowering indices using the Minkowski metric \(\mathbf{m}\), and summing up repeated upper and lower indices.

We now introduce some notation for geometric subsets of \({{\mathbb {R}}}^{1+4}\) and \({{\mathbb {R}}}^{4}\). The forward light cone

$$\begin{aligned} C := \left\{ (t,x) : 0 < t < \infty , \vert x\vert \le t\right\} \end{aligned}$$

will play a central role in this paper. For \(t_{0} \in {{\mathbb {R}}}\) and \(I \subset {{\mathbb {R}}}\), we define

$$\begin{aligned} C_{I} :=&\left\{ (t, x) : t \in I, \vert x\vert \le t\right\} ,&\partial C_{I} :=&\left\{ (t, x) : t \in I, \vert x\vert = t\right\} , \\ S_{t_{0}} :=&\left\{ (t, x) : t = t_{0}, \vert x\vert \le t\right\} ,&\partial S_{t_{0}} :=&\left\{ (t, x) : t = t_{0}, \vert x\vert = t\right\} . \end{aligned}$$

For \(\delta \in {{\mathbb {R}}}\), we define the translated cones

$$\begin{aligned} C^{\delta } :=&\left\{ (t,x) : \max \left\{ 0,\delta \right\} \le t < \infty , \vert x\vert \le t-\delta \right\} . \end{aligned}$$

The corresponding objects \(C^{\delta }_{I}, \partial C^{\delta }_{I}, S^{\delta }_{t_{0}}\) and \(\partial S^{\delta }_{t_{0}}\) are defined in the obvious manner.

We also define \(B_{r}(x)\) to be the ball of radius r centered at x in \({{\mathbb {R}}}^{4}\).

4.4 Frequency projections and function spaces

Let \(m_{\le 0}\) be a smooth cutoff that equals 1 on \(\left\{ r \le 1\right\} \) and 0 on \(\left\{ r \ge 2\right\} \). For \(k \in {{\mathbb {Z}}}\), we define

$$\begin{aligned} m_{\le k}(r) := m_{\le 0}(r/2^{k}), \quad m_{k}(r) := m_{\le k}(r) - m_{\le k-1}(r). \end{aligned}$$

so that \({\mathrm {supp}}\, m_{k} \subseteq \left\{ 2^{k-1} \le r \le 2^{k+1}\right\} \) and \(\sum _{k} m_{k}(r) = 1\). We introduce the Littlewood-Paley projections \(P_{k}, Q_{j}\) and \(S_{\ell }\), which are used in this paper:

$$\begin{aligned} P_{k} \varphi= & {} {{\mathcal {F}}}^{-1}[m_{k}(\vert \xi \vert ) {{\mathcal {F}}}[\varphi ]], \\ Q_{j} \varphi= & {} {{\mathcal {F}}}^{-1}[m_{j}(\vert \vert \tau \vert - \vert \xi \vert \vert ) {{\mathcal {F}}}[\varphi ]], \\ S_{\ell } \varphi= & {} {{\mathcal {F}}}^{-1}[m_{\ell }(\vert (\tau , \xi )\vert ) {{\mathcal {F}}}[\varphi ]], \end{aligned}$$

where \({{\mathcal {F}}}\) [resp. \({{\mathcal {F}}}^{-1}\)] is the [resp. the inverse] space-time Fourier transform.

Given a normed space X of function on \({{\mathbb {R}}}^{1+4}\), we define the restriction space \(X({{\mathcal {O}}})\) on a measurable subset \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) by the norm

$$\begin{aligned} \Vert \varphi \Vert _{X({{\mathcal {O}}})} := \inf _{\psi = \varphi \hbox { on } {{\mathcal {O}}}} \Vert \psi \Vert _{X({{\mathbb {R}}}^{1+4})}. \end{aligned}$$

In application, the set \({{\mathcal {O}}}\) is often an open set with (piecewise) smooth boundary, and hence there exists a bounded linear extension operator from \(X({{\mathcal {O}}})\) to \(X({{\mathbb {R}}}^{1+4})\) for many standard function spaces X (e.g., \(X = H^{1}\)).

4.5 Results from previous papers

Here we give precise statements of results from [19] and the first two papers in the sequence [26, 27], which are used in the present paper. Given a measurable subset \(S \subseteq \left\{ t\right\} \times {{\mathbb {R}}}^{4}\) for some t, we define the energy of a pair \((A, \phi )\) on S by

$$\begin{aligned} {{\mathcal {E}}}_{S}[A, \phi ] := \int _{S} \frac{1}{2} \sum _{0 \le \mu < \nu \le 4} \vert F_{\mu \nu }\vert ^{2} + \frac{1}{2} \sum _{\mu = 0}^{4} \vert \mathbf{D}_{\mu } \phi \vert ^{2} \, \mathrm {d}x. \end{aligned}$$

Accordingly, for a measurable subset \(S \subseteq {{\mathbb {R}}}^{4}\), we define

$$\begin{aligned} {{\mathcal {E}}}_{S}[a, e, f, g]:= & {} \int _{S} \frac{1}{2} \sum _{1 \le j < k \le 4} \vert (\mathrm {d}a)_{j k}\vert ^{2} + \frac{1}{2} \sum _{j=1}^{4} \vert e_{j}\vert ^{2}\\&+ \frac{1}{2} \sum _{j = 1}^{4} \vert \mathbf{D}_{j} f\vert ^{2} + \frac{1}{2} \vert g\vert ^{2} \, \mathrm {d}x. \end{aligned}$$

The following is the main theorem of [19].

Theorem 4.1

(Small energy global well-posedness in global Coulomb gauge) There exists \(\epsilon _{*}> 0\) such that the following holds. Let (aefg) be an \({{\mathcal {H}}}^{1}\) initial data set on \({{\mathbb {R}}}^{4}\) satisfying the global Coulomb gauge condition \(\partial ^{\ell } a_{\ell } = 0\), whose energy does not exceed \(\epsilon _{*}^{2}\), i.e.,

$$\begin{aligned} {{\mathcal {E}}}_{{{\mathbb {R}}}^{4}} [a, e, f, g] \le \epsilon _{*}^{2}. \end{aligned}$$
(4.1)
  1. (1)

    Then there exists a unique \(C_{t} {{\mathcal {H}}}^{1}\) admissible solution \((A, \phi )\) to (MKG) on \({{\mathbb {R}}}^{1+4}\) satisfying the global Coulomb gauge condition \(\partial ^{\ell } A_{\ell } = 0\) with (aefg) as its initial data at \(t = 0\), i.e., \((A_{j}, F_{0j}, \phi , \mathbf{D}_{t} \phi ) \!\upharpoonright _{\left\{ t=0\right\} } = (a_{j}, e_{j}, f, g)\).

  2. (2)

    Moreover, \((A, \phi )\) obeys the \(S^{1}\) norm bound

    $$\begin{aligned} \Vert A_{0}\Vert _{Y^{1}({{\mathbb {R}}}^{1+4})} + \Vert A_{x}\Vert _{S^{1}({{\mathbb {R}}}^{1+4})} + \Vert \phi \Vert _{S^{1}({{\mathbb {R}}}^{1+4})} \lesssim \Vert (a, e, f, g)\Vert _{{{\mathcal {H}}}^{1}}.\quad \quad \end{aligned}$$
    (4.2)
  3. (3)

    If the initial data set (aefg) is more regular, then so is the solution \((A, \phi )\); in particular, if (aefg) is classical, then \((A, \phi )\) is a classical solution to (MKG).

  4. (4)

    Finally, given a sequence \((a^{(n)}, e^{(n)}, f^{(n)}, g^{(n)}) \in {{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4})\) of Coulomb initial data sets such that \({{\mathcal {E}}}[a^{(n)}, e^{(n)}, f^{(n)}, g^{(n)}] \le \epsilon _{*}^{2}\) and \((a^{(n)}, e^{(n)}, f^{(n)}, g^{(n)}) \rightarrow (a, e, f, g)\) in \({{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4})\), we have

    $$\begin{aligned} \Vert A_{0}^{(n)} - A_{0}\Vert _{Y^{1}(I \times {{\mathbb {R}}}^{4})} + \Vert A_{x}^{(n)} - A_{x}\Vert _{S^{1}(I \times {{\mathbb {R}}}^{4})} + \Vert \phi ^{(n)} - \phi \Vert _{S^{1}(I \times {{\mathbb {R}}}^{4})} \rightarrow 0\nonumber \\ \end{aligned}$$
    (4.3)

    as \(n \rightarrow \infty \), for every compact interval \(I \subseteq {{\mathbb {R}}}\).

The first statement can be found directly in the main theorem of [19]. For the proof of the remaining statements, see [19, Sect. 5].

Remark 4.2

For the purpose of the present paper, the precise structure of the norm \(S^{1}\) is not necessary. Instead, we rely on the following embedding property:

$$\begin{aligned} \Vert \partial _{t,x} \phi \Vert _{L^{\infty }_{t} L^{2}_{x}}+\Vert \Box \phi \Vert _{L^{2}_{t} \dot{H}^{-\frac{1}{2}}_{x}} \lesssim&\Vert \phi \Vert _{S^{1}}, \end{aligned}$$

where all norms are taken on \({{\mathbb {R}}}^{1+4}\); see [19, Sect. 3] or [27, Sect. 3]. On the other hand, the definition of the \(Y^{1}\) norm is rather simple:

$$\begin{aligned} \Vert A\Vert _{Y^{1}}^{2} := \Vert \partial _{t,x} A\Vert _{L^{\infty }_{t} L^{2}_{x}}^{2} + \Vert \partial _{t,x} A\Vert _{L^{2} \dot{H}^{\frac{1}{2}}_{x}}^{2} \end{aligned}$$

where all norms are again taken on \({{\mathbb {R}}}^{1+4}\). Furthermore, \(S^{1}\) and \(Y^{1}\) are closed under multiplication by \(\eta \in C^{\infty }_{0}({{\mathbb {R}}}^{1+4})\), i.e., \(\eta S^{1}({{\mathbb {R}}}^{1+4}) \subseteq S^{1}({{\mathbb {R}}}^{1+4})\) and \(\eta Y^{1}({{\mathbb {R}}}^{1+4}) \subseteq Y^{1}({{\mathbb {R}}}^{1+4})\); we refer to [26, Sects. 6 and 7].

Given a positive number \(E \gtrsim \epsilon _{*}\) and an \({{\mathcal {H}}}^{1}\) initial data set (aefg) on \({{\mathbb {R}}}^{4}\) with energy \({{\mathcal {E}}}[a, e, f, g] \le E\), we define its energy concentration scale \(r_{c} = r_{c}[a, e, f, g]\) (with respect to energy E), in terms of the function \(\delta _{0}(E, \epsilon _{*}^{2}) = c \epsilon _{*}^{2} \min \left\{ 1, \epsilon _{*}^{4} E^{-2}\right\} \) with a small universal constant c, by

$$\begin{aligned} r_{c}= & {} r_{c}(E)[a, e, f, g] := \sup \left\{ r \ge 0 : \forall x \in {{\mathbb {R}}}^{4}, \ {{\mathcal {E}}}_{B_{r}(x)}[a, e, f, g]\right. \nonumber \\&\left. < \delta _{0}\left( E, \epsilon _{*}^{2}\right) \right\} . \end{aligned}$$
(4.4)

The following is the main result of [26].

Theorem 4.3

(Large energy local well-posedness theorem in global Coulomb gauge)  Let (aefg) be an \({{\mathcal {H}}}^{1}\) initial data set satisfying the global Coulomb gauge condition \(\partial ^{\ell } a_{\ell } = 0\) with energy \({{\mathcal {E}}}[a, e, f, g] \le E\). Let \(r_{c} = r_{c}[a, e, f, g]\) be defined as above. Then the following statements hold:

  1. (1)

    Existence and uniqueness.  There exists a unique admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution \((A, \phi )\) to (MKG) on \([-r_{c}, r_{c}] \times {{\mathbb {R}}}^{4}\) satisfying the global Coulomb gauge condition with (aefg) as its initial data.

  2. (2)

    A-priori \(S^{1}\) regularity.  We have the additional regularity properties

    $$\begin{aligned} A_{0} \in Y^{1}[-r_{c}, r_{c}], \quad A_{x}, \phi \in S^{1}[-r_{c}, r_{c}]. \end{aligned}$$
  3. (3)

    Persistence of regularity.  If the initial data set (aefg) is more regular, then so is the solution \((A, \phi )\); in particular, the solution \((A, \phi )\) is classical if (aefg) is classical.

  4. (4)

    Continuous dependence.  Consider a sequence \((a^{(n)}, e^{(n)}, f^{(n)}, g^{(n)})\) of \({{\mathcal {H}}}^{1}\) Coulomb initial data sets such that \((a^{(n)}, e^{(n)}, f^{(n)}, g^{(n)}) \rightarrow (a, e, f, g)\) in \({{\mathcal {H}}}^{1}\). Then the lifespan of \((A^{(n)}, \phi ^{(n)})\) eventually contains \([-r_{c}, r_{c}]\), and we have

    $$\begin{aligned}&\Vert A_{0} - A^{(n)}_{0}\Vert _{Y^{1}[-r_{c}, r_{c}]} + \Vert \left( A_{x} - A^{(n)}_{x}, \phi - \phi ^{(n)}\right) \Vert _{S^{1}[-r_{c}, r_{c}]} \rightarrow 0 \\&\quad \hbox { as } n \rightarrow \infty . \end{aligned}$$

We also state the initial data excision/gluing result from [26], which is used in several places in the present paper. Given a measurable subset \(O \subseteq {{\mathbb {R}}}^{4}\), the \({{\mathcal {H}}}^{1}(O)\) norm is defined as the restriction of the \({{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4})\) norm to O, and the space \({{\mathcal {H}}}^{1}(O)\) consists of all initial data sets on O with finite \({{\mathcal {H}}}^{1}(O)\) norm.

Proposition 4.4

(Excision and gluing of initial data sets) Let \(B = B_{r_{0}}(x_{0}) \subseteq {{\mathbb {R}}}^{4}\). Then there exists an operator \(E^{\mathrm {ext}}\) from \({{\mathcal {H}}}^{1}(2 B {\setminus } \overline{B})\) to \({{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4} {\setminus } \overline{B})\) satisfying the following properties.

  1. (1)

    Extension property:

    $$\begin{aligned} E^{\mathrm {ext}}[a, e, f, g] = (a, e, f, g) \quad \hbox { on the annulus } \frac{3}{2} B {\setminus } \overline{B}. \end{aligned}$$
  2. (2)

    Uniform bounds:

    $$\begin{aligned}&\Vert E^{\mathrm {ext}}[a, e, f, g]\Vert _{{{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4} {\setminus } \overline{B})} \lesssim \Vert (a, e, f, g)\Vert _{{{\mathcal {H}}}^{1}(2 B {\setminus } \overline{B})} \end{aligned}$$
    (4.5)
    $$\begin{aligned}&{{\mathcal {E}}}_{{{\mathbb {R}}}^{4} {\setminus } \overline{B}}[E^{\mathrm {ext}}[a, e, f, g]] \lesssim \Vert \frac{1}{\vert x - x_{0}\vert } f\Vert _{L^{2}_{x}(2 B {\setminus } \overline{B})}^{2} + {{\mathcal {E}}}_{2 B {\setminus } \overline{B}}[a, e, f, g]. \end{aligned}$$
    (4.6)
  3. (3)

    Regularity: The operator \(E^{\mathrm {ext}}\) is continuous from \({{\mathcal {H}}}^{1}(2 B {\setminus } \overline{B})\) to \({{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4} {\setminus } \overline{B})\). Moreover, if (aefg) is classical, then so is \(E^{\mathrm {ext}}[a, e, f, g]\).

In order to gain control of the first norm on the right in (4.6), we will repeatedly use the following improvement of the classical Hardy inequality, which is a consequence of a result proved in [26], Lemma 6.5:

Lemma 4.5

Let \(\sigma \ge 2\). Then for any ball B of radius r in \({{\mathbb {R}}}^4\) we have the bounds

$$\begin{aligned} r^{-1} \Vert f \Vert _{L^2_{x}(2B)}\lesssim & {} \Vert \mathbf{D}_x f \Vert _{L^2_{x}(\sigma B)} + \sigma ^{-1} \Vert \mathbf{D}_x f \Vert _{L^2_{x}({{\mathbb {R}}}^4{\setminus } \overline{\sigma B})} \end{aligned}$$
(4.7)
$$\begin{aligned} r^{-1} \Vert f \Vert _{L^2_{x}(2B {\setminus } \overline{B})}\lesssim & {} \Vert \mathbf{D}_x f \Vert _{L^2_{x}(\sigma B {\setminus } \overline{B})} + \sigma ^{-1} \Vert \mathbf{D}_x f \Vert _{L^2_{x}({{\mathbb {R}}}^4{\setminus } \overline{\sigma B})} \end{aligned}$$
(4.8)

Furthermore, we state the local geometric uniqueness result from [26], which we use in this paper to construct compatible pairs. For a ball \(B = \left\{ t_{0}\right\} \times B_{r_{0}}(x_{0}) \subseteq \left\{ t_{0}\right\} \times {{\mathbb {R}}}^{4}\), we define its future domain of dependence \({{\mathcal {D}}}^{+}(B)\) to be the set

$$\begin{aligned} {{\mathcal {D}}}^{+}(B) := \left\{ (t, x) \in {{\mathbb {R}}}^{1+4} : t_{0} \le t < r_{0}, \ \vert x - x_{0}\vert < t - t_{0}\right\} . \end{aligned}$$

Given a measurable subset \(O \subseteq {{\mathbb {R}}}^{4}\), the space \({{\mathcal {G}}}^{2}(O)\) consists of locally integrable gauge transformations such that the following semi-norm is finite:

$$\begin{aligned} \Vert \chi \Vert _{{{\mathcal {G}}}^{2}(O)} := \Vert \partial _{x} \chi \Vert _{L^{4}_{x}(O)} + \Vert \partial _{x}^{(2)} \chi \Vert _{L^{2}_{x}(O)}. \end{aligned}$$

Given a measurable subset \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\), define \({{\mathcal {O}}}_{t} := {{\mathcal {O}}}\cap (\left\{ t\right\} \times {{\mathbb {R}}}^{4})\) and \(I({{\mathcal {O}}}) := \left\{ t \in {{\mathbb {R}}}: {{\mathcal {O}}}_{t} \ne \emptyset \right\} \). Note that \(I({{\mathcal {O}}})\) is measurable and \({{\mathcal {O}}}_{t}\) is measurable for almost every t. Accordingly, we define the space \(C_{t} {{\mathcal {G}}}^{2}({{\mathcal {O}}})\) by the semi-norm

$$\begin{aligned} \Vert \chi \Vert _{C_{t} {{\mathcal {G}}}^{2}({{\mathcal {O}}})}:= & {} \mathop {{{\mathrm{ess\,sup\,}}}}_{t \in I({{\mathcal {O}}})} \Big ( \Vert \chi \Vert _{\dot{H}^{2}_{x} \cap \dot{W}^{1, 4}_{x} \cap \mathrm {BMO} ({{\mathcal {O}}}_{t})} \\&+ \Vert \partial _{t} \chi \Vert _{\dot{H}^{1}_{x} \cap L^{4}_{x}({{\mathcal {O}}}_{t})} + \Vert \partial _{t}^{2} \chi \Vert _{L^{2}_{x}({{\mathcal {O}}}_{t})} \Big ). \end{aligned}$$

Proposition 4.6

(Local geometric uniqueness among admissible solutions)  Let \(T_{0} > 0\) and let \(B \subset {{\mathbb {R}}}^{4}\) be an open ball. Consider \(C_{t} {{\mathcal {H}}}^{1}\) admissible solutions \((A, \phi ), (A', \phi ')\) on the region

$$\begin{aligned} {{\mathcal {D}}}:= {{\mathcal {D}}}^{+}(\left\{ 0\right\} \times B) \cap ( [0, T_{0}) \times {{\mathbb {R}}}^{4}). \end{aligned}$$

Suppose that the respective initial data (aefg) and \((a', e', f', g')\) are gauge equivalent on B, i.e., there exists \(\underline{\chi } \in {{\mathcal {G}}}^{2}(B)\) such that \((a, e, f, g) = (a' - \mathrm {d}\underline{\chi }, e', e^{i \underline{\chi }} f', e^{i \underline{\chi }} g')\). Then there exists a unique gauge transformation \(C_{t} {{\mathcal {G}}}^{2}({{\mathcal {D}}})\) such that \(\chi \!\upharpoonright _{\left\{ 0\right\} \times B} = \underline{\chi }\) and

$$\begin{aligned} (A, \phi ) = (A' - \mathrm {d}\chi , e^{i \chi } \phi ') \quad \hbox { on } {{\mathcal {D}}}. \end{aligned}$$

We now pass to results from [27]. Given an interval \(I \subseteq {{\mathbb {R}}}\), we define the energy dispersion of a function \(\phi \) on \(I \times {{\mathbb {R}}}^{4}\) by

$$\begin{aligned} ED[\phi ](I) := \sup _{k \in {{\mathbb {Z}}}} \Big ( 2^{-k} \Vert P_{k} \phi \Vert _{L^{\infty }_{t,x}(I \times {{\mathbb {R}}}^{4})} + 2^{-2k} \Vert P_{k} (\partial _{t} \phi )\Vert _{L^{\infty }_{t,x}(I \times {{\mathbb {R}}}^{4})} \Big )\quad \quad \quad \end{aligned}$$
(4.9)

The main theorem of [27] is as follows.

Theorem 4.7

(Energy dispersed regularity theorem)  For each \(E > 0\) there exist positive numbers \(\epsilon = \epsilon (E)\) and \(F = F(E)\) such that the following holds. Let \(I \subseteq {{\mathbb {R}}}\) be an open interval, and let \((A, \phi )\) be an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) on \(I \times {{\mathbb {R}}}^{4}\) in the global Coulomb gauge \(\partial ^{\ell } A_{\ell } = 0\) with energy not exceeding E, i.e.,

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t\right\} \times {{\mathbb {R}}}^{4}}[A, \phi ] \le E \quad \hbox { for every } t \in I. \end{aligned}$$
(4.10)

If, furthermore, the energy dispersion of \(\phi \) on \(I \times {{\mathbb {R}}}^{4}\) is less than or equal to \(\epsilon (E)\), i.e.,

$$\begin{aligned} ED[\phi ](I) \le \epsilon (E), \end{aligned}$$
(4.11)

then the following a-priori estimate for \((A, \phi )\) on \(I \times {{\mathbb {R}}}^{4}\) holds:

$$\begin{aligned} \Vert A_{0}\Vert _{Y^{1}[I]} + \Vert A_x\Vert _{S^{1}[I]} + \Vert \phi \Vert _{S^{1}[I]} \le F(E). \end{aligned}$$
(4.12)

We also state an continuation and scattering result for Coulomb solutions with finite \(S^{1}\) norm, which is proved in [27].

Theorem 4.8

(Continuation and scattering of solutions with finite \(S^{1}\) norm)  Let \(0 < T_{+} \le \infty \) and \((A, \phi )\) an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) on \([0, T_{+}) \times {{\mathbb {R}}}^{4}\) in the global Coulomb gauge which obeys the bound

$$\begin{aligned} \Vert A_{0}\Vert _{Y^{1}([0, T_{+}) \times {{\mathbb {R}}}^{4})} + \sup _{j=1, \ldots , 4} \Vert A_{j}\Vert _{S^{1}([0, T_{+}) \times {{\mathbb {R}}}^{4})} + \Vert \phi \Vert _{S^{1}([0, T_{+}) \times {{\mathbb {R}}}^{4})} < \infty . \end{aligned}$$

Then the following statements hold.

  1. (1)

    If \(T_{+} < \infty \), then \((A, \phi )\) extends to an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution with finite \(S^{1}\) norm past \(T_{+}\).

  2. (2)

    If \(T_{+} = \infty \), then \((A_{x}, \phi )\) scatters as \(t \rightarrow \infty \) in the following sense: There exist a solution \((A^{(\infty )}_{x}, \phi ^{(\infty )})\) to the system

    $$\begin{aligned} \left\{ \begin{aligned}&\Box A^{(\infty )}_{j} = 0, \\&\left( \Box + 2 i A^{free}_{\ell } \partial ^{\ell }\right) \phi ^{(\infty )} = 0, \end{aligned} \right. \end{aligned}$$

    with initial data \(A^{(\infty )}_{x}[0], \phi ^{(\infty )}[0] \in \dot{H}^{1}_{x} \times L^{2}_{x}\) such that

    The above statement holds with \(A^{free}_{x}\) the solution to the homogeneous wave equation with any of the data \(A_{x}[0]\) or \(A_{x}^{(\infty )}[0]\).

Analogous statements hold in the past time direction as well.

5 Conservation laws and monotonicity formulae

In this section, we derive key conservation laws and monotonicity formulae that will serve as a basis for proving regularity and scattering. We begin by describing the main results, deferring their proofs until later in the section. We emphasize that all statements in this section apply to admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions to (MKG), unless otherwise stated.

One of the fundamental conservation laws for (MKG) is that of the standard energy: Given an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution \((A, \phi )\) to (MKG) on \(I \times {{\mathbb {R}}}^{4}\), for \(t_{0}, t_{1} \in I\) we have

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t_{0}\right\} \times {{\mathbb {R}}}^{4}}[A, \phi ] = {{\mathcal {E}}}_{\left\{ t_{1}\right\} \times {{\mathbb {R}}}^{4}}[A, \phi ]. \end{aligned}$$
(5.1)

For self-similar solutions, finite energy condition translates to a weighted \(L^{2}\) estimate on \({{\mathcal {H}}}_{\rho }\). This estimate will be used to show that they must in fact be trivial.

Proposition 5.1

Let \((A, \phi )\) be a smooth solution to (MKG) on \(C_{(0, \infty )}\) with finite energy, i.e., there exists \(E > 0\) such that

$$\begin{aligned} \sup _{t \in (0, \infty )} {{\mathcal {E}}}_{S_{t}}[A, \phi ] \le E <\infty . \end{aligned}$$

Suppose furthermore that \((A, \phi )\) is self-similar, i.e., \(\iota _{X_{0}} F = 0\) and \((\mathbf{D}_{X_{0}} + \frac{1}{\rho }) \phi = 0\), where \(X_{0} = \partial _{\rho }\) in the hyperbolic coordinates \((\rho , y, \Theta )\). Then we have

$$\begin{aligned} \int _{{{\mathcal {H}}}_{\rho }} \frac{1}{2} \left( \frac{\cosh y}{\rho ^{2}}\vert \phi \vert ^{2} + 2 \frac{\sinh y}{\rho ^{2}} \mathrm {Re}(\phi \overline{\mathbf{D}_{y} \phi })+ \cosh y \left( \vert \mathbf{D}\phi \vert _{{{\mathcal {H}}}_{\rho }}^{2} + \vert F\vert _{{{\mathcal {H}}}_{\rho }}^{2}\right) \right) \le E,\nonumber \\ \end{aligned}$$
(5.2)

where \(\vert \mathbf{D}\phi \vert _{{{\mathcal {H}}}_{\rho }}^{2}, \vert F\vert _{{{\mathcal {H}}}_{\rho }}^{2}\) are to be defined in (5.27).

The next statement concerns the quantities

$$\begin{aligned} {{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}}[A, \phi ] := {{\mathcal {E}}}_{S_{t_{1}}}[A, \phi ] - {{\mathcal {E}}}_{S_{t_{0}}}[A, \phi ], {{\mathcal {G}}}_{\partial S_{t_{1}}}[\phi ] := \frac{1}{t_{1}} \int _{\partial S_{t_{1}}} \vert \phi \vert ^{2}.\nonumber \\ \end{aligned}$$
(5.3)

Here, \({{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}}\) is the energy flux of \((A, \phi )\) through \(\partial C_{[t_{0}, t_{1}]}\). For \(\phi \in C_{t} (I; \dot{H}^{1}_{x})\) and \(t_{1} \in I\), observe that \({{\mathcal {G}}}_{\partial S_{t_{1}}}[\phi ]\) is well-defined by the trace theorem. In fact, \(\phi \!\upharpoonright _{\partial S_{t_{1}}} \in H^{1/2}(\partial S_{t_{1}})\).

Lemma 5.2

Let \((A, \phi )\) be an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) on \(I \times {{\mathbb {R}}}^{4}\) where \(I \subset {{\mathbb {R}}}^{4}\) is an open interval. Then for every \(t_{0}, t_{1} \in I\) with \(t_{0} \le t_{1}\), the following statements hold:

  1. (1)

    The energy flux on \({{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}} [A, \phi ]\) is non-negative and additive, i.e.,

    $$\begin{aligned} {{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}} [A, \phi ] = {{\mathcal {F}}}_{\partial C_{[t_{0}, t']}} [A, \phi ] + {{\mathcal {F}}}_{\partial C_{[t', t_{1}]}} [A, \phi ] \quad \hbox { for } t' \in [t_{0}, t_{1}].\nonumber \\ \end{aligned}$$
    (5.4)
  2. (2)

    The following local Hardy’s inequality holds on \(\partial C_{[t_{0}, t_{1}]}\):

    $$\begin{aligned} {{\mathcal {G}}}_{\partial S_{t_{0}}}[\phi ] + \int _{t_{0}}^{t_{1}} {{\mathcal {G}}}_{\partial S_{t}} [\phi ] \, \frac{\mathrm {d}t}{t} \le {{\mathcal {G}}}_{\partial S_{t_{1}}}[\phi ] + {{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}}[A, \phi ]. \end{aligned}$$
    (5.5)

    Moreover, we also have

    $$\begin{aligned} {{\mathcal {G}}}_{\partial S_{t_{1}}}[\phi ] \le {{\mathcal {E}}}_{(\left\{ t\right\} \times {{\mathbb {R}}}^{4}) \setminus S_{t_{1}}}[A, \phi ] \end{aligned}$$
    (5.6)

A consequence of Lemma 5.2 is a simple but crucial decay result for the two quantities defined in (5.3).

Corollary 5.3

Let \((A, \phi )\) be an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) on \(I \times {{\mathbb {R}}}^{4}\) where \(I \subset {{\mathbb {R}}}^{4}\) is an open interval. Then the following statements hold.

  1. (1)

    If \((0, \delta ] \subseteq I\) for some \(\delta > 0\), then we have

    $$\begin{aligned} \lim _{t_{1} \rightarrow 0} {{\mathcal {F}}}_{\partial C_{(0, t_{1}]}}[A, \phi ] = 0, \quad \lim _{t_{1} \rightarrow 0} {{\mathcal {G}}}_{\partial S_{t_{1}}}[\phi ] = 0. \end{aligned}$$
    (5.7)

    where \({{\mathcal {F}}}_{\partial C_{(0, t_{1}]}}[A, \phi ] := \lim _{t_{0} \rightarrow 0} {{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}}[A, \phi ]\).

  2. (2)

    If \([\delta , \infty ) \subseteq I\) for some \(\delta > 0\), then we have

    $$\begin{aligned} \lim _{t_{0}, t_{1} \rightarrow \infty } {{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}}[A, \phi ] = 0, \quad \lim _{t_{1} \rightarrow \infty } {{\mathcal {G}}}_{\partial S_{t_{1}}}[\phi ] = 0. \end{aligned}$$
    (5.8)

The statements concerning \({{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}}\) follow from the monotonicity and boundedness of \({{\mathcal {E}}}_{S_{t}}\), whereas those concerning \({{\mathcal {G}}}_{\partial S_{t_{1}}}\) follow from (5.5), (5.6); we omit the straightforward details.

The decay statements (5.7) and (5.8) imply that the energy flux and the quantity \({{\mathcal {G}}}_{\partial S_{t}}[\phi ]\) vanish as one approaches (0, 0) or \(t \rightarrow \infty \). In the ideal case when \({{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}} = 0\) and \({{\mathcal {G}}}_{\partial S_{t_{1}}} = 0\), the solution \((A, \phi )\) enjoys an additional monotonicity formula, namely

$$\begin{aligned}&\int _{S_{t_{1}}} {}^{(X_{0})} P_{T}[A, \phi ] \, \mathrm {d}x + \iint _{C_{[t_{0}, t_{1}]}} \frac{1}{\rho } \vert \iota _{X_{0}} F\vert ^{2}\nonumber \\&\quad + \frac{1}{\rho } \left| {\left( \mathbf{D}_{X_{0}} + \frac{1}{\rho }\right) \phi }\right| ^{2} \, \mathrm {d}t \mathrm {d}x = \int _{S_{t_{0}}} {}^{(X_{0})} P_{T}[A, \phi ] \, \mathrm {d}x \end{aligned}$$
(5.9)

where \(X_{0} = \partial _{\rho }\) in the hyperbolic coordinate system \((\rho , y, \Theta ), \vert \iota _{X_{0}} F\vert ^{2} := \mathbf{m}(\iota _{X_{0}} F, \iota _{X_{0}} F)\) (observe that \(\vert \iota _{X_{0}} F\vert ^{2} \ge 0\)) and \({}^{(X_{0})} P_{T}[A, \phi ]\) is to be defined below in Lemma 5.10. It turns out that the right-hand side is uniformly bounded by the conserved energy as \(t_{0} \rightarrow 0\), thereby breaking the scaling invariance. More precisely, the first term on the left-hand side precludes null concentration of energy, whereas the second term implies that rescalings of \((A, \phi )\) are asymptotically self-similar.

In application, however, the quantities \({{\mathcal {F}}}\) and \({{\mathcal {G}}}\) will be small but not necessarily zero. Hence we will rely on the following approximate version of (5.9) instead. Define

$$\begin{aligned}&\rho _{\varepsilon } = \sqrt{(t+\varepsilon )^{2} - r^{2}}, \quad X_{\varepsilon } = \rho _{\varepsilon }^{-1} ((t + \varepsilon ) \partial _{t} + r \partial _{r}), \\&\vert \iota _{X_{\varepsilon }} F\vert ^{2} := \mathbf{m}(\iota _{X_{\varepsilon }} F, \iota _{X_{\varepsilon }} F). \end{aligned}$$

Proposition 5.4

Let \((A, \phi )\) be an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) on \([\varepsilon , 1] \times {{\mathbb {R}}}^{4}\), where \(\varepsilon \in (0, 1)\). Suppose furthermore that \((A, \phi )\) satisfies

$$\begin{aligned} {{\mathcal {E}}}_{S_{1}}[A, \phi ] \le E, \quad {{\mathcal {F}}}_{\partial C_{[\varepsilon , 1]}} [A, \phi ] \le \varepsilon ^{\frac{1}{2}} E, \quad {{\mathcal {G}}}_{\partial S_{1}}[\phi ] \le \varepsilon ^{\frac{1}{2}} E. \end{aligned}$$
(5.10)

Then

$$\begin{aligned}&\int _{S_{1}} {}^{(X_{\varepsilon })} P_{T}[A, \phi ] \, \mathrm {d}x + \iint _{C_{[\varepsilon , 1]}} \frac{1}{\rho _{\varepsilon }} \vert \iota _{X_{\varepsilon }} F\vert ^{2} + \frac{1}{\rho _{\varepsilon }} \left| \left( \mathbf{D}_{X_{\varepsilon }} + \frac{1}{\rho _{\varepsilon }}\right) \phi \right| ^{2} \, \mathrm {d}t \mathrm {d}x \lesssim E\nonumber \\ \end{aligned}$$
(5.11)

where the implicit constant is independent of \(\varepsilon , E\). We refer to Lemma 5.10 for the computation of \({}^{(X_{\varepsilon })} P_{T}[A, \phi ]\).

Using Proposition 5.4, we can also establish a version of (5.9) that is localized away from the boundary of the cone. This statement will be useful for propagating lower bounds in a time-like region towards (0, 0).

Proposition 5.5

Let \((A, \phi )\) be an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) on \([\varepsilon , 1] \times {{\mathbb {R}}}^{4}\), where \(\varepsilon \in (0, 1)\). Suppose furthermore that \((A, \phi )\) satisfies (5.10). Then for \(2 \varepsilon \le \delta _{0} < \delta _{1} \le t_{0} \le 1\), we have

$$\begin{aligned} \int _{S_{1}^{\delta _{1}}} {}^{(X_{0})} P_{T}[A, \phi ] \, \mathrm {d}x\le & {} \int _{S_{t_{0}}^{\delta _{0}}} {}^{(X_{0})} P_{T}[A, \phi ] \, \mathrm {d}x \nonumber \\&+\,C\Big ( (\delta _{1} / t_{0})^{\frac{1}{2}}+ \vert \log (\delta _{1} / \delta _{0})\vert ^{-1} \Big ) E. \end{aligned}$$
(5.12)

The rest of this section is devoted to the proofs of the above statements, and is organized as follows. In Sect. 5.1, we discuss ways of generating divergence identities for proving the above conservation laws and monotonicity formulae. We also introduce null decomposition, which will assist our computations below. In Sect. 5.2, we use them to prove (5.1) and Proposition 5.1. In Sect. 5.3, we introduce and prove a local version of Hardy’s inequality and use it establish Lemma 5.2. Lastly, Sect. 5.4 is devoted to the proof of (5.9) and Propositions 5.4 and 5.5.

5.1 Divergence identities and null decomposition

The goal of this subsection is two-fold. First, we introduce methods for generating useful divergence identities for solutions to (MKG) that essentially arise from Nöther’s principle. Second, we define the notion of a null frame and the associated null decomposition of F and \(\mathbf{D}\phi \), which will be useful for the computations below.

We first present the energy-momentum tensor formalism for generating divergence identities. This formalism is a way to exploit Nöther’s principle (continuous symmetries lead to conserved quantities in a Lagrangian field theory) for external symmetries, i.e., symmetries of the base manifold \({{\mathbb {R}}}^{1+4}\) of (MKG). Let \((A, \phi )\) be a smooth solution to (MKG) on an open subset \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\). We define the energy-momentum tensor associated to \((A, \phi )\) as

$$\begin{aligned} {\mathcal {Q}}_{\mathbf{a}\mathbf{b}}[A, \phi ] = {}^{(\mathrm {M})}{\mathcal {Q}}_{\mathbf{a}\mathbf{b}}[A]_{\mathbf{a}\mathbf{b}} + {}^{(\mathrm {KG})}{\mathcal {Q}}_{\mathbf{a}\mathbf{b}}[A, \phi ] \end{aligned}$$
(5.13)

where

(5.14)
$$\begin{aligned} {}^{(\mathrm {KG})}{\mathcal {Q}}_{\mathbf{a}\mathbf{b}}[A, \phi ] = \mathrm {Re}(\mathbf{D}_{\mathbf{a}} \phi \overline{\mathbf{D}_{\mathbf{b}} \phi }) - \frac{1}{2} \mathbf{m}_{\mathbf{a}\mathbf{b}} \mathbf{D}^{\mathbf{c}} \phi \overline{\mathbf{D}_{\mathbf{c}} \phi } \end{aligned}$$
(5.15)

Note that \({\mathcal {Q}}\) is a symmetric 2-tensor, which is gauge invariant at each point. Moreover, since \((A, \phi )\) is a smooth solution to (MKG), the energy-momentum tensor satisfies

$$\begin{aligned} \nabla ^{\mathbf{a}} {\mathcal {Q}}_{\mathbf{a}\mathbf{b}}[A, \phi ] = 0. \end{aligned}$$
(5.16)

Given a vector field X on \({{\mathcal {O}}}\), we define its deformation tensor to be the Lie derivative of the metric with respect to X, i.e., \({}^{(X)} \pi := {{\mathcal {L}}}_{X} \mathbf{m}\). Using covariant derivatives, \({}^{(X)} \pi \) also takes the form

$$\begin{aligned} {}^{(X)} \pi _{\mathbf{a}\mathbf{b}} = \nabla _{\mathbf{a}} X_{\mathbf{b}} + \nabla _{\mathbf{a}} X_{\mathbf{b}} \end{aligned}$$

We will denote the metric dual of \({}^{(X)} \pi \) by \({}^{(X)} \pi ^{\sharp }\), i.e., \(({}^{(X)} \pi ^{\sharp })^{\mathbf{a}\mathbf{b}} = \mathbf{m}^{\mathbf{a}\mathbf{c}} \mathbf{m}^{\mathbf{b}\mathbf{d}} {}^{(X)} \pi _{\mathbf{c}\mathbf{d}}\). From its Lie derivative definition, the following formula for \({}^{(X)} \pi _{\mu \nu }\) in coordinates can be immediately derived:

$$\begin{aligned} {}^{(X)} \pi _{\mu \nu } = X(\mathbf{m}_{\mu \nu }) + \partial _{\mu } (X^{\alpha }) \mathbf{m}_{\alpha \nu } + \partial _{\nu } (X^{\alpha }) \mathbf{m}_{\alpha \mu } \end{aligned}$$
(5.17)

Using the deformation tensor, we now define the associated 1- and 0-currents of \((A, \phi )\) as

$$\begin{aligned} \begin{aligned} {}^{(X)} J_{\mathbf{a}} [A, \phi ] :=\,&{\mathcal {Q}}_{\mathbf{a}\mathbf{b}}[A, \phi ] X^{\mathbf{b}}, \\ {}^{(X)} K[A, \phi ] :=\,&{\mathcal {Q}}_{\mathbf{a}\mathbf{b}}[A, \phi ] \left( \frac{1}{2} {}^{(X)} \pi ^{\sharp }\right) ^{\mathbf{a}\mathbf{b}}. \end{aligned} \end{aligned}$$
(5.18)

Then by (5.16) and the symmetry of \({\mathcal {Q}}[A, \phi ]_{\mathbf{a}\mathbf{b}}\), we obtain

$$\begin{aligned} \nabla ^{\mathbf{a}} \left( {}^{(X)} J_{\mathbf{a}}[A,\phi ]\right) = {}^{(X)} K[A, \phi ]. \end{aligned}$$
(5.19)

Remark 5.6

Taking \(X = T = \partial _{t}\) in the rectilinear coordinates \((t, x^{1}, \ldots , x^{4})\), we have \({}^{(T)} \pi = 0\) (in other words, T is a Killing vector field) and hence \({}^{(T)} K = 0\). In fact, (5.19) is a local form of the standard conservation of energy (5.1). We refer to Sect. 5.2 for more details.

For a (smooth) scalar field \(\phi \) satisfying the gauge covariant wave equation \(\Box _{A} \phi = 0\), we introduce another way of generating divergence identities. This method corresponds to using Nöther’s principle for the symmetry of the equation under the action of \({{\mathbb {C}}}\) viewed as the complexification of the gauge group U(1). Given a \({{\mathbb {C}}}\)-valued function w on an open subset of \({{\mathbb {R}}}^{1+4}\), we define its associated 1- and 0-currents by

$$\begin{aligned} \begin{aligned} {}^{(w)} J_{\mathbf{a}}[A, \phi ] =\,&(\mathrm {Re}\, w) \mathrm {Re}(\phi \overline{\mathbf{D}_{\mathbf{a}} \phi }) - (\mathrm {Im}\, w) \mathrm {Im}(\phi \overline{\mathbf{D}_{\mathbf{a}} \phi }) - \frac{1}{2} \nabla _{\mathbf{a}} (\mathrm {Re}\, w) \vert \phi \vert ^{2}, \\ {}^{(w)} K[A, \phi ] =\,&(\mathrm {Re}\, w) \mathbf{D}_{\mathbf{a}} \phi \overline{\mathbf{D}^{\mathbf{a}}\phi } - \frac{1}{2} \Box (\mathrm {Re}\, w) \vert \phi \vert ^{2} - \nabla _{\mathbf{a}} (\mathrm {Im}\, w) \mathrm {Im}(\phi \overline{\mathbf{D}^{\mathbf{a}} \phi }). \end{aligned}\nonumber \\ \end{aligned}$$
(5.20)

A simple computationFootnote 12 shows that the following conservation law holds:

$$\begin{aligned} \nabla ^{\mathbf{a}} \left( {}^{(w)} J_{\mathbf{a}}[A, \phi ]\right) = {}^{(w)} K[A, \phi ]. \end{aligned}$$
(5.21)

Remark 5.7

Taking \(w = -i\), we have

$$\begin{aligned} {}^{(w)} J_{\mathbf{a}} = \mathrm {Im}(\phi \overline{\mathbf{D}_{\mathbf{a}} \phi }), \quad {}^{(w)} K = 0, \end{aligned}$$

and (5.21) reduces to the well-known local conservation of charge.

Finally, we introduce the notion of a null frame and the associated null decomposition of \(\mathbf{D}\phi \) and F, which are useful for computations concerning the energy-momentum tensor. At each point \(p = (t_{0}, x_{0}) \in {{\mathbb {R}}}^{1+4}\), consider orthonormal vectors \(\left\{ e_{{\mathfrak {a}}}\right\} _{{\mathfrak {a}}=1,\ldots , 3}\) which are orthogonal to L and \(\underline{L}\). Observe that each \(e_{{\mathfrak {a}}}\) is tangent to the sphere \(\partial B_{t_{0}, r_{0}} := \left\{ t_{0}\right\} \times \partial B_{r_{0}}(0)\) where \(r_{0} = \vert x_{0}\vert \). The set of vectors \(\left\{ L, \underline{L}, e_{1}, e_{2}, e_{3}\right\} \) at p is called a null frame at p associated to \(L, \underline{L}\).

The \({{\mathbb {C}}}\)-valued 1-form \(\mathbf{D}\phi \) can be decomposed with respect to the null frame \(\left\{ L, \underline{L}, e_{{\mathfrak {a}}}\right\} \) as \(\mathbf{D}_{L} \phi , \mathbf{D}_{\underline{L}} \phi \) and \({\not \!\! \mathbf{D}}_{{\mathfrak {a}}} \phi := \mathbf{D}_{e_{{\mathfrak {a}}}} \phi \), which is the null decomposition of \(\mathbf{D}\phi \). A simple computation shows that

$$\begin{aligned} {}^{(\mathrm {KG})}{\mathcal {Q}}[A, \phi ](L, L)= & {} \vert \mathbf{D}_{L} \phi \vert ^{2}, \nonumber \\ \ {}^{(\mathrm {KG})}{\mathcal {Q}}[A, \phi ](\underline{L}, \underline{L})= & {} \vert \mathbf{D}_{\underline{L}} \phi \vert ^{2}, \ {}^{(\mathrm {KG})}{\mathcal {Q}}[A, \phi ](L, \underline{L}) = \vert \not \!\! \mathbf{D}\phi \vert ^{2} \end{aligned}$$
(5.22)

where \(\vert \not \!\! \mathbf{D}\phi \vert ^{2} := \sum _{{\mathfrak {a}}=1, \ldots , 3} \vert \not \!\! \mathbf{D}_{{\mathfrak {a}}} \phi \vert ^{2}\).

Next, we define the null decomposition of the 2-form F with respect to \(\left\{ L, \underline{L}, e_{{\mathfrak {a}}}\right\} \) as

$$\begin{aligned} \alpha _{{\mathfrak {a}}} := F(L, e_{{\mathfrak {a}}}), \quad \underline{\alpha }_{{\mathfrak {a}}} := F(\underline{L}, e_{{\mathfrak {a}}}), \quad \varrho := \frac{1}{2} F(L, \underline{L}), \quad \sigma _{{\mathfrak {a}}{\mathfrak {b}}} := F(e_{{\mathfrak {a}}}, e_{{\mathfrak {b}}}). \end{aligned}$$

Note that \(\varrho \) is a function, \(\alpha _{{\mathfrak {a}}}, \underline{\alpha }_{{\mathfrak {b}}}\) are 1-forms on \(\partial B_{t_{0}, r_{0}}\) and \(\sigma _{{\mathfrak {a}}{\mathfrak {b}}}\) is a 2-form on \(\partial B_{t_{0}, r_{0}}\). We define their pointwise absolute values as

$$\begin{aligned} \vert \alpha \vert ^{2} := \sum _{{\mathfrak {a}}=1, \ldots , 3} \alpha _{{\mathfrak {a}}}^{2}, \quad \vert \underline{\alpha }\vert ^{2} := \sum _{{\mathfrak {a}}=1, \ldots , 3} \underline{\alpha }_{{\mathfrak {a}}}^{2}, \quad \vert \sigma \vert ^{2} := \sum _{1 \le {\mathfrak {a}}< {\mathfrak {b}}\le 3} \sigma _{{\mathfrak {a}}{\mathfrak {b}}}^{2}. \end{aligned}$$

This decomposition leads to the following simple formulae for the \(L, \underline{L}\) components of \({}^{(\mathrm {M})} {\mathcal {Q}}\):

$$\begin{aligned}&{}^{(\mathrm {M})}{\mathcal {Q}}[A](L, L) = \vert \alpha \vert ^{2}, \quad {}^{(\mathrm {M})}{\mathcal {Q}}[A](\underline{L}, \underline{L}) = \vert \underline{\alpha }\vert ^{2}, \nonumber \\&{}^{(\mathrm {M})}{\mathcal {Q}}[A](L, \underline{L}) = \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2}. \end{aligned}$$
(5.23)

5.2 The standard energy identity and proof of Proposition 5.1

Consider the vector field T, which is equal to the coordinate vector field \(\partial _{t}\) in the rectilinear coordinates \((t, x^{1}, \ldots , x^{4})\). It can be easily checked that T is Killing, i.e., \({}^{(T)} \pi = 0\). Contracting T with the energy-momentum tensor \({\mathcal {Q}}[A, \phi ]\), we then obtain the local conservation of energy, i.e., given a smooth solution \((A, \phi )\) to (MKG) on an open subset \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\), we have

$$\begin{aligned} \nabla ^{\mathbf{a}} ({}^{(T)} J_{\mathbf{a}}[A, \phi ] )= 0 \quad \hbox { on } {{\mathcal {O}}}. \end{aligned}$$
(5.24)

Since \(T = \frac{1}{2}(L + \underline{L})\), we have

$$\begin{aligned} {}^{(T)} J_{L} [A, \phi ]= & {} {\mathcal {Q}}[A, \phi ](T, L) = \frac{1}{2} \left( \vert \mathbf{D}_{L} \phi \vert ^{2} + \vert \not \!\! \mathbf{D}\phi \vert ^{2}\right) \nonumber \\&+ \frac{1}{2} \left( \vert \alpha \vert ^{2} + \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2}\right) , \end{aligned}$$
(5.25)
$$\begin{aligned} {}^{(T)} J_{\underline{L}} [A, \phi ]= & {} {\mathcal {Q}}[A, \phi ](T, \underline{L}) = \frac{1}{2} \left( \vert \mathbf{D}_{\underline{L}} \phi \vert ^{2} + \vert \not \!\! \mathbf{D}\phi \vert ^{2}\right) \nonumber \\&+ \frac{1}{2} \left( \vert \underline{\alpha }\vert ^{2} + \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2}\right) . \end{aligned}$$
(5.26)

Given a (measurable) subset \(S \subseteq \left\{ t\right\} \times {{\mathbb {R}}}^{4}\) for some \(t \in {{\mathbb {R}}}\), the above computation implies

$$\begin{aligned} {{\mathcal {E}}}_{S}[A, \phi ] = \int _{S} {}^{(T)} J_{T}[A, \phi ]\, \mathrm {d}x. \end{aligned}$$

We are now ready to give a quick proof of (5.1). For a classical solution \((A, \phi )\) in the class \(C_{t} {{\mathcal {H}}}^{1} ([t_{0}, t_{1}] \times {{\mathbb {R}}}^{4})\), the standard energy conservation (5.1) follows by integrating (5.24) over \((t_{0}, t_{1}) \times {{\mathbb {R}}}^{4}\) and applying the divergence theorem. The case of an admissible solution then easily follows by approximation.

We conclude this subsection with a proof of Proposition 5.1.

Proof of Proposition 5.1

Note that \(X_{0} = \partial _{\rho }\) and \(T = \cosh y \partial _{\rho } - \sinh y (\rho ^{-1} \partial _{y})\) in the hyperbolic coordinates \((\rho , y, \Theta )\). In the following computation, we use the orthonormal frame \(\left\{ \partial _{\rho }, \rho ^{-1} \partial _{y}, e_{{\mathfrak {a}}}\right\} \) at each point, where \(\left\{ e_{{\mathfrak {a}}}\right\} _{{\mathfrak {a}}=1,2,3}\) is an orthonormal frame tangent to the constant \(\rho , y\) sphere as before. Then we compute

$$\begin{aligned} {}^{(\mathrm {KG})}{\mathcal {Q}}[A, \phi ](\partial _{\rho }, \partial _{\rho })= & {} \frac{1}{2} \Big ( \vert \mathbf{D}_{\rho } \phi \vert ^{2} + \vert \rho ^{-1} \mathbf{D}_{y} \phi \vert ^{2} + \vert \not \!\! \mathbf{D}\phi \vert ^{2} \Big ) \\ {}^{(\mathrm {KG})}{\mathcal {Q}}[A, \phi ](\partial _{\rho }, \rho ^{-1} \partial _{y})= & {} \mathrm {Re}(\mathbf{D}_{\rho } \phi \overline{\rho ^{-1} \mathbf{D}_{y} \phi }), \\ {}^{(\mathrm {M})} {\mathcal {Q}}[A, \phi ](\partial _{\rho }, \partial _{\rho })= & {} \frac{1}{2} F(\partial _{\rho }, \rho ^{-1} \partial _{y})^{2} + \frac{1}{2} \sum _{{\mathfrak {a}}= 1, \ldots , 3} F(\partial _{\rho }, e_{{\mathfrak {a}}})^{2} \\&+ \frac{1}{2} \sum _{{\mathfrak {a}}= 1, \ldots , 3} \rho ^{-2} F(\partial _{y}, e_{{\mathfrak {a}}})^{2}\nonumber \\&+ \frac{1}{2} \sum _{1 \le {\mathfrak {a}}< {\mathfrak {b}}\le 3} F(e_{{\mathfrak {a}}}, e_{{\mathfrak {b}}})^{2}, \\ {}^{(\mathrm {M})} {\mathcal {Q}}[A, \phi ](\partial _{\rho }, \rho ^{-1} \partial _{y})= & {} \sum _{{\mathfrak {a}}= 1, \ldots , 3} F(\partial _{\rho }, e_{{\mathfrak {a}}}) F(\rho ^{-1} \partial _{y}, e_{{\mathfrak {a}}}). \end{aligned}$$

By the self-similarity conditions \(\iota _{\partial _{\rho }} F = F(\partial _{\rho }, \cdot ) = 0\) and \((\mathbf{D}_{\rho } + \frac{1}{\rho }) \phi = 0\), we have

$$\begin{aligned} {}^{(T)} J_{\rho }[A, \phi ]= & {} \cosh y {\mathcal {Q}}[A, \phi ](\partial _{\rho }, \partial _{\rho }) - \sinh y {\mathcal {Q}}[A, \phi ](\rho ^{-1} \partial _{y}, \partial _{\rho }) \\= & {} \frac{1}{2} \left( \frac{\cosh y}{\rho ^{2}}\vert \phi \vert ^{2} + 2 \frac{\sinh y}{\rho ^{2}} \mathrm {Re}(\phi \overline{\mathbf{D}_{y} \phi })\right. \\&\left. + \cosh y \left( \vert \mathbf{D}\phi \vert _{{{\mathcal {H}}}_{\rho }}^{2} + \vert F\vert _{{{\mathcal {H}}}_{\rho }}^{2}\right) \right) \end{aligned}$$

where

$$\begin{aligned} \vert \mathbf{D}\phi \vert _{{{\mathcal {H}}}_{\rho }}^{2} := \left( g_{{{\mathcal {H}}}_{\rho }}^{-1}\right) ^{\mathbf{a}\mathbf{b}} \mathbf{D}_{\mathbf{a}} \phi \overline{\mathbf{D}_{\mathbf{b}} \phi }, \vert F\vert _{{{\mathcal {H}}}_{\rho }}^{2} := \frac{1}{2} \left( g_{{{\mathcal {H}}}_{\rho }}^{-1}\right) ^{\mathbf{a}\mathbf{c}} \left( g_{{{\mathcal {H}}}_{\rho }}^{-1}\right) ^{\mathbf{b}\mathbf{d}} F_{\mathbf{a}\mathbf{b}} F_{\mathbf{c}\mathbf{d}},\quad \quad \quad \end{aligned}$$
(5.27)

and \(g_{{{\mathcal {H}}}_{\rho }}^{-1} = \rho ^{-2} \partial _{y} \cdot \partial _{y} + \sum _{{\mathfrak {a}}=1,2,3} e_{{\mathfrak {a}}} \cdot e_{{\mathfrak {a}}}\) is the induced metric on \({{\mathcal {H}}}_{\rho }\).

Fig. 1
figure 1

Domain of integration for the proof of Proposition 5.1

We are ready to complete the proof. Denote by \({{\mathcal {H}}}_{>\rho }\) the region \(\{(\rho ', y', \Theta ') : \rho ' > \rho \}\). Integrate (5.24) over the region \({{\mathcal {R}}}:= C_{(0, t)} \cap {{\mathcal {H}}}_{> \rho }\), whose boundary is \((S_{t} \cap {{\mathcal {H}}}_{> \rho }) \cup ({{\mathcal {H}}}_{\rho } \cap C_{(0, t)})\), and apply the divergence theorem; see Fig. 1. Then taking \(t \rightarrow \infty \), the desired estimate (5.2) on \({{\mathcal {H}}}_{\rho }\) follows. \(\square \)

5.3 A localized Hardy’s inequality and proof of Lemma 5.2

We begin by stating a very general identity (valid for any dimension \(d \ge 3\)), which can be thought of as Hardy’s inequality with all the errors terms explicit.

Lemma 5.8

Let \(\phi \) be a smooth \({{\mathbb {C}}}\)-valued function and A be a smooth 1-form on \({{\mathbb {R}}}^{d}\) \((d \ge 3)\). Then for \(0 < r_{1} < r_{2}\), we have

$$\begin{aligned}&\int _{r_{1}}^{r_{2}} \int \frac{1}{r^{2}} \vert \phi \vert ^{2} r^{d-1} \, \mathrm {d}\sigma _{{{\mathbb {S}}}^{d-1}} \, \mathrm {d}r\nonumber \\&\qquad + \int _{r_{1}}^{r_{2}} \int \left| \frac{2}{d-2} \mathbf{D}_{r} \phi + \frac{1}{r} \phi \right| ^{2} r^{d-1} \, \mathrm {d}\sigma _{{{\mathbb {S}}}^{d-1}} \, \mathrm {d}r \nonumber \\&\quad = \left( \frac{2}{d-2} \right) ^{2} \int _{r_{1}}^{r_{2}} \int \vert \mathbf{D}_{r} \phi \vert ^{2} r^{d-1} \, \mathrm {d}\sigma _{{{\mathbb {S}}}^{d-1}} \, \mathrm {d}r \nonumber \\&\qquad + \frac{2}{d-2} \int \vert \phi \vert ^{2} r^{d-2} \, \mathrm {d}\sigma _{{{\mathbb {S}}}^{d-1}} \Big \vert _{r = r_{1}}^{r_{2}} . \end{aligned}$$
(5.28)

We omit the proof, which is a simple algebra plus an application of the fundamental theorem of calculus in r. Specializing to \(d = 4\) and rearranging some terms, we obtain

$$\begin{aligned}&\int _{\left\{ r=r_{1}\right\} } \frac{\vert \phi \vert ^{2}}{r} r^{3} \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} + \int _{r_{1}}^{r_{2}} \int \frac{1}{r^{2}} \vert \phi \vert ^{2} r^{3} \, \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \, \mathrm {d}r\nonumber \\&\qquad + \int _{r_{1}}^{r_{2}} \int \vert r^{-1} \mathbf{D}_{r} (r \phi )\vert ^{2} r^{3} \, \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \, \mathrm {d}r \nonumber \\&\quad = \int _{\left\{ r=r_{2}\right\} } \frac{\vert \phi \vert ^{2}}{r} r^{3} \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} + \int _{r_{1}}^{r_{2}} \int \vert \mathbf{D}_{r} \phi \vert ^{2} r^{3} \, \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \, \mathrm {d}r. \end{aligned}$$
(5.29)

The last term on the left-hand side of (5.29) is always non-negative; moreover, for \(\phi \in {{\mathcal {S}}}({{\mathbb {R}}}^{4})\), the first term on the right-hand side vanishes as \(r_{2} \rightarrow \infty \). By approximation, the following gauge invariant version of Hardy’s inequality on \({{\mathbb {R}}}^{4}\) follows.

Corollary 5.9

Let \(\phi , A \in \dot{H}^{1}({{\mathbb {R}}}^{4}) \cap L^{4}({{\mathbb {R}}}^{4})\). Then \(r^{-1} \phi \in L^{2}({{\mathbb {R}}}^{4})\) and \(\phi \!\upharpoonright _{\partial B_{r}} \in L^{2}(\partial B_{r})\) for every \(r >0\). Moreover, we have

$$\begin{aligned} ||\frac{\phi }{r}||_{L^{2}({{\mathbb {R}}}^{4})}^{2} + \sup _{r > 0} \frac{1}{r} \Vert \phi \Vert _{L^{2}(\partial B_{r})}^{2} \le \Vert \mathbf{D}_{r} \phi \Vert _{L^{2}({{\mathbb {R}}}^{4})}^{2}. \end{aligned}$$
(5.30)

We are ready to establish Lemma 5.2.

Proof of Lemma 5.2

We first consider the case when \((A, \phi )\) is smooth. Then by local conservation of energy, we have

$$\begin{aligned} {{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}} = \frac{1}{2} \int _{\partial C_{[t_{0}, t_{1}]}} {}^{(T)} J_{L}[A, \phi ] r^{3} \, \mathrm {d}v \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \end{aligned}$$

and hence the non-negativity and additivity are obvious. The first local Hardy’s inequality (5.5) is a consequence of (5.29) applied to the hypersurface \(\partial C_{[t_{0}, t_{1}]} = \left\{ u = 0, \, r \in [t_{0}, t_{1}]\right\} \) in the coordinate system \((u, r, \Theta )\), whereas the second local Hardy’s inequality (5.6) follows from a similar argument used to derive Corollary 5.9.

Now we turn to the general case. Since \((A, \phi )\) is an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution, there exists a sequence of smooth solutions converging to \((A, \phi )\) in \(C_{t} {{\mathcal {H}}}^{1}(I \times {{\mathbb {R}}}^{4})\). Since all quantities in the conclusions of the lemma are continuous with respect to the \(C_{t} {{\mathcal {H}}}^{1}(I \times {{\mathbb {R}}}^{4})\) topology, the general case follows from the smooth case by approximation. \(\square \)

5.4 Monotonicity formulae and proofs of Propositions 5.4, 5.5

Here we derive monotonicity formulae associated with the vector fields \(X_{\varepsilon }\), which are defined in the polar coordinates as

$$\begin{aligned} X_{\varepsilon } = \frac{1}{\rho _{\varepsilon }} ((t+\varepsilon ) \partial _{t} + r \partial _{r}), \quad \rho _{\varepsilon } = \sqrt{(t+\varepsilon )^{2} - r^{2}}, \end{aligned}$$
(5.31)

where \(\varepsilon \ge 0, t > -\varepsilon \).

The starting point for derivation of the monotonicity formula (5.9), as well as Propositions 5.4 and 5.5, is to contract the energy-momentum tensor \({\mathcal {Q}}\) with one of the vector fields \(X_{\varepsilon }\). Due to the unfavorable contribution of \({}^{(\mathrm {KG})} {\mathcal {Q}}\), however, several additional modifications are necessary. To simplify the discussion, we first restrict to the case \(\varepsilon = 0\). The reader should keep in mind that the general case follows simply by translating in time by \(\varepsilon \).

Using the formula (5.17), we compute

$$\begin{aligned} \frac{1}{2} {}^{(X_{0})} \pi ^{\sharp } = \frac{1}{\rho ^{3}} \Big (\partial _{y} \cdot \partial _{y} + \frac{1}{\sinh ^{2} y} \left( g^{-1}_{{{\mathbb {S}}}^{3}}\right) \Big ) = \frac{1}{\rho } \left( \mathbf{m}^{-1} + X_{0} \cdot X_{0}\right) . \end{aligned}$$

Hence we have

$$\begin{aligned} {}^{(X_{0})} K= & {} {}^{(\mathrm {M})}{\mathcal {Q}}_{\mathbf{a}\mathbf{b}} \left( \frac{1}{2} {}^{(X_{0})} \pi ^{\sharp }\right) ^{\mathbf{a}\mathbf{b}} + {}^{(\mathrm {KG})}{\mathcal {Q}}_{\mathbf{a}\mathbf{b}} \left( \frac{1}{2} {}^{(X_{0})} \pi ^{\sharp }\right) ^{\mathbf{a}\mathbf{b}} \nonumber \\= & {} \frac{1}{\rho } \vert \iota _{X_{0}} F\vert ^{2} + \frac{1}{\rho } \vert \mathbf{D}_{X_{0}} \phi \vert ^{2} - \frac{1}{\rho } \mathbf{D}_{\mathbf{a}} \phi \overline{\mathbf{D}^{\mathbf{a}} \phi }. \end{aligned}$$
(5.32)

where \(\vert \iota _{X_{0}} F\vert ^{2} = \mathbf{m}(\iota _{X_{0}} F, \iota _{X_{0}} F) \ge 0\), since \(X_{0}\) is time-like. The first term on (5.32) is satisfactory in view of our goal (5.9), but the rest is not. To remove the last term, we use the currents \({}^{(w_{0})} J\) and \({}^{(w_{0})} K\) with \(w_{0} = \frac{1}{\rho }\) and compute

$$\begin{aligned} {}^{(X_{0})} K + {}^{(w_{0})} K =&\frac{1}{\rho } \vert \iota _{X_{0}} F\vert ^{2} + \frac{1}{\rho } \vert \mathbf{D}_{X_{0}} \phi \vert ^{2} - \frac{1}{\rho ^{3}} \vert \phi \vert ^{2}. \end{aligned}$$
(5.33)

Now we introduce an auxiliary divergence identity, which is related to Hardy’s inequality in the \(\rho \) variable. Define \({}^{(\mathcal {{H}}_{0})} J[\phi ]\) in the hyperbolic coordinates \((\rho , y, \Theta )\) by

$$\begin{aligned} {}^{(\mathcal {{H}}_{0})} J_{\rho }[\phi ] := - \frac{\vert \phi \vert ^{2}}{\rho ^{2}}, \end{aligned}$$
(5.34)

where the remaining components are set to be zero. Define also

$$\begin{aligned} {}^{(\mathcal {{H}}_{0})} K[\phi ] := \frac{2}{\rho ^{3}} \vert \phi \vert ^{2} + \frac{1}{\rho ^{2}} \partial _{\rho } \vert \phi \vert ^{2}. \end{aligned}$$
(5.35)

Then a simple computation shows that

$$\begin{aligned} \nabla ^{\mathbf{a}} ({}^{(\mathcal {{H}}_{0})} J_{\mathbf{a}}[\phi ]) = {}^{(\mathcal {{H}}_{0})} K[\phi ]. \end{aligned}$$
(5.36)

Since \(\partial _{\rho } \vert \phi \vert ^{2} = 2 \mathrm {Re}(\phi \overline{\mathbf{D}_{\rho } \phi })\) and \(X_{0} = \partial _{\rho }\), we arrive at

$$\begin{aligned} {}^{(X_{0})} K + {}^{(w_{0})} K + {}^{(\mathcal {{H}}_{0})} K = \frac{1}{\rho } \vert \iota _{X_{0}} F\vert ^{2} + \frac{1}{\rho } \vert \left( \mathbf{D}_{X_{0}} + \frac{1}{\rho }\right) \phi \vert ^{2}, \end{aligned}$$
(5.37)

which is precisely the integrand in the space-time integral in (5.9).

The preceding computation suggests that we should define a new 1- and 0-currents by \({}^{(X_{0})} J + {}^{(w_{0})} J + {}^{(\mathcal {{H}}_{0})} J\) and \({}^{(X_{0})} K + {}^{(w_{0})} K + {}^{(\mathcal {{H}}_{0})} K\), respectively. To make the L and \(\underline{L}\) components of the 1-current look more favorable, however, it turns out to be convenient to add in an auxiliary current \({}^{(\mathcal {{N}}_{0})} J\) defined by

$$\begin{aligned} {}^{(\mathcal {{N}}_{0})} J_{L}[\phi ]= & {} \frac{1}{2 r^{3}} L\left( r^{3} \frac{t}{\rho r} \vert \phi \vert ^{2} \right) , \nonumber \\ {}^{(\mathcal {{N}}_{0})} J_{\underline{L}}[\phi ]= & {} - \frac{1}{2 r^{3}} \underline{L}\left( r^{3} \frac{t}{\rho r} \vert \phi \vert ^{2} \right) ,\nonumber \end{aligned}$$
(5.38)

where the remaining components are set to be zero. By equality of mixed partials \(L \underline{L}= 4 \partial _{v} \partial _{u} = 4 \partial _{u} \partial _{v} = \underline{L}L\), it follows that

$$\begin{aligned} \nabla ^{\mathbf{a}} \left( {}^{(\mathcal {{N}}_{0})} J_{\mathbf{a}}[\phi ]\right) = 0. \end{aligned}$$
(5.39)

For \({}^{(X_{0})} P := {}^{(X_{0})} J + {}^{(w_{0})} J + {}^{(\mathcal {{H}}_{0})} J + {}^{(\mathcal {{N}}_{0})} J\), we claim that

$$\begin{aligned} {}^{(X_{0})} P_{L}= & {} \frac{1}{2} \left( \frac{v}{u}\right) ^{\frac{1}{2}} \left( \vert r^{-1}\mathbf{D}_{L}(r \phi )\vert ^{2} + \vert \alpha \vert ^{2}\right) \nonumber \\&+ \frac{1}{2} \left( \frac{u}{v}\right) ^{\frac{1}{2}} \left( \vert \not \!\! \mathbf{D}\phi \vert ^{2} + \frac{\vert \phi \vert ^{2}}{r^{2}} + \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2} \right) , \end{aligned}$$
(5.40)
$$\begin{aligned} {}^{(X_{0})} P_{\underline{L}}= & {} \frac{1}{2} \left( \frac{u}{v}\right) ^{\frac{1}{2}} \left( \vert r^{-1}\mathbf{D}_{\underline{L}}(r \phi )\vert ^{2} + \vert \underline{\alpha }\vert ^{2}\right) \nonumber \\&+ \frac{1}{2} \left( \frac{v}{u}\right) ^{\frac{1}{2}} \Big ( \vert \not \!\! \mathbf{D}\phi \vert ^{2} + \frac{\vert \phi \vert ^{2}}{r^{2}} + \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2} \Big ). \end{aligned}$$
(5.41)

We will prove (5.40), leaving the task of verifying (5.41) to the reader. Using the relations

$$\begin{aligned} \rho ^{2} = uv, \quad X_{0} = \frac{1}{2} \left( \frac{v}{\rho } L + \frac{u}{\rho } \underline{L}\right) , \end{aligned}$$

and the null decomposition formulae (5.22), (5.23), we have

$$\begin{aligned} {}^{(X_{0})} J_{L}[A, \phi ]= & {} \frac{1}{2} \left( \frac{v}{\rho } \vert \mathbf{D}_{L} \phi \vert ^{2} + \frac{u}{\rho } \vert \not \!\! \mathbf{D}\phi \vert ^{2} \right) + \frac{1}{2} \left( \frac{v}{\rho } \vert \alpha \vert ^{2} + \frac{u}{\rho } \left( \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2}\right) \right) . \end{aligned}$$

On the other hand, we compute

$$\begin{aligned} {}^{(w_{0})} J_{L}[A, \phi ] = \frac{1}{\rho } \mathrm {Re}(\phi \overline{\mathbf{D}_{L} \phi }) + \frac{1}{2} \frac{1}{\rho v} \vert \phi \vert ^{2}, \quad {}^{(\mathcal {{H}}_{0})} J_{L}[\phi ] = - \frac{1}{\rho v} \vert \phi \vert ^{2}. \end{aligned}$$

To prove (5.40), it suffices to verify

$$\begin{aligned}&\frac{1}{2} \frac{v}{\rho } \vert \mathbf{D}_{L} \phi \vert ^{2} + {}^{(w_{0})} J_{L}[A, \phi ] + {}^{(\mathcal {{H}}_{0})} J_{L}[\phi ] + {}^{(\mathcal {{N}}_{0})} J_{L}[\phi ]\nonumber \\&\quad = \frac{1}{2} \frac{v}{\rho } \vert r^{-1} \mathbf{D}_{L} (r \phi )\vert ^{2} + \frac{1}{2} \frac{u}{\rho } \frac{\vert \phi \vert ^{2}}{r^{2}}. \end{aligned}$$
(5.42)

For this purpose, it is convenient to work with \(\psi = r \phi \). We have

$$\begin{aligned} \hbox {LHS of }(5.42)= & {} \frac{1}{2} \frac{v}{\rho } \vert \mathbf{D}_{L}(\psi /r)\vert ^{2} + \frac{1}{\rho r} \mathrm {Re}(\psi \overline{\mathbf{D}_{L}(\psi /r)}) \\&+ \frac{1}{2} \frac{1}{\rho v} \frac{\vert \psi \vert ^{2}}{r^{2}} - \frac{1}{\rho v} \frac{\vert \psi \vert }{r^{2}} + \frac{1}{2 r^{3}} L\left( \frac{t}{\rho } \vert \psi \vert ^{2}\right) \\= & {} \frac{1}{2} \frac{v}{\rho } \vert r^{-1} \mathbf{D}_{L} \psi \vert ^{2} + \frac{1}{2} \left( \frac{v}{\rho r^{2}} - \frac{2}{\rho r} - \frac{1}{\rho v} + \frac{1}{r} L(t/\rho ) \right) \frac{\vert \psi \vert ^{2}}{r^{2}} \end{aligned}$$

Since \(r^{-1} L(t/\rho ) = 1/(\rho r) - t/(\rho r v) = 1/(\rho v)\), we see that

$$\begin{aligned} \frac{v}{\rho r^{2}} - \frac{2}{\rho r} - \frac{1}{\rho v} + \frac{1}{r} L(t/\rho ) = \frac{v}{\rho r^{2}} - \frac{2}{\rho r} = \frac{u}{\rho r^{2}}, \end{aligned}$$

which establishes (5.42), and hence (5.40).

We now return to the general case \(\varepsilon \ge 0\). Define \({}^{(X_{\varepsilon })} J, {}^{(w_{\varepsilon })} J, {}^{(\mathcal {{H}}_{\varepsilon })} J, {}^{(\mathcal {{N}}_{\varepsilon })} J\) and their 0-current counterparts by pulling back the \(\varepsilon = 0\) versions defined above along the map \((t, r, \Theta ) \mapsto (t+\varepsilon , r, \Theta )\). For \({}^{(X_{\varepsilon })} J, {}^{(w_{\varepsilon })} J, {}^{(X_{\varepsilon })} K\) and \({}^{(w_{\varepsilon })} K\), note that this definition agrees with that from Sect. 5.1 using \(X_{\varepsilon }\) as in (5.31) and \(w_{\varepsilon } := 1/\rho _{\varepsilon }\). Let

$$\begin{aligned} {}^{(X_{\varepsilon })} P[A, \phi ]:= & {} {}^{(X_{\varepsilon })} J[A, \phi ] + {}^{(w_{\varepsilon })} J[A, \phi ] + {}^{(\mathcal {{H}}_{\varepsilon })} J[\phi ] + {}^{(\mathcal {{N}}_{\varepsilon })} J[\phi ], \nonumber \\ {}^{(X_{\varepsilon })} Q[A, \phi ]:= & {} {}^{(X_{\varepsilon })} K[A, \phi ] + {}^{(w_{\varepsilon })} K[A, \phi ] + {}^{(\mathcal {{H}}_{\varepsilon })} K[\phi ]. \end{aligned}$$
(5.43)

We summarize the discussion so far in the following lemma, which follows easily by pulling back the above computations along \((t, r, \Theta ) \mapsto (t+\varepsilon , r, \Theta )\).

Lemma 5.10

Let \((A ,\phi )\) be a smooth solution to (MKG) on an open subset \({{\mathcal {O}}}\subseteq C_{(0, \infty )}\). The 1- and 0-currents \({}^{(X_{\varepsilon })} P[A, \phi ]\) and \({}^{(X_{\varepsilon })} Q\) obeys the divergence identity

$$\begin{aligned} \nabla ^{\mathbf{a}} \left( {}^{(X_{\varepsilon })} P_{\mathbf{a}}[A, \phi ]\right) = {}^{(X_{\varepsilon })} Q[A, \phi ], \end{aligned}$$
(5.44)

where \({}^{(X_{\varepsilon })} Q = {}^{(X_{\varepsilon })} Q[A, \phi ]\) takes the form

$$\begin{aligned} {}^{(X_{\varepsilon })} Q = \frac{1}{\rho _{\varepsilon }} \vert \iota _{X_{\varepsilon }} F\vert ^{2} + \frac{1}{\rho _{\varepsilon }} \vert \left( \mathbf{D}_{X_{\varepsilon }} + \frac{1}{\rho _{\varepsilon }}\right) \phi \vert ^{2}. \end{aligned}$$
(5.45)

Here, \(\vert \iota _{X_{\varepsilon }} F\vert ^{2} = \mathbf{m}(\iota _{X_{\varepsilon }} F, \iota _{X_{\varepsilon }} F) \ge 0\). Moreover, the L and \(\underline{L}\) components of \({}^{(X_{\varepsilon })} P = {}^{(X_{\varepsilon })} P[A, \phi ]\) take the form

$$\begin{aligned} {}^{(X_{\varepsilon })} P_{L}= & {} \frac{1}{2} \left( \frac{v_{\varepsilon }}{u_{\varepsilon }}\right) ^{\frac{1}{2}} \left( \vert r^{-1}\mathbf{D}_{L}(r \phi )\vert ^{2} + \vert \alpha \vert ^{2}\right) \nonumber \\&+ \frac{1}{2} \Big (\frac{u_{\varepsilon }}{v_{\varepsilon }}\Big )^{\frac{1}{2}} \left( \vert \not \!\! \mathbf{D}\phi \vert ^{2} + \frac{\vert \phi \vert ^{2}}{r^{2}} + \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2} \right) , \end{aligned}$$
(5.46)
$$\begin{aligned} {}^{(X_{\varepsilon })} P_{\underline{L}}= & {} \frac{1}{2} \left( \frac{u_{\varepsilon }}{v_{\varepsilon }}\right) ^{\frac{1}{2}} \left( \vert r^{-1}\mathbf{D}_{\underline{L}}(r \phi )\vert ^{2} + \vert \underline{\alpha }\vert ^{2}\right) \nonumber \\&+ \frac{1}{2} \left( \frac{v_{\varepsilon }}{u_{\varepsilon }}\right) ^{\frac{1}{2}} \left( \vert \not \!\! \mathbf{D}\phi \vert ^{2} + \frac{\vert \phi \vert ^{2}}{r^{2}} + \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2} \right) , \end{aligned}$$
(5.47)

where \(v_{\varepsilon } := (t+\varepsilon ) + r\) and \(u_{\varepsilon } := (t+\varepsilon ) - r\).

Here we give a quick proof of (5.9) for a smooth solution \((A, \phi )\) on \({{\mathbb {R}}}^{1+4}\). By \({{\mathcal {F}}}_{\partial C_{[t_{0}, t_{1}]}} = 0, {{\mathcal {G}}}_{\partial S_{t_{1}}} = 0\) and Lemma 5.2, note that \(\phi \) and the tangential components of F (i.e., \(\alpha , \varrho , \sigma \)) vanish on the boundary \(\partial C_{[t_{0}, t_{1}]}\). Integrate (5.44) with \(\varepsilon = 0\) over \(C_{[t_{0}, t_{1}]}\) and apply the divergence theorem. The boundary term on \(\partial C_{[t_{0}, t_{1}]}\) vanishes thanks to \(\phi , \alpha , \varrho , \sigma = 0\), and thus (5.9) follows.

In the preceding proof, however, note from (5.40) that there is a weight \((\frac{v}{u})^{1/2}\) in the boundary term, which would blow up if \(\mathbf{D}_{L}(r\phi )\) and \(\alpha _{A}\) were not exactly zero on \(\partial C_{[t_{0}, t_{1}]}\). We now turn to the proof of Proposition 5.4, whose goal is exactly to deal with this issue.

Proof of Proposition 5.4

As the hypothesis (5.10) and the conclusion (5.11) only involve quantities which are continuous with respect to the \(C_{t} {{\mathcal {H}}}^{1}(I \times {{\mathbb {R}}}^{4})\) topology, it suffices to consider the case when \((A, \phi )\) is smooth. Integrating (5.44) with \(\varepsilon > 0\) over \(C_{[\varepsilon , 1]}\) and integrating by parts, we obtain

$$\begin{aligned}&\int _{S_{1}} {}^{(X_{\varepsilon })} P_{T}[A, \phi ] \, \mathrm {d}x + \iint _{C_{[\varepsilon , 1]}} \frac{1}{\rho _{\varepsilon }} \vert \iota _{X_{\varepsilon }} F\vert ^{2} + \frac{1}{\rho _{\varepsilon }} \left| \left( \mathbf{D}_{X_{\varepsilon }} + \frac{1}{\rho _{\varepsilon }}\right) \phi \right| ^{2} \, \mathrm {d}t \mathrm {d}x \nonumber \\&\quad = \int _{S_{\varepsilon }} {}^{(X_{\varepsilon })} P_{T}[A, \phi ] \, \mathrm {d}x + \frac{1}{2} \int _{\partial C_{[\varepsilon , 1]}} {}^{(X_{\varepsilon })} P_{L} [A, \phi ] r^{3} \, \mathrm {d}v \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}}. \end{aligned}$$
(5.48)

We claim that the right-hand side is bounded from above by \(\lesssim E\). We begin with the first term. On \(S_{\varepsilon }\), we have the pointwise bound

$$\begin{aligned} {}^{(X_{\varepsilon })} P_{T}[A, \phi ] \lesssim {}^{(T)} P_{T}[A, \phi ] + \frac{1}{r^{2}} \vert \phi \vert ^{2}, \end{aligned}$$

since \(u_{\varepsilon } / v_{\varepsilon } \simeq 1\) and \(v_{\varepsilon } / u_{\varepsilon } \simeq 1\) on \(S_{\varepsilon }\). By (5.10), Lemma 5.2 and (5.29) applied to \(\phi \) on \(S_{\varepsilon }\) with \(r_{1} = 0, r_{2} = \varepsilon \), it follows that the first term on the right-hand side of (5.48) is bounded by \(\lesssim E\).

We now consider the last term in (5.48). On \(\partial C_{[\varepsilon , 1]}\), we have

$$\begin{aligned} {}^{(X_{\varepsilon })} P_{L}[A, \phi ] \lesssim \varepsilon ^{-\frac{1}{2}} \Big ( \vert \mathbf{D}_{L}\phi \vert ^{2} + \frac{1}{r^{2}}\vert \phi \vert ^{2} + \vert \alpha \vert ^{2} \Big ) + {}^{(T)} J_{L}[A, \phi ], \end{aligned}$$

Then by (5.10), Lemma 5.2 and the fact that \(t = r\) on \(\partial C\), the last term in (5.48) is bounded by \(\lesssim E\) as desired.

We end this section with a proof of Proposition 5.5.

Proof of Proposition 5.5

As before, by approximation, it suffices to consider the case when \((A, \phi )\) is smooth. Let \(\delta \in [\delta _{0}, \delta _{1}]\) be a number to be determined below. Integrating (5.44) with \(\varepsilon = 0\) over \(C^{\delta }_{[t_{0}, 1]}\) and using the divergence theorem, we see that (5.12) would follow if there exists \(\delta \in [\delta _{0}, \delta _{1}]\) such that

$$\begin{aligned} \int _{\partial C^{\delta }_{[t_{0}, 1]}} {}^{(X_{0})} P_{L}[A, \phi ] \, r^{3} \, \mathrm {d}v \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \lesssim \Big ( (\delta _{1}/t_{0})^{\frac{1}{2}} + \vert \log (\delta _{1}/\delta _{0})\vert ^{-1} \Big ) E.\quad \quad \quad \end{aligned}$$
(5.49)

The contribution of the term with the weight \((u_{0} / v_{0})^{1/2}\) in (5.46) is easy to treat; indeed, using localized Hardy’s inequality and local conservation of energy, we have

$$\begin{aligned}&\int _{\partial C^{\delta }_{[t_{0}, 1]}} \frac{1}{2} \left( \frac{u}{v}\right) ^{\frac{1}{2}} \left( \vert \not \!\! \mathbf{D}\phi \vert ^{2} + \frac{\vert \phi \vert ^{2}}{r^{2}} + \vert \varrho \vert ^{2} + \vert \sigma \vert ^{2} \right) \, r^{3} \, \mathrm {d}v \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \\&\quad \lesssim \left( \frac{\delta _{1}}{t_{0}} \right) ^{1/2}\left( \int _{\partial C^{\delta }_{[t_{0}, 1]}} {}^{(T)} J_{L}[A, \phi ] \, r^{3} \, \mathrm {d}v \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \right. \\&\left. \qquad + \,{{\mathcal {E}}}_{S_{1} {\setminus } S_{1}^{\delta }}[A, \phi ] + {{\mathcal {G}}}_{S_{1}}[\phi ] \right) \lesssim \left( \frac{\delta _{1}}{t_{0}} \right) ^{1/2} E. \end{aligned}$$

It remains to treat the term with the weight \((v_{0} / u_{0})^{1/2}\) in (5.46). Note that

$$\begin{aligned} r^{-1} \mathbf{D}_{L}(r \phi )= & {} \left( \mathbf{D}_{L} + \frac{1}{r}\right) \phi = 2 \left( \frac{u_{\varepsilon }}{v_{\varepsilon }} \right) ^{\frac{1}{2}} \left( \mathbf{D}_{X_{\varepsilon }} + \frac{1}{\rho _{\varepsilon }}\right) \phi \\&\quad - \,\left( \frac{u_{\varepsilon }}{v_{\varepsilon }} \right) \mathbf{D}_{\underline{L}} \phi + \left( \frac{u_{\varepsilon }}{v_{\varepsilon }} \right) \frac{1}{r} \phi ,\\ \alpha _{\mathfrak {a}}= & {} F(L, e_{\mathfrak {a}}) = 2 \left( \frac{u_{\varepsilon }}{v_{\varepsilon }} \right) ^{\frac{1}{2}} F(X_{\varepsilon }, e_{\mathfrak {a}}) - \left( \frac{u_{\varepsilon }}{v_{\varepsilon }} \right) F(\underline{L}, e_{\mathfrak {a}}). \end{aligned}$$

Note that \(u \le u_{\varepsilon }\) and \(v \le v_{\varepsilon }\). Furthermore \(u_{\varepsilon } \le 2 u\) on \(\partial C^{\delta }_{[t_{0}, 1]}\) since \(2 \varepsilon \le \delta _{0}\). Hence,

$$\begin{aligned}&\int _{\partial C^{\delta }_{[t_{0}, 1]}} \frac{1}{2} \left( \frac{v}{u}\right) ^{\frac{1}{2}} \left( \vert r^{-1} \mathbf{D}_{L} (r \phi )\vert ^{2} + \vert \alpha \vert ^{2} \right) \, r^{3} \, \mathrm {d}v \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}} \nonumber \\&\quad \lesssim \int _{\partial C^{\delta }_{[t_{0}, 1]}} \frac{u}{\rho _{\varepsilon }} \left( \left| \left( \mathbf{D}_{X_{\varepsilon }} + \frac{1}{\rho _{\varepsilon }}\right) \phi \right| ^{2} + \vert \iota _{X_{\varepsilon }} F\vert ^{2} \right) \nonumber \\&\qquad + \, \frac{u^{\frac{3}{2}}}{v^{\frac{3}{2}}} \left( \vert \mathbf{D}_{\underline{L}} \phi \vert ^{2} + \frac{1}{r^{2}} \vert \phi \vert ^{2} + \vert \underline{\alpha }\vert ^{2} \right) \, r^{3} \, \mathrm {d}v \mathrm {d}\sigma _{{{\mathbb {S}}}^{3}}. \end{aligned}$$
(5.50)

We claim that the integral of the right-hand side over \(\delta _{0} \le u \le \delta _{1}\) with respect to \(u^{-1} \mathrm {d}u\) is bounded by E. Then by the pigeonhole principle, there would exist \(\delta \in [\delta _{0}, \delta _{1}]\) such that the left-hand side of (5.50) is bounded by \(\lesssim \vert \log (\delta _{1} / \delta _{0})\vert ^{-1} E\), as desired.

For the contribution of the first term, the claim follows directly from Proposition 5.4. For the second term, we have

$$\begin{aligned}&\iint _{C^{\delta _{0}}_{[t_{0}, 1]} {\setminus } C^{\delta _{1}}_{[t_{0}, 1]}} \frac{u^{\frac{1}{2}}}{v^{\frac{3}{2}}} \Big ( \vert \mathbf{D}_{\underline{L}} \phi \vert ^{2} + \frac{1}{r^{2}} \vert \phi \vert ^{2} + \vert \underline{\alpha }\vert ^{2} \Big ) \, \mathrm {d}t \mathrm {d}x \\&\quad \lesssim \iint _{C^{\delta _{0}}_{[t_{0}, 1]} {\setminus } C^{\delta _{1}}_{[t_{0}, 1]}}\frac{\delta _{1}^{\frac{1}{2}}}{t^{\frac{3}{2}}} {}^{(T)} J_{T}[A, \phi ] \, \mathrm {d}t \mathrm {d}x \lesssim \Big (\frac{\delta _{1}}{t_{0}} \Big )^{\frac{1}{2}} E, \end{aligned}$$

which is sufficient to prove the claim since \(\delta _{1} \le t_{0}\). \(\square \)

6 Local strong compactness and weak solutions to (MKG)

The first goal of this section is to establish the following local strong compactness result for asymptotically stationary (see (6.2) below) sequences of solutions to (MKG) with small energy.

Proposition 6.1

There exists a universal constant \(\epsilon _{0} > 0\) such that the following holds. Let \(B = B_{1}(x_{0}) \subseteq {{\mathbb {R}}}^{4}\) be an open ball of unit radius centered at \(x_{0}\), and let \((A^{(n)}, \phi ^{(n)})\) be a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions to (MKG) in \((-2, 2) \times 8B\) such that

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ 0\right\} \times 8B}\left[ A^{(n)}, \phi ^{(n)}\right] + \Vert \phi ^{(n)}(0, x)\Vert _{L^{2}_{x}(8 B)}^{2} \le \epsilon _{0}^{2}. \end{aligned}$$
(6.1)

Suppose furthermore that \((A^{(n)}, \phi ^{(n)})\) is asymptotically stationary in the sense that

$$\begin{aligned} \iint _{(-2, 2) \times 2 B} \vert \iota _{X} F^{(n)}\vert ^{2} + \vert (\mathbf{D}^{(n)}_{X} + b) \phi ^{(n)}\vert ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty ,\quad \quad \end{aligned}$$
(6.2)

where X is a smooth time-like vector field and b is a smooth real-valued function. Then there exists a pair \((A, \phi )\) in \(L^{2}_{t,x}((- 1, 1) \times B)\) such that the following statements hold:

  1. (1)

    There exists a sequence of gauge transforms \(\chi ^{(n)} \in C_{t} {{\mathcal {G}}}^{2}((-1, 1) \times B)\) such that, after passing to a subsequence, we have

    $$\begin{aligned}&\left( A_{\mu }^{(n)} - \partial _{\mu } \chi ^{(n)}, e^{i \chi ^{(n)}} \phi ^{(n)} \right) \rightarrow (A_{\mu }, \phi ) \quad \hbox { strongly in } L^{2}_{t,x}((-1, 1) \times B), \end{aligned}$$
    (6.3)
    $$\begin{aligned}&\left( F_{\mu \nu }^{(n)}, e^{i \chi ^{(n)}} \mathbf{D}_{\mu }^{(n)} \phi ^{(n)}\right) \rightarrow (F_{\mu \nu }, \mathbf{D}_{\mu } \phi ) \quad \hbox { strongly in } L^{2}_{t,x}((-1, 1) \times B), \end{aligned}$$
    (6.4)

    where \(F_{\mu \nu } = \partial _{\mu } A_{\nu } - \partial _{\nu } A_{\mu }\) and \(\mathbf{D}_{\mu } \phi = \partial _{\mu } \phi + i A_{\mu } \phi \) are defined in the sense of distributions.

  2. (2)

    The limiting pair \((A, \phi )\) is a weak solution to (MKG) on \((-1, 1) \times B\), in the sense of Definition 6.6 below. The connection 1-form A obeys, in the sense of distributions, the Coulomb gauge condition

    $$\begin{aligned} \partial ^{\ell } A_{\ell } = 0 \quad \hbox { on } (-1, 1) \times B. \end{aligned}$$
    (6.5)
  3. (3)

    The pair \((A ,\phi )\) possesses the following additional regularity:

    $$\begin{aligned}&A \in H^{1}_{t,x}((-1, 1) \times B), \quad F_{\mu \nu }, \in H^{\frac{1}{2}}_{t,x}((-1, 1) \times B), \nonumber \\&\phi \in H^{\frac{3}{2}}_{t,x}((- 1, 1) \times B). \end{aligned}$$
    (6.6)
  4. (4)

    Moreover, the pair \((A, \phi )\) is stationary with respect to X, in the sense that

    $$\begin{aligned} \iota _{X} F = 0, \quad (\mathbf{D}_{X} + b) \phi = 0 \quad \hbox { on } (-1, 1) \times B. \end{aligned}$$
    (6.7)

As a result of taking limits, the notion of weak solutions to (MKG) arises naturally from Proposition 6.1. For our application in Sect. 8, we also need to formulate the notion of locally defined weak solutions \((A_{[\alpha ]}, \phi _{[\alpha ]})\) that can be pieced together to form a global pair (weak compatible pairs). Developing a theory of these objects is another goal of this section.

Remark 6.2

We remark that weak solutions and their gauge structure play only an auxiliary role in our work. Indeed, the stationarity equation (6.7), combined with (MKG) and the additional regularity (6.6) of \((A, \phi )\), allow us to infer smoothness of \((A, \phi )\) via elliptic regularity. This issue is considered in Sect. 7, where we study stationary and self-similar solutions to (MKG).

Remark 6.3

It is in fact possible to obtain stronger convergence than (6.3) namely \(A_{\mu }^{(n)} - \partial _{\mu } \chi ^{(n)} \rightarrow A_{\mu }\) and \(e^{i \chi ^{(n)}} \phi ^{(n)} \rightarrow \phi \) in \(H^{1}_{t,x}((-1, 1) \times B)\). Moreover, the limit \(A_{\mu }\) obeys the additional regularity \(H^{3/2-\varepsilon }_{t,x}((-1, 1) \times B)\) for any \(\varepsilon > 0\). As these facts are not necessary for the proof of our main theorem, we omit their proofs to avoid lengthening the paper.

The rest of this section is structured as follows. We first give a proof of Proposition 6.1 in Sect. 6.1, except the statement that the limit \((A, \phi )\) is a weak solution to (MKG). In Sect. 6.2, we formulate a notion of weak solutions to (MKG) that will be used in our proof. Finally, in Sect. 6.3, we introduce and discuss the notions of smooth and weak compatible pairs, which are local descriptions of smooth and weak solutions to (MKG), respectively.

6.1 Proof of Proposition 6.1

Here we prove Proposition 6.1 modulo the assertion that the limit \((A, \phi )\) is a weak solution to (MKG), which would be clear once we define the notion of a weak solution in Definition 6.6 below.

Proof

The basic idea behind proof is as in [33, Proposition 5.1]: Small energy (6.1) implies local uniform \(S^{1}\) bound on \((-2, 2) \times 2B\), which can be combined with asymptotic stationarity (6.2) via a microlocal decomposition to conclude strong convergence in \((-1, 1) \times B\). In implementing this strategy, we need to take into account the presence of the constraint equation and the system nature of (MKG) (especially the Maxwell part). Our proof proceeds in several steps.

Step 1 In this step, we use the excision and gluing technique to produce gauge equivalent Coulomb solutions on the smaller region \((-2, 2) \times 2B\), which enjoy a uniform \(S^{1}\) bound.

Let \((a_{j}^{(n)}, e_{j}^{(n)}, f^{(n)}, g^{(n)}) = (A_{j}^{(n)}, F_{0j}, \phi ^{(n)}, \mathbf{D}_{t}^{(n)} \phi ^{(n)}) \!\upharpoonright _{\left\{ t=0\right\} }\) be the data for \((A, \phi )\) on \(\left\{ t = 0\right\} \). Applying Proposition 4.4 to \(8 B \setminus 4\overline{B}\), we obtain an initial data set \((\widetilde{a}^{(n)}, \widetilde{e}^{(n)}, \widetilde{f}^{(n)}, \widetilde{g}^{(n)}) \in {{\mathcal {H}}}^{1}({{\mathbb {R}}}^{4})\) such that \((\widetilde{a}^{(n)}, \widetilde{e}^{(n)}, \widetilde{f}^{(n)}, \widetilde{g}^{(n)}) = (a^{(n)}, e^{(n)}, f^{(n)}, g^{(n)})\) on 4B and

$$\begin{aligned} {{\mathcal {E}}}\left[ \widetilde{a}^{(n)}, \widetilde{e}^{(n)}, \widetilde{f}^{(n)}, \widetilde{g}^{(n)}\right] \lesssim \epsilon _{0}^{2}. \end{aligned}$$

by (4.6) and (6.1). Choosing \(\epsilon _{0}\) appropriately, we may ensure that the left-hand side is smaller than \(\epsilon _{*}^{2}\), which is the threshold for Theorem 4.1.

To pass to the global Coulomb gauge, consider the gauge transformation \(\underline{\chi }^{(n)} \in {{\mathcal {G}}}^{2}({{\mathbb {R}}}^{4})\) defined by \(\underline{\chi }^{(n)} = \triangle ^{-1} \partial ^{\ell } \widetilde{a}_{\ell }^{(n)}\) and let

$$\begin{aligned} \left( \check{a}^{(n)}, \check{e}^{(n)}, \check{f}^{(n)}, \check{g}^{(n)}\right) := \left( \widetilde{a}^{(n)} - \mathrm {d}\underline{\chi }^{(n)}, \widetilde{e}^{(n)}, e^{i \underline{\chi }^{(n)}} \widetilde{f}^{(n)}, e^{i \underline{\chi }^{(n)}} \widetilde{g}^{(n)}\right) . \end{aligned}$$

This initial data set agrees with \((a^{(n)}, e^{(n)}, f^{(n)}, g^{(n)})\) on 4 B up to a gauge transformation, i.e.,

$$\begin{aligned} \left( \check{a}^{(n)}, \check{e}^{(n)}, \check{f}^{(n)}, \check{g}^{(n)}\right) = \left( a^{(n)} - \mathrm {d}\underline{\chi }^{(n)}, e^{(n)}, e^{i \underline{\chi }^{(n)}} f^{(n)}, e^{i \underline{\chi }^{(n)}} g^{(n)}\right) \quad \hbox { on } 4B,\nonumber \\ \end{aligned}$$
(6.8)

and furthermore obeys the small energy condition

$$\begin{aligned} {{\mathcal {E}}}\left[ \check{a}^{(n)}, \check{e}^{(n)}, \check{f}^{(n)}, \check{g}^{(n)}\right] < \epsilon _{*}^{2}. \end{aligned}$$
(6.9)

By small energy global well-posedness (Theorem 4.1), it follows that there exists a unique \(C_{t} {{\mathcal {H}}}^{1}\) admissible solution \((\check{A}^{(n)}, \check{\phi }^{(n)})\) on \({{\mathbb {R}}}^{1+4}\) with initial data \((\check{a}^{(n)}, \check{e}^{(n)}, \check{f}^{(n)}, \check{g}^{(n)})\), which obeys

$$\begin{aligned} \Vert \check{A}^{(n)}_{0}\Vert _{Y^{1}({{\mathbb {R}}}^{1+4})} + \Vert \check{A}^{(n)}_{x}\Vert _{S^{1}({{\mathbb {R}}}^{1+4})} + \Vert \check{\phi }^{(n)}\Vert _{S^{1}({{\mathbb {R}}}^{1+4})} \lesssim \epsilon _{*}. \end{aligned}$$
(6.10)

Moreover, by geometric uniqueness (Proposition 4.6) and the simple fact that

$$\begin{aligned} (-2, 2) \times 2 B \subseteq {{\mathcal {D}}}^{+}(\left\{ 0\right\} \times 4B) \cup {{\mathcal {D}}}^{-}(\left\{ 0\right\} \times 4B), \end{aligned}$$

there exists \(\chi ^{(n)} \in C_{t} {{\mathcal {G}}}^{2}((-2, 2) \times 2B)\) such that

$$\begin{aligned} \left( \check{A}^{(n)}, \check{\phi }^{(n)}\right) = \left( A^{(n)} - \mathrm {d}\chi ^{(n)}, e^{i \chi ^{(n)}} \phi ^{(n)}\right) \quad \hbox { on } (-2, 2) \times 2B. \end{aligned}$$
(6.11)

Let \(\eta _{0}, \ldots , \eta _{3} \in C^{\infty }_{0}({{\mathbb {R}}}^{1+4})\) be such that

$$\begin{aligned} \eta _{j} = 1 \hbox { on } (-1, 1) \times B, \quad {\mathrm {supp}}\, \eta _{j} \subseteq (-2, 2) \times 2B, \quad \eta _{j} \eta _{j+1} = \eta _{j}. \end{aligned}$$

for \(j=0, 1, 2, 3\) (except for the last property, for which \(j = 0, 1, 2\)), which will be fixed for the rest of the proof. We will also often write \(\eta = \eta _{0}\) and \(\widetilde{\eta } = \eta _{3}\). By (6.10) and Remark 4.2, the solution \((\check{A}^{(n)}, \check{\phi }^{(n)})\) satisfies

$$\begin{aligned}&\Vert \partial _{t,x} \left( \eta _{j} \check{A}^{(n)}\right) \Vert _{L^{\infty }_{t} L^{2}_{x}} + \Vert \partial _{t,x} \left( \eta _{j} \check{\phi }^{(n)}\right) \Vert _{L^{\infty }_{t} L^{2}_{x}} \lesssim _{\eta _{j}} \epsilon _{0}, \end{aligned}$$
(6.12)
$$\begin{aligned}&\Vert \partial _{t,x} \left( \eta _{j} \check{A}^{(n)}_{0}\right) \Vert _{L^{2}_{t} \dot{H}^{\frac{1}{2}}_{x}} + \Vert \Box \left( \eta _{j} \check{A}^{(n)}_{x}\right) \Vert _{L^{2}_{t} \dot{H}^{-\frac{1}{2}}_{x}} \nonumber \\&\quad +\, \Vert \Box \left( \eta _{j} \check{\phi }^{(n)}\right) \Vert _{L^{2}_{t} \dot{H}^{-\frac{1}{2}}_{x}} \lesssim _{\eta _{j}} \epsilon _{0}. \end{aligned}$$
(6.13)

for any \(j=0,1,2,3\). In particular, in view of (6.12) and Hölder’s inequality, the sequence \((\widetilde{\eta } \check{A}^{(n)}, \widetilde{\eta } \check{\phi }^{(n)})\) is uniformly bounded in \(H^{1}_{t,x}\). By the Rellich-Kondrachov theorem, there exists a subsequence, which we still denote by \((\widetilde{\eta } \check{A}^{(n)}, \widetilde{\eta } \check{\phi }^{(n)})\), and a pair \((A, \phi ) \in H^{1}_{t,x}\) such that

$$\begin{aligned} \left( \widetilde{\eta } \check{A}^{(n)}, \widetilde{\eta } \check{\phi }^{(n)}\right) \rightharpoonup (A, \phi ) \quad \hbox { in } H^{1}_{t,x}, \quad \left( \widetilde{\eta } \check{A}^{(n)}, \widetilde{\eta } \check{\phi }^{(n)}\right) \rightarrow (A, \phi )\quad \hbox { in } L^{2}_{t,x},\nonumber \\ \end{aligned}$$
(6.14)

as \(n \rightarrow \infty \), where the notation \(\rightharpoonup \) refers to weak convergence.

Step 2 In this preparatory step, we make a microlocal decomposition of \(\eta \) that will allows us to combine (6.2) with the bound (6.13) on the sequence; see (6.15).

We use the classical pseudo-differential calculus. Let \(q_{0}(\tau , \xi ) \in S^{0}\) be a smooth cutoff such that \(q_{0}=1\) to the region \(\left\{ (\tau , \xi ) : \vert \tau \vert \le (1-\delta ) \vert \xi \vert \right\} \) in Fourier space and \({\mathrm {supp}}\, q_{0} \subseteq \left\{ (\tau , \xi ) : \vert \tau \vert \le (1-\delta /2) \vert \xi \vert \right\} \), where \(\delta > 0\) is to be chosen shortly. On the support of \(q_{0}\), the norm on the left-hand side of (6.13) is effective. On the other hand, since \(X = X^{\mu } \partial _{\mu }\) is a time-like vector field, we have \(\vert X^{0}(t,x)\vert ^{2} > \sum _{j=1}^{4} \vert X^{j}(t,x)\vert ^{2}\) everywhere. As \({\mathrm {supp}}\, \eta \) is compact, we may choose \(\delta > 0\) sufficiently small so that

$$\begin{aligned} (1-\delta )^{2} \vert X^{0}(t,x)\vert \ge \left( \sum _{j=1}^{4} \vert X^{j}(t,x)\vert ^{2} \right) ^{\frac{1}{2}} \quad \hbox { for } (t,x) \in {\mathrm {supp}}\, \eta . \end{aligned}$$

With such a choice of \(\delta > 0\), the symbol \(X^{0}(t,x) \tau + X^{\ell }(t,x) \xi _{\ell } \in S^{1}\) is elliptic on the phase space support of \(\eta (t,x) (1-q_{0})(\tau , \xi )\), in the sense that

$$\begin{aligned} \vert X^{0}(t,x) \tau + X^{\ell }(t,x) \xi _{\ell }\vert \ge \vert X^{0}(t,x) \tau \vert - \vert X^{\ell }(t,x) \xi _{\ell }\vert \ge c_{\delta , \eta , X^{0}} (\vert \tau \vert + \vert \xi \vert ) \end{aligned}$$

for \((t,x) \in {\mathrm {supp}}\, \eta \) and \((\tau , \xi ) \in {\mathrm {supp}}\, (1-q_{0})\), where we may take

$$\begin{aligned} c_{\delta , \eta , X^{0}} = \frac{\delta (1-\delta )}{2} \inf _{{\mathrm {supp}}\, \eta } \vert X^{0}\vert > 0. \end{aligned}$$

Using the standard construction of a pseudo-differential elliptic parametrix, we may write

$$\begin{aligned} \eta (1-q_{0})(D_{t,x}) = q_{-1} (t,x, D_{t,x}) \, \eta X^{\mu } \partial _{\mu } + \widetilde{r}_{-1}(t,x,D_{t,x}) \end{aligned}$$

where \(q_{-1}, \widetilde{r}_{-1} \in S^{-1}\). Rearranging the terms, commuting \(\eta (t,x)\) with \(q_{0}\) and applying multiplication by \(\eta _{1}\) on the right, we arrive at the decomposition

$$\begin{aligned} \eta = q_{-1}(t,x, D_{t,x}) \eta X^{\mu } \partial _{\mu } + q_{0} \eta + r_{-1}(t,x,D_{t,x}) \eta _{1}, \end{aligned}$$
(6.15)

where \(r_{-1} \in S^{-1}\) is the sum of \(\widetilde{r}_{-1}\) and the commutator between \(\eta \) and \(q_{0}\).

Step 3 Here we show the strong convergence \(\eta F^{(n)}_{\mu \nu } \rightarrow \eta F_{\mu \nu }\) in \(L^{2}_{t,x}\), where we remind the reader that \(F_{\mu \nu } = \hat{F}_{\mu \nu }\) by gauge invariance of the curvature 2-form. By (6.15), we may write

$$\begin{aligned} \eta F^{(n)}_{\mu \nu } = q_{-1}(t,x, D_{t,x}) \eta X^{\lambda } \partial _{\lambda } F^{(n)}_{\mu \nu } + q_{0}(D_{t,x}) \eta F^{(n)}_{\mu \nu } + r_{-1}(t,x,D_{t,x}) \eta _{1} F^{(n)}_{\mu \nu }. \end{aligned}$$

Using \(\mathrm {d}F^{(n)} = 0\), we rewrite \(\eta X^{\lambda } \partial _{\lambda } F^{(n)}_{\mu \nu }\) as

$$\begin{aligned} \eta X^{\lambda } \partial _{\lambda } F_{\mu \nu }^{(n)}= & {} \partial _{\mu } \left( \eta X^{\lambda } F^{(n)}_{\lambda \nu }\right) - \partial _{\nu } (\eta X^{\lambda } F^{(n)}_{\lambda \mu })\\&- \,\partial _{\mu } (\eta X^{\lambda }) F^{(n)}_{\lambda \nu } + \partial _{\nu } \left( \eta X^{\lambda }\right) F^{(n)}_{\lambda \mu }, \end{aligned}$$

and hence we arrive at

$$\begin{aligned} \eta F^{(n)}_{\mu \nu }= & {} q_{-1} (t,x, D_{t,x}) \big [ \partial _{\mu } (\eta (\iota _{X} F^{(n)})_{\nu }) - \partial _{\nu } (\eta (\iota _{X} F^{(n)})_{\mu }) \big ] + \,R_{\mathrm {M}}[F^{(n)}]_{\mu \nu }\nonumber \\ \end{aligned}$$
(6.16)

where

$$\begin{aligned} R_{\mathrm {M}}[F^{(n)}]_{\mu \nu }= & {} \ q_{0} (D_{t,x}) \eta F^{(n)}_{\mu \nu } - q_{-1}(t,x, D_{t,x}) \big [ \partial _{\mu } (\eta X^{\lambda }) F^{(n)}_{\lambda \nu } \\&- \partial _{\nu }(\eta X^{\lambda }) F^{(n)}_{\lambda \mu } \big ] + r_{-1}(t, x, D_{t,x}) \eta _{1} F^{(n)}_{\mu \nu } . \end{aligned}$$

By (6.2), it follows that

$$\begin{aligned} \Vert q_{-1} (t,x, D_{t,x}) \left[ \partial _{\mu } \left( \eta (\iota _{X} F^{(n)})_{\nu }\right) - \partial _{\nu } \left( \eta (\iota _{X} F^{(n)})_{\mu }\right) \right] \Vert _{L^{2}_{t,x}} \rightarrow 0. \end{aligned}$$

Moreover, we claim that \(R_{\mathrm {M}}[F^{(n)}]_{\mu \nu }\) enjoys improved regularity, i.e.,

$$\begin{aligned} \Vert R_{\mathrm {M}}[F^{(n)}]_{\mu \nu }\Vert _{H^{\frac{1}{2}}_{t,x}} \lesssim \epsilon _{0} \quad \hbox { uniformly in } n. \end{aligned}$$
(6.17)

By the Rellich-Kondrachov theorem, after passing to a subsequence of \((\check{A}^{(n)}, \check{\phi }^{(n)})\), the sequence \(\widetilde{\eta } R_{\mathrm {M}}[F^{(n)}]_{\mu \nu }\) is strongly convergent in \(L^{2}_{t,x}\); moreover, we can also ensure that the limit belongs to \(H^{\frac{1}{2}}_{t,x}\). Combining these facts, as well as the identity \(\eta \widetilde{\eta } = \eta \), we see that \(\eta F^{(n)}_{\mu \nu }\) is strongly convergent in \(L^{2}_{t,x}\) to a limit that belongs to \(H^{\frac{1}{2}}_{t,x}\). Since \(\widetilde{\eta } \check{A}_{\mu } \rightarrow A_{\mu }\) in \(L^{2}_{t,x}\), the limit is equal to \(\eta F_{\mu \nu }\). Hence the statements regarding F in (6.4) and (6.6) follow.

It remains to verify the claim (6.17); it is at this point we use the uniform bounds (6.12) and (6.13). Using the formula \(\eta F^{(n)} = \eta (\mathrm {d}\check{A}^{(n)}) = \mathrm {d}(\eta \check{A}^{(n)}) - \mathrm {d}\eta \wedge \check{A}^{(n)}\) and the support property of the symbol \(q_{0}\), we obtain

$$\begin{aligned} \Vert q_{0}(D_{t,x}) \eta F^{(n)}\Vert _{H^{\frac{1}{2}}_{t,x}}&\lesssim \Vert q_{0}(D_{t,x}) \mathrm {d}\left( \eta \check{A}^{(n)}\right) \Vert _{H^{\frac{1}{2}}_{t,x}} \\&\quad + \Vert q_{0}(D_{t,x}) \left( \mathrm {d}\eta \wedge \check{A}^{(n)}\right) \Vert _{H^{\frac{1}{2}}_{t,x}}. \end{aligned}$$

The second term on the right-hand side is bounded by \(\epsilon _{0}\) thanks to (6.12). To handle the first term, we divide the space-time Fourier space into the regions \(\left\{ \vert \tau \vert + \vert \xi \vert \le 1\right\} \) and \(\left\{ \vert \tau \vert + \vert \xi \vert > 1\right\} \). Also distinguishing the temporal and spatial components of \(\check{A}^{(n)}\), we may estimate

$$\begin{aligned} \Vert q_{0}(D_{t,x}) \mathrm {d}\left( \eta \check{A}^{(n)}\right) \Vert _{H^{\frac{1}{2}}_{t,x}}\lesssim & {} \Vert \partial _{t,x} \left( \eta \check{A}^{(n)}_{0}\right) \Vert _{L^{2}_{t} \dot{H}^{1/2}_{x}} \\&+\, \Vert \Box (\eta \check{A}^{(n)}_{x})\Vert _{L^{2}_{t} \dot{H}^{-1/2}_{x}} + \Vert \eta \check{A}^{(n)}\Vert _{L^{2}_{t,x}} \end{aligned}$$

Using (6.13) for the first two terms and (6.12) for the last, the entire right-hand side is bounded by \(\epsilon _{0}\).

For the remainder \(R_{\mathrm {M}}[F^{(n)}]_{\mu \nu } - q_{0}(D_{t,x}) \eta F^{(n)}_{\mu \nu }\), we begin by observing that \(\Vert \eta _{2} F^{(n)}\Vert _{L^{2}_{t,x}} \lesssim \epsilon _{0}\) by the formula \(F^{(n)} = \mathrm {d}\check{A}^{(n)}\) and (6.12). Then we have

$$\begin{aligned} \Vert R_{\mathrm {M}}[F^{(n)}]_{\mu \nu } - q_{0}(D_{t,x}) \eta F^{(n)}_{\mu \nu }\Vert _{H^{1}_{t,x}} \lesssim \Vert \eta _{2} F^{(n)}\Vert _{L^{2}_{t,x}} \lesssim \epsilon _{0}, \end{aligned}$$

which proves the claim (6.17).

Step 4 In this intermediate step, we use strong \(L^{2}_{t,x}\) convergence of \(F^{(n)}_{\mu \nu }\) to prove

$$\begin{aligned} \eta \check{A}^{(n)}_{\mu } \rightarrow \eta A_{\mu } \quad \hbox { strongly in } L^{2}_{t} H^{1}_{x}, \end{aligned}$$
(6.18)

as \(n \rightarrow \infty \), up to a subsequence. We also prove improved regularity for the limit \(A_{\mu }\), i.e.,

$$\begin{aligned} \partial _{x} (\eta A_{\mu }) \in H^{\frac{1}{2}}_{t,x}. \end{aligned}$$
(6.19)

To begin with, observe that \(\triangle \check{A}^{(n)}_{\mu } = \partial ^{\ell } F^{(n)}_{\ell \mu }\) by the Coulomb gauge condition. Therefore, for each spatial component \(\mu = k \in \left\{ 1,2,3,4\right\} \), we have

$$\begin{aligned} \eta \check{A}^{(n)}_{k} = \triangle ^{-1} \left( \partial ^{\ell } \left( \eta F^{(n)}_{\ell k}\right) + [\triangle , \eta ] \check{A}^{(n)}_{k} + \left[ \eta , \partial ^{\ell }\right] F^{(n)}_{\ell k} \right) . \end{aligned}$$
(6.20)

For any \(j \in \left\{ 1,2,3,4\right\} \), note that \(\partial _{j} \triangle ^{-1} \partial ^{\ell } (\eta F^{(n)}_{\ell k})\) is strongly convergent in \(L^{2}_{t,x}\), thanks to the previous step. Writing out \(F^{(n)} = \mathrm {d}\check{A}^{(n)}\) and using the strong \(L^{2}_{t,x}\) convergence of \(\widetilde{\eta } \check{A}^{(n)}_{k}\), it follows that the remainder \(\partial _{j} \triangle ^{-1} ([\triangle , \eta ] \check{A}^{(n)}_{k} + [\eta , \partial ^{\ell }] F^{(n)}_{\ell k} )\) is strongly convergent in \(L^{2}_{t,x}\) as well. Hence (6.18) holds for \(\mu \in \left\{ 1,2,3,4\right\} \).

In the case \(\mu = 0\), note that (6.12) and (6.13) already imply

$$\begin{aligned} \Vert \partial _{x} \left( \widetilde{\eta } \check{A}^{(n)}_{0}\right) \Vert _{H^{\frac{1}{2}}_{t,x}} \lesssim \epsilon _{0} \quad \hbox { uniformly in } n. \end{aligned}$$
(6.21)

Therefore, after taking a suitable subsequence, the desired convergence (6.18) (by the Rellich-Kondrachov theorem) as well as the improved regularity (6.19) follow.

It only remains to prove the improved regularity (6.19) for \(\mu = k \in \left\{ 1,2,3,4\right\} \). First, by (6.20) and the improved regularity \(\eta F \in H^{\frac{1}{2}}_{t,x}, \widetilde{\eta } \check{A} \in H^{1}_{t,x}\), it follows that \(\eta \check{A}_{k} \in L^{2}_{t} H^{\frac{3}{2}}_{x}\). Then using the identity

$$\begin{aligned} \partial _{t} (\eta A_{k}) - \partial _{k} ( \eta A_{0})= \eta F_{0 k} + [\partial _{j}, \eta ] A_{k} - [\partial _{k}, \eta ] A_{0}, \end{aligned}$$

and the improved regularity \(\partial _{x} (\eta A_{0}) \in H^{\frac{1}{2}}_{t,x}\), as well as \(\eta F \in H^{\frac{1}{2}}_{t,x}, \widetilde{\eta } \check{A} \in H^{1}_{t,x}\), we have \(\partial _{t} (\eta \check{A}_{k}) \in H^{\frac{1}{2}}_{t,x}\). It follows that \(\eta \check{A}_{k} \in H^{\frac{3}{2}}_{t,x}\), which is better than what we need.

Step 5 In this step, we show that \(\eta \check{\mathbf{D}}^{(n)} \check{\phi }^{(n)} \rightarrow \eta \mathbf{D}\phi \) in \(L^{2}_{t,x}\) and \(\eta \phi \in H^{\frac{3}{2}}_{t,x}\). For the former, from the decomposition

$$\begin{aligned} \eta \check{\mathbf{D}}^{(n)}_{\mu } \check{\phi }^{(n)} = \eta \partial _{\mu } \check{\phi }^{(n)} + i \eta \check{A}^{(n)}_{\mu } \check{\phi }^{(n)}, \end{aligned}$$

the convergence \(\eta \check{A}^{(n)}_{\mu } \rightarrow \eta A\) in \(L^{2}_{t} H^{1}_{x}\) and (6.12), we see that it suffices to prove

$$\begin{aligned} \eta \partial _{\mu } \check{\phi }^{(n)} \rightarrow \eta \partial _{\mu } \phi \quad \hbox { in } L^{2}_{t,x}. \end{aligned}$$
(6.22)

By (6.15), we have

$$\begin{aligned} \eta \check{\phi }^{(n)} = q_{-1}(t,x, D_{t,x}) \eta X^{\mu } \partial _{\mu } \check{\phi }^{(n)} + q_{0} \eta \check{\phi }^{(n)} + r_{-1}(t,x,D_{t,x}) \eta _{1} \check{\phi }^{(n)} \end{aligned}$$

To use (6.2), we rewrite \(\eta X^{\mu } \partial _{\mu } \check{\phi }^{(n)}\) as

$$\begin{aligned} \eta X^{\mu } \partial _{\mu } \check{\phi }^{(n)} = \eta (\check{\mathbf{D}}^{(n)}_{X} +b)\check{\phi }^{(n)} - i X^{\nu } \check{A}^{(n)}_{\nu } \eta \check{\phi }^{(n)} - \eta b \check{\phi }^{(n)}. \end{aligned}$$

where \(\check{\mathbf{D}}^{(n)} = \mathrm {d}+ i \check{A}^{(n)}\). Expanding \(\eta \check{A}^{(n)} = \eta (\check{A}^{(n)} - A^{(n)}) + \eta A^{(n)}\), we arrive at

$$\begin{aligned} \eta \check{\phi }^{(n)}= & {} q_{-1}(t, x, D_{t,x}) \eta \left( \check{\mathbf{D}}^{(n)}_{X} + b\right) \check{\phi }^{(n)}\nonumber \\&-\, i q_{-1} (t,x, D_{t,x}) X^{\nu } \eta \left( \check{A}^{(n)}_{\nu } - A_{\nu }\right) \check{\phi }^{(n)} \nonumber \\&-\, i q_{-1} (t,x, D_{t,x}) X^{\nu } \eta A_{\nu } \check{\phi }^{(n)} + R_{\mathrm {KG}}\left[ \check{\phi }^{(n)}\right] \end{aligned}$$
(6.23)

where

$$\begin{aligned} R_{\mathrm {KG}}[\check{\phi }^{(n)}] := q_{0} \eta \check{\phi }^{(n)} + r_{-1}(t, x, D_{t,x}) \eta _{1} \check{\phi }^{(n)} - q_{-1}(t, x, D_{t,x}) b \eta \check{\phi }^{(n)}. \end{aligned}$$

As in Step 2, for the first term we have

$$\begin{aligned} \Vert q_{-1}(t, x, D_{t,x}) \eta (\check{\mathbf{D}}^{(n)}_{X} + b) \check{\phi }^{(n)}\Vert _{H^{1}_{t,x}} \rightarrow 0 \end{aligned}$$

as \(n \rightarrow \infty \), thanks to (6.2). For the second term, we have

$$\begin{aligned}&\Vert q_{-1} (t,x, D_{t,x}) X^{\nu } \eta (\check{A}^{(n)}_{\nu } - A_{\nu }) \check{\phi }^{(n)} \Vert _{H^{1}_{t,x}}\\&\quad \lesssim \Vert \eta (\check{A}^{(n)}_{\nu } - A_{\nu })\Vert _{L^{2}_{t} L^{4}_{x}} \Vert \check{\phi }^{(n)}\Vert _{L^{\infty }_{t} L^{4}_{x}} \rightarrow 0 \end{aligned}$$

as \(n \rightarrow \infty \), by Hölder, Sobolev in \(x, L^{2}_{t} H^{1}_{x}\) convergence of \(\eta \check{A}^{(n)}_{\nu }\) to \(\eta A_{\nu }\) and (6.12). On the other hand, for the third term, we have

$$\begin{aligned} \Vert q_{-1} (t,x, D_{t,x}) X^{\nu } \eta A_{\nu } \check{\phi }^{(n)}\Vert _{H^{\frac{3}{2}}_{t,x}} \lesssim \epsilon _{0} \Vert \langle D_{x}\rangle \langle D_{t,x}\rangle ^{\frac{1}{2}} (\eta A)\Vert _{L^{2}_{t,x}} \hbox { uniformly in } n. \end{aligned}$$

where we used Lemma 6.4 below with \(f = \eta A_{\nu }\) and \(g = \check{\phi }^{(n)}\). We also used the obvious bound \(\Vert \eta A_{\nu } \check{\phi }^{(n)}\Vert _{L^{2}_{t,x}} \lesssim \epsilon _{0} \Vert \langle D_{x}\rangle (\eta A)\Vert _{L^{2}_{t,x}}\), which follows from Hölder, Sobolev in x and (6.12), to control the \(L^{2}_{t,x}\) norm of the left-hand side. Finally, for \(R_{\mathrm {KG}}[\check{\phi }^{(n)}]\) we have, as in Step 3,

$$\begin{aligned} \Vert R_{\mathrm {KG}}\left[ \check{\phi }^{(n)}\right] \Vert _{H^{\frac{3}{2}}_{t,x}} \lesssim \epsilon _{0} \quad \hbox { uniformly in } n. \end{aligned}$$

By the Rellich-Kondrachov theorem, there exists a subsequence (which we still denote by \(\check{\phi }^{(n)}\)) such that

$$\begin{aligned} \widetilde{\eta } \left( - i q_{-1} (t,x, D_{t,x}) X^{\nu } \eta A_{\nu } \check{\phi }^{(n)} + R_{\mathrm {KG}}\left[ \check{\phi }^{(n)}\right] \right) \end{aligned}$$

is strongly convergent in \(H^{1}_{t,x}\) to a limit that belongs to \(H^{\frac{3}{2}}_{t,x}\). As a consequence of these facts, as well as the identity \(\eta \widetilde{\eta } = \eta \), it follows that \(\eta \check{\phi }^{(n)}\) is strongly convergent in \(H^{1}_{t,x}\) to a limit in \(H^{\frac{3}{2}}_{t,x}\). Finally, since \(\widetilde{\eta } \check{\phi }^{(n)} \rightarrow \phi \) in \(L^{2}_{t,x}\), the limit is equal to \(\eta \phi \). \(\square \)

Lemma 6.4

For \(f, g \in {{\mathcal {S}}}({{\mathbb {R}}}^{1+4})\), we have

$$\begin{aligned} \Vert fg\Vert _{\dot{H}^{\frac{1}{2}}_{t,x}} \lesssim \Vert \vert D_{t,x}\vert ^{\frac{1}{2}} f\Vert _{L^{2}_{t} \dot{H}^{1}_{x}} \Vert D_{t,x} g\Vert _{L^{\infty }_{t} L^{2}_{x}} . \end{aligned}$$
(6.24)

Proof

We use the Littlewood-Paley projections \(\left\{ S_{j}\right\} \) in \({{\mathbb {R}}}^{1+4}\). For every \(j \in {{\mathbb {Z}}}\), we decompose

$$\begin{aligned} S_{j} (fg) = S_{j}\left( \left( S_{> j - 10} f\right) g\right) + S_{j}\left( S_{\le j - 10} f S_{[j-5, j+5]} g\right) \end{aligned}$$

Using Sobolev and Hölder, we estimate each term on the right-hand side as follows:

$$\begin{aligned}&\Vert S_{j}((S_{> j - 10} f) g)\Vert _{\dot{H}^{\frac{1}{2}}_{t,x}}\nonumber \\&\quad \lesssim \sum _{j_{1} > j - 10} 2^{\frac{1}{2} j} \Vert S_{j_{1}} f\Vert _{L^{2}_{t} L^{4}_{x}} \Vert g\Vert _{L^{\infty }_{t} L^{4}_{x}} \\&\quad \lesssim \Vert D_{t,x} g\Vert _{L^{\infty }_{t} L^{2}_{x}}\sum _{j_{1} > j - 10} 2^{\frac{1}{2} (j - j_{1})} \Vert \vert D_{t,x}\vert ^{\frac{1}{2}} S_{j_{1}} f\Vert _{L^{2}_{t} \dot{H}^{1}_{x}}, \\&\Vert S_{j}(S_{\le j - 10} f S_{[j-5, j+5]} g)\Vert _{\dot{H}^{\frac{1}{2}}_{t,x}} \\&\quad \lesssim \sum _{j_{1} \le j-10} 2^{\frac{1}{2} j} \Vert S_{j_{1}} f\Vert _{L^{2}_{t} L^{\infty }_{x}} \Vert S_{[j-5, j+5]} g\Vert _{L^{\infty }_{t} L^{2}_{x}} \\&\quad \lesssim \Vert D_{t,x} g\Vert _{L^{\infty }_{t} L^{2}_{x}} \sum _{j_{1} \le j-10} 2^{\frac{1}{2}(j_{1} - j)} \Vert \vert D_{t,x}\vert ^{\frac{1}{2}} S_{j_{1}} f\Vert _{L^{2}_{t} \dot{H}^{1}_{x}} . \end{aligned}$$

Thanks to the exponential gain \(2^{- \frac{1}{2}\vert j - j_{1}\vert }\), we have

$$\begin{aligned} \sum _{j} \Vert S_{j}(fg)\Vert _{\dot{H}^{\frac{1}{2}}_{t,x}}^{2} \lesssim \Vert D_{t,x} g\Vert _{L^{\infty }_{t} L^{2}_{x}}^{2} \sum _{j_{1}} \Vert \vert D_{t,x}\vert ^{\frac{1}{2}} S_{j_{1}} f\Vert _{L^{2}_{t} \dot{H}^{1}_{x}}^{2}. \end{aligned}$$

The desired estimate is now a consequence of almost orthogonality of \(\left\{ S_{j}\right\} _{j \in {{\mathbb {Z}}}}\) in \(L^{2}_{t,x}\). \(\square \)

6.2 Weak solutions to (MKG)

We first define a function space that is suitable for a weak formulation of (MKG).

Definition 6.5

Let \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) be an open set. We define \({{\mathcal {X}}}^{w}({{\mathcal {O}}})\) to be the linear space of pairs \((A, \phi )\), where A is a real-valued 1-form and \(\phi \) is a \({{\mathbb {C}}}\)-valued function on \({{\mathcal {O}}}\), such that

$$\begin{aligned} A_{\mu }, \phi \in L^{2}_{t,x}({{\mathcal {O}}}), \ F_{\mu \nu }, \mathbf{D}_{\mu } \phi \in L^{2}_{t,x}({{\mathcal {O}}}) \quad \hbox { for all } \mu , \nu = 0, 1, \ldots , 4,\quad \quad \quad \end{aligned}$$
(6.25)

where \(F_{\mu \nu } = \partial _{\mu } A_{\nu } - \partial _{\nu } A_{\mu }\) and \(\mathbf{D}_{\mu } \phi = \partial _{\mu } \phi + i A_{\mu } \phi \) in the sense of distributions.

We may now define a notion of weak solutions to (MKG) as follows.

Definition 6.6

(Weak solutions to (MKG)) Let \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) be an open set, and let \((A, \phi ) \in {{\mathcal {X}}}^{w}({{\mathcal {O}}})\). We say that \((A, \phi )\) is a weak solution to (MKG) on \({{\mathcal {O}}}\) if for every real-valued 1-form \(\omega \in C^{\infty }_{0}({{\mathcal {O}}})\) and complex-valued function \(\varphi \in C^{\infty }_{0}({{\mathcal {O}}})\), we have

$$\begin{aligned}&\iint _{{{\mathcal {O}}}} F_{\nu \mu } \partial ^{\mu } \omega ^{\nu } + \mathrm {Im}(\phi \overline{\mathbf{D}_{\nu } \phi }) \omega ^{\nu } \, \mathrm {d}t \mathrm {d}x = 0, \end{aligned}$$
(6.26)
$$\begin{aligned}&\iint _{{{\mathcal {O}}}} \mathrm {Re}(\mathbf{D}_{\mu } \phi \overline{\partial ^{\mu } \varphi }) + \mathrm {Im}(A^{\mu } \mathbf{D}_{\mu } \phi \overline{\varphi }) \, \mathrm {d}t \mathrm {d}x = 0. \end{aligned}$$
(6.27)

By an integration by parts argument, it may be readily verified that admissible and classical solutions to (MKG) are indeed weak solutions. In the converse direction, if \((A, \phi )\) is a weak solution to (MKG) that is furthermore smooth, then \((A, \phi )\) solves (MKG) in the usual, classical sense.

Next, we discuss the gauge structure of weak solutions to (MKG). We first define the space of gauge transformations between pairs in \({{\mathcal {X}}}^{w}\).

Definition 6.7

Given an open set \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\), let \({{\mathcal {Y}}}^{w}({{\mathcal {O}}})\) be the space of real-valued functions \(\chi \) on \({{\mathcal {O}}}\) such that \(\chi \in H^{1}_{t,x}({{\mathcal {O}}})\).

Indeed, note that if \((A, \phi ) \in {{\mathcal {X}}}^{w}\) and \(\chi \in {{\mathcal {Y}}}^{w}\), then the gauge transform \((\widetilde{A}, \widetilde{\phi }) := (A - \mathrm {d}\chi , e^{i \chi })\) also belongs to \({{\mathcal {X}}}^{w}\). Moreover, if \((A, \phi )\) is a weak solution to (MKG) then so is \((\widetilde{A}, \widetilde{\phi })\), as the next lemma demonstrates.

Lemma 6.8

Let \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) be an open set, and let \((A, \phi ) \in {{\mathcal {X}}}^{w}({{\mathcal {O}}})\) be a weak solution to (MKG). Then for every \(\chi \in {{\mathcal {Y}}}^{w}({{\mathcal {O}}})\), the gauge transform \((\widetilde{A}, \widetilde{\phi }) := (A - \mathrm {d}\chi , e^{i \chi } \phi )\) also belongs to \({{\mathcal {X}}}^{w}({{\mathcal {O}}})\) and is a weak solution to (MKG).

Proof

We need to verify (6.26) and (6.27) for \((\widetilde{A}, \widetilde{\phi })\). For (6.26) there is nothing to verify, as both F and \(\mathrm {Im}(\phi \overline{\mathbf{D}\phi })\) are invariant under gauge transformation. For (6.27), we have

$$\begin{aligned}&\iint _{{{\mathcal {O}}}} \mathrm {Re}\left( \widetilde{\mathbf{D}}_{\mu } \widetilde{\phi } \, \overline{\partial ^{\mu } \varphi }\right) + \mathrm {Im}\left( \widetilde{A}^{\mu } \widetilde{\mathbf{D}_{\mu }} \widetilde{\phi } \, \overline{\varphi }\right) \, \mathrm {d}t \mathrm {d}x \\&\quad = \iint _{{{\mathcal {O}}}} \mathrm {Re}\left( \mathbf{D}_{\mu } \phi \overline{\partial ^{\mu } (e^{-i \chi } \varphi )}\right) + \mathrm {Im}\left( A^{\mu } \mathbf{D}_{\mu } \phi \, \overline{e^{-i \chi } \varphi }\right) \, \mathrm {d}t \mathrm {d}x. \end{aligned}$$

Observe that if \(\chi \in C^{\infty }({{\mathcal {O}}})\), then the last line would be equal to zero by (6.27) for \((A, \phi )\). Considering a sequence \(\chi ^{(n)} \in C^{\infty }({{\mathcal {O}}})\) such that \(\chi ^{(n)} \rightarrow \chi \) in the \(H^{1}_{t,x}({{\mathcal {O}}})\) topology and also pointwise almost everywhere, it can be seen that the last line is indeed zero, by the dominated convergence theorem, Leibniz’s rule and Hölder’s inequality. \(\square \)

6.3 Local description of solutions to (MKG)

Here we discuss how to describe a solution to (MKG) by local data. More precisely, given an open cover \({\mathcal {Q}}= \left\{ Q_{\alpha }\right\} \) of an open set \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\), we would like to describe a solution to (MKG) on \({{\mathcal {O}}}\) by local solutions on \(Q_{\alpha }\) satisfying certain compatibility conditions, which ensure that the local solutions combine to form a single solution on \({{\mathcal {O}}}\). This idea is made precise by the ensuing definition.

Definition 6.9

(Smooth compatible pairs) Let \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) be an open set and let \({\mathcal {Q}}= \left\{ Q_{\alpha }\right\} \) be a locally finite open covering of \({{\mathcal {O}}}\). For each index \(\alpha \), consider a pair \((A_{[\alpha ]}, \phi _{[\alpha ]}) \in C^{\infty }_{t,x}(Q_{\alpha })\), where \(A_{[\alpha ]}\) is a real-valued 1-form and \(\phi _{[\alpha ]}\) is a \({{\mathbb {C}}}\)-valued function on \(Q_{\alpha }\). We say that \((A_{[\alpha ]}, \phi _{[\alpha ]})\) are smooth compatible pairs if for every \(\alpha , \beta \), there exists a gauge transformation \(\chi _{[\alpha \beta ]} \in C^{\infty }_{t,x}(Q_{\alpha } \cap Q_{\beta })\) such that the following properties hold:

  1. (1)

    For every \(\alpha \), we have \(\chi _{[\alpha \alpha ]} = 0\).

  2. (2)

    For every \(\alpha , \beta \), we have

    $$\begin{aligned} (A_{[\beta ]}, \phi _{[\beta ]}) = (A_{[\alpha ]} - \mathrm {d}\chi _{[\alpha \beta ]}, e^{i \chi _{[\alpha \beta ]}} \phi _{[\alpha ]}) \quad \hbox { on } Q_{\alpha } \cap Q_{\beta }. \end{aligned}$$
    (6.28)
  3. (3)

    For every \(\alpha , \beta , \gamma \), the following cocycle condition is satisfied:

    $$\begin{aligned} \chi _{[\alpha \beta ]} + \chi _{[\beta \gamma ]} + \chi _{[\gamma \alpha ]} \in 2 \pi {{\mathbb {Z}}}\quad \hbox { on } Q_{\alpha } \cap Q_{\beta } \cap Q_{\gamma }. \end{aligned}$$
    (6.29)

The notion of (gauge-)equivalence of compatible pairs is defined as follows.

Definition 6.10

(Equivalence of smooth compatible pairs)  Let \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) be an open set, and let \({\mathcal {Q}}= \left\{ Q_{\alpha }\right\} , {\mathcal {Q}}' = \left\{ Q_{\beta }'\right\} \) be locally finite open coverings of \({{\mathcal {O}}}\). Consider two sets of smooth compatible pairs \((A_{[\alpha ]}, \phi _{[\alpha ]})\) and \((A'_{[\beta ]}, \phi '_{[\beta ]})\) on \({\mathcal {Q}}\) and \({\mathcal {Q}}'\), respectively. When \({\mathcal {Q}}'\) is a refinement of \({\mathcal {Q}}\) (i.e., for every \(\beta \) there exists \(\alpha (\beta )\) such that \(Q'_{\beta } \subseteq Q_{\alpha }\)), we say that \((A_{[\alpha ]}, \phi _{[\alpha ]})\) and \((A'_{[\beta ]}, \phi '_{[\beta ]})\) are (gauge-)equivalent if for every \(\beta \) there exists \(\chi _{[\beta ]} \in C^{\infty }_{t,x}(Q'_{\beta })\) such that \((A'_{[\beta ]}, \phi '_{[\beta ]}) =(A_{[\alpha ]} - \mathrm {d}\chi _{[\beta ]}, \phi _{[\alpha ]} e^{i \chi _{[\beta ]}})\). In the general case, we say that \((A_{[\alpha ]}, \phi _{[\alpha ]})\) and \((A'_{[\beta ]}, \phi '_{[\beta ]})\) are (gauge-)equivalent if there exists a common refinement \({\mathcal {Q}}''\) of \({\mathcal {Q}}, {\mathcal {Q}}'\) and a set of smooth compatible pairs \((A''_{[\gamma ]}, \phi ''_{[\gamma ]})\) on \({\mathcal {Q}}''\) which is equivalent to both \((A_{[\alpha ]}, \phi _{[\alpha ]})\) and \((A'_{[\beta ]}, \phi '_{[\beta ]})\).

Remark 6.11

In more geometric terms, compatible pairs \((A_{[\alpha ]}, \phi _{[\alpha ]})\) on \(Q_{\alpha }\) are precisely expressions of a connection A and a section \(\phi \) of a complex line bundle L in local trivializations \(L \!\upharpoonright _{Q_{\alpha }} \simeq Q_{\alpha } \times {{\mathbb {C}}}\). Moreover, equivalent sets of compatible pairs are alternative expressions of the same global pair \((A, \phi )\).

In fact, expression of connections and sections in local trivializations in the fashion of Definition 6.9 is necessary if the complex line bundle L under consideration is topologically nontrivial (i.e., L is not homeomorphic to the product of \({{\mathbb {C}}}\) and the base space). In our setting, however, there is no loss of generality in simply identifying connections and sections of L with real-valued 1-forms and complex-valued functions, respectively, as all base spaces we consider (e.g., \({{\mathcal {O}}}= I \times {{\mathbb {R}}}^{4}\) or \(C^{T}_{[T, \infty )}\) for some \(T > 0\)) are contractible and hence all complex line bundles over such spaces are topologically trivial. In this case, every smooth compatible pairs on \({{\mathcal {O}}}\) is equivalent to a global smooth pair \((A, \phi )\) on \({{\mathcal {O}}}\).

Remark 6.12

We emphasize that no delicate patching is needed for smooth compatible pairs in this paper, since all we need is merely the soft fact that the energy argument in Sect. 5 and the stress tensor argument in Sect. 7 (which are both gauge invariant) can be justified. In contrast, in [26] an elaborate patching argument had to be developed in order to control the \(S^{1}\) norm of the equivalent global pair in the Coulomb gauge.

Based on the spaces introduced for the weak formulation of (MKG) discussed above, we can also formulate the notion of weak compatible pairs.

Definition 6.13

(Weak compatible pairs)  Let \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) be an open set and let \({\mathcal {Q}}= \left\{ Q_{\alpha }\right\} \) be a locally finite covering of \({{\mathcal {O}}}\). For each index \(\alpha \), consider a pair \((A_{[\alpha ]}, \phi _{[\alpha ]}) \in {{\mathcal {X}}}^{w}(Q_{\alpha })\). We say that \((A_{[\alpha ]}, \phi _{[\alpha ]})\) are weak compatible pairs if for every \(\alpha , \beta \), there exists a gauge transformation \(\chi _{[\alpha \beta ]} \in {{\mathcal {Y}}}^{w}(Q_{\alpha } \cap Q_{\beta })\) such that the properties (1)–(3) in Definition 6.9 hold almost everywhere.

The notion of equivalent sets of weak compatible pairs is defined as in Definition 6.10, where the space \(C^{\infty }_{t,x}(Q'_{\beta })\) is replaced by \({{\mathcal {Y}}}^{w}(Q'_{\beta })\).

Geometrically, weak compatible pairs \((A_{[\alpha ]}, \phi _{[\alpha ]})\) may be thought of as local descriptions of a connection and a section defined on a rough complex line bundle L. A simple but crucial observation is that smoothness of the pairs \((A_{[\alpha ]}, \phi _{[\alpha ]})\) implies smoothness of the gauge transformations \(\chi _{[\alpha \beta ]}\). Indeed, simply note that \(\mathrm {d}\chi _{[\alpha \beta ]} = A_{[\alpha ]} - A_{[\beta ]}\) by the property (2) in Definition 6.9. As this fact will play an important role in our argument (see Proposition 7.3), we record it as a separate lemma.

Lemma 6.14

Let \({\mathcal {Q}}= \left\{ Q_{\alpha }\right\} \) be an open cover of \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\), and let \((A_{[\alpha ]}, \phi _{[\alpha ]})\) on \(Q_{\alpha }\) be weak compatible pairs. If \(A_{[\alpha ]}, \phi _{[\alpha ]} \in C^{\infty }(Q_{\alpha })\) for every \(\alpha \), then \((A_{[\alpha ]}, \phi _{[\alpha ]})\) form smooth compatible pairs in the sense of Definition 6.9.

We end this subsection with another simple lemma, which will be used later to show that the local solutions obtained from Proposition 6.1 in the limit form weak compatible pairs.

Lemma 6.15

Let \(Q_{1}, Q_{2} \subseteq {{\mathbb {R}}}^{1+4}\) be open sets such that \(Q_{1} \cap Q_{2} \ne \emptyset \) is an open bounded set with a piecewise smooth boundary. Consider sequences \((A_{[\alpha ]}^{(n)}, \phi _{[\alpha ]}^{(n)}) \in {{\mathcal {X}}}^{w}(Q_{\alpha })\) (\(\alpha = 1,2\)) and \(\chi _{[12]}^{(n)} \in {{\mathcal {Y}}}^{w}(Q_{1} \cap Q_{2})\) such that

$$\begin{aligned} \left( A_{[2]}^{(n)}, \phi _{[2]}^{(n)}\right) = \left( A_{[1]}^{(n)} - \mathrm {d}\chi _{[12]}^{(n)}, \phi _{[1]}^{(n)} e^{i \chi _{[12]}^{(n)}}\right) \quad \hbox { a.e. on } Q_{1} \cap Q_{2}.\quad \quad \end{aligned}$$
(6.30)

In other words, \((A_{[\alpha ]}^{(n)}, \phi _{[\alpha ]}^{(n)})\) are weak compatible pairs for each n. Suppose furthermore that each sequence \((A_{[\alpha ]}^{(n)}, \phi _{[\alpha ]}^{(n)})\) has a limit \((A_{[\alpha ]}, \phi _{[\alpha ]})\) in \({{\mathcal {X}}}^{w}(Q_{\alpha })\) as \(n \rightarrow \infty \). Then the limits \((A_{[\alpha ]}, \phi _{[\alpha ]})\) (\(\alpha =1,2\)) also form weak compatible pairs, i.e., there exists \(\chi _{[12]} \in {{\mathcal {Y}}}^{w}(Q_{1} \cap Q_{2})\) such that

$$\begin{aligned} (A_{[2]}, \phi _{[2]} ) = (A_{[1]} - \mathrm {d}\chi _{[12]}, \phi _{[1]} e^{i \chi _{[12]}}) \quad \hbox { a.e. on } Q_{1} \cap Q_{2}. \end{aligned}$$
(6.31)

Moreover, there exists a subsequence of \(\chi _{[12]}^{(n)}\) that convergesFootnote 13 to \(\chi _{[12]}\) in \({{\mathcal {Y}}}^{w}(Q_{1} \cap Q_{2})\) up to integer multiples of \(2 \pi \).

Proof

Let \(\overline{\chi }_{[12]}^{(n)} := \int _{Q_{1} \cap Q_{2}} \chi _{[12]}^{(n)}\) denote the mean of \(\chi _{[12]}^{(n)}\). By Poincaré’s inequality, the identity \(\mathrm {d}\chi _{[12]}^{(n)} = A_{[1]}^{(n)} - A_{[2]}^{(n)}\) and the \(L^{2}_{t,x}\) convergence of \(A_{[\alpha ]}^{(n)} (\alpha = 1, 2\)), the mean-zero part \(\hat{\chi }_{[12]}^{(n)} := \chi _{[12]}^{(n)} - \overline{\chi }_{[12]}^{(n)}\) converges to a limit \(\hat{\chi }_{[12]}\) in \({{\mathcal {Y}}}^{w}(Q_{1} \cap Q_{2}) = H^{1}_{t,x}(Q_{1} \cap Q_{2})\). On the other hand, we can easily extract a convergent subsequence from the bounded sequence \(e^{i \overline{\chi }_{[12]}^{(n)}}\); abusing the notation a bit, we denote the subsequence still by \(e^{i \overline{\chi }_{[12]}^{(n)}}\), and the limit by \(e^{i \overline{\chi }_{[12]}}\) for some \(\overline{\chi }_{[12]} \in {{\mathbb {R}}}\). It follows that \(\chi _{[12]}^{(n)}\) converges to \(\chi _{[12]} := \hat{\chi }_{[12]} + \overline{\chi }_{[12]}\) in \({{\mathcal {Y}}}^{w}(Q_{1} \cap Q_{2})\) as \(n \rightarrow \infty \) up to integer multiples of \(2 \pi \). The desired gauge equivalence in the limit (6.31) is now an easy consequence of (6.30) and the above convergences.

7 Stationary/self-similar solutions with finite energy

In the context of the blow-up analysis to be performed in Sect. 8, the local strong compactness result (Proposition 6.1) will give rise to two types of solutions to (MKG):

  • A stationary solution \((A, \phi )\), which is defined by the property

    $$\begin{aligned} \iota _{Y} F = 0, \quad \mathbf{D}_{Y} \phi = 0 \end{aligned}$$
    (7.1)

    for some constant time-like vector field Y; or

  • A self-similar solution \((A, \phi )\), defined by the property

    $$\begin{aligned} \iota _{X_{0}} F = 0, \quad \left( \mathbf{D}_{X_{0}} + \frac{1}{\rho }\right) \phi = 0. \end{aligned}$$
    (7.2)

In Sects. 7.1 and 7.2, we show that such solutions must be trivial under the finite energy assumption. We use the method of stress tensor, which is the elliptic version of the energy-momentum-stress tensor considered in Sect. 5. In Sect. 7.3, we establish an elliptic regularity result for these solutions under the improved regularity assumption (6.6) ensured by Proposition 6.1.

7.1 Triviality of finite energy stationary solutions

As any unit constant time-like vector field Y can be Lorentz transformed to the vector field \(T = \partial _{t}\) in the rectilinear coordinates, we may assume that \(Y = T\). Our main result in this case is as follows.

Proposition 7.1

Let \((A, \phi )\) be a smooth solution to (MKG) on \({{\mathbb {R}}}^{1+4}\) with \(\iota _{T} F = 0\) and \(\mathbf{D}_{T} \phi = 0\). Suppose furthermore that \((A, \phi )\) has finite energy, i.e., \({{\mathcal {E}}}_{\left\{ 0\right\} \times {{\mathbb {R}}}^{4}}[A, \phi ] < \infty \). Then \({{\mathcal {E}}}_{\left\{ 0\right\} \times {{\mathbb {R}}}^{4}}[A, \phi ] = 0\).

Proof

We use the rectilinear coordinates \((t= x^{0}, x^{1}, \ldots , x^{4})\), in which \(T = \partial _{t}\). By the stationarity assumptions \((\iota _{T} F)(\partial _{j}) = F_{0j}= 0\) and \(\mathbf{D}_{T} \phi = \mathbf{D}_{0} \phi = 0\), (MKG) reduces to the following elliptic system on each constant t hypersurface:

$$\begin{aligned} \left\{ \begin{aligned}&\partial ^{\ell } F_{j \ell } = \mathrm {Im}(\phi \overline{\mathbf{D}_{j} \phi }), \\&\mathbf{D}^{\ell } \mathbf{D}_{\ell } \phi = 0. \end{aligned} \right. \end{aligned}$$
(7.3)

Henceforth, we work with \(F, \phi \) restricted to the hypersurface \(\left\{ t = 0\right\} \).

For the purpose of showing \({{\mathcal {E}}}[A, \phi ] = 0\), consider the following stress tensor associated to (7.3):

(7.4)

Given a vector field S on \({{\mathbb {R}}}^{4}\), we define as before the associated 1-and 0-currents

$$\begin{aligned} {}^{(S)} J_{j}[A, \phi ]:= & {} {\mathcal {Q}}_{jk}[A,\phi ] S^{k}, \\ {}^{(S)} K[A, \phi ]:= & {} {\mathcal {Q}}_{jk}[A, \phi ] {}^{(S)} \pi ^{jk} \end{aligned}$$

which, thanks to (7.3), satisfy the divergence identity

$$\begin{aligned} \nabla ^{\mathbf{a}} ({}^{(S)} J_\mathbf{}[A, \phi ]_{\mathbf{a}}) = {}^{(S)} K[A, \phi ]. \end{aligned}$$
(7.5)

Choosing S to be the scaling vector field on \({{\mathbb {R}}}^{4}\) so that, in the rectilinear coordinates

$$\begin{aligned} S^{k} = x^{k}, \quad {}^{(S)} \pi ^{jk} = 2 \delta ^{jk}, \end{aligned}$$

we have

$$\begin{aligned} {}^{(S)} K[A, \phi ] = - 2 \vert \mathbf{D}\phi \vert ^{2}, \quad \vert {}^{(S)} J_{j}[A, \phi ]\vert \lesssim \vert x\vert \vert \mathbf{D}\phi \vert ^{2} + \vert x\vert \vert F\vert ^{2}. \end{aligned}$$

where \(\vert \mathbf{D}\phi \vert ^{2} = \sum _{j=1}^{4} \vert \mathbf{D}_{j} \phi \vert ^{2}\) and \(\vert F\vert ^{2} = \sum _{1 \le j < k \le 4} \vert F_{jk}\vert ^{2}\).

We now integrate (7.5) by parts on a ball \(B_{R} \subseteq {{\mathbb {R}}}^{4}\) of radius \(R > 1\) centered at 0. Then we see that

$$\begin{aligned} - 2 \int _{B_{R}} \vert \mathbf{D}\phi \vert ^{2} \, \mathrm {d}x = \int _{\partial B_{R}} {}^{(S)} J[A, \phi ]_{\mathbf{a}} \mathbf{n}^{\mathbf{a}}, \quad \hbox { where } \mathbf{n}= \frac{x^{\ell }}{\vert x\vert } \partial _{\ell }. \end{aligned}$$
(7.6)

By the finite energy condition, we have \(\vert \mathbf{D}\phi \vert , \vert F\vert \in L^{2}({{\mathbb {R}}}^{4})\); this fact is enough to deduce the existence of a sequence of radii \(R_{n} \rightarrow \infty \) along which the boundary integral vanishes. Hence it follows that \(\mathbf{D}_{x} \phi = 0\).

It only remains to show that \(F = 0\). Note that F is now a harmonic 2-form in \(L^{2}({{\mathbb {R}}}^{4})\), as \(\mathrm {d}F = \mathrm {d}^{2} A = 0\) and the right-hand side of the first equation in (7.3) vanishes. Therefore, each component \(F_{jk}\) is a harmonic function. By the non-existenceFootnote 14 of nontrivial harmonic functions in \(L^{2}({{\mathbb {R}}}^{4})\), it follows that \(F =0\), which completes the proof. \(\square \)

7.2 Triviality of finite energy self-similar solutions

In the case of a self-similar solution with finite energy, our main result is as follows.

Proposition 7.2

Let \((A, \phi )\) be a smooth solution to (MKG) on the forward light cone \(C_{(0, \infty )}\) with \(\iota _{X_{0}} F = 0\) and \(\mathbf{D}_{X_{0}} \phi + \frac{1}{\rho } \phi = 0\). Suppose furthermore that \((A, \phi )\) has finite energy, i.e., \(\sup _{t \in (0, \infty )} {{\mathcal {E}}}_{S_{t}}[A, \phi ] < \infty \). Then \({{\mathcal {E}}}_{S_{t}}[A, \phi ] = 0\) for all \(t > 0\).

Proof

We use the hyperbolic coordinates \((\rho , y, \Theta )\), in which \(X_{0} = \partial _{\rho }\). By the self-similarity assumption \(\iota _{X_{0}} F (\cdot ) = F(\partial _{\rho }, \cdot ) = 0\) and \(\mathbf{D}_{\partial _{\rho }} \phi = - \frac{1}{\rho } \phi \), it follows that the pullback of \((A, \phi )\) to \({{\mathcal {H}}}_{1} = \left\{ \rho = 1\right\} = {{\mathbb {H}}}^{4}\), which we still denote by \((A, \phi )\), solves the system

$$\begin{aligned} \left\{ \begin{aligned}&- \mathrm {div}_{{{\mathbb {H}}}^{4}} F = \mathrm {Im}( \phi \overline{\mathbf{D}_{{{\mathbb {H}}}^{4}} \phi }), \\&(-\triangle _{{{\mathbb {H}}}^{4}, A} - 2) \phi = 0, \end{aligned} \right. \end{aligned}$$
(7.7)

where \(F = \mathrm {d}A, (\mathrm {div}_{{{\mathbb {H}}}^{4}} F)_{\mathbf{a}} = \nabla _{{{\mathbb {H}}}^{4}}^{\mathbf{b}} F_{\mathbf{b}\mathbf{a}}, \mathbf{D}_{{{\mathbb {H}}}^{4}} = \nabla _{{{\mathbb {H}}}^{4}} + i A\) and \(\triangle _{{{\mathbb {H}}}^{4}, A} = \mathbf{D}_{{{\mathbb {H}}}^{4}}^{\mathbf{a}} \mathbf{D}_{{{\mathbb {H}}}^{4}, \mathbf{a}}\). Furthermore, by Proposition 5.1 applied to \({{\mathcal {H}}}_{1} = {{\mathbb {H}}}^{4}\), we have

$$\begin{aligned}&\displaystyle \int _{{{\mathbb {H}}}^{4}} \frac{1}{2} \cosh y \, \vert F\vert _{{{\mathbb {H}}}^{4}}^{2} \, \mathrm {d}\sigma _{{{\mathbb {H}}}^{4}} < \infty ,&\end{aligned}$$
(7.8)
$$\begin{aligned}&\displaystyle \int _{{{\mathbb {H}}}^{4}} \frac{1}{2} \Big [ \cosh y \vert \phi \vert ^{2} + 2 \sinh y \mathrm {Re}[ \phi \overline{\mathbf{D}_{y} \phi } ] + \cosh y \vert \mathbf{D}\phi \vert _{{{\mathbb {H}}}^{4}}^{2} \Big ] \mathrm {d}\sigma _{{{\mathbb {H}}}^{4}} < \infty .&\qquad \end{aligned}$$
(7.9)

where \(\vert F\vert _{{{\mathbb {H}}}^{4}}^{2} = \frac{1}{2} (g_{{{\mathbb {H}}}^{4}}^{-1})^{\mathbf{a}\mathbf{c}} (g_{{{\mathbb {H}}}^{4}}^{-1})^{\mathbf{b}\mathbf{d}} F_{\mathbf{a}\mathbf{b}} F_{\mathbf{c}\mathbf{d}}\) and \(\vert \mathbf{D}\phi \vert _{{{\mathbb {H}}}^{4}}^{2} = (g_{{{\mathbb {H}}}^{4}}^{-1})^{\mathbf{a}\mathbf{b}} \mathbf{D}_{\mathbf{a}} \phi \overline{\mathbf{D}_{\mathbf{b}} \phi }\).

In order to proceed, we reformulate the system on \({{\mathbb {D}}}^{4}\) using the conformal equivalence of \({{\mathbb {D}}}^{4}\) and \({{\mathbb {H}}}^{4}\). Consider the following map from \({{\mathbb {D}}}^{4}\) to \({{\mathbb {H}}}^{4}\):

$$\begin{aligned} \Phi : {{\mathbb {D}}}^{4} \rightarrow {{\mathbb {H}}}^{4}, \quad (r, \Theta ) \mapsto (y, \Theta ) = \left( 2 \tanh ^{-1} r, \Theta \right) \end{aligned}$$

The map \(\Phi \) is a conformal isometry, i.e.,

$$\begin{aligned} \Phi ^{*} g_{{{\mathbb {H}}}^{4}} = \Phi ^{*} \left( \mathrm {d}y^{2} + \sinh ^{2} y \, g_{{{\mathbb {S}}}^{3}}\right) = \Omega ^{2} \left( \mathrm {d}r^{2} + r^{2} \, g_{{{\mathbb {S}}}^{3}}\right) = \Omega ^{2} g_{{{\mathbb {D}}}^{4}}, \end{aligned}$$

where \(\Phi ^{*}\) denotes the pullback along \(\Phi \) to \({{\mathbb {D}}}^{4}\), and \(\Omega := \frac{2}{1-r^{2}}\). For the pulled-back pair \((\Phi ^{*} A, \Omega \, \Phi ^{*} \phi )\) on \({{\mathbb {D}}}^{4}\), which (slightly abusing the notation) we will denote by (Au), we have

$$\begin{aligned} \left\{ \begin{aligned}&\partial ^{\ell } F_{j \ell } =\mathrm {Im}(u \overline{\mathbf{D}_{j} u}) \\&\mathbf{D}^{\ell } \mathbf{D}_{\ell } u = 0. \end{aligned} \right. \end{aligned}$$
(7.10)

where \(F = \mathrm {d}A\) and \(\mathbf{D}= \nabla + i A\). Moreover, the bounds (7.8) and (7.9) then translate to

$$\begin{aligned}&\int _{{{\mathbb {D}}}^{4}} \frac{1}{2} \frac{1+r^{2}}{1-r^{2}} \vert F\vert _{{{\mathbb {D}}}^{4}}^{2} \, \mathrm {d}\sigma _{{{\mathbb {D}}}^{4}} < \infty , \end{aligned}$$
(7.11)
$$\begin{aligned}&\int _{{{\mathbb {D}}}^{4}} \frac{1}{2} \left[ \frac{1}{1-r^{2}} \vert r \mathbf{D}_{r} u + 2 u\vert ^{2} + \frac{1}{1-r^{2}} \vert \mathbf{D}_{r} u\vert ^{2}\right. \nonumber \\&\left. \quad + \frac{1+r^{2}}{(1-r^{2})r^{2}} \vert \not \!\! \mathbf{D}u\vert ^{2} \right] \mathrm {d}\sigma _{{{\mathbb {D}}}^{4}} < \infty . \end{aligned}$$
(7.12)

where \(\vert \not \!\! \mathbf{D}u\vert ^{2} = (g_{{{\mathbb {S}}}^{3}}^{-1})^{\mathbf{a}\mathbf{b}} \mathbf{D}_{\mathbf{a}} u \overline{\mathbf{D}_{\mathbf{b}} u}\). Indeed, note that

$$\begin{aligned} \Phi ^{*} \mathrm {d}\sigma _{{{\mathbb {H}}}^{4}} = \Omega ^{4} \mathrm {d}\sigma _{{{\mathbb {D}}}^{4}}, \quad \Phi ^{*} (\cosh y) = \frac{1 + r^{2}}{1 - r^{2}}, \quad \Phi ^{*} (\sinh y) = \frac{2 r}{1 - r^{2}}. \end{aligned}$$

From these identities and (7.8), we immediately see that (7.11) holds. Moreover, (7.12) follows from (7.9) and the following computation:

$$\begin{aligned}&\int _{{{\mathbb {H}}}^{4}} \frac{1}{2} \Big [ \cosh y \vert \phi \vert ^{2} + 2 \sinh y \mathrm {Re}[ \phi \overline{\mathbf{D}_{y} \phi } ] + \cosh y \vert \mathbf{D}\phi \vert _{{{\mathbb {H}}}^{4}}^{2} \Big ] \mathrm {d}\sigma _{{{\mathbb {H}}}^{4}} \\&\quad = \int _{{{\mathbb {D}}}^{4}} \frac{1}{2} \Big [ \frac{1+r^{2}}{1-r^{2}} \Omega ^{2} \vert u\vert ^{2} + \frac{4 r}{1-r^{2}} \mathrm {Re}[\Omega u \overline{\Omega \mathbf{D}_{r} (\Omega ^{-1} u)}]\\&\qquad + \frac{1+r^{2}}{1-r^{2}} \Big ( \vert \Omega \mathbf{D}_{r} (\Omega ^{-1} u)\vert ^{2} + \frac{1}{r^{2}} \vert \not \!\! \mathbf{D}u\vert ^{2} \Big ) \Big ] \mathrm {d}\sigma _{{{\mathbb {D}}}^{4}} \\&\quad = \int _{{{\mathbb {D}}}^{4}} \frac{1}{2} \Big [ \frac{4}{1-r^{2}} \vert u\vert ^{2} + \frac{4r}{1-r^{2}} \mathrm {Re}[u \overline{\mathbf{D}_{r} u}] + \frac{r^{2}+1}{1-r^{2}} \vert \mathbf{D}_{r} u\vert ^{2}\\&\qquad + \frac{1+r^{2}}{(1-r^{2})r^{2}} \vert \not \!\! \mathbf{D}u\vert ^{2} \Big ] \mathrm {d}\sigma _{{{\mathbb {D}}}^{4}} \\&\quad = \int _{{{\mathbb {D}}}^{4}} \frac{1}{2} \Big [ \frac{1}{1-r^{2}} \vert r \mathbf{D}_{r} u + 2 u\vert ^{2}+ \frac{1}{1-r^{2}} \vert \mathbf{D}_{r} u\vert ^{2}\\&\qquad + \frac{1+r^{2}}{(1-r^{2})r^{2}} \vert \not \!\! \mathbf{D}u\vert ^{2} \Big ] \mathrm {d}\sigma _{{{\mathbb {D}}}^{4}}. \end{aligned}$$

We will now show that (7.10), (7.11) and (7.12) imply \(u = 0\) on \({{\mathbb {D}}}^{4}\). Since the system (7.10) coincides with (7.3) restricted to \({{\mathbb {D}}}^{4}\), the divergence identity (7.5) can be used in the present context as well. Integrating (7.5) by parts on a ball \(B_{R} \subseteq {{\mathbb {D}}}^{4}\) of radius \(R < 1\) centered at 0, we see that

$$\begin{aligned} - 2 \int _{B_{R}} \vert \mathbf{D}u\vert ^{2} \, \mathrm {d}\sigma _{{{\mathbb {D}}}^{4}} = \int _{\partial B_{R}} {}^{(S)} J[A, u]_{\mathbf{a}} \mathbf{n}^{\mathbf{a}}, \quad \hbox {where } \mathbf{n}= \frac{x^{\ell }}{\vert x\vert } \partial _{\ell }. \end{aligned}$$
(7.13)

Observe that (7.11) and (7.12) imply the existence of a sequence \(R_{n} \rightarrow 1\) such that

$$\begin{aligned} \int _{\partial B_{R_{n}}} \vert {}^{(S)} J[A, u]_{\mathbf{a}} \mathbf{n}^{\mathbf{a}} \vert \rightarrow 0, \end{aligned}$$

which shows that \(\mathbf{D}u = 0\) on \({{\mathbb {D}}}^{4}\). Plugging this information into (7.12), it follows that \(u = 0\) on \({{\mathbb {D}}}^{4}\), as desired.

To complete the proof, it only remains to show that \(F = 0\). As before, F is now a harmonic 2-form in \(L^{2}({{\mathbb {D}}}^{4})\) by (7.7); hence each component \(F_{j k}\) is a harmonic function on \({{\mathbb {D}}}^{4}\). Fix \(j, k \in \left\{ 1,2,3,4\right\} \) and observe that \(\varphi := F_{j k}\), viewed as a real-valued function, obeys the following monotonicity property:

$$\begin{aligned} \frac{1}{r_{1}^{3}} \int _{\partial B_{r_{1}}} \vert \varphi \vert ^{2} \le \frac{1}{r_{2}^{3}} \int _{\partial B_{r_{2}}} \vert \varphi \vert ^{2} \quad \hbox { where } 0 < r_{1} < r_{2} < 1. \end{aligned}$$
(7.14)

Indeed, (7.14) is a consequence of interpolating the inequalities

$$\begin{aligned} \frac{1}{r_{1}^{3}} \int _{\partial B_{r_{1}}} \vert \varphi \vert \le \frac{1}{r_{2}^{3}} \int _{\partial B_{r_{2}}} \vert \varphi \vert , \quad \sup _{\partial B_{r_{1}}} \vert \varphi \vert \le \sup _{\partial B_{r_{2}}} \vert \varphi \vert \quad \hbox { where } 0 < r_{1} < r_{2} < 1, \end{aligned}$$

which follow from the mean-value property and the weak maximum principle for the subharmonic function \(\vert \varphi \vert \) on \({{\mathbb {D}}}^{4}\), respectively. By (7.11), it follows that \(F_{jk} = \varphi = 0\) on \({{\mathbb {D}}}^{4}\). \(\square \)

7.3 Regularity of stationary and self-similar weak solutions to (MKG)

We end this section with a regularity result, which applies to weak solutions obtained by Proposition 6.1.

Proposition 7.3

Let \((A, \phi )\) be a weak solution to (MKG) on an open set \({{\mathcal {O}}}\subseteq {{\mathbb {R}}}^{1+4}\) such that

$$\begin{aligned} A_{\mu } \in H^{1}_{t,x}({{\mathcal {O}}}), \quad \phi \in H^{\frac{3}{2}}_{t,x}({{\mathcal {O}}}). \end{aligned}$$
(7.15)

Suppose furthermore that one of the following holds:

  1. (1)

    Either \((A, \phi )\) is stationary on \({{\mathcal {O}}}\) in the sense of (7.1); or

  2. (2)

    The set \({{\mathcal {O}}}\) is a subset of the cone \(C_{(0, \infty )} = \left\{ 0 \le r < t\right\} \) and \((A, \phi )\) is self-similar on \({{\mathcal {O}}}\) in the sense of (7.2).

Then for every \(p \in {{\mathcal {O}}}\), there exists an open neighborhood \(p \in Q_{p} \subseteq {{\mathcal {O}}}\) and a gauge transformation \(\chi _{[p]} \in {{\mathcal {Y}}}^{w}(Q_{p})\) such that \((A_{[p]}, \phi _{[p]}) = (A - \mathrm {d}\chi _{[p]}, \phi e^{i \chi _{[p]}})\) is smooth on \(Q_{p}\).

Proof

The idea is to derive an elliptic system as in (7.3) [resp. (7.7)] using stationarity [resp. self-similarity], and then use its regularity theory. To get rid of the non-local operator \(\langle D_{t,x}\rangle ^{\frac{3}{2}}\) in the norm, we begin with the following simple maneuver: For any open bounded subset \(Q \subseteq {{\mathcal {O}}}\) with smooth boundary, by Sobolev and (7.15), we have

$$\begin{aligned} A_{\mu } \in H^{1}_{t,x}(Q), \quad \phi \in W^{1, q}_{t,x}(Q) \end{aligned}$$
(7.16)

where \(q = \frac{5}{2}\). The important point is that \(q > 2\), which will make this bound subcritical. Hence we would be able to conclude regularity via a simple elliptic bootstrap argument.

We first treat Case 1. Applying a suitable Lorentz transformation, it suffices to consider the case \(Y = \partial _{t}\) in the rectilinear coordinates \((t = x^{0}, x^{1}, \ldots , x^{4})\). Moreover, applying an appropriate space-time translation, we may assume that p is the origin. Let \(Q_{p} := (-\delta , \delta ) \times \delta B\), where \(\delta B\) is the open ball of radius \(\delta \) centered at the origin. Choosing \(\delta > 0\) small enough, we have \(Q_{p} \subseteq {{\mathcal {O}}}\). By (7.16) and Fubini, there exists \(\overline{t} \in (-\delta , \delta )\) such that

$$\begin{aligned} A \!\upharpoonright _{\overline{t} \times \delta B} \in H^{1}(\delta B), \quad \phi \!\upharpoonright _{\overline{t} \times \delta B} \in W^{1, q}(\delta B), \end{aligned}$$
(7.17)

where the shorthand \(\overline{t} = \left\{ \overline{t}\right\} \) is used for simplicity. We claim that there exists \(\chi _{[p]} \in {{\mathcal {Y}}}^{w}((-\delta , \delta ) \times \delta B)\) so that \(\chi _{[p]} \!\upharpoonright _{\overline{t} \times \delta B} \in H^{2}(\delta B)\) and

$$\begin{aligned} \partial _{t} \chi _{[p]} = A_{0} \hbox { in } (-\delta , \delta ) \times \delta B, \quad \triangle \chi _{[p]} \!\upharpoonright _{\overline{t} \times \delta B}= \partial ^{\ell } (A \!\upharpoonright _{\overline{t} \times \delta B})_{\ell }. \end{aligned}$$
(7.18)

Indeed, we may simply define \(\underline{\chi }_{[p]} = \triangle ^{-1} \partial ^{\ell } (\eta A \!\upharpoonright _{\left\{ t = \overline{t} \right\} })_{\ell }\), where \(\eta \in C^{\infty }_{0}({{\mathbb {R}}}^{4})\) satisfies \(\eta = 1\) on \(\delta B\) and \({\mathrm {supp}}\, \eta \subseteq {{\mathcal {O}}}\), then solve the transport equation \(\partial _{t} \chi _{[p]} = A_{0} \hbox { in } (-\delta , \delta ) \times \delta B\) with initial data \(\chi _{[p]} \!\upharpoonright _{\overline{t} \times \delta B} = \underline{\chi }_{[p]}\). That this \(\chi _{[p]}\) belongs to \({{\mathcal {Y}}}^{w}((-\delta , \delta ) \times \delta B)\) and \(\chi _{[p]} \!\upharpoonright _{\overline{t} \times \delta B} \in H^{2}(\delta B)\) easily follow from the bounds for A in (7.16) and (7.17).

Consider now the gauge transform \((A_{[p]}, \phi _{[p]}) = (A - \mathrm {d}\chi _{[p]}, \phi e^{i \chi _{[p]}})\). By (7.18), we have

$$\begin{aligned} A_{[p] 0} = 0 \hbox { in } (-\delta , \delta ) \times \delta B, \quad \partial ^{\ell } \left( A_{[p]} \!\upharpoonright _{\overline{t} \times \delta B}\right) _{\ell } = 0 \hbox { in } \delta B. \end{aligned}$$
(7.19)

By the stationarity assumption \(\iota _{\partial _{t}} F = 0\) and \(\mathbf{D}_{\partial _{t}} \phi = 0\), it follows that

$$\begin{aligned} \partial _{t} A_{[p] j} = F_{0j} = 0, \quad \partial _{t} \phi _{[p]} = 0 \hbox { in } (-\delta , \delta ) \times \delta B. \end{aligned}$$

Hence to prove that \((A_{[p]}, \phi _{[p]})\) is smooth in \(Q_{p}\), it suffices to show that \((A_{[p]}, \phi _{[p]}) \!\upharpoonright _{\overline{t} \times \delta B}\) is smooth. Abusing the notation slightly for simplicity, we will henceforth write \(A = A_{[p]} \!\upharpoonright _{\overline{t} \times \delta B}\) and \(\phi = \phi _{[p]} \!\upharpoonright _{\overline{t} \times \delta B}\). By (7.3) and (7.19) (in particular, the Coulomb condition for A), \((A, \phi )\) satisfies an elliptic system on \(\delta B\) of the schematic form

$$\begin{aligned} \triangle A = \phi \partial \phi + \phi A \phi , \\ \triangle \phi = A \partial \phi + A A \phi . \end{aligned}$$

Moreover, \((A, \phi )\) belongs to \(A \in H^{1}(\delta B)\) and \(\phi \in W^{1, q}(\delta B)\), thanks to (7.17) and \(\chi _{[p]} \!\upharpoonright _{\overline{t} \times \delta B}\in H^{2}(\delta B)\). As this system is \(H^{1}\)-critical and every nonlinear term has at least one factor of \(\phi \), which obeys a subcritical bound \(\phi \in W^{1, q}(\delta B)\), we can perform a standard elliptic bootstrap argument to conclude that \((A, \phi )\) is smooth on \(\delta B\) with uniform bounds on compact subsets. This concludes the proof in Case 1.

The proof in Case 2 is entirely analogous to Case 1, so we only give a brief outline. Here, instead of the rectilinear coordinates, we use the hyperbolic coordinates \((\rho , y, \Theta )\), in which \(X = \partial _{\rho }\). Applying a suitable Lorentz transformation and scaling transformation, we may assume that p coincides with the point \(\rho = 1, y = 0\). Let \(Q_{p} = (-\delta , \delta ) \times D_{\delta }\), where \(D_{\delta } := \left\{ (y, \Theta ) : \vert y\vert < \delta \right\} \), which is contained in \({{\mathcal {O}}}\) if \(\delta > 0\) is sufficiently small. By (7.16) and Fubini, there exists \(\overline{\rho } \in (-\delta , \delta )\) such that

$$\begin{aligned} A \!\upharpoonright _{\overline{\rho } \times D_{\delta }} \in H^{1}(D_{\delta }), \quad \phi \!\upharpoonright _{\overline{\rho } \times D_{\delta }} \in W^{1, p}(D_{\delta }). \end{aligned}$$
(7.20)

Proceeding as before, we can find \(\chi _{[p]} \in {{\mathcal {Y}}}^{w}((-\delta , \delta ) \times D_{\delta })\) so that \(\chi _{[p]} \!\upharpoonright _{\overline{\rho } \times D_{\delta }} \in H^{2}(D_{\delta })\) and

$$\begin{aligned} \partial _{\rho } \chi _{[p]} = 0 \quad \hbox { in } (-\delta , \delta ) \times D_{\delta }, \quad \triangle _{{{\mathcal {H}}}_{\overline{\rho }}} \, \chi _{[p]} \!\upharpoonright _{\overline{\rho } \times D_{\delta }} = \nabla _{{{\mathcal {H}}}_{\overline{\rho }}}^{\mathbf{a}} \left( A \!\upharpoonright _{\overline{\rho } \times D_{\delta }}\right) _{\mathbf{a}} \, . \end{aligned}$$

Then the gauge transform \((A_{[p]}, \phi _{[p]}) = (A - \mathrm {d}\chi _{[p]}, \phi e^{i \chi _{[p]}})\) obeys

$$\begin{aligned} A_{[p]}(\partial _{\rho }) = 0 \hbox { in } (-\delta , \delta ) \times D_{\delta }, \quad \nabla _{{{\mathcal {H}}}_{\overline{\rho }}}^{\mathbf{a}} \left( A_{[p]} \!\upharpoonright _{\overline{\rho } \times D_{\delta }}\right) _{\mathbf{a}} = 0 \hbox { in } D_{\delta }. \end{aligned}$$

By self-similarity, we have \({{\mathcal {L}}}_{\partial _{\rho }} A_{[p]} = 0\) and \(\partial _{\rho } (\rho \phi _{[p]}) = 0\), so it only remains to prove that the pullback of \((A_{[p]}, \phi _{[p]})\) on \(\overline{\rho } \times D_{\delta }\), which we will refer to as \((A, \phi )\), is smooth. As in the previous case, this is a consequence of the fact that \((A, \phi )\) obeys an elliptic system (thanks to (7.7) and the Coulomb gauge condition on \({{\mathcal {H}}}_{\overline{\rho }}\)), the bounds \(A \in H^{1}(D_{\delta })\) and \(\phi \in W^{1, q}(D_{\delta })\) with \(q > 2\) (by (7.20) and \(\chi _{[p]} \!\upharpoonright _{\overline{\rho } \times D_{\delta }} \in H^{2}(D_{\delta })\)), and a standard elliptic bootstrap argument. \(\square \)

8 Proof of global well-posedness and scattering

Here we carry out the proof of Theorem 1.3 using the tools developed in the earlier parts.

8.1 Finite time blow-up/non-scattering scenarios and initial reduction

Our overall strategy for proving Theorem 1.3 is by contradiction. Suppose that Theorem 1.3 fails for an initial data set \((a, e, f, g) \in {{\mathcal {H}}}^{1}\) in the global Coulomb gauge. By time reversal symmetry, it suffices to consider the forward evolution. Let \((A, \phi )\) be the admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to the Cauchy problem in the global Coulomb gauge defined on the maximal forward time interval \(I = [0, T_{+})\) for some \(T_{+} > 0\) constructed by Theorem 4.3. By Theorem 4.8, the solution \((A, \phi )\) exhibits one of the following behaviors:

  1. (1)

    Finite time blow-up  We have \(T_{+} < \infty \) and

    $$\begin{aligned} \Vert A_{0}\Vert _{Y^{1}[0, T_{+})} + \Vert A_{x}\Vert _{S^{1}[0, T_{+})} + \Vert \phi \Vert _{S^{1}[0, T_{+})} = \infty . \end{aligned}$$
    (8.1)
  2. (2)

    Non-scattering  We have \(T_{+} = \infty \), but

    $$\begin{aligned} \Vert A_{0}\Vert _{Y^{1}[0, \infty )} + \Vert A_{x}\Vert _{S^{1}[0, \infty )} + \Vert \phi \Vert _{S^{1}[0, \infty )} = \infty . \end{aligned}$$
    (8.2)

In the case of finite time blow-up, we may use the energy concentration scale \(r_{c}\) in Theorem 4.3 to show that the energy must concentrate at a point.

Lemma 8.1

Let \((A, \phi )\) be an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to  (MKG) on \([0, T_{+}) \times {{\mathbb {R}}}^{4}\) with \(T_{+} < \infty \) in the global Coulomb gauge. Then either \((A, \phi )\) can be continued past \(T_{+}\) as an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution in the global Coulomb gauge (as in Theorem 4.3), or there exists a point \(x_{0} \in {{\mathbb {R}}}^{4}\) such that

$$\begin{aligned} \limsup _{t \rightarrow T_{+}} {{\mathcal {E}}}_{\left\{ t\right\} \times B_{(T_{+} - t)}(x_{0})} [A, \phi ] > 0. \end{aligned}$$
(8.3)

Proof

For \( t < T_+\) and \(x \in {{\mathbb {R}}}^4\) we define the function

$$\begin{aligned} E(t,x) = {{\mathcal {E}}}_{\left\{ t_{0}\right\} \times B_{(T_{+} - t)}(x)} [A, \phi ] . \end{aligned}$$

This is continuous in x, and, by the non-negativity of the flux in the energy relation (5.3), it is non-increasing in t. Further, by the same relation, we have

$$\begin{aligned} \lim _{\vert x\vert \rightarrow \infty } E(t,x) = 0, \qquad \text {uniformly in } t \in [0,T_+). \end{aligned}$$
(8.4)

Then we have two alternatives:

  1. (i)

    Either \(\lim _{t \rightarrow T_+} \sup _{x\in {{\mathbb {R}}}^4} E(t,x) < \delta _{0}(E, \epsilon _{*}^{2})\), which implies that there exists \(t_0\) so that energy concentration scale \(r_{c}\) at \(t = t_{0}\) as in (4.4) is greater than \(T_{+} - t_{0}\). By Theorem 4.3 we can then extend \((A, \phi )\) past \(T_{+}\), as claimed.

  2. (ii)

    Or, \(\lim _{t \rightarrow T_+} \sup _{x\in {{\mathbb {R}}}^4} E(t,x) \ge \delta _{0}(E, \epsilon _{*}^{2})\). Then the sets \(D_t = \{ x \in {{\mathbb {R}}}^4: E(t,x) \ge \frac{1}{2} \delta _{0}(E, \epsilon _{*}^{2})\}\) are nonempty and decreasing in t. Moreover, they are compact by (8.4). Thus they must intersect. Any \(x_0\) in the intersection will provide the second alternative in the lemma.\(\square \)

Theorem 4.7 provides additional information about the nature of the singularity in both scenarios, which is crucial to our proof of Theorem 1.3. To utilize this information, we introduce a smooth function \(\zeta \) satisfying the following properties:

  • \(\displaystyle {{\mathrm {supp}}\, \zeta \subseteq B_{1}(0)}\) and \(\int \zeta = 1\).

  • There exists a function \(\widetilde{\zeta } \in C^{\infty }_{0}({{\mathbb {R}}}^{4})\) with \(\widetilde{\zeta } \ge 0\) such that \(\zeta = \widetilde{\zeta } *\widetilde{\zeta }\).

Then we define the physical space version of energy dispersion as follows:

$$\begin{aligned} \mathbf{E}\mathbf{D}[A, \phi ](I) := \sup _{k \in {{\mathbb {Z}}}} \Big ( 2^{-k} \Vert \zeta _{2^{-k}} *\phi (t,x)\Vert _{L^{\infty }_{t,x}(I \times {{\mathbb {R}}}^{4})} + 2^{-2k} \Vert \zeta _{2^{-k}} *\mathbf{D}_{t} \phi (t,x)\Vert _{L^{\infty }_{t,x}(I \times {{\mathbb {R}}}^{4})} \Big )\nonumber \\ \end{aligned}$$
(8.5)

where \(\zeta _{2^{-k}} := 2^{4k} \zeta (2^{k} \cdot )\), and the convolution \(*\) is defined with respect to only the spatial variables \((x^{1}, \ldots , x^{4})\). The first property makes \(\mathbf{E}\mathbf{D}[A, \phi ]\) simpler to use in physical space arguments; on the other hand, the second property is helpful in connection with the diamagnetic inequality, which we state here.

Lemma 8.2

(Diamagnetic inequality)  Let \(O \subseteq {{\mathbb {R}}}^{4}\) be an open set and \(\phi , A \in H^{1}(O)\). Then for any smooth vector \(X, \vert \partial _{X} \vert \phi \vert \vert \le \vert \mathbf{D}_{X} \phi \vert \) in the sense of distributions. More precisely, for any smooth \(\eta \ge 0\) with \({\mathrm {supp}}\, \eta \, \subseteq O\), we have

$$\begin{aligned} \int \eta \vert \partial _{X} \vert \phi \vert \vert \, \mathrm {d}x \le \int \eta \vert \mathbf{D}_{X} \phi \vert \, \mathrm {d}x. \end{aligned}$$

The key to the proof is the formal computation \(\vert \partial _{X} \vert \phi \vert \vert = \vert \vert \phi \vert ^{-1} \langle \phi , \mathbf{D}_{X} \phi \rangle \vert \le \vert \mathbf{D}_{X} \phi \vert \); we omit the standard details. We fix the choice of functions \(\zeta , \widetilde{\zeta }\) here, and henceforth we will suppress the dependence of constants on these functions for simplicity.

The physical space version \(\mathbf{E}\mathbf{D}[A, \phi ]\) is related to the earlier Littlewood-Paley version \(ED[\phi ]\) defined in (4.9) as follows.

Lemma 8.3

Let \((A, \phi )\) be an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to (MKG) on \(I \times {{\mathbb {R}}}^{4}\) in the global Coulomb gauge with \({{\mathcal {E}}}_{\left\{ t\right\} \times {{\mathbb {R}}}^{4}}[A, \phi ] \le E\). Then there exists \(C = C(E)\) such that

$$\begin{aligned} ED[\phi ](I) \le C \, \mathbf{E}\mathbf{D}[A, \phi ](I) + \frac{1}{100} \epsilon (E), \end{aligned}$$

where \(\epsilon (E)\) is as in Theorem 4.7.

Proof

All norms in this proof will be taken over \(I \times {{\mathbb {R}}}^{4}\). The following estimates are straightforward to establish:

$$\begin{aligned}&\sup _{k} 2^{-k} \Vert P_{k} \phi \Vert _{L^{\infty }_{t,x}} \lesssim \sup _{k} 2^{-k} \Vert \zeta _{2^{-k}} *\phi \Vert _{L^{\infty }_{t,x}}, \end{aligned}$$
(8.6)
$$\begin{aligned}&\sup _{k} 2^{-2k} \Vert P_{k} (\mathbf{D}_{t} \phi )\Vert _{L^{\infty }_{t,x}} \lesssim \sup _{k} 2^{-2k} \Vert \zeta _{2^{-k}} *(\mathbf{D}_{t} \phi )\Vert _{L^{\infty }_{t,x}}. \end{aligned}$$
(8.7)

As these two estimates are proved in the same manner, we only consider (8.6). Fix \(k \in {{\mathbb {Z}}}\), and let \(m_{0} > 0\) be an absolute constant to be chosen. We have

$$\begin{aligned} 2^{-k} \Vert P_{k} \phi \Vert _{L^{\infty }_{x}}\le & {} 2^{-k} \Vert \zeta _{2^{-k-m_{0}}} *P_{k} \phi \Vert _{L^{\infty }_{x}} + 2^{-k} \Vert (1 - \zeta _{2^{-k-m_{0}}} *) P_{k} \phi \Vert _{L^{\infty }_{x}} \\\lesssim & {} 2^{m_{0}} 2^{-k-m_{0}} \Vert \zeta _{2^{-k-m_{0}}} *\phi \Vert _{L^{\infty }_{x}} + 2^{-m_{0}} 2^{-2k} \sup _{j=1, \ldots , 4}\Vert \partial _{j} P_{k} \phi \Vert _{L^{\infty }_{x}} \end{aligned}$$

The last term can be bounded by \(\lesssim 2^{-m_{0}} 2^{-k} \Vert P_{k} \phi \Vert _{L^{\infty }_{x}}\), which can be absorbed into the left-hand side by taking \(m_{0}\) sufficiently large. Hence (8.6) follows.

In view of (8.6) and (8.7), the lemma would follow once we prove that, for any \(m_{1} > 10\),

By the relation \(\partial _{t} = \mathbf{D}_{t} - iA_{0}\), it suffices to show that

$$\begin{aligned} \sup _{k} 2^{-2k} \Vert P_{k} (A_{0} \phi )\Vert _{L^{\infty }_{t,x}} \lesssim 2^{m_{1}} E^{\frac{1}{2}} \sup _{k} 2^{-k} \Vert P_{k} \phi \Vert _{L^{\infty }_{t,x}} + 2^{-m_{1}} \left( E+E^{\frac{3}{2}}\right) .\nonumber \\ \end{aligned}$$
(8.8)

Thanks to the global Coulomb condition, we have

$$\begin{aligned} \Vert A_{0}\Vert _{L^{\infty }_{t} \dot{H}^{1}_{x}} \lesssim E^{1/2}, \quad \Vert \phi \Vert _{L^{\infty }_{t} \dot{H}^{1}_{x}} \lesssim E^{1/2} + E. \end{aligned}$$

For each \(k \in {{\mathbb {Z}}}\), we split \(\phi = P_{\le k + m_{1}} \phi + P_{> k + m_{1}} \phi \). For the former, we have

$$\begin{aligned} 2^{-2k} \Vert P_{k} (A_{0} P_{\le k+m_{1}}\phi )\Vert _{L^{\infty }_{t,x}}&\lesssim \sum _{\ell \le k + m_{1}} 2^{\ell - k} \Vert A_{0}\Vert _{L^{\infty }_{t} L^{4}_{x}} 2^{-\ell } \Vert P_{\ell }\phi \Vert _{L^{\infty }_{t,x}} \\&\lesssim 2^{m_{1}} E^{\frac{1}{2}} \sup _{\ell } 2^{-\ell } \Vert P_{\ell }\phi \Vert _{L^{\infty }_{t,x}}. \end{aligned}$$

For the latter, by the properties of frequency supports, note that

$$\begin{aligned} P_{k} (A_{0} P_{>k + m_{1}} \phi ) = \sum _{\ell > k+m_{1}} P_{k} ( P_{[\ell -3, \ell +3]} A_{0} P_{\ell } \phi ). \end{aligned}$$

Hence (8.8) follows from the estimate

$$\begin{aligned} 2^{-2k} \Vert P_{k} (A_{0} P_{>k + m_{1}}\phi )\Vert _{L^{\infty }_{t,x}}\lesssim & {} \sum _{\ell > k + m_{1}} 2^{2k} \Vert P_{[\ell -3, \ell +3]}A_{0}\Vert _{L^{\infty }_{t} L^{2}_{x}} \Vert P_{\ell }\phi \Vert _{L^{\infty }_{t} L^{2}_{x}} \\\lesssim & {} 2^{-2m_{1}} (E + E^{3/2}). \end{aligned}$$

\(\square \)

As a result, there exists a function \(\mathbf{e}= \mathbf{e}(E) > 0\) such that Theorem 4.7 holds with the condition (4.11) replaced by

$$\begin{aligned} \mathbf{E}\mathbf{D}[A, \phi ](I) \le \mathbf{e}(E). \end{aligned}$$
(4.11$^\prime $)

Let \(\varepsilon > 0\) be a small parameter to be chosen below. We have the following result, which unifies the proof of Theorem 1.3 in both finite time blow-up and non-scattering scenarios from here on.

Lemma 8.4

Suppose that Theorem 1.3 fails for some initial data (aefg) of energy E. Then for every \(\varepsilon > 0\) there exists a sequence \(\varepsilon _{n} \rightarrow 0\) and a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions \((A^{(n)}, \phi ^{(n)})\) on \([\varepsilon _{n}, 1] \times {{\mathbb {R}}}^{4}\) in the global Coulomb gauge that satisfy the following properties:

  1. (1)

    Bounded energy in the cone

    $$\begin{aligned} {{\mathcal {E}}}_{S_{t}} \left[ A^{(n)}, \phi ^{(n)}\right] \le 2E \quad \hbox { for every } t \in \left[ \varepsilon _{n}, 1\right] , \end{aligned}$$
    (8.9)
  2. (2)

    Small energy outside the cone

    $$\begin{aligned} {{\mathcal {E}}}_{(\left\{ t\right\} \times {{\mathbb {R}}}^{4}) {\setminus } S_{t}} \left[ A^{(n)}, \phi ^{(n)}\right] \le \varepsilon ^{8} E \quad \hbox { for every } t \in \left[ \varepsilon _{n}, 1\right] , \end{aligned}$$
    (8.10)
  3. (3)

    Decaying flux on \(\partial C\)

    $$\begin{aligned} {{\mathcal {F}}}_{[\varepsilon _{n}, 1]} \left[ A^{(n)}, \phi ^{(n)}\right] + {{\mathcal {G}}}_{S_{1}}\left[ \phi ^{(n)}\right] \le \varepsilon _{n}^{\frac{1}{2}} E, \end{aligned}$$
    (8.11)
  4. (4)

    Pointwise concentration at \(t = 1\)

    $$\begin{aligned} 2^{-k_{n}} \vert \zeta _{2^{-k_{n}}} *\phi ^{(n)}(1,x_{n})\vert + 2^{-2k_{n}} \vert \zeta _{2^{-k_{n}}} *\mathbf{D}_{t}^{(n)} \phi ^{(n)}(1, x_{n})\vert > \mathbf{e}(E)\nonumber \\ \end{aligned}$$
    (8.12)

    for some \(k_{n} \in {{\mathbb {Z}}}\) and \(x_{n} \in {{\mathbb {R}}}^{4}\).

Remark 8.5

The small parameter \(\varepsilon > 0\) will be specified near the end of the proof of Theorem 1.3, precisely in Lemma 8.11, depending only on E.

Remark 8.6

By the global Coulomb gauge condition \(\partial ^{\ell } A^{(n)}_{\ell } = 0\), the following gauge dependent uniform bounds for \(A^{(n)}\) and \(\phi ^{(n)}\) hold:

$$\begin{aligned} \Vert \partial _{t,x} A^{(n)}\Vert _{L^{\infty }_{t} ([\varepsilon _{n}, 1]; L^{2}_{x})} \lesssim E^{\frac{1}{2}}, \quad \Vert \partial _{t,x} \phi ^{(n)}\Vert _{L^{\infty }_{t} ([\varepsilon _{n}, 1]; L^{2}_{x})} \lesssim \left( 1+E^{\frac{1}{2}}\right) E^{\frac{1}{2}}.\nonumber \\ \end{aligned}$$
(8.13)

Proof

Suppose that Theorem 1.3 fails. Then by the discussion at the beginning of the section, there exists an admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution \((A, \phi )\) of energy E to (MKG) on \([0, T_{+}) \times {{\mathbb {R}}}^{4}\) which satisfies either \(0 < T_{+} < \infty \) and (8.1) (finite time blow-up) or \(T_{+} = \infty \) and (8.2) (non-scattering). We treat these two cases separately.

Case 1: Finite time blow-up By Lemma 8.1, there exists a point \(x_{0} \in {{\mathbb {R}}}^{4}\) such that (8.3) holds. By translation in space-time and reversing time, we may assume that \(x_{0} = 0\) and we have energy concentration at the space-time origin as \(t \rightarrow 0\), i.e.,

$$\begin{aligned} \limsup _{t \rightarrow 0} {{\mathcal {E}}}_{S_{t}}[A, \phi ] > 0. \end{aligned}$$
(8.14)

Our next course of action is to use the excision and gluing technique (Proposition 4.4) to cut away the part of \((A, \phi )\) outside the cone of influence of (0, 0). In what follows, we denote the ball \(B_{1}(0)\) by B, so that \(rB = B_{r}(0)\) for any \(r > 0\).

By Corollary 5.3 there exists \(t_{0} > 0\) such that

$$\begin{aligned} {{\mathcal {F}}}_{\partial C_{(0, t_{0}]}}[A, \phi ] \ll \min \left\{ \delta _{0}(E, \epsilon _{*}^{2}), \varepsilon ^{8} E\right\} \end{aligned}$$

where \(\delta _{0}(E, \epsilon _{*}^{2})\) is as in (4.4). Furthermore, we can find a collar of radius \(r_{0} > 0\) around \(S_{t_{0}} = \left\{ t_{0}\right\} \times t_{0} B\) with small energy, i.e.,

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t_{0}\right\} \times ((t_{0} + r_{0}) B {\setminus } {t_{0} B})}[A, \phi ] \ll \min \left\{ \delta _{0}(E, \epsilon _{*}^{2}), \varepsilon ^{8} E\right\} . \end{aligned}$$

By local conservation of energy, we then have

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t\right\} \times ((t + r_{0}) B {\setminus } {t B})}[A, \phi ] \ll \min \left\{ \delta _{0}\left( E, \epsilon _{*}^{2}\right) , \varepsilon ^{8} E\right\} \quad \hbox { for every } t \in (0, t_{0}]. \end{aligned}$$

Observe that the ratio \((t+r_0)/t\) goes to \(\infty \) as \(t \rightarrow 0\). Hence, by the improved Hardy estimate in Lemma 4.5, for sufficiently small \(0 < \bar{t} < r_0\) we also obtain

$$\begin{aligned} \Vert \frac{1}{\vert x\vert } \phi (\bar{t}, \cdot )\Vert _{L^{2}_{x} (2\bar{t} B {\setminus } \bar{t} B)}^{2} \ll \min \left\{ \delta _{0}\left( E, \epsilon _{*}^{2}\right) , \varepsilon ^{8} E\right\} \quad \hbox { for every } t \in (0, t_{0}]. \end{aligned}$$

We may now apply Proposition 4.4 to \((a, e, f, g) = (A_{j}, F_{0j}, \phi , \mathbf{D}_{t} \phi ) \!\upharpoonright _{\left\{ t = \bar{t}\right\} }\) to obtain a new data set \((\widetilde{a}, \widetilde{e}, \widetilde{f}, \widetilde{g})\) that coincides with (aefg) on \(\bar{t} B\) and obeys

$$\begin{aligned} {{\mathcal {E}}}_{{{\mathbb {R}}}^{4} {\setminus } \bar{t} B}[\widetilde{a}, \widetilde{e}, \widetilde{f}, \widetilde{g}] \le \frac{1}{2} \min \left\{ \delta _{0}\left( E, \epsilon _{*}^{2}\right) , \varepsilon ^{8} E\right\} . \end{aligned}$$

To pass to the global Coulomb gauge, we define the gauge transformation \(\underline{\chi } \in {{\mathcal {G}}}^{2}({{\mathbb {R}}}^{4})\) by \(\underline{\chi } = \triangle ^{-1} \partial ^{\ell } \widetilde{a}_{\ell }\) and let \((\check{a}, \check{e}, \check{f}, \check{g})\) be the gauge transform of \((\widetilde{a}, \widetilde{e}, \widetilde{f}, \widetilde{g})\) by \(\underline{\chi }\). Let \((\check{A}, \check{\phi })\) be the admissible \(C_{t} {{\mathcal {H}}}^{1}\) solution to the Cauchy problem in the global Coulomb gauge given by Theorem 4.3, defined on the maximal time interval \(I \ni \bar{t}\).

As a consequence of the construction and local conservation of energy, the energy outside the cone C is always tiny, i.e.,

$$\begin{aligned} {{\mathcal {E}}}_{(\left\{ t\right\} \times {{\mathbb {R}}}^{4}) \setminus S_{t}}\left[ \check{A}, \check{\phi }\right] \le \frac{1}{2} \min \left\{ \delta _{0}\left( E, \epsilon _{*}^{2}\right) , \varepsilon ^{8} E\right\} \quad \hbox { for every } t \in I.\quad \quad \end{aligned}$$
(8.15)

Then by an argument similar to the proof of Lemma 8.1, it follows that \((\check{A}, \check{\phi })\) can be always continued to the past until 0, i.e., \((0, \bar{t}] \subseteq I\). Furthermore, there exist sequences \((t_{n}, x_{n}) \in I \times {{\mathbb {R}}}^{4}\) and \(k_{n} \in {{\mathbb {Z}}}\) with \(t_{n} \rightarrow 0\) such that

$$\begin{aligned} 2^{-k_{n}} \vert \zeta _{2^{-k_{n}}} *\check{\phi }(t_{n}, x_{n})\vert + 2^{-2k_{n}} \vert \zeta _{2^{-k_{n}}} *\check{\mathbf{D}}_{t} \check{\phi }(t_{n}, x_{n})\vert > \mathbf{e}(E). \end{aligned}$$
(8.16)

For otherwise, there exists \(\delta > 0\) such that (4.11’) holds on \((0, \delta )\). Then by Theorem 4.7 (with (4.11) replaced by (4.11’)) and Theorem 4.8, the solution \((\check{A}, \check{\phi })\) can be extended past \(t=0\). Hence \(\limsup _{t \rightarrow 0} {{\mathcal {E}}}_{S_{t}}[\check{A}, \check{\phi }] = 0\), but this fact contradicts (8.14) as \({{\mathcal {E}}}_{S_{t}}[\check{A}, \check{\phi }] = {{\mathcal {E}}}_{S_{t}}[A, \phi ]\) for every \(t \in I\).

Applying Corollary 5.3 to \((\check{A}, \check{\phi })\), we may choose a sequence \(\varepsilon _{n} \rightarrow 0\) such that

$$\begin{aligned} {{\mathcal {F}}}_{[\varepsilon _{n} t_{n}, t_{n}]} [A, \phi ] + {{\mathcal {G}}}_{S_{t_{n}}}[\phi ] \le \varepsilon _{n}^{\frac{1}{2}} E. \end{aligned}$$

By the scaling properties of \({{\mathcal {E}}}, {{\mathcal {F}}}\) and \({{\mathcal {G}}}\), as well as scale invariance of (8.16) (which is immediate from definition), it follows that the sequence of rescaled solutions

$$\begin{aligned} \left( A^{(n)}, \phi ^{(n)}\right) (t,x) := t_{n} \left( \check{A}, \check{\phi }\right) \left( t_{n} t, t_{n} x\right) \end{aligned}$$

obeys the desired properties.

Case 2: Non-scattering  This case follows by a simple rescaling argument. Let \(R_{0} > 0\) be a large radius such that \({{\mathcal {E}}}_{\left\{ 0\right\} \times ({{\mathbb {R}}}^{4} {\setminus } B_{R_{0}}(0))} [A, \phi ] \le \varepsilon ^{8} E\). Translating in time by \(R_{0}\) and using the local conservation of energy, we may assume that \((A, \phi )\) obeys

$$\begin{aligned} {{\mathcal {E}}}_{(\left\{ t\right\} \times {{\mathbb {R}}}^{4}) {\setminus } S_{t}} [A, \phi ] \le \varepsilon ^{8} E \quad \hbox { for every } t \in [R_{0}, \infty ). \end{aligned}$$

By Theorem 4.7 with (4.11) replaced by (4.11’) and (8.2), there exist sequences \((t_{n}, x_{n}) \in [R_{0}, \infty ) \times {{\mathbb {R}}}^{4}\) and \(k_{n} \in {{\mathbb {Z}}}\) with \(t_{n} \rightarrow \infty \) such that

$$\begin{aligned} 2^{-k_{n}} \vert \zeta _{2^{-k_{n}}} *\phi (t_{n}, x_{n})\vert + 2^{-2k_{n}} \vert \zeta _{2^{-k_{n}}} *\mathbf{D}_{t} \phi (t_{n}, x_{n})\vert > \mathbf{e}(E) \end{aligned}$$

By Corollary 5.3, we may then choose a sequence \(\varepsilon _{n} \rightarrow 0\) such that \(\varepsilon _{n} t_{n} \rightarrow \infty \) and

$$\begin{aligned} {{\mathcal {F}}}_{[\varepsilon _{n} t_{n}, t_{n}]} [A, \phi ] + {{\mathcal {G}}}_{S_{t_{n}}}[\phi ] \le \varepsilon _{n}^{\frac{1}{2}} E. \end{aligned}$$

Defining \((A^{(n)}, \phi ^{(n)})(t,x) := t_{n} (A, \phi )(t_{n} t, t_{n} x)\), we obtain a desired sequence.

8.2 Elimination of the null concentration scenario

Using Proposition 5.4, in particular the weighted energy estimate on \(S_{1}\), we show that null concentration cannot happen. The precise statement is as follows.

Lemma 8.7

(No null concentration)  Let \((A^{(n)}, \phi ^{(n)})\) be a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions to (MKG) satisfying the conclusions of Lemma 8.4 with the sequences \(\varepsilon _{n}, k_{n}\) and \(x_{n}\). There exist \(K = K(E) > 0\) and \(\gamma = \gamma (E) \in (0, 1)\) such that if \(k_{n} > K(E)\) and \(\vert x_{n}\vert > \gamma (E)\) for all sufficiently large n, and \(\varepsilon > 0\) is sufficiently small depending on E, then

$$\begin{aligned} \limsup _{n \rightarrow \infty } \, 2^{-k_{n}} \vert \zeta _{2^{-k_{n}}} *\phi (1, x_{n})\vert + 2^{-2k_{n}} \vert \zeta _{2^{-k_{n}}} *\mathbf{D}_{t}^{(n)} \phi ^{(n)} (1,x_{n})\vert \le \mathbf{e}(E).\nonumber \\ \end{aligned}$$
(8.17)

Remark 8.8

Note that K(E) in Lemma 8.7 can be replaced a posteriori by any number greater than K(E). Hence given any \(m = m(E)\) depending only on E, we may assume in addition to the statement of Lemma 8.7 that

$$\begin{aligned} 2^{-K} \le \frac{1}{100 \, m(E)} (1 - \gamma ). \end{aligned}$$
(8.18)

This observation will be useful in the proof of Lemma 8.9 below.

Proof

The idea of the proof is similar to that of [33, Lemma 6.2] with additional ideas to deal with the presence of covariant derivatives.

Step 1 The starting point is Proposition 5.4 applied to \((A, \phi ) = (A^{(n)}, \phi ^{(n)})\) with \(\varepsilon = \varepsilon _{n}\), more precisely the first term on the left-hand side of (5.11). Using Lemma 5.10 to write out \({}^{(X_{\varepsilon _{n}})} P_{T}\), we see that the following a-priori estimate holds on \(S_{1}\):

$$\begin{aligned} \int _{S_{1}} \frac{1}{(1 - \vert x\vert +\varepsilon _{n})^{\frac{1}{2}}} \Big ( \vert \mathbf{D}_{L}^{(n)} \phi ^{(n)}\vert ^{2} + \vert \not \!\! \mathbf{D}^{(n)} \phi ^{(n)}\vert ^{2} \Big ) \, \mathrm {d}x \lesssim E. \end{aligned}$$
(8.19)

By the smallness of the energy outside \(S_{1}\), we then obtain the global bound

$$\begin{aligned} \int _{\left\{ t = 1\right\} } \frac{1}{((1 - \vert x\vert )_{+} + \varepsilon _{n})^{\frac{1}{2}} + \varepsilon ^{8}} \Big ( \vert \mathbf{D}_{L}^{(n)} \phi ^{(n)}\vert ^{2} + \vert \not \!\! \mathbf{D}^{(n)} \phi ^{(n)}\vert ^{2} \Big ) \, \mathrm {d}x \lesssim E,\nonumber \\ \end{aligned}$$
(8.20)

where \((\cdot )_{+} := \max \left\{ \cdot , 0\right\} \).

Step 2  We claim that for any non-negative \(k \in {{\mathbb {Z}}}\) the following estimate holds:

$$\begin{aligned} \limsup _{n \rightarrow \infty } \, 2^{-k} \vert \zeta _{2^{-k}} *\vert \phi ^{(n)}\vert (1,x)\vert \lesssim \Big ( 2^{-\frac{3}{8} k} + \big ( (1-\vert x\vert )_{+} + 2^{-k} \big )^{\frac{1}{4}} + \varepsilon ^{4} \Big ) E^{\frac{1}{2}}.\nonumber \\ \end{aligned}$$
(8.21)

The point of (8.21) is that \(\vert \phi ^{(n)}\vert \) is gauge invariant, and hence we can avoid estimating A. Henceforth, we will denote \(\psi ^{(n)} := \vert \phi ^{(n)}\vert (1, \cdot )\). We use the rotational symmetry to bring x to the \(x^{1}\)-axis, so that \(x = (\vert x\vert , 0, 0, 0)\). Henceforth we will write \(x = (x^{1}, x')\) where \(x' = (x^{2}, x^{3}, x^{4})\).

By the diamagnetic inequality (Lemma 8.2), conservation of energy implies

$$\begin{aligned} \int \vert \nabla \psi ^{(n)}\vert ^{2} \, \mathrm {d}x \lesssim E. \end{aligned}$$
(8.22)

where \(\vert \nabla \psi \vert ^{2} := \sum _{\ell =1}^{4} \vert \partial _{\ell } \psi \vert ^{2}\). Note that (8.22) and Young’s inequality implies the trivial bound \(2^{-k} \Vert \zeta _{2^{-k}} *\psi ^{(n)}\Vert _{L^{\infty }_{x}} \lesssim E^{1/2}\), which allows us to restrict our attention to \(x = (x^{1}, 0, 0, 0)\) with \(1/2 < x^{1} < 2\).

We claim that for n sufficiently large so that \(\varepsilon _{n}^{1/2} \le \frac{1}{100} \varepsilon ^{8}\), the directional derivatives other than \(\partial _{1}\) obey an improved estimate

$$\begin{aligned} \sum _{j=1}^{4} \int w_{k} \vert \partial _{j} \psi ^{(n)}\vert ^{2} \, \mathrm {d}x \lesssim E, \end{aligned}$$
(8.23)

where \(w_{k} > 0\) is defined as

$$\begin{aligned} w_{k}(x) := \frac{1}{(\vert 1-x^{1}\vert + \vert x'\vert ^{2} + 2^{-k})^{\frac{1}{2}} + \varepsilon ^{8}}. \end{aligned}$$
(8.24)

To prove (8.23) under the assumption \(\varepsilon _{n}^{1/2} \le \frac{1}{100} \varepsilon ^{8}\), it suffices to prove

$$\begin{aligned} \sum _{j=2}^{4} \int \frac{1}{(\vert 1 - x^{1}\vert + \vert x'\vert ^{2} + \varepsilon _{n})^{\frac{1}{2}} + \varepsilon ^{8}} \vert \partial _{j} \psi ^{(n)}\vert ^{2} \, \mathrm {d}x \lesssim E. \end{aligned}$$
(8.25)

The estimate (8.25) is a consequence of (8.20). Indeed, the latter estimate combined with the diamagnetic inequality implies

(8.26)

At \(x = (1, 0, 0, 0)\) we have \(\frac{1}{r^{2}} g_{{{\mathbb {S}}}^{3}}^{-1} = \sum _{j=2}^{4} \partial _{j} \cdot \partial _{j}\). Therefore, by smoothness, we have

On the other hand, \((1-\vert x\vert )_{+} \lesssim \vert 1 - x^{1}\vert + \vert x'\vert ^{2}\).

Combining these facts, we may now bound the left-hand side of (8.25) by

Then using (8.22) and (8.26), the desired estimate (8.25) follows.

Compared to the weight in the preceding expression, observe that we have absorbed \(\varepsilon _{n}\) into \(\varepsilon ^{8}\) and added \(2^{-k}\) in \(w_{k}\). This maneuver ensures that \(w_{k}\) is slowly varying at scale \(2^{-k} \times 2^{-k/2} \times \cdots \times 2^{-k/2}\), i.e., for any \(x, y \in {{\mathbb {R}}}^{4}\) we have

$$\begin{aligned} \vert \frac{w_{k}(x)}{w_{k}(x-y)}\vert \lesssim e^{\sum _{j=1}^{4} \vert y^{j}\vert \Vert \partial _{j} \log w_{k}\Vert _{L^{\infty }}} \lesssim e^{2^{k} \vert y^{1}\vert + 2^{k/2} \vert y'\vert }. \end{aligned}$$
(8.27)

We now turn to the task of deriving (8.21) from (8.22) and (8.23). We introduce the notation \(Z_{k} \psi := \zeta _{2^{-k}} *\psi \) and write \(z_{k}(\xi )\) for the symbol of the integral operator \(Z_{k}\); of course, \(z_{k}\) is nothing but the (spatial) Fourier transform of \(\zeta _{2^{-k}}\). Given a smooth cut-off \(\eta \) on \({{\mathbb {R}}}^{3}\) adapted to the unit ball, we furthermore decompose

$$\begin{aligned} Z_{k} = Z_{k}^{1} \partial _{1} + Z_{k}^{2} \partial _{2} + \cdots + Z_{k}^{4} \partial _{4} \end{aligned}$$

where the symbols \(z_{k}^{j}(\xi )\) of \(Z_{k}^{j}\) are given by

$$\begin{aligned} z_{k}^{1} (\xi )= & {} z_{k}(\xi ) \eta \left( 2^{-\frac{k}{2}} \xi '\right) \frac{1}{i \xi _{1}}, \\ z_{k}^{j} (\xi )= & {} z_{k}(\xi ) \left( 1 - \eta \left( 2^{-\frac{k}{2}} \xi '\right) \right) \frac{\xi _{j}}{i \vert \xi '\vert ^{2}} \quad \hbox { for } j = 2, 3, 4, \end{aligned}$$

where \(\xi ' = (\xi _{2}, \xi _{3}, \xi _{4})\).

The contribution of \(Z_{k}^{1} \partial _{1}\) to (8.21) is easy to treat. Observe that \(z_{k}^{1} (\xi ) i \xi _{1}\) is a smooth symbol which is rapidly decaying at scale \(2^{k}\) in the \(\xi _{1}\)-direction, i.e., for every \(N \ge 0\) we have

$$\begin{aligned} \vert (\xi _{1} \partial _{\xi _{1}})^{N} \left( z_{k}^{1} i \xi _{1}\right) \vert \lesssim _{N} \left( 1+2^{-k} \vert \xi _{1}\vert \right) ^{-100}. \end{aligned}$$

Moreover, \(z_{k}^{1} (\xi ) i \xi _{1}\) is compactly supported in the set \(\left\{ \vert \xi '\vert \lesssim 2^{k/2}\right\} \) in the other directions. For any \(N \ge 0\), these facts immediately imply the kernel bound

$$\begin{aligned} \vert {{\mathcal {F}}}_{x}^{-1} (z_{k}^{1} i \xi _{1})\vert \lesssim _{N} 2^{\frac{5}{2} k} (1+2^{k} \vert x^{1}\vert )^{-N} (1 + 2^{\frac{k}{2}} \vert x'\vert )^{-N}, \end{aligned}$$

where \({{\mathcal {F}}}_{x}\) denotes the inverse (spatial) Fourier transform. In particular, \(\Vert {{\mathcal {F}}}_{x}^{-1} (z_{k}^{1} i \xi _{1}) \Vert _{L^{\frac{4}{3}}_{x}} \lesssim 2^{\frac{5}{8}k}\). By Young’s inequality, we have

$$\begin{aligned} 2^{-k} \vert Z_{k}^{1} \partial _{1} \psi ^{(n)}(x)\vert \lesssim 2^{-\frac{3}{8} k} \Vert \psi ^{(n)}\Vert _{\dot{H}^{1}_{x}} \lesssim 2^{-\frac{3}{8} k} E^{\frac{1}{2}}, \end{aligned}$$

which is acceptable.

It remains to treat the contribution of \(Z_{k}^{j} \partial _{j}\) for \(j = 2,3, 4\). Denote by \(\zeta _{k}^{j}(x)\) the integral kernel of \(Z_{k}^{j}\), which is simply the inverse Fourier transform of \(z_{k}^{j}\). A straightforward computation shows that \(\Vert z^{j}_{k}\Vert _{L^{2}_{\xi }} \lesssim 2^{k}\). Therefore, by Plancherel,

$$\begin{aligned} \Vert \zeta ^{j}_{k}\Vert _{L^{2}_{x}} \lesssim 2^{k}. \end{aligned}$$
(8.28)

Next, for any \(N, M_{2}, M_{3}, M_{4} \ge 0\), we have

$$\begin{aligned}&\left| (\xi _{1} \partial _{\xi _{1}})^{N} (\xi _{2} \partial _{\xi _{2}})^{M_{2}} (\xi _{3} \partial _{\xi _{3}})^{M_{3}} (\xi _{4} \partial _{\xi _{4}})^{M_{4}} \left( \sum _{j=2}^{4} i \xi _{j} z_{k}^{j}\right) \right| \\&\quad \lesssim _{N, M_{2}, M_{3}, M_{4}} \left( 1 + 2^{-k} \vert \xi \vert \right) ^{-100}. \end{aligned}$$

In addition, the left-hand side is supported in \(\left\{ \vert \xi '\vert \lesssim 2^{k/2}\right\} \) if \(M_{2} + M_{3} + M_{4} \ne 0\). Taking the inverse Fourier transform, we obtain

$$\begin{aligned} \left| \sum _{j=2}^{4} \partial _{j} \zeta _{k}^{j}(x)\right| \lesssim _{N} \min \left\{ 2^{\frac{3}{2} k}, \left( 2^{\frac{k}{2}} \vert x'\vert \right) ^{-N}\right\} \left( 1 + 2^{k} \vert x^{1}\vert \right) ^{-N} 2^{\frac{5}{2} k}\quad \quad \quad \end{aligned}$$
(8.29)

where the implicit constant is independent of k.

Hence we can split \(\zeta _{k}^{j} = \zeta _{k, \mathrm {near}}^{j} + \zeta _{k, \mathrm {far}}^{j}\), where

$$\begin{aligned} \zeta _{k, \mathrm {near}}^{j}(x) := \zeta _{k}^{j} (x) 1_{\left\{ x : 2^{k} \vert x^{1}\vert \le L, \ 2^{\frac{k}{2}} \vert x'\vert \le L \right\} }(x), \end{aligned}$$

and \(L > 0\) is chosen large enough (independent of k) so that, by (8.29), we have

$$\begin{aligned} \left\| {\sum _{j=2}^{4} \partial _{j} \zeta _{k, \mathrm {far}}^{j}}\right\| _{L^{\frac{4}{3}}_{x}} \le 2^{k} \varepsilon ^{4}. \end{aligned}$$
(8.30)

We denote the corresponding splitting of \(Z_{k}^{j}\) by \(Z_{k, \mathrm {near}}^{j} + Z_{k, \mathrm {far}}^{j}\).

We are now ready to complete the proof of (8.21). The contribution of \(Z_{k, \mathrm {far}}^{j} \partial _{j}\) is acceptable, thanks to (8.22), (8.30) and the Sobolev embedding \(\dot{H}^{1}_{x} \subseteq L^{4}_{x}\). For \(\sum _{j=2}^{4} Z_{k, \mathrm {near}}^{j} \partial _{j}\), we have

$$\begin{aligned} 2^{-k} \left| \sum _{j=2}^{4} Z_{k, \mathrm {near}}^{j} \partial _{j} \psi ^{(n)}(x)\right|\le & {} 2^{-k} \sum _{j=2}^{4} \int \vert \zeta ^{j}_{k, \mathrm {near}}(y)\vert \vert \partial _{j} \psi ^{(n)}(x-y)\vert \, \mathrm {d}y \\\lesssim & {} M w_{k}^{-\frac{1}{2}}(x) \Vert w_{k}^{\frac{1}{2}} \partial _{j} \psi ^{(n)}\Vert _{L^{2}_{x}} \end{aligned}$$

where, by (8.27), (8.28) and the definition of \(\zeta _{k, \mathrm {near}}^{j}, M\) obeys the bound

Note that \(w_{k}^{-\frac{1}{2}}(x) \lesssim ((1-\vert x\vert )_{+} + 2^{-k})^{1/4} + \varepsilon ^{4}\), since we have chosen \(x = (x^{1}, 0, 0, 0)\); this proves (8.21).

Step 3  In this step we upgrade (8.21) to the following gauge dependent estimate:

$$\begin{aligned} \limsup _{n \rightarrow \infty } 2^{-2k} \vert \zeta _{2^{-k}} *\mathbf{D}_{j} \phi ^{(n)}(1, x)\vert \lesssim \Big ( 2^{-\frac{3}{8} k} + \big ( (1-\vert x\vert )_{+} + 2^{-k} \big )^{\frac{1}{4}} + \varepsilon ^{4} \Big ) E^{\frac{1}{2}}.\nonumber \\ \end{aligned}$$
(8.31)

The idea is that (8.21) has already broken the scaling invariance, so we can easily incorporate A using the trivial bound \(\Vert A\Vert _{L^{\infty }_{t} \dot{H}^{1}_{x}} \lesssim E^{1/2}\).

We begin by applying Step 2 to \(\widetilde{\zeta }_{2^{-k}}\), where we recall that \(\zeta = \widetilde{\zeta } *\widetilde{\zeta }\). We again introduce the shorthand \(\widetilde{Z}_{k} (\cdot ):= \widetilde{\zeta }_{2^{-k}} *(\cdot )\). By the simple pointwise inequality \(\vert \widetilde{Z}_{k} \phi ^{(n)}\vert \le \vert \widetilde{Z}_{k} \vert \phi ^{(n)}\vert \vert \), which holds since \(\widetilde{\zeta } \ge 0\), we have

$$\begin{aligned} \limsup _{n \rightarrow \infty } \, 2^{-k} \vert \widetilde{Z}_{k} \phi ^{(n)}(1,x)\vert \lesssim \Big ( 2^{-\frac{3}{8} k} + \big ( (1-\vert x\vert )_{+} + 2^{-k} \big )^{\frac{1}{4}} + \varepsilon ^{4} \Big ) E^{\frac{1}{2}}.\nonumber \\ \end{aligned}$$
(8.32)

Note furthermore that \(Z_{k} = \widetilde{Z}_{k}^{2}\). For \(j= 1, \ldots , 4\), we may write

$$\begin{aligned} 2^{-2k} \vert Z_{k} \mathbf{D}_{j}^{(n)} \phi ^{(n)}(1,x)\vert&\le 2^{-2k} \vert Z_{k} \partial _{j} \phi ^{(n)}(1,x)\vert \\&\quad +2^{-2k} \vert Z_{k} (A_{j}^{(n)} \phi ^{(n)})(1,x)\vert \\&\lesssim 2^{-k} \sup _{\vert x - x'\vert \lesssim 2^{-k}} \vert \widetilde{Z}_{k} \phi ^{(n)}(1,x')\vert \\&\quad + 2^{-2k} \vert Z_{k} (A_{j}^{(n)} \phi ^{(n)})(1,x)\vert . \end{aligned}$$

The first term on the last line is acceptable, thanks to (8.32). To treat the second term, we insert \(1 = (1 - \widetilde{Z}_{k+m}) + \widetilde{Z}_{k+m}\) in front of both \(A^{(n)}\) and \(\phi ^{(n)}\) for some \(m > 0\) to be determined. By the simple inequalities \(\vert Z_{k} f(x)\vert \lesssim 2^{3k} \Vert f\Vert _{L^{4/3}_{x}}\) and \(\Vert (1 - \widetilde{Z}_{k+m}) f\Vert _{L^{2}_{x}} \lesssim 2^{-k-m} \Vert f\Vert _{\dot{H}^{1}_{x}}\), as well as the Sobolev embedding \(\dot{H}^{1}_{x} \subseteq L^{4}_{x}\), we have

$$\begin{aligned} 2^{-2k} \vert Z_{k}((1-\widetilde{Z}_{k+m}) A_{j}^{(n)} \cdot \phi ^{(n)} ) (1,x)\vert \lesssim 2^{-m} \Vert A^{(n)}(1, \cdot )\Vert _{\dot{H}^{1}_{x}} \Vert \phi ^{(n)}(1, \cdot )\Vert _{\dot{H}^{1}_{x}} \end{aligned}$$

which can be made \(\le \varepsilon ^{4} E^{\frac{1}{2}}\) by choosing m large enough. Proceeding similarly, the same upper bound can be proved for \(Z_{k}(\widetilde{Z}_{k+m} A_{j}^{(n)} (1-\widetilde{Z}_{k+m}) \phi ^{(n)})\).

For the remaining term, we have

$$\begin{aligned}&2^{-2k} \vert Z_{k} \left( \widetilde{Z}_{k+m} A_{j}^{(n)} \widetilde{Z}_{k+m} \phi ^{(n)}\right) (1,x)\vert \\&\quad \lesssim 2^{-k} \Vert \widetilde{Z}_{k+m} A_{j}^{(n)}(1, \cdot )\Vert _{L^{\infty }_{x}} 2^{-k} \sup _{\vert x - x'\vert \lesssim 2^{-k}} \vert \widetilde{Z}_{k+m} \phi ^{(n)}(1,x')\vert \\&\quad \lesssim _{E, m} 2^{-k+m} \sup _{\vert x - x'\vert \lesssim 2^{-k}} \vert \widetilde{Z}_{k+m} \phi ^{(n)}(1,x')\vert \end{aligned}$$

which is acceptable in view of (8.32).

Step 4  We are ready to conclude the proof of the lemma. By (8.21) and the pointwise inequality \(\vert \zeta _{2^{-k}} *\phi \vert \le \zeta _{2^{-k}} *\vert \phi \vert \), we can achieve the desired smallness as in (8.17) of \(\phi ^{(n)}\) by taking K very large, \(\gamma \) close enough to 1 and \(\varepsilon > 0\) sufficiently small. For \(\mathbf{D}^{(n)}_{t}\), we have

$$\begin{aligned} 2^{-2k} \vert \zeta _{2^{-k}} *\mathbf{D}_{t}^{(n)} \phi ^{(n)}(1, x)\vert\le & {} 2^{-2k} \vert \zeta _{2^{-k}} *\mathbf{D}_{L}^{(n)} \phi ^{(n)}(1, x)\vert \nonumber \\&+ \sum _{j=1} 2^{-2k} \vert \zeta _{2^{-k}} *\mathbf{D}_{j}^{(n)} \phi ^{(n)}(1, x)\vert .\quad \quad \end{aligned}$$
(8.33)

For the first term, we begin by estimating

$$\begin{aligned} 2^{-2k} \vert \zeta _{2^{-k}} *\mathbf{D}_{L}^{(n)} \phi ^{(n)}(1, x)\vert \lesssim \left( \int _{\left\{ y : \vert y - x\vert \lesssim 2^{-k}\right\} } \vert \mathbf{D}_{L}^{(n)} \phi ^{(n)}(1, y)\vert ^{2} \, \mathrm {d}y \right) ^{1/2}. \end{aligned}$$

Then by (8.20), the right-hand side is bounded by

$$\begin{aligned} \Big ( \big ( (1 - \vert x\vert )_{+} + 2^{-k} \big )^{\frac{1}{4}} + \varepsilon ^{4} \Big ) E^{1/2} \end{aligned}$$

provided that \(\varepsilon _{n}^{1/2} \le \frac{1}{10} \varepsilon ^{8}\). Using (8.31) to estimate the second term in (8.33), (8.17) now follows after adjusting \(K, \gamma \) and \(\varepsilon \) if necessary. \(\square \)

8.3 Nontrivial energy in a time-like region

An important consequence of Lemma 8.7 is that there is a uniform lower bound for \(\phi ^{(n)}\) in a time-like region at \(t = 1\).

Lemma 8.9

Let \((A^{(n)}, \phi ^{(n)})\) be a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions to (MKG) satisfying the conclusions of Lemma 8.4. Let \(K(E) > 0\) and \(\gamma (E) \in (0, 1)\) be as in Lemma 8.7 and Remark 8.8. Assume that either (1) \(k_{n} \le K(E)\) or (2) \(k_{n} > K(E)\) and \(\vert x_{n}\vert \le \gamma (E)\). Then there exist \(E_{1} = E_{1}(E) > 0\) and \(\gamma _{1} = \gamma _{1} (E) \in (0, 1)\) such that if \(\varepsilon > 0\) is sufficiently small depending on E, then

$$\begin{aligned} \int _{S^{1-\gamma _{1}}_{1}} \sum _{\mu =0}^{4} \vert \mathbf{D}_{\mu }^{(n)} \phi ^{(n)}\vert ^{2} + \frac{1}{r^{2}} \vert \phi ^{(n)}\vert ^{2} \, \mathrm {d}x \ge E_{1}(E). \end{aligned}$$
(8.34)

Proof

Since the whole proof will take place on \(\left\{ t = 1\right\} \), we will ignore the difference between \(\left\{ t =1\right\} \) and \({{\mathbb {R}}}^{4}\). In this case, note that \(S_{1}^{1-\gamma } = \left\{ 1\right\} \times B_{\gamma }(0)\) for any \(\gamma \in (0, 1)\). Furthermore, as the argument is the same for each n, we will henceforth suppress n for simplicity.

There are two scenarios to consider:

  1. A.

    Nontrivial kinetic energy. \(2^{-2k} \vert \zeta _{2^{-k}} *\mathbf{D}_{t} \phi (x)\vert \ge \frac{1}{2} \mathbf{e}(E)\), or

  2. B.

    Nontrivial potential energy. \(2^{-k} \vert \zeta _{2^{-k}} *\phi (x)\vert \ge \frac{1}{2} \mathbf{e}(E)\).

We first treat Scenario A. By Cauchy-Schwarz,

$$\begin{aligned} \frac{1}{2} \mathbf{e}\le \int 2^{-2k} \zeta _{2^{-k}}(y) \vert \mathbf{D}_{t} \phi (x-y)\vert \, \mathrm {d}y \lesssim \left( \int _{B_{2^{-k}}(x)} \vert \mathbf{D}_{t} \phi \vert ^{2} \, \mathrm {d}y \right) ^{1/2}, \end{aligned}$$

where we also used \({\mathrm {supp}}\, \zeta \subseteq B_{1}(0)\). Hence in Case 2, (8.34) immediately follows by taking \(\gamma _{1} \ge \gamma + 2^{-k}\) so that \(\left\{ 1\right\} \times B_{2^{-k}}(x) \subseteq S^{1-\gamma _{1}}_{1}\). Note that we may still ensure that \(\gamma _{1} < 1\) thanks to (8.18).

Now assume that Case 1 holds, i.e., \(k \le K\). Splitting the convolution integral into \(\int _{S_{1}^{1-\gamma _{1}}} + \int _{S_{1} \setminus S^{1-\gamma _{1}}_{1}} + \int _{{{\mathbb {R}}}^{4} \setminus S_{1}}\), applying Cauchy-Schwarz and using (8.9), (8.10), we have

$$\begin{aligned} \mathbf{e}\lesssim \left( \int _{S_{1}^{1-\gamma _{1}}} \vert \mathbf{D}_{t} \phi \vert ^{2} \, \mathrm {d}y \right) ^{1/2} + c_{0}(\gamma _{1}) E^{1/2} + \varepsilon ^{8} E^{1/2}, \end{aligned}$$

where

$$\begin{aligned} c_{0}(\gamma _{1}):= & {} \left( \int _{S_{1} {\setminus } S_{1}^{1-\gamma _{1}}} \vert \zeta (2^{-k} y)\vert ^{2} 2^{-4k} \, \mathrm {d}y \right) ^{1/2}\\\lesssim & {} 2^{-2k} \vert \left( S_{1} {\setminus } S_{1}^{1-\gamma _{1}}\right) \cap B_{2^{-k}}(x)\vert ^{1/2}. \end{aligned}$$

By elementary geometry and the assumption \(k \le K\), it follows that the last term is bounded by \(\lesssim (1-\gamma _{1})^{1/2} 2^{-K/2}\) uniformly in x. Taking \(\gamma _{1}\) sufficiently close to 1, the desired conclusion follows.

We now consider Scenario B. We repeat the above argument with \(\mathbf{D}_{t} \phi \) replaced by \(\phi \), while putting \(\zeta _{2^{-k}}\) [resp. \(\phi \)] in \(L^{4/3}\) [resp. \(L^{4}\)] instead of \(L^{2}\) [resp. \(L^{2}\)]. Then in Case 1,

$$\begin{aligned} \mathbf{e}\lesssim \left( \int _{B_{2^{-k}}(x)} \vert \phi \vert ^{4} \, \mathrm {d}y \right) ^{1/4}, \end{aligned}$$
(8.35)

whereas in Case 2,

$$\begin{aligned} \mathbf{e}\lesssim \left( \int _{S_{1}^{1-\gamma _{1}}} \vert \phi \vert ^{4} \, \mathrm {d}y \right) ^{\frac{1}{4}} + c_{1}(\gamma _{1}) \Vert \phi \Vert _{L^{4}_{x}({{\mathbb {R}}}^{4})} + \Vert \phi \Vert _{L^{4}_{x}({{\mathbb {R}}}^{4} \setminus S_{1})}, \end{aligned}$$
(8.36)

with \(c_{1}(\gamma _{1}) \lesssim (1-\gamma _{1})^{3/4} 2^{-3K/4}\). The desired conclusion then follows from (8.9), (8.10), the diamagnetic inequality (Lemma 8.2) and the localized Sobolev inequalities

$$\begin{aligned} \Vert f\Vert _{L^{4}_{x}(B_{r}(0))}\lesssim & {} \left( \sum _{j=1}^{4} \Vert \partial _{j} f\Vert _{L^{2}_{x}(B_{r}(0))}^{2} \right) ^{1/2} + \left\| {\frac{1}{\vert x\vert } f}\right\| _{L^{2}(B_{r}(0))}, \\ \Vert f\Vert _{L^{4}_{x}({{\mathbb {R}}}^{4} {\setminus } B_{r}(0))}\lesssim & {} \left( \sum _{j=1}^{4} \Vert \partial _{j} f\Vert _{L^{2}_{x}({{\mathbb {R}}}^{4} {\setminus } B_{r}(0))}^{2} \right) ^{1/2}, \end{aligned}$$

which hold with a uniform constant for any \(r > 0\).

To prove the preceding two inequalities, it suffices to consider the case \(r = 1\) by scaling invariance. The first inequality is an immediate consequence of the standard inequality \(\Vert f\Vert _{L^{4}(B_{1}(0))} \lesssim \Vert f\Vert _{\dot{H}^{1}(B_{1}(0))} + \Vert f\Vert _{L^{2}(B_{1}(0))}\). To prove the second inequality, we extend f to \({{\mathbb {R}}}^{4}\). Using the standard extension operator from f on \(B_{2}(0) {\setminus } B_{1}(0)\), the global extension \(\widetilde{f}\) on \({{\mathbb {R}}}^{4}\) can be chosen so that \(\widetilde{f} = f\) on \({{\mathbb {R}}}^{4} {\setminus } B_{1}(0)\) and

$$\begin{aligned} \Vert \tilde{f}\Vert _{H^{1}(B_{1}(0))} \lesssim \Vert f\Vert _{H^{1}(B_{2}(0) {\setminus } B_{1}(0))} \lesssim \Vert f\Vert _{\dot{H}^{1}({{\mathbb {R}}}^{4} {\setminus } B_{1}(0))} + \Vert \frac{1}{\vert x\vert } f\Vert _{L^{2}({{\mathbb {R}}}^{4} {\setminus } B_{1}(0))}. \end{aligned}$$

Using localized Hardy’s inequality in Lemma 5.8, the second term on the right-hand side may be bounded by the first term. The desired localized Sobolev inequality now follows from the usual Sobolev embedding \(\dot{H}^{1} \subseteq L^{4}\). \(\square \)

Combining Lemmas 8.7 and 8.9, it follows that any sequence \((A^{(n)}, \phi ^{(n)})\) of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions to (MKG) constructed by Lemma 8.4 obeys the uniform lower bound (8.34) in a time-like region \(S^{1-\gamma _{1}}_{1}\) for some \(\gamma _{1} = \gamma _{1}(E) \in (0,1)\). The uniform lower bound in a time-like region can be propagated towards \(t = 0\) using the localized monotonicity formula in Proposition 5.5.

Lemma 8.10

Let \((A^{(n)}, \phi ^{(n)})\) be a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions to (MKG) satisfying the conclusions of Lemma 8.4. Assume furthermore that each \((A^{(n)}, \phi ^{(n)})\) obeys (8.34). Then there exist \(E_{2} = E_{2}(E) > 0\) and \(\gamma _{2} = \gamma _{2}(E) \in (0, 1)\) such that

$$\begin{aligned} \int _{S^{(1-\gamma _{2}) t}_{t}} {}^{(X_{0})} P_{T}[A^{(n)}, \phi ^{(n)}] \, \mathrm {d}x \ge E_{2}(E) \quad \hbox { for every } t \in \left[ \varepsilon _{n}^{\frac{1}{2}}, \varepsilon _{n}^{\frac{1}{4}}\right] .\quad \quad \quad \end{aligned}$$
(8.37)

Proof

Fix n and \(t_{0} \in [\varepsilon _{n}^{1/2}, \varepsilon _{n}^{1/4}]\). Applying Proposition 5.5 with \(\varepsilon = \varepsilon _{n}, \delta _{0} = (1-\gamma _{2}) t_{0}\) and \(\delta _{1} = M \delta _{0}\), where \(\gamma _{2} \in (0, 1)\) and \(M > 1\) will be chosen below, we obtain

$$\begin{aligned} \int _{S_{1}^{M (1-\gamma _{2}) t_{0}}} {}^{(X_{0})} P_{T}[A, \phi ] \, \mathrm {d}x\le & {} \int _{S_{t_{0}}^{(1-\gamma _{2})t_{0}}} {}^{(X_{0})} P_{T}[A, \phi ] \, \mathrm {d}x \nonumber \\&+\, C\Big ( (M (1-\gamma _{2}))^{\frac{1}{2}}+ \vert \log M\vert ^{-1} \Big ) E.\quad \quad \quad \quad \end{aligned}$$
(8.38)

On the other hand, by Lemma 5.10 (in particular, the expression for \({}^{(X_{0})} P_{T} = \frac{1}{2} ({}^{(X_{0})} P_{L} + {}^{(X_{0})} P_{\underline{L}})\) and (8.34), we have

$$\begin{aligned} E_{1} \lesssim (1-\gamma _{1})^{-\frac{1}{2}} \int _{S_{1}^{1 - \gamma _{1}}} {}^{(X_{0})} P_{T}[A, \phi ] \, \mathrm {d}x. \end{aligned}$$

Hence choosing M sufficiently large and \(\gamma _{2}\) close enough to 1 to make the last term in (8.38) small, (8.37) follows with \(E_{2} = c E_{1} (1-\gamma _{1})^{\frac{1}{2}}\) for some \(c > 0\).

\(\square \)

8.4 Final rescaling

So far, under the assumption that Theorem 1.3 fails, we have shown the existence of a sequence of solutions \((A^{(n)}, \phi ^{(n)})\) that satisfies the conclusions of Lemma 8.4 and a uniform lower bound (8.37) in a time-like region. By Proposition 5.4, the sequence moreover obeys the uniform space-time bound

$$\begin{aligned} \iint _{C_{[\varepsilon _{n}, 1]}} \frac{1}{\rho _{\varepsilon _{n}}} \vert \iota _{X_{\varepsilon _{n}}} F^{(n)}\vert ^{2} + \frac{1}{\rho _{\varepsilon _{n}}} \left| \left( \mathbf{D}^{(n)}_{X_{\varepsilon _{n}}} + \frac{1}{\rho _{\varepsilon _{n}}}\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \lesssim E.\quad \quad \quad \end{aligned}$$
(8.39)

Our next goal is to upgrade (8.39) to asymptotic self-similarity by a rescaling argument.

Lemma 8.11

Suppose that Theorem 1.3 fails. Then there exists a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions \((A^{(n)}, \phi ^{(n)})\) on \([1, T_{n}] \times {{\mathbb {R}}}^{4}\) with \(T_{n} \rightarrow \infty \) satisfying the following properties:

  1. (1)

    Bounded energy in the cone

    $$\begin{aligned} {{\mathcal {E}}}_{S_{t}}\left[ A^{(n)}, \phi ^{(n)}\right] \le E, \quad \quad \hbox { for every } t \in [1, T_{n}], \end{aligned}$$
    (8.40)
  2. (2)

    Small energy outside the cone

    $$\begin{aligned} {{\mathcal {E}}}_{\left\{ t\right\} \times {{\mathbb {R}}}^{4} \setminus S_{t}}\left[ A^{(n)}, \phi ^{(n)}\right] \le \frac{1}{100} E \quad \quad \hbox { for every } t \in [1, T_{n}], \end{aligned}$$
    (8.41)
  3. (3)

    Nontrivial energy in a time-like region

    $$\begin{aligned} \int _{S^{(1-\gamma _{2})t}_{t}} {}^{(X_{0})} P_{T}\left[ A^{(n)}, \phi ^{(n)}\right] \, \mathrm {d}x \ge E_{2} \quad \hbox { for every } t \in [1, T_{n}],\quad \quad \quad \end{aligned}$$
    (8.42)
  4. (4)

    Asymptotic self-similarity

    $$\begin{aligned} \iint _{K} \vert \iota _{X_{0}} F^{(n)}\vert ^{2} + \left| \left( \mathbf{D}^{(n)}_{X_{0}} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty \quad \quad \quad \end{aligned}$$
    (8.43)

    for every compact subset K of the interior of \(C_{[1, \infty )}\).

Proof

Let \((A^{(n)}, \phi ^{(n)})\) be a sequence of solutions satisfying the conclusions of Lemmas 8.4 and 8.10. Consider the time interval \([\varepsilon _{n}^{1/2}, \varepsilon _{n}^{1/4}]\), on which (8.37) applies. Given \(T_{n} > 1\), we partition \(\varepsilon _{n}\) in to dyadic intervals of the form \(I_{n}^{j} = [T_{n}^{j} \varepsilon _{n}^{1/2}, T_{n}^{j+1} \varepsilon _{n}^{1/2}]\); there are roughly \(\vert \log \varepsilon _{n}\vert / \log T_{n}\) many such intervals. We choose \(T_{n}\) so that \(\log T_{n} \sim \vert \log \varepsilon _{n}\vert ^{1/2}\). Observe that \(T_{n} \rightarrow \infty \). Also, by the pigeonhole principle applied to (8.39), there exists j(n) such that

$$\begin{aligned}&\iint _{C_{I_{n}^{j(n)}}} \frac{1}{\rho _{\varepsilon _{n}}} \vert \iota _{X_{\varepsilon _{n}}} F^{(n)}\vert ^{2} + \frac{1}{\rho _{\varepsilon _{n}}} \left| \left( \mathbf{D}^{(n)}_{X_{\varepsilon _{n}}} + \frac{1}{\rho _{\varepsilon _{n}}}\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x\nonumber \\&\quad \lesssim \frac{\log T_{n}}{\vert \log \varepsilon _{n}\vert } E \sim \frac{1}{\vert \log \varepsilon _{n}\vert ^{1/2}} E, \end{aligned}$$
(8.44)

which decays to 0 as \(n \rightarrow \infty \).

We now rescale \(C_{I_{n}^{j(n)}}\) to \(C_{[1, T_{n}]}\); abusing the notation a bit (but conforming to the statement of the lemma), we denote the rescaled solutions again by \((A^{(n)}, \phi ^{(n)})\). From (8.9) and (8.10) with \(\varepsilon ^{8} \le \frac{1}{100}\), (8.40) and (8.41) follow. Also, (8.42) is a consequence of (8.37). Furthermore, (8.44) implies

$$\begin{aligned} \iint _{C_{[1, T_{n}]} } \frac{1}{\rho _{\varepsilon '_{n}}} \vert \iota _{X_{\varepsilon '_{n}}} F^{(n)}\vert ^{2} + \frac{1}{\rho _{\varepsilon '_{n}}} \left| \left( \mathbf{D}^{(n)}_{X_{\varepsilon '_{n}}} + \frac{1}{\rho _{\varepsilon '_{n}}}\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty \nonumber \\ \end{aligned}$$
(8.45)

where \(\varepsilon '_{n} := (T_{n}^{j(n)} \varepsilon _{n}^{1/2})^{-1} \varepsilon _{n}\) obeys \(\varepsilon '_{n} \le \varepsilon _{n}^{1/2} \rightarrow 0\). For any compact subset K of the interior of \(C_{[1, \infty )}\), which is in particular situated away from the boundary \(\partial C_{[1, \infty )}\), we claim that

$$\begin{aligned}&\iint _{K} \left( \frac{1}{\rho _{\varepsilon '_{n}}} \vert \iota _{X_{\varepsilon '_{n}}} F^{(n)}\vert ^{2} - \frac{1}{\rho } \vert \iota _{X_{0}} F^{(n)}\vert ^{2} \right) \\&\quad + \left( \frac{1}{\rho _{\varepsilon '_{n}}} \left| \left( \mathbf{D}^{(n)}_{X_{\varepsilon '_{n}}} - \frac{1}{\rho _{\varepsilon '_{n}}}\right) \phi ^{(n)}\right| ^{2} - \frac{1}{\rho } \left| \left( \mathbf{D}^{(n)}_{X_{0}} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \right) \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \end{aligned}$$

Indeed, in the coordinates \((x^{0} = t, x^{1}, \ldots , x^{4})\), the left-hand side can be written in the form

$$\begin{aligned} \iint _{K} d_{1}^{(n) \mu \nu } \vert F^{(n)}_{\mu \nu }\vert ^{2} + d_{2}^{(n) \mu } \vert \mathbf{D}^{(n)}_{\mu } \phi ^{(n)}\vert ^{2} + d_{3}^{(n)} \vert \phi ^{(n)}\vert ^{2} \, \mathrm {d}t \mathrm {d}x \end{aligned}$$

where \(d_{1}^{(n) \mu \nu }(t,x), d_{2}^{(n) \mu }(t,x)\) and \(d_{3}^{(n)}\) are continuous functions which tend to 0 pointwisely (hence uniformly) on K, whereas \(\vert F_{\mu \nu }\vert , \vert \mathbf{D}^{(n)}_{\mu } \phi ^{(n)}\vert \) and \(\vert \phi ^{(n)}\vert \) are uniformly in \(L^{2}(K)\) by (8.40), (8.41) and Hardy’s inequality. By Hölder’s inequality, the claim follows. Then combining the claim with (8.45), we arrive at the desired asymptotic self-similarity (8.43). \(\square \)

8.5 Concentration scales

Let \((A^{(n)}, \phi ^{(n)})\) be a sequence of solutions given by Lemma 8.11. We now present a combinatorial result that establishes the following dichotomy: Either there is a uniform non-concentration of energy, or we can identify a sequence of points and decreasing scales at which energy concentrates.

To state the result, we need few definitions. For each \(j = 1, 2, \ldots \) we define

$$\begin{aligned} C_{j} :=&\left\{ (t,x) \in C^{1}_{[1, \infty )} : 2^{j} \le t < 2^{j+1}\right\} , \\ \widetilde{C}_{j} :=&\left\{ (t,x) \in C^{1/2}_{[1/2, \infty )} : 2^{j} \le t < 2^{j+1}\right\} . \end{aligned}$$

In words, \(C_{j}\) [resp. \(\widetilde{C}_{j}\)] is the set of points in the truncated cone \(C_{[2^{j}, 2^{j+1})}\) at distance \(\ge 1\) [resp. \(\ge 1/2\)] from the lateral boundary. For each \(j \ge 1\), we have the following lemma.

Lemma 8.12

Let \((A^{(n)}, \phi ^{(n)})\) be a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions on \([1, T_{n}] \times {{\mathbb {R}}}^{4}\) with \(T_{n} \rightarrow \infty \) satisfying (8.40)–(8.43) for some \(E > 0\). Let \(\epsilon _{0}\) be as in Proposition 6.1. Then for each \(j = 1, 2, \cdots \), after passing to a subsequence, one of the following alternatives holds:

  1. (1)

    Concentration of energy. There exist points \((t_{n}, x_{n}) \in \widetilde{C}_{j}\), scales \(r_{n} \rightarrow 0\) and \(0 < r = r(j) < 1/4\) such that the following bounds hold:

    $$\begin{aligned}&{{\mathcal {E}}}_{\left\{ t_{n}\right\} \times B_{r_{n}}(x_{n})}\left[ A^{(n)}, \phi ^{(n)}\right] = \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}, \end{aligned}$$
    (8.46)
    $$\begin{aligned}&\sup _{x \in B_{r}(x_{n})} {{\mathcal {E}}}_{\left\{ t_{n}\right\} \times B_{r_{n}}(x)}\left[ A^{(n)}, \phi ^{(n)}\right] \le \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}, \end{aligned}$$
    (8.47)
    $$\begin{aligned}&\quad \frac{1}{4 r_{n}} \int _{t_{n}-2 r_{n}}^{t_{n}+2 r_{n}} \int _{B_{r}(x_{n})} \vert \iota _{X_{0}} F^{(n)}\vert ^{2} \nonumber \\&\quad + \left| \left( \mathbf{D}^{(n)}_{X_{0}} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty . \end{aligned}$$
    (8.48)
  2. (2)

    Uniform non-concentration of energy. There exists \(0 < r = r(j) < 1/4\) such that the following bounds hold:

    $$\begin{aligned}&\int _{S^{(1-\gamma _{2})t}_{t}} {}^{(X_{0})} P_{T}\left[ A^{(n)}, \phi ^{(n)}\right] \, \mathrm {d}x \ge E_{2} \quad \hbox { for } t \in [2^{j}, 2^{j+1}), \end{aligned}$$
    (8.49)
    $$\begin{aligned}&\quad \sup _{(t, x) \in C_{j}} {{\mathcal {E}}}_{\left\{ t\right\} \times B_{r}(x)} \left[ A^{(n)}, \phi ^{(n)}\right] \le \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}, \end{aligned}$$
    (8.50)
    $$\begin{aligned}&\iint _{\widetilde{C}_{j}} \vert \iota _{X_{0}} F^{(n)}\vert ^{2} + \left| \left( \mathbf{D}^{(n)}_{X_{0}} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow \ 0 \quad \hbox { as } n \rightarrow \infty . \end{aligned}$$
    (8.51)

Here \(C_{0} > 0\) is a universal constant much larger than the implicit constants in Lemma 4.5.

Proof

This lemma is essentially [33, Lemma 6.3]; for completeness we give a self-contained alternative proof, which relies on the use of the Hardy-Littlewood maximal function theorem to establish (8.48).

Step 1  Fix \(j \in \left\{ 1, 2, \ldots \right\} \). We begin by identifying a ‘low energy barrier’ around \(C_{j}\) inside \(\widetilde{C}_{j}\). Let \(N > 0\) be a large integer to be determined later. We first partition the time interval \([2^{j}, 2^{j+1})\) into smaller intervals \(I_{k}\), where

$$\begin{aligned} I_{k} := \bigg [2^{j} + \frac{k-1}{10 N}, 2^{j} + \frac{k}{10 N }\bigg ) \quad k = 1, \ldots , 10 N 2^{j}. \end{aligned}$$

Accordingly, define \(C_{j}^{k} := C_{j} \cap (I_{k} \times {{\mathbb {R}}}^{4})\) and \(\widetilde{C}_{j}^{k} := \widetilde{C}_{j} \cap (I_{k} \times {{\mathbb {R}}}^{4})\). Next, we partition \(\widetilde{C}_{j}^{k} {\setminus } C_{j}^{k}\) into \(\cup _{\ell =1}^{N} \widetilde{C}^{k, \ell }_{j}\), where

$$\begin{aligned} \widetilde{C}^{k, \ell }_{j} = \left\{ (t,x) \in \widetilde{C}_{j}^{k} : \frac{1}{2} + \frac{\ell - 1}{2N} \le t - \vert x\vert < \frac{1}{2} + \frac{\ell }{2N}\right\} , \quad \ell = 1, \ldots , N. \end{aligned}$$

For each n and k, we claim that there exists \(1 \le \ell (n, k) \le N\) such that

$$\begin{aligned} \sup _{t \in I_{k}} \, {{\mathcal {E}}}_{S_{t} \cap \widetilde{C}_{j}^{k, \ell (n, k)}}\left[ A^{(n)}, \phi ^{(n)}\right] \le \frac{3}{N} E. \end{aligned}$$
(8.52)

Indeed, for each k consider the left endpoint \(\underline{t}_{k} := 2^{j} + (k-1)/(10N)\). The set \(S_{\underline{t}_{k}} \cap (\widetilde{C}^{k}_{j} {\setminus } C^{k}_{j})\) is partitioned into N annuli of the form \(S_{\underline{t}_{k}} \cap \widetilde{C}^{k, \ell }_{j}\). By the pigeonhole principle and the energy bound (8.40), there exists \(1 \le \ell (n, k) \le N-2\) such that

$$\begin{aligned} \sum _{\ell = \ell (n, k)}^{\ell (n, k) + 2} {{\mathcal {E}}}_{S_{\underline{t}_{k}} \cap \widetilde{C}_{j}^{k, \ell }} \left[ A^{(n)}, \phi ^{(n)}\right] \le \frac{3}{N} E. \end{aligned}$$

As \(\widetilde{C}_{j}^{k, \ell (n, k)}\) lies in the domain of dependence of \(\cup _{\ell = \ell (n, k)}^{\ell (n,k)+2} S_{\underline{t}_{k}} \cap \widetilde{C}_{j}^{k, \ell }\), (8.52) now follows by the local conservation of energy.

We choose N large enough so that

$$\begin{aligned} \frac{3}{N} E \le \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}. \end{aligned}$$

Hence, by (8.52), \(\widetilde{C}_{j}^{k, \ell (n, k)}\) serves as a ‘low energy barrier’ that separates the behavior of the solution in the interior \(\widetilde{C}^{k, < \ell (n, k)}_{j} := (\cup _{\ell = 1}^{\ell (n,k)-1} \widetilde{C}_{j}^{k, \ell }) \cup C_{j}^{k}\) from the outside. Fix \(r_{0} = \frac{1}{4N}\), so that \(0 < r_{0} < 1/4\) and

$$\begin{aligned} (t, x) \in \widetilde{C}_{j}^{k, < \ell (n, k)} \Rightarrow \left\{ t\right\} \times B_{4 r_{0}}(x) \subseteq \widetilde{C}_{j}^{k, < \ell (n, k)} \cup \widetilde{C}_{j}^{k, \ell (n, k)} \subseteq C_{[1/2, \infty )}^{1/2}.\nonumber \\ \end{aligned}$$
(8.53)

Step 2  For each n and k, define \(f_{n, k} : [0, r_{0}] \times I_{k} \rightarrow [0, \infty )\) by

$$\begin{aligned} f_{n, k}(r, t) := \sup \left\{ {{\mathcal {E}}}_{\left\{ t\right\} \times B_{r}(x)}[A^{(n)}, \phi ^{(n)}] : (t, x) \in \widetilde{C}_{j}^{k, < \ell (n, k)}\right\} . \end{aligned}$$

We then define the lowest energy concentration scale \(r_{n, k}(t)\) as

$$\begin{aligned} r_{n, k}(t) := \left\{ \begin{array}{ll} \inf \left\{ r \in [0, r_{0}] : f_{n, k}(t, r) \ge \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}\right\} &{} \hbox {if } f_{n, k}(t, r_{0}) \ge \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}, \\ r_{0} &{} \hbox {otherwise}. \end{array} \right. \nonumber \\ \end{aligned}$$
(8.54)

We claim that each \(r_{n, k}\) is Lipschitz continuous with a universal constant \(c_{L} > 0\), i.e.,

$$\begin{aligned} \vert r_{n, k}(t_{1}) - r_{n,k}(t_{0})\vert \le c_{L} \vert t_{1} - t_{0}\vert \quad \hbox { for } t_{0}, t_{1} \in I_{k}. \end{aligned}$$

The key idea is to use the finite speed of propagation, or equivalently, local conservation of energy. Let \(t_{0}, t_{1} \in I_{k}\) with \(t_{0} < t_{1}\). Consider first the case when \(r_{n, k}(t_{0}) \ge r_{n,k}(t_{1})\). For convenience, we introduce the shorthand \(\overline{r} := r_{n, k}(t_{1})\). When \(\overline{r} = r_{0}\), then necessarily \(r_{n, k}(t_{0}) = r_{0}\) and (8.5) holds trivially. If \(\overline{r} < r_{0}\), then there exists \(x_{1} \in {{\mathbb {R}}}^{4}\) such that \((t_{1}, x_{1}) \in \widetilde{C}_{j}^{k, < \ell (n, k)}\) and \({{\mathcal {E}}}_{\left\{ t_{1}\right\} \times B_{\overline{r}}} [A^{(n)}, \phi ^{(n)}] = \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}\). By local conservation of energy, it follows that

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t_{0}\right\} \times B_{\overline{r} + (t_{1} - t_{0})}(x_{1})}\left[ A^{(n)}, \phi ^{(n)}\right] \ge {{\mathcal {E}}}_{\left\{ t_{1}\right\} \times B_{\overline{r}}} \left[ A^{(n)}, \phi ^{(n)}\right] = \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}. \end{aligned}$$

If \((t_{0}, x_{1}) \in \widetilde{C}_{j}^{k, < \ell (n, k)}\), then \(r_{n, k}(t_{0}) \le \overline{r} + (t_{1} - t_{0})\). If \((t_{0}, x_{1}) \not \in \widetilde{C}_{j}^{k, < \ell (n, k)}\), then by elementary geometry there exists \((t_{0}, x_{0}) \in \widetilde{C}_{j}^{k, < \ell (n, k)}\) such that \(\vert x_{1} - x_{0}\vert < t_{1} - t_{0}\). Hence the energy of \((A^{(n)}, \phi ^{(n)})\) on \(\left\{ t_{0}\right\} \times B_{\overline{r} + 2(t_{1} - t_{0})}(x_{0})\) is bounded from below by \(\frac{1}{C_{0}^{2}} \epsilon _{0}^{2}\), which implies \(r_{n, k}(t_{0}) \le \overline{r} + 2(t_{1} - t_{0})\) in general. Treating the other case \(r_{n, k}(t_{0}) > r_{n,k}(t_{1})\) in a similar way, it follows that (8.5) holds with \(c_{L} = 2\).

We now proceed to the proof of of the lemma. We first treat the case when there exists a common lower bound \(0 < r(j) \le r_{0}\) of \(r_{n,k}\), i.e., \(r_{n, k}(t) \ge r(j)\) for all nk and \(t \in I_{k}\). Unraveling the definition of \(r_{n, k}\), we see that (8.50) holds. Moreover, (8.49) and (8.51) follow directly from (8.42) and (8.43), respectively. Thus we conclude that the second scenario (uniform non-concentration of energy) holds.

To complete the proof, it only remains to consider the alternative case and show that the first scenario (concentration of energy) holds. After passing to a subsequence, we may assume that there exists \(k \in \left\{ 1, \ldots , 10 N 2^{j}\right\} \) such that

$$\begin{aligned} \lim _{n \rightarrow \infty } \inf _{I_{k}} r_{n, k} = 0. \end{aligned}$$
(8.55)

Then we claim that there exist \((t_{n}, x_{n})\) and \(r_{n}\) such that (8.46)–(8.48) hold with \(r(j) = r_{0}\), up to passing to a subsequence.

Define

$$\begin{aligned} \alpha _{n}^{2}:= & {} \int _{2^{j-1}}^{2^{j+2}} \beta _{n}^{2}(t) \, \mathrm {d}t, \quad \\ \beta _{n}^{2}(t):= & {} \int _{S_{t} \cap C^{1/2}_{[1/2,\infty )}} \vert \iota _{X_{0}} F^{(n)}\vert ^{2} + \left| \left( \mathbf{D}^{(n)}_{X_{0}} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \, \mathrm {d}x. \end{aligned}$$

Note that \(\alpha _{n}^{2} \rightarrow 0\) by (8.43). By the Hardy-Littlewood maximal function theorem, for every \(\alpha > 0\) we have

$$\begin{aligned} \left| \left\{ t \in [2^{j-1}, 2^{j+1}) : M \left[ \beta _{n}^{2}\right] (t) > \alpha \right\} \right| \lesssim \frac{1}{\alpha } \alpha _{n}^{2}, \end{aligned}$$
(8.56)

where \(M[\beta _{n}](t)\) is the Hardy-Littlewood maximal function on \([2^{j-1}, 2^{j+2})\), given by

$$\begin{aligned} M[\beta _{n}] (t):= \sup _{a > 0} \frac{1}{2a}\int _{(t-a, t+a) \cap [2^{j-1}, 2^{j+2})} \beta ^{2}_{n}(t') \, \mathrm {d}t'. \end{aligned}$$

Roughly speaking (8.56) says that the desired conclusion (8.48) holds for ‘most of’ \(t \in I_{k}\). This fact, combined with the flexibility of the choice of \(t_{n}\) such that \(\lim _{n \rightarrow \infty } r_{n, k}(t_{n}) = 0\), will lead to the desired conclusions (8.46)–(8.48).

More precisely, define the intervals \(J_{n}, K_{n} \subseteq I_{k}\) by

$$\begin{aligned} J_{n} := \left\{ t \in I_{k} : M[\beta _{n}^{2}](t) \le \alpha _{n}\right\} , K_{n} := (\overline{t}_{n} - \alpha _{n}^{1/2}, \overline{t}_{n} + \alpha _{n}^{1/2}) \cap I_{k}, \end{aligned}$$

where \(\overline{t}_{n} \in I_{k}\) is a minimum of \(r_{n, k}\), i.e., \(r_{n, k}(\overline{t}_{n}) = \inf _{I_{k}} r_{n, k}\). By the uniform Lipschitz continuity of \(r_{n,k}\) and the fact that \(\alpha _{n}^{2} \rightarrow 0\) as \(n \rightarrow \infty \), we have

$$\begin{aligned} \sup _{t \in K_{n}} r_{n, k}(t) \rightarrow 0 \quad \hbox {as } n \rightarrow \infty . \end{aligned}$$

Note that \(\vert I_{k} {\setminus } J_{n}\vert \lesssim \alpha _{n}\) by (8.56) with \(\alpha = \alpha _{n}\), whereas \(\vert K_{n}\vert = 2 \alpha _{n}^{1/2}\). Using again the fact that \(\alpha _{n}^{2} \rightarrow 0\) as \(n \rightarrow \infty \) and passing to a subsequence, it follows that \(J_{n} \cap K_{n} \ne \emptyset \) for all n. Choosing \(t_{n}\) so that \(t_{n} \in J_{n} \cap K_{n}\) and \(r_{n} := r_{n, k}(t_{n})\), we have

$$\begin{aligned} \sup _{a > 0} \frac{1}{2a}\int _{t_{n}-a}^{t_{n}+a} \beta _{n}^{2}(t) \, \mathrm {d}t \rightarrow 0, \quad r_{n} = r_{n, k}(t_{n}) \rightarrow 0 \quad \hbox { as } n \rightarrow \infty . \end{aligned}$$

With the choice \(r(j) = r_{0}\), (8.48) follows from (8.53) and the previous statement. Passing to a subsequence if necessary, we may assume that \(r_{n, k}(t_{n}) < r_{0}\); then there exists \((t_{n}, x_{n}) \in \widetilde{C}_{j}^{k, <\ell (n, k)}\) such that (8.46) holds for all n as well. Finally, thanks to the low energy barrier (8.53) and the definition of \(r_{n,k}\), (8.47) follows with \(r(j) = r_{0}\). \(\square \)

8.6 Compactness/rigidity argument

We are now ready to complete the proof of Theorem 1.3, by using the tools developed in Sects. 6 and 7.

Completion of proof of Theorem 1.3

Let \((A^{(n)}, \phi ^{(n)})\) be a sequence of admissible \(C_{t} {{\mathcal {H}}}^{1}\) solutions on \([1, T_{n}] \times {{\mathbb {R}}}^{4}\) given by Lemma 8.11. We consider two cases according to Lemma 8.12, and show that both lead to contradictions.

Case 1 Suppose that there exists \(j \in \left\{ 1, 2, \ldots \right\} \) such that the first scenario (concentration of energy) in Lemma 8.12 holds. We need to set things up so that we can use Proposition 6.1, and for that we also need local control of the \(L^2\) norm of \(\phi \). This is achieved via the improved form of Hardy’s inequality in Lemma 4.5. From (8.47), we obtain

$$\begin{aligned} (\sigma ^{-1} r_n)^{-2} \Vert \phi ^{(n)}(t_n) \Vert _{L^2_{x}\left( B_{8 \sigma ^{-1} r_n}\right) }^2 \le c \epsilon _0^2 + C \sigma ^{-2} E, \end{aligned}$$

with a universal constant \(c \ll 1\) and a parameter \(\sigma \ge 2\) to be specified. To eliminate the second term, we choose \(\sigma \) so that

$$\begin{aligned} C \sigma ^{-2} E = c \epsilon _0^2. \end{aligned}$$

Thus we have insured that the hypothesis of Proposition 6.1 are satisfied with respect to the rescaled ball \(B_{\sigma ^{-1} r_{n}}(x)\) with x as in (8.47), i.e.,

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t_{n}\right\} \times B_{8 \sigma ^{-1} r_{n}}(x)}\left[ A^{(n)}, \phi ^{(n)}\right] +\left( \sigma ^{-1} r_n\right) ^{-2} \Vert \phi ^{(n)}(t_n) \Vert _{L^2_{x}(B_{8\sigma ^{-1} r_n}(x))}^2 \le \epsilon _0^2\nonumber \\ \end{aligned}$$
(8.57)

for every \(x \in B_{r(j)}(x_{n})\).

As \(\widetilde{C}_{j}\) is pre-compact, we may assume that \((t_{n}, x_{n})\) has a limit \((t_{0}, x_{0})\) in the closure of \(\widetilde{C}_{j}\) after passing to a subsequence. Consider the sequence

$$\begin{aligned} (\widetilde{A}^{(n)}, \widetilde{\phi }^{(n)})(t,x) := \sigma ^{-1} r_{n} (A^{(n)}, \phi ^{(n)}) \left( \sigma ^{-1} r_{n} t + t_{n}, \sigma ^{-1} r_{n} x + x_{n}\right) . \end{aligned}$$

By (8.46), there is always a nontrivial amount of energy at the origin, i.e.,

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ 0\right\} \times B_{\sigma }(0)}\left[ \widetilde{A}^{(n)}, \widetilde{\phi }^{(n)}\right] = \frac{1}{C_{0}^{2}} \epsilon _{0}^{2}. \end{aligned}$$
(8.58)

Fix any \(x \in {{\mathbb {R}}}^{4}\). As \(r_{n} \rightarrow 0\), observe that the point \(\sigma ^{-1} r_{n} x + x_{n}\) belongs to \(B_{r(j)}(x_{n})\) for sufficiently large n. Hence, by (8.57), we have

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ 0\right\} \times B_{8}(x)}\left[ \widetilde{A}^{(n)}, \widetilde{\phi }^{(n)}\right] + \Vert \widetilde{\phi }^{(n)}(0)\Vert _{L^2_{x}( B_{8}(x))}^{2} \le \epsilon _{0}^{2} \quad \hbox { for sufficiently large } n.\nonumber \\ \end{aligned}$$
(8.59)

Finally, by (8.48), the convergence \((t_{n}, x_{n}) \rightarrow (t_{0}, x_{0})\) and smoothness of \(X_{0}\), it follows that

$$\begin{aligned} \iint _{(-2, 2) \times B_{2}(x)} \vert \iota _{Y} \widetilde{F}^{(n)}\vert ^{2} + \vert \widetilde{\mathbf{D}}^{(n)}_{Y} \widetilde{\phi }^{(n)}\vert ^{2} \, \mathrm {d}t \mathrm {d}x \rightarrow 0 \quad \hbox { as } n \rightarrow \infty .\quad \quad \end{aligned}$$
(8.60)

where \(Y = X_{0}(t_{0}, x_{0})\) is a constant time-like vector field. Note that the contribution of the term \(\frac{1}{\rho } \phi ^{(n)}\) drops out by scaling.

As a consequence, for each \(x \in {{\mathbb {R}}}^{4}\) we can apply Proposition 6.1 to obtain a weak solution \((A_{[x]}, \phi _{[x]}) \in {{\mathcal {X}}}^{w}((-1, 1) \times B_{1}(x))\) to (MKG) such that

$$\begin{aligned} \iota _{Y} F_{[x]} = 0, \quad \mathbf{D}_{[x] Y} \phi _{[x]} = 0, \end{aligned}$$

and \((\widetilde{A}^{(n)}, \widetilde{\phi }^{(n)})\) converges to \((A_{[x]}, \phi _{[x]})\) up to gauge transformations on \((-1, 1) \times B_{1}(x)\) as in (6.3), (6.4). By Lemma 6.15, the weak solutions \((A_{[x]}, \phi _{[x]})\) form weak compatible pairs (as in Definition 6.13) on the open cover \(\left\{ (-1, 1) \times B_{1}(x)\right\} _{x \in (1/2){{\mathbb {Z}}}^{4}}\) of \((-1, 1) \times {{\mathbb {R}}}^{4}\). Furthermore, by Proposition 7.3, there exists an equivalent set of smooth compatible pairs \((A_{[\alpha ]}, \phi _{[\alpha ]})\) on some refined open cover \({\mathcal {Q}}= \left\{ Q_{\alpha }\right\} \) of \((-1, 1) \times {{\mathbb {R}}}^{4}\).

Let \((A, \phi )\) be a global smooth pair on \((-1, 1) \times {{\mathbb {R}}}^{4}\) equivalent to \((A_{[\alpha ]}, \phi _{[\alpha ]})\). We then extend \((A, \phi )\) to \({{\mathbb {R}}}^{1+4}\) as a smooth solution to (MKG) satisfying \(\iota _{Y} F = 0\) and \(\mathbf{D}_{Y} \phi = 0\) by pulling back along the flow of Y.

Note that \((A, \phi )\) has finite energy (in fact, bounded by \(\le E\)), as we have

$$\begin{aligned} {}^{(T)} J_{T}\left[ \widetilde{A}^{(n)}, \widetilde{\phi }^{(n)}\right] \rightarrow {}^{(T)} J_{T}[A, \phi ] \quad \hbox { locally in } L^{1}_{t,x} \hbox { on } (-1, 1) \times {{\mathbb {R}}}^{4}\nonumber \\ \end{aligned}$$
(8.61)

by (6.4) and the gauge invariance of the energy density \({}^{(T)} J_{T}\). After applying a suitable Lorentz transform, we may furthermore assume that \(Y = T\). By Proposition 7.1, it follows that \({{\mathcal {E}}}[A, \phi ] = 0\), but this contradicts (8.58) and (8.61).

Case 2  Suppose that for every \(j \in \left\{ 1, 2, \ldots \right\} \) the second scenario (uniform non-concentration of energy) in Lemma 8.12 holds. The goal in this case is to apply results in Sect. 6 to extract a smooth nontrivial self-similar solution with finite energy, which would contradict Proposition 7.2.

Fix \(j \in \left\{ 1, 2, \ldots \right\} \) and consider any point \((t,x) \in C_{[2, \infty )}^{2} \cap C_{j}\). By (8.50) and Lemma 4.5, where \(\sigma \ge 2\) is chosen as in Case 1, we obtain the following analogue of (8.57):

$$\begin{aligned} {{\mathcal {E}}}_{\left\{ t\right\} \times B_{8 \sigma ^{-1} r(j)}(x)}\left[ A^{(n)}, \phi ^{(n)}\right] + (\sigma ^{-1} r(j))^{-2} \Vert \phi ^{(n)}\Vert _{L^{2}_{x}(B_{8 \sigma ^{-1} r(j)}(x))}^{2} \le \epsilon _{0}^{2}.\nonumber \\ \end{aligned}$$
(8.62)

Since (tx) belongs to the smaller cone \(C_{[2, \infty )}^{2}\), we have

$$\begin{aligned} \widetilde{K}^{j}_{[t,x]}:= & {} \left( t-2 \sigma ^{-1} r(j), t+2 \sigma ^{-1} r(j)\right) \times B_{8 \sigma ^{-1} r(j)}(x) \\\subseteq & {} \widetilde{C}_{j-1} \cup \widetilde{C}_{j} \cup \widetilde{C}_{j+1}. \end{aligned}$$

Therefore, by (8.51) we have

$$\begin{aligned} \iint _{\widetilde{K}^{j}_{[t,x]}} \vert \iota _{X_{0}} F^{(n)}\vert ^{2} + \left| \left( \mathbf{D}_{X_{0}} + \frac{1}{\rho }\right) \phi ^{(n)}\right| ^{2} \rightarrow 0 \quad \hbox { as } n \rightarrow \infty . \end{aligned}$$

Applying Proposition 6.1 to \((A^{(n)}, \phi ^{(n)})\) on the space-time cylinder \(\widetilde{K}^{j}_{[t,x]}\), we obtain a limit \((A_{[t,x]}, \phi _{[t,x]}) \in {{\mathcal {X}}}^{w}(K^{j}_{[t,x]})\) (up to gauge transformations and passing to a subsequence) on a smaller space-time cylinder \(K^{j}_{[t,x]} := (t - \sigma ^{-1} r(j), t + \sigma ^{-1} r(j)) \times B_{\sigma ^{-1} r(j)}(x)\), which is a weak solution to (MKG) obeying

$$\begin{aligned} \iota _{X_{0}} F_{[t,x]} = 0, \quad \left( \mathbf{D}_{[t,x] X_{0}} + \frac{1}{\rho } \right) \phi _{[t,x]} = 0. \end{aligned}$$

The cylinders \(\left\{ K^{j}_{[t,x]}\right\} \) for \(j \in \left\{ 1,2,\ldots \right\} \) and \((t,x) \in C^{2}_{[2,\infty )} \cap C_{j}\) form an open cover of the cone \(C^{2}_{[2, \infty )}\). By Lemma 6.15, the weak solutions \((A_{[t,x]}, \phi _{[t,x]})\) form weak compatible pairs on \(\left\{ K_{[t,x]}\right\} \). Then by Proposition 7.3, these pairs are equivalent to a set of smooth compatible pairs in some refined open cover of \(C^{2}_{[2, \infty )}\), which in turn is equivalent to a single global smooth pair \((A, \phi )\) on \(C^{2}_{[2, \infty )}\), thanks to the fact that \(C^{2}_{[2, \infty )}\) is contractible. By construction, the pair \((A, \phi )\) satisfies the following properties:

  • The pair \((A, \phi )\) is a smooth solution to (MKG) obeying the self-similarity condition

    $$\begin{aligned} \iota _{X_{0}} F = 0, \quad \left( \mathbf{D}_{X_{0}} + \frac{1}{\rho }\right) \phi = \frac{1}{\rho } \mathbf{D}_{X_{0}}(\rho \phi )= 0. \end{aligned}$$
  • The following local convergences hold:

    $$\begin{aligned} {}^{(T)} J_{T}\left[ \widetilde{A}^{(n)}, \widetilde{\phi }^{(n)}\right] \rightarrow&{}^{(T)} J_{T}[A, \phi ] \quad \hbox { locally in } L^{1}_{t,x} \hbox { on } C^{2}_{[2, \infty )},\quad \quad \end{aligned}$$
    (8.63)
    $$\begin{aligned} {}^{(X_{0})} P_{T}\left[ \widetilde{A}^{(n)}, \widetilde{\phi }^{(n)}\right] \rightarrow&{}^{(X_{0})} P_{T}[A, \phi ] \quad \hbox { locally in } L^{1}_{t,x} \hbox { on } C^{2}_{[2,\infty )}.\quad \quad \end{aligned}$$
    (8.64)

We extend \((A, \phi )\) to a smooth self-similar solution to (MKG) on the whole cone \(C_{(0, \infty )} = \left\{ 0 \le r < t\right\} \) by pulling back \((A, \rho \phi )\) along the flow of \(X_{0}\). Note that \((A, \phi )\) has finite energy (again bounded by \(\le E\)), thanks to the local convergence (8.63). Hence by Proposition 7.2, it follows that \({{\mathcal {E}}}_{S_{t}}[A, \phi ] = 0\) for every \(t \in (0, \infty )\). However, this is a contradiction with (8.49) (in particular, for large enough t so that \(S^{(1-\gamma _{2})t}_{t} \subseteq C^{2}_{[2, \infty )}\)) and (8.64). \(\square \)