1 Introduction

The classical Cwikel–Lieb–Rozenblum (CLR) estimate [Cwi77, Lie76, Ros72], related to the famous asymptotic formula of Weyl [Wey11] on the growth of eigenvalues, bounds the Morse index of a Schrödinger operator \(L = - \Delta + V\) on a bounded domain in \(\mathbb R^n\) in terms of the \(L^{\frac{n}{2}}\) norm of the negative part of V. This central result has applications to mathematical physics, where it is referred to as an estimate of the number of bound states for the linear Schrödinger operator. From the point of view of both geometry and mathematical physics, it is important to find similar index/bound state estimates for nonlinear problems, specifically for Yang–Mills connections and Einstein metrics.

Let \(\left( X^n,g \right) \) be a smooth, compact Riemannian manifold, and suppose \(\nabla \) is a connection on a vector bundle E over X. The Yang–Mills energy associated to \(\nabla \) is given by

$$\begin{aligned} {\mathcal {YM}}\left[ \nabla \right] := \int _{X^n} \left| F_{\nabla } \right| ^2 {{\,\mathrm{dV}\,}}_g. \end{aligned}$$

Critical points for \({\mathcal {YM}}\) are called Yang–Mills connections, including the special class of instantons, which always minimize \({\mathcal {YM}}\) when they exist. While there are many existence results for instantons (eg. [Tau82]), it is also known that generically one expects non-instanton, non-minimizing Yang–Mills connections to exist even in the critical dimension \(n=4\) [SJU89, HM90, SS92, Bor92]. Furthermore, in dimension 4 every stable Yang–Mills connection with small gauge group is an instanton [BL81], so non-minimizing Yang–Mills connections in this setting will have positive index. Thus, to understand the Yang–Mills functional it becomes important to understand the structure of these non-minimizing Yang–Mills connections, in particular to understand their Morse index. This index is that of the relevant Jacobi operator, a Schrödinger operator acting on Lie algebra-valued 1-forms, with inhomogeneous term determined by the curvature of the underlying Riemannian metric as well as the bundle connection’s curvature. Taking a cue from the CLR estimate one may hope roughly that for a connection to have high Morse index it must also have high Yang–Mills energy. The first main result yields an estimate of this type.

Theorem 1.1

Let \((X^4,g)\) be a closed, oriented four-manifold, with Yamabe invariant \({{\,\mathrm{Y}\,}}(X^4,[g]) > 0\). Suppose \(\nabla \) is a non-instanton Yang–Mills connection on a vector bundle E over \(X^4\) with structure group \(G \subset {{\,\mathrm{SO}\,}}(E)\), and curvature \(F_{\nabla }\). Let \(\imath (\nabla )\) denote the index and \(\nu (\nabla )\) the nullity of \(\nabla \). Then

$$\begin{aligned} \imath (\nabla ) + \nu (\nabla )&\le \dfrac{ 144 e^2 \dim (\mathfrak {g}_E) }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } \Big \{- 12 \pi ^2 \chi \left( X^4 \right) + 12 \int _{X} |F_{\nabla }|^2 \, {{\,\mathrm{dV}\,}}_g\\&\quad + 3 \sqrt{2} \int _X |W_{g}||F_{\nabla }| \, {{\,\mathrm{dV}\,}}_g+ 3 \int _{X} | W_{g} |^2 \,{{\,\mathrm{dV}\,}}_g \Big \}, \end{aligned}$$

where \(\chi (X^4)\) is the Euler characteristic of \(X^4\) and \(W_{g}\) is the Weyl tensor.

If \(\nabla \) is an instanton, then \(\nu (\nabla ) = 0\) and the Atiyah–Singer index formula gives an explicit formula for \(\imath (\nabla )\) depending on topological data (see Chapter 4 of [DK90]). Our statement explicitly does not include this case, and we use the assumption of nonvanishing of \(F^+_{\nabla }\) when constructing a metric conformal to the base, with respect to which we carry out the index estimate (see Proposition 3.7). When the base manifold is the round sphere we can simplify the statement to the following:

Corollary 1.2

Let \(E \rightarrow (\mathbb {S}^4,g_{\mathbb S^4})\) be a vector bundle over the round sphere with structure group \(G \subset {{\,\mathrm{SO}\,}}(E)\), with \(\nabla \) a non-instanton Yang–Mills connection. Then

$$\begin{aligned} \imath (\nabla ) + \nu (\nabla ) \le 9 e^2 \dim (\mathfrak {g}_E) \Big \{- 1 + \tfrac{1}{4 \pi ^2} \int _{\mathbb {S}^4} |F_{\nabla }|^2 \,{{\,\mathrm{dV}\,}}_g\Big \}. \end{aligned}$$

An index plus nullity estimate for Yang–Mills connections appeared in [Ura86], under the much stronger assumption that the base manifold has positive Ricci curvature and with a bound depending on the \(L^{\infty }\)-norm of the bundle curvature. Our result only assumes positive Yamabe invariant, and the bound depends on conformal invariants of the base manifold and the Yang–Mills energy. This is more natural, in view of the fact that the index and nullity are conformal invariants. Furthermore, although the constants in Theorem 1.1 are almost certainly not sharp (in fact, the sharp value is not known in the classical CLR inequality; cf. [HKRV18]), we can show by means of examples that the growth rate of the index as a function of the Yang–Mills energy is sharp. Specifically, combining an index estimate of Taubes [Tau83] as well as an explicit construction of non-instanton Yang–Mills connections due to Sadun–Segert [SS92], we exhibit a family of connections whose index grows linearly in the Yang–Mills energy (Proposition 3.10 below). Lastly we point out that the estimate we give in Sect. 2 can be adapted to give an index estimate for Yang–Mills connections in any dimension in terms of the \(L^{\frac{n}{2}}\) norms of F and the Ricci curvature, and the Sobolev constant, and in this case the proof is a very direct adaptation of the method of Li–Yau [LY83] (see Remark 2.6).

Our second main result is an index estimate for Einstein metrics in dimension four. Einstein metrics arise as critical points of the normalized total scalar curvature functional

$$\begin{aligned} \mathscr {S}[g] = {{\,\mathrm{Vol}\,}}(g)^{-1/2} \int _{X^4} R_g \,{{\,\mathrm{dV}\,}}_g. \end{aligned}$$
(1.1)

It is well-known that Einstein metrics are never stable critical points, since \(\mathscr {S}\) is minimized over conformal variations but is locally maximized over transverse-traceless variations, possibly up to a finite dimensional subspace. The index \(\imath (g)\) of an Einstein metric, which we define to be the Morse index of \(-\mathscr {S}\), is the dimension of the maximal subspace on which the second variation is negative when restricted to transverse traceless variations, while the nullity \(\nu (g)\) is the dimension of the space of infinitesimal Einstein deformations. While there are some works characterizing the stability and space of deformations of Einstein metrics ([Koi79, Koi82, DWW05, DWW07]), it seems very little is known about the index in the case it is positive. Intuitively, one might expect an Einstein metric with large index to have small energy. We derive an estimate of this kind which relies on explicit universal constants and the Euler characteristic.

Theorem 1.3

Let \((X^4,g)\) be an Einstein four-manifold with positive scalar curvature. Then

$$\begin{aligned} \mathscr {S}[g] \le 24 \pi \sqrt{ \dfrac{ \chi (X^4)}{ 3 + \delta \left[ \imath (g) + \nu (g) \right] }}, \end{aligned}$$

where \(\delta = \frac{1}{24} e^{-2}\).

Our final application is a bound on the Betti numbers of an oriented four-manifold \(X^4\) of positive scalar curvature. Bounds for the Betti numbers in terms of the curvature, Sobolev constant, and diameter of the manifold were proved by P. Li in [Li80]. These estimates can be viewed as refined or quantitative versions of the classical vanishing theorems; see [B88] for a beautiful survey. To state our results we need to introduce two conformal invariants of four-manifolds with positive Yamabe invariant.

To define the first conformal invariant, we need some additional notation. Let \(A = A_g\) denote the Schouten tensor of g:

$$\begin{aligned} A = \tfrac{1}{2} \left( {{\,\mathrm{Ric}\,}}- \tfrac{1}{6}R g \right) , \end{aligned}$$

where \({{\,\mathrm{Ric}\,}}\) is the Ricci tensor and R the scalar curvature of g. Let \(\sigma _2(A)\) denote the second symmetric function of the eigenvalues of A (viewed as a symmetric bilinear form on the tangent space at each point). Then

$$\begin{aligned} \sigma _2(A) = -\tfrac{1}{8}|{{\,\mathrm{Ric}\,}}|^2 + \tfrac{1}{24}R^2. \end{aligned}$$

The integral of this expression is a scalar conformal invariant of a four-manifold. Using this we define the following two conformal invariants:

$$\begin{aligned} \begin{aligned} \rho _1(X^4,[g])&:= \dfrac{4 \int _{X} \sigma _2(A)\, {{\,\mathrm{dV}\,}}}{{{\,\mathrm{Y}\,}}(X^4,[g])^2}, \\ \rho _{+}(X^4, [g])&:= \dfrac{ 24 \int _{X} |W^{+}|^2 \, {{\,\mathrm{dV}\,}}}{{{\,\mathrm{Y}\,}}(X^4,[g])^2}. \end{aligned} \end{aligned}$$
(1.2)

Let \(b_1(X^4)\) denote the first Betti number of \(X^4\), and let \(b^{+}(X^4)\) denote the maximal dimension of a subspace of \(\Lambda ^2(X^4)\) on which the intersection form is positive. It follows from ([Gur98] Theorem 2) that if \(b_1(X^4) > 0\) then \(\rho _1 \le 0\), with equality only when conformal to a quotient of \(S^3 \times \mathbb R\) with the product metric. Furthermore, it follows from ([Gur00] Theorem 3.3) that if \(b^+ > 0\) then \(\rho _+ \ge 1\), with equality only when conformal to a Kähler metric with positive scalar curvature. Using the general index estimate of Section 2, we can prove quantitative versions of these estimates:

Theorem 1.4

Let \((X^4,g)\) be an oriented four-manifold with \({{\,\mathrm{Y}\,}}(X^4,[g]) > 0\). Then

$$\begin{aligned} b_1(X^4) \le 9e^2 \left( 1 - 24 \rho _1 \right) , \end{aligned}$$
(1.3)

and

$$\begin{aligned} b^{+}(X^4) \le 3 e^2 \left( 2 \sqrt{\rho _{+}} - 1 \right) ^2. \end{aligned}$$
(1.4)

Here, as in the Yang-Mills estimate, our constants are not sharp but the growth rate likely is. In particular, by taking connect sums with sufficiently long necks, we can produce locally conformally flat metrics on the manifold \(k \# \mathbb {S}^3 \times \mathbb {S}^1\) whose Yamabe invariant is uniformly bounded below. Evidently this manifold has \(b_1 = k\), while for these conformal classes we see that the right hand side of (1.3) grows linearly in k. To verify that the growth rate of \(b_+\) is sharp the natural candidates to consider are the self-dual metrics on \(k \# \mathbb {CP}^2\) constructed by LeBrun [LeB91]. However, we do not know if the Yamabe invariant of these metrics has a uniform lower bound.

The proofs of these theorems all rely on an extension of the CLR estimate to elliptic operators on vector bundles with certain geometric backgrounds (see Sect. 2). The case of dimension \(n=4\) especially requires careful analysis of the curvature terms in the relevant index operator in order to capture the conformal invariance. While many proofs of the classical CLR inequality by now exist, the proof of Li–Yau [LY83] gives explicit bounds in terms of the Sobolev constant. By adapting their ideas to operators modeled on the conformal Laplacian but acting on sections of a vector bundle, we are able to obtain estimates in terms of conformal invariants. An important technical step is to compare the \(L^2\)-trace of the heat kernel of a Schrödinger-type operator acting on sections of a vector bundle to the heat trace of an associated scalar operator. Again, many results of this kind exist (see [HSU77, HSU80, Sim79]), but we adapt a proof of Donnely–Li [DL82] as it is closest in spirit to the other estimates. Combining these ideas together with a conformal gauge-fixing argument yields our main index estimates.

2 General Index Estimate

In this section we adapt the proof of the Cwikel–Lieb–Rosenblum inequality due to Li–Yau [LY83] to prove an index estimate for a certain class of elliptic operators acting on sections of vector bundles. Given a vector bundle \(\mathcal {E} \rightarrow (X^4, g)\) with a metric-compatible connection \(\nabla \), let \(\Delta = \Delta _g : \Gamma (\mathcal {E}) \rightarrow \Gamma (\mathcal {E})\) denote the rough Laplacian, where in local coordinates \(\Delta = g^{ij} \nabla _i \nabla _j\) (note this convention differs from some references). Given a non-negative function \(V \in C^0(X^4)\), consider the operator

$$\begin{aligned} \mathcal {S} = -\Delta + \tfrac{1}{6}R - V, \end{aligned}$$
(2.1)

where \(R = R_g\) is the scalar curvature of g. We will assume throughout this section that \(R \ge 0\), and the Yamabe invariant \({{\,\mathrm{Y}\,}}(X^4,\left[ g \right] ) > 0\). Our main result is

Theorem 2.1

If \(N_0(\mathcal {S})\) denotes the number of non-positive eigenvalues of \(\mathcal {S}\), then

$$\begin{aligned} N_0(\mathcal {S}) \le \dfrac{ 36 e^2 {{\,\mathrm{rank}\,}}(\mathcal {E}) \Vert V \Vert _{L^2}^2 }{{{\,\mathrm{Y}\,}}\left( X^4, [g] \right) ^2 }. \end{aligned}$$
(2.2)

The proof is a consequence of a series of technical lemmas, and will appear at the end of the section. We begin with some notation. We need to distinguish between the Laplacian on functions and the rough Laplacian acting on sections of \(\mathcal {E}\), so from now on we set

$$\begin{aligned} \Delta _0&: C^{\infty }\left( X^4 \right) \rightarrow C^{\infty }\left( X^4 \right) , \\ \Delta&: C^{\infty }\left( \mathcal {E} \right) \rightarrow C^{\infty } \left( \mathcal {E} \right) . \end{aligned}$$

Fix some small \(\epsilon > 0\) define

$$\begin{aligned} V_{\epsilon } := V + \epsilon . \end{aligned}$$
(2.3)

Consider the two operators

$$\begin{aligned} \mathcal {P}_0&:= \tfrac{1}{V_{\epsilon }} \left( \Delta _0 - \tfrac{1}{6}R \right) ,\\ \mathcal {P}&:= \tfrac{1}{V_{\epsilon }} \left( \Delta - \tfrac{1}{6}R \right) . \end{aligned}$$

As a first step we give the following analogue of an estimate in Li–Yau:

Lemma 2.2

Let \(\mu _1^0 \le \mu _2^0 \le \cdots \) denote the eigenvalues of \(-\mathcal {P}_0\), counted with multiplicity. Then for all \(t > 0\),

$$\begin{aligned} \sum _{i=1}^{\infty } e^{-2\mu _i^0 t} \le \frac{ 36 \left| \left| V_{\epsilon } \right| \right| _{L^2}^2 }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } t^{-2}. \end{aligned}$$
(2.4)

Proof

As in [LY83], we take \(\{ \psi _i \}\) to be an orthonormal basis of \(L^2 \left( V_{\epsilon } {{\,\mathrm{dV}\,}} \right) \) consisting of eigenfunctions of \(-\mathcal {P}_0\):

$$\begin{aligned} - \mathcal {P}_0 \psi _i = \mu _i \psi _i, \end{aligned}$$

with

$$\begin{aligned} \int _X \psi _i(x) \psi _j(x) V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}_x \equiv \delta _{ij}. \end{aligned}$$

Let

$$\begin{aligned} H_0(x,y,t) := \sum _{i=1} e^{-t \mu _i} \psi _i(x) \psi _i(y). \end{aligned}$$

Note that \(H_0\) is the heat kernel associated to the operator \(\mathcal {P}_0\) with respect to the weighted inner product \(L^2(V_{\epsilon } {{\,\mathrm{dV}\,}})\). In particular,

$$\begin{aligned} \begin{aligned} \tfrac{\partial }{\partial t}\left[ H_0(x,y,t) \right]&= \mathcal {P}_0 H_0(x,y,t) \\&= \tfrac{1}{V_{\epsilon }} \left( \Delta _0 - \tfrac{1}{6}R \right) H_0(x,y,t). \end{aligned} \end{aligned}$$
(2.5)

Moreover, since \(R \ge 0\) we have

$$\begin{aligned} H_0(x,y,t) > 0, \end{aligned}$$

and for any \(f \in C^0\left( X^4 \right) \),

$$\begin{aligned} \lim _{t \rightarrow 0} \int _X H_0(x,y,t) f(y) V_{\epsilon }(y) \,{{\,\mathrm{dV}\,}}_y = f(x). \end{aligned}$$
(2.6)

We also let

$$\begin{aligned} \begin{aligned} h(t)&:= \int _{X} \int _{X} H_0(x,y, t)^2 V_{\epsilon }(x) V_{\epsilon }(y) \, {{\,\mathrm{dV}\,}}_x \, {{\,\mathrm{dV}\,}}_y \\&= \sum _{i=1}^{\infty } e^{-2\mu _i^0 t}. \end{aligned} \end{aligned}$$

We now argue as in the proof of Theorem 2 of [LY83]: differentiating h, using (2.5) and integrating by parts, we have

$$\begin{aligned} \begin{aligned} \tfrac{d h}{d t}&= 2 \int _{X} V_{\epsilon }(x) \int _{X} H_0(x,y,t) (\mathcal {P}_0)_y H_0(x,y,t) V_{\epsilon }(y) \, {{\,\mathrm{dV}\,}}_y \, {{\,\mathrm{dV}\,}}_x \\&= 2 \int _{X} V_{\epsilon }(x) \int _{X} H_0(x,y,t) \left( \Delta _0 - \tfrac{1}{6}R \right) _y H_0(x,y,t) \,{{\,\mathrm{dV}\,}}_y \, {{\,\mathrm{dV}\,}}_x.\\&= - 2 \int _{X} V_{\epsilon }(x) \int _{X} \left[ \left| \nabla _y H_0(x,y,t) \right| ^2 + \tfrac{1}{6}R(y) H_0(x,y,t)^2 \right] \,{{\,\mathrm{dV}\,}}_y \, {{\,\mathrm{dV}\,}}_x,\\&= - 2 \int _{X} V_{\epsilon }(x) \int _{X} \left[ \left| \nabla _y H_0(x,y,t) \right| ^2 + \tfrac{1}{6}R(y) H_0(x,y,t)^2 \right] \, {{\,\mathrm{dV}\,}}_y \, {{\,\mathrm{dV}\,}}_x. \end{aligned} \end{aligned}$$
(2.7)

By the definition of the Yamabe invariant,

$$\begin{aligned}&{{\,\mathrm{Y}\,}}(X^4,[g]) \Big ( \int _{X} H_0(x,y,t)^4 \,{{\,\mathrm{dV}\,}}_y \Big )^{1/2}\\&\quad \le 6 \int _{X} \big [ |\nabla _y H_0(x,y,t)|^2 + \tfrac{1}{6}R_y H_0(x,y,t)^2 \big ] \,{{\,\mathrm{dV}\,}}_y. \end{aligned}$$

Using this, we can rewrite (2.7) as

$$\begin{aligned} \tfrac{d h}{d t} \le - \tfrac{1}{3} {{\,\mathrm{Y}\,}}(X^4, [g]) \int _{X} V_{\epsilon }(x) \left( \int _{X} H_0(x,y,t)^4 \,{{\,\mathrm{dV}\,}}_y \right) ^{1/2} \, {{\,\mathrm{dV}\,}}_x. \end{aligned}$$
(2.8)

To obtain a differential inequality for h we need a further a priori upper bound. Iterating Hölder’s inequality twice and using the fact that \(H_0(x,y,t) > 0\) we note

$$\begin{aligned} h(t)= & {} \int _X V_{\epsilon }(x) \int _X H_0(x,y,t)^2 V_{\epsilon }(y) {{\,\mathrm{dV}\,}}_y {{\,\mathrm{dV}\,}}_x \nonumber \\\le & {} \int _X V_{\epsilon }(x) \left[ \left( \int _X H_0(x,y,t)^4 {{\,\mathrm{dV}\,}}_y \right) ^{1/3} \left( \int _X H_0(x,y,t) V_{\epsilon }^{3/2}(y) {{\,\mathrm{dV}\,}}_y \right) ^{2/3} \right] {{\,\mathrm{dV}\,}}_x \nonumber \\\le & {} \left[ \int _X V_{\epsilon }(x) \left( \int _X H_0(x,y,t)^4 {{\,\mathrm{dV}\,}}_y \right) ^{1/2} {{\,\mathrm{dV}\,}}_x \right] ^{2/3} \nonumber \\&\left[ \int _X V_{\epsilon }(x) \left( \int H_0(x,y,t) V_{\epsilon }^{3/2}(y) {{\,\mathrm{dV}\,}}_y \right) ^2 {{\,\mathrm{dV}\,}}_x \right] ^{1/3}. \end{aligned}$$
(2.9)

It remains to estimate the second term on the right hand side above, which is done by treating it as an auxiliary solution to the heat equation. In particular set

$$\begin{aligned} Q(x,t) := \int _X H_0(x,y,t) V_{\epsilon }(y)^{3/2} \,{{\,\mathrm{dV}\,}}_y. \end{aligned}$$
(2.10)

Note that Q is a solution of the heat equation associated to \(\mathcal {P}_0\):

$$\begin{aligned} \begin{aligned} \tfrac{\partial }{\partial t} \left[ Q(x,t) \right]&= (\mathcal {P}_0 Q)(x,t) = \tfrac{1}{V_{\epsilon }(x)} \left( \Delta _0 - \tfrac{1}{6}R \right) Q(x,t),\\ Q(x,0)&= V^{1/2}_{\epsilon }(x). \end{aligned} \end{aligned}$$
(2.11)

Note in particular the power of \(V_{\epsilon }\), which is a consequence of the weighted inner product. We first compute

$$\begin{aligned} \tfrac{d}{dt} \left[ \int _X Q(x,t)^2 V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}_x \right]= & {} 2 \int _X Q(x,t) \tfrac{\partial }{\partial t}\left[ Q(x,t) \right] V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}\nonumber \\= & {} 2 \int _X Q(x,t) \left( \Delta _0 - \tfrac{1}{6}R \right) Q(x,t) \,{{\,\mathrm{dV}\,}}\nonumber \\= & {} -2 \int _X \left[ \left| \nabla Q(x,t) \right| ^2 + \tfrac{1}{6} R(x) Q(x,t)^2 \right] \,{{\,\mathrm{dV}\,}}\nonumber \\\le & {} 0. \end{aligned}$$
(2.12)

Integrating this and applying (2.11),

$$\begin{aligned}\begin{aligned} \int _X Q(x,t)^2 V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}&\le \int _X Q(x,0)^2 V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}\\&= \int _X V_{\epsilon }(x)^2 \,{{\,\mathrm{dV}\,}}. \end{aligned} \end{aligned}$$

Now, using (2.10),

$$\begin{aligned} \int _X Q(x,t)^2 V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}= \int _X V_{\epsilon }(x) \left( \int _X H_0(x,y,t) V_{\epsilon }(y)^{3/2}{{\,\mathrm{dV}\,}}_y \right) ^2 \,{{\,\mathrm{dV}\,}}_x, \end{aligned}$$

and so substituting into (2.12) we obtain

$$\begin{aligned} \Vert V_{\epsilon } \Vert _{L^2} \ge \left[ \int _X V_{\epsilon }(x)\left( \int _X H_0(x,y,t) V_{\epsilon }(y)^{3/2} {{\,\mathrm{dV}\,}}_y \right) ^2 \, {{\,\mathrm{dV}\,}}_x \right] ^{1/2}. \end{aligned}$$

Substituting this into (2.9), we have

$$\begin{aligned} h(t) \le \left[ \int _X V_{\epsilon }(x) \left( \int _X H_0(x,y,t)^4 {{\,\mathrm{dV}\,}}_y \right) ^{1/2} {{\,\mathrm{dV}\,}}_x \right] ^{2/3} \Vert V_{\epsilon } \Vert _{L^2}^{2/3}. \end{aligned}$$

By (2.8), we conclude

$$\begin{aligned} \frac{dh}{dt} \le - \tfrac{1}{3} \frac{ {{\,\mathrm{Y}\,}}(X^4, [g]) }{ \left| \left| V_{\epsilon } \right| \right| _{L^2}} h(t)^{3/2}. \end{aligned}$$
(2.13)

Integrating and using the fact that \(h(t) \rightarrow \infty \) as \(t \rightarrow 0^{+}\) we conclude

$$\begin{aligned} h(t) \le \frac{ 36 \left| \left| V_{\epsilon } \right| \right| _{L^2}^2}{ {{\,\mathrm{Y}\,}}(X^4, [g])^2 } t^{-2}, \end{aligned}$$

which is equivalent to (2.4). \(\quad \square \)

The key lemma that allows us to pass from Lemma 2.2 to Theorem 2.1 is the following:

Lemma 2.3

We have

$$\begin{aligned} {{\,\mathrm{tr_{L^2}}\,}}e^{t \mathcal {P}} \le {{\,\mathrm{rank}\,}}(\mathcal {E}) {{\,\mathrm{tr_{L^2}}\,}}e^{t \mathcal {P}_0}. \end{aligned}$$
(2.14)

Proof

This is based on argument in [DL82], Theorem 4.3 and Corollary 4.4. Let H(xyt) denote the heat kernel associated to \(\mathcal {P}\) with respect to the weighted inner product of Lemma 2.2. More precisely, let \(\mu _1 \le \mu _2 \le \cdots \) denote the eigenvalues of \(-\mathcal {P}\), counted with multiplicity, and let \(\{ \phi _i \}\) be an orthonormal basis of sections of \(L^2(\mathcal {E},V_{\epsilon } {{\,\mathrm{dV}\,}})\) consisting of eigenfunctions of \(-\mathcal {P}\):

$$\begin{aligned} - \mathcal {P} \phi _i = \mu _i \phi _i, \end{aligned}$$

with

$$\begin{aligned} \int _X \left\langle \phi _i(x) , \phi _j(x) \right\rangle V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}_x = \delta _{ij}. \end{aligned}$$

Then the associated heat kernel is given by

$$\begin{aligned} H \left( x,y,t \right) = \sum _{i=1} e^{-t \mu _i} \phi _i \left( x \right) \otimes \phi _i \left( y \right) . \end{aligned}$$

If |H| denotes the norm of H as an endomorphism \(H(\cdot ,x,y) : \mathcal {E}_x \rightarrow \mathcal {E}_y\), then \(\left| H \right| \) is a subsolution of (2.5) (in the sense of distributions):

$$\begin{aligned} \begin{aligned} \tfrac{\partial }{\partial t}\left[ \left| H \right| \left( x,y,t \right) \right]&\le \mathcal {P}_0 \left| H \right| \left( x,y,t \right) \\&= \tfrac{1}{V_{\epsilon }} \left( \Delta _0 \left| H \right| - \tfrac{1}{6}R \right) \left| H \right| \left( x,y,t \right) , \end{aligned} \end{aligned}$$
(2.15)

see Lemma 4.1 of [DL82]. Also, in analogy with (2.6), for any \(f \in C^0\left( X^4 \right) \) we have

$$\begin{aligned} \lim _{t \rightarrow 0} \int _X \left| H\left( x,y,t \right) \right| f\left( y \right) V_{\epsilon }\left( y \right) \, {{\,\mathrm{dV}\,}}_y = f\left( x \right) . \end{aligned}$$
(2.16)

By (2.6) and (2.16),

$$\begin{aligned} \begin{aligned} \left| H \right| \left( x,y,t \right)&- H_0\left( x,y,t \right) \\&= \lim _{\tau \rightarrow 0} \left\{ \int _{X} |H|(x,z,t) H_0(z,y,\tau ) V_{\epsilon }(z) \,{{\,\mathrm{dV}\,}}_z \right. \\&\qquad \left. - \int _{X} \left| H \right| \left( x,z,\tau \right) H_0\left( z,y,t \right) V_{\epsilon }\left( z \right) \, {{\,\mathrm{dV}\,}}_z \right\} \\&= \int _0^t \tfrac{d}{ds} \left[ \int _{X} \left| H \right| \left( x,z,s \right) H_0\left( z,y,t-s \right) V_{\epsilon }(z) \,{{\,\mathrm{dV}\,}}_z \right] {{\,\mathrm{ds}\,}}\\&= \left[ \int _0^t \left[ \int _{X} \tfrac{d}{ds}\left[ \left| H \right| \left( x,z,s \right) \right] H_0\left( z,y,t-s \right) V_{\epsilon }(z) \,{{\,\mathrm{dV}\,}}_z \right] {{\,\mathrm{ds}\,}} \right] _{T_1}\\&\qquad + \left[ \int _0^t \left[ \int _{X} \left| H \right| \left( x,z,s \right) \tfrac{d}{ds}\left[ H_0\left( z,y,t-s \right) \right] V_{\epsilon }(z) \,{{\,\mathrm{dV}\,}}_z \right] {{\,\mathrm{ds}\,}} \right] _{T_2}. \end{aligned} \end{aligned}$$
(2.17)

We manipulate the second term \(T_2\) using (2.5),

$$\begin{aligned} T_2= & {} \int _0^t \int _{X} \left| H \right| \left( x,z,s \right) \tfrac{\partial }{\partial s} H_0\left( z,y,t-s \right) V_{\epsilon }(z) \, {{\,\mathrm{dV}\,}}_z {{\,\mathrm{ds}\,}}\nonumber \\= & {} - \int _0^t \int _{X} \left| H \right| (x,z,s) \left( \tfrac{1}{V_{\epsilon }(z)} \left( \Delta _0 - \tfrac{1}{6}R \right) H_0\left( z,y,t-s \right) \right) V_{\epsilon }(z) \,{{\,\mathrm{dV}\,}}_z {{\,\mathrm{ds}\,}}\nonumber \\= & {} - \int _0^t \int _{X} \left| H \right| \left( x,z,s \right) \Delta _0 H_0\left( z,y,t-s \right) \,{{\,\mathrm{dV}\,}}_z {{\,\mathrm{ds}\,}}\nonumber \\&+ \int _0^t \int _{X} \left| H \right| \left( x,z,s \right) \tfrac{1}{6}R\left( z \right) H_0\left( z,y,t-s \right) \,{{\,\mathrm{dV}\,}}_z {{\,\mathrm{ds}\,}}. \end{aligned}$$
(2.18)

Integrating by parts in the term involving \(\Delta _0\) and using (2.15), reincorporating \(T_2\) into (2.17),

$$\begin{aligned} \begin{aligned}&|H|(x,y,t) - H_0(x,y,t)\\&\quad = \int _0^t \int _{X} \left( \tfrac{\partial }{\partial s} - \mathcal {P}_0 \right) |H|(x,z,s) H_0(z,y,t-s) q_{\epsilon }(z) \,{{\,\mathrm{dV}\,}}_z \le 0. \end{aligned} \end{aligned}$$
(2.19)

Therefore, if \(\text{ tr }_g H\) denotes the pointwise trace of \(H(\cdot ,x,x) : \mathcal {E}_x \rightarrow \mathcal {E}_x\),

$$\begin{aligned} {{\,\mathrm{tr_{L^2}}\,}}e^{t \mathcal {P}}&= \int _{X} \text{ tr }_g H(x,x,t) V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}_x \\&\le \text{ rank }(\mathcal {E}) \int _{X} |H|(x,x,t) V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}_x \\&\le \text{ rank }(\mathcal {E}) \int _{X} H_0(x,x,t) V_{\epsilon }(x) \,{{\,\mathrm{dV}\,}}_x \\&= \text{ rank }(\mathcal {E})\ {{\,\mathrm{tr_{L^2}}\,}}e^{t \mathcal {P}_0}. \end{aligned}$$

The result follows. \(\quad \square \)

Combining Lemma 2.2 with Lemma 2.3 we have

Proposition 2.4

Let \(\mu _1 \le \mu _2 \le \cdots \) denote the eigenvalues of \(-\mathcal {P}\), counted with multiplicity. Then for all \(t > 0\),

$$\begin{aligned} \sum _{i=1}^{\infty } e^{-2\mu _i t} \le \frac{ 36 {{\,\mathrm{rank}\,}}(\mathcal {E}) \left| \left| V_{\epsilon } \right| \right| _{L^2}^2 }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } t^{-2}. \end{aligned}$$
(2.20)

Proof

Observe that

$$\begin{aligned} \sum _{i=1}^{\infty } e^{-2\mu _i t} = \left( {{\,\mathrm{tr_{L^2}}\,}}e^{\left( 2t \right) \mathcal {P}} \right) . \end{aligned}$$
(2.21)

But by Lemma 2.3,

$$\begin{aligned} \big ( {{\,\mathrm{tr_{L^2}}\,}}e^{(2t) \mathcal {P}}\big )&\le {{\,\mathrm{rank}\,}}(\mathcal {E})\ \big ( {{\,\mathrm{tr_{L^2}}\,}}e^{(2t) \mathcal {P}_0}\big ) = {{\,\mathrm{rank}\,}}(\mathcal {E}) \sum _{i=1}^{\infty } e^{-2\mu _i^0 t}. \end{aligned}$$
(2.22)

Thus the result follows from Lemma 2.2.

Corollary 2.5

Let \(\mu _k\) denote the \(k^{th}\)-eigenvalue of \(-\mathcal {P}\). Then

$$\begin{aligned} \dfrac{ 36 e^2 {{\,\mathrm{rank}\,}}(\mathcal {E}) \Vert V_{\epsilon } \Vert _{L^2}^2 }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } \mu _k^2 \ge k. \end{aligned}$$
(2.23)

Proof

As in [LY83], take \(t = \frac{1}{\mu _k}\) in (2.20), then

$$\begin{aligned} \dfrac{ 36 {{\,\mathrm{rank}\,}}(\mathcal {E}) \Vert V_{\epsilon } \Vert _{L^2}^2 }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } \mu _k^2&\ge \sum _{i=1}^{\infty } \exp (- 2 \tfrac{\mu _i}{\mu _k}) \\&\ge \sum _{i=1}^{k} \exp (- 2 \tfrac{\mu _i}{\mu _k}) \\&\ge k e^{-2}. \end{aligned}$$

The result follows. \(\quad \square \)

Proof of Theorem 2.1

By the argument of Birman–Schwinger, the number of non-positive eigenvalues of the operator \(-\Delta + \frac{1}{6}R + V_{\epsilon }\) is less than or equal to the number of eigenvalues of the operator \(-\mathcal {P} = \frac{1}{V_{\epsilon }} (-\Delta + \frac{1}{6} R)\) that are less than or equal to 1 (for an overview of the argument, see (iv) in the proof of [LY83] Corollary 2). But, by (2.23), if \(\mu _k\) the greatest eigenvalue of \(-\mathcal {P}\) that is less than or equal to 1, then

$$\begin{aligned} \begin{aligned} k&\le \dfrac{ 36 e^2 {{\,\mathrm{rank}\,}}(\mathcal {E}) \Vert V_{\epsilon } \Vert _{L^2}^2 }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 }. \end{aligned} \end{aligned}$$

Therefore, taking \(\epsilon \rightarrow 0\) we conclude

$$\begin{aligned} N_0(\mathcal {S}) \le \dfrac{ 36 e^2 {{\,\mathrm{rank}\,}}(\mathcal {E}) \Vert V \Vert _{L^2}^2 }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 }, \end{aligned}$$

which completes the proof. \(\quad \square \)

Remark 2.6

If \(\mathcal {E} \rightarrow (X^n, g)\) is a vector bundle, \(n \ge 3\), and \(\mathcal {S} = -\Delta + V\) is a linear operator acting on sections of E with \(V \ge 0\), then the preceding arguments can easily be adapted to give an estimate for the number of non-positive eigenvalues of \(\mathcal {S}\). If \(C_S(g)\) denotes the Sobolev constant,

$$\begin{aligned} C_S(g) \left( \int _{X} |f|^{\frac{2n}{n-2}} \, {{\,\mathrm{dV}\,}} \right) ^{\frac{n-2}{n}} \le \int _X \left[ |\nabla f|^2 + f^2 \right] \, {{\,\mathrm{dV}\,}}, \end{aligned}$$

then

$$\begin{aligned} N_0(\mathcal {S}) \le c_n \dfrac{ {{\,\mathrm{rank}\,}}(\mathcal {E}) }{ C_S(g)^{\frac{n}{2}}} \Vert (1 + V) \Vert _{L^{n/2}}^{n/2}. \end{aligned}$$

3 Index Estimate for Yang–Mills Connections

3.1 Background

Let \((E,h) \rightarrow (X^n,g)\) be a vector bundle with metric over a closed Riemannian manifold with structure group \(G \subset {{\,\mathrm{SO}\,}}(E)\). Let \(\Gamma (E)\) denote the smooth sections of E, and \(\mathfrak {g}_E\) denote the associated Lie algebra of E. For each point \(x \in X^n\) choose a local orthonormal basis of \(TX^n\) given by \(\{ e_i \}\) with dual basis \(\{ e^i \}\) and a local basis for E given by \(\{ \mu _{\alpha } \}\) with dual basis \(\{ (\mu ^*)^{\alpha } \}\) of the dual \(E^*\). Let \(\Lambda ^p\) denote the space of smooth p-forms over X and set \(\Lambda ^p(E) := \Lambda ^p \otimes \Gamma (E)\). Given an element in \(\Lambda ^p(E)\) its components are understood be with respect to the forgoing bases. We will also use the fact that when \(p=1\), we can take tensor products of the basis elements \(\{e^i\}, \{ \mu _{\alpha } \}, \{ (\mu ^*)^{\alpha } \}\) to obtain a (local) basis of \(\Lambda ^1(E)\).

We will use the following conventions for the various inner products that appear:

$$\begin{aligned} \left\langle \eta ,\omega \right\rangle _{\Lambda ^2}&= \tfrac{1}{2} \sum _{i,j} \eta _{ij} \omega _{ij}, \qquad \left\langle \nu ,\mu \right\rangle _{S^2_0(X)} = \sum _{i,j}\nu _{ij} \mu _{ij}, \\ \left\langle A,B \right\rangle _{\mathfrak {g}_E}&= - \tfrac{1}{2} {{\,\mathrm{tr}\,}}_E \left( AB \right) = - \tfrac{1}{2} \sum _{\alpha ,\beta } A^{\beta }_{\alpha } B^{\alpha }_{\beta }\\ \left\langle P, Q \right\rangle _{\Lambda ^1 \left( \mathfrak {g}_E \right) }&= - \tfrac{1}{2} P_{i \beta }^{\alpha } Q_{i \alpha }^{\beta }, \qquad \left\langle R, S \right\rangle _{\Lambda ^2 \left( \mathfrak {g}_E \right) } = - \tfrac{1}{4} R_{ij \beta }^{\alpha } S_{ij \alpha }^{\beta }. \end{aligned}$$

Here, repeated Latin indices indicate contractions by the metric g on \(X^n\), and the components are with respect to the orthonormal basis above. Unless specified otherwise, we will use Einstein summation notation for both bundle and base components.

We need certain algebraic actions as well. First there is the bracket operation \([,] : \Lambda ^1 (\mathfrak {g}_E) \times \Lambda ^1 (\mathfrak {g}_E) \rightarrow \Lambda ^2(\mathfrak {g}_E)\) defined by

$$\begin{aligned} \left[ A, B \right] _{jk \alpha }^{\beta }&:= A_{j \delta }^{\beta } B_{k \alpha }^{\delta } - B_{k\delta }^{\beta } A_{j \alpha }^{\delta }, \qquad A, B \in \Lambda ^1 (\mathfrak {g}_E). \end{aligned}$$

Also, given \(\eta \in S^2 \left( TX \right) \) and \(\Phi \in \Lambda ^2 \left( \mathfrak {g}_E \right) \), we may view both as elements of \({{\,\mathrm{End}\,}}(\Lambda ^1(\mathfrak {g}_E))\) via the formulas

$$\begin{aligned} \left( \eta \left( A \right) \right) _{i \alpha }^{\beta }&= \eta _{ij} A_{j\alpha }^{\beta }, \\ \left( \left[ \Phi , A \right] \right) _{i \alpha }^{\beta }&= \left[ \Phi _{ji}, A_j \right] _{\alpha }^{\beta } = \Phi _{ji \mu }^{\beta } A_{j \alpha }^{\mu } - A_{j \mu }^{\beta } \Phi _{ji \alpha }^{\mu }. \end{aligned}$$

We next recall the definition of the Jacobi operator of \({\mathcal {YM}}\) (see Theorem (6.8) [BL81]).

Theorem 3.1

Suppose \(\nabla \) is a Yang–Mills connection on a vector bundle E over \(X^n\) with structure group \(G \subset {{\,\mathrm{SO}\,}}(E)\), and \(\left\{ \nabla _s \right\} \) is a one parameter family of connections with \(\nabla \equiv \left. \nabla _s \right| _{s=0}\). Furthermore, suppose \(B := \left. \tfrac{\partial }{\partial s} \left[ \nabla _s \right] \right| _{s=0} \in \Lambda ^1(\mathfrak {g}_E)\). Then

$$\begin{aligned} \left. \tfrac{d^2}{d s^2 }\left[ {\mathcal {YM}}\left( \nabla _s \right) \right] \right| _{s=0}&= 2 \int _X \left\langle \mathcal {J}^{\nabla } \left( B \right) , B \right\rangle _{\Lambda ^1(\mathfrak {g}_E)} {{\,\mathrm{dV}\,}}, \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} \mathcal {J}^{\nabla } \left( B \right) _i&= -\Delta B_i - \nabla _i \nabla ^j B_j + 2 \left[ F_{ji}, B^j \right] + {{\,\mathrm{Ric}\,}}_i^j B_j, \end{aligned} \end{aligned}$$

where \(\Delta = \nabla ^a \nabla _a\) denotes the rough Laplacian.

The operator \(\mathcal {J}^{\nabla }\) is degenerate elliptic, due to the action of the infinite dimensional gauge group. Questions of index and nullity always refer to the operator restricted to divergence-free sections B, one which the operator takes the simpler form:

$$\begin{aligned} \begin{aligned} \mathcal {J}^{\nabla } \left( B \right) _i&= -\Delta B_i + 2 \left[ F_{ji}, B^j \right] + {{\,\mathrm{Ric}\,}}_i^j B_j. \end{aligned} \end{aligned}$$
(3.1)

The index and nullity of a Yang-Mills connection are understood to be those quantities associated to this operator. It follows from the conformal invariance of the Yang-Mills energy that both the index and nullity are conformally invariant.

3.2 Linear algebraic estimates

In this subsection we obtain linear algebraic estimates which enter into estimating the Jacobi operator. The key point is Proposition 3.5, which provides a sharp inequality between the operator and Hilbert-Schmidt norms of the bilinear form appearing in the Jacobi operator. Let \({{\,\mathrm{Z}\,}}\in S_0^2 \left( TX \right) \) and \(\Phi \in \Lambda ^2 \left( \mathfrak {g}_E \right) \); in the following we can view both as elements of \({{\,\mathrm{End}\,}}(\Lambda ^1(\mathfrak {g}_E))\).

Lemma 3.2

Suppose \(E \rightarrow \left( X^n,g \right) \) is a vector bundle. Then \({{\,\mathrm{Z}\,}}\) and \(\Phi \), viewed as endomorphisms of \(\Lambda ^1\left( \mathfrak {g}_E \right) \), are symmetric. Moreover, \({{\,\mathrm{Z}\,}}\) is trace-free as an endomorphism of \(\Lambda ^1(\mathfrak {g}_E)\).

Proof

Take \(A,B \in \Lambda ^1 (\mathfrak {g}_E)\). Using the symmetry of both \({{\,\mathrm{Z}\,}}\) and the inner product on E,

$$\begin{aligned} \left\langle {{\,\mathrm{Z}\,}}\left( A \right) , B \right\rangle _{\Lambda ^1 (\mathfrak {g}_E)}&=- \tfrac{1}{2} {{\,\mathrm{Z}\,}}_{ij} A_{j\alpha }^{\beta } B_{i \beta }^{\alpha } \\&= - \tfrac{1}{2} {{\,\mathrm{Z}\,}}_{ji} B_{i \alpha }^{\beta } A_{j\beta }^{\alpha } \\&= \left\langle {{\,\mathrm{Z}\,}}\left( B \right) , A \right\rangle _{\Lambda ^1 (\mathfrak {g}_E)}. \end{aligned}$$

The symmetry of \({{\,\mathrm{Z}\,}}\) follows. Next, using the cyclicity of inner products over \(\mathfrak {g}_E\), reindexing and skew symmetry of the bracket operation and \(\Phi \),

$$\begin{aligned} \left\langle \left[ \Phi ,A \right] , B \right\rangle _{\Lambda ^1 (\mathfrak {g}_E)}&= - \tfrac{1}{2} \left[ \Phi _{ij}, A_i \right] ^{\beta }_{\alpha } B_{j \beta }^{\alpha } \\&= - \tfrac{1}{2} \Phi _{ij \delta }^{\beta } A_{i \alpha }^{\delta } B_{j \beta }^{\alpha } + \tfrac{1}{2} A_{i \delta }^{\beta } \Phi _{ij \alpha }^{\delta } B_{j \beta }^{\alpha } \\&= - \tfrac{1}{2} A_{i \alpha }^{\delta } B_{j \beta }^{\alpha }\Phi _{ij \delta }^{\beta } + \tfrac{1}{2} A_{i \delta }^{\beta } \Phi _{ij \alpha }^{\delta } B_{j \beta }^{\alpha }\\&= - \tfrac{1}{2} A_{i \delta }^{\beta } B_{j \alpha }^{\delta }\Phi _{ij \beta }^{\alpha } + \tfrac{1}{2} A_{i \delta }^{\beta } \Phi _{ij \alpha }^{\delta } B_{j \beta }^{\alpha }\\&= - \tfrac{1}{2} A_{i \delta }^{\beta } \left[ B_{j}, \Phi _{ij} \right] _{\beta }^{\delta }\\&= - \tfrac{1}{2} A_{i \delta }^{\beta } \left[ \Phi _{ji},B_{j} \right] _{\beta }^{\delta }\\&= \left\langle \left[ \Phi ,B \right] , A \right\rangle _{\Lambda ^1 (\mathfrak {g}_E)}, \end{aligned}$$

hence \(\Phi \) is symmetric as an endomorphism.

To show that \({{\,\mathrm{Z}\,}}\) is trace-free as an operator on \(\Lambda ^1(\mathfrak {g}_E)\), we construct an orthonormal basis for \(\Lambda ^1(\mathfrak {g}_E)\) as described at the beginning of Sect. 3.1: for fixed \((k,\alpha ,\beta )\), let

$$\begin{aligned} A_{(k,\alpha , \beta )} := e^k \otimes \left( \mu ^* \right) ^{\alpha } \otimes \mu _{\beta }, \end{aligned}$$
(3.2)

where \(\left\{ e_i \right\} \) is a basis of TM that diagonalizes \({{\,\mathrm{Z}\,}}\). Note that the components of these basis elements are given by

$$\begin{aligned} \left( A_{(k,\alpha , \beta )} \right) _{\ell \mu }^{\nu } = \delta _{k \ell } \delta _{\alpha }^{\nu } \delta _{\mu }^{\beta }, \qquad \alpha \ne \beta , \end{aligned}$$

so the only nonzero entry is the \((k,\alpha , \beta )\)-component. Computing the trace of \({{\,\mathrm{Z}\,}}\) with respect to this basis yields

$$\begin{aligned} \left\langle {{\,\mathrm{Z}\,}}(A_{(k,\alpha , \beta )}) , A_{(k,\alpha , \beta )} \right\rangle _{\Lambda ^1 (\mathfrak {g}_E)}&= - \tfrac{1}{2} {{\,\mathrm{Z}\,}}_{ij} \left( A_{(k,\alpha , \beta )} \right) _{i \mu }^{\nu } \left( A_{(k,\alpha , \beta )} \right) _{j \nu }^{\mu } \\&= - \tfrac{1}{2} {{\,\mathrm{Z}\,}}_{ij} \delta _{k i} \delta _{\alpha }^{\nu } \delta _{\mu }^{\beta } \delta _{k j} \delta _{\alpha }^{\mu } \delta _{\nu }^{\beta } \\&= - \tfrac{1}{2} {{\,\mathrm{Z}\,}}_{ii} \delta _{\alpha }^{\beta } \delta _{\alpha }^{\beta }\\&= 0, \end{aligned}$$

since \({{\,\mathrm{Z}\,}}\) is traceless on TM. The result follows. \(\quad \square \)

Lemma 3.3

As operators on \(\Lambda ^1(\mathfrak {g}_E)\), the ranges of \({{\,\mathrm{Z}\,}}\) and \(\Phi \) are orthogonal subspaces.

Proof

The orthogonality of \({{\,\mathrm{Z}\,}}\) and \(\left[ \Phi ,\cdot \right] \) will follow since \({{\,\mathrm{Z}\,}}\) preserves the bundle components while \(\Phi \) is skew symmetric with respect to the bundle components. Using the basis (3.2) as above, for fixed \((k,\alpha , \beta )\), then

$$\begin{aligned} \left( {{\,\mathrm{Z}\,}}\left( A_{(k,\alpha , \beta )} \right) \right) _{i \mu }^{\nu }&={{\,\mathrm{Z}\,}}_{\ell i} \left( A_{(k,\alpha , \beta )} \right) _{\ell \mu }^{\nu }\\&={{\,\mathrm{Z}\,}}_{\ell i} \delta _{k \ell } \delta _{\alpha }^{\nu } \delta _{\mu }^{\beta }\\&= {{\,\mathrm{Z}\,}}_{k i} \delta _{\alpha }^{\nu } \delta _{\mu }^{\beta }\\&= {\left\{ \begin{array}{ll} {{\,\mathrm{Z}\,}}_{k i} &{} \text { if } \mu = \alpha , \beta = \nu , \\ 0 &{} \text { otherwise.} \end{array}\right. }. \end{aligned}$$

Similarly,

$$\begin{aligned} \left[ \Phi ,A_{(k,\alpha , \beta )} \right] _{i \mu }^{\nu }&= \Phi _{\ell i \delta }^{\nu } \left( A_{(k,\alpha , \beta )} \right) ^{\delta }_{\ell \mu } - \left( A_{(k,\alpha , \beta )} \right) ^{\nu }_{\ell \delta } \Phi ^{\delta }_{\ell i \mu }\\&= \Phi _{\ell i \delta }^{\nu } \delta _{k \ell } \delta _{\alpha }^{\delta } \delta _{\mu }^{\beta } - \delta _{k \ell } \delta _{\alpha }^{\nu } \delta _{\delta }^{\beta } \Phi ^{\delta }_{\ell i \mu } \\&= \Phi _{k i \alpha }^{\nu } \delta _{\mu }^{\beta } - \delta _{\alpha }^{\nu } \Phi ^{\beta }_{k i \mu } \\&= {\left\{ \begin{array}{ll} 0 &{} \alpha = \nu \text { and } \beta = \mu \\ -\Phi _{k i \mu }^{\beta } &{} \alpha = \nu \text { and } \beta \ne \mu \\ \Phi ^{\nu }_{k i \alpha } &{} \alpha \ne \nu \text { and } \beta = \mu \\ 0 &{} \alpha \ne \nu \text { and } \beta \ne \mu \end{array}\right. }. \end{aligned}$$

Where here, we are noting that since \(\Phi \in \Lambda ^2 (\mathfrak {g}_E)\), its endomorphism indices cannot coincide. \(\quad \square \)

To state our next result, we need to introduce an algebraic invariant defined by Bourguignon–Lawson. Let

$$\begin{aligned} \gamma _{0} :=&\ \sup _{A,B \in \Gamma (\mathfrak {g}_E) \backslash \{0\}} \tfrac{\left| [A,B] \right| }{\left| A \right| \left| B \right| }. \end{aligned}$$

Lemma 2.30 of [BL81] gives the universal upper bound

$$\begin{aligned} \gamma _0 \le \sqrt{2}, \end{aligned}$$
(3.3)

and characterizes the case of equality.

Lemma 3.4

If \(A \in \Lambda ^1 (\mathfrak {g}_E)\), then

$$\begin{aligned} \begin{aligned} \left| \left[ A,A \right] \right| _{\Lambda ^2(\mathfrak {g}_E)}&\le \gamma _0 \sqrt{ \tfrac{n-1}{2n}} |A|^2_{\Lambda ^1(\mathfrak {g}_E)}. \end{aligned} \end{aligned}$$
(3.4)

Since \(\gamma _0 \le \sqrt{2}\), in general we have

$$\begin{aligned} \left| \left[ A,A \right] \right| _{\Lambda ^2(\mathfrak {g}_E)}&\le \sqrt{ \tfrac{n-1}{n}} |A|^2_{\Lambda ^1(\mathfrak {g}_E)}. \end{aligned}$$
(3.5)

Proof

Fix a point \(p \in X^n\) and let \(\left\{ e^i \right\} \) to be an orthonormal basis of \(\Lambda ^1\). If \(A \in \Lambda ^1(\mathfrak {g}_E)\), then we can express \(A = A_i e^i\) for \(A_i \in \Gamma \left( \mathfrak {g}_E \right) \). Then

$$\begin{aligned} \left| \left[ A,A \right] \right| ^2_{\Lambda ^2(\mathfrak {g}_E)}&= - \tfrac{1}{4} \left[ A,A \right] _{ij \alpha }^{\beta } \left[ A,A \right] _{ij \beta }^{\alpha } \\&=\tfrac{1}{2} \sum _{i,j} \left| \left[ A,A \right] _{ij} \right| ^2_{\mathfrak {g}_E}\\&= \tfrac{1}{2} \sum _{i,j} \left| \left[ A_i,A_j \right] \right| ^2_{\mathfrak {g}_E}\\&= \sum _{i < j} \left| \left[ A_i,A_j \right] \right| ^2_{\mathfrak {g}_E}. \end{aligned}$$

By the definition of \(\gamma _0\), this gives

$$\begin{aligned} \left| \left[ A,A \right] \right| ^2_{\Lambda ^2(\mathfrak {g}_E)}= & {} \left( \sum _{ i< j } \left| \left[ A_i, A_j \right] \right| ^2_{\mathfrak {g}_E} \right) \nonumber \\\le & {} \gamma _0^2 \left( \sum _{\le i < j} \left| A_i \right| ^2_{\mathfrak {g}_E} \left| A_j \right| ^2_{\mathfrak {g}_E} \right) . \end{aligned}$$
(3.6)

Now

$$\begin{aligned} |A|^4_{\Lambda ^1(\mathfrak {g}_E)} = \sum _{i, j} \left| A_i \right| ^2_{\mathfrak {g}_E} \left| A_j \right| ^2_{\mathfrak {g}_E}&= 2 \sum _{i < j} \left| A_i \right| ^2_{\mathfrak {g}_E} \left| A_j \right| _{\mathfrak {g}_E} ^2 + \sum _{i } \left| A_i \right| ^4_{\mathfrak {g}_E} , \end{aligned}$$

while the arithmetic-geometric mean implies

$$\begin{aligned} \sum _{i } \left| A_i \right| ^4_{\mathfrak {g}_E} \ge \tfrac{1}{n} \left( \sum _{i} \left| A_i \right| ^2_{\mathfrak {g}_E} \right) ^2 = \tfrac{1}{n} |A|^4_{\Lambda ^1(\mathfrak {g}_E)} . \end{aligned}$$

Therefore,

$$\begin{aligned} \sum _{1 \le i < j \le n} \left| A_i \right| ^2_{\mathfrak {g}_E} \left| A_j \right| ^2_{\mathfrak {g}_E} \le \tfrac{(n-1)}{2n} |A|^4_{\Lambda ^1(\mathfrak {g}_E)} . \end{aligned}$$

Substituting this into (3.6) gives

$$\begin{aligned} \left| \left[ A,A \right] \right| ^2_{\Lambda ^2(\mathfrak {g}_E)}&\le \gamma _0^2 \left( \tfrac{n-1}{2n} \right) |A|^4_{\Lambda ^1(\mathfrak {g}_E)}, \end{aligned}$$

and taking the square root yields (3.4). \(\quad \square \)

Proposition 3.5

Suppose \(E \rightarrow \left( X^n,g \right) \) is a vector bundle and let

$$\begin{aligned} \mathcal {B} = {{\,\mathrm{Z}\,}}+ \left[ \Phi , \cdot \right] : \Lambda ^1(\mathfrak {g}_E) \rightarrow \Lambda ^1(\mathfrak {g}_E). \end{aligned}$$

Then

$$\begin{aligned} \left| \mathcal {B} \left( A,A \right) \right| \le \sqrt{ \tfrac{n-1}{n}} \cdot \left( \sqrt{ |{{\,\mathrm{Z}\,}}|^2_{S_0^2(T^*M)} + 2 \gamma _0^2 |\Phi |^2_{\Lambda ^2(\mathfrak {g}_E)} } \right) |A|^2_{\Lambda ^1(\mathfrak {g}_E)}. \end{aligned}$$

Proof

Since \(\mathcal {B}\) is symmetric by Lemma 3.2, there exists an orthonormal basis of \(\Lambda ^1(\mathfrak {g}_E)\) with respect to which the matrix of \(\mathcal {B}\) is diagonalized. Since the ranges of \({{\,\mathrm{Z}\,}}\) and \(\Phi \) are orthogonal by Lemma 3.3, we can express the matrix of \(\mathcal {B}\) as

$$\begin{aligned} \left[ \mathcal {B} \right] = \begin{pmatrix} \vec {z} &{} 0 \\ 0 &{} \vec {\phi } \end{pmatrix}, \end{aligned}$$

where

$$\begin{aligned} \left[ {{\,\mathrm{Z}\,}} \right] = \begin{pmatrix} \vec {z} &{} 0 \\ 0 &{} 0 \end{pmatrix}, \ \ \ \ \left[ \Phi \right] = \begin{pmatrix} 0 &{} 0 \\ 0 &{} \vec {\phi } \end{pmatrix}, \end{aligned}$$

are the matrices of \({{\,\mathrm{Z}\,}}\) and \(\Phi \) with respect to this basis, \(\vec {z} = \left( z_1, \cdots , z_n \right) \), \(\vec {\phi } = \left( \phi _{1}, \cdots , \phi _{N} \right) \) are the eigenvalues of \({{\,\mathrm{Z}\,}}\) and \(\Phi \) respectively. If \(A \in \Lambda ^1(\mathfrak {g}_E)\), then we can write \(A = A_1 + A_2\), where

$$\begin{aligned} A_1 = \begin{pmatrix} \vec {a}\\ 0 \end{pmatrix}, \ \ \ A_2 = \begin{pmatrix} 0 \\ \vec {b} \end{pmatrix}, \end{aligned}$$

with \(\vec {a} = \left( a_1, \cdots , a_n \right) \), \(\vec {b} = \left( b_{1}, \cdots , b_{N} \right) \). Therefore, as a bilinear form

$$\begin{aligned} \mathcal {B} \left( A,A \right)&= {{\,\mathrm{Z}\,}}\left( A_1,A_1 \right) + \Phi (A_2,A_2) \\&= \begin{pmatrix} \vec {z} &{} 0 \\ 0 &{} \vec {\phi } \end{pmatrix}\begin{pmatrix} \vec {a}\\ \vec {b} \end{pmatrix} \cdot \begin{pmatrix} \vec {a}&\vec {b} \end{pmatrix} \\&= \sum _i z_i a_i^2 + \sum _{j} \phi _j b_j^2. \end{aligned}$$

Since \({{\,\mathrm{Z}\,}}\) is trace-free via Lemma 3.2,

$$\begin{aligned} \left| {{\,\mathrm{Z}\,}}(A_1,A_1) \right|&= \left| \sum _i z_i a_i^2 \right| \\&\le \sqrt{\tfrac{n-1}{n}} |\vec {z}||\vec {a}|^2 \\&= \sqrt{\tfrac{n-1}{n}}\left| {{\,\mathrm{Z}\,}} \right| _{S_0^2 \left( T^*M \right) } |A_1|^2_{\Lambda ^1(\mathfrak {g}_E)}. \end{aligned}$$

Also, by Lemma 3.4,

$$\begin{aligned} \left| \Phi (A_2,A_2) \right|&= \left| \sum _{j} \phi _j b_j^2 \right| \\&= \left| \langle \left[ \Phi , A_2 \right] , A_2 \rangle \right| _{\Lambda ^1 \left( \mathfrak {g}_E \right) }\\&= 2 \left| \langle \Phi , \left[ A_2,A_2 \right] \rangle \right| _{\Lambda ^2(\mathfrak {g}_E)} \\&\le 2 |\Phi |_{\Lambda ^2(\mathfrak {g}_E)} \left| \left[ A_2,A_2 \right] \right| _{\Lambda ^2(\mathfrak {g}_E)} \\&\le 2 \gamma _0 |\Phi |_{\Lambda ^2(\mathfrak {g}_E)} \sqrt{ \tfrac{n-1}{2n}} \left| A_2 \right| _{\Lambda ^1(\mathfrak {g}_E)}^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \left| \mathcal {B}\left( A,A \right) \right| \le \sqrt{ \tfrac{n-1}{n}} \left( |{{\,\mathrm{Z}\,}}|_{S_0^2 \left( T^*M \right) }\left| A_1 \right| ^2_{\Lambda ^1 \left( \mathfrak {g}_E \right) } + \sqrt{2}\gamma _0 |\Phi |_{\Lambda ^2 \left( \mathfrak {g}_E \right) } \left| A_2 \right| ^2_{\Lambda ^1 \left( \mathfrak {g}_E \right) } \right) , \end{aligned}$$

where we have dropped the subscripts designating the norms in order to simplify notation. By the Cauchy-Schwartz inequality,

$$\begin{aligned} \left| \mathcal {B}\left( A,A \right) \right|&\le \sqrt{ \tfrac{n-1}{n}} \cdot \sqrt{ |{{\,\mathrm{Z}\,}}|^2_{S_0^2(T^*M)} + 2 \gamma _0^2 |\Phi |^2_{\Lambda ^2(\mathfrak {g}_E)} } \cdot \sqrt{ |A_1|_{\Lambda ^1(\mathfrak {g}_E)}^4 + |A_2|^4_{\Lambda ^1(\mathfrak {g}_E)} } \\&\le \sqrt{ \tfrac{n-1}{n}} \cdot \left( \sqrt{ |{{\,\mathrm{Z}\,}}|^2_{S_0^2(T^*M)} + 2 \gamma _0^2 |\Phi |^2_{\Lambda ^2(\mathfrak {g}_E)} } \right) |A|^2_{\Lambda ^1(\mathfrak {g}_E)}. \end{aligned}$$

The result follows. \(\quad \square \)

3.3 A canonical conformal representative

Since the index and nullity of a Yang–Mills connection in four dimensions are conformally invariant, we may estimate them with respect to any metric conformal to the base metric g. In this subsection, we specify a choice of conformal metric based on our work in [GKS18]. To this end, suppose \(\nabla \) is a Yang–Mills connection on a vector bundle E over \((X^4,g)\) with structure group \(G \subset {{\,\mathrm{SO}\,}}(E)\), and denote the curvature by \(F = F_{\nabla }\). For \(t \ge 0\), define

$$\begin{aligned} \Phi _{g}^t = R_{g} - t \big [ 2 \sqrt{6} |W|_{g} + 3 \gamma _1 |F|_{g}\big ], \end{aligned}$$

where \(R_g\) is the scalar curvature of g, \(W_g\) is the Weyl tensor, and \(\gamma _1(E)\) is the constant given by

$$\begin{aligned} \gamma _1(E) :=&\ \sup _{\omega \in \Lambda ^2_{+}(\mathfrak {g}_E) \setminus \{0\}} \dfrac{ \langle \omega , [\omega , \omega ]\rangle }{|\omega |^3}. \end{aligned}$$
(3.7)

Remark 3.6

The definition of the inner product on \(\Lambda ^2_{+}(\mathfrak {g}_E)\) given in [GKS18] differs from the definition of this paper. In particular, the estimate for \(\gamma _1(E)\) in Section 2 of [GKS18] needs to be adjusted. With respect to our current conventions, we have the estimate

$$\begin{aligned} \begin{aligned} \gamma _1(E)&\le \tfrac{2\sqrt{6}}{3} \gamma _0(E) \\&\le \tfrac{4 \sqrt{3}}{3}. \end{aligned} \end{aligned}$$
(3.8)

We also define the associated operator

$$\begin{aligned} L^t_g = - 6 \Delta _g + \Phi _g^t. \end{aligned}$$

In [GKS18], based on the ideas of [Gur00], we defined the related curvature and operator

$$\begin{aligned} \begin{aligned} \Phi _{g}&= R_{g} - 2 \sqrt{6} |W^{+}|_{g} - 3 \gamma _1 |F^{+}|_{g},\\ L_g&= - 6 \Delta _g + \Phi _g. \end{aligned} \end{aligned}$$
(3.9)

It is easy to see that the expression \(\gamma _1(E) |F|\) is independent of the choice of norms. Therefore, despite the difference of conventions pointed out in Remark 3.6, the definition of \(\Phi _g\) in (3.9) agrees with the corresponding formula (3.5) in [GKS18].

Observe that

$$\begin{aligned} \Phi _g^0&= R_g, \\ \Phi _g^1&\le \Phi _g. \end{aligned}$$

Note that the latter inequality implies

$$\begin{aligned} \lambda _1(L^1_g) \le \lambda _1(L_g). \end{aligned}$$
(3.10)

In addition, \(\Phi ^t\) satisfies the same kind of conformal transformation formula as \(\Phi \): given \(\hat{g}= u^2 g\),

$$\begin{aligned} \Phi _{\hat{g}}^t&= u^{-3} L_g^t u. \end{aligned}$$

If \(\lambda _1(L^t)\) denotes the first eigenvalue of \(L^t\),

$$\begin{aligned} \lambda _1(L_g^t) = \inf _{\phi \in C^{\infty }(X) \backslash \left\{ 0 \right\} } \dfrac{ \int _X \phi L_g^t \phi \ {{\,\mathrm{dV}\,}}_g }{\int _X \phi ^2\ {{\,\mathrm{dV}\,}}_g}, \end{aligned}$$
(3.11)

then the sign of \(\lambda _1(L^t)\) is a conformal invariant (see [Gur00], Proposition 3.2). In particular, by using an eigenfunction associated with \(\lambda _1(L^t)\) as a conformal factor, it follows that [g] admits a metric \(\hat{g}\) with \(\Phi _{\hat{g}}^t > 0\) (resp., \(= 0, < 0\)) if and only if \(\lambda _1(L_g^t) > 0\) (resp.\(= 0, < 0\)).

Proposition 3.7

Assume \((X^4,[g])\) has \({{\,\mathrm{Y}\,}}(X^4,[g]) > 0\). Given \(\nabla \) a Yang-Mills connection which is not an instanton, there exists \(t_0 \in (0,1]\) such that \(\lambda _1(L_g^{t_0}) = 0\). In particular, we can choose a conformal metric \(\hat{g}\in [g]\) with respect to which \(\Phi ^{t_0}_{\hat{g}} \equiv 0\), hence

$$\begin{aligned} R_{\hat{g}} = 2\sqrt{6} t_0 |W_{\hat{g}}| + 3 \gamma _1 t_0 |F|_{\hat{g}}. \end{aligned}$$
(3.12)

Moreover,

$$\begin{aligned} \dfrac{ {{\,\mathrm{Y}\,}}(X^4,[g]) }{ 2 \sqrt{6} \Vert W \Vert _{L^2} + 3 \gamma _1 \Vert F \Vert _{L^2}}\le t_0 \le 1. \end{aligned}$$
(3.13)

Proof

Using the Bochner formula for Yang-Mills connections, in [GKS18] we showed that either \(F^{+} \equiv 0\), or else \(\lambda _1(L_g) \le 0\) (see [GKS18] following (3.8) of the proof of Theorem 1.1). Since we are ruling out the former by assumption, the latter condition must hold. Consequently, by (3.10), \(\lambda _1(L_g^1) \le 0\). In fact, we can assume \(\lambda _1(L_g^1) < 0\), since otherwise we could take \(t_0 = 1\).

Clearly, \(\lambda _1(L_g^t)\) depends continuously on the parameter t. Since \(\Phi _g^0 = R_g\) and the Yamabe invariant of \((X^4,[g])\) is positive, we know that \(\lambda _1(L_g^0) > 0\). By the intermediate value theorem, it follows there is \(t_0 \in (0,1]\) with \(\lambda _1(L_g^{t_0}) = 0\). Also, integrating (3.12) and using the Cauchy-Schwarz inequality it is easy to see that \(t_0\) satisfies (3.13). \(\quad \square \)

3.4 The proof of Theorem 1.1

In this subection use Theorem 2.1 to give the proof of Theorem 1.1. As remarked above, since the index and nullity are conformal invariants we are free to make a conformal modification of the base metric and we choose the conformal gauge guaranteed by Proposition 3.7. To begin we obtain an algebraic estimate for the Jacobi operator. Specifically, let Z now denote the trace-free Ricci tensor, i.e.

$$\begin{aligned} {{\,\mathrm{Z}\,}}:= {{\,\mathrm{Ric}\,}}- \tfrac{1}{4} R g. \end{aligned}$$

We express \(\mathcal {J}^{\nabla }\) as

$$\begin{aligned} \mathcal {J}^{\nabla }= & {} \ -\Delta + \tfrac{1}{4} R + {{\,\mathrm{Z}\,}}+ 2 [ F_{\nabla }, \cdot ]\nonumber \\= & {} \ - \Delta + \tfrac{1}{6}R + \left\{ \tfrac{1}{12}R + \tfrac{\sqrt{3} }{12} \gamma _1 t_0 [ F, \cdot ] \right\} _{\mathcal {A}} + \left\{ {{\,\mathrm{Z}\,}}+ \left( 2 - \tfrac{\sqrt{3}}{12} \gamma _1 t_0 \right) [F,\cdot ] \right\} _{\mathcal {B}}, \end{aligned}$$
(3.14)

and proceed to estimate the zeroth-order operators \(\mathcal {A}\) and \(\mathcal {B}\) labeled above.

Lemma 3.8

As a bilinear form, \(\mathcal A \ge 0\).

Proof

If we take \({{\,\mathrm{Z}\,}}=0\) and \(\Phi = F_{\nabla }\) in Proposition 3.5, then

$$\begin{aligned} |F(A,A)|&= | \langle [ F, A], A \rangle | \\&\le \tfrac{\sqrt{3}}{2} \sqrt{ 2 \gamma _0^2 |F|^2 }|A|^{2} \\&= \tfrac{\sqrt{6}}{2} \gamma _0 |F| |A|^2. \end{aligned}$$

Since \(\gamma _0 \le \sqrt{2}\), it follows that

$$\begin{aligned} | \langle [ F, A], A \rangle | \le \sqrt{3} |F| |A|^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathcal {A}(A,A)&= \tfrac{1}{12} R \left| A \right| ^2 + \tfrac{\sqrt{3}}{12} \gamma _1 t_0\left\langle \left[ F, A \right] , A \right\rangle \\&\ge \tfrac{1}{12} R |A|^2 - \tfrac{\sqrt{3} }{12} \gamma _1 t_0 \left( \sqrt{3} |F| \right) |A|^2 \\&= \tfrac{1}{12} \left( R - 3 \gamma _1 t_0 \left| F \right| \right) \left| A \right| ^2. \end{aligned}$$

Using the formula for the scalar curvature in (3.12), we conclude

$$\begin{aligned} \mathcal {A}(A,A)&\ge \tfrac{1}{12} \left( R - 3 \gamma _1 t_0 \left| F \right| \left| A \right| ^2 \right) \\&= \tfrac{\sqrt{6}}{6} t_0 \left| W \right| \left| A \right| ^2 \\&\ge 0. \end{aligned}$$

\(\square \)

Lemma 3.9

Let

$$\begin{aligned} \alpha = 2 - \tfrac{\sqrt{3}}{12} \gamma _1 t_0 > 0. \end{aligned}$$
(3.15)

Then

$$\begin{aligned} \mathcal {B} \left( A,A \right) \ge - \left[ \tfrac{3}{4} \left| {{\,\mathrm{Z}\,}} \right| ^2 + 3 \alpha ^2 \left| F \right| ^2 \right] ^{1/2} \left| A \right| ^2. \end{aligned}$$
(3.16)

Proof

Note that \(\mathcal {B} = {{\,\mathrm{Z}\,}}+ \alpha [F, \cdot ]\). If we take \(\Phi = \alpha F\) in Proposition 3.5 and use the fact that \(\gamma _0 \le \sqrt{2}\), then

$$\begin{aligned} \mathcal {B}(A,A)&\ge -\tfrac{\sqrt{3}}{2} \left[ \left| {{\,\mathrm{Z}\,}} \right| ^2 + 2 \gamma _0^2 \alpha ^2 \left| F \right| ^2 \right] ^{1/2} \left| A \right| ^2 \\&\ge \left[ \tfrac{3}{4} \left| {{\,\mathrm{Z}\,}} \right| ^2 + 3 \alpha ^2 \left| F \right| ^2 \right] ^{1/2}\left| A \right| ^2, \end{aligned}$$

as claimed. \(\quad \square \)

In view of (3.14) and Lemmas 3.8 and 3.9, we have

$$\begin{aligned} \begin{aligned} \langle \mathcal {J}^{\nabla } A, A \rangle _{L^2}&\ge \langle \big ( - \Delta + \tfrac{1}{6}R - \left[ \tfrac{3}{4} \left| {{\,\mathrm{Z}\,}} \right| ^2 + 3 \alpha ^2 \left| F \right| ^2 \right] ^{1/2} \big ) A, A \rangle _{L^2} \\&= \langle \left( - \Delta + \tfrac{1}{6}R - V \right) A, A \rangle _{L^2}, \end{aligned} \end{aligned}$$
(3.17)

where

$$\begin{aligned} V = \left[ \tfrac{3}{4} \left| {{\,\mathrm{Z}\,}} \right| ^2 + 3 \alpha ^2 \left| F \right| ^2 \right] ^{1/2}. \end{aligned}$$
(3.18)

We therefore define

$$\begin{aligned} {\mathcal S} = - \Delta + \tfrac{1}{6}R - V. \end{aligned}$$
(3.19)

To estimate the index and nullity of \(\mathcal J^{\nabla }\) it suffices to obtain the estimate for \(\mathcal S\), since by (3.17) whenever \(\mathcal {J}^{\nabla }\) is nonpositive on a subspace, then so is \(\mathcal {S}\). Applying Theorem 2.1 to the operator \(\mathcal S\) on the bundle \(\Lambda ^{1} (\mathfrak g_E)\), which has rank 4d, where \(d = \dim (\mathfrak {g}_E)\), we obtain

$$\begin{aligned} \begin{aligned} N_0(\mathcal {S})&\le \dfrac{ 144 e^2 d }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } \int _{X} V^2 \, {{\,\mathrm{dV}\,}}\\&\le \dfrac{ 144 e^2 d }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } \Bigg \{ \tfrac{3}{4} \int _{X} |{{\,\mathrm{Z}\,}}|^2 \,{{\,\mathrm{dV}\,}}+ 3 \alpha ^2 \int _{X} |F|^2 \, {{\,\mathrm{dV}\,}}\Bigg \}. \end{aligned} \end{aligned}$$
(3.20)

By the Chern–Gauss–Bonnet formula

$$\begin{aligned} \tfrac{3}{4} \int _X \left| {{\,\mathrm{Z}\,}} \right| ^2 \,{{\,\mathrm{dV}\,}}= - 12 \pi ^2 \chi \left( X^4 \right) + \tfrac{3}{2} \int _X | W |^2 \,{{\,\mathrm{dV}\,}}+ \tfrac{1}{16} \int _X R^2 \,{{\,\mathrm{dV}\,}}. \end{aligned}$$
(3.21)

Using the conformal gauge fixing of Proposition 3.7, we can estimate the scalar curvature term above as

$$\begin{aligned} \begin{aligned} \tfrac{1}{16} \int _X R^2 \, {{\,\mathrm{dV}\,}}&= \tfrac{t_0^2}{16} \int _X \left( 2 \sqrt{6}|W| + 3 \gamma _1 |F| \right) ^2 \, {{\,\mathrm{dV}\,}}\\&= \tfrac{3}{2} t_0^2 \int _X |W|^2 \, {{\,\mathrm{dV}\,}}+ \tfrac{3\sqrt{6}}{4} \gamma _1 t_0^2 \int _X |W||F| \, {{\,\mathrm{dV}\,}}+ \tfrac{9}{16} \gamma _1^2 t_0^2 \int _X |F|^2 \, {{\,\mathrm{dV}\,}}. \end{aligned} \end{aligned}$$

Substituting this into (3.21) gives

$$\begin{aligned} \begin{aligned} \tfrac{3}{4} \int _X \left| {{\,\mathrm{Z}\,}} \right| ^2 \,{{\,\mathrm{dV}\,}}&= - 12 \pi ^2 \chi \left( X^4 \right) + \tfrac{3}{2}\left( 1 + t_0^2 \right) \int _X | W |^2 \,{{\,\mathrm{dV}\,}}\\&\quad + \tfrac{3\sqrt{6}}{4} \gamma _1 t_0^2 \int _X |W||F| \, {{\,\mathrm{dV}\,}}+ \tfrac{9}{16} \gamma _1^2 t_0^2 \int _X |F|^2 \, {{\,\mathrm{dV}\,}}. \end{aligned} \end{aligned}$$

We now substitute this into (3.20) to get

$$\begin{aligned} \begin{aligned} N_0(\mathcal {\mathcal S})&\le \dfrac{ 144 e^2 d }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } \Big \{ - 12 \pi ^2 \chi \left( X^4 \right) + \tfrac{3}{2}\left( 1 + t_0^2 \right) \int _X | W^{+} |^2 \,{{\,\mathrm{dV}\,}}\\&\quad + \tfrac{3\sqrt{6}}{4} \gamma _1 t_0^2 \int _X |W||F| \, {{\,\mathrm{dV}\,}}+ \left( 3 \alpha ^2 + \tfrac{9}{16} \gamma _1^2 t_0^2 \right) \int _{X} |F|^2 \, {{\,\mathrm{dV}\,}}\Big \}. \end{aligned} \end{aligned}$$
(3.22)

We estimate the coefficients of each of terms above as follows: For the first coefficient, since \(t_0 \le 1\) we have

$$\begin{aligned} \tfrac{3}{2}\left( 1 + t_0^2 \right) \le 3. \end{aligned}$$

Since \(0 \le t_0 \le 1\) and by (3.8) \(\gamma _1 \le \tfrac{4\sqrt{3}}{3}\), we can bound the second coefficient by

$$\begin{aligned} \begin{aligned} \tfrac{3\sqrt{6}}{4} \gamma _1 t_0^2&\le \tfrac{3\sqrt{6}}{4} \gamma _1 \\&\le 3 \sqrt{2}. \end{aligned} \end{aligned}$$
(3.23)

For the third coefficient we use the formula for \(\alpha \) in (3.15) to write

$$\begin{aligned} \begin{aligned} \left( 3 \alpha ^2 + \tfrac{9}{16} \gamma _1^2 t_0^2 \right) = \tfrac{5}{8} (\gamma _1 t_0)^2 - \sqrt{3} (\gamma _1 t_0) + 12. \end{aligned} \end{aligned}$$
(3.24)

Now \(\gamma _1 t_0 \le \frac{4\sqrt{3}}{3}\), and the quadratic polynomial \(q(x) = \tfrac{5}{8} x^2 - \sqrt{3} x + 12\) attains its maximum at \(x = 0\) on the interval \(\left[ 0, \frac{4 \sqrt{3}}{3} \right] \). Consequently,

$$\begin{aligned} \left( 3\alpha ^2 + \tfrac{9}{16} \gamma _1^2 t_0^2 \right) \le 12. \end{aligned}$$

With these estimates on the coefficients, we can rewrite (3.22) as

$$\begin{aligned} N_0(\mathcal {\mathcal S}) \le&\, \dfrac{ 144 e^2 d }{{{\,\mathrm{Y}\,}}(X^4, [g])^2 } \Big \{ - 12 \pi ^2 \chi \left( X^4 \right) + 3 \int _{X} | W |^2 \,{{\,\mathrm{dV}\,}}\\&\quad + 3 \sqrt{2} \int _X |W||F| \, {{\,\mathrm{dV}\,}}+ 12 \int _{X} |F|^2 \, {{\,\mathrm{dV}\,}}\Big \}, \end{aligned}$$

finishing the proof. \(\quad \square \)

3.5 Linear growth rate in four dimensions

Theorem 1.1 exhibits that the index can grow at worst linearly in the Yang-Mills energy of the connection. In this section we show that this growth rate is sharp through an explicit family of examples. Various authors [SJU89, HM90, SS92, Bor92] have shown the existence of families of noninstanton Yang–Mills connection for a given \({{\,\mathrm{SU}\,}}(2)\) bundle over \(\mathbb {S}^4\) provided that the charge \(\kappa \) satisfies \(\kappa (E) \ne \pm 1\). We will use the work of Sadun–Segert [SS92], who constructed non-instanton Yang-Mills connections on the so-called ‘quadrupole bundles.’ The proposition below analyzes this construction in conjunction with an index estimate of Taubes ([Tau83] Theorem 1.1) to exhibit the required index growth.

Proposition 3.10

Given \(l = 4k - 1 > 1\), let \(\nabla ^l\) denote the Sadun–Segert connection on the quadrupole bundle \(P_{(l,3)} \rightarrow \mathbb {S}^4\). There exists a constant \(\delta > 0\) so that

$$\begin{aligned} \imath \left( \nabla ^l \right) \ge \delta \left| \left| F_{\nabla ^l} \right| \right| _{L^2}^2. \end{aligned}$$

Proof

We assume familiarity with the results and notation of [SS92]. The quadrupole bundles are defined by different lifts of the unique irreducible representation of \({{\,\mathrm{SU}\,}}(2)\) on \(\mathbb R^5\), and are classified by a pair of odd positive integers \((n_+, n_-)\), with the bundle denotes \(P_{(n_+, n_-)}\). The construction of [SS92] further restricts to the case \(n_{\pm } \ne 1\). We will choose \(n_+ = l = 4k - 1 > 1\), \(n_- = 3\), and let \(\nabla ^l\) denote the Sadun–Segert connection on \(P_{(l,3)}\). As computed in [SS89, ASSS89] one has

$$\begin{aligned} \kappa (P_{(n_+, n_-)}) = \tfrac{1}{8} (n_+^2 - n_-^2) = \tfrac{1}{8} \left( l^2 - 9 \right) . \end{aligned}$$
(3.25)

Furthermore, as the connection \(\nabla ^l\) is not self-dual, [Tau83] Theorem 1.1 yields

$$\begin{aligned} \imath (\nabla ^l) \ge 2 \left( \left| \kappa (P_{(l, 3)}) \right| + 1 \right) . \end{aligned}$$
(3.26)

We claim that there exists a constant \(C > 0\) so that \(\nabla ^l\) satisfies

$$\begin{aligned} \left| \left| F_{\nabla ^l} \right| \right| _{L^2}^2 \le C l^2. \end{aligned}$$
(3.27)

Assuming this for the moment, putting together (3.25) - (3.27) yields

$$\begin{aligned} \imath (\nabla ^l) \ge&\ 2 \left( \left| \kappa (P_{(l, 3)}) \right| + 1 \right) \\ =&\ 2 \left( \tfrac{1}{8} \left( l^2 - 9 \right) + 1 \right) \\ \ge&\ \tfrac{1}{4} l^2\\ \ge&\ \tfrac{1}{4C} \left| \left| F_{\nabla ^l} \right| \right| _{L^2}^2, \end{aligned}$$

as required.

We now prove line (3.27). Connections with quadrupole symmetry on these bundles are described in terms of a triple of functions \(a_i : (0,\frac{\pi }{3}) \rightarrow \mathbb R\), \(i = 1,2,3\). The bundle on which the connection is defined is determined by the boundary data. In particular, as per ([SS92] Definition 2.5, Lemma 2.6), we require that \(a = (a_1,a_2,a_3)\) satisfies

$$\begin{aligned} \lim _{\theta \rightarrow 0} a\left( \theta \right) = \left( 0,0,l \right) , \qquad \lim _{\theta \rightarrow \frac{\pi }{3}} a(\theta ) = \left( 0,3,0 \right) , \end{aligned}$$
(3.28)

and moreover each \(a_i\) extends to \((-\epsilon , \frac{\pi }{3} + \epsilon )\) such that for all \(\theta \in (-\epsilon ,\epsilon )\),

$$\begin{aligned} \begin{aligned} a_1\left( \theta \right) = a_2\left( -\theta \right) ,&\qquad a_3 \left( \theta \right) = a_3\left( -\theta \right) \\ a_1\left( \tfrac{\pi }{3} + \theta \right) = a_3(\tfrac{\pi }{3} - \theta ),&\qquad a_2\left( \tfrac{\pi }{3} + \theta \right) = a_2\left( \tfrac{\pi }{3} - \theta \right) . \end{aligned} \end{aligned}$$
(3.29)

We can construct a test connection which satisfies these conditions as follows. First set \(a_1 \equiv 0\). Fix some small \(\delta > 0\) and define \(a_2\) via

$$\begin{aligned} a_2(\theta ) \equiv&\ 0 \qquad \text{ for } \theta \in (-\delta ,\delta )\\ a_2(\theta ) \equiv&\ 3 \qquad \text{ for } \theta \in \left( \tfrac{\pi }{3} - \delta , \tfrac{\pi }{3} + \delta \right) \\ 0 \le a_2(\theta ) \le&\ 3 \qquad \text{ for } \theta \in [0,\tfrac{\pi }{3}]\\ 0 \le a_2'(\theta ) \le&\ 5 \qquad \text{ for } \theta \in [0,\tfrac{\pi }{3}] \end{aligned}$$

and we define \(a_3\) via

$$\begin{aligned} a_3(\theta ) \equiv&\ 3 \qquad \text{ for } \theta \in (-\delta ,\delta )\\ a_3(\theta ) \equiv&\ 0 \qquad \text{ for } \theta \in \left( \tfrac{\pi }{3} - \delta , \tfrac{\pi }{3} + \delta \right) \\ 0 \le a_3(\theta ) \le&\ 3 \qquad \text{ for } \theta \in [0,\tfrac{\pi }{3}]\\ 0 \ge a_3'(\theta ) \ge&\ -5 \qquad \text{ for } \theta \in [0,\tfrac{\pi }{3}]. \end{aligned}$$

One easily checks that this satisfies conditions (3.28) and (3.29) for \(l = 3\). Furthermore, if we set, for \(l > 0\),

$$\begin{aligned} a_{l} := \left( a_1, a_2, \tfrac{l}{3} a_3 \right) \end{aligned}$$

then \(a_{l}\) satisfies the conditions of (3.28) and (3.29) for the (l, 3) bundle, and furthermore satisfies

$$\begin{aligned} 0 \le a_3(\theta ) \le l, \qquad 0 \ge a_3'(\theta ) \ge -5 l. \end{aligned}$$

In ([SS92] Proposition 2.7) the Yang-Mills energy of these connections is computed, and takes the form

$$\begin{aligned} \left| \left| F_{\nabla (a)} \right| \right| _{L^2}^2= & {} \ \pi ^2 \int _0^{\tfrac{\pi }{3}} \left[ (a_1')^2 G_1 + (a_1 + a_2 a_3)^2/G_1 + (a'_2)^2 G_2 + (a_2 + a_1a_3)/G_2 \right. \nonumber \\&\left. \qquad \qquad + (a_3')^2 G_3 + (a_3 + a_1 a_2)^2/G_3 \right] {{\,\mathrm{d}\,}}\theta , \end{aligned}$$
(3.30)

where

$$\begin{aligned} G_1 =&\ \tfrac{f_2 f_3}{f_1}, \qquad G_2 = \tfrac{f_3 f_1}{f_2}, \qquad G_3 = \tfrac{f_1 f_2}{f_3}\\ f_1\left( \theta \right) =&\ 2 \sin \left( \tfrac{\pi }{3} + \theta \right) , \qquad f_2 \left( \theta \right) = 2 \sin \left( \tfrac{\pi }{3} - \theta \right) , \qquad f_3\left( \theta \right) = 2 \sin \left( \theta \right) . \end{aligned}$$

Note that some terms in the energy formula involve factors of the \(G_i\) which can blowup at one endpoint or the other, but the boundary conditions for a ensure that these are finite integrals. In particular, for our initial choice of \(a = a_3\), we obtain some value for the Yang-Mills energy, call it C. We furthermore observe that every term in (3.30) is at worst quadratic in \(a_3\) and \(a_3'\), which both grow linearly with l, and hence it follows that there is a different constant C such that

$$\begin{aligned} \left| \left| F_{\nabla (a_{l})} \right| \right| _{L^2}^2 \le C l^2. \end{aligned}$$

As the Sadun–Segert connection is constructed by energy minimization within this symmetry class ([SS92] Proposition 3.4, Theorem 3.10), its energy must lie below that of this test connection, finishing the proof of (3.27). \(\quad \square \)

4 The Index of a Positive Einstein Metric

Let \(X^4\) be a smooth, closed, four-dimensional manifold. Furthermore suppose g is a critical point for the normalized total scalar curvature functional given in (1.1):

$$\begin{aligned} \mathscr {S}[g] = {{\,\mathrm{Vol}\,}}(g)^{-1/2} \int _{X^4} R_g \,{{\,\mathrm{dV}\,}}_g, \end{aligned}$$

where \(R_g\) is the scalar curvature of g. It follows that g is an Einstein metric, whose Ricci tensor is given by

$$\begin{aligned} {{\,\mathrm{Ric}\,}}(g) = \tfrac{1}{4} R g \end{aligned}$$

(see [Bes87], Chapter 4C).

To study the second variation of \(\mathcal {S}\) at g, one uses the splitting of the space of sections of the bundle of symmetric two-tensors (see [Sch06] for details). The stability operator, corresponding to transverse-traceless variations of g, is given by

$$\begin{aligned} \begin{aligned} \left( \mathcal {L}(h) \right) _{ij}&= \Delta h_{ij} + 2 R_{ik j \ell } h_{k \ell } \\&= \Delta h_{ij} + 2 W_{ik j \ell } h_{k \ell } - \tfrac{1}{6}R h_{ij}. \end{aligned} \end{aligned}$$
(4.1)

This defines an index form

$$\begin{aligned} \begin{aligned} I(h,h)&= \int _X \langle h, \mathcal {L} ( h) \rangle \,{{\,\mathrm{dV}\,}}\\&= \int _X \big [ - |\nabla h|^2 + 2 W(h,h) - \tfrac{1}{6} R |h|^2 \big ] \,{{\,\mathrm{dV}\,}}, \end{aligned} \end{aligned}$$
(4.2)

where

$$\begin{aligned} W(h,h) = W_{ik j \ell } h_{k \ell } h_{ij}. \end{aligned}$$

The index \(\imath (g)\) of an Einstein metric is the number of positive eigenvalues of \(\mathcal {L}\) (equivalently, the number of negative eigenvalues of \(-\mathcal {L}\)). The nullity \(\nu (g)\) of an Einstein metric is the dimension of the kernel of \(\mathcal {L}\), i.e., the dimension of the space of infinitesimal Einstein deformations (see Chapter 12 of [Bes87]). With this background we can give the proof of Theorem 1.3.

Proof of Theorem 1.3

Note that \(\mathcal {L} : S_0^2(T^{*}X^4) \rightarrow S_0^2(T^{*}X^4)\), where \(S_0^2(T^{*}X^4)\) is the bundle of trace-free symmetric two-tensors. It follows from ([Hui85], Lemma 3.4), thatFootnote 1

$$\begin{aligned} - W(h,h) \ge -\tfrac{2}{\sqrt{3}} |W||h|^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \begin{aligned} \int _X \langle h, - \mathcal {L} (h) \rangle \,{{\,\mathrm{dV}\,}}&\ge \int _X \big [ |\nabla h|^2 - \tfrac{4}{\sqrt{3}} |W||h|^2 + \tfrac{1}{6} R |h|^2 \big ] \,{{\,\mathrm{dV}\,}}\\&= \int _X \langle h, \big ( -\Delta + \tfrac{1}{6}R - V \big )h \rangle \,{{\,\mathrm{dV}\,}}, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} V = \tfrac{4}{\sqrt{3}} |W|. \end{aligned}$$

Since \(\dim (S_0^2(T^{*}X^4)) = 9\), applying Theorem 2.1 to the operator \(\mathcal {N} = -\Delta + \frac{1}{6}R - V\) gives

$$\begin{aligned} \imath (g) + \nu (g) \le 1728 e^2 \dfrac{ \int _X |W|^2 \,{{\,\mathrm{dV}\,}}}{{{\,\mathrm{Y}\,}}(X^4, [g])^2 }. \end{aligned}$$
(4.3)

Since g is Einstein,

$$\begin{aligned} {{\,\mathrm{Y}\,}}(X^4, [g]) = \mathscr {S}[g]. \end{aligned}$$
(4.4)

Also, by the Chern–Gauss–Bonnet formula,

$$\begin{aligned} 8 \pi ^2 \chi (X^4)&= \int _X \big ( |W|^2 + \tfrac{1}{24} R^2 \big ) \,{{\,\mathrm{dV}\,}}\\&= \int _X |W|^2 \,{{\,\mathrm{dV}\,}}+ \tfrac{1}{24} \mathscr {S}[g]^2. \end{aligned}$$

Substituting this into (4.3), using (4.4), and rearranging the inequality gives

$$\begin{aligned} \mathscr {S}[g] \le 24 \pi \sqrt{ \dfrac{ \chi (X^4)}{ 3 + \delta \left[ \imath (g) + \nu (g) \right] }}, \end{aligned}$$

where \(\delta = (24 e^2)^{-1}\), as required. \(\quad \square \)

5 The Proof of Theorem 1.4

Proof of Theorem 1.4

Let \((X^4,g)\) be an oriented four-manifold with positive scalar curvature. To obtain the estimate for the first Betti number we only need to make minor changes to the index estimate for Yang-Mills connections, since the Jacobi operator in the case of the trivial bundle is the Hodge Laplacian acting on \(\Lambda ^1\). The only difference is the choice of conformal representative: in the trivial case, we use a Yamabe metric in the conformal class of g instead of the metric specified in Proposition 3.7.

Let \(\mathcal {H}_1 : \Lambda ^1 \rightarrow \Lambda ^1\) denote the Hodge Laplacian. Then by the Hodge-de Rham theorem, \(H^1(X^4,\mathbb {R}) = \ker \mathcal {H}_1\), and \(\dim \ker \mathcal {H}_1 = b_1(X^4)\). Let \(\omega \in H^1(X^4,\mathbb {R})\) be a harmonic one-form; by the classical Bochner formula,

$$\begin{aligned} \langle -\mathcal {H}_1 \omega , \omega \rangle _{L^2}&= \int _{X} \left( |\nabla \omega |^2 + {{\,\mathrm{Ric}\,}}(\omega ,\omega ) \right) \, {{\,\mathrm{dV}\,}}\\&= \int _{X} \left( |\nabla \omega |^2 + \tfrac{1}{4}R |\omega |^2 + {{\,\mathrm{Z}\,}}(\omega ,\omega ) \right) \, {{\,\mathrm{dV}\,}}\\&\ge \int _{X} \left( |\nabla \omega |^2 + \tfrac{1}{6}R |\omega |^2 + {{\,\mathrm{Z}\,}}(\omega ,\omega ) \right) \, {{\,\mathrm{dV}\,}}. \end{aligned}$$

Since \({{\,\mathrm{Z}\,}}\) is trace-free,

$$\begin{aligned} {{\,\mathrm{Z}\,}}(\omega ,\omega ) \ge -\tfrac{\sqrt{3}}{2}|\omega |^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \langle -\mathcal {H}_1 \omega , \omega \rangle _{L^2}&\ge \int _{X} \left( |\nabla \omega |^2 + \tfrac{1}{12} R |\omega |^2 - \tfrac{\sqrt{3}}{2}|{{\,\mathrm{Z}\,}}| |\omega |^2 \right) \, {{\,\mathrm{dV}\,}}\\&= \big \langle \left( -\Delta + \tfrac{1}{6}R - V \right) \omega , \omega \big \rangle _{L^2}, \end{aligned}$$

where

$$\begin{aligned} V = \tfrac{\sqrt{3}}{2}|{{\,\mathrm{Z}\,}}|. \end{aligned}$$

Applying Theorem 2.1 to the operator \(-\Delta + \frac{1}{6}R - V\) with \(\mathcal {E} = \Lambda _1\), we get

$$\begin{aligned} b_1(X^4)&\le \frac{108 e^2}{{{\,\mathrm{Y}\,}}(X^4,[g])^2} \int _{X} |{{\,\mathrm{Z}\,}}|^2 \, {{\,\mathrm{dV}\,}}. \end{aligned}$$
(5.1)

Recall

$$\begin{aligned} \rho _1(X^4,[g]) = \dfrac{ 4 \int \sigma _2(A_g) \, {{\,\mathrm{dV}\,}}}{{{\,\mathrm{Y}\,}}(X^4,[g])^2} = \dfrac{ \int _{X} \left( -\tfrac{1}{2}|{{\,\mathrm{Z}\,}}|^2 + \tfrac{1}{24}R^2 \right) \, {{\,\mathrm{dV}\,}}}{{{\,\mathrm{Y}\,}}(X^4,[g])^2}. \end{aligned}$$

Since g is a Yamabe metric,

$$\begin{aligned} \int _{X} R^2 \,{{\,\mathrm{dV}\,}}= {{\,\mathrm{Y}\,}}(X^4,[g])^2. \end{aligned}$$

Consequently,

$$\begin{aligned} \int _{X} |{{\,\mathrm{Z}\,}}|^2 \, {{\,\mathrm{dV}\,}}&= - 2\rho _1(X^4,[g]) {{\,\mathrm{Y}\,}}(X^4,[g])^2 + \tfrac{1}{12} \int _{X} R^2 \, {{\,\mathrm{dV}\,}}\\&= \tfrac{1}{12} \left( 1 - 24 \rho _1(X^4,[g]) \right) {{\,\mathrm{Y}\,}}(X^4,[g])^2. \end{aligned}$$

Substituting this into (5.1) gives (1.3).

To estimate \(b^{+}(X^4)\), let \(\mathcal {H}_2 : H^2(X^4) \rightarrow H^2(X^4)\) denote the Hodge Laplacian. Then \(b^{+}(X^4) = \dim \ker \mathcal {H}_2^{+}\), where \(\mathcal {H}^{+}_2\) is the restriction of \(\mathcal {H}_2\) to \(\Lambda _{+}^2\), the bundle of self-dual two-forms. The space of self-dual harmonic two-forms is conformally invariant since the Hodge \(\star \) operator is. Therefore, in estimating \(b^+(X^4)\) we are free to choose a conformal metric. If we take the bundle E to be the trivial bundle in Proposition 3.7, then there is a conformal metric \(\hat{g}\in [g]\) and a \(t_0 \in (0,1]\) such that

$$\begin{aligned} R_{\hat{g}} = 2\sqrt{6} t_0 |W^{+}_{\hat{g}}|. \end{aligned}$$
(5.2)

From now on we assume \(g = \hat{g}\).

The operator \(\mathcal {H}^+_2\) satisfies the Weitzenbock formula

$$\begin{aligned} \mathcal {H}_2^{+} = \Delta + 2 W^{+} - \tfrac{1}{3}R, \end{aligned}$$

where \(\Delta \) is the rough Laplacian. Since \(W^{+}: \Lambda ^2_{+} \rightarrow \Lambda ^2_{+}\) is trace-free and \(\dim \Lambda ^2_{+} = 3\), we have the sharp inequality

$$\begin{aligned} | W^{+}(\omega ,\omega )| \le \tfrac{2}{\sqrt{6}}|W^{+}| |\omega |^2. \end{aligned}$$

Therefore,

$$\begin{aligned} \langle -\mathcal {H}_2^{+} \omega , \omega \rangle _{L^2}&= \int _{X} \left( |\nabla \omega |^2 - 2 W^{+}(\omega ,\omega ) + \tfrac{1}{3}R|\omega |^2 \right) \, {{\,\mathrm{dV}\,}}\\&\ge \int _{X} \left( |\nabla \omega |^2 - \tfrac{4}{\sqrt{6}} |W^{+}||\omega |^2 + \tfrac{1}{3}R|\omega |^2 \right) \, {{\,\mathrm{dV}\,}}\\&= \int _{X} \left( |\nabla \omega |^2 + \tfrac{1}{6} R |\omega |^2 + \left( \tfrac{1}{6}R - \tfrac{4}{\sqrt{6}} |W^{+}| \right) |\omega |^2 \right) \, {{\,\mathrm{dV}\,}}. \end{aligned}$$

Using (5.2),

$$\begin{aligned} \langle -\mathcal {H}_2^{+} \omega , \omega \rangle _{L^2}&\ge \int _{X} \left( |\nabla \omega |^2 + \tfrac{1}{6} R |\omega |^2 - \tfrac{\sqrt{6}}{3}(2 - t_0) |W^{+}| |\omega |^2 \right) \, {{\,\mathrm{dV}\,}}\\&\ge \big \langle \left( -\Delta + \tfrac{1}{6}R - V \right) \omega , \omega \big \rangle _{L^2}, \end{aligned}$$

where

$$\begin{aligned} V = \tfrac{\sqrt{6}}{3}(2 - t_0) |W^{+}|. \end{aligned}$$

Applying Theorem 2.1 to the operator \(-\Delta + \frac{1}{6}R - V\) with \(\mathcal {E} = \Lambda _2^{+}\), we get

$$\begin{aligned} \begin{aligned} b^+(X^4)&\le \frac{72 e^2}{{{\,\mathrm{Y}\,}}(X^4,[g])^2} (2-t_0)^2 \Vert W^{+}\Vert _{L^2}^2 \\&= 3 e^2 (2 - t_0)^2 \rho _{+}(X^4,[g]), \end{aligned} \end{aligned}$$
(5.3)

where \(\rho _{+}\) is given by (1.2). By (3.13) of Proposition 3.7,

$$\begin{aligned} \rho _{+}^{-1/2} \le t_0 \le 1, \end{aligned}$$

hence

$$\begin{aligned} \left( 2 - t_0 \right) ^2 \rho _{+} \le ( 2 - \rho _{+}^{-1/2} )^2 \rho _{+} \le ( 2 \rho _{+}^{1/2} - 1 )^2. \end{aligned}$$

Substituting this into (5.3) gives (1.4). \(\quad \square \)