1 Introduction and Main Result

The standard Courant minimax values \(\lambda _k(A)\) of a lower semibounded operator A on a Hilbert space \({{\mathcal {H}}}\) are given by

$$\begin{aligned} \lambda _k(A) = \inf _{\begin{array}{c} {{\mathfrak {M}}}\subset {{\,\mathrm{Dom}\,}}(A)\\ \dim {{\mathfrak {M}}}=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}\\ \Vert x\Vert =1 \end{array}} \langle x, Ax \rangle = \inf _{\begin{array}{c} {{\mathfrak {M}}}\subset {{\,\mathrm{Dom}\,}}(|A|^{1/2})\\ \dim {{\mathfrak {M}}}=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {a}}}[x,x] \end{aligned}$$

for \(k\in \mathbb {N}\) with \(k\le \dim {{\mathcal {H}}}\), see, e.g., [23,  Theorem 12.1] and also [31,  Section 12.1 and Exercise 12.4.2]. Here, \(\langle \cdot ,\cdot \rangle \) denotes the inner product of \({{\mathcal {H}}}\), and \({{\mathfrak {a}}}\) with \({{\mathfrak {a}}}[x,x] = \langle |A|^{1/2}x, {{\,\mathrm{sign}\,}}(A)|A|^{1/2}x \rangle \) for \(x\in {{\,\mathrm{Dom}\,}}(|A|^{1/2})\) is the form associated with A.

The above minimax values have proved to be a powerful description of the eigenvalues below the essential spectrum of A; in fact, they agree with these eigenvalues in nondecreasing order counting multiplicities as long as the latter exist and else equal the bottom of the essential spectrum. A standard application in this context is that the eigenvalues below the essential spectrum exhibit a monotonicity with respect to the operator: for two lower semibounded self-adjoint operators A and B with \(A\le B\) in the sense of quadratic forms one has \(\lambda _k(A) \le \lambda _k(B)\) for all k, see, e.g., [31,  Corollary 12.3].

Matters get, however, much more complicated when eigenvalues in a gap of the essential spectrum are considered. If \(A_+\) is the (lower semibounded) part of A associated with its spectrum in an interval of the form \((\gamma ,\infty )\), \(\gamma \in \mathbb {R}\), then the minimax values for \(A_+\) still describe the eigenvalues of \(A_+\) below its essential spectrum, and thus the eigenvalues of A in \((\gamma , \infty )\) below the essential spectrum of A above \(\gamma \). However, the subspaces over which the corresponding infimum is taken are chosen within the spectral subspace for A associated with the interval \((\gamma ,\infty )\) and therefore usually depend on the operator itself rather than just its domain. This makes it difficult to compare minimax values in spectral gaps of two different operators A and B, even if their domains agree.

An adapted minimax principle taking this problem into account was first proposed by Talman [34] and Datta and Devaiah [2] in the context of Dirac operators. A corresponding mathematically rigorous result was announced by Esteban and Séré in [9] and proved together with Dolbeault in [5]. To the best of the authors knowledge, the first abstract theorem in this direction is due to Griesemer and Siedentop [12], the hypotheses of which have an overlap with [5, 9] but do not seem suitable to handle Dirac operators efficiently. In an attempt to overcome this and as a step towards finding the optimal assumptions, Griesemer, Lewis, and Siedentop [11] provided an alternative set of hypotheses. In a parallel development, Dolbeault, Esteban, and Séré [6] obtained an abstract theorem in an operator setting with yet another set of hypotheses that has an overlap with those of [11] but allows to deal with more potentials for Dirac operators. However, the abstract result of [11] does not seem to be contained in [6]. The result in [6] has later been extended to a form setting by Morozov and Müller [25], and recently by Schimmer, Solovej, and Tokus [28] to a class of symmetric operators with a distinguished self-adjoint extension. There has also been some activity regarding variational principles for block operator matrices, see, e.g., [19, 21] and triple variational principles, see, e.g., [7, 22]. They, however, follow a different approach and are not pursued here.

The aim of the present work is to complement the above works in an abstract framework. To this end, the result by Griesemer, Lewis, and Siedentop in [11] is adapted to a perturbative setting. In the particular case of bounded additive perturbations, this has already been done by the present author in the appendix to [27] with hypotheses that can, under reasonable assumptions, be verified explicitly by means of the Davis-Kahan \(\sin 2\Theta \) theorem from [3] or variants thereof. The latter has been successfully applied in [27] to study lower bounds on the movement of eigenvalues in gaps of the essential spectrum and of edges of the essential spectrum. In the current work, the considerations from [27, Appendix A] are extended and supplemented to cover also certain unbounded perturbations, in particular ones that are off-diagonal with respect to the spectral gap under consideration. The results obtained here seem not suitable to handle Dirac operators with Coulomb potentials since either the perturbation is assumed to be sufficiently small (Theorem 1.2) or of an off-diagonal structure (Theorem 1.4) or since they assume a semibounded setting (Theorems 1.3 and 1.5). However, weaker perturbations and other important situations such as perturbed periodic Schrödinger operators seem to be a natural context in which they can be applied. It should also be mentioned that some of the results and applications discussed here might at least in parts also be obtained with the approaches from earlier works such as [6, 22, 25, 28]. This is commented on at various spots below, see, e.g., Remarks 1.62.2 (2), 2.11, and 2.13. The present work focuses on [11] as a starting point for mainly two reasons: Firstly, the proof of that result is remarkably elementary and short, while the proofs of [6, 22, 25, 28] are each a lot longer and much more technical, and, secondly, the techniques employed in Sects. 4 and 5 below to apply the approach from [11] promise to be of independent interest. In any case, to the best of the author’s knowledge, neither the main results presented here nor their applications have been stated explicitly anywhere before. Rare exceptions to the latter are commented on accordingly.

1.1 Main Results

In order to formulate our main results, it is convenient to fix the following notational setup tailored towards spectral gaps to the right of 0; other gaps can of course always be reduced to this situation by spectral shift, cf. Remark 3.2 below.

Hypothesis 1.1

Let A be a self-adjoint operator on a Hilbert space. Denote the spectral projections for A associated with the intervals \((0,\infty )\) and \((-\infty ,0]\) by \(P_+\) and \(P_-\), respectively, that is,

$$\begin{aligned} P_+ := \mathsf {E}_A\bigl ((0,\infty )\bigr ),\quad P_- := I-P_+, \end{aligned}$$

and let

$$\begin{aligned} {{\mathcal {D}}}_\pm := {{\,\mathrm{Ran}\,}}P_\pm \cap {{\,\mathrm{Dom}\,}}(A),\quad {{\mathfrak {D}}}_\pm := {{\,\mathrm{Ran}\,}}P_\pm \cap {{\,\mathrm{Dom}\,}}(|A|^{1/2}). \end{aligned}$$

Moreover, let B be another self-adjoint operator on the same Hilbert space with analogously defined spectral projections

$$\begin{aligned} Q_+ := \mathsf {E}_B\bigl ((0,\infty )\bigr ),\quad Q_- := I - Q_+, \end{aligned}$$

and denote by \({{\mathfrak {b}}}\) the form associated with B, that is,

$$\begin{aligned} {{\mathfrak {b}}}[x,y] = \langle |B|^{1/2}x, {{\,\mathrm{sign}\,}}(B)|B|^{1/2}y \rangle \end{aligned}$$

for \(x,y\in {{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}] = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\).

Here, \(\mathsf {E}_A\) and \(\mathsf {E}_B\) stand for the projection-valued spectral measures for the operators A and B, respectively, and \({{\,\mathrm{Ran}\,}}P_\pm \) denotes the range of \(P_\pm \). We have also used the notation I for the identity operator.

Denoting the form associated with A by \({{\mathfrak {a}}}\), the minimax values of the positive part \(A|_{{{\,\mathrm{Ran}\,}}P_+}\) of A can clearly be written as

$$\begin{aligned} \lambda _k(A|_{{{\,\mathrm{Ran}\,}}P_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathcal {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, Ax \rangle = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathfrak {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathfrak {D}}}_-\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {a}}}[x,x] \end{aligned}$$

for \(k\in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}P_+\). The point of interest is now to find conditions on B under which the minimax values for the positive part \(B|_{{{\,\mathrm{Ran}\,}}Q_+}\) of B admit the same representations with \(\langle x, Ax \rangle \) and \({{\mathfrak {a}}}[x, x]\) replaced by \(\langle x, Bx\rangle \) and \({{\mathfrak {b}}}[x, x]\), respectively, but with the infima taken over the same respective families of subspaces as for A above. It is natural to consider this in a perturbative framework where B is obtained by an operator or form perturbation of A and, thus, one has \({{\,\mathrm{Dom}\,}}(A) = {{\,\mathrm{Dom}\,}}(B)\) and/or \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\).

In the situation of Hypothesis 1.1, a representation for the minimax values of \(B|_{{{\,\mathrm{Ran}\,}}Q_+}\) of the above mentioned form is guaranteed by [25, Theorem 1] in the form setting with \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\) if

$$\begin{aligned} \sup _{x_- \in {{\mathfrak {D}}}_-} {{\mathfrak {b}}}[ x_-, x_- ] \le 0 < \inf _{x_+ \in {{\mathfrak {D}}}_+{\setminus }\{0\}} \sup _{x_- \in {{\mathfrak {D}}}_-} \frac{{{\mathfrak {b}}}[ x_+ + x_-, x_+ + x_- ]}{\Vert x_+ + x_-\Vert ^2}, \end{aligned}$$
(1.1)

or by [6, Theorem 1.1] in the operator setting with \({{\,\mathrm{Dom}\,}}(A) = {{\,\mathrm{Dom}\,}}(B)\) if the analogous condition with \({{\mathfrak {D}}}_\pm \) replaced by \({{\mathcal {D}}}_\pm \) is satisfied; as pointed out in [28], for the latter additionally the restriction \(P_-B|_{{{\,\mathrm{Ran}\,}}P_-}\) should be essentially self-adjoint on \({{\mathcal {D}}}_-\), and it is very likely that a similar additional assumption is also necessary in the form setting of [25, Theorem 1], namely that the restriction of the form \({{\mathfrak {b}}}\) to \({{\mathfrak {D}}}_- \times {{\mathfrak {D}}}_-\) is closable. Obviously, (1.1) and its operator analogue do not need any knowledge of \(Q_\pm \). Moreover, in case of (1.1), the right-hand side of (1.1) then agrees with \(\lambda _1(B|_{{{\,\mathrm{Ran}\,}}Q_+})\), so that the strict inequality in (1.1) is also a necessary condition for such a representation to hold if B has a spectral gap to the right of zero. On the other hand, this strict inequality is not always very convenient to verify or it is sometimes not even entirely clear how to verify it, cf. Remarks 1.6 (2) and 2.2  (2) below. However, for an example where it can be verified analytically in terms of a Hardy-type inequality in the case of the Coulomb-Dirac operator, see [4]; cf. also [8] and [28, Section 3].

Instead of (1.1), [11] used the conditions

$$\begin{aligned} \sup _{x_- \in {{\mathfrak {D}}}_-} {{\mathfrak {b}}}[ x_-, x_- ] \le 0 \quad {\text { and }}\quad \Vert (|A|+I)^{1/2} P_+Q_- (|A|+I)^{-1/2} \Vert < 1, \end{aligned}$$
(1.2)

cf. Remark 3.3 below, which the authors were able to handle in case of Dirac operators but where especially the second condition seems to be hard to deal with in a general abstract setting. However, although (1.1) can treat more Coulomb-like potentials than (1.2) in case of the Dirac operator, (1.2) does not seem to imply (1.1) directly.

In the main results below the aim is to discuss situations where the second condition in (1.2) can be replaced by \(\Vert P_+Q_- \Vert < 1\), \(\Vert P_+ - Q_+ \Vert < 1\), or by a certain explicit structural assumption on how B is related to A. Here, especially the first two conditions seem to be natural since they relate the subspaces \({{\,\mathrm{Ran}\,}}P_+\) and \({{\,\mathrm{Ran}\,}}Q_+\). Four results in this direction are presented here, each addressing a different situation, which are not contained in the previously known results in the sense that their hypotheses do not seem to imply (1.1), its operator analogue, or (1.2) directly. We first treat the case of operator perturbations and start with the direct extension of [27, Theorem A.2] to infinitesimal perturbations. Recall that an operator V with \({{\,\mathrm{Dom}\,}}(V) \supset {{\,\mathrm{Dom}\,}}(A)\) is called A-bounded with A-bound \(b_* \ge 0\) if for all \(b > b_*\) there is some \(a \ge 0\) with

$$\begin{aligned} \Vert Vx \Vert \le a\Vert x\Vert + b\Vert Ax\Vert \quad {\text { for all }}\ x \in {{\,\mathrm{Dom}\,}}(A) \end{aligned}$$

and if there is no such a for \(0< b < b_*\). If \(b_* = 0\), then V is called infinitesimal with respect to A.

Theorem 1.2

Assume Hypothesis 1.1. Suppose, in addition, that B is of the form \(B = A + V\), \({{\,\mathrm{Dom}\,}}(B) = {{\,\mathrm{Dom}\,}}(A)\), with some symmetric operator V that is infinitesimal with respect to A. Furthermore, suppose that we have \(\Vert P_+Q_-\Vert < 1\) and that

$$\begin{aligned} \langle x, Bx \rangle \le 0 \quad {\text { for all }}\ x \in {{\mathcal {D}}}_-. \end{aligned}$$

Then,

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathcal {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, Bx \rangle = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathfrak {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathfrak {D}}}_-\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {b}}}[x,x] \end{aligned}$$

for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}P_+\).

It is worth to note that every operator of the form \(B = A + V\) as in Theorem 1.2 is automatically self-adjoint on \({{\,\mathrm{Dom}\,}}(B) = {{\,\mathrm{Dom}\,}}(A)\) by the well-known Kato-Rellich theorem. Two more remarks regarding Theorem 1.2 are in order: (1) also certain perturbations V that are not infinitesimal with respect to A can be considered here, but at the cost of a stronger assumption on \(\Vert P_+Q_-\Vert \), see Remark 4.2 below; (2) the condition \(\Vert P_+Q_-\Vert < 1\) is satisfied if the stronger inequality \(\Vert P_+-Q_+\Vert < 1\) holds. In the latter case, the subspaces \({{\,\mathrm{Ran}\,}}P_+\) and \({{\,\mathrm{Ran}\,}}Q_+\) automatically have the same dimension, that is, \(\dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\), see Remark 3.5 (a) below.

The stronger condition \(\Vert P_+-Q_+\Vert < 1\) just mentioned in fact also opens the way to employ a different approach than the one used to prove Theorem 1.2. This alternative approach has previously been used in the context of block diagonalization of operators and forms, see Sect. 5 below, and is particularly attractive if the unperturbed operator A is semibounded.

Theorem 1.3

Assume Hypothesis 1.1. Suppose, in addition, that A is semibounded and that \(\Vert P_+ - Q_+\Vert < 1\).

If \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\) and \({{\mathfrak {b}}}[ x, x ] \le 0\) for all \(x \in {{\mathfrak {D}}}_-\), then

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathfrak {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathfrak {D}}}_-\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {b}}}[x,x] \end{aligned}$$
(1.3)

for all \(k \le \dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\). If even \({{\,\mathrm{Dom}\,}}(A) = {{\,\mathrm{Dom}\,}}(B)\) and \(\langle x, Bx \rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\), then also

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathcal {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, Bx \rangle \end{aligned}$$
(1.4)

for all \(k \le \dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\).

It should be emphasized that the conditions \({{\,\mathrm{Dom}\,}}(A) = {{\,\mathrm{Dom}\,}}(B)\) and \(\langle x, Bx \rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\) in Theorem 1.3 indeed imply that one has also \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\) and \({{\mathfrak {b}}}[ x, x ] \le 0\) for all \(x \in {{\mathfrak {D}}}_-\), see Lemma 3.4 below. Note also that in contrast to Theorem 1.2, Theorem 1.3 makes no assumptions on how the operator B is related to A. The latter will, however, be relevant when the hypotheses of Theorem 1.3 are to be verified in concrete situations.

The condition \(\langle x, Bx \rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\) plays an important role in both Theorems 1.2 and 1.3. In the case where \(B = A + V\) with some A-bounded symmetric operator V, this condition is automatically satisfied if \(\langle x, Vx \rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\) since \(\langle x, Ax \rangle \le 0\) holds for all \(x \in {{\mathcal {D}}}_-\) by definition. The latter is certainly the case for nonpositive V. Another instance of perturbations satisfying \(\langle x, Vx \rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\) are so-called off-diagonal perturbations with respect to the decomposition \({{\,\mathrm{Ran}\,}}P_+ \oplus {{\,\mathrm{Ran}\,}}P_-\), in which case also the condition \(\Vert P_+-Q_+\Vert < 1\) can be verified efficiently. In comparison with Theorem 1.2, we may even relax the assumption on the A-bound of V here.

Theorem 1.4

Assume Hypothesis 1.1. Suppose, in addition, that B has the form \(B = A + V\), \({{\,\mathrm{Dom}\,}}(B) = {{\,\mathrm{Dom}\,}}(A)\), with some symmetric A-bounded operator V with A-bound smaller than 1 and which is off-diagonal on \({{\,\mathrm{Dom}\,}}(A)\) with respect to the decomposition \({{\,\mathrm{Ran}\,}}P_+ \oplus {{\,\mathrm{Ran}\,}}P_-\), that is,

$$\begin{aligned} P_+VP_+ x = 0 = P_-VP_- x \quad {\text { for all }}\ x \in {{\,\mathrm{Dom}\,}}(A). \end{aligned}$$

Then, one has \(\dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\) and

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathcal {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, Bx \rangle = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathfrak {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathfrak {D}}}_-\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {b}}}[x,x] \end{aligned}$$

for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}Q_+\).

It is again worth to note that every operator of the form \(B = A + V\) as in Theorem 1.4 is automatically self-adjoint on \({{\,\mathrm{Dom}\,}}(B) = {{\,\mathrm{Dom}\,}}(A)\) by the Kato-Rellich theorem. Moreover, although off-diagonal perturbations may seem a bit restrictive, they appear quite naturally when a general, not necessarily off-diagonal, perturbation is decomposed into its diagonal and off-diagonal parts. How Theorem 1.4 may then be applied is demonstrated in Proposition 2.12 below and the considerations thereafter.

The method of proof for Theorem 1.4 can to some extend be carried over to off-diagonal form perturbations, at least in the semibounded setting. The latter restriction is commented on in Sect. 5 below.

Theorem 1.5

Assume Hypothesis 1.1. Suppose, in addition, that B is semibounded and that its form \({{\mathfrak {b}}}\) is given by \({{\mathfrak {b}}}= {{\mathfrak {a}}}+ {{\mathfrak {v}}}\), \({{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}] = {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\), where \({{\mathfrak {a}}}\) is the form associated with A and \({{\mathfrak {v}}}\) is a symmetric sesquilinear form satisfying

$$\begin{aligned} {{\mathfrak {v}}}[ P_+x, P_+y ] = 0 = {{\mathfrak {v}}}[ P_-x, P_-y ] \quad {\text { for all }}\ x,y \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] \subset {{\,\mathrm{Dom}\,}}[{{\mathfrak {v}}}] \end{aligned}$$

and

$$\begin{aligned} | {{\mathfrak {v}}}[ x, x ] | \le a\Vert x\Vert ^2 + b|{{\mathfrak {a}}}[ x, x ]| \quad {\text { for all }}\ x \in {{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] \end{aligned}$$
(1.5)

with some constants \(a,b \ge 0\).

Then, one has \(\dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\) and

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathfrak {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathfrak {D}}}_-\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {b}}}[x,x] \end{aligned}$$

for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}Q_+\).

The semiboundedness of B in Theorem 1.5 forces A to be semibounded as well, see the proof of Theorem 1.5 below. In this regard, Theorem 1.5 can be interpreted as a particular case of the first part of Theorem 1.3 with \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\), in which the remaining hypotheses are automatically satisfied due to the structure of the perturbation.

Remark 1.6

  1. (1)

    If B in Theorem 1.5 is lower semibounded, then the operator \((|B| + I)^{1/2}Q_-\) is everywhere defined and bounded, and so is the operator \((|B| + I)^{1/2}Q_-P_+\). Taking into account that \({{\mathfrak {b}}}[ x, x ] = {{\mathfrak {a}}}[ x, x ] \le 0\) for \(x \in {{\mathfrak {D}}}_-\) and \({{\mathfrak {b}}}[ x, x ] = {{\mathfrak {a}}}[ x, x ] > 0\) for \(x \in {{\mathfrak {D}}}_+\), Theorem 1.5 therefore reproduces in this situation a particular case of the earlier result [12, Theorem 3].

  2. (2)

    If in Theorems 1.4 or 1.5 the unperturbed operator A has a spectral gap to the right of 0, then considerations as in part (1) show that (1.1) or the corresponding analogue in the operator framework is satisfied; cf. also Corollaries 2.5 and 2.7  (a) below. In this regard, Theorems 1.4 and 1.5 then can also be deduced from [6] and [25], respectively. However, if A does not have such a gap, it is a priori not clear how to derive the two theorems from [6, 25].

The rest of this note is organized as follows. In Sect. 2 we discuss applications of the main theorems and revisit the Stokes operator as an example in the framework of Theorem 1.5. It is also explained there how the framework for off-diagonal perturbations in Theorem 1.4 can be applied to general, not necessarily off-diagonal, perturbations by decomposing the perturbation into its diagonal and off-diagonal parts. Section 3 is devoted to an abstract minimax principle based on [11]. Two approaches are then used to verify the hypotheses of this abstract minimax principle, the graph norm approach and the block diagonalization approach, respectively, which are discussed separately in Sects. 4 and 5 below. Theorem 1.2 is proved in Sect. 4, which is based on the author’s appendix to [27] and extends the corresponding considerations to certain unbounded perturbations. Theorems 1.31.5 are proved in Sect. 5, which builds upon recent developments on block diagonalization of operators and forms from [24] and [14], respectively. Finally, Appendix A reproduces the proof from [11] for the abstract minimax principle discussed in Sect. 3, and Appendix B provides some consequences of the well-known Heinz inequality that are used at various spots in this work and are probably folklore.

2 Applications and Examples

In this section, we use the main results from Sect. 1 to prove monotonicity and continuity properties of minimax values in gaps of the essential spectrum in various situations and also revisit the well-known Stokes operator in the framework of Theorem 1.5 as an example. We finally discuss how to apply the off-diagonal framework from Theorem 1.4 to general, not necessarily off-diagonal, perturbations.

We first consider the situation of indefinite or semidefinite bounded perturbations, which has partially been discussed in a slightly different form in [27]. For a bounded self-adjoint operator V we define bounded nonnegative operators \(V^{(p)}\) and \(V^{(n)}\) with \(V = V^{(p)} - V^{(n)}\) via functional calculus by

$$\begin{aligned} V^{(p)} := (1 + {{\,\mathrm{sign}\,}}(V))V / 2,\quad V^{(n)} := ({{\,\mathrm{sign}\,}}(V) - 1)V / 2 . \end{aligned}$$
(2.1)

We clearly have \(\Vert V^{(p)} \Vert \le \Vert V\Vert \) and \(\Vert V^{(n)} \Vert \le \Vert V\Vert \).

The following result can be proved in several ways. The proof below is based on Theorem 1.2 and is in its core close to the proofs of Theorems 3.14 and 3.15 in [27]. An alternative proof for part (b) based on Theorem 1.4 is discussed after Remark 2.13 below. The result itself extends Theorem 5 in [12], which was formulated there for bounded nonpositive V that are relatively compact with respect to the unperturbed operator A.

Proposition 2.1

Let the finite interval (cd) belong to the resolvent set of the self-adjoint operator A, and let V be a bounded self-adjoint operator on the same Hilbert space satisfying \(\Vert V^{(p)}\Vert + \Vert V^{(n)}\Vert < d - c\) with \(V^{(p)}\) and \(V^{(n)}\) as in (2.1). Set \(P := \mathsf {E}_A([d,\infty ))\) and \(Q := \mathsf {E}_{A+V}([d-\Vert V^{(n)}\Vert ,\infty ))\). Then:

  1. (a)

    The interval \((c+\Vert V^{(p)}\Vert , d-\Vert V^{(n)}\Vert )\) belongs to the resolvent set of the operator \(A + V\), and we have \(\Vert P - Q \Vert < 1\).

  2. (b)

    With \({{\mathcal {D}}}_+ := {{\,\mathrm{Ran}\,}}P \cap {{\,\mathrm{Dom}\,}}(A)\) and \({{\mathcal {D}}}_- := {{\,\mathrm{Ran}\,}}(I-P) \cap {{\,\mathrm{Dom}\,}}(A)\) we have

    $$\begin{aligned} \lambda _k\bigl ((A+V)|_{{{\,\mathrm{Ran}\,}}Q} \bigr ) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+ \subset {{\mathcal {D}}}_+\\ \dim {{\mathfrak {M}}}_+ = k \end{array}} \sup _{\begin{array}{c} x \in {{\mathfrak {M}}}_+ \oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, (A+V)x \rangle \end{aligned}$$

    for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}Q\).

  3. (c)

    With \({{\mathcal {E}}}_+ := {{\,\mathrm{Ran}\,}}Q \cap {{\,\mathrm{Dom}\,}}(A)\) and \({{\mathcal {E}}}_- := {{\,\mathrm{Ran}\,}}(I-Q) \cap {{\,\mathrm{Dom}\,}}(A)\) we have

    $$\begin{aligned} \lambda _k\bigl (A|_{{{\,\mathrm{Ran}\,}}P} \bigr ) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+ \subset {{\mathcal {E}}}_+\\ \dim {{\mathfrak {M}}}_+ = k \end{array}} \sup _{\begin{array}{c} x \in {{\mathfrak {M}}}_+ \oplus {{\mathcal {E}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, Ax \rangle \end{aligned}$$

    for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}P\).

Proof

  1. (a).

    This is Proposition 2.1 and Theorem 1.1 in [33], respectively; cf. also [36, Theorem 3.2]. More precisely, the variant of the Davis–Kahan \(\sin 2\Theta \) theorem in [33, Theorem 1.1] gives

    $$\begin{aligned} \Vert P - Q \Vert \le \sin \Bigl ( \frac{1}{2} \arcsin \frac{\Vert V^{(p)}\Vert + \Vert V^{(n)}\Vert }{d-c} \Bigr )< \frac{\sqrt{2}}{2} < 1. \end{aligned}$$

    In particular, the subspaces \({{\,\mathrm{Ran}\,}}P\) and \({{\,\mathrm{Ran}\,}}Q\) have the same dimension; cf. also Remark 3.5 (1) below.

  2. (b).

    Pick \(\gamma \in (c + \Vert V^{(p)}\Vert , d - \Vert V^{(n)}\Vert )\). By part (a) we then have \(\mathsf {E}_{A-\gamma }((0,\infty )) = P\) and \(\mathsf {E}_{A+V-\gamma }((0,\infty )) = Q\). Moreover, for \(x \in {{\mathcal {D}}}_-\) we have

    $$\begin{aligned} \begin{aligned} \langle x, (A+V-\gamma )x \rangle&= \langle x, (A-\gamma )x \rangle + \langle x, V^{(p)}x \rangle - \langle x, V^{(n)}x \rangle \\&\le (c - \gamma + \Vert V^{(p)}\Vert ) \Vert x\Vert ^2 < 0. \end{aligned} \end{aligned}$$
    (2.2)

    In light of part (a), the claim now follows from Theorem 1.2 with \(P_+ = P\) and \(Q_+ = Q\) upon a spectral shift by \(\gamma \).

  3. (c).

    Similarly as in (b), pick \(\gamma \in (c + \Vert V^{(p)}\Vert + \Vert V^{(n)}\Vert , d)\). We then have \(\mathsf {E}_{A-\gamma }((0,\infty )) = P\) and for some chosen \(\rho \in (c + \Vert V^{(p)}\Vert , d - \Vert V^{(n)}\Vert )\) also \(\mathsf {E}_{A+V-\rho }((0,\infty )) = Q\). Moreover, for \(x \in {{\mathcal {E}}}_-\) we have

    $$\begin{aligned} \begin{aligned} \langle x, (A-\gamma )x \rangle&= \langle x, (A+V-\gamma )x \rangle - \langle x, V^{(p)}x \rangle + \langle x, V^{(n)}x \rangle \\&\le (c + \Vert V^{(p)}\Vert - \gamma + \Vert V^{(n)}\Vert ) \Vert x\Vert ^2 < 0. \end{aligned} \end{aligned}$$
    (2.3)

    In light of part (a), the claim now follows analogously from Theorem 1.2 with switched roles of A and \(A+V\) and with \(P_+ = Q\) and \(Q_+ = P\). \(\square \)

Remark 2.2

  1. (1)

    A corresponding representation of the minimax values in terms of the forms associated with \(A+V\) and A, respectively, as in Theorems 1.21.4 holds here as well. However, for the sake of simplicity and since this is not needed in Corollaries 2.3 and 2.4 below, this has not been formulated in Proposition 2.1.

  2. (2)

    Part (b) of Proposition 2.1 can also be deduced from [6]. Indeed, for \(y \in {{\mathcal {D}}}_+\) we analogously have

    $$\begin{aligned} \langle y, (A+V-\gamma )y \rangle \ge (d - \gamma - \Vert V^{(n)}\Vert ) \Vert y\Vert ^2 > 0, \end{aligned}$$

    which together with (2.2) implies that the operator analogue to condition (1.1) is satisfied. However, the situation is less clear for part (c) of Proposition 2.1. Here, for \(y \in {{\mathcal {E}}}_+\) we get

    $$\begin{aligned} \langle y, (A-\gamma )y \rangle \ge (d - \Vert V^{(n)}\Vert - \gamma - \Vert V^{(p)}\Vert ) \Vert y\Vert ^2, \end{aligned}$$

    which is positive only if \(\gamma < d - \Vert V^{(p)}\Vert - \Vert V^{(n)}\Vert \). Together with the condition \(\gamma > c + \Vert V^{(p)}\Vert + \Vert V^{(n)}\Vert \) in (2.3), this requires the stronger assumption \(\Vert V^{(p)}\Vert + \Vert V^{(n)}\Vert < (d-c)/2\). The latter can be somehow remedied with a continuity argument, but in the end the approach presented in the proof of Proposition 2.1 above is just more convenient.

The above proposition includes the particular cases where V satisfies the condition \(\Vert V\Vert < (d - c)/2\) and where V is semidefinite with \(\Vert V\Vert < d - c\), which in the context of part (c) have essentially been discussed in the proofs of Theorems 3.14 and 3.15 in [27]. However, Proposition 2.1 allows also certain indefinite perturbations V with \((d - c)/2 \le \Vert V\Vert < d - c\) that were not covered before and may thus be used to refine the results in [27].

As immediate corollaries to Proposition 2.1, we obtain the following monotonicity and continuity statements for the minimax values in gaps of the essential spectrum, which, in essence, reproduce particular cases of results in [36].

Corollary 2.3

Let A be as in Proposition 2.1, and let \(V_0\) and \(V_1\) be bounded self-adjoint operators on the same Hilbert space satisfying the condition \(\max \)

$$\{ \Vert V_0^{(p)}\Vert + \Vert V_0^{(n)}\Vert , \Vert V_1^{(p)}\Vert + \Vert V_1^{(n)}\Vert \} < d - c$$

.

If, in addition, \(V_0 \le V_1\), then

$$\begin{aligned} \lambda _k\bigl ( (A+V_0)|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{A+V_0}([d-\Vert V_0^{(n)}\Vert ,\infty ))} \bigr ) \le \lambda _k\bigl ( (A+V_1)|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{A+V_1}([d-\Vert V_1\Vert ^{(n)},\infty ))} \bigr ) \end{aligned}$$

for \(k \le \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_A([d,\infty )) = \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_{A+V_j}([d-\Vert V_j^{(n)}\Vert ,\infty ))\), \(j\in \{0,1\}\).

Corollary 2.4

Let A and V be as in Proposition 2.1. Then, the open interval \((c+\Vert V^{(p)}\Vert , d-\Vert V^{(n)}\Vert )\) belongs to the resolvent set of every \(A+tV\), \(t \in [0,1]\), and for each \(k \le \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_A([d,\infty )) = \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_{A+tV}([d-t\Vert V^{(n)}\Vert ,\infty ))\), \(t \in [0,1]\), the mapping

$$\begin{aligned}{}[0,1] \ni t \mapsto \lambda _k((A+tV)|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{A+tV}([d-t\Vert V^{(n)}\Vert ,\infty ))}) \end{aligned}$$

is Lipschitz continuous with Lipschitz constant \(\Vert V\Vert \).

Proof

Taking into account that

$$\begin{aligned} \langle x, (A+sV)x \rangle - |t-s|\Vert V\Vert \le \langle x, (A+tV)x \rangle \le \langle x, (A+sV)x \rangle + |t-s|\Vert V\Vert \end{aligned}$$

for all \(x \in {{\,\mathrm{Dom}\,}}(A)\), the claim follows immediately from Proposition 2.1. \(\square \)

It should again be mentioned that the above statements include the particular cases where the norm of the perturbations is less than \((d - c)/2\) or where the perturbations are semidefinite with a norm less than \(d - c\). These cases have essentially been discussed in [27]. There, especially lower bounds on the movement of eigenvalues in gaps of the essential spectrum under certain conditions and the behaviour of edges of the essential spectrum have been studied. However, since this is not the main focus of the present work, this is not pursued further here.

As a consequence of Theorem 1.4, we obtain the following lower bound for the minimax values in the setting of off-diagonal operator perturbations.

Corollary 2.5

In the situation of Theorem 1.4, we have

$$\begin{aligned} \lambda _k(A|_{{{\,\mathrm{Ran}\,}}P_+}) \le \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) \end{aligned}$$

for all \(k \le \dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\).

Proof

Let \({{\mathfrak {M}}}_+ \subset {{\mathcal {D}}}_+\) with \(\dim {{\mathfrak {M}}}_+ = k\). Since \(\langle x, Vx \rangle = 0\) for all \(x \in {{\mathcal {D}}}_+\) by hypothesis, we have

$$\begin{aligned} \sup _{\begin{array}{c} x \in {{\mathfrak {M}}}_+\\ \Vert x\Vert =1 \end{array}} \langle x, Ax \rangle = \sup _{\begin{array}{c} x \in {{\mathfrak {M}}}_+\\ \Vert x\Vert =1 \end{array}} \langle x, (A+V)x \rangle \le \sup _{\begin{array}{c} x \in {{\mathfrak {M}}}_+ \oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, (A+V)x \rangle . \end{aligned}$$

Taking the infimum over all such subspaces \({{\mathfrak {M}}}_+\) proves the claim by Theorem 1.4 and the standard minimax values for \(A|_{{{\,\mathrm{Ran}\,}}P_+}\). \(\square \)

As in Corollary 2.4, we also obtain a continuity statement in the situation of Theorem 1.4 with bounded off-diagonal perturbations. Here, however, we do not have to impose any condition on the norm of the perturbation.

Corollary 2.6

Let A and V be as in Theorem 1.4, and suppose that V is bounded. Then, for each \(k \le \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_A((0,\infty )) = \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_{A+tV}((0,\infty ))\), \(t \in \mathbb {R}\), the mapping

$$\begin{aligned} \mathbb {R}\ni t \mapsto \lambda _k\bigl ( (A+tV)|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{A+tV}((0,\infty ))} \bigr ) \end{aligned}$$

is Lipschitz continuous with Lipschitz constant \(\Vert V\Vert \).

In the particular case where B is semibounded, Theorem 1.5 allows us to extend Corollaries 2.5 and 2.6 to some degree to off-diagonal form perturbations. Recall here, that semiboundedness of B implies that also A is semibounded, see the proof of Theorem 1.5 below.

Corollary 2.7

Assume the hypotheses of Theorem 1.5.

  1. (a)

    For each \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\) one has \(\lambda _k(A|_{{{\,\mathrm{Ran}\,}}P_+}) \le \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+})\).

  2. (b)

    Denote for \(t \in (-1/b, 1/b)\) by \(B_t\) the self-adjoint operator associated with the form \({{\mathfrak {b}}}_t := {{\mathfrak {a}}}+ t{{\mathfrak {v}}}\) with form domain \({{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}_t] := {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\). Then, for each \(k \le \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_A((0,\infty )) = \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_{B_t}((0,\infty ))\), the mapping

    $$\begin{aligned} ( -1/b, 1/b ) \ni t \mapsto \lambda _k(B_t|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_t}((0,\infty ))}) \end{aligned}$$

    is locally Lipschitz continuous.

Proof

  1. (a).

    Taking into account that \({{\mathfrak {v}}}[ x, x ] = 0\) for all \(x \in {{\mathfrak {D}}}_+\) by hypothesis, the inequality \(\lambda _k(A|_{{{\,\mathrm{Ran}\,}}P_+}) \le \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+})\) is proved by means of Theorem 1.5 in a way analogous to Corollary 2.5.

  2. (b).

    Recall that each \(B_t\) is indeed a semibounded self-adjoint operator with \({{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}_t] = {{\,\mathrm{Dom}\,}}(|B_t|^{1/2})\) by the well-known KLMN theorem, and note that each \(t{{\mathfrak {v}}}\) satisfies the hypotheses of Theorem 1.5. Pick \(t,s \in (-1/b, 1/b)\) with \(b|t-s| \le 1-b|s|\). Consider first the case where A (and hence \({{\mathfrak {a}}}\)) is lower semibounded with lower bound \(m \in \mathbb {R}\). We then have \(|{{\mathfrak {a}}}[x,x]| \le {{\mathfrak {a}}}[x,x] + (|m|-m)\Vert x\Vert ^2\) for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\). With \({\tilde{a}} := a + b|m| - bm\), this gives

    $$\begin{aligned} |{{\mathfrak {v}}}[x,x]| \le {\tilde{a}}\Vert x\Vert ^2 + b{{\mathfrak {a}}}[x,x] \le {\tilde{a}}\Vert x\Vert ^2 + b{{\mathfrak {b}}}_s[x,x] + b|s||{{\mathfrak {v}}}[x,x]| \end{aligned}$$

    and, hence,

    $$\begin{aligned} |{{\mathfrak {v}}}[x,x]| \le \frac{{\tilde{a}}}{1-b|s|}\Vert x\Vert ^2 + \frac{b}{1-b|s|}{{\mathfrak {b}}}_s[x,x] \end{aligned}$$

    for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] = {{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}_s]\). Since \({{\mathfrak {b}}}_t = {{\mathfrak {b}}}_s + (t-s){{\mathfrak {v}}}\), we thus obtain

    $$\begin{aligned} -\frac{{\tilde{a}}|t-s|}{1-b|s|} + \Bigl ( 1 - \frac{b|t-s|}{1-b|s|} \Bigr ) {{\mathfrak {b}}}_s \le {{\mathfrak {b}}}_t \le \frac{{\tilde{a}}|t-s|}{1-b|s|} + \Bigl ( 1 + \frac{b|t-s|}{1-b|s|} \Bigr ) {{\mathfrak {b}}}_s. \end{aligned}$$

    Abbreviating \(\lambda _k(t):=\lambda _k(B_t|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_t}((0,\infty ))})\), Theorem 1.5 then implies that

    $$\begin{aligned} -\frac{{\tilde{a}}|t-s|}{1-b|s|} + \Bigl ( 1 - \frac{b|t-s|}{1-b|s|} \Bigr ) \lambda _k(s) \le \lambda _k(t) \le \frac{{\tilde{a}}|t-s|}{1-b|s|} + \Bigl ( 1 + \frac{b|t-s|}{1-b|s|} \Bigr ) \lambda _k(s) \end{aligned}$$

    and, therefore,

    $$\begin{aligned} |\lambda _k(t) - \lambda _k(s)| \le \frac{{\tilde{a}}|t-s|}{1-b|s|} + \frac{b|t-s|}{1-b|s|} |\lambda _k(s)|. \end{aligned}$$
    (2.4)

    This proves that \(t \mapsto \lambda _k(t)\) is continuous on \((-1/b,1/b)\) and, in particular, bounded on every compact subinterval of \((-1/b, 1/b)\). In turn, it then easily follows from (2.4) that this mapping is even locally Lipschitz continuous, which concludes the case where A is lower semibounded. If A is upper semibounded with upper bound \(m \in \mathbb {R}\), we proceed similarly. We then have \(|{{\mathfrak {a}}}[x,x]| \le -{{\mathfrak {a}}}[x,x] + (m+|m|)\Vert x\Vert ^2\) for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\). With \({\tilde{a}} := a + bm + b|m|\), this leads to

    $$\begin{aligned} |{{\mathfrak {v}}}[x,x]| \le \frac{{\tilde{a}}}{1-b|s|}\Vert x\Vert ^2 - \frac{b}{1-b|s|}{{\mathfrak {b}}}_s[x,x] \end{aligned}$$

    for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] = {{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}_s]\). Analogously as above, we then eventually obtain again (2.4), which proves the claim in the case where A is upper semibounded. This completes the proof.\(\square \)

Remark 2.8

In part (a) of Corollary 2.7, one can also give an upper bound for \(\lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+})\) in terms of the form bounds of \({{\mathfrak {v}}}\): If A is lower semibounded with lower bound \(m \in \mathbb {R}\), then we have as in the proof of part (b) of Corollary 2.7 that

$$\begin{aligned} | {{\mathfrak {v}}}[ x, x ] | \le (a + b|m| - bm)\Vert x\Vert ^2 + b{{\mathfrak {a}}}[ x, x ] \end{aligned}$$

for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\), leading to

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) \le (1+b)\lambda _k(A|_{{{\,\mathrm{Ran}\,}}P_+}) + (a + b|m| - bm) \end{aligned}$$

for all \(k \le \dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\). Similarly, if A is upper semibounded with upper bound \(m \in \mathbb {R}\), we have

$$\begin{aligned} | {{\mathfrak {v}}}[ x, x ] | \le (a + b|m| + bm)\Vert x\Vert ^2 - b{{\mathfrak {a}}}[ x, x ] \end{aligned}$$

for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\). If, in addition, \(b \le 1\), this then leads to

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) \le (1-b)\lambda _k(A|_{{{\,\mathrm{Ran}\,}}P_+}) + (a + b|m| + bm) \end{aligned}$$

for all \(k \le \dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q_+\).

2.1 An Example: The Stokes Operator

We now briefly revisit the Stokes operator in the framework of Theorem 1.5. Here, we mainly rely on [15], but the reader is referred also to [14, Section 7], [29, Chapter 5], [10], and the references cited therein.

Let \(\Omega \subset \mathbb {R}^n\), \(n \ge 2\), be a bounded domain with \(C^2\)-boundary, and let \(\nu > 0\) and \(v_* \ge 0\). On the Hilbert space \({{\mathcal {H}}}= {{\mathcal {H}}}_+ \oplus {{\mathcal {H}}}_-\) with \({{\mathcal {H}}}_+ = L^2(\Omega )^n\) and \({{\mathcal {H}}}_- = L^2(\Omega )\), we consider the closed, densely defined, and nonnegative form \({{\mathfrak {a}}}\) with \({{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] := H_0^1(\Omega )^n \oplus L^2(\Omega )\) and

$$\begin{aligned} {{\mathfrak {a}}}[ v \oplus q, u \oplus p ] := \nu \sum _{j=1}^n \int _\Omega \langle \partial _j v(x), \partial _j u(x) \rangle _{\mathbb {C}^n} \,\mathrm {d}x \end{aligned}$$

for \(u \oplus p, v \oplus q \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\). Clearly, \({{\mathfrak {a}}}\) is the form associated to the nonnegative self-adjoint operator \(A := -\nu {\varvec{{\Delta }}} \oplus 0\) on the Hilbert space \({{\mathcal {H}}}= {{\mathcal {H}}}_+ \oplus {{\mathcal {H}}}_-\) with \({{\,\mathrm{Dom}\,}}(A) := (H^2(\Omega ) \cap H_0^1(\Omega ))^n \oplus L^2(\Omega )\) and \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\), where \({\varvec{\Delta }} = \Delta \cdot I_{\mathbb {C}^n}\) is the vector-valued Dirichlet Laplacian on \(\Omega \). Moreover, \(P_+ := \mathsf {E}_A((0,\infty ))\) and \(P_- := \mathsf {E}_A((-\infty ,0]) = \mathsf {E}_A(\{0\})\) are the orthogonal projections onto \({{\mathcal {H}}}_+\) and \({{\mathcal {H}}}_-\), respectively. In particular, we have

$$\begin{aligned} {{\mathfrak {D}}}_+ := {{\,\mathrm{Ran}\,}}P_+ \cap {{\,\mathrm{Dom}\,}}(|A|^{1/2}) = H_0^1(\Omega )^n \oplus 0 \end{aligned}$$

and

$$\begin{aligned} {{\mathfrak {D}}}_- := {{\,\mathrm{Ran}\,}}P_- \cap {{\,\mathrm{Dom}\,}}(|A|^{1/2}) = 0 \oplus L^2(\Omega ). \end{aligned}$$

Define the symmetric sesquilinear form \({{\mathfrak {v}}}\) on \({{\mathcal {H}}}= {{\mathcal {H}}}_+ \oplus {{\mathcal {H}}}_-\) with domain \({{\,\mathrm{Dom}\,}}[{{\mathfrak {v}}}] := {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\) by

$$\begin{aligned} {{\mathfrak {v}}}[ v \oplus q, u \oplus p ] := -v_* \langle {{\,\mathrm{div}\,}}v, p \rangle _{L^2(\Omega )} - v_* \langle q, {{\,\mathrm{div}\,}}u \rangle _{L^2(\Omega )} \end{aligned}$$

for \(u \oplus p, v \oplus q \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\). One can show that \(\nu \Vert {{\,\mathrm{div}\,}}u\Vert _{L^2(\Omega )}^2 \le {{\mathfrak {a}}}[u \oplus 0, u \oplus 0]\) for all \(u \in {{\mathfrak {D}}}_+ = H_0^1(\Omega )^n\), see, e.g., [29, Proof of Theorem 5.12]. Using Young’s inequality, this then implies that \({{\mathfrak {v}}}\) is infinitesimally form bounded with respect to \({{\mathfrak {a}}}\), see [29, Remark 5.1.3]; cf. also [15, Section 2]. Indeed, for \(\varepsilon > 0\) and \(f = u \oplus p \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\) we obtain

$$\begin{aligned} \begin{aligned} |{{\mathfrak {v}}}[f, f]|&\le 2 v_* | \langle p, {{\,\mathrm{div}\,}}u \rangle _{L^2(\Omega )} | \le 2 v_* \Vert p\Vert _{L^2(\Omega )} \Vert {{\,\mathrm{div}\,}}u\Vert _{L^2(\Omega )}\\&\le \varepsilon \nu \Vert {{\,\mathrm{div}\,}}u\Vert _{L^2(\Omega )}^2 + \varepsilon ^{-1} \nu ^{-1} v_*^2 \Vert p\Vert _{L^2(\Omega )}^2\\&\le \varepsilon {{\mathfrak {a}}}[ u \oplus 0, u \oplus 0 ] + \varepsilon ^{-1} \nu ^{-1} v_*^2 \Vert f\Vert _{{\mathcal {H}}}^2\\&= \varepsilon {{\mathfrak {a}}}[ f, f ] + \varepsilon ^{-1} \nu ^{-1} v_*^2 \Vert f\Vert _{{\mathcal {H}}}^2. \end{aligned} \end{aligned}$$
(2.5)

Thus, by the well-known KLMN theorem, the form \({{\mathfrak {b}}}_S := {{\mathfrak {a}}}+ {{\mathfrak {v}}}\) with \({{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}_S] = {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] = {{\,\mathrm{Dom}\,}}(|A|^{1/2})\) is associated to a unique lower semibounded self-adjoint operator \(B_S\) on \({{\mathcal {H}}}\) with \({{\,\mathrm{Dom}\,}}(|B_S|^{1/2}) = {{\,\mathrm{Dom}\,}}(|A|^{1/2})\), the so-called Stokes operator. It is a self-adjoint extension of the (non-closed) upper dominant block operator matrix

$$\begin{aligned} \begin{pmatrix} -\nu {\varvec{\Delta }} &{} v_*{{\,\mathrm{grad}\,}}\\ -v_*{{\,\mathrm{div}\,}}&{} 0 \end{pmatrix} \end{aligned}$$

defined on \((H^2(\Omega ) \cap H_0^1(\Omega ))^n \oplus H^1(\Omega )\). In fact, the closure of the latter is a self-adjoint operator, see [10, Theorems 3.7 and 3.9], which yields another characterization of the Stokes operator \(B_S\).

By rescaling, one obtains from [10, Theorem 3.15] that the essential spectrum of \(B_S\) is given by

$$\begin{aligned} {{\,\mathrm{spec}\,}}_\mathrm {ess}(B_S) = \Bigl \{ -\frac{v_*^2}{\nu }, -\frac{v_*^2}{2\nu } \Bigr \}, \end{aligned}$$

cf. [15, Remark 2.2]. In particular, the essential spectrum of \(B_S\) is purely negative. In turn, the positive spectrum of \(B_S\), that is, \({{\,\mathrm{spec}\,}}(B_S) \cap (0,\infty )\), is discrete [15, Theorem 2.1 (i)].

The above shows that the hypotheses of Theorem 1.5 are satisfied in this situation, so that we obtain from Theorem 1.5 and Corollary 2.7 the following result.

Proposition 2.9

Let \(B_S\) be the Stokes operator as above. Then, the positive spectrum of \(B_S\), \({{\,\mathrm{spec}\,}}(B_S) \cap (0,\infty )\), is discrete, and the positive eigenvalues \(\lambda _k(B_S|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_S}((0,\infty ))})\), \(k \in \mathbb {N}\), of \(B_S\), enumerated in nondecreasing order and counting multiplicities, admit the representation

$$\begin{aligned} \lambda _k(B_S|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_S}((0,\infty ))}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+ \subset H_0^1(\Omega )^n\\ \dim {{\mathfrak {M}}}_+ = k \end{array}} \sup _{\begin{array}{c} u \oplus p \in {{\mathfrak {M}}}_+ \oplus L^2(\Omega )\\ \Vert u\Vert _{L^2(\Omega )^n}^2 + \Vert p\Vert _{L^2(\Omega )}^2 = 1 \end{array}} {{\mathfrak {b}}}_S[ u \oplus p, u \oplus p ]. \end{aligned}$$

The latter depend locally Lipschitz continuously on \(\nu \) and \(v_*\) and satisfy the two-sided estimate

$$\begin{aligned} \nu \lambda _k(-{\varvec{\Delta }}) \le \lambda _k(B_S|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_S}((0,\infty ))}) \le \nu \lambda _k(-{\varvec{\Delta }}) + \frac{v_*^2}{\nu }. \end{aligned}$$

Proof

In view of the above considerations, the representation of the eigenvalues follows from Theorem 1.5, and the lower bound on the eigenvalues follows from Corollary 2.7 (a). Moreover, by rescaling, the continuity statement is a consequence of Corollary 2.7 (b). It remains to show the upper bound on the eigenvalues. To this end, let \({{\mathfrak {M}}}_+ \subset H_0^1(\Omega )^n\) with \(\dim {{\mathfrak {M}}}_+ = k \in \mathbb {N}\), and let \(f = u \oplus p \in {{\mathfrak {M}}}_+ \oplus L^2(\Omega )\) be a normalized vector with \(u \ne 0\). Then, \(\mu := {{\mathfrak {a}}}[ u \oplus 0, u \oplus 0 ] / \Vert u\Vert _{L^2(\Omega )^n}^2 = {{\mathfrak {a}}}[ f, f ] / \Vert u\Vert _{L^2(\Omega )^n}^2\) is positive and satisfies

$$\begin{aligned} \mu \le \sup _{\begin{array}{c} v \in {{\mathfrak {M}}}_+\\ \Vert v\Vert _{L^2(\Omega )^n}^2 = 1 \end{array}} {{\mathfrak {a}}}[ v \oplus 0, v \oplus 0 ] \end{aligned}$$
(2.6)

and

$$\begin{aligned} \frac{\nu \Vert {{\,\mathrm{div}\,}}u\Vert _{L^2(\Omega )}^2}{\mu } = \frac{\Vert u\Vert _{L^2(\Omega )^n}^2 \nu \Vert {{\,\mathrm{div}\,}}u\Vert _{L^2(\Omega )}^2}{{{\mathfrak {a}}}[ u \oplus 0, u \oplus 0 ]} \le \Vert u\Vert _{L^2(\Omega )^n}^2 \le 1. \end{aligned}$$

Similarly as in (2.5), we now obtain by means of Young’s inequality that

$$\begin{aligned} | {{\mathfrak {v}}}[ f, f ] |&\le 2v_*\Vert p\Vert _{L^2(\Omega )} \Vert {{\,\mathrm{div}\,}}u\Vert _{L^2(\Omega )} \le \mu \Vert p\Vert _{L^2(\Omega )}^2 + \frac{v_*^2 \Vert {{\,\mathrm{div}\,}}u\Vert _{L^2(\Omega )}^2}{\mu }\\&\le \mu \Vert p\Vert _{L^2(\Omega )}^2 + \frac{v_*^2}{\nu }. \end{aligned}$$

Since \({{\mathfrak {a}}}[ f, f ] = \mu \Vert u\Vert _{L^2(\Omega )^n}^2\), this gives

$$\begin{aligned} {{\mathfrak {b}}}_S[ f, f ] \le {{\mathfrak {a}}}[ f, f ] + \mu \Vert p\Vert _{L^2(\Omega )}^2 + \frac{v_*^2}{\nu } = \mu + \frac{v_*^2}{\nu }. \end{aligned}$$
(2.7)

In light of \({{\mathfrak {b}}}_S[ 0 \oplus p, 0 \oplus p ] = {{\mathfrak {a}}}[ 0 \oplus p, 0 \oplus p ] = 0\), we conclude from (2.6) and (2.7) that

$$\begin{aligned} \sup _{\begin{array}{c} u \oplus p \in {{\mathfrak {M}}}_+ \oplus L^2(\Omega )\\ \Vert u\Vert _{L^2(\Omega )^2}^2 + \Vert p\Vert _{L^2(\Omega )}^2 = 1 \end{array}} {{\mathfrak {b}}}_S[ u \oplus p, u \oplus p ] \le \sup _{\begin{array}{c} v \in {{\mathfrak {M}}}_+\\ \Vert v\Vert _{L^2(\Omega )^n}^2 = 1 \end{array}} {{\mathfrak {a}}}[ v \oplus 0, v \oplus 0 ] + \frac{v_*^2}{\nu }, \end{aligned}$$

and taking the infimum over subspaces \({{\mathfrak {M}}}_+ \subset H_0^1(\Omega )^n\) with \(\dim {{\mathfrak {M}}}_+ = k\) proves the upper bound. This completes the proof. \(\square \)

Remark 2.10

  1. (1)

    Choosing \(\varepsilon = 1\) in (2.5), the upper bound from Remark 2.8 (1) reads

    $$\begin{aligned} \lambda _k(B_S|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_S}((0,\infty ))}) \le 2\nu \lambda _k(-{\varvec{\Delta }}) + \frac{v_*^2}{\nu } \end{aligned}$$

    for all \(k \in \mathbb {N}\), while the choice \(\varepsilon = v_*\) in (2.5) leads to

    $$\begin{aligned} \lambda _k(B_S|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_S}((0,\infty ))}) \le (1+v_*)\nu \lambda _k(-{\varvec{\Delta }}) + \frac{v_*}{\nu } \end{aligned}$$

    for all \(k \in \mathbb {N}\).

  2. (2)

    For the particular case of \(k = 1\), a similar upper bound has been established in the proof of [15, Theorem 2.1 (i)]:

    $$\begin{aligned} \nu \lambda _1(-\Delta ) \le \lambda _1(B_S|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_S}((0,\infty ))}) \le \nu \lambda _1(-\Delta ) + v_*\Vert {{\,\mathrm{div}\,}}u_0\Vert _{L^2(\Omega )}, \end{aligned}$$

    where \(u_0 \in (H^2(\Omega ) \cap H_0^1(\Omega ))^n\) is a normalized eigenfunction for \(-{\varvec{\Delta }}\) corresponding to the first positive eigenvalue \(\lambda _1(-{\varvec{\Delta }}) = \lambda _1(-\Delta )\).

Remark 2.11

Since \(B_S\) is lower semibounded, Proposition 2.9 can alternatively be proved via [12], see Remark 1.6 (1). Moreover, since \(A = -\nu {\varvec{\Delta }} \oplus 0\) has a spectral gap to the right of 0, the same is true with [25], see Remark 1.6 (2). In fact, [28] gives for \(\lambda _k = \lambda _k(B_S|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{B_S}((0,\infty ))})\), \(k \in \mathbb {N}\), with the same reasoning also the representation

$$\begin{aligned} \lambda _k = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+ \subset (H^2(\Omega )\cap H_0^1(\Omega ))^n\\ \dim {{\mathfrak {M}}}_+ = k \end{array}} \sup _{\begin{array}{c} u \oplus p \in {{\mathfrak {M}}}_+ \oplus H^1(\Omega )\\ \Vert u\Vert _{L^2(\Omega )^n}^2 + \Vert p\Vert _{L^2(\Omega )}^2 = 1 \end{array}} \langle u \oplus p, B_S(u \oplus p) \rangle . \end{aligned}$$

2.2 Reducing to the Off-diagonal Framework

Apart from the Stokes operator from the previous subsection, the consideration of off-diagonal perturbations in Theorems 1.4 and 1.5 may seem a bit restrictive. However, such perturbations naturally appear when the perturbation is decomposed into its diagonal and off-diagonal part. If the diagonal perturbation can then be handled efficiently in a suitable way, the discussed off-diagonal framework can be applied to the remaining part of the perturbation. The following result makes this precise in the setting of operator perturbations.

Proposition 2.12

Assume Hypothesis 1.1. Suppose, in addition, that B is of the form \(B = A + V\), \({{\,\mathrm{Dom}\,}}(B) = {{\,\mathrm{Dom}\,}}(A)\), with some symmetric A-bounded operator V. Denote \(A_\pm := A|_{{{\,\mathrm{Ran}\,}}P_\pm }\), and decompose \(V|_{{{\,\mathrm{Dom}\,}}(A)}\) as \(V|_{{{\,\mathrm{Dom}\,}}(A)} = V_{\mathrm {diag}} + V_{\mathrm {off}}\), where \(V_{\mathrm {diag}} = V_+ \oplus V_-\) is the diagonal part of \(V|_{{{\,\mathrm{Dom}\,}}(A)}\) and \(V_{\mathrm {off}}\) is the off-diagonal part of \(V|_{{{\,\mathrm{Dom}\,}}(A)}\) with respect to the decomposition \({{\,\mathrm{Ran}\,}}P_+ \oplus {{\,\mathrm{Ran}\,}}P_-\). Suppose that the following hold:

  1. (i)

    \(A+V_{\mathrm {diag}}\) is self-adjoint on \({{\,\mathrm{Dom}\,}}(A+V_{\mathrm {diag}}) = {{\,\mathrm{Dom}\,}}(A)\),

  2. (ii)

    \(V_{\mathrm {off}}\) is \((A+V_{\mathrm {diag}})\)-bounded with \((A+V_{\mathrm {diag}})\)-bound smaller than 1,

  3. (iii)

    \(\sup {{\,\mathrm{spec}\,}}(A_- + V_-) \le \inf {{\,\mathrm{spec}\,}}(A_+ + V_+)\), and

  4. (iv)

    \({{\,\mathrm{Ker}\,}}(A_+ + V_+ - \mu ) = \{ 0 \}\) with \(\mu := \sup {{\,\mathrm{spec}\,}}(A_- + V_-)\).

Then, setting \(Q := \mathsf {E}_B((\mu ,\infty ))\), one has \(\dim {{\,\mathrm{Ran}\,}}P_+ = \dim {{\,\mathrm{Ran}\,}}Q\) and

$$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathcal {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, Bx \rangle = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathfrak {D}}}_+\\ \dim {{\mathfrak {M}}}_+=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+ \oplus {{\mathfrak {D}}}_-\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {b}}}[x,x] \end{aligned}$$

for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}Q\).

Proof

By (iii) and (iv) we obviously have

$$\begin{aligned} P_+ = \mathsf {E}_{A+V_{{\mathrm{diag}}}}\bigl ( (\mu ,\infty ) \bigr ) \quad {\text { and }}\quad P_- = \mathsf {E}_{A+V_{{\mathrm{diag}}}}\bigl ( (-\infty ,\mu ] \bigr ) . \end{aligned}$$

Upon a spectral shift by \(\mu \), the claim now follows from Theorem 1.4 with A, V, and \(Q_+\) replaced by \(A + V_{\mathrm{diag}}\), \(V_{\mathrm{off}}\), and Q, respectively. \(\square \)

Remark 2.13

Conditions (iii) and (iv) in Proposition 2.12 are satisfied if \(\mu < \nu := \inf {{\,\mathrm{spec}\,}}(A_+ + V_+)\) holds. In this case, the interval \((\mu , \nu )\) belongs to the resolvent set of the operator \(A + V_{\mathrm{diag}}\), and by [26, Theorem 1] (cf. also [1, Theorem 2.1]) to the one of \(A + V = (A + V_{\mathrm{diag}}) + V_{\mathrm{off}}\) as well. In particular, we have \(\mathsf {E}_{A+V}((\mu ,\infty )) = \mathsf {E}_{A+V}([\nu ,\infty ))\). This is the situation encountered in the alternative proof of Proposition 2.1 (b) and the proof of Corollary 2.15 below. However, the conclusions can then alternatively be obtained also via [6, 25], cf. Remark 1.6 (2).

Since conditions (i) and (ii) in Proposition 2.12 are clearly satisfied if V is bounded, the above provides an alternative way to prove part (b) of Proposition 2.1:

Alternative proof of Proposition 2.1(b)

By spectral shift we may assume without loss of generality that \(c< 0 < d\). We then have \(P = P_+\) with \(P_+\) as in Hypothesis 1.1. Let \(A_\pm \) and \(V_{\mathrm{diag}}\) be defined as in Proposition 2.12, and for \(\bullet \in \{ p, n \}\) denote by \(V_{\mathrm{diag}}^{(\bullet )} = V_+^{(\bullet )} \oplus V_-^{(\bullet )}\) the diagonal part of \(V^{(\bullet )}\). Clearly, we have \(V_\pm ^{(\bullet )} \ge 0\) and \(\Vert V_\pm ^{(\bullet )}\Vert \le \Vert V^{(\bullet )}\Vert \), and \(V_{\mathrm{diag}}\) decomposes as \(V_{\mathrm{diag}} = V_{\mathrm{diag}}^{(p)} - V_{\mathrm{diag}}^{(n)} = (V_+^{(p)} - V_+^{(n)}) \oplus (V_-^{(p)} - V_-^{(n)})\). Now,

$$\begin{aligned} A_- + V_-^{(p)} - V_-^{(n)} \le c + \Vert V^{(p)}\Vert \quad {\text { and }}\quad d - \Vert V^{(n)}\Vert \le A_+ + V_+^{(p)} - V_+^{(n)} \end{aligned}$$

in the sense of quadratic forms. In light of \(c + \Vert V^{(p)}\Vert < d - \Vert V^{(n)}\Vert \) and Remark 2.13, applying Proposition 2.12 proves the claim. \(\square \)

It is worth to note that an analogous reasoning for part (c) of Proposition 2.1 suffers from similar obstacles as the alternative proof based on [6, 25] mentioned in Remark 2.2 (2).

If V is not bounded, conditions (i) and (ii) in Proposition 2.12 can still be guaranteed via the well-known Kato-Rellich theorem by means of a sufficiently small A-bound of V, as the following lemma shows.

Lemma 2.14

In the situation of Proposition 2.12, let V have A-bound smaller than 1/2. Then \(V_{\mathrm {diag}}\) and \(V_{\mathrm {off}}\) both have A-bound smaller than 1/2, and \(V_{\mathrm {off}}\) has \((A+V_{\mathrm {diag}})\)-bound smaller than 1.

Proof

By hypothesis, there are constants \(a,b \ge 0\), \(b < 1/2\) such that we have \(\Vert Vx \Vert \le a\Vert x \Vert + b\Vert Ax \Vert \) for all \(x \in {{\,\mathrm{Dom}\,}}(A)\). Using Young’s inequality, this gives for every \(\varepsilon > 0\) that

$$\begin{aligned} \Vert Vx \Vert ^2 \le \bigl ( a\Vert x\Vert + b\Vert Ax\Vert \bigr )^2 \le a^2\Bigl ( 1 + \frac{1}{\varepsilon } \Bigr )\Vert x\Vert ^2 + b^2(1+\varepsilon )\Vert Ax\Vert ^2 \end{aligned}$$

for all \(x \in {{\,\mathrm{Dom}\,}}(A)\); cf. [17,  Section V.4.1]. Since \(AP_\pm x = P_\pm Ax\) for all \(x \in {{\,\mathrm{Dom}\,}}(A)\) and the ranges of \(P_\pm \) are orthogonal, this implies that

$$\begin{aligned} \Vert V_{\mathrm{diag}}x\Vert ^2&= \Vert P_+VP_+x\Vert ^2 + \Vert P_-VP_-x\Vert ^2\\&\le a^2\Bigl ( 1 + \frac{1}{\varepsilon } \Bigr ) \Vert x\Vert ^2 + b^2(1+\varepsilon ) \Vert Ax\Vert ^2 \end{aligned}$$

for all \(x \in {{\,\mathrm{Dom}\,}}(A)\). We choose \(\varepsilon > 0\) such that \(\beta := b(1+\varepsilon )^{1/2} < 1/2\) and set \(\alpha := a(1+1/\varepsilon )^{1/2}\). It then follows from the above that

$$\begin{aligned} \Vert V_{\mathrm{diag}}x\Vert \le \alpha \Vert x\Vert + \beta \Vert Ax\Vert \quad {\text { for all }}\ x \in {{\,\mathrm{Dom}\,}}(A) \end{aligned}$$

and analogously the same for \(V_{\mathrm{off}}\). This shows that \(V_{\mathrm{diag}}\) and \(V_{\mathrm{off}}\) indeed have A-bound smaller than 1/2.

Using standard arguments as, for instance, in [35,  Lemma 2.1.6], we obtain

$$\begin{aligned} \Vert V_{\mathrm{diag}}x\Vert \le \frac{\alpha }{1-\beta }\Vert x\Vert + \frac{\beta }{1-\beta }\Vert (A + V_{\mathrm{diag}})x\Vert \quad {\text { for all }}\ x \in {{\,\mathrm{Dom}\,}}(A) \end{aligned}$$

and, in turn,

$$\begin{aligned} \Vert V_{\mathrm{off}}x\Vert&\le \alpha \Vert x\Vert + \beta \Vert Ax\Vert \le \alpha \Vert x\Vert + \beta \Vert (A+V_{\mathrm{diag}})x\Vert + \beta \Vert V_{\mathrm{diag}}x\Vert \\&\le \Bigl ( \alpha + \frac{\alpha \beta }{1-\beta } \Bigr )\Vert x\Vert + \Bigl ( \beta + \frac{\beta ^2}{1-\beta } \Bigr ) \Vert (A + V_{\mathrm{diag}})x\Vert \\&= \frac{\alpha }{1-\beta }\Vert x\Vert + \frac{\beta }{1-\beta }\Vert (A + V_{\mathrm{diag}})x\Vert \end{aligned}$$

for all \(x \in {{\,\mathrm{Dom}\,}}(A + V_{\mathrm{diag}}) = {{\,\mathrm{Dom}\,}}(A)\), where \(\beta /(1-\beta ) < 1\). This shows that \(V_{\mathrm{off}}\) has \((A + V_{\mathrm{diag}})\)-bound smaller than 1 and, hence, completes the proof. \(\square \)

A suitable smallness assumption on the perturbation may also be used to guarantee condition (iii) and (iv) in Proposition 2.12 in the sense of Remark 2.13 if the unperturbed operator has a gap in the spectrum. This is demonstrated in the following corollary to the above.

Corollary 2.15

Let A, (cd), and \({{\mathcal {D}}}_\pm \) be as in Proposition 2.1, and let V be a symmetric operator that is A-bounded with A-bound smaller than 1/2. Suppose, in addition, that \(c< 0 < d\), and define \(A_\pm \) and \(V_\pm \) as in Proposition 2.12. Suppose that there are constants \(a_\pm , b_\pm \ge 0\), \(b_\pm < 1\), with

$$\begin{aligned} | \langle x, V_\pm x \rangle | \le a_\pm \Vert x\Vert ^2 \pm b_\pm \langle x, A_\pm x \rangle \quad {\text { for all }}\ x \in {{\mathcal {D}}}_\pm \end{aligned}$$
(2.8)

and

$$\begin{aligned} a_+ + a_- + b_+d - b_-c < d - c. \end{aligned}$$
(2.9)

Then, the interval \((a_- + (1-b_-)c, (1-b_+)d-a_+)\) belongs to the resolvent set of \(A+V\), and one has \(\dim {{\,\mathrm{Ran}\,}}\mathsf {E}_A( [d,\infty ) ) = \dim \mathsf {E}_{A+V}( [(1-b_+)d-a_+,\infty ) )\) and

$$\begin{aligned} \lambda _k((A+V)|_{{{\,\mathrm{Ran}\,}}\mathsf {E}_{A+V}([(1-b_+)d-a_+,\infty ))}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+ \subset {{\mathcal {D}}}_+\\ \dim {{\mathfrak {M}}}_+ = k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+\oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, (A+V)x \rangle \end{aligned}$$

for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}\mathsf {E}_{A+V}([(1-b_+)d-a_+,\infty ))\).

Proof

By Lemma 2.14 and the Kato-Rellich theorem, conditions (i) and (ii) in Proposition 2.12 are satisfied. By (2.8) we have

$$\begin{aligned} A_- + V_- \le a_- + (1-b_-)c \quad {\text { and }}\quad (1-b_+)d - a_+ \le A_+ + V_+ \end{aligned}$$

in the sense of quadratic forms. Since \(a_- + (1-b_-)c < (1-b_+)d - a_+\) by (2.9), the claim now again follows from Proposition 2.12 and Remark 2.13. \(\square \)

Remark 2.16

  1. (1)

    It is easy to see that the left-hand side of (2.9) is invariant under a spectral shift in A. In this respect, an analogous statement as in Corollary 2.15 holds for arbitrary spectral gaps (cd), not just ones satisfying \(c< 0 < d\).

  2. (2)

    It follows from Lemma 2.14 that \(V_\pm \) is \(A_\pm \)-bounded with \(A_\pm \)-bound smaller than 1/2. In turn, since both \(A_\pm \) are semibounded, constants \(a_\pm ,b_\pm \ge 0\), \(b_\pm < 1/2\), satisfying (2.8) always exist by [17, Theorem VI.1.38]. Condition (2.9) can then be guaranteed for tV instead of V for \(t \in \mathbb {R}\) with sufficiently small modulus.

  3. (3)

    As indicated in Remark 2.13, Corollary 2.15 can also be proved via [6, 25] by verifying the operator analogue to (1.1). In fact, that approach even allows to weaken the assumption on the A-bound of V from being smaller than 1/2 to merely being smaller than 1 since it is then not necessary to have that \(V_{\mathrm{off}}\) has \((A+V_{\mathrm{diag}})\)-bound smaller than 1.

3 An Abstract Minimax Principle in Spectral Gaps

We rely on the following abstract minimax principle in spectral gaps, part (a) of which is extracted from [11] and part (b) of which is its natural adaptation to the operator framework; cf. also [27, Proposition A.3]. For the convenience of the reader, its proof is reproduced in Appendix A below.

Proposition 3.1

Assume Hypothesis 1.1.

  1. (a)

    If we have \({{\,\mathrm{Dom}\,}}(|B|^{1/2}) = {{\,\mathrm{Dom}\,}}(|A|^{1/2})\), \({{\mathfrak {b}}}[x,x]\le 0\) for all \(x\in {{\mathfrak {D}}}_-\), and \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathfrak {D}}}_+}) \supset {{\mathfrak {D}}}_+\), then

    $$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathfrak {D}}}_+\\ \dim ({{\mathfrak {M}}}_+)=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+\oplus {{\mathfrak {D}}}_-\\ \Vert x\Vert =1 \end{array}} {{\mathfrak {b}}}[x,x] \end{aligned}$$

    for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}P_+\).

  2. (b)

    If we have \({{\,\mathrm{Dom}\,}}(B) = {{\,\mathrm{Dom}\,}}(A)\), \(\langle x, Bx \rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\), and \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \supset {{\mathcal {D}}}_+\), then

    $$\begin{aligned} \lambda _k(B|_{{{\,\mathrm{Ran}\,}}Q_+}) = \inf _{\begin{array}{c} {{\mathfrak {M}}}_+\subset {{\mathcal {D}}}_+\\ \dim ({{\mathfrak {M}}}_+)=k \end{array}} \sup _{\begin{array}{c} x\in {{\mathfrak {M}}}_+\oplus {{\mathcal {D}}}_-\\ \Vert x\Vert =1 \end{array}} \langle x, Bx \rangle \end{aligned}$$

    for all \(k \in \mathbb {N}\) with \(k \le \dim {{\,\mathrm{Ran}\,}}P_+\).

Remark 3.2

The above proposition is tailored towards spectral gaps to the right of 0, but by a spectral shift we can of course handle also spectral gaps to the right of any point \(\gamma \in \mathbb {R}\). Indeed, we have \(\mathsf {E}_{A-\gamma }((0,\infty )) = \mathsf {E}_A((\gamma ,\infty ))\) for \(\gamma \in \mathbb {R}\) and analogously for B. Moreover, the form associated to the operator \(B - \gamma \) is known to agree with the form \({{\mathfrak {b}}}- \gamma \). The latter can be seen for instance with an analogous reasoning as in [31, Proposition 10.5 (a)]; cf. also Lemma B.6 in Appendix B below.

Remark 3.3

Since \(P_+\) and \(Q_+\) are spectral projections for the respective operators, we have \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathfrak {D}}}_+}) \subset {{\mathfrak {D}}}_+\) and \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \subset {{\mathcal {D}}}_+\). In this respect, the condition \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathfrak {D}}}_+}) \supset {{\mathfrak {D}}}_+\) in part (a) of Proposition 3.1 actually means that the restriction \(P_+Q_+|_{{{\mathfrak {D}}}_+} :{{\mathfrak {D}}}_+ \rightarrow {{\mathfrak {D}}}_+\) is surjective. This has not been formulated explicitly in the statement of [11, Theorem 1] but has instead been guaranteed by the stronger condition

$$\begin{aligned} \Vert (|A|+I)^{1/2} P_+Q_- (|A|+I)^{-1/2} \Vert < 1. \end{aligned}$$

In fact, taking into account that \({{\mathfrak {D}}}_+ = {{\,\mathrm{Ran}\,}}((|A|+I)^{-1/2}|_{{{\,\mathrm{Ran}\,}}P_+})\), a standard Neumann series argument in the Hilbert space \({{\,\mathrm{Ran}\,}}P_+\) then even gives bijectivity of the restriction \(P_+Q_+|_{{{\mathfrak {D}}}_+}\), see Step 2 of the proof of [11, Theorem 1]. In this reasoning, the operators \((|A|+I)^{\pm 1/2}\) can be replaced by \((|A|+\alpha I)^{\pm 1/2}\) for any \(\alpha >0\); if \(|A|\) has a bounded inverse, also \(\alpha =0\) can be considered here.

Of course, the above reasoning also applies in the situation of part (b) of Proposition 3.1, but with \((|A|+\alpha I)^{\pm 1/2}\) replaced by \((|A|+\alpha I)^{\pm 1}\).

In the context of our main theorems, the restriction \(P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_+}\), understood as an endomorphism of \({{\,\mathrm{Ran}\,}}P_+\), will always be bijective, cf. Remark 3.5 (1) below. It turns out that then the hypotheses of part (b) in Proposition 3.1 imply those of part (a), in which case both representations for the minimax values in Proposition 3.1 are valid. More precisely, we have the following lemma, essentially based on the well-known Heinz inequality, cf. Appendix B below.

Lemma 3.4

Assume Hypothesis 1.1 with \({{\,\mathrm{Dom}\,}}(A) = {{\,\mathrm{Dom}\,}}(B)\).

  1. (a)

    One has \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\).

  2. (b)

    If \(\langle x, Bx\rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\), then \({{\mathfrak {b}}}[x, x] \le 0\) for all \(x \in {{\mathfrak {D}}}_-\).

  3. (c)

    If the restriction \(P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_+} :{{\,\mathrm{Ran}\,}}P_+ \rightarrow {{\,\mathrm{Ran}\,}}P_+\) is bijective and \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \supset {{\mathcal {D}}}_+\), then also \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathfrak {D}}}_+}) \supset {{\mathfrak {D}}}_+\).

Proof

  1. (a).

    This is a consequence of the well-known Heinz inequality, see, e.g., Corollary B.3 below. Alternatively, this follows by classical considerations regarding operator and form boundedness, see Remark B.4 below.

  2. (b).

    It follows from part (a) that the operator \(|B|^{1/2} ( |A|^{1/2} + I )^{-1}\) is closed and everywhere defined, hence bounded by the closed graph theorem. Thus,

    $$\begin{aligned} \Vert |B|^{1/2}x\Vert \le \Vert |B|^{1/2}(|A|^{1/2}+I)^{-1}\Vert \cdot \Vert (|A|^{1/2}+I)x\Vert \end{aligned}$$

    for all \(x\in {{\,\mathrm{Dom}\,}}(|A|^{1/2})={{\,\mathrm{Dom}\,}}(|B|^{1/2})\). Since \({{\mathcal {D}}}_-\) is a core for the operator \(|A|_{{{\,\mathrm{Ran}\,}}P_-}|^{1/2}=|A|^{1/2}|_{{{\,\mathrm{Ran}\,}}P_-}\) with \({{\,\mathrm{Dom}\,}}(|A|^{1/2}|_{{{\,\mathrm{Ran}\,}}P_-})={{\mathfrak {D}}}_-\), the inequality \({{\mathfrak {b}}}[x,x]\le 0\) for \(x\in {{\mathfrak {D}}}_-\) now follows from the hypothesis \(\langle x,Bx \rangle \le 0\) for all \(x\in {{\mathcal {D}}}_-\) by approximation.

  3. (c).

    We clearly have \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+})={{\mathcal {D}}}_+\), \({{\mathcal {D}}}_+={{\,\mathrm{Dom}\,}}(A|_{{{\,\mathrm{Ran}\,}}P_+})\), and \({{\mathfrak {D}}}_+={{\,\mathrm{Dom}\,}}(|A|_{{{\,\mathrm{Ran}\,}}P_+}|^{1/2})\). Applying Corollary B.5 below with the choices \(\Lambda _1=\Lambda _2=A|_{{{\,\mathrm{Ran}\,}}P_+}\) and \(S = P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_+}\) therefore implies that \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathfrak {D}}}_+})={{\mathfrak {D}}}_+\), which proves the claim. \(\square \)

Remark 3.5

  1. (1)

    In light of the identity \(P_+Q_+ = P_+ - P_+Q_-\), the bijectivity of \(P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_+} :{{\,\mathrm{Ran}\,}}P_+ \rightarrow {{\,\mathrm{Ran}\,}}P_+\) can be guaranteed, for instance, by the condition \(\Vert P_+Q_-\Vert < 1\) via a standard Neumann series argument. Since \(P_+-Q_+=P_+Q_- - P_-Q_+\) and, in particular, \(\Vert P_+Q_-\Vert \le \Vert P_+-Q_+\Vert \), this condition holds if the stronger inequality \(\Vert P_+-Q_+\Vert <1\) is satisfied. In the latter case, there also is a unitary operator U with \(Q_+U=UP_+\), see, e.g., [17, Theorem I.6.32], so that automatically \(\dim {{\,\mathrm{Ran}\,}}P_+=\dim {{\,\mathrm{Ran}\,}}Q_+\). It is this situation we encounter in Theorems 1.31.5.

  2. (2)

    In the case where B is an infinitesimal operator perturbation of A, the inequality \(\Vert P_+Q_-\Vert <1\) already implies that \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \supset {{\mathcal {D}}}_+\), see the following section; the particular case where B is a bounded perturbation of A has previously been considered in [27, Lemma A.6]. For more general, not necessarily infinitesimal, perturbations, this remains so far an open problem.

4 Proof of Theorem 1.2: The Graph Norm Approach

In this section we show that the inequality \(\Vert P_+Q_-\Vert < 1\) in the context of Theorem 1.2 implies that \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \supset {{\mathcal {D}}}_+\), which is essentially what is needed to deduce Theorem 1.2 from Proposition 3.1 and Lemma 3.4. The main technique used to accomplish this can in fact be formulated in a much more general framework:

Recall that for a closed operator \(\Lambda \) on a Banach space with norm \(\Vert \,\cdot \,\Vert \), its domain \({{\,\mathrm{Dom}\,}}(\Lambda )\) can be equipped with the graph norm

$$\begin{aligned} \Vert x\Vert _\Lambda := \Vert x\Vert + \Vert \Lambda x\Vert ,\quad x\in {{\,\mathrm{Dom}\,}}(\Lambda ), \end{aligned}$$

which makes \(({{\,\mathrm{Dom}\,}}(\Lambda ),\Vert \,\cdot \,\Vert _\Lambda )\) a Banach space. Also recall that a linear operator K with \({{\,\mathrm{Dom}\,}}(K) \supset {{\,\mathrm{Dom}\,}}(\Lambda )\) is called \(\Lambda \)-bounded with \(\Lambda \)-bound \(\beta _* \ge 0\) if for all \(\beta > \beta _*\) there is an \(\alpha \ge 0\) with

$$\begin{aligned} \Vert Kx\Vert \le \alpha \Vert x\Vert + \beta \Vert \Lambda x\Vert \quad {\text { for all }}\ x \in {{\,\mathrm{Dom}\,}}(\Lambda ) \end{aligned}$$
(4.1)

and if there is no such \(\alpha \) for \(0< \beta < \beta _*\).

The following lemma extends part (a) of [27, Proposition A.5], taken from Lemma 3.9 in the author’s Ph.D. thesis [32], to relatively bounded commutators.

Lemma 4.1

Let \(\Lambda \) be a closed operator on a Banach space, K be \(\Lambda \)-bounded with \(\Lambda \)-bound \(\beta _* \ge 0\), and let S be bounded with \({{\,\mathrm{Ran}\,}}(S|_{{{\,\mathrm{Dom}\,}}(\Lambda )})\subset {{\,\mathrm{Dom}\,}}(\Lambda )\) and

$$\begin{aligned} \Lambda Sx - S\Lambda x = Kx \quad {\text { for all }}\ x\in {{\,\mathrm{Dom}\,}}(\Lambda ). \end{aligned}$$

Then, the restriction \(S|_{{{\,\mathrm{Dom}\,}}(\Lambda )}\) is bounded with respect to the graph norm for \(\Lambda \), and the corresponding spectral radius \(r_\Lambda (S) = \lim _{k\rightarrow \infty } \Vert (S|_{{{\,\mathrm{Dom}\,}}(\Lambda )})^k\Vert _\Lambda ^{1/k}\) satisfies

$$\begin{aligned} r_\Lambda (S) \le \Vert S\Vert + \beta _*. \end{aligned}$$

Proof

Only small modifications to the reasoning from [32, Lemma 3.9], [27, Proposition A.5] are necessary. For the sake of completeness, we reproduce the full argument here:

Let \(\beta > \beta _*\) and \(\alpha \ge 0\) such that (4.1) holds. Then, for \(x\in {{\,\mathrm{Dom}\,}}(\Lambda )\) one has

$$\begin{aligned} \Vert \Lambda Sx\Vert \le \Vert S\Vert \Vert \Lambda x\Vert + \Vert Kx\Vert \le (\Vert S\Vert +\beta )\Vert \Lambda x\Vert + \alpha \Vert x\Vert , \end{aligned}$$

so that

$$\begin{aligned} \Vert Sx\Vert _\Lambda = \Vert Sx\Vert + \Vert \Lambda Sx\Vert \le \bigl (\Vert S\Vert +\beta \bigr )\Vert x\Vert _\Lambda + \alpha \Vert x\Vert . \end{aligned}$$

In particular, \(S|_{{{\,\mathrm{Dom}\,}}(\Lambda )}\) is bounded with respect to the graph norm \(\Vert \,\cdot \,\Vert _\Lambda \) with \(\Vert S\Vert _\Lambda \le \Vert S\Vert + \beta + \alpha \).

Now, a straightforward induction yields

$$\begin{aligned} \Vert S^kx\Vert _\Lambda \le \bigl (\Vert S\Vert + \beta \bigr )^k\Vert x\Vert _\Lambda + k\alpha \bigl (\Vert S\Vert +\beta \bigr )^{k-1}\Vert x\Vert , \quad x\in {{\,\mathrm{Dom}\,}}(\Lambda ), \end{aligned}$$

for \(k\in \mathbb {N}\). Hence, \(\Vert (S|_{{{\,\mathrm{Dom}\,}}(\Lambda )})^k\Vert _\Lambda \le (\Vert S\Vert +\beta )^k + k\alpha (\Vert S\Vert +\beta )^{k-1}\), so that

$$\begin{aligned} r_\Lambda (S)&= \lim _{k\rightarrow \infty } \Vert (S|_{{{\,\mathrm{Dom}\,}}(\Lambda )})^k\Vert _\Lambda ^{1/k} \le \lim _{k\rightarrow \infty } \bigl ( (\Vert S\Vert +\beta )^k+k\alpha (\Vert S\Vert +\beta )^{k-1} \bigr )^{1/k}\\&= \Vert S\Vert +\beta . \end{aligned}$$

Since \(\beta > \beta _*\) was chosen arbitrarily, this proves the claim. \(\square \)

We are now in position to prove Theorem 1.2.

Proof of Theorem 1.2

We mainly follow the line of reasoning in the proof of [27, Lemma A.6]. Only a few additional considerations are necessary in order to accommodate unbounded perturbations V by means of Lemma 4.1. For convenience of the reader, we nevertheless reproduce the whole argument here.

Define \(S,T :{{\,\mathrm{Ran}\,}}P_+ \rightarrow {{\,\mathrm{Ran}\,}}P_+\) by

$$\begin{aligned} S := P_+Q_-|_{{{\,\mathrm{Ran}\,}}P_+},\quad T := P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_+} = I_{{{\,\mathrm{Ran}\,}}P_+} - S. \end{aligned}$$

By hypothesis, we have \(\Vert S\Vert \le \Vert P_+Q_-\Vert < 1\), so that T is bijective. In light of Proposition 3.1 and Lemma 3.4, it now remains to show the inclusion \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \supset {{\mathcal {D}}}_+\), that is, \({{\,\mathrm{Ran}\,}}(T^{-1}|_{{{\mathcal {D}}}_+})\subset {{\mathcal {D}}}_+\). To this end, we rewrite \(T^{-1}\) as a Neumann series,

$$\begin{aligned} T^{-1} = (I_{{{\,\mathrm{Ran}\,}}P_+} - S)^{-1} = \sum _{k=0}^\infty S^k. \end{aligned}$$

Clearly, S maps the domain \({{\mathcal {D}}}_+={{\,\mathrm{Dom}\,}}(A|_{{{\,\mathrm{Ran}\,}}P_+})\) into itself, so that the inclusion \({{\,\mathrm{Ran}\,}}(T^{-1}|_{{{\mathcal {D}}}_+})\subset {{\mathcal {D}}}_+\) holds if the above series converges also with respect to the graph norm for the closed operator \(\Lambda :=A|_{{{\,\mathrm{Ran}\,}}P_+}\). This, in turn, is the case if the corresponding spectral radius \(r_\Lambda (S)\) of S is smaller than 1.

For \(x\in {{\mathcal {D}}}_+\subset {{\,\mathrm{Ran}\,}}P_+\) we compute

$$\begin{aligned} \Lambda Sx&= AP_+Q_-x = P_+(A+V)Q_-x - P_+VQ_-x\\&= P_+Q_-(A+V)x - P_+VQ_-x\\&= S\Lambda x + Kx \end{aligned}$$

with

$$\begin{aligned} K := (P_+Q_-V - P_+VQ_-)|_{{{\,\mathrm{Ran}\,}}P_+}. \end{aligned}$$

We show that the operator K is \(\Lambda \)-bounded with \(\Lambda \)-bound 0. Indeed, let \(b > 0\), and choose \(a\ge 0\) with \(\Vert Vx\Vert \le a\Vert x\Vert +b\Vert Ax\Vert \) for all \(x\in {{\,\mathrm{Dom}\,}}(A)\); recall that V is infinitesimal with respect to A by hypothesis. Then,

$$\begin{aligned} \Vert VQ_-x\Vert \le a\Vert Q_-x\Vert + b\Vert AQ_-x\Vert \le a\Vert x\Vert + b\Vert (A+V)x\Vert + b\Vert VQ_-x\Vert , \end{aligned}$$

so that

$$\begin{aligned} \Vert VQ_-x\Vert&\le \frac{a}{1-b}\Vert x\Vert + \frac{b}{1-b}\bigl (\Vert Ax\Vert +\Vert Vx\Vert \bigr )\\&\le \frac{a(1+b)}{1-b}\Vert x\Vert + \frac{b(1+b)}{1-b}\Vert Ax\Vert . \end{aligned}$$

Thus,

$$\begin{aligned} \begin{aligned} \Vert Kx\Vert&\le \Vert P_+Q_-\Vert \Vert Vx\Vert + \Vert VQ_-x\Vert \\&\le a\Bigl (\Vert P_+Q_-\Vert + \frac{1+b}{1-b}\Bigr )\Vert x\Vert + b\Bigl (\Vert P_+Q_-\Vert +\frac{1+b}{1-b}\Bigr )\Vert \Lambda x\Vert \end{aligned} \end{aligned}$$
(4.2)

for \(x\in {{\,\mathrm{Dom}\,}}(\Lambda )={{\mathcal {D}}}_+\). Since \(b>0\) was chosen arbitrarily, this implies that K is \(\Lambda \)-bounded with \(\Lambda \)-bound 0. It therefore follows from Lemma 4.1 that \(r_\Lambda (S)\le \Vert S\Vert <1\), which completes the proof. \(\square \)

Remark 4.2

  1. (1)

    Estimate (4.2) suggests that also relatively bounded perturbations V that are not necessarily infinitesimal with respect to A can be considered here. Indeed, if \(b_*\in [0,1)\) is the A-bound of V, then by (4.2) and Lemma 4.1 we have

    $$\begin{aligned} r_\Lambda (S) \le \Vert P_+Q_-\Vert + b_*\Bigl (\Vert P_+Q_-\Vert +\frac{1+b_*}{1-b_*}\Bigr ), \end{aligned}$$

    and the right-hand side of the latter is smaller than 1 if and only if

    $$\begin{aligned} \Vert P_+Q_-\Vert < \frac{1-2b_*-b_*^2}{1-b_*^2}. \end{aligned}$$

    This is a reasonable condition on the norm \(\Vert P_+Q_-\Vert \) only for \(b_*<\sqrt{2}-1\).

  2. (2)

    A similar result as in (1) can be obtained in terms of the \((A+V)\)-bound of V: If for some \({\tilde{b}}\in [0,1)\) and \({\tilde{a}}\ge 0\) one has \(\Vert Vx\Vert \le {\tilde{a}}\Vert x\Vert + {\tilde{b}}\Vert (A+V)x\Vert \) for all \(x\in {{\,\mathrm{Dom}\,}}(A)={{\,\mathrm{Dom}\,}}(A+V)\), then standard arguments as in the above proof of Theorem 1.2 (see also [35, Lemma 2.1.6]) show that

    $$\begin{aligned} \Vert Vx\Vert \le \frac{{\tilde{a}}}{1-{\tilde{b}}}\Vert x\Vert + \frac{{\tilde{b}}}{1-{\tilde{b}}}\Vert Ax\Vert \end{aligned}$$

    and, in turn,

    $$\begin{aligned} \Vert VQ_-x\Vert&\le {\tilde{a}}\Vert x\Vert + {\tilde{b}}\Vert (A+V)x\Vert \le {\tilde{a}}\Vert x\Vert + {\tilde{b}}\Vert Ax\Vert + {\tilde{b}}\Vert Vx\Vert \\&\le {\tilde{a}}\Bigl ( 1 + \frac{{\tilde{b}}}{1-{\tilde{b}}} \Bigr )\Vert x\Vert + {\tilde{b}}\Bigl ( 1 + \frac{{\tilde{b}}}{1-{\tilde{b}}} \Bigr )\Vert Ax\Vert \\&= \frac{{\tilde{a}}}{1-{\tilde{b}}}\Vert x\Vert + \frac{{\tilde{b}}}{1-{\tilde{b}}}\Vert Ax\Vert \end{aligned}$$

    for all \(x\in {{\,\mathrm{Dom}\,}}(A)\). Plugging these into (4.2) gives

    $$\begin{aligned} \Vert Kx\Vert&\le \Vert P_+Q_-\Vert \Vert Vx\Vert + \Vert VQ_-x\Vert \\&\le (1+\Vert P_+Q_-\Vert )\Bigl ( \frac{{\tilde{a}}}{1-{\tilde{b}}}\Vert x\Vert + \frac{{\tilde{b}}}{1-{\tilde{b}}}\Vert \Lambda x\Vert \Bigr ) \end{aligned}$$

    for all \(x \in {{\,\mathrm{Dom}\,}}(\Lambda ) = {{\mathcal {D}}}_+\), which eventually leads to

    $$\begin{aligned} r_\Lambda (S) \le \Vert P_+Q_-\Vert + \frac{{\tilde{b}}(1+\Vert P_+Q_-\Vert )}{1-{\tilde{b}}} = \frac{\Vert P_+Q_-\Vert +{\tilde{b}}}{1-{\tilde{b}}}. \end{aligned}$$

    The right-hand side of the latter is smaller than 1 if and only if

    $$\begin{aligned} \Vert P_+Q_-\Vert < 1-2{\tilde{b}}, \end{aligned}$$

    which is a reasonable condition on \(\Vert P_+Q_-\Vert \) only for \({\tilde{b}} < 1/2\).

5 The Block Diagonalization Approach

In this section, we discuss an approach to verify the hypotheses of Proposition 3.1 and Lemma 3.4 which relies on techniques previously discussed in the context of block diagonalizations of operators and forms, for instance in [24] and [14], respectively; cf. also Remark 5.4 below.

Recall that for the two orthogonal projections \(P_+\) and \(Q_+\) from Hypothesis 1.1 the inequality \(\Vert P_+-Q_+\Vert <1\) holds if and only if \({{\,\mathrm{Ran}\,}}Q_+\) can be represented as

$$\begin{aligned} {{\,\mathrm{Ran}\,}}Q_+ = \{ f \oplus Xf \mid f\in {{\,\mathrm{Ran}\,}}P_+ \} \end{aligned}$$
(5.1)

with some bounded linear operator \(X:{{\,\mathrm{Ran}\,}}P_+\rightarrow {{\,\mathrm{Ran}\,}}P_-\); in this case, one has

$$\begin{aligned} \Vert P_+-Q_+\Vert = \frac{\Vert X\Vert }{\sqrt{1+\Vert X\Vert {}^2}}, \end{aligned}$$
(5.2)

see, e.g., [18, Corollary 3.4 (i)]. The orthogonal projection \(Q_+\) can then be represented as the \(2\times 2\) block operator matrices

$$\begin{aligned} \begin{aligned} Q_+&= \begin{pmatrix} (I_{{{\,\mathrm{Ran}\,}}P_+}+X^*X)^{-1} &{} (I_{{{\,\mathrm{Ran}\,}}P_+}+X^*X)^{-1}X^*\\ X(I_{{{\,\mathrm{Ran}\,}}P_+}+X^*X)^{-1} &{} X(I_{{{\,\mathrm{Ran}\,}}P_+}+X^*X)^{-1}X^* \end{pmatrix}\\&= \begin{pmatrix} (I_{{{\,\mathrm{Ran}\,}}P_+}+X^*X)^{-1} &{} X^*(I_{{{\,\mathrm{Ran}\,}}P_-}+XX^*)^{-1}\\ (I_{{{\,\mathrm{Ran}\,}}P_-}+XX^*)^{-1}X &{} XX^*(I_{{{\,\mathrm{Ran}\,}}P_-}+XX^*)^{-1} \end{pmatrix} \end{aligned} \end{aligned}$$
(5.3)

with respect to \({{\,\mathrm{Ran}\,}}P_+\oplus {{\,\mathrm{Ran}\,}}P_-\), see, e.g., [18, Remark 3.6]. In particular, we have

$$\begin{aligned} P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_+} = (I_{{{\,\mathrm{Ran}\,}}P_+} + X^*X)^{-1}, \end{aligned}$$
(5.4)

which is in fact the starting point for the current approach: With regard to the desired relations \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \supset {{\mathcal {D}}}_+\) and \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathfrak {D}}}_+}) \supset {{\mathfrak {D}}}_+\), we need to establish that the operator \(I_{{{\,\mathrm{Ran}\,}}P_+} + X^*X\) maps \({{\mathcal {D}}}_+\) and \({{\mathfrak {D}}}_+\) into \({{\mathcal {D}}}_+\) and \({{\mathfrak {D}}}_+\), respectively.

Define the skew-symmetric operator Y via the \(2\times 2\) block operator matrix

$$\begin{aligned} Y = \begin{pmatrix} 0 &{} -X^*\\ X &{} 0 \end{pmatrix} \end{aligned}$$
(5.5)

with respect to \({{\,\mathrm{Ran}\,}}P_+\oplus {{\,\mathrm{Ran}\,}}P_-\). Then, the operators \(I\pm Y\) are bijective with

$$\begin{aligned} (I-Y)(I+Y) = \begin{pmatrix} I_{{{\,\mathrm{Ran}\,}}P_+}+X^*X &{} 0\\ 0 &{} I_{{{\,\mathrm{Ran}\,}}P_-}+XX^* \end{pmatrix}. \end{aligned}$$
(5.6)

The following lemma is extracted from various sources. We comment on this afterwards in Remark 5.2 below.

Lemma 5.1

Suppose that the projections \(P_+\) and \(Q_+\) from Hypothesis 1.1 satisfy \(\Vert P_+ - Q_+\Vert < 1\), and let the operators X and Y be as in (5.1) and (5.5), respectively. Moreover, let \({{\mathcal {C}}}\) be an invariant subspace for both \(P_+\) and \(Q_+\) we have such that \({{\mathcal {C}}}= ({{\mathcal {C}}}\cap {{\,\mathrm{Ran}\,}}P_+) \oplus ({{\mathcal {C}}}\cap {{\,\mathrm{Ran}\,}}P_-) =: {{\mathcal {C}}}_+ \oplus {{\mathcal {C}}}_-\).

Then, the following are equivalent:

  1. (i)

    \(I_{{{\,\mathrm{Ran}\,}}P_+} + X^*X\) maps \({{\mathcal {C}}}_+\) into itself;

  2. (ii)

    \(I_{{{\,\mathrm{Ran}\,}}P_-} + XX^*\) maps \({{\mathcal {C}}}_-\) into itself;

  3. (iii)

    Y maps \({{\mathcal {C}}}\) into itself;

  4. (iv)

    \((I+Y)\) maps \({{\mathcal {C}}}\) into itself;

  5. (v)

    \((I-Y)\) maps \({{\mathcal {C}}}\) into itself.

Proof

Clearly, the hypotheses imply that \(P_+Q_+\) maps \({{\mathcal {C}}}\) into \({{\mathcal {C}}}_+\) and \(P_-Q_+\) maps \({{\mathcal {C}}}\) into \({{\mathcal {C}}}_-\).

(i)\(\Rightarrow \)(ii). Let \(g \in {{\mathcal {C}}}_-\). Using the first representation in (5.3), we then have identity \((I_{{{\,\mathrm{Ran}\,}}P_+} + X^*X)^{-1}X^*g = (P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_-})g \in {{\mathcal {C}}}_+\). Hence, \(X^*g \in {{\mathcal {C}}}_+\) by (i) and, in turn, \(h := (I_{{{\,\mathrm{Ran}\,}}P_+} + X^*X)X^*g \in {{\mathcal {C}}}_+\). Using again (5.3), this yields

$$\begin{aligned} (I_{{{\,\mathrm{Ran}\,}}P_-} + XX^*)g&= g + XX^*g = g + X(I_{{{\,\mathrm{Ran}\,}}P_+}+X^*X)^{-1}h\\&= g + (P_-Q_+|_{{{\,\mathrm{Ran}\,}}P_+})h \in {{\mathcal {C}}}_-. \end{aligned}$$

As a byproduct, we have also shown that \(X^*\) maps \({{\mathcal {C}}}_-\) into \({{\mathcal {C}}}_+\).

(ii)\(\Rightarrow \)(i). Using the identities \((I_{{{\,\mathrm{Ran}\,}}P_-} + XX^*)^{-1}X = P_-Q_+|_{{{\,\mathrm{Ran}\,}}P_+}\) and \(X^*(I_{{{\,\mathrm{Ran}\,}}P_-}+XX^*)^{-1} = P_+Q_+|_{{{\,\mathrm{Ran}\,}}P_-}\) taken from the second representation in (5.3), the proof is completely analogous to the implication (i)\(\Rightarrow \)(ii). In particular, we likewise obtain as a byproduct that X maps \({{\mathcal {C}}}_+\) into \({{\mathcal {C}}}_-\).

(i),(ii)\(\Rightarrow \)(iii). We have already seen that X maps \({{\mathcal {C}}}_+\) into \({{\mathcal {C}}}_-\) and that \(X^*\) maps \({{\mathcal {C}}}_-\) into \({{\mathcal {C}}}_+\). Taking into account that \({{\mathcal {C}}}= {{\mathcal {C}}}_+ \oplus {{\mathcal {C}}}_-\), this means that Y maps \({{\mathcal {C}}}\) into itself.

(iii)\(\Leftrightarrow \)(iv),(v). This is clear.

(iv),(v)\(\Rightarrow \)(i),(ii). This follows immediately from identity (5.6). \(\square \)

Remark 5.2

The proof of the equivalence (i)\(\Leftrightarrow \)(ii) and the one of the implication (i),(ii)\(\Rightarrow \)(iii) in Lemma 5.1 are extracted from the proof of [14, Theorem 5.1]; see also [29, Theorem 6.3.1 and Lemma 6.3.3].

The equivalence (iv)\(\Leftrightarrow \)(v) can alternatively be directly obtained from the identity

$$\begin{aligned} \begin{pmatrix} I_{{{\,\mathrm{Ran}\,}}P_+} &{} 0\\ 0 &{} -I_{{{\,\mathrm{Ran}\,}}P_-} \end{pmatrix} (I + Y) \begin{pmatrix} I_{{{\,\mathrm{Ran}\,}}P_+} &{} 0\\ 0 &{} -I_{{{\,\mathrm{Ran}\,}}P_-} \end{pmatrix} = I - Y. \end{aligned}$$

Such an argument has been used in the proof of [24, Proposition 3.3].

The implication (iv),(v)\(\Rightarrow \)(i) can essentially be found in the proof of [14, Theorem 5.1] and [29, Remark 6.3.2].

Below, we apply Lemma 5.1 with \({{\mathcal {C}}}= {{\,\mathrm{Dom}\,}}(A) = {{\,\mathrm{Dom}\,}}(B) = {{\mathcal {D}}}_+ \oplus {{\mathcal {D}}}_-\)or \({{\mathcal {C}}}= {{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2}) = {{\mathfrak {D}}}_+ \oplus {{\mathfrak {D}}}_-\), depending on the situation. The easiest case is encountered in Theorem 1.3, where we are in the semibounded setting:

Proof of Theorem 1.3

Let \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\) and \({{\mathfrak {b}}}[x, x] \le 0\) for all \(x \in {{\mathfrak {D}}}_-\). We then have \({{\mathfrak {D}}}_- = {{\,\mathrm{Ran}\,}}P_-\) if A is bounded from below and \({{\mathfrak {D}}}_+ = {{\,\mathrm{Ran}\,}}P_+\) if A is bounded from above. Hence, item (ii) or (i), respectively, in Lemma 5.1 with \({{\mathcal {C}}}= {{\mathfrak {D}}}_+ \oplus {{\mathfrak {D}}}_-\) is automatically satisfied. In any case, we have by Lemma 5.1 that \(I_{{{\,\mathrm{Ran}\,}}P_+} + X^*X\) maps \({{\mathfrak {D}}}_+\) into \({{\mathfrak {D}}}_+\), which by identity (5.4) means that \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathfrak {D}}}_+}) \supset {{\mathfrak {D}}}_+\). The representation (1.3) now follows from Proposition 3.1 (a) and Remark 3.5 (1). If even \({{\,\mathrm{Dom}\,}}(A) = {{\,\mathrm{Dom}\,}}(B)\) and \(\langle x, Bx \rangle \le 0\) for all \(x \in {{\mathcal {D}}}_-\), we use the same reasoning as above with \({{\mathfrak {D}}}_+\) and \({{\mathfrak {D}}}_-\) replaced by \({{\mathcal {D}}}_+\) and \({{\mathcal {D}}}_-\), respectively, and obtain representation (1.4) from Proposition 3.1 (b) and Remark 3.5 (1). The representation (1.3) is then still valid by Lemma 3.4 and the first part of the proof. \(\square \)

While certain conditions for Proposition 3.1 and Lemma 3.4 are part of the hypotheses of Theorems 1.2 and 1.3, in the situations of Theorems 1.4 and 1.5 these need to be verified explicitly from the specific hypotheses at hand. Here, we rely on previous considerations on block diagonalizations for block operator matrices and forms. In case of Theorem 1.4, the crucial ingredient is presented in the following result, extracted from [24]. An earlier result in this direction is commented on in Remark 5.4 (2) below.

Proposition 5.3

(see  [24, Theorem 6.1]) In the situation of Theorem 1.4 one has inequality \(\Vert P_+-Q_+\Vert \le \sqrt{2}/2 <1\), and the operator identity

$$\begin{aligned} (I-Y)(A+V)(I-Y)^{-1} = A-YV \end{aligned}$$
(5.7)

holds with Y as in (5.5).

Proof

Set \(V_0 := V|_{{{\,\mathrm{Dom}\,}}(A)}\), so that we have \(B = A+V = A+V_0\) as well as \(A-YV = A-YV_0\). Clearly, the hypotheses on V ensure that \(V_0\) is A-bounded with A-bound \(b_*<1\) and off-diagonal with respect to the decomposition \({{\,\mathrm{Ran}\,}}P_+\oplus {{\,\mathrm{Ran}\,}}P_-\). By [24, Lemma 6.3] we now have

$$\begin{aligned} {{\,\mathrm{Ker}\,}}(A+V_0) \subset {{\,\mathrm{Ker}\,}}A \subset {{\,\mathrm{Ran}\,}}P_-. \end{aligned}$$

In light of (5.2), the claim therefore is just an instance of [24, Theorem 6.1]. \(\square \)

Remark 5.4

(1) Let \(A_\pm :=A|_{{{\,\mathrm{Ran}\,}}P_\pm }\) be the parts of A associated with the subspaces \({{\,\mathrm{Ran}\,}}P_\pm \), and write

$$\begin{aligned} V|_{{{\,\mathrm{Dom}\,}}(A)} = \begin{pmatrix} 0 &{} W\\ W^* &{} 0 \end{pmatrix} , \end{aligned}$$

where \(W:{{\,\mathrm{Ran}\,}}P_-\supset {{\mathcal {D}}}_-\rightarrow {{\,\mathrm{Ran}\,}}P_+\) is given by \(Wx:=P_+Vx\), \(x\in {{\mathcal {D}}}_-\). Then,

$$\begin{aligned} A - YV = \begin{pmatrix} A_+ - X^*W^* &{} 0\\ 0 &{} A_- + XW \end{pmatrix} . \end{aligned}$$

In this sense, identity (5.7) can be viewed as a block diagonalization of the operator \(A+V\). For a more detailed discussion of block diagonalizations and operator Riccati equations in the operator setting, the reader is referred to [24] and the references cited therein.

(2) In the particular case where 0 belongs to the resolvent set of A, the conclusion of Proposition 5.3 can be inferred also from [35, Theorems 2.7.21 and 2.8.5].

Proof of Theorem 1.4

For \(x \in {{\mathcal {D}}}_-\), we have

$$\begin{aligned} \langle x, Vx \rangle = \langle P_-x, VP_-x \rangle = \langle x, P_-VP_-x \rangle = 0 \end{aligned}$$

and, thus,

$$\begin{aligned} \langle x, (A+V)x \rangle = \langle x, Ax \rangle \le 0. \end{aligned}$$

Moreover, by Proposition 5.3 the inequality \(\Vert P_+-Q_+\Vert <1\) is satisfied. Let Y be as in (5.5). Since \({{\,\mathrm{Dom}\,}}(A+V)={{\,\mathrm{Dom}\,}}(A)={{\,\mathrm{Dom}\,}}(A-YV)\), it then follows from identity (5.7) that \(I-Y\) maps \({{\mathcal {C}}}:= {{\,\mathrm{Dom}\,}}(A) = {{\mathcal {D}}}_+ \oplus {{\mathcal {D}}}_-\) into itself. In turn, Lemma 5.1 implies that \(I_{{{\,\mathrm{Ran}\,}}P_+} + X^*X\) maps \({{\mathcal {D}}}_+\) into itself, which by identity (5.4) means that \({{\,\mathrm{Ran}\,}}(P_+Q_+|_{{{\mathcal {D}}}_+}) \supset {{\mathcal {D}}}_+\). The claim now follows from Proposition 3.1, Lemma 3.4, and Remark 3.5 (1). \(\square \)

To the best of the author’s knowledge, no direct analogue of Proposition 5.3 is known so far in the setting of form rather than operator perturbations. Although the inequality \(\Vert P_+-Q_+\Vert \le \sqrt{2}/2\) can be established here as well under fairly reasonable assumptions, see [14, Theorem 3.3], the mapping properties of the operators \(I \pm Y\) connected with a corresponding diagonalization related to (5.7) are much harder to verify. The situation is even more subtle there since also the domain equality \({{\,\mathrm{Dom}\,}}(|A|^{1/2}) = {{\,\mathrm{Dom}\,}}(|B|^{1/2})\) needs careful treatment. The latter is conjectured to hold in a general off-diagonal form perturbation framework [13, Remark 2.7]. Some characterizations have been discussed in [30, Theorem 3.8], but they all are hard to verify in a general abstract setting. A compromise in this direction is to require that the form \({{\mathfrak {b}}}\) is semibounded, see [30, Lemma 3.9] and [14, Lemma 2.7], which forces the diagonal form \({{\mathfrak {a}}}\) to be semibounded as well, see below. As in the situation of Theorem 1.3 above, this simplifies matters immensely:

Proof of Theorem 1.5

For \(x \in {{\mathfrak {D}}}_- = {{\,\mathrm{Ran}\,}}P_- \cap {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\) we have

$$\begin{aligned} {{\mathfrak {v}}}[ P_-x, P_-x ] = 0 \end{aligned}$$

and, thus,

$$\begin{aligned} {{\mathfrak {b}}}[ x, x ] = {{\mathfrak {a}}}[ x, x ] \le 0. \end{aligned}$$

In the same way, we see that \({{\mathfrak {b}}}[ x, x ] = {{\mathfrak {a}}}[ x, x ]\) for \(x \in {{\mathfrak {D}}}_+\), which by the identity \({{\mathfrak {a}}}[ x, x ] = {{\mathfrak {a}}}[ P_+x, P_+x ] + {{\mathfrak {a}}}[ P_-x, P_-x ]\) for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\) implies that along with \({{\mathfrak {b}}}\) also the form \({{\mathfrak {a}}}\) is semibounded; cf. the proof of [14, Lemma 2.7]. Since also \({{\,\mathrm{Dom}\,}}(|B|^{1/2}) = {{\,\mathrm{Dom}\,}}[{{\mathfrak {b}}}] = {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] = {{\,\mathrm{Dom}\,}}(|A|^{1/2})\) by hypothesis and in view of Theorem 1.3, it only remains to show that \(\Vert P_+ - Q_+ \Vert < 1\).

To this end, we first show that \({{\mathfrak {b}}}\) is a semibounded saddle-point form in the sense of [14, Section 2]: Let \(m \in \mathbb {R}\) be the lower (resp. upper) bound of \({{\mathfrak {a}}}\). We then have

$$\begin{aligned} |({{\mathfrak {a}}}-m)[ x, x ]| = \Vert |A-m|^{1/2}x \Vert ^2 \le \Vert |A-m|^{1/2}(|A|^{1/2}+I)^{-1} \Vert \Vert (|A|^{1/2}+I)x \Vert ^2 \end{aligned}$$

for all \(x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]\), where \(|A-m|^{1/2}(|A|^{1/2}+I)^{-1}\) is closed and everywhere defined, hence bounded by the closed graph theorem. From this and the hypothesis on \({{\mathfrak {v}}}\) we see that

$$\begin{aligned} | {{\mathfrak {v}}}[ x, x ] | \le \beta \bigl ( \Vert |A|^{1/2}x\Vert ^2 + \Vert x\Vert ^2 \bigr ) \quad {\text { for all }}\ x \in {{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}] \end{aligned}$$

with some \(\beta \ge 0\), which means that \({{\mathfrak {b}}}= {{\mathfrak {a}}}+ {{\mathfrak {v}}}= {{\mathfrak {a}}}+ {{\mathfrak {v}}}_0\) with \({{\mathfrak {v}}}_0 := {{\mathfrak {v}}}|_{{{\,\mathrm{Dom}\,}}[{{\mathfrak {a}}}]}\) is indeed a semibounded saddle-point form.

It now follows from [30, Theorem 2.13] that

$$\begin{aligned} {{\,\mathrm{Ker}\,}}B \subset {{\,\mathrm{Ker}\,}}A \subset {{\,\mathrm{Ran}\,}}P_-. \end{aligned}$$

In turn, [14, Theorem 3.3] and (5.2) give \(\Vert P_+ - Q_+ \Vert \le \sqrt{2}/2 < 1\), which completes the proof. \(\square \)