1 Introduction

Dirac operators attracted considerable attention in recent years, in particular in the context of non-self-adjoint spectral theory [4,5,6, 9, 12, 24, 25], nonlinear Schrödinger equations e.g. [3, 21] or as an effective model for graphene [1, 7, 15, 20]. In this paper we analyze eigenvalues emerging from the thresholds of the essential spectrum of the one-dimensional Dirac operator in \(L^2({\mathbb {R}})\) perturbed by a general matrix-valued and non-symmetric potential V preserving the essential spectrum.

Our main results include the existence and asymptotics of weakly coupled eigenvalues, i.e. in the asymptotic regime with \(\varepsilon V\) and \(\varepsilon \rightarrow 0\), for the one-dimensional Dirac operator (Theorem 2.2) and Lieb–Thirring type inequalities (Theorem 2.4) in the massive as well as the massless case. These results complement the eigenvalue estimates in [6] and also show that the latter are optimal in the weak coupling regime, see Remark 2.3.

As physical applications, we investigate the damped wave equation in \(L^2({\mathbb {R}})\) and to a two-dimensional model of charge carriers in graphene nanoribbons (or waveguides) with so-called armchair boundary conditions. We emphasize here the inherent non-self-adjoint nature of the former caused by the presence of damping. Moreover, our eigenvalue estimates may be converted to resonance estimates via the well-known method of complex scaling, as in [6].

The application for the damped wave equation (Theorem 3.1) demonstrates a natural effect from the physical point of view: The integrable damping \(\varepsilon a_1(x)\) cannot affect the essential spectrum; however, for any \(\varepsilon >0\), it gives rise to a pair of complex conjugated eigenvalues having the tendency to meet at the real axis. The interpretation of the results for the graphene armchair waveguides is more complicated due to the \(4\times 4\) matrix structure and the PDE nature of the problem. Nonetheless, in the simplest setting of a diagonal potential that is constant in the transverse direction, the quantities entering the eigenvalue asymptotics are expressed in terms of the integral of the trace of V(x) only, see Example 3.7 and Theorem 3.6.

The main ingredient in the proofs is the analysis of a Birman–Schwinger operator. Since the problem is not self-adjoint, the existence of eigenvalues in the gap of the essential spectrum does not follow from min-max considerations. Nonetheless, the weak coupling technique, relying on the isolation of a singular part L of the Birman–Schwinger operator, admits a generalization to the non-self-adjoint setting. This is possible since L is of finite rank and so the question of existence and asymptotics of eigenvalues is converted to a matrix problem, which is analyzed with the help of Rouché’s theorem eventually. The proofs of the Lieb–Thirring type bounds are also based on complex analysis techniques, this time on a generalization of Jensen’s identity due to [2]. Inequalities of this type were established in [10] for one- and multidimensional Dirac operators, and improved results in the multidimensional case recently appeared in [5]. The difference of our new estimates compared to the one-dimensional results in [10] is that the weights in the eigenvalue sums are better, which leads to tighter upper bounds for the number of eigenvalues in certain subsets of the complex plane. The price to pay for this improvement is that the eigenvalue sums cannot be controlled by a single \(L^p\) norm, but only by a combination of two such norms. This phenomenon was already encountered in [5], and the reason for it is a lack of decay of the free resolvent as the spectral parameter tends to infinity.

To avoid technicalities related to domain questions, we intentionally require that V is both integrable and square-integrable throughout the entire paper. The technique allowing the omission of the \(L^2\) assumption is described in [6, Sec. 6]. In the waveguide case, the \(L^2\) assumption is also convenient, though not essential, when estimating the infinite sums arising from the decomposition of the resolvent, see Remark 3.3. To simplify the presentation of the weak coupling eigenvalue asymptotics (2.14), (2.15) and (3.15), (3.16), we also do not strive for higher-order terms in the expansion, although these could in principle be obtained in a similar way as in the Schrödinger case, see e.g. [22, 26]. We have now also all needed ingredients in hand to prove an analogue of the Lieb–Thirring inequalities in Sect. 2.2 for the graphene waveguides. However, the conformal map would be much more involved and we do not pursue this direction.

The paper is organized as follows. In Sect. 2 we briefly recall the relevant results of [6] for the one-dimensional non-self-adjoint Dirac operator and establish the weak coupling eigenvalue asymptotics (Sect. 2.1), and the Lieb–Thirring inequalities (Sect. 2.2). In Sect. 3, we apply these results to the one-dimensional damped wave equation, (Sect. 3.1), and graphene waveguides (Sect. 3.2).

2 One-dimensional Dirac operator

The spectrum of the free operator H with \(m \ge 0\) in \(L^2({\mathbb {R}};{\mathbb {C}}^2)\)

$$\begin{aligned} H = - {i}\partial _x \sigma _1 + m \sigma _3= \begin{pmatrix} m &{}\quad - {i}\partial _x \\ - {i}\partial _x &{}\quad -m \end{pmatrix} \end{aligned}$$
(2.1)

reads

$$\begin{aligned} \sigma (H) = \sigma _\mathrm{e3}(H) = (-\infty , -m] \cup [m, \infty ); \end{aligned}$$

here and in the sequel, we use the essential spectrum \(\sigma _\mathrm{e3}\) defined as

$$\begin{aligned} \sigma _\mathrm {e3}(T):=\{ z \in {\mathbb {C}}\,: \, T-z \text { is not Fredholm}\}, \end{aligned}$$

see e.g. [11, Sec. IX] for details, the point spectrum is denoted by

$$\begin{aligned} \sigma _{\mathrm{p}}(T) := \{ z \in {\mathbb {C}}\, : \, {{\text {Ker}}}(T-z) \ne \{0\} \}, \end{aligned}$$

the Pauli matrices read

$$\begin{aligned} \sigma _1=\begin{pmatrix} 0&{}\quad 1 \\ 1&{}\quad 0 \end{pmatrix}, \quad \sigma _2=\begin{pmatrix} 0&{} \quad -{i}\\ {i}&{} \quad 0 \end{pmatrix}, \quad \sigma _3=\begin{pmatrix} 1&{}\quad 0\\ 0&{}\quad -1 \end{pmatrix} \end{aligned}$$

and the \(n\times n\) identity matrix is denoted by \(I_n\).

As a perturbation, we consider a (possibly complex and non-symmetric) matrix potential

$$\begin{aligned} V: {\mathbb {R}}\rightarrow {\mathbb {C}}^{2 \times 2}, \quad \Vert V\Vert \in L^1({\mathbb {R}}) \cap L^2({\mathbb {R}}), \end{aligned}$$
(2.2)

where \(\Vert V(x)\Vert \) is the operator norm in \({\mathbb {C}}^2\) of the matrix V(x).

Our goal is to analyze the spectrum of \(H+ V\). Note that the \(L^2\) condition in (2.2) is imposed for technical reasons only and is not strictly necessary for the results of this section. It offers the advantage that \(H+V\) may be defined as an operator sum because V is relatively H-compact and hence infinitesimally H-bounded. In addition, the relative compactness implies that the essential spectrum is stable, i.e. \(\sigma _\mathrm{e3}(H+V)=\sigma _\mathrm{e3}(H)\). If only the \(L^1\) condition is assumed in (2.2), the perturbed operator can be defined by means of a resolvent formula; we refer to [6] for the details.

For any \(p\in [1,\infty )\) we set

$$\begin{aligned} \Vert V\Vert _p^p:= \int _{{\mathbb {R}}} \Vert V(x)\Vert ^p \; {\mathrm{d}}x. \end{aligned}$$

It is proved in [6, Thm. 2.1] that if \(\Vert V\Vert _1 <1\), then all non-embedded eigenvalues of \(H+V\) satisfy

$$\begin{aligned} \sigma _\mathrm{p}(H+V) {\setminus } \sigma _\mathrm{e3}(H) \subset \overline{B}_{mr_0}(m x_0) \, \dot{\cup } \, \overline{B}_{mr_0}(-m x_0) \end{aligned}$$
(2.3)

where

$$\begin{aligned} x_0 := \left( \frac{\Vert V\Vert _1^4 - 2 \Vert V\Vert _1^2 +2}{4(1-\Vert V\Vert _1^2)} + \frac{1}{2}\right) ^\frac{1}{2}, \quad r_0 := \left( \frac{\Vert V\Vert _1^4 - 2 \Vert V\Vert _1^2 +2}{4(1-\Vert V\Vert _1^2)} - \frac{1}{2}\right) ^\frac{1}{2}. \end{aligned}$$
(2.4)

2.1 Weakly coupled eigenvalues

Here we analyze the point spectrum of \(H+\varepsilon V\) as \(\varepsilon \rightarrow 0+\), i.e. the weak coupling regime. In the self-adjoint setting, a straightforward construction of test functions together with a min-max argument applied to \((H+V)^2\) shows that if the matrix

$$\begin{aligned} \int _{{\mathbb {R}}} \left( V(x)^2 + m \{\sigma _3, V(x)\} \right) \; {\mathrm{d}}x, \end{aligned}$$
(2.5)

where the brackets \(\{\cdot , \cdot \}\) denote the anticommutator, has a negative eigenvalue, then \(H + V\) has an eigenvalue in \((-m,m)\). In the weak coupling regime, (2.5) can be translated (by ignoring the term of \(\mathcal {O}(\varepsilon ^2)\)) to \(\int _{{\mathbb {R}}} V_{11} <0\) or \(\int _{{\mathbb {R}}} V_{22} > 0\). In Theorem 2.2 below, we prove that the intuition obtained from this simple self-adjoint argument is indeed correct.

The free resolvent \((H-z)^{-1}\), \(z \in \rho (H)\), is an integral operator with the kernel (see [6] for details)

$$\begin{aligned} \begin{aligned}&\mathcal {R}(x,y;z) = \mathcal {N}(x,y;z) e^{{i}k(z)|x-y|}, \quad \\&\mathcal {N}(x,y;z) := \frac{{i}}{2} \begin{pmatrix} \zeta (z) &{}\quad {\text {sgn}}(x-y) \\ {\text {sgn}}(x-y) &{}\quad \zeta (z)^{-1} \end{pmatrix} \end{aligned} \end{aligned}$$
(2.6)

where

$$\begin{aligned} \zeta (z) := \frac{z+m}{k(z)}, \quad k(z) := (z^2 - m^2)^\frac{1}{2} \end{aligned}$$
(2.7)

and the square root on \({\mathbb {C}}{\setminus } [0,\infty )\) is chosen such that \({\text {Im}}k(z)>0\).

A natural technique is the Birman–Schwinger principle, derived for the Dirac operator e.g. in [6, Thm. 6.1]. The Birman–Schwinger operator Q(z) is an integral operator with the kernel

$$\begin{aligned} \mathcal {Q}(x,y;z) = \mathcal {A}(x) \mathcal {N}(x,y;z) e^{{i}k(z)|x-y|} \mathcal {B}(y) \end{aligned}$$

where the factorization of V is based on its polar decomposition \(V=U_V |V|\), namely

$$\begin{aligned} V = \mathcal {B}\mathcal {A}, \quad \mathcal {B}:=U_V |V|^\frac{1}{2}, \quad \mathcal {A}:=|V|^\frac{1}{2}. \end{aligned}$$

Notice that

$$\begin{aligned} \Vert \mathcal {A}(x)\Vert ^2 = \Vert \mathcal {B}(x)\Vert ^2 = \Vert V(x)\Vert . \end{aligned}$$
(2.8)

As usual, we split \(Q=Q(z)\) into a singular and a regular part L and M, respectively, i.e.

$$\begin{aligned} Q=L+M; \end{aligned}$$
(2.9)

the corresponding (z-dependent) kernels read

$$\begin{aligned} \mathcal {L}(x,y)&= \mathcal {A}(x) \varUpsilon \mathcal {B}(y), \nonumber \\ \mathcal {M}(x,y)&= \mathcal {A}(x) \Big ({\text {sgn}}(x-y) \sigma _1 e^{{i}k(z)|x-y|} +\varUpsilon (e^{{i}k(z)|x-y|}-1) \Big ) \mathcal {B}(y), \end{aligned}$$
(2.10)

where

$$\begin{aligned} \varUpsilon = \varUpsilon (\zeta (z))= \frac{{i}}{2} \begin{pmatrix} \zeta (z) &{}\quad 0 \\ 0 &{} \quad \zeta (z)^{-1} \end{pmatrix}. \end{aligned}$$
(2.11)

Similarly as in [6, Sec. 2], estimating the quadratic form of L, we obtain the bound

$$\begin{aligned} \Vert L\Vert \le \Vert \varUpsilon \Vert \Vert V\Vert _1 = \frac{1}{2} \max \{|\zeta (z)|,|\zeta (z)|^{-1}\} \Vert V\Vert _1. \end{aligned}$$

For later use, we also notice that

$$\begin{aligned} \Vert Q\Vert _{\mathfrak {S} ^2}^2 \le \left( \frac{1}{2}+\Vert \varUpsilon \Vert _{\mathfrak {S} ^2}^2\right) \Vert V\Vert _1^2 = \frac{1}{4} \left( 2+|\zeta (z)|^2+|\zeta (z)|^{-2}\right) \Vert V\Vert _1^2. \end{aligned}$$
(2.12)

The following lemma shows that the possible singularities for \(z = \pm m\) of the regular part M are weaker than those of L.

Lemma 2.1

Let V be as in (2.2), M be the integral operator with the kernel (2.10) and \(\varUpsilon \) as in (2.11). Then

$$\begin{aligned} \Vert M\Vert = o \big ( \Vert \varUpsilon \Vert \big ), \quad z \rightarrow \pm m, \ z \notin \sigma (H). \end{aligned}$$

Proof

The proof is inspired by [22, Lem. 1]. Since \({\text {Im}}k(z)>0\) for \(z \notin \sigma (H)\), straightforward estimates and (2.8) show that there exists \(C>0\) such that

$$\begin{aligned} \Vert \varUpsilon \Vert ^{-2}\Vert \mathcal {M}(x,y)\Vert ^2 \le C \Vert V(x)\Vert \Vert V(y)\Vert \left( \Vert \varUpsilon \Vert ^{-2} + 1 \right) , \quad z \notin \sigma (H). \end{aligned}$$

Thus, for \(\Vert \varUpsilon \Vert >1\), the function \((x,y) \mapsto \Vert \varUpsilon \Vert ^{-2}\Vert \mathcal {M}(x,y)\Vert ^2\) has an integrable upper bound. Since \(\Vert \varUpsilon \Vert \rightarrow \infty \) when \(k(z) \rightarrow 0\) and

$$\begin{aligned} \lim _{k(z) \rightarrow 0}(e^{{i}k(z)|x-y|}-1) = 0, \end{aligned}$$

the dominated convergence theorem yields

$$\begin{aligned}&\lim _{k(z) \rightarrow 0} \Vert \varUpsilon \Vert ^{-2} \int _{{\mathbb {R}}^2} \Vert \mathcal {M}(x,y)\Vert ^2 \; {\mathrm{d}}x \, {\mathrm{d}}y= 0. \end{aligned}$$

\(\square \)

Theorem 2.2

Let H be as in (2.1), V as in (2.2) and let \(m>0\). Define the matrix

$$\begin{aligned} U:=\int _{{\mathbb {R}}} V(x) \, {\mathrm{d}}x. \end{aligned}$$
(2.13)

If \({\text {Re}}U_{11} <0\), then, for all sufficiently small \(\varepsilon >0\), there exists an eigenvalue \(z_+(\varepsilon )\) of \(H + \varepsilon V\) satisfying

$$\begin{aligned} z_+(\varepsilon ) = m - \frac{m}{2} U_{11}^2 \varepsilon ^2 + o(\varepsilon ^2), \quad \varepsilon \rightarrow 0+. \end{aligned}$$
(2.14)

Similarly, if   \({\text {Re}}U_{22} >0\) then, for all sufficiently small \(\varepsilon >0\), there exists an eigenvalue \(z_-(\varepsilon )\) of \(H + \varepsilon V\) satisfying

$$\begin{aligned} z_-(\varepsilon ) = -m + \frac{m}{2} U_{22}^2 \varepsilon ^2 + o(\varepsilon ^2), \quad \varepsilon \rightarrow 0+. \end{aligned}$$
(2.15)

Proof

According to the Birman–Schwinger principle, see [6, Thm. 6.1], \(z \in \sigma _\mathrm{p}(H + \varepsilon V)\) if and only if \(-1 \in \sigma _\mathrm{p}(\varepsilon Q(z))\), where Q is as in (2.9). Thus we investigate the invertibility of \(1 + \varepsilon Q(z)\). Notice that, if \(\varepsilon \Vert M\Vert <1\), we have

$$\begin{aligned} I + \varepsilon Q(z) = (I + \varepsilon M)(I + (I + \varepsilon M)^{-1}\varepsilon L), \end{aligned}$$
(2.16)

so the invertibility of Q(z) depends on the invertibility of the second factor in (2.16). We proceed with the analysis of the latter, find for which z it is not invertible and show that, for these z, the condition \(\varepsilon \Vert M\Vert <1\) holds.

The crucial observation is that the kernel of L is separated (in x and y); hence \((I + \varepsilon M)^{-1}\varepsilon L\) is of finite rank and so \(-1 \in \sigma _\mathrm{p}((I + \varepsilon M)^{-1}\varepsilon L)\) if and only if

$$\begin{aligned} \det (I_2 +(I + \varepsilon M)^{-1}\varepsilon L) =0. \end{aligned}$$
(2.17)

To analyze (2.17), it is convenient to write

$$\begin{aligned} (I + \varepsilon M)^{-1} L = \widehat{A}_\varepsilon \varUpsilon \widehat{B}, \end{aligned}$$

where

$$\begin{aligned} \begin{aligned}&\widehat{B} : \quad L^2({\mathbb {R}};{\mathbb {C}}^2) \rightarrow {\mathbb {C}}^2: \quad \varPsi \mapsto \int _{{\mathbb {R}}} \mathcal {B}(x)\varPsi (x) \, {\mathrm{d}}x, \\&\widehat{A}_\varepsilon : \quad {\mathbb {C}}^2 \rightarrow L^2({\mathbb {R}};{\mathbb {C}}^2): \quad \varPhi \mapsto (I+\varepsilon M)^{-1} \mathcal {A}(x) \varPhi =: \mathcal {A}_\varepsilon (x) \varPhi , \end{aligned} \end{aligned}$$

where \(\mathcal {A}_\varepsilon (x)\) is a matrix satisfying

$$\begin{aligned} \Vert \mathcal {A}_\varepsilon (x)-\mathcal {A}(x)\Vert \le r \Vert \mathcal {A}(x)\Vert , \qquad r=r(\varepsilon ,z)=\frac{\varepsilon \Vert M\Vert }{1-\varepsilon \Vert M\Vert }. \end{aligned}$$
(2.18)

Notice also that (with U as in (2.13))

$$\begin{aligned} \widehat{B} \widehat{A}_\varepsilon = U + U_1, \quad \Vert U_1\Vert \le r \Vert V\Vert _1. \end{aligned}$$
(2.19)

Employing these observations and Sylvester’s determinant identity, we can rewrite (2.17) as

$$\begin{aligned} 0 = \det (I_2 + \varepsilon \varUpsilon (U + U_1)) = 1 + \varepsilon {\text {Tr}}(\varUpsilon (U + U_1)) -\frac{1}{4} \varepsilon ^2 \det (U + U_1). \end{aligned}$$

Since

$$\begin{aligned} {\text {Tr}}(U\varUpsilon ) = \frac{{i}}{2} (\zeta U_{11} + \zeta ^{-1} U_{22}), \end{aligned}$$

our initial guess (ignoring the smaller terms) for the dependence of \(\zeta \) on \(\varepsilon \) reads

$$\begin{aligned} \zeta ^0_+= \frac{2 {i}}{\varepsilon U_{11}}, \quad \zeta ^0_-= - \frac{{i}}{2} \varepsilon U_{22}. \end{aligned}$$

Notice that the assumption that \({\text {Re}}U_{11}<0\) or \({\text {Re}}U_{22} > 0\) is needed since the allowed region in terms of \(\zeta \) is \({\text {Im}}\zeta <0\), see the discussion of [4] after (2.6) there.

Finally, we prove that there is indeed a solution \(\zeta _+\) of (2.17) in a neighborhood of \(\zeta ^0_+\); the reasoning for \(\zeta _-\) is analogous. To this end, for \(|\alpha |<\delta _0\), we define \(\zeta _\alpha :=\zeta ^0_+(1+\alpha )\) and select \(\delta _0\) so small that, with some \(\beta >0\), we have \({\text {Im}}\zeta _\alpha< - \beta <0\) for all \(|\alpha |<\delta _0\) and \(\varepsilon \rightarrow 0+\). Observe that with this \(\zeta _\alpha \), we have (uniformly in \(\alpha \)) that \(\varepsilon \Vert M\Vert = o(\varepsilon |\zeta _+^0|) = o(1)\) as \(\varepsilon \rightarrow 0+\) and so \(r = o(1)\) as \(\varepsilon \rightarrow 0+\); the former justifies (2.16) in particular.

Take \(\zeta =\zeta _\alpha \) and define the function

$$\begin{aligned} F(\alpha ):= \det ( I_2 + \varUpsilon (U+U_1) ), \quad |\alpha |<\delta _0. \end{aligned}$$

Since, for all \(|\alpha |<\delta _0\), we have \({\text {Im}}\zeta _\alpha< - \beta < 0\) and \(\zeta _\alpha \rightarrow \infty \) as \(\varepsilon \rightarrow 0+\), i.e. the corresponding \(z_\alpha \in \rho (H)\) and \(z_\alpha \rightarrow m\) as \(\varepsilon \rightarrow 0+\), the function F is holomorphic for \(|\alpha |<\delta _0\). Moreover, using (2.18), (2.19), we get

$$\begin{aligned} \begin{aligned} F(\alpha )&= 1 + \varepsilon {\text {Tr}}(U \varUpsilon ) + \varepsilon {\text {Tr}}(U_1 \varUpsilon ) -\frac{1}{4} \varepsilon ^2 \det (U+U_1) \\&= 1 + \big (-(1+\alpha ) + \mathcal {O}(\varepsilon ^2) \big ) + o(1) + \varepsilon ^2 \mathcal {O}(1) \\&= -\alpha + o(1), \qquad \varepsilon \rightarrow 0+. \end{aligned} \end{aligned}$$

Hence, Rouché’s theorem implies that, for all \(0<\varepsilon <\varepsilon _{\delta _0}\), functions \(F(\alpha )\) and \(G(\alpha ):=-\alpha \) have the same number of zeros in the ball \(B_{\delta _0}(0)\). Notice that the same reasoning is valid for any \(0<\delta <\delta _0\), thus we obtain that the sought solution of (2.17) reads

$$\begin{aligned} \zeta _+ = \zeta ^0_+(1+o(1)), \quad \varepsilon \rightarrow 0+. \end{aligned}$$
(2.20)

The last step, yielding (2.14), is to use the relation (2.7) between \(\zeta \) and z and rewrite (2.20) in terms of \(z_+\). \(\square \)

Remark 2.3

  1. (i)

    Theorem 2.2 shows that the spectral estimate (2.3) is sharp in the weak coupling regime. Indeed, the latter can be stated as

    $$\begin{aligned} |z\mp m|\le \frac{m}{2}\varepsilon ^2\Vert V\Vert _1^2+o(\varepsilon ^2), \quad \varepsilon \rightarrow 0+. \end{aligned}$$

    This implies that for an arbitrary potential satisfying the assumptions of Theorem 2.2 and for sufficiently small coupling there is always at least one eigenvalue lying close to the boundary of one of the disks in (2.3). As the coupling \(\varepsilon \) tends to zero, this eigenvalue approaches a point on the boundary. To see that this can happen for any point on the boundary one can consider a family of potentials \(\exp (ii\theta )V\) depending on a parameter \(\theta \in {\mathbb {R}}\), where V is a fixed potential such that \({\text {Re}}U_{11}\ne 0\) and \({\text {Re}}U_{22}\ne 0\). By (2.14)–(2.15), as \(\theta \) varies, the eigenvalues trace out a set that is converges to the whole boundary (i.e. the union of the two disks).

  2. (ii)

    Notice that in the proof of Theorem 2.2, we use that \(\Vert M\Vert = o(\Vert \varUpsilon (\zeta (z))\Vert )\) as \(z \rightarrow \pm m\), \(z \notin \sigma (H)\) only and not the particular structure of M. The latter would be needed to derive more terms in the expansions (2.14), (2.15).

  3. (iii)

    It is known that the weak coupling limit for the Dirac operator is equivalent to the non-relativistic limit. We do not pursue this connection here and refer to [4] for a discussion in this matter.

2.2 Lieb–Thirring inequalities

In the last subsection we have seen that the massive Dirac operator is critical, i.e. an arbitrarily small (non-self-adjoint) perturbation will create an eigenvalue. Here we prove an upper bound for the number of eigenvalues in certain subsets of the complex plane. The upper bound will be a consequence of a Lieb–Thirring type inequality. We prove similar results for the (non-critical) massless Dirac operator.

Theorem 2.4

Let H be as in (2.1) and V as in (2.2). If \(m=0\) and \(\Vert V\Vert _1\ge 1\), then

$$\begin{aligned} \sum _{z\in \sigma _\mathrm{disc}(H+V)}\frac{{\text {dist}}(z,\sigma (H))}{(|z|+1)^{2}}\le C\left( 1+\Vert V\Vert _2^4\right) \Vert V\Vert _1^2, \end{aligned}$$
(2.21)

where each eigenvalue is counted according to its algebraic multiplicity. If \(m>0\), then for any \(\tau >0\) we have that

$$\begin{aligned} \sum _{z\in \sigma _\mathrm{disc}(H+V)}\frac{{\text {dist}}(z,\sigma (H))|m^2-z^2|^{\frac{\tau }{2}}}{(m+|z|)^{2+\tau }}\le C_{\tau } \frac{A_m(V)}{m} \max \{\Vert V\Vert _1,\Vert V\Vert _1^2 \} \end{aligned}$$
(2.22)

where

$$\begin{aligned} A_m(V)={\left\{ \begin{array}{ll} \displaystyle \min \left\{ \frac{1}{1-\Vert V\Vert _1 e^{\frac{1}{2}\left( \Vert V\Vert _1+1\right) ^2}}, \frac{\left( 1+m^{-2}\Vert V\Vert _2^4\right) ^{2+\tau }}{\rho _0^2} \right\} &{}\quad \mathrm{if}\; \Vert V\Vert _1< \rho _0, \\ \displaystyle \frac{\left( 1+m^{-2}\Vert V\Vert _2^4\right) ^{2+\tau }}{\rho _0^2} &{}\quad \mathrm{if}\;\Vert V\Vert _1\ge \rho _0, \end{array}\right. } \end{aligned}$$

and where \(\rho _0\) is the unique solution to \(\rho e^{\frac{1}{2}(\rho +1)^2}=1\) (\(\rho _0\approx 0.38\)).

Remark 2.5

  1. (i)

    We recall that in the massless case (\(m=0\)) the spectral inclusion (2.3) states that there are no eigenvalues whenever \(\Vert V\Vert _1<1\). For this reason we are assuming that \(\Vert V\Vert _1\ge 1\) above.

  2. (ii)

    We emphasize the dependence of the bound on the \(L^1\) norm of V. One reason is that this norm is invariant with respect to rescaling of the mass, i.e. the substitution \(V\rightarrow m^{-1}V(m^{-1}\cdot )\) does not change the \(L^1\) norm. Secondly, a straightforward adaptation of the proof shows that if instead of \(\Vert V\Vert \in L^2({\mathbb {R}})\) we assume that \(\Vert V\Vert \in L^p({\mathbb {R}})\) for some \(p>1\), then (2.21) and (2.22) hold with \(\Vert V\Vert _2^2\) replaced by \(\Vert V\Vert _p^{2p/(p-1)}\).

  3. (iii)

    The bounds of Theorem 2.4 should be compared with those of [10]. In the one-dimensional case it is claimed that, for \(p>1\) and e.g. for \(m=0\), one has

    $$\begin{aligned} \sum _{z\in \sigma _\mathrm{disc}(H+V)}\frac{{\text {dist}}(z,\sigma (H))^{p+\tau }}{(|z|+1)^{2(p+\tau )}}\le C_{p,\tau }\Vert V\Vert _p^p \end{aligned}$$

    for any \(\tau \in (0,\min (p-1,1))\). In fact, an inspection of the proof shows that the constant \(C_{p,\tau }\) still depends on the potential through the parameter b introduced in [10], Sect 4.1]. Our estimate (2.22) yields better weights both for sequences of eigenvalues accumulating to a point in the essential spectrum or tending to infinity. Moreover, the constant is universal; the dependence on (the \(L^2\) norm of) the potential is exhibited explicitly. The comparison in the massive case is less obvious, and we will not pursue the issue here.

  4. (iv)

    In the self-adjoint case, Lieb–Thirring type inequalities for one-dimensional Dirac operators may be found in [14]. Note that even there the eigenvalue sums cannot be controlled by a single \(L^p\) norm. Similar inequalities for resonances were established in [18, 23] in the massless case and in [19] for the massive case.

Theorem 2.4 has the following consequences for the number of eigenvalues \(N_K (H+V)\) of \(H+V\) in a compact subset \(K\subset \rho (H)\). For any \(\delta ,R> 0\) and \(\epsilon \ge 0\) we set

$$\begin{aligned} K_{\delta ,\epsilon ,R}=\left\{ z\in {\mathbb {C}}:{\text {dist}}(z,\sigma (H))\ge \delta ,\,{\text {dist}}(z,\{-m,+m\})\ge \epsilon ,\,|z|\le R\right\} . \end{aligned}$$

Corollary 2.6

If \(m=0\) and \(\Vert V\Vert _1\ge 1\), then we have

$$\begin{aligned} N_{K_{\delta ,0,R}}\le \frac{C}{\delta }(1+R^2)\left( 1+\Vert V\Vert _2^4\right) \Vert V\Vert _1^2. \end{aligned}$$

If \(m>0\), then we have

$$\begin{aligned} N_{K_{\delta ,\epsilon ,R}}\le \frac{C_{\tau }}{\delta } \max \left\{ m^{1+\frac{\tau }{2}}\epsilon ^{-\frac{\tau }{2}},R^{2}\right\} \frac{A_{m}(V)}{m} \max \left\{ \Vert V\Vert _1,\Vert V\Vert _1^2\right\} . \end{aligned}$$
(2.23)

Proof

For \(m>0\) the claim follows from the lower bound

$$\begin{aligned} \frac{{\text {dist}}(z,\sigma (H))|m^2-z^2|^{\frac{\tau }{2}}}{(m+|z|)^{2+\tau }} \ge c_{\tau }\delta \min \left\{ m^{-1-\frac{\tau }{2}}\epsilon ^{\frac{\tau }{2}},R^{-2} \right\} . \end{aligned}$$

This can be seen by treating the cases \(|z|\le 2m\) and \(|z|>2m\) separately and using that \(|z^2-m^2|\ge m\epsilon \) in the first case. The case \(m=0\) is even easier. \(\square \)

Proof of Theorem 2.4

The proof is based on complex analysis. The general approach is explained very well in [8], and we refer the reader to this article for more details. We treat the massive and the massless case separately, starting with the latter. For simplicity we prove (2.21) only for eigenvalues in the upper half plane \({\mathbb {C}}^+\); the proof for the lower half plane is analogous. The basic idea in the complex analysis approach is to relate the eigenvalues to the zeros of a holomorphic function. It is convenient to define the following maps (recall that \(Q=|V|^\frac{1}{2} R(z)V^\frac{1}{2}\)):

$$\begin{aligned} \begin{aligned}&h:{\mathbb {C}}^+\rightarrow {\mathbb {C}},\quad h(z):={\det }_2(I+Q),\\&\varphi :{\mathbb {C}}^+\rightarrow \mathbb {D},\quad \varphi (z):=\frac{z-{i}}{z+{i}},\\&\nu :\mathbb {D}\rightarrow \mathbb {D},\quad \nu (w):=\frac{w+\varphi ({i}\eta )}{1+\varphi ({i}\eta )w},\\&\psi :{\mathbb {C}}^+\rightarrow \mathbb {D},\quad \psi (z):=\nu ^{-1}(\varphi (z)),\\&g:\mathbb {D}\rightarrow {\mathbb {C}},\quad g(w):=\frac{h(\psi ^{-1}(w))}{h({i}\eta )}, \end{aligned} \end{aligned}$$
(2.24)

where \({i}\eta \in \rho (H+V)\) will be chosen momentarily. Note that \(\varphi ,\nu ,\psi \) are conformal maps and the regularized determinant \(\det _2\) is defined for any \(T\in \mathfrak {S}^2\) by

$$\begin{aligned} {\det }_2(I+T):=\det \left( (I+T) e^{-T}\right) . \end{aligned}$$

We then have that \(\det _2(I+T)=0\) if and only if \((I+T)\) is not invertible; see e.g. [27, Thm. 9.2]. In particular, \(h(z)=0\) if and only if \(z\in \sigma _\mathrm{p}(H+V)\). Moreover, h is analytic in the Hilbert-Schmidt norm, see e.g. [13]. From [27, Thm. 9.2] and (2.12) we have the uniform bound (notice the \(\Vert \varUpsilon \Vert _{\mathfrak {S}^2}^2=1/2\) for \(m=0\))

$$\begin{aligned} \log |h(z)|\le \varGamma _2\Vert Q(z )\Vert _{\mathfrak {S}^2}^2 \le \frac{1}{2} \Vert V\Vert _1^2. \end{aligned}$$
(2.25)

The value of the optimal constant is \(\varGamma _2=1/2\), see e.g. formula (2.2) in Chapter IV of [17]. This implies the following estimate,

$$\begin{aligned} \log |g(w)|\le \frac{1}{2}\Vert V\Vert _1^2-\log |h({i}\eta )|. \end{aligned}$$

We will use the \(L^2\) norm to estimate the second term. First observe that

$$\begin{aligned} \Vert Q({i}\eta )\Vert _{\mathfrak {S}^2}\le 2\Vert |V|^\frac{1}{2} |\sqrt{-\varDelta }-{i}\eta |^{-\frac{1}{2}}\Vert _{\mathfrak {S}^4}^2 \le (2\pi )^{-\frac{1}{4}}2\sqrt{\pi }\Vert V\Vert _2\eta ^{-\frac{1}{2}}, \end{aligned}$$
(2.26)

where we used the Schwarz inequality in Schatten spaces in the first and the Kato-Seiler-Simon bound [27, Thm. 4.1] in the second estimate; a more precise bound is stated in [6, Theorem 4.3]. We also used that the norm of the kernel of \(|R_0({i}\eta )|^{1/2}\) is dominated by twice the absolute value of the kernel of \(|\sqrt{-\varDelta }-{i}\eta |^{-\frac{1}{2}}\) and that the \(L^4\) norm of the function \(||\cdot |-{i}\eta |^{-\frac{1}{2}}\) is bounded from above by \(\pi ^{\frac{1}{4}}\eta ^{-\frac{1}{4}}\). We set

$$\begin{aligned} \eta ^{\frac{1}{2}}=\gamma (2\pi )^{-\frac{1}{4}}2\sqrt{\pi }\Vert V\Vert _2,\quad \text {where} \quad \gamma =e^\frac{5}{2}. \end{aligned}$$

This guarantees that \({i}\eta \in \rho (H+V)\) since \(\gamma ^{-1}<1\). By [27, Thm. 9.2] we have the bound

$$\begin{aligned} |h({i}\eta )-1|\le \gamma ^{-1}\exp \left( \frac{1}{2}(\gamma ^{-1}+1)^2\right) \le \gamma ^{-1} e^2\le e^{-\frac{1}{2}}. \end{aligned}$$

and so, since \(\Vert V\Vert _1 \ge 1\),

$$\begin{aligned} \log |g(w)|\le \frac{1}{2} \Vert V\Vert _1^2 + \frac{1}{2} \le \Vert V\Vert _1^2. \end{aligned}$$
(2.27)

Since g is holomorphic on the unit disk \(\mathbb {D}\) and continuous up to the boundary, the following Blaschke condition holds,

$$\begin{aligned} \sum _{g(w)=0}(1-|w|)\le \sum _{g(w)=0}\left| \frac{1}{w}\right| \le \sup _{0<r<1}\frac{1}{2\pi }\int _0^{2\pi }\log |g(r\mathrm {e}^{{i}\theta })|{\mathrm{d}}\theta \le \log \Vert g\Vert _{\infty }. \end{aligned}$$
(2.28)

Here and in the following, every zero is counted according to its multiplicity. We have also used the normalization condition \(g(0)=1\) in (2.28), which follows from \(\psi ({i}\eta )=0\). To translate the result back to the z-plane, we use the distortion bound

$$\begin{aligned} (1-|w|)\sim |\psi '(z)|{\text {dist}}(z,\sigma (H))\ge \frac{2}{1+\eta ^2}\frac{{\text {dist}}(z,\sigma (H))}{|z+{i}|^2}. \end{aligned}$$
(2.29)

The first inequality follows from the Koebe distortion theorem, the second from an explicit computation. Combination of (2.28), (2.28) and (2.29) yields

$$\begin{aligned} \sum _{z\in \sigma _\mathrm{p}(H+V)}\frac{{\text {dist}}(z,\sigma (H))}{|z+{i}|^{2}}\le C(1+\eta ^2) \Vert V\Vert _1^2. \end{aligned}$$

By the choice of \(\eta \), this is equivalent to (2.21) for \(z\in {\mathbb {C}}^+\).

Next, we consider the massive case. Without loss of generality we may restrict our attention to the case \(m=1\); the general case follows by scaling. We use the same maps as in (2.24) except that we replace \(\varphi \) by

$$\begin{aligned} \varphi :\rho (H)\rightarrow \mathbb {D},\quad \varphi (z):=\frac{\sqrt{(z+1)/(z-1)}-{i}}{\sqrt{(z+1)/(z-1)}+{i}}. \end{aligned}$$

Instead of (2.25) we now have [again from (2.12)] the estimate

$$\begin{aligned} \log |h(z)| \le \frac{1}{2} \Vert Q(z)\Vert _{\mathfrak {S}^2}^2 = \frac{1}{8} \left( 2 + |\zeta (z)|^2 + |\zeta (z)|^{-2}\right) \Vert V\Vert _1^2. \end{aligned}$$

Assume first that \(\Vert V\Vert _1<\rho _0\). By (2.3) we may choose \(\eta >0\) (independent of V) such that \({i}\eta \in \rho (H+V)\), i.e. \(h({i}\eta )\ne 0\). Since \(|\zeta ({i}\eta )|=1\), we have

$$\begin{aligned} \Vert Q({i}\eta )\Vert _{\mathfrak {S}^2} \le \Vert V\Vert _1. \end{aligned}$$

It follows that

$$\begin{aligned} |h({i}\eta )-1| \le \Vert V\Vert _1 \exp \left( \frac{1}{2}(\Vert V\Vert _1+1)^2\right) , \end{aligned}$$

and thus

$$\begin{aligned} \begin{aligned} \log |g(w)|&\le \frac{1}{8} \left( 2 + |\zeta (\psi ^{-1}(w))|^2 + |\zeta (\psi ^{-1}(w))|^{-2}\right) \Vert V\Vert _1^2 \\&\quad +\frac{e^{\frac{1}{2}(\Vert V\Vert _1+1)^2}\Vert V\Vert _1}{1-\Vert V\Vert _1 e^{\frac{1}{2}(\Vert V\Vert _1+1)^2}} \\&\le C\Vert V\Vert _1\left( 1-\Vert V\Vert _1 e^{\frac{1}{2}(\Vert V\Vert _1+1)^2}\right) ^{-1}|1-w|^{-2}|1+w|^{-2}. \end{aligned} \end{aligned}$$

In the first estimate we used the triangle inequality \(|h({i}\eta )|\ge 1-|h({i}\eta )-1|\), the monotonicity of the logarithm and the elementary inequality \(\log (1-x)\ge -x/(1-x)\) for \(x=|h({i}\eta )-1|\in (0,1)\). The second estimate follows from the definition of \(\psi \), some elementary inequalities for positive numbers and the fact that \(\Vert V\Vert _1^2<\Vert V\Vert _1\). Since the right-hand side is no longer bounded, we cannot use (2.28). Instead, we have to use a more refined result due to [2]. In our case it implies that

$$\begin{aligned} \sum _{w:\,g(w)=0}(1-|w|)|1-w|^{1+\tau }|1+w|^{1+\tau }\le C_{\tau } \left( 1-\Vert V\Vert _1 e^{\frac{1}{2}(\Vert V\Vert _1+1)^2}\right) ^{-1}\Vert V\Vert _1 \end{aligned}$$
(2.30)

for any \(\tau >0\). Using the estimates

$$\begin{aligned} \begin{aligned} (1-|w|)&\gtrsim {\text {dist}}(z,\sigma (H))(1+|z|)^{-1}|z^2-1|^{-\frac{1}{2}}(1+\eta ^2)^{-1},\\ |1-w||1+w|&\gtrsim (1+|z|)^{-1}|z^2-1|^{\frac{1}{2}}(1+\eta ^2)^{-1}, \end{aligned} \end{aligned}$$

which follow from straightforward albeit tedious computations, directly from the definition of \(\psi \) and by Koebe’s distortion theorem we infer from (2.30) that

$$\begin{aligned} \sum _{z\in \sigma _\mathrm{disc}(H+V)}\frac{{\text {dist}}(z,\sigma (H))|1-z^2|^{\frac{\tau }{2}}}{(1+|z|)^{2+\tau }}\le C_{\tau } \left( 1-\Vert V\Vert _1 e^{\frac{1}{2}(\Vert V\Vert _1+1)^2}\right) ^{-1}\Vert V\Vert _1. \end{aligned}$$
(2.31)

This is half of the desired bound (2.22) for \(m=1\) and \(\Vert V\Vert _1<\rho _0\). Observe that if \(\Vert V\Vert _1<\rho _0/2\), then the right-hand side of (2.31) is bounded from above by a constant multiple of \(\max \{\Vert V\Vert _1,\Vert V\Vert _1^2\}\), while \(A_m(V)\) is bounded from below by a positive constant. Hence, it remains to prove (2.22) for \(\Vert V\Vert _1\ge \rho _0/2\) and with \(A_m(V)=(1+\Vert V\Vert _2^4)^{2+\tau }\). Since we have \(|\sqrt{1-\varDelta }-{i}\eta |\ge |\sqrt{-\varDelta }-{i}\eta |\), (2.26) holds, and we make the same choice of \(\eta \) and \(\gamma \) as in the massless case. A repetition of the above arguments yields that

$$\begin{aligned} \log |g(w)|\le 4\frac{\Vert V\Vert _1^2+\gamma ^{-1}}{|1-w|^2|1+w|^2} \end{aligned}$$

and finally

$$\begin{aligned} \sum _{z\in \sigma _\mathrm{disc}(H+V)}\frac{{\text {dist}}(z,\sigma (H))|1-z^2|^{\frac{\tau }{2}}}{(1+|z|)^{2+\tau }}\le C_{\tau }(1+\eta ^2)^{2+\tau }\left( \Vert V\Vert _1^2+\gamma ^{-1}\right) . \end{aligned}$$

For \(\Vert V\Vert _1\ge \rho _0/2\), we have that \(\gamma ^{-1}\le 4(\gamma \rho _0^2)^{-1}\Vert V\Vert _1^2\). Hence, by the choice of \(\eta \) and \(\gamma \), this is the desired bound. \(\square \)

3 Applications

3.1 Damped wave equation in \(L^2({\mathbb {R}})\)

Our first non-self-adjoint application motivated from physics is the damped wave equation

$$\begin{aligned} u_{tt}(t,x) + 2 a(x) u_t(t,x) = u_{xx}(t,x) -q_0 u(t,x), \quad t >0, \ x \in {\mathbb {R}}; \end{aligned}$$
(3.1)

where the damping a and the potential \(q_0 \in {\mathbb {R}}_+\) satisfy

$$\begin{aligned} a(x) = a_0 + a_1(x), \quad a_0>0, \quad a_1 \in L^1({\mathbb {R}}) \cap L^2({\mathbb {R}}), \quad q_0>a_0^2. \end{aligned}$$
(3.2)

The second-order scalar equation (3.1) can be reformulated as a first-order system, suitable for spectral analysis, in the form

$$\begin{aligned} \partial _t \begin{pmatrix} u \\ v \end{pmatrix} = \underbrace{\begin{pmatrix} -2 a &{}\quad - \partial _x - q_0^\frac{1}{2} \\ - \partial _x + q_0^\frac{1}{2} &{}\quad 0 \end{pmatrix} }_{G} \begin{pmatrix} u \\ v \end{pmatrix}. \end{aligned}$$

The equivalence to another (perhaps more intuitive) system with \(v=u_t\) is extensively discussed e.g. in [16]. We view G as an operator in \(L^2({\mathbb {R}};{\mathbb {C}}^2)\) and we employ a similarity transformation that brings G to the form studied in Sect. 2. Let

$$\begin{aligned} T= - \frac{1}{2\left( \mu ^2+{i}a_0 \mu \right) ^\frac{1}{2}} \begin{pmatrix} q_0^\frac{1}{2} &{}\quad a_0 - {i}\mu \\ a_0 - {i}\mu &{}\quad q_0^\frac{1}{2} \end{pmatrix}, \qquad \mu :=\left( q_0-a_0^2\right) ^\frac{1}{2}>0. \end{aligned}$$

Then a straightforward calculation yields

$$\begin{aligned} T {i}G T^{-1} = \underbrace{ \begin{pmatrix} \mu &{}\quad - {i}\partial _x \\ - {i}\partial _x &{}\quad -\mu \end{pmatrix} }_{H} + \underbrace{\frac{a_1(x)}{\mu } \begin{pmatrix} -a_0 &{}\quad q_0^\frac{1}{2} \\ - q_0^\frac{1}{2} &{}\quad a_0 \end{pmatrix} }_{V} - {i}a_0 I_2. \end{aligned}$$
(3.3)

In the simplest case with \(a_1=0\), (3.3) immediately gives that

$$\begin{aligned} \sigma (G) = \sigma _\mathrm{e3}(G) = -a_0 + (-{i}\infty , -{i}\mu ] \cup [{i}\mu , + {i}\infty ), \end{aligned}$$

and thus illustrates the well-known spectral picture exhibiting the effect of damping (the shift of the spectrum to the left). As a corollary of the claims established or recalled in Sect. 2, we obtain new results in the situation when damping is no longer constant but satisfies (3.2).

Theorem 3.1

Let G, H, V, a, \(q_0\) and \(\mu \) be as above. Then

$$\begin{aligned} \sigma _\mathrm{e3}(G)=-a_0 + (-{i}\infty , -{i}\mu ] \cup [{i}\mu , + {i}\infty ) \end{aligned}$$

and the following holds.

  1. (i)

    For any \(\tau >0\), the Lieb–Thirring type inequality (2.22) and the bound on the number of eigenvalues (2.23) hold with

    $$\begin{aligned} \Vert V\Vert _1= \sqrt{\frac{q_0^\frac{1}{2} + a_0}{q_0^\frac{1}{2} - a_0}} \int _{{\mathbb {R}}} |a_1(x)| \, {\mathrm{d}}x, \quad \Vert V\Vert _2= \sqrt{\frac{q_0^\frac{1}{2} + a_0}{q_0^\frac{1}{2} - a_0}} \left( \int _{{\mathbb {R}}} |a_1(x)|^2 \, {\mathrm{d}}x\right) ^{\frac{1}{2}} \end{aligned}$$
    (3.4)

    and with the replacement \(H \mapsto {i}H\) and \(m \mapsto {i}\mu \).

  2. (ii)

    If \(\Vert V\Vert _1<1\), then

    $$\begin{aligned} \sigma _\mathrm{p}(G) {\setminus } \sigma _\mathrm{e3}(G) \subset \overline{B}_{\mu r_0}({i}\mu x_0-a_0) \, \dot{\cup } \, \overline{B}_{\mu r_0}(-{i}\mu x_0 - a_0), \end{aligned}$$

    where \(x_0\) and \(r_0\) are as in (2.4) with \(\Vert V\Vert _1\) as in (3.4).

  3. (iii)

    In the weak coupling regime, i.e. \(a_1\) is replaced everywhere by \(\varepsilon a_1\) with \(\varepsilon >0\): if \(\int _{{\mathbb {R}}} a_1(x) \, {\mathrm{d}}x >0\), then, for all sufficiently small \(\varepsilon >0\), there are two eigenvalues \(z_\pm (\varepsilon )\), \(z_-=\overline{z_+}\), of G satisfying

    $$\begin{aligned} z_+(\varepsilon ) = -a_0 + {i}\mu - \frac{{i}a_0^2}{2 \mu } \left( \int _{\mathbb {R}}a_1(x) \; {\mathrm{d}}x \right) ^2 \varepsilon ^2 + o(\varepsilon ^2), \quad \varepsilon \rightarrow 0+. \end{aligned}$$

3.2 Armchair graphene waveguides

As a second application motivated from physics, we consider the two-dimensional Dirac operator of an infinite straight graphene waveguide \(\varOmega =(-a,a)\times {\mathbb {R}}\),

$$\begin{aligned} D= \begin{pmatrix} 0&{}\quad \tau ^*&{}\quad 0&{}\quad 0\\ \tau &{}\quad 0&{}\quad 0&{}\quad 0\\ 0&{}\quad 0&{}\quad 0&{}\quad -\tau \\ 0&{}\quad 0&{}\quad -\tau ^*&{}\quad 0\\ \end{pmatrix}\quad \text{ in } L^2(\varOmega ;{\mathbb {C}}^4), \end{aligned}$$
(3.5)

where \(\tau :=-{i}\partial _1+\partial _2\) and \(\tau ^*:=-{i}\partial _1 - \partial _2\) is the formal adjoint. The domain of D consists of spinors \(\psi \in H^1(\varOmega ;{\mathbb {C}}^4)\) satisfying so-called armchair boundary conditions

$$\begin{aligned} \psi _i(-a,x_2)=\psi _{i+2}(-a,x_2), \quad \psi _i(a,x_2)=e^{{i}\varTheta }\psi _{i+2}(a,x_2), \quad i=1,2, \end{aligned}$$

where \(0\le \varTheta <2\pi \) depends on the geometry of the waveguide. It was proved in [15, Prop. 1, 19] that D is self-adjoint and that the spectrum is given by

$$\begin{aligned} \sigma (D)=\sigma _\mathrm{e3}(D)=(-\infty ,-E_0]\cup [E_0,\infty ), \end{aligned}$$

where

$$\begin{aligned} E_0:=\min _{n\in {\mathbb {Z}}}|\xi _n|, \qquad \xi _n:=\frac{\pi n}{2a}-\frac{\varTheta }{4a}. \end{aligned}$$

To simplify the presentation in the sequel, we restrict ourselves to the case when \(\varTheta \in (0,\pi )\) and thus \(E_0 = -\xi _0>0\); the results can be extended in a straightforward way to the other cases.

Although the algebraic structure of Dirac waveguide operators is more complicated than in the Laplacian (or Schrödinger) case, it might be helpful to remark that the numbers \(\{\xi _n\}_{n \in {\mathbb {Z}}}\) play the role of spectral thresholds in the essential spectrum of D. The corresponding (normalized in \(L^2((-a,a);{\mathbb {C}}^4)\)) transverse eigenfunctions read

$$\begin{aligned} \varPsi _n^+(x_1) = \frac{1}{2 a^\frac{1}{2}} \begin{pmatrix} e^{-{i}\xi _n x_1}\\ 0 \\ (-1)^ne^{-{i}\frac{\varTheta }{2}}e^{{i}\xi _n x_1} \\ 0 \end{pmatrix}, \ \ \varPsi _n^-(x_1) = \frac{1}{2 a^\frac{1}{2}} \begin{pmatrix} 0\\ e^{-{i}\xi _n x_1}\\ 0\\ (-1)^n e^{-{i}\frac{\varTheta }{2}} e^{{i}\xi _n x_1} \end{pmatrix} \end{aligned}$$
(3.6)

and the set \(\{\varPsi _n^\sigma \}_{n \in {\mathbb {Z}}, \sigma \in \{+,-\}}\) forms an orthonormal basis in \(L^2((-a,a);{\mathbb {C}}^4)\). To proceed with spectral analysis of perturbations of D, we derive a convenient representation of the resolvent of D based on its decomposition into transverse modes. Moreover, we employ a unitary transformation bringing D and its resolvent close to the form of the Dirac operator H investigated in Sect. 2. Notice that \(-\xi _n\) plays the role of m in previous formulas.

Lemma 3.2

Let D be as in (3.5). Then, for all \(z \notin (-\infty ,-E_0] \cup [E_0,\infty )\),

$$\begin{aligned} \varSigma (D-z)^{-1} \varSigma ^{-1} = \sum _{n \in {\mathbb {Z}}} (H_n-z)^{-1} P_n = \sum _{\begin{array}{c} n \in {\mathbb {Z}}\\ \sigma \in \{+,-\} \end{array}} (H_n-z)^{-1} P_n^\sigma \end{aligned}$$
(3.7)

where

$$\begin{aligned} \varSigma = \frac{1}{2} I_2 \otimes \begin{pmatrix} 1+{i}&{}\quad 1+{i}\\ -1+{i}&{}\quad 1-{i}\end{pmatrix}, \quad H_n = I_2 \otimes \begin{pmatrix} -\xi _n &{}\quad - {i}\partial _2 \\ - {i}\partial _2 &{}\quad \xi _n \end{pmatrix}, \quad P_n = P_n^+ + P_n^- \end{aligned}$$
(3.8)

and, for every \(f \in L^2(\varOmega ;{\mathbb {C}}^4)\),

$$\begin{aligned} \left( P_n^\sigma f\right) (x_1,x_2) = \langle \varSigma \varPsi _n^\sigma , f(\cdot ,x_2) \rangle _{L^2((-a,a);{\mathbb {C}}^4)} \varSigma \varPsi _n^\sigma (x_1), \quad \sigma \in \{+,-\}. \end{aligned}$$

Proof

Notice that, for any \(g \in H^1({\mathbb {R}};{\mathbb {C}})\), we have

$$\begin{aligned} \begin{aligned} D g(x_2) \varPsi _n^\sigma (x_1)&= I_2 \otimes \begin{pmatrix} 0 &{}\quad - \partial _2 - \xi _n \\ \partial _2 - \xi _n &{}\quad 0 \end{pmatrix} g(x_2) \varPsi _n^\sigma (x_1) \\&= \varSigma ^{-1} H_n \varSigma g(x_2) \varPsi _n^\sigma (x_1). \end{aligned} \end{aligned}$$

Thus (3.7) follows by standard arguments. \(\square \)

In the following, we investigate the spectrum of \(D+V\) where

$$\begin{aligned} V: \varOmega \rightarrow {\mathbb {C}}^{4 \times 4}: \quad \Vert V\Vert \in L^1(\varOmega )\cap L^2(\varOmega ). \end{aligned}$$
(3.9)

To keep the connection to Sect. 2, in the proofs we always employ the unitary transformation \(\varSigma \) and thus instead of V we in fact work with

$$\begin{aligned} W= \varSigma V \varSigma ^{-1}. \end{aligned}$$

Remark 3.3

(On the assumption \(\Vert V\Vert \in L^2(\varOmega )\)) Similarly to the case of the one-dimensional Dirac operator, the condition \(\Vert V\Vert \in L^2(\varOmega )\) is imposed for convenience only. However, the situation is slightly different for the waveguide: We cannot drop the \(L^2\) norm completely, but merely replace it (at the expense of using a more complicated definition of the sum \(D+V\)) by an \(L^{1+\varepsilon }\) norm, where \(\varepsilon >0\) is arbitrary. The \(\varepsilon \) loss takes place in an orthogonality argument for estimating an infinite sum in the proof of Lemma 3.5. We do not know if this is just a technical issue.

Lemma 3.4

Let D be as in (3.5) and V as in (3.9). Then

$$\begin{aligned} \sigma _\mathrm{e3}(D+V) = \sigma _\mathrm{e3}(D) = (-\infty ,-E_0]\cup [E_0,\infty ). \end{aligned}$$

Proof

We show V is relatively compact with respect to D, so the essential spectrum \(\sigma _\mathrm{e3}\) is preserved, see [11, Thm. IX.2.1]. In the following, we employ the unitary transformation \(\varSigma \).

First, using the explicit kernel (2.6) with \(z=0\) and \(m=-\xi _n\) and \(\Vert W\Vert \in L^2(\varOmega )\), we verify that \(W H_n^{-1}P_n\) is a Hilbert-Schmidt operator with

$$\begin{aligned} \Vert W H_n^{-1} P_n\Vert ^2_{\mathfrak {S}^2}\le C(1+|n|)^{-1}\Vert W\Vert _2^2. \end{aligned}$$

We will show that the series

$$\begin{aligned} \sum _{n \in {\mathbb {Z}}} W H_n^{-1} P_n, \end{aligned}$$
(3.10)

having Hilbert-Schmidt operators as summands, is convergent in the operator norm; this will imply that \(V D^{-1}\) is compact.

To show the convergence of (3.10), we approximate W by bounded potentials \(W_n\) defined by

$$\begin{aligned} W_n(x):={\left\{ \begin{array}{ll} W(x) &{}\quad \text{ if } \Vert W(x)\Vert \le C_n,\\ 0 &{}\quad \text{ if } \Vert W(x)\Vert > C_n, \end{array}\right. } \end{aligned}$$

where \(C_n\) is chosen such that \(\Vert W\mathbf {1}_{\{x:\Vert W(x)\Vert > C_n\}}\Vert _2\le 2^{-n}\); this is possible e.g. by Chebychev’s inequality. In summary, we have chosen \(W_n\) such that

$$\begin{aligned} \Vert W-W_n\Vert _2\le 2^{-n},\quad \Vert W_n\Vert _2\le \Vert W\Vert _2. \end{aligned}$$

By the mutual orthogonality of \({{\text {Ran}}}P_n \), we have for any \(N\in \mathbb {N}\) with \(N\ge 1\) that

$$\begin{aligned} \Big \Vert \sum _{|n|>N} W_n H_n^{-1}P_n \Big \Vert = \Big \Vert \sum _{|n|>N} P_n H_n^{-1} W_n^* \Big \Vert \le C N^{-\frac{1}{2}}\Vert W\Vert _2. \end{aligned}$$

On the other hand, we have

$$\begin{aligned} \Big \Vert \sum _{|n|>N} (W-W_n) H_n^{-1}P_n \Big \Vert \le C\sum _{|n|>N}2^{-n}. \end{aligned}$$

The claim is proved. \(\square \)

3.2.1 Weakly coupled eigenvalues in armchair waveguides

We analyze the eigenvalues of \(D+ \varepsilon V\) emerging from the edges of the essential spectrum and their asymptotics as \(\varepsilon \rightarrow 0+\). Here V is assumed to satisfy (3.9) and thus \(\sigma _\mathrm{e3}(D+\varepsilon V) =\sigma _\mathrm{e3}(D)\) for every \(\varepsilon \ge 0\) by Lemma 3.2. As in the one-dimensional case, the main tool is the Birman–Schwinger principle. To be able to use the formulas from the one-dimensional case, we employ the unitary transformation \(\varSigma \), see (3.8), and representation of the resolvent of D in (3.7).

By standard arguments, it can be verified that \(z \in \sigma _\mathrm{p}(D+ \varepsilon V)\) if and only if \(-1 \in \sigma _\mathrm{p}( \varepsilon Q)\), where the Birman–Schwinger operator Q has the form

$$\begin{aligned} Q(z) = \overline{A \varSigma (D-z)^{-1} \varSigma ^{-1} B}, \end{aligned}$$

(the bar denotes the closure) and where A, B are multiplication operators by the matrices \(\mathcal {A}\) and \(\mathcal {B}\) stemming from the polar decomposition of W, i.e. 

$$\begin{aligned} W = \mathcal {B}\mathcal {A}, \quad \mathcal {B}:=U_W |W|^\frac{1}{2}, \quad \mathcal {A}:=|W|^\frac{1}{2}. \end{aligned}$$

We further decompose Q into a singular and regular part, namely \(Q = L + M\). All these integral operators have explicit kernels; nonetheless, we display here in detail only the formula for L. After straightforward manipulations employing the formulas (2.6) and (3.6), we get

$$\begin{aligned} (L \varPsi )(x) = \mathcal {A}(x) \varUpsilon _2 \sum _{\sigma \in \{+,-\}} \int _\varOmega \langle \varPsi _0^\sigma (y_1), \mathcal {B}(y) \varPsi (y) \rangle _{{\mathbb {C}}^4} \, {\mathrm{d}}y \, \psi _0^\sigma , \end{aligned}$$

where (with \(\varUpsilon \) as in (2.11))

$$\begin{aligned} \varUpsilon _2= \varUpsilon _2(\zeta _0(z)) = I_2 \otimes \varUpsilon (\zeta _0(z)), \quad \zeta _0(z) = \frac{z - \xi _0}{k_0(z)}, \quad k_0(z) = \left( z^2-\xi _0^2\right) ^\frac{1}{2}\qquad \end{aligned}$$
(3.11)

and

$$\begin{aligned} \psi _0^+=\frac{4 a^\frac{1}{2} \sin \frac{\varTheta }{4}}{\varTheta } \begin{pmatrix} 1 \\ 0 \\ e^{-{i}\frac{\varTheta }{2} } \\ 0 \end{pmatrix}, \qquad \psi _0^-=\frac{4 a^\frac{1}{2} \sin \frac{\varTheta }{4}}{\varTheta } \begin{pmatrix} 0 \\ 1 \\ 0 \\ e^{-{i}\frac{\varTheta }{2} } \end{pmatrix}. \end{aligned}$$

For the regular part \(M=M_1+M_2\), we have

$$\begin{aligned} M_1 = A (H_0-z)^{-1} P_0 B - L, \quad M_2 = A \left( \sum _{n \in {\mathbb {Z}}{\setminus } \{0\}} (H_n-z)^{-1} P_n \right) B . \end{aligned}$$
(3.12)

Lemma 3.5

Let V be as in (3.9) and \(M_1\), \(M_2\) be as in (3.12). Then

$$\begin{aligned} \Vert M_1\Vert = o(\Vert \varUpsilon \Vert ), \quad \Vert M_2\Vert = \mathcal {O}(1), \quad z \rightarrow \pm \xi _0. \end{aligned}$$

Proof

The estimate of \(\Vert M_1\Vert \) is almost the same as in the proof of Lemma 2.1, we omit the details.

To prove the estimate for \(\Vert M_2\Vert \) let \(f,g\in C_c^{\infty }(\varOmega ;{\mathbb {C}}^4)\), normalized in \(L^2(\varOmega ;{\mathbb {C}}^4)\) and assume that \(|z-|\xi _0||<1/2\). From formula (3.6) it is straightforward to obtain the estimate

$$\begin{aligned} \left( \int _{-a}^{a} \Vert AP_n^{\sigma }f(x_1,x_2)\Vert _{{\mathbb {C}}^4}^2 {\mathrm{d}}x_1\right) ^{\frac{1}{2}} \le w(x_2)^{\frac{1}{2}}\left( \int _{-a}^a\Vert f(x_1,x_2)\Vert _{{\mathbb {C}}^4}^2{\mathrm{d}}x_1\right) ^{\frac{1}{2}} \end{aligned}$$
(3.13)

for all \(n\in {\mathbb {Z}}\), all \(\sigma \in \{+,-\}\) and for almost all \(x_2\in {\mathbb {R}}\), where

$$\begin{aligned} w(x_2):=\left( \int _{-a}^a\Vert W(x_1,x_2)\Vert ^2{\mathrm{d}}x_1\right) ^{\frac{1}{2}}. \end{aligned}$$

An analogue of this inequality holds for Af replaced by Bg. By Schwarz’s and Bessel’s inequality,

$$\begin{aligned}&\sum _{|n|\ge 1} \left| \langle Af,(H_n-z)^{-1} P_n Bg\rangle \right| \\&\quad \le \left( \sum _{|n|\ge 1} \Vert |H_n-z|^{-\frac{1}{2}}P_n Af\Vert ^2\right) ^{\frac{1}{2}} \left( \sum _{|n|\ge 1} \Vert |H_n-z|^{-1/2} P_n Bg\Vert ^2\right) ^{1/2}\\&\quad \le \sup _{|n|\ge 1} \Vert |AP_n|(-\partial _2^2+\xi _n^2)^{\frac{1}{2}}\otimes I_4 -z|^{-\frac{1}{2}}\Vert \,\Vert BP_n|(-\partial _2^2+\xi _n^2)^{\frac{1}{2}}\otimes I_4-z|^{-\frac{1}{2}}\Vert \\&\quad \le 4 \sup _{|n|\ge 1} \Vert w^{1/2}|(-\partial _2^2+\xi _n^2)^{\frac{1}{2}}-z|^{-\frac{1}{2}}\Vert _{\mathfrak {S}^4(L^2({\mathbb {R}}))}^2 \\&\quad \le 4(2\pi )^{-\frac{1}{4}}\sup _{|n|\ge 1}\sup _{|z-|\xi _0||<1/2}\left( \int _{{\mathbb {R}}}|(\eta ^2+\xi _n^2)^{\frac{1}{2}}-z|^{-2}{\mathrm{d}}\eta \right) ^{\frac{1}{4}}\Vert W\Vert _2, \end{aligned}$$

where we used (3.13) in the next-to-last inequality and the Kato-Seiler-Simon inequality, see [27, Thm. 4.1], in the last inequality. The factor 4 comes from estimating the matrix operator \(H_n\) in terms of the scalar operator \((-\partial _2^2+\xi _n^2)^{\frac{1}{2}}\). The supremum over n and z in the final expression is finite. \(\square \)

Having established suitable estimates of the regular part M, we prove the main result of this section.

Theorem 3.6

Let V be as in (3.9) and let \(W= \varSigma V \varSigma ^{-1}\) with \(\varSigma \) as in (3.8). Define the matrices

$$\begin{aligned} U_{ij}:= \sum _{k=1}^4 \sum _{\sigma \in \{+,-\}} \int _{\varOmega } (\overline{\varPsi _0^\sigma (x_1)})_k W_{kj}(x) \, {\mathrm{d}}x \, (\psi _0^\sigma )_i. \end{aligned}$$
(3.14)

and

$$\begin{aligned} U^+:= \begin{pmatrix} U_{11} &{}\quad U_{13} \\ U_{31} &{}\quad U_{33} \end{pmatrix}, \quad U^-:= \begin{pmatrix} U_{22} &{}\quad U_{24} \\ U_{42} &{}\quad U_{44} \end{pmatrix}. \end{aligned}$$

If \(u^+ \in {\mathbb {C}}\) is a solution of \(\det (u^+ U^+ + I_2)=0\) that satisfies \({\text {Re}}u^+ >0\), then for any sufficiently small \(\varepsilon >0\), there exists an eigenvalue \(z_+(\varepsilon )\) of \(D + \varepsilon V\) satisfying

$$\begin{aligned} z_+(\varepsilon ) = -\xi _0 + \frac{\xi _0}{2(u^+)^2} \varepsilon ^2 + o(\varepsilon ^2), \quad \varepsilon \rightarrow 0+. \end{aligned}$$
(3.15)

Similarly, if \(u^- \in {\mathbb {C}}\) is a solution of \(\det (u^- U^- + I_2)=0\) that satisfies \({\text {Re}}u^- <0\), then for any sufficiently small \(\varepsilon >0\), there exists an eigenvalue \(z_-(\varepsilon )\) of \(D + \varepsilon V\) satisfying

$$\begin{aligned} z_-(\varepsilon ) = \xi _0 - \frac{\xi _0}{2} (u^- )^2 \varepsilon ^2 + o(\varepsilon ^2), \quad \varepsilon \rightarrow 0+. \end{aligned}$$
(3.16)

Proof

The strategy and individual steps are the same as in the one-dimensional case (Theorem 2.2) thus we indicate only the differences. Employing a decomposition as in (2.16) and the fact that the kernel of L is separated, we convert the problem \(-1 \in \sigma _\mathrm{p}(Q(z))\) into the algebraic equation

$$\begin{aligned} \det \left( I_4 + \varepsilon \varUpsilon _2 (U+U_1) \right) =0, \end{aligned}$$

with U as in (3.14) and \(U_1=\mathcal {O}\left( \frac{\varepsilon \Vert M\Vert }{1-\varepsilon \Vert M\Vert }\right) \). As an initial guess we consider

$$\begin{aligned} \zeta ^0_+= - \frac{2{i}}{\varepsilon } u^+, \quad \zeta ^0_-= \frac{{i}}{2} \varepsilon u^-, \end{aligned}$$

for which we obtain \(\det \left( I_4 + \varepsilon \varUpsilon _2 U \right) = \mathcal {O}(\varepsilon ^2)\) as \(\varepsilon \rightarrow 0+\); the latter can be verified by the Laplace expansion of the determinant. The rest of the proof follows the lines of the one of Theorem 2.2, employing the estimates on M from Lemma 3.5 and formulas (3.11). \(\square \)

Example 3.7

In particularly simple case where \(V = {\text {diag}}(v_1,v_2,v_3,v_4)\) with \(v_j=v_j(x_2)\), \(j=1,\dots ,4\), and with \(a = \varTheta ^2/(8 \sin ^2(\varTheta /4))\), straightforward calculations reveal that

$$\begin{aligned} u^+=u^-=-\frac{1}{\int _{{\mathbb {R}}} {\text {Tr}}(V(x_2)) \; {\mathrm{d}}x_2}. \end{aligned}$$

Thus, depending on the sign of \({\text {Re}}u^\pm \), we obtain eigenvalues \(z_\pm \) obeying (3.15) or (3.16).