1 Introduction

In this paper, we consider computing all eigenvalues located in a certain region of a generalized eigenvalue problem and their corresponding eigenvectors:

$$\begin{aligned} A {{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i, \quad {{\varvec{x}}}_i \in {\mathbb {C}}^n \setminus \{ {{\varvec{0}}} \}, \quad \lambda _i \in \varOmega \subset {\mathbb {C}}, \end{aligned}$$
(1)

where \(A, B \in {\mathbb {C}}^{n \times n}\) and \(zB-A\) are assumed as nonsingular for any z on the boundary \(\varGamma \) of the region \(\varOmega \). Let m be the number of target eigenvalues \(\lambda _i \in \varOmega \) (counting multiplicity) and \(X_\varOmega = [{{\varvec{x}}}_i | \lambda _i \in \varOmega ]\) be a matrix whose columns are the target eigenvectors.

In 2003, Sakurai and Sugiura proposed a powerful algorithm for solving the interior eigenvalue problem (1) [19]. Their projection-type method uses certain complex moment matrices constructed by a contour integral. The basic concept is to introduce the rational function

$$\begin{aligned} r(z) := \widetilde{{\varvec{v}}}^{\mathrm{H}} (z B - A)^{-1} B {{\varvec{v}}}, \quad {{\varvec{v}}}, \widetilde{{\varvec{v}}} \in {\mathbb {C}}^n \setminus \{ {{\varvec{0}}} \}, \end{aligned}$$
(2)

whose poles are the eigenvalues of the generalized eigenvalue problem: \(A{{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i\), and then compute all poles located in \(\varOmega \) by Kravanja’s algorithm [16], which is based on Cauchy’s integral formula.

Kravanja’s algorithm can be expressed as follows. Let \(\varGamma \) be a positively oriented Jordan curve, i.e., the boundary of \(\varOmega \). We define complex moments \(\mu _k\) as

$$\begin{aligned} \mu _k := \frac{1}{2 \pi \mathrm{i} } \oint _\varGamma z^k r(z) \mathrm{d}z, \quad k = 0, 1, \dots , 2M-1. \end{aligned}$$

Then, all poles located in \(\varOmega \) of a meromorphic function r(z) are the eigenvalues of the generalized eigenvalue problem

$$\begin{aligned} H_M^< {{\varvec{y}}}_i = \theta _i H_M {{\varvec{y}}}_i, \end{aligned}$$
(3)

where \(H_M, H_M^<\) are Hankel matrices:

$$\begin{aligned} H_M := \left( \begin{array}{llll} \mu _0 &{}\quad \mu _1 &{}\quad \cdots &{}\quad \mu _{M-1} \\ \mu _1 &{}\quad \mu _2 &{}\quad \cdots &{}\quad \mu _{M} \\ \vdots &{} \quad \vdots &{}\quad \ddots &{} \quad \vdots \\ \mu _{M-1} &{}\quad \mu _{M} &{}\quad \cdots &{}\quad \mu _{2M-2} \end{array} \right) , \quad H_M^< := \left( \begin{array}{llll} \mu _1 &{}\quad \mu _2 &{}\quad \cdots &{}\quad \mu _{M} \\ \mu _2 &{}\quad \mu _3 &{}\quad \cdots &{}\quad \mu _{M+1} \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \mu _{M} &{}\quad \mu _{M+1} &{}\quad \cdots &{}\quad \mu _{2M-1} \end{array} \right) . \end{aligned}$$

Applying Kravanja’s algorithm to the rational function (2), the generalized eigenvalue problem (1) reduces to the generalized eigenvalue problem with the Hankel matrices (3). This algorithm is called the SS–Hankel method.

The SS–Hankel method has since been developed by several researchers. The SS–RR method based on the Rayleigh–Ritz procedure increases the accuracy of the eigenpairs [20]. Block variants of the SS–Hankel and SS–RR methods (known as the block SS–Hankel and the block SS–RR methods, respectively) improve stability of the algorithms [10,11,12]. The block SS–Arnoldi method based on the block Arnoldi method has also been proposed [13]. Different from these methods, Polizzi proposed the FEAST eigensolver for Hermitian generalized eigenvalue problems, which is based on an accelerated subspace iteration with the Rayleigh–Ritz procedure [17]. Their original 2009 version has been further developed [9, 23, 24].

Meanwhile, the contour integral-based methods have been extended to nonlinear eigenvalue problems. Nonlinear eigensolvers are based on the block SS–Hankel [1, 2] and the block SS–RR [25] methods and a different type of contour integral-based nonlinear eigensolver was proposed by Beyn [6], which we call the Beyn method. More recently, an improvement of the Beyn method was proposed based on using the canonical polyadic decomposition [5].

For Hermitian case, i.e., A is a Hermitian and B is a Hermitian positive definite, there are several related works based on Chebyshev polynomial filtering [7, 26] and based on rational interpolation [4]. Specifically, Austin et al. analyzed that the contour integral-based eigensolvers have strong relationship with rational interpolation established in [3], and proposed a projection type method only with real poles [4].

In this paper, we reconsider the algorithms of typical contour integral-based eigensolvers of (1), namely, the block SS–Hankel method [10, 11], the block SS–RR method [12], the FEAST eigensolver [17], the block SS–Arnoldi method [13] and the Beyn method [6] as projection methods. We then analyze and map the relationships among these methods. From the map of the relationships, we also provide error analyses of each method. Here, we note that our analyses cover the case of Jordan blocks of the size larger than one and infinite eigenvalues (or even both).

The remainder of this paper is organized as follows. Sections 2 and 3 briefly describe the algorithms of the contour integral-based eigensolvers and analyze the properties of their typical matrices, respectively. The relationships among these methods are analyzed and mapped in Sect. 4. Error analyses of the methods are presented in Sect. 5, and numerical experiments are conducted in Sect. 6. The paper concludes with Sect. 7.

Throughout, the following notations are used. Let \(V = [{{\varvec{v}}}_1, {{\varvec{v}}}_2, \ldots , {{\varvec{v}}}_L] \in {\mathbb {C}}^{n \times L}\) and define the range space of the matrix V by \({\mathcal {R}}(V) := \mathrm{span}\{ {{\varvec{v}}}_1, {{\varvec{v}}}_2, \ldots , {{\varvec{v}}}_L \}\). In addition, for \(A \in {\mathbb {C}}^{n \times n}, {\mathcal {K}}_k^\square (A,V)\) and \({\mathcal {B}}_k^\square (A,V)\) are the block Krylov subspaces:

$$\begin{aligned} {\mathcal {K}}_k^\square (A,V) :=&{\mathcal {R}}([V, AV, A^2V, \dots , A^{k-1}V]), \\ {\mathcal {B}}_k^\square (A,V) :=&\left\{ \sum _{i=0}^{k-1} A^i V \alpha _i \Big | \alpha _i \in {\mathbb {C}}^{L \times L} \right\} . \end{aligned}$$

We also define a block diagonal matrix with block elements \(D_i \in {\mathbb {C}}^{n_i \times n_i}\) constructed as follows:

$$\begin{aligned} \bigoplus _{i=1}^d D_i = D_1 \oplus D_2 \oplus \dots \oplus D_d = \left( \begin{array}{cccc} D_1 &{} &{} &{} \\ &{} D_2 &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{} D_d \end{array} \right) \in {\mathbb {C}}^{n \times n}, \end{aligned}$$

where \(n = \sum _{i=1}^d n_i\).

2 Contour integral-based eigensolvers

The contour integral-based eigensolvers reduce the target eigenvalue problem (1) to a different type of small eigenvalue problem. In this section, we first describe the reduced eigenvalue problems and then introduce the algorithms of the contour integral-based eigensolvers.

2.1 Theoretical preparation

As a generalization of the Jordan canonical form to the matrix pencil, we have the following theorem.

Theorem 1

(Weierstrass canonical form) Let \(zB-A\) be regular. Then, there exist nonsingular matrices \({\widetilde{P}}^{\mathrm{H}}, Q\) such that

$$\begin{aligned} {\widetilde{P}}^{\mathrm{H}} (zB-A) Q = \bigoplus _{i=1}^r \left( zI_{n_i}-J_{n_i}(\lambda _i) \right) \oplus \bigoplus _{i=r+1}^d \left( zJ_{n_i}(0)-I_{n_i} \right) , \end{aligned}$$

where \(J_{n_i}(\lambda _i)\) is the Jordan block with \(\lambda _i\),

$$\begin{aligned} J_{n_i}(\lambda ) = \left( \begin{array}{llll} \lambda _i &{}\quad 1 &{} &{} \\ &{}\quad \lambda _i &{}\quad \ddots &{} \\ &{} &{}\quad \ddots &{} 1 \\ &{} &{} &{}\quad \lambda _i \end{array} \right) \in {\mathbb {C}}^{n_i \times n_i}, \end{aligned}$$

and \(zJ_{n_i}(0)-I_{n_i}\) is the Jordan block with \(\lambda =\infty \),

$$\begin{aligned} zJ_{n_i}(0)-I_{n_i} = \left( \begin{array}{llll} -1 &{}\quad z &{} &{} \\ &{} -1 &{}\quad \ddots &{} \\ &{} &{}\quad \ddots &{}\quad z \\ &{} &{} &{}\quad -1 \end{array} \right) \in {\mathbb {C}}^{n_i \times n_i}. \end{aligned}$$

The generalized eigenvalue problem \(A {{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i\) has r finite eigenvalues \(\lambda _i, i = 1, 2, \dots , r\) with multiplicity \(n_i\) and \(d-r\) infinite eigenvalues \(\lambda _i, i = r+1, r+2, \dots , d\) with multiplicity \(n_i\). Let \({\widetilde{P}}_i\) and \(Q_i\) be submatrices of \({\widetilde{P}}\) and Q, respectively, corresponding to the i-th Jordan block, i.e., \({\widetilde{P}} = [{\widetilde{P}}_1, {\widetilde{P}}_2, \dots , {\widetilde{P}}_d], Q = [Q_1, Q_2, \dots , Q_d]\). Then, the columns of \({\widetilde{P}}_i\) and \(Q_i\) are the left/right generalized eigenvectors, whose 1st columns are the corresponding left/right eigenvectors.

Let \(L, M \in {\mathbb {N}}\) be input parameters and \(V \in {\mathbb {C}}^{n \times L}\) be an input matrix. We also define \(S \in {\mathbb {C}}^{n \times LM}\) and \(S_k \in {\mathbb {C}}^{n \times L}\) as follows:

$$\begin{aligned} S := [ S_0, S_1, \dots , S_{M-1}], \quad S_k := \frac{1}{2 \pi \mathrm{i} } \oint _\varGamma z^k (zB-A)^{-1} BV \mathrm{d}z. \end{aligned}$$
(4)

From Theorem 1, we have the following theorem [10, 11, Theorem 4].

Theorem 2

Let \({\widetilde{Q}}^{\mathrm{H}} = Q^{-1}\) and \({\widetilde{Q}}_i\) be a submatrix of \({\widetilde{Q}}\) corresponding to the i-th Jordan block, i.e., \({\widetilde{Q}} = [{\widetilde{Q}}_1, {\widetilde{Q}}_2, \dots , {\widetilde{Q}}_d]\). Then, we have

$$\begin{aligned} S_k = Q_\varOmega J_\varOmega ^k {\widetilde{Q}}_\varOmega ^{\mathrm{H}} V = (Q_\varOmega J_\varOmega {\widetilde{Q}}_\varOmega ^{\mathrm{H}})^k (Q_\varOmega {\widetilde{Q}}_\varOmega ^{\mathrm{H}}V) = C_\varOmega ^k S_0, \quad C_\varOmega = Q_\varOmega J_\varOmega {\widetilde{Q}}_\varOmega ^{\mathrm{H}}, \end{aligned}$$

where

$$\begin{aligned} Q_\varOmega = [Q_i | \lambda _i \in \varOmega ], \quad {\widetilde{Q}}_\varOmega = [{\widetilde{Q}}_i | \lambda _i \in \varOmega ], \quad J_\varOmega = \bigoplus _{\lambda _i \in \varOmega } J_{n_i}(\lambda _i). \end{aligned}$$

Using Theorem 2, we also have the following theorem.

Theorem 3

Let m be the number of target eigenvalues (counting multiplicity) and \(X_\varOmega := [{{\varvec{x}}}_i | \lambda _i \in \varOmega ]\) be a matrix whose columns are the target eigenvectors. Then, we have

$$\begin{aligned} {\mathcal {R}}(X_\varOmega ) \subset {\mathcal {R}}(Q_\varOmega ) = {\mathcal {R}}(S), \end{aligned}$$

if and only if \(\mathrm{rank}(S) = m\).

Proof

From Theorem 2 and the definition of S, we have

$$\begin{aligned} S = [S_0, S_1, \dots , S_{M-1}] = Q_\varOmega Z \end{aligned}$$

where

$$\begin{aligned} Z := \left[ (Q_\varOmega ^{\mathrm{H}} V), J_\varOmega (Q_\varOmega ^{\mathrm{H}} V), \dots , J_\varOmega ^{M-1}(Q_\varOmega ^{\mathrm{H}} V)\right] . \end{aligned}$$

Since \(Q_\varOmega \) is full rank, \(\mathrm{rank}(S) = \mathrm{rank}(Z)\) and \({\mathcal {R}}(Q_\varOmega ) = {\mathcal {R}}(S)\) is satisfied if and only if \(\mathrm{rank}(S) = \mathrm{rank}(Z) = m\). From the definitions of \(X_\varOmega \) and \(Q_\varOmega \), we have \({\mathcal {R}}(X_\varOmega ) \subset {\mathcal {R}}(Q_\varOmega )\). Therefore, Theorem 3 is proven.

Here, we note that \(\mathrm{rank}(Z)=m\) is not always satisfied for \(m \le LM\) even if \(Q_\varOmega ^{\mathrm{H}}V\) is full rank [8]. \(\square \)

2.2 Introduction to contour integral-based eigensolvers

The contour integral-based eigensolvers are mathematically designed based on Theorems 2 and 3, then the algorithms are derived from approximating the contour integral (4) using some numerical integration rule:

$$\begin{aligned} {\widehat{S}} := [ {\widehat{S}}_0, {\widehat{S}}_1, \dots , {\widehat{S}}_{M-1}], \quad {\widehat{S}}_k := \sum _{j=1}^N \omega _j z_j^k (z_jB-A)^{-1} BV, \end{aligned}$$
(5)

where \(z_j\) is a quadrature point and \(\omega _j\) is its corresponding weight.

2.2.1 The block SS–Hankel method

The block SS–Hankel method [10, 11] is a block variant of the SS–Hankel method. Define the block complex moments \(\mu _k^\square \in {\mathbb {C}}^{L \times L}\) by

$$\begin{aligned} \mu _k^\square := \frac{1}{2 \pi \mathrm{i} } \oint _\varGamma z^k {\widetilde{V}}^{\mathrm{H}} (zB-A)^{-1} BV \mathrm{d}z = {\widetilde{V}}^\mathrm{H} S_k, \end{aligned}$$

where \({\widetilde{V}} \in {\mathbb {C}}^{n \times L}\), and the block Hankel matrices \(H_M, H_M^< \in {\mathbb {C}}^{LM \times LM}\) are given by

$$\begin{aligned} H_M^\square := \left( \begin{array}{llll} \mu _0^\square &{}\quad \mu _1^\square &{}\quad \cdots &{}\quad \mu _{M-1}^\square \\ \mu _1^\square &{}\quad \mu _2^\square &{}\quad \cdots &{}\quad \mu _{M}^\square \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \mu _{M-1}^\square &{}\quad \mu _{M}^\square &{}\quad \cdots &{}\quad \mu _{2M-2}^\square \end{array} \right) , \quad H_M^{\square <} := \left( \begin{array}{llll} \mu _1^\square &{}\quad \mu _2^\square &{}\quad \cdots &{}\quad \mu _{M}^\square \\ \mu _2^\square &{}\quad \mu _3^\square &{}\quad \cdots &{}\quad \mu _{M+1}^\square \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \mu _{M}^\square &{}\quad \mu _{M+1}^\square &{}\quad \cdots &{}\quad \mu _{2M-1}^\square \end{array} \right) . \end{aligned}$$

We then obtain the following theorem [10, 11, Theorem 7].

Theorem 4

If \(\mathrm{rank}(S)=m\), then the nonsingular part of the matrix pencil \(zH_M^\square -H_M^{\square <}\) is equivalent to \(zI-J_\varOmega \).

According to Theorem 4, the target eigenpairs \((\lambda _i,{{\varvec{x}}}_i), \lambda _i \in \varOmega \) can be obtained through the generalized eigenvalue problem

$$\begin{aligned} H_M^{\square <} {{\varvec{y}}}_i = \theta _i H_M^\square {{\varvec{y}}}_i. \end{aligned}$$
(6)

In practice, we approximate the block complex moments \({\mu }_k^\square \in {\mathbb {C}}^{L \times L}\) by the numerical integral (5) such that

$$\begin{aligned} {\widehat{\mu }}_k^\square := \sum _{j=1}^N \omega _j z_j^k {\widetilde{V}}^{\mathrm{H}} (z_jB-A)^{-1} BV = {\widetilde{V}}^{\mathrm{H}} {\widehat{S}}_k, \end{aligned}$$

and set the block Hankel matrices \({\widehat{H}}_M^\square , {\widehat{H}}_M^{\square <} \in {\mathbb {C}}^{LM \times LM}\) as follows:

$$\begin{aligned} {\widehat{H}}_M^\square := \left( \begin{array}{llll} {{\widehat{\mu }}}_0^\square &{}\quad {{\widehat{\mu }}}_1^\square &{}\quad \cdots &{}\quad {{\widehat{\mu }}}_{M-1}^\square \\ {{\widehat{\mu }}}_1^\square &{}\quad {{\widehat{\mu }}}_2^\square &{}\quad \cdots &{}\quad {{\widehat{\mu }}}_{M}^\square \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ {{\widehat{\mu }}}_{M-1}^\square &{}\quad {{\widehat{\mu }}}_{M}^\square &{}\quad \cdots &{}\quad {{\widehat{\mu }}}_{2M-2}^\square \end{array} \right) , \quad {\widehat{H}}_M^{\square <} := \left( \begin{array}{llll} {{\widehat{\mu }}}_1^\square &{}\quad {{\widehat{\mu }}}_2^\square &{}\quad \cdots &{}\quad {{\widehat{\mu }}}_{M}^\square \\ {{\widehat{\mu }}}_2^\square &{}\quad {{\widehat{\mu }}}_3^\square &{}\quad \cdots &{}\quad {{\widehat{\mu }}}_{M+1}^\square \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ {{\widehat{\mu }}}_{M}^\square &{}\quad {{\widehat{\mu }}}_{M+1}^\square &{}\quad \cdots &{}\quad {{\widehat{\mu }}}_{2M-1}^\square \end{array} \right) . \nonumber \\ \end{aligned}$$
(7)

To reduce the computational costs and improve the numerical stability, we also introduce a low-rank approximation with a numerical rank \({\widehat{m}}\) of \({\widehat{H}}_M^\square \) based on singular value decomposition:

$$\begin{aligned} {\widehat{H}}_M^\square = [U_{\mathrm{H}1}, U_{\mathrm{H}2}] \left[ \begin{array}{ll} \varSigma _{\mathrm{H}1} &{}\quad O \\ O &{}\quad \varSigma _{ \mathrm{H}2} \end{array} \right] \left[ \begin{array}{ll} W_{ \mathrm{H}1}^{\mathrm{H}} \\ W_{ \mathrm{H}2}^{\mathrm{H}} \end{array} \right] \approx U_{ \mathrm{H}1} \varSigma _{ \mathrm{H}1} W_{ \mathrm{H}1}^\mathrm{H}. \end{aligned}$$

In this way, the target eigenvalue problem (1) reduces to an \({\widehat{m}}\) dimensional standard eigenvalue problem, i.e.,

$$\begin{aligned} U_{ \mathrm{H}1}^{\mathrm{H}} {\widehat{H}}_M^{\square <} W_{ \mathrm{H}1} \varSigma _{ \mathrm{H}1}^{-1} {{\varvec{t}}}_i = \theta _i {{\varvec{t}}}_i. \end{aligned}$$

The approximate eigenpairs are obtained as \(({\widetilde{\lambda }}_i,\widetilde{{\varvec{x}}}_i) = (\theta _i, {\widehat{S}} W_{ \mathrm{H}1}\varSigma _{\mathrm{H1}}^{-1} {{\varvec{t}}}_i)\). The algorithm of the block SS–Hankel method is shown in Algorithm 1.

figure a

2.2.2 The block SS–RR method

Theorem 3 indicates that the target eigenpairs can be computed by the Rayleigh–Ritz procedure over the subspace \({\mathcal {R}}(S)\), i.e.,

$$\begin{aligned} S^{\mathrm{H}} A S {{\varvec{t}}}_i = \theta _i S^{\mathrm{H}} B S {{\varvec{t}}}_i. \end{aligned}$$

The above forms the basis of the block SS–RR method [12]. In practice, the Rayleigh–Ritz procedure uses the approximated subspace \({\mathcal {R}}({\widehat{S}}) \approx {\mathcal {R}}(S)\) and a low-rank approximation of \({\widehat{S}}\):

$$\begin{aligned} {\widehat{S}} = [U_1, U_2] \left[ \begin{array}{ll} \varSigma _1 &{}\quad O \\ O &{}\quad \varSigma _2 \end{array} \right] \left[ \begin{array}{ll} W_1^{\mathrm{H}} \\ W_2^{\mathrm{H}} \end{array} \right] \approx U_1 \varSigma _1 W_1^{\mathrm{H}}. \end{aligned}$$

In this case, the reduced problem is given by

$$\begin{aligned} U_1^{\mathrm{H}}AU_1 {{\varvec{t}}}_i = \theta _i U_1^{\mathrm{H}}BU_1 {{\varvec{t}}}_i. \end{aligned}$$

The approximate eigenpairs are obtained as \(({\widetilde{\lambda }}_i,\widetilde{{\varvec{x}}}_i) = (\theta _i, U_1 {{\varvec{t}}}_i)\). The algorithm of the block SS–RR method is shown in Algorithm 2.

figure b

2.2.3 The FEAST eigensolver

The algorithm of the accelerated subspace iteration with the Rayleigh–Ritz procedure for solving Hermitian generalized eigenvalue problems is given in Algorithm 3. Here, \(\rho (A,B)\) is called an accelerator. When \(\rho (A,B) = B^{-1}A\), Algorithm 3 becomes the standard subspace iteration with the Rayleigh–Ritz procedure. It computes the L largest-magnitude eigenvalues and their corresponding eigenvectors.

figure c

The FEAST eigensolver [17], proposed for Hermitian generalized eigenvalue problems, is based on an accelerated subspace iteration with the Rayleigh–Ritz procedure. In the FEAST eigensolver, the accelerator \(\rho (A,B)\) is set as

$$\begin{aligned} \rho (A,B) = \sum _{j=1}^N \omega _j (z_j B - A)^{-1} B \approx \frac{1}{2 \pi \mathrm{i} } \oint _\varGamma (zB-A)^{-1} B \mathrm{d}z, \end{aligned}$$

based on Theorem 3. Therefore, the FEAST eigensolver computes the eigenvalues located in \(\varOmega \) and their corresponding eigenvectors. For numerical integration, the FEAST eigensolver uses the Gauß–Legendre quadrature or the Zolotarev quadrature; see [9, 17].

In each iteration of the FEAST eigensolver, the target eigenvalue problem (1) is reduced to a small eigenvalue problem, i.e.,

$$\begin{aligned} {\widehat{S}}_0^{\mathrm{H}}A{\widehat{S}}_0 {{\varvec{t}}}_i = \theta _i {\widehat{S}}_0^{\mathrm{H}}B{\widehat{S}}_0 {{\varvec{t}}}_i, \end{aligned}$$

based on the Rayleigh–Ritz procedure. The approximate eigenpairs are obtained as \(({\widetilde{\lambda }}_i,\widetilde{{\varvec{x}}}_i) = (\theta _i, {\widehat{S}}_0 {{\varvec{t}}}_i)\). The algorithm of the FEAST eigensolver is shown in Algorithm 4.

figure d

2.2.4 The block SS–Arnoldi method

From Theorems 2 and 3 and the definition of \(C_\varOmega :=Q_\varOmega J_\varOmega {\widetilde{Q}}_\varOmega ^\mathrm{H}\), we have the following three theorems [13].

Theorem 5

The subspace \({\mathcal {R}}(S)\) can be expressed as the block Krylov subspace associated with the matrix \(C_\varOmega \):

$$\begin{aligned} {\mathcal {R}}(S) = {\mathcal {K}}_M^\square (C_\varOmega ,S_0). \end{aligned}$$

Theorem 6

Let m be the number of target eigenvalues (counting multiplicity). Then, if \(\mathrm{rank}(S) = m\), the target eigenvalue problem (1) is equivalent to a standard eigenvalue problem of the form

$$\begin{aligned} C_\varOmega {{\varvec{x}}}_i = \lambda _i {{\varvec{x}}}_i, \quad {{\varvec{x}}}_i \in {\mathcal {R}}(S) = {\mathcal {K}}_M^\square (C_\varOmega ,S_0). \end{aligned}$$
(8)

Theorem 7

Any \(E_k \in {\mathcal {B}}_k^\square (C_\varOmega ,S_0)\) has the following formula:

$$\begin{aligned} E_k = \frac{1}{2 \pi \mathrm{i}} \oint _\varGamma \sum _{i=0}^{k-1} z^i (zB-A)^{-1} B V \alpha _i \mathrm{d}z, \quad \alpha _i \in {\mathbb {C}}^{L \times L}. \end{aligned}$$

Then, the matrix multiplication of \(C_\varOmega \) by \(E_k\) becomes

$$\begin{aligned} C_\varOmega E_k = \frac{1}{2 \pi \mathrm{i}} \oint _\varGamma z \sum _{i=0}^{k-1} z^i (zB-A)^{-1} B V \alpha _i \mathrm{d}z. \end{aligned}$$

From Theorems 5 and 6, we observe that the target eigenpairs \((\lambda _i,{{\varvec{x}}}_i),\lambda _i \in \varOmega \) can be computed by the block Arnoldi method with the block Krylov subspace \({\mathcal {K}}_M^\square (C_\varOmega ,S_0)\) for solving the standard eigenvalue problem (8). Here, we note that the matrix \(C_\varOmega \) is not explicitly constructed. Instead, the matrix multiplication of \(C_\varOmega \) can be computed via the contour integral using Theorem 7. By approximating the contour integral by a numerical integration rule, the algorithm of the block SS–Arnoldi method is derived (Algorithm 5).

A low-rank approximation technique to reduce the computational costs and improve stability is not applied in the current version of the block SS–Arnoldi method [13]. Improvements of the block SS–Arnoldi method has been developed in [14].

figure e

2.2.5 The Beyn method

The Beyn method is a nonlinear eigensolver based on the contour integral [6]. In this subsection, we consider the algorithm of the Beyn method for solving the generalized eigenvalue problem (1).

Let the singular value decomposition of \(S_0\) be \(S_0 = U_{0} \varSigma _{0} W_{0}^{\mathrm{H}}\), where \(U_0 \in {\mathbb {C}}^{n \times m}, \varSigma _0 = \mathrm{diag}(\sigma _1, \sigma _2, \dots , \sigma _m), W_0 \in {\mathbb {C}}^{L \times m}\) and \(\mathrm{rank}(S_0) = m\). Then, from Theorem 2, we have

$$\begin{aligned} S_0 = Q_\varOmega {\widetilde{Q}}_\varOmega ^{\mathrm{H}}V = U_0 \varSigma _0 W_0^\mathrm{H}, \quad S_1 = Q_\varOmega J_\varOmega {\widetilde{Q}}_\varOmega ^{\mathrm{H}}V. \end{aligned}$$
(9)

Since \({\mathcal {R}}(Q_\varOmega ) = {\mathcal {R}}(U_0)\), we obtain

$$\begin{aligned} Q_\varOmega = U_0 Z, \quad Z = U_0^{\mathrm{H}} Q_\varOmega \in {\mathbb {C}}^{m \times m}, \end{aligned}$$
(10)

where Z is nonsingular. With (9) and (10), we find \(U_0 Z {\widetilde{Q}}_\varOmega ^{\mathrm{H}} V = U_0 \varSigma _0 W_0^{\mathrm{H}}\) and thus \({\widetilde{Q}}_\varOmega ^{\mathrm{H}} V = Z^{-1} \varSigma _0 W_0^{\mathrm{H}}\). This leads to

$$\begin{aligned} U_0^{\mathrm{H}} S_1 = Z J_\varOmega {\widetilde{Q}}_\varOmega ^{\mathrm{H}} V = Z J_\varOmega Z^{-1} \varSigma _0 W_0^{\mathrm{H}}. \end{aligned}$$

Therefore, we have

$$\begin{aligned} Z J_\varOmega Z^{-1} = U_0^{\mathrm{H}} S_1 W_0 \varSigma _0^{-1}. \end{aligned}$$

This means that the target eigenpairs \((\lambda _i,{{\varvec{x}}}_i), \lambda _i \in \varOmega \) are computed by solving

$$\begin{aligned} U_0^{\mathrm{H}} S_1 W_0 \varSigma _0^{-1} {{\varvec{t}}}_i = \theta _i {{\varvec{t}}}_i, \end{aligned}$$

where \((\lambda _i,{{\varvec{x}}}_i) = (\theta _i, U_0 {{\varvec{t}}}_i)\) [6].

In practice, we compute a low-rank approximation of \({\widehat{S}}_0\) by the singular value decomposition, i.e.,

$$\begin{aligned} {\widehat{S}}_0 = [U_{0,1}, U_{0,2}] \left[ \begin{array}{ll} \varSigma _{0,1} &{}\quad O \\ O &{}\quad \varSigma _{0,2} \end{array} \right] \left[ \begin{array}{l} W_{0,1}^{\mathrm{H}} \\ W_{0,2}^{\mathrm{H}} \end{array} \right] \approx U_{0,1} \varSigma _{0,1} W_{0,1}^{\mathrm{H}}, \end{aligned}$$
(11)

which reduces the target eigenvalue problem (1) to the standard eigenvalue problem

$$\begin{aligned} U_{0,1}^{\mathrm{H}} {\widehat{S}}_1 W_{0,1} \varSigma _{0,1}^{-1} {{\varvec{t}}}_i = \theta _i {{\varvec{t}}}_i. \end{aligned}$$
(12)

The approximate eigenpairs are obtained as \(({\widetilde{\lambda }}_i,\widetilde{{\varvec{x}}}_i) = (\theta _i, U_{0,1} {{\varvec{t}}}_i)\). The algorithm of the Beyn method for solving the generalized eigenvalue problem (1) is shown in Algorithm 6.

figure f

3 Theoretical preliminaries for map building

As shown in Sect. 2, contour integral-based eigensolvers are based on the property of the matrices S and \(S_k\) (Theorems 2 and 3). The practical algorithms are then derived by a numerical integral approximation. As theoretical preliminaries for map building in Sect. 4, this section explores the properties of the approximated matrices \({\widehat{S}}\) and \({\widehat{S}}_k\). Here, we assume that \((z_j,\omega _j)\) satisfy the following condition:

$$\begin{aligned} \sum _{j=1}^N \omega _j z_j^k \left\{ \begin{array}{ll} = 0, &{} k = 0, 1, \dots , N-2 \\ \ne 0, &{} k = -1 \end{array} \right. . \end{aligned}$$
(13)

If the matrix pencil \(zB-A\) is diagonalizable and \((z_j,\omega _j)\) satisfies condition (13), we have

$$\begin{aligned} {\widehat{S}}_k = C^k {\widehat{S}}_0, \quad C = X_r \varLambda _r {\widetilde{X}}_r^{\mathrm{H}}, \end{aligned}$$

where \(X_r = [{{\varvec{x}}}_1, {{\varvec{x}}}_2, \ldots , {{\varvec{x}}}_r]\) is a matrix whose columns are eigenvectors corresponding to finite eigenvalues, \({\widetilde{X}}_r = [\widetilde{{\varvec{x}}}_1, \widetilde{{\varvec{x}}}_2, \ldots , \widetilde{{\varvec{x}}}_r]\) is a submatrix of \({\widetilde{X}} = X^{\mathrm{-H}}\): \({\widetilde{X}}_r^{\mathrm{H}} X_r = I\), and \(\varLambda _r = \mathrm{diag}(\lambda _1, \lambda _2, \dots , \lambda _r)\); see [15]. In the following analysis, we introduce a similar relationship in the case that the matrix pencil \(zB-A\) is non-diagonalizable. First, we define an upper triangular Toeplitz matrix as follows.

Definition 1

For \({{\varvec{a}}} = [a_1, a_2, \dots , a_n] \in {\mathbb {C}}^{1 \times n}\), define \(T_n({{\varvec{a}}})\) as an \(n \times n\) triangular Toeplitz matrix, i.e.,

$$\begin{aligned} T_n({{\varvec{a}}}) := \left( \begin{array}{llll} a_1 &{}\quad a_2 &{}\quad \cdots &{}\quad a_n \\ 0 &{}\quad a_1 &{}\quad \cdots &{}\quad a_{n-1} \\ \vdots &{}\quad \ddots &{}\quad \ddots &{}\quad \vdots \\ 0 &{}\quad \cdots &{}\quad 0 &{}\quad a_1 \end{array} \right) \in {\mathbb {C}}^{n \times n}. \end{aligned}$$

Let \({{\varvec{a}}} = [a_1, a_2, \dots , a_n], {{\varvec{b}}} = [b_1, b_2, \dots , b_n], {{\varvec{c}}} = [c_1, c_2, \dots , c_n] \in {\mathbb {C}}^{1 \times n}\) and \(\alpha , \beta \in {\mathbb {C}}\). Then, we have

$$\begin{aligned}&\alpha T_n({{\varvec{a}}}) + \beta T_n({{\varvec{b}}}) = T_n(\alpha {{\varvec{a}}} + \beta {{\varvec{b}}}), \end{aligned}$$
(14)
$$\begin{aligned}&T_n({{\varvec{a}}}) T_n({{\varvec{b}}}) = T_n({{\varvec{c}}}), \quad c_i = \sum _{j=1}^i a_j b_{i-j+1}, \quad i = 1, 2, \dots , n . \end{aligned}$$
(15)

Letting \({{\varvec{d}}} = [\alpha , \beta , 0, \dots , 0] \in {\mathbb {C}}^{1 \times n}\), we also have

$$\begin{aligned}&(T_n({{\varvec{d}}}))^{k} = T_n({{\varvec{d}}}^{(k)}), \quad {{\varvec{d}}}^{(k)} = \left[ \left( {\begin{array}{c}k\\ 0\end{array}}\right) \alpha ^k \beta ^0, \left( {\begin{array}{c}k\\ 1\end{array}}\right) \alpha ^{k-1} \beta ^1, \dots , \left( {\begin{array}{c}k\\ n\end{array}}\right) \alpha ^{k-n+1} \beta ^{n-1} \right] , \end{aligned}$$
(16)
$$\begin{aligned}&(T_n({{\varvec{d}}}))^{-1} = T_n({{\varvec{d}}}^{(-1)}), \quad {{\varvec{d}}}^{(-1)} = \left[ \frac{1}{\alpha }, -\frac{\beta }{\alpha ^2}, \dots , \frac{(-\beta )^{n-1}}{\alpha ^n} \right] . \end{aligned}$$
(17)

Using these relations (14)–(17), we analyze properties of \({\widehat{S}}\) and \({\widehat{S}}_k\). From Theorem 1, we have

$$\begin{aligned} (zB-A)^{-1} =&Q \left[ \bigoplus _{i=1}^r \left( zI_{n_i}-J_{n_i}(\lambda _i) \right) ^{-1} \oplus \bigoplus _{i=r+1}^d \left( zJ_{n_i}(0)-I_{n_i} \right) ^{-1} \right] {\widetilde{P}}^{\mathrm{H}}, \\ B =&P \left[ \bigoplus _{i=1}^r I_{n_i} \oplus \bigoplus _{i=r+1}^d J_{n_i}(0) \right] {\widetilde{Q}}^{\mathrm{H}}, \end{aligned}$$

where \(P := {\widetilde{P}}^{\mathrm{-H}}, {\widetilde{Q}}^{\mathrm{H}} := Q^{-1}\). Therefore, the matrix \({\widehat{S}}_k\) can be written as

$$\begin{aligned} {\widehat{S}}_k&= \sum _{j=1}^N \omega _j z_j^k (z_j B - A)^{-1} B V \nonumber \\&= \sum _{j=1}^N \omega _j z_j^k Q \left\{ \bigoplus _{i=1}^r \left( z_jI_{n_i}-J_{n_i}(\lambda _i) \right) ^{-1} \right. \nonumber \\&\quad \left. \oplus \bigoplus _{i=r+1}^d \left[ \left( z_jJ_{n_i}(0)-I_{n_i} \right) ^{-1} J_{n_i}(0) \right] \right\} {\widetilde{Q}}^{\mathrm{H}} V \nonumber \\&= \left\{ \sum _{i=1}^r Q_i \left[ \sum _{j=1}^N \omega _j z_j^k \left( z_j I_{n_i} - J_{n_i}(\lambda _i) \right) ^{-1} \right] {\widetilde{Q}}_i^{\mathrm{H}} V \right\} \nonumber \\&\quad + \left\{ \sum _{i=r+1}^d Q_i \left[ \sum _{j=1}^N \omega _j z_j^k \left( z_j J_{n_i}(0) - I_{n_i} \right) ^{-1} J_{n_i}(0) \right] {\widetilde{Q}}_i^{\mathrm{H}} V \right\} , \end{aligned}$$
(18)

where \(Q_i\) and \({\widetilde{Q}}_i\) are \(n \times n_i\) submatrices of Q and \({\widetilde{Q}}\) respectively, corresponding to the i-th Jordan block, i.e., \(Q = [Q_1, Q_2, \dots , Q_d], {\widetilde{Q}} = [{\widetilde{Q}}_1, {\widetilde{Q}}_2, \dots , {\widetilde{Q}}_d]\).

First, we consider the 1st term of \({\widehat{S}}_k\) (18):

$$\begin{aligned} {\widehat{S}}_k^{(1)} := \sum _{i=1}^r Q_i \left[ \sum _{j=1}^N \omega _j z_j^k \left( z_j I_{n_i} - J_{n_i}(\lambda _i) \right) ^{-1} \right] {\widetilde{Q}}_i^{\mathrm{H}} V. \end{aligned}$$

From the relation

$$\begin{aligned} z_j I_{n_i} - J_{n_i}(\lambda _i) = T_n([z_j-\lambda _i, -1, 0, \dots , 0]) \end{aligned}$$

and (17), we have

$$\begin{aligned} (z_j I_{n_i} - J_{n_i}(\lambda _i))^{-1} = T_{n_i}\left( \left[ \frac{1}{z_j-\lambda _i}, \frac{1}{(z_j-\lambda _i)^2}, \dots , \frac{1}{(z_j-\lambda _i)^{n_i}} \right] \right) . \end{aligned}$$

Thus, defining \({{\varvec{f}}}_{k}(\lambda _i) \in {\mathbb {C}}^{1 \times n_i}\) as

$$\begin{aligned} {{\varvec{f}}}_{k}(\lambda _i) := \sum _{j=1}^N \omega _j z_j^k \left[ \frac{1}{z_j-\lambda _i}, \frac{1}{(z_j-\lambda _i)^2}, \dots , \frac{1}{(z_j-\lambda _i)^{n_i}} \right] , \end{aligned}$$
(19)

from (14), we have

$$\begin{aligned} \sum _{j=1}^N \omega _j z_j^k (z_j I_{n_i} - J_{n_i}(\lambda _i))^{-1} = T_{n_i}\left( {{\varvec{f}}}_{k}(\lambda _i) \right) . \end{aligned}$$
(20)

Therefore, \({\widehat{S}}^{(1)}_k\) can be rewritten as

$$\begin{aligned} {\widehat{S}}^{(1)}_k = \sum _{i=1}^r Q_i T_{n_i}\left( {{\varvec{f}}}_{k}(\lambda _i) \right) {\widetilde{Q}}_i^{\mathrm{H}} V. \end{aligned}$$
(21)

Here, the following propositions hold.

Proposition 1

Suppose that \((\omega _j,z_j)\) satisfies condition (13). Then, for any \(\lambda _i \ne 0\) and \(0 \le k \le N+p-2, p \ge 1\), the relation

$$\begin{aligned} \sum _{j=1}^N \frac{\omega _j z_j^k}{(z_j - \lambda _i)^p} = \lambda _i^k \sum _{j=1}^N \frac{\omega _j}{(z_j - \lambda _i)^p} \sum _{q=0}^{p-1} \left( {\begin{array}{c}k\\ q\end{array}}\right) \left( \frac{z_j - \lambda _i}{\lambda _i} \right) ^q \end{aligned}$$
(22)

is satisfied.

Proof

Since \(\lambda _i \ne 0\), we have

$$\begin{aligned} \frac{\omega _j z_j^k}{(z_j - \lambda _i)^p} = \frac{\omega _j}{(z_j-\lambda _i)^p} \lambda _i^k \left( \frac{z_j}{\lambda _i} \right) ^k = \frac{\omega _j}{(z_j-\lambda _i)^p} \lambda _i^k \left( 1 + \frac{z_j - \lambda _i}{\lambda _i} \right) ^k. \end{aligned}$$
(23)

Here, from the binomial theorem \((a+b)^k = \sum _{q=0}^k \left( {\begin{array}{c}k\\ q\end{array}}\right) a^{k-q}b^q\), (23) is rewritten as

$$\begin{aligned} \frac{\omega _j z_j^k}{(z_j - \lambda _i)^p} = \frac{\omega _j}{(z_j-\lambda _i)^p} \lambda _i^k \sum _{q=0}^k \left( {\begin{array}{c}k\\ q\end{array}}\right) \left( \frac{z_j - \lambda _i}{\lambda _i} \right) ^q. \end{aligned}$$

Therefore, the left-hand side of (22) is

$$\begin{aligned}&\sum _{j=1}^N \frac{\omega _j z_j^k}{(z_j - \lambda _i)^p} \nonumber \\&\quad = \sum _{j=1}^N \frac{\omega _j}{(z_j-\lambda _i)^p} \lambda _i^k \sum _{q=0}^k \left( {\begin{array}{c}k\\ q\end{array}}\right) \left( \frac{z_j - \lambda _i}{\lambda _i} \right) ^q \nonumber \\&\quad = \sum _{j=1}^N \frac{\omega _j}{(z_j-\lambda _i)^p} \lambda _i^k \left[ \sum _{q=0}^{p-1} \left( {\begin{array}{c}k\\ q\end{array}}\right) \left( \frac{z_j - \lambda _i}{\lambda _i} \right) ^q + \sum _{q=p}^{k} \left( {\begin{array}{c}k\\ q\end{array}}\right) \left( \frac{z_j - \lambda _i}{\lambda _i} \right) ^q \right] \nonumber \\&\quad = \left[ \lambda _i^k \sum _{q=0}^{p-1} \left( {\begin{array}{c}k\\ q\end{array}}\right) \lambda _i^{-q} \sum _{j=1}^N \omega _j (z_j - \lambda _i)^{q-p} \right] \nonumber \\&\qquad + \left[ \lambda _i^k \sum _{q=p}^k \left( {\begin{array}{c}k\\ q\end{array}}\right) \lambda _i^{-q} \sum _{j=1}^N \omega _j (z_j - \lambda _i)^{q-p} \right] . \end{aligned}$$
(24)

Because condition (13) is satisfied, we have

$$\begin{aligned} \sum _{j=1}^N \omega _j (z_j - \lambda _i)^{q-p} = 0, \quad q = p, p+1, \dots , N+p-2, \end{aligned}$$

thus, for \(k \le N+p-2\), the 2nd term of (24) becomes 0. Therefore, for \(k = 0, 1, \ldots , N+p-2\), we obtain

$$\begin{aligned} \sum _{j=1}^N \frac{\omega _j z_j^k}{(z_j - \lambda _i)^p} = \sum _{j=1}^N \frac{\omega _j}{(z_j-\lambda _i)^p} \lambda _i^k \sum _{q=0}^{p-1} \left( {\begin{array}{c}k\\ q\end{array}}\right) \left( \frac{z_j - \lambda _i}{\lambda _i} \right) ^q, \end{aligned}$$

which proves Proposition 1. \(\square \)

Proposition 2

Suppose that \((\omega _j,z_j)\) satisfies condition (13). Then, for any \(0 \le k \le N-1\), the relation

$$\begin{aligned} T_{n_i}\left( {{\varvec{f}}}_k(\lambda _i) \right) = (J_{n_i}(\lambda _i))^k T_{n_i}\left( {{\varvec{f}}}_0(\lambda _i) \right) \end{aligned}$$
(25)

is satisfied, where \({{\varvec{f}}}_k(\lambda _i)\) is defined by (19) and \(0^0 = 1\).

Proof

We first consider the case of \(\lambda _i=0\). From \(J_{n_i}(0) = T_{n_i}([0, 1, 0, \ldots , 0])\), there exists a vector \({{\varvec{t}}}_{0,k} \in {\mathbb {C}}^{1 \times n_i}\) satisfying

$$\begin{aligned} T_{n_i}({{\varvec{t}}}_{0,k}) := (J_{n_i}(0))^k T_{n_i}\left( {{\varvec{f}}}_0(0) \right) . \end{aligned}$$

Then, from (15) and (16), the p-th element \(( {{\varvec{t}}}_{0,k} )_p\) of \({{\varvec{t}}}_{0,k}\) can be written as

$$\begin{aligned} ( {{\varvec{t}}}_{0,k} )_p = \left\{ \begin{array}{ll} 0 &{}\quad (1 \le p \le k) \\ \sum _{j=1}^N \omega _j z_j^{k-p} &{}\quad (k < p \le n_i) \end{array} \right. . \end{aligned}$$

On the other hand, from the definition (19), the vector \({{\varvec{f}}}_k(0)\) can be written as

$$\begin{aligned} {{\varvec{f}}}_k(0) = \sum _{j=1}^N \omega _j [z_j^{k-1}, z_j^{k-2}, \ldots , z_j^{k-n_i}]. \end{aligned}$$

Since the first k elements of \({{\varvec{f}}}_k(0)\) are 0 from condition (13), we have \({{\varvec{f}}}_k(0) = {{\varvec{t}}}_{0,k}\). Therefore, (25) is satisfied for \(\lambda _i = 0\).

Next, we consider the case of \(\lambda _i \ne 0\). From \(J_{n_i}(\lambda _i) = T_{n_i}([\lambda _i,1,0,\dots ,0])\) and (16), we have

$$\begin{aligned} (J_{n_i}(\lambda _i))^k = T_{n_i}\left( \left[ \lambda _i^k, \left( {\begin{array}{c}k\\ 1\end{array}}\right) \lambda _i^{k-1}, \dots , \left( {\begin{array}{c}k\\ n_i\end{array}}\right) \lambda _i^{k-n_i+1} \right] \right) . \end{aligned}$$
(26)

Let \({{\varvec{t}}}_k \in {\mathbb {C}}^{1 \times n_i}\) be a vector satisfying

$$\begin{aligned} T_{n_i}({{\varvec{t}}}_k) := (J_{n_i}(\lambda _i))^k T_{n_i}\left( {{\varvec{f}}}_0(\lambda _i) \right) . \end{aligned}$$

Then, from (15), (26) and the definition of \({{\varvec{f}}}_k(\lambda _i)\) (19), the p-th element \(({{\varvec{t}}}_k)_p\) of \({{\varvec{t}}}_k\) can be written as

$$\begin{aligned} ({{\varvec{t}}}_k)_p&= \sum _{q = 1}^p \left( {\begin{array}{c}k\\ q-1\end{array}}\right) \lambda _i^{k-q+1} \sum _{j=1}^N \omega _j \frac{1}{(z_j-\lambda _i)^{p-q+1}} \\&= \lambda _i^k \sum _{j=1}^N \frac{\omega _j}{(z_j-\lambda _i)^p} \sum _{q = 1}^p \left( {\begin{array}{c}k\\ q-1\end{array}}\right) \frac{(z_j-\lambda _i)^{q-1}}{ \lambda _i^{q-1}} \\&= \lambda _i^k \sum _{j=1}^N \frac{\omega _j}{(z_j-\lambda _i)^p} \sum _{q = 0}^{p-1} \left( {\begin{array}{c}k\\ q\end{array}}\right) \left( \frac{z_j-\lambda _i}{ \lambda _i} \right) ^q. \end{aligned}$$

By Proposition 1, for \(0 \le k \le N+p-2\), we obtain

$$\begin{aligned} ({{\varvec{t}}}_k)_p = \sum _{j=1}^N \frac{\omega _j z_j^k}{(z_j - \lambda _i)^p}. \end{aligned}$$

Therefore, for \(0 \le k \le N-1\), we have

$$\begin{aligned} {{\varvec{t}}}_k = \left[ \sum _{j=1}^N \frac{\omega _j z_j^k}{z_j-\lambda _i}, \sum _{j=1}^N \frac{\omega _j z_j^k}{(z_j-\lambda _i)^2}, \dots , \sum _{j=1}^N \frac{\omega _j z_j^k}{(z_j-\lambda _i)^{n_i}} \right] = {{\varvec{f}}}_k(\lambda _i), \end{aligned}$$

and

$$\begin{aligned} T_{n_i}\left( {{\varvec{f}}}_k(\lambda _i) \right) = T_{n_i}\left( {{\varvec{t}}}_k \right) = (J_{n_i}(\lambda _i))^k T_{n_i}\left( {{\varvec{f}}}_0(\lambda _i) \right) \end{aligned}$$

is satisfied, proving Proposition 2. \(\square \)

From Proposition 2, by substituting (25) into (21) and using (20), we obtain

$$\begin{aligned} {\widehat{S}}_k^{(1)}&= \sum _{i=1}^r Q_i \left( J_{n_i}(\lambda _i) \right) ^k T_{n_i}({{\varvec{f}}}_0(\lambda _i)) {\widetilde{Q}}_i^{\mathrm{H}} V \nonumber \\&= \sum _{i=1}^r Q_i \left( J_{n_i}(\lambda _i) \right) ^k \left[ \sum _{j=1}^N \omega _j \left( z_j I_{n_i} - J_{n_i}(\lambda _i) \right) ^{-1} \right] {\widetilde{Q}}_i^{\mathrm{H}} V \end{aligned}$$
(27)

for any \(0 \le k \le N - 1\).

Now consider the 2nd term of \({\widehat{S}}_k\) (18), i.e.,

$$\begin{aligned} {\widehat{S}}_k^{(2)} := \sum _{i=r+1}^d Q_i \left[ \sum _{j=1}^N \omega _j z_j^k \left( z_j J_{n_i}(0) - I_{n_i} \right) ^{-1} J_{n_i}(0) \right] {\widetilde{Q}}_i^{\mathrm{H}} V. \end{aligned}$$

From the relations

$$\begin{aligned}&z_j J_{n_i}(0) - I_{n_i} = T_{n_i}([-1, z_j, 0, \dots , 0]), \\&J_{n_i}(0) = T_{n_i}([0, 1, 0, \dots , 0]) \end{aligned}$$

and (15) and (17), we have

$$\begin{aligned} \left( z_j J_{n_i}(0) - I\right) ^{-1} J_{n_i}(0) = - T_{n_i}([0, 1, z_j, z_j^2, \dots , z_j^{n_i-2}]). \end{aligned}$$

In addition, from (14), we have

$$\begin{aligned}&\sum _{j=1}^N \omega _j z_j^k \left( z_j J_{n_i}(0) - I_{n_i} \right) ^{-1} J_{n_i}(0) \\&\quad = - T_{n_i} \left( \left[ 0, \sum _{j=1}^N \omega _j z_j^k, \sum _{j=1}^N \omega _j z_j^{k+1}, \dots , \sum _{j=1}^N \omega _j z_j^{k+n_i-2}\right] \right) . \end{aligned}$$

Here, because \((z_j,\omega _j)\) satisfies condition (13),

$$\begin{aligned} \sum _{j=1}^N \omega _j z_j^k \left( z_j J_{n_i}(0) - I_{n_i} \right) ^{-1} J_{n_i}(0) = O \end{aligned}$$

is satisfied for any \(0 \le k \le N-n_i\). Therefore, letting

$$\begin{aligned} \eta := \max _{r+1 \le i \le d} n_i, \end{aligned}$$

the 2nd term of \({\widehat{S}}_k\) is O for any \(0 \le k \le N - \eta \), i.e.,

$$\begin{aligned} {\widehat{S}}_k^{(2)} = O. \end{aligned}$$
(28)

From (27) and (28), we have the following theorems.

Theorem 8

Suppose that \((\omega _j,z_j)\) satisfies condition (13). Then, we have

$$\begin{aligned} {\widehat{S}}_k = C^k {\widehat{S}}_0, \quad C = Q_{1:r} J_{1:r} {\widetilde{Q}}_{1:r}^{\mathrm{H}}, \end{aligned}$$

for any \(0 \le k \le N-\eta \), where

$$\begin{aligned} Q_{1:r} := [Q_1, Q_2, \dots , Q_r], \quad {\widetilde{Q}}_{1:r} := [{\widetilde{Q}}_1, {\widetilde{Q}}_2, \dots , {\widetilde{Q}}_r], \quad J_{1:r} := \bigoplus _{i=1}^r J_{n_i}(\lambda _i). \end{aligned}$$

Proof

From (27) and (28), we have

$$\begin{aligned} {\widehat{S}}_k = \sum _{i=1}^r Q_i (J_{n_i}(\lambda _i))^k \left[ \sum _{j=1}^N \omega _j \left( z_j I_{n_i} - J_{n_i}(\lambda _i) \right) ^{-1} \right] {\widetilde{Q}}_i^{\mathrm{H}} V, \end{aligned}$$

for any \(0 \le k \le N-\eta \). Here, we let

$$\begin{aligned}&F_{n_i} := T_{n_i}\left( {{\varvec{f}}}_0(\lambda _i)\right) = \sum _{j=1}^N \omega _j \left( z_j I_{n_i} - J_{n_i}(\lambda _i) \right) ^{-1}, \\&F_{1:r} := \bigoplus _{i=1}^r F_{n_i}, \end{aligned}$$

then we obtain

$$\begin{aligned} {\widehat{S}}_k&= \sum _{i=1}^r Q_i \left( J_{n_i}(\lambda _i) \right) ^k F_{n_i} {\widetilde{Q}}_i^{\mathrm{H}} V \\&= Q_{1:r} J_{1:r}^k F_{1:r} {\widetilde{Q}}_{1:r}^{\mathrm{H}} V \\&= (Q_{1:r} J_{1:r} {\widetilde{Q}}_{1:r}^{\mathrm{H}})^k (Q_{1:r} F_{1:r} {\widetilde{Q}}_{1:r}^{\mathrm{H}} V) \\&= C^k {\widehat{S}}_0. \end{aligned}$$

Therefore, Theorem 8 is proven. \(\square \)

Theorem 9

If \((z_j,\omega _j)\) satisfies condition (13), then the standard eigenvalue problem

$$\begin{aligned} C {{\varvec{x}}}_i = \lambda _i {{\varvec{x}}}_i, \quad {{\varvec{x}}}_i \in {\mathcal {R}}(Q_{1:r}), \quad \lambda _i \in \varOmega \subset {\mathbb {C}}, \end{aligned}$$
(29)

is equivalent to the generalized eigenvalue problem (1).

Proof

From the definition of \(C := Q_{1:r} J_{1:r} {\widetilde{Q}}_{1:r}^{\mathrm{H}}\), the matrix C has the same right eigenpairs \((\lambda _i,{{\varvec{x}}}_i), i = 1, 2, \dots , r\) as the matrix pencil \(zB-A\), i.e., \({{\varvec{x}}}_i \in {\mathcal {R}}(Q_{1:r})\). The other eigenvalues of C are 0, and their corresponding eigenvectors are equivalent to the right eigenvectors associated with the infinite eigenvalues \(\lambda _i = \infty \) of \(zB-A\), i.e., \({{\varvec{x}}}_i \not \in {\mathcal {R}}(Q_{1:r})\). Therefore, Theorem 9 is proven. \(\square \)

4 Map of the relationships among contour integral-based eigensolvers

Section 3 analyzed the properties of the approximated matrices \({\widehat{S}}\) and \({\widehat{S}}_k\) (Theorem 8) and introduced the standard eigenvalue problem (29) equivalent to the target eigenvalue problem (1) (Theorem 9).

In this section, based on Theorems 8 and 9, we reconsider the algorithms of the contour integral-based eigensolvers in terms of projection methods and map the relationships, focusing on their subspace used, the type of projection and the problem to which they are applied implicitly.

4.1 Reconsideration of the contour integral-based eigensolvers

As described in Sect. 2, the subspaces \({\mathcal {R}}(S)\) and \({\mathcal {R}}(S_k)\) contain only the target eigenvectors \({{\varvec{x}}}_i, \lambda _i \in \varOmega \) based on Cauchy’s integral formula. In contrast, the subspaces \({\mathcal {R}}({\widehat{S}})\) and \({\mathcal {R}}({\widehat{S}}_k)\) are rich in the component of the target eigenvectors as will be shown in Sect. 5.

4.1.1 The block SS–RR method and the FEAST eigensolvers

The block SS–RR method and the FEAST eigensolvers are easily reconfigured as projection methods.

The block SS–RR method solves \(A{{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i\) through the Rayleigh–Ritz procedure on \({\mathcal {R}}({\widehat{S}})\). The block SS–RR method (Algorithm 2) is derived using a low-rank approximation of the matrix \({\widehat{S}}\) as shown in Sect. 2.2. Since \({\mathcal {R}}({\widehat{S}})\) is rich in the component of the target eigenvectors, the target eigenpairs are well approximated by the Rayleigh–Ritz procedure.

The FEAST eigensolver conducts accelerated subspace iteration with the Rayleigh–Ritz procedure. In each iteration of the FEAST eigensolver, the Rayleigh–Ritz procedure solves \(A{{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i\) on \({\mathcal {R}}({\widehat{S}}_0)\). Like \({\mathcal {R}}({\widehat{S}})\) in the block SS–RR method, \({\mathcal {R}}({\widehat{S}}_0)\) is rich in the component of the target eigenvectors; therefore, the FEAST eigensolver also well approximates the target eigenpairs by the Rayleigh–Ritz procedure.

4.1.2 The block SS–Hankel method, the block SS–Arnoldi method and the Beyn method

From Theorem 8, we rewrite the block complex moments \({\widehat{\mu }}_k^\square \) of the block SS–Hankel method as

$$\begin{aligned} {\widehat{\mu }}_k^\square = {\widetilde{V}}^{\mathrm{H}} {\widehat{S}}_k = {\widetilde{V}}^{\mathrm{H}} C {\widehat{S}}_{k-1} = \cdots = {\widetilde{V}}^{\mathrm{H}} C^k {\widehat{S}}_0. \end{aligned}$$

Thus, the block Hankel matrices \({\widehat{H}}_M^\square , {\widehat{H}}_M^{\square <}\) (7) become

$$\begin{aligned}&{\widehat{H}}_M^\square = \left( \begin{array}{cccc} {\widetilde{V}}^{\mathrm{H}} {\widehat{S}}_0 &{}\quad {\widetilde{V}}^{\mathrm{H}} {\widehat{S}}_1 &{}\quad \cdots &{}\quad {\widetilde{V}}^{\mathrm{H}} {\widehat{S}}_{M-1} \\ {\widetilde{V}}^{\mathrm{H}} C {\widehat{S}}_0 &{}\quad {\widetilde{V}}^{\mathrm{H}} C {\widehat{S}}_1 &{}\quad \cdots &{}\quad {\widetilde{V}}^{\mathrm{H}} C {\widehat{S}}_{M-1} \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ {\widetilde{V}}^{\mathrm{H}} C^{M-1} {\widehat{S}}_0 &{}\quad {\widetilde{V}}^{\mathrm{H}} C^{M-1} {\widehat{S}}_1 &{}\quad \cdots &{}\quad {\widetilde{V}}^{\mathrm{H}} C^{M-1} {\widehat{S}}_{M-1} \end{array} \right) , \\&{\widehat{H}}_M^{\square <} = \left( \begin{array}{cccc} {\widetilde{V}}^{\mathrm{H}} C {\widehat{S}}_0 &{}\quad {\widetilde{V}}^{\mathrm{H}} C {\widehat{S}}_1 &{}\quad \cdots &{}\quad {\widetilde{V}}^{\mathrm{H}} C {\widehat{S}}_{M-1} \\ {\widetilde{V}}^{\mathrm{H}} C^2 {\widehat{S}}_0 &{}\quad {\widetilde{V}}^{\mathrm{H}} C^2 {\widehat{S}}_1 &{}\quad \cdots &{}\quad {\widetilde{V}}^{\mathrm{H}} C^2 {\widehat{S}}_{M-1} \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ {\widetilde{V}}^{\mathrm{H}} C^{M} {\widehat{S}}_0 &{}\quad {\widetilde{V}}^{\mathrm{H}} C^{M} {\widehat{S}}_1 &{}\quad \cdots &{}\quad {\widetilde{V}}^{\mathrm{H}} C^{M} {\widehat{S}}_{M-1} \end{array} \right) , \end{aligned}$$

respectively. Here, let

$$\begin{aligned} {\widetilde{S}} := [{\widetilde{V}}, C^{\mathrm{H}}{\widetilde{V}}, (C^\mathrm{H})^2{\widetilde{V}}, \dots , (C^{\mathrm{H}})^{M-1}{\widetilde{V}}]. \end{aligned}$$

Then, we have

$$\begin{aligned} {\widehat{H}}_M^\square = {\widetilde{S}}^{\mathrm{H}} {\widehat{S}}, \quad {\widehat{H}}_M^{\square <} = {\widetilde{S}}^{\mathrm{H}} C {\widehat{S}}. \end{aligned}$$

Therefore, the generalized eigenvalue problem (6) is rewritten as

$$\begin{aligned} {\widetilde{S}}^{\mathrm{H}} C {\widehat{S}} {{\varvec{y}}}_i = \theta _i {\widetilde{S}}^{\mathrm{H}} {\widehat{S}}{{\varvec{y}}}_i. \end{aligned}$$
(30)

In this form, the block SS–Hankel method can be regarded as a Petrov–Galerkin-type projection method for solving the standard eigenvalue problem (29), i.e., the approximate solution \(\widetilde{{\varvec{x}}}_i\) and the corresponding residual \({{\varvec{r}}}_i:= C \widetilde{{\varvec{x}}}_i-\theta _i \widetilde{{\varvec{x}}}_i\) satisfy \(\widetilde{{\varvec{x}}}_i \in {\mathcal {R}}({\widehat{S}})\) and \({{\varvec{r}}}_i \bot {\mathcal {R}}({\widetilde{S}})\), respectively. Recognizing that \({\mathcal {R}}({\widehat{S}}) \subset {\mathcal {R}}(Q_{1:r})\) and applying Theorem 9, we find that the block SS–Hankel method obtains the target eigenpairs.

Since the Petrov–Galerkin-type projection method for (29) does not perform the (bi-)orthogonalization; that is \({\widetilde{S}}^{\mathrm{H}} {\widehat{S}} \ne I\), (30) describes the generalized eigenvalue problem. The practical algorithm of the block SS–Hankel method (Algorithm 1) is derived from a low-rank approximation of (30).

From Theorem 8, we have

$$\begin{aligned} {\mathcal {R}}({\widehat{S}}) = {\mathcal {K}}_M^\square (C,{\widehat{S}}_0) \end{aligned}$$

similar to Theorem 5. Therefore, the block SS–Arnoldi method can be regarded as a block Arnoldi method with \({\mathcal {K}}_M^\square (C,{\widehat{S}}_0)\) for solving the standard eigenvalue problem (29). Moreover, for \(M \le N-\eta \), any \({\widehat{E}}_M \in {\mathcal {B}}_M^\square (C,{\widehat{S}}_0)\) can be written as

$$\begin{aligned} {\widehat{E}}_M = \sum _{j=1}^N \omega _j \sum _{i=0}^{M-1} z_j^i (z_jB-A)^{-1} B V \alpha _i, \quad \alpha _i \in {\mathbb {C}}^{L \times L}. \end{aligned}$$

and the matrix multiplication of C by \({\widehat{E}}_M\) is given by

$$\begin{aligned} C {\widehat{E}}_M = \sum _{j=1}^N \omega _j z_j \sum _{i=0}^{M-1} z_j^i (z_jB-A)^{-1} B V \alpha _i. \end{aligned}$$

similar to Theorem 7. Therefore, in each iteration, the matrix multiplication of C can be performed by a numerical integration.

The Beyn method can be also regarded as a projection method for solving the standard eigenvalue problem (29). From the relation \({\widehat{S}}_1 = C {\widehat{S}}_0\) and the singular value decomposition (11) of \({\widehat{S}}_0\), the coefficient matrix of the eigenvalue problem (12) obtained from the Beyn method becomes

$$\begin{aligned} U_{0,1}^{\mathrm{H}} {\widehat{S}}_1 W_{0,1} \varSigma _1^{-1} = U_{0,1}^\mathrm{H} C {\widehat{S}}_0 W_{0,1} \varSigma _{0,1}^{-1} = U_{0,1}^{\mathrm{H}} C U_{0,1}. \end{aligned}$$

Therefore, the Byen method can be regarded as a Rayleigh–Ritz-type projection method on \({\mathcal {R}}(U_{0,1})\) for solving (29), where \({\mathcal {R}}(U_{0,1})\) is obtained from a low-rank approximation of \({\widehat{S}}_0\).

4.2 Map of the contour integral-based eigensolvers

As shown in Sect. 4.1.1, the block SS–RR method and the FEAST eigensolver are based on the Rayleigh–Ritz procedure, which solve the generalized eigenvalue problem \(A{{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i\). These methods use subspaces \({\mathcal {R}}({\widehat{S}})\) and \({\mathcal {R}}({\widehat{S}}_0)\), respectively. The FEAST eigensolver can be regarded as a simplified algorithm of the block SS–RR method with \(M=1\) and no orthogonalization of the basis. Instead, the FEAST eigensolver presupposes an iteration based on an accelerated subspace iteration. Here, we note that the block SS–RR method can also use an iteration technique for improving accuracy as demonstrated in [15, 21].

In contrast, as shown in Sect. 4.1.2, the block SS–Hankel, block SS–Arnoldi and Beyn methods can be regarded as projection methods for solving the standard eigenvalue problem (29) instead of \(A{{\varvec{x}}}_i=\lambda _i B {{\varvec{x}}}_i\). The block SS–Hankel method is a Petrov–Galerkin-type method with \({\mathcal {R}}({\widehat{S}})\), the block SS–Arnoldi method is a block Arnoldi method with \({\mathcal {R}}({\widehat{S}})={\mathcal {K}}_M^\square (C,{\widehat{S}}_0)\) and the Beyn method is a Rayleigh–Ritz-type method with \({\mathcal {R}}({\widehat{S}}_0)\). Note that because these methods are based on Theorems 8 and 9, \((z_j,\omega _j)\) should satisfy condition (13).

Since the block SS–Hankel, block SS–RR and block SS–Arnoldi methods use \({\mathcal {R}}({\widehat{S}})\) as the subspace, the maximum dimension of the subspace is LM. In contrast, the FEAST eigensolver and the Beyn method use the subspace \({\mathcal {R}}({\widehat{S}}_0)\), whose maximum dimension is L; that is, \(\mathrm{rank}({\widehat{S}}_0)\) can not be larger than the number L of right-hand sides of linear systems at each quadrature point. Therefore, for the same subspace dimension, the FEAST eigensolver and the Beyn method should incur larger computational costs than the block SS–Hankel, block SS–RR and block SS–Arnoldi methods for solving the linear systems with multiple right-hand sides.

A map of the relationship among the contour integral-based eigensolvers is presented in Fig. 1.

Fig. 1
figure 1

A map of the relationship among the contour integral-based eigensolvers

4.3 Proposal for a block SS–Beyn method

As mentioned above, one iteration of the FEAST eigensolver is a simplified version of the block SS–RR method with \(M=1\) and no orthogonalization. In contrast, a derivative of the Beyn method with \(M \ge 2\) has not been proposed. Although this paper mainly aims to analyze the relationships among these methods and provide a map, we also propose an extension of the Beyn method to \(M \ge 2\) as with the block SS–Hankel, block SS–RR and block SS–Arnoldi methods.

As shown in Sect. 2.2.5, from the relation \({\widehat{S}}_1 = C {\widehat{S}}_0\) and a singular value decomposition of \({\widehat{S}}_0\), we can derive a small size eigenvalue problem (12) of the Beyn method. As shown in Sect. 4.1.2, the Beyn method can be also regarded as the Rayleigh–Ritz projection method with \({\mathcal {R}}({\widehat{S}}_0)\) for solving the standard eigenvalue problem (29). To extend the Beyn method, here we consider the Rayleigh–Ritz projection method with \({\mathcal {R}}({\widehat{S}})\) for solving (29), i.e.,

$$\begin{aligned} U^{\mathrm{H}} C U {{\varvec{t}}}_i = \theta _i {{\varvec{t}}}_i \end{aligned}$$

where \({\widehat{S}} = U \varSigma W^{\mathrm{H}}\) is a singular value decomposition of \({\widehat{S}}\). Using Theorem 8, the coefficient matrix \(U^{\mathrm{T}} C U\) is replaced as

$$\begin{aligned} U^{\mathrm{H}} C U = U^{\mathrm{H}} C {\widehat{S}} W_1 \varSigma _1^{-1} = U^\mathrm{H} {\widehat{S}}_+ W_1 \varSigma _1^{-1}, \end{aligned}$$

where

$$\begin{aligned} {\widehat{S}}_+ := [{\widehat{S}}_1, {\widehat{S}}_2, \ldots , {\widehat{S}}_{M}] = C {\widehat{S}}. \end{aligned}$$

In practice, we can also use a low-rank approximation of \({\widehat{S}}\),

$$\begin{aligned} {\widehat{S}} = [U_1, U_2] \left[ \begin{array}{ll} \varSigma _1 &{}\quad O \\ O &{}\quad \varSigma _2 \end{array} \right] \left[ \begin{array}{ll} W_1^{\mathrm{H}} \\ W_2^{\mathrm{H}} \end{array} \right] \approx U_1 \varSigma _1 W_1^{\mathrm{H}}. \end{aligned}$$

Then, the reduced eigenvalue problem becomes

$$\begin{aligned} U_1^{\mathrm{H}} {\widehat{S}}_+ W_1 \varSigma _1^{-1} {{\varvec{t}}}_i = \theta _i {{\varvec{t}}}_i. \end{aligned}$$

The approximate eigenpairs are obtained as \(({\widetilde{\lambda }}_i,\widetilde{{\varvec{x}}}_i) = (\theta _i, U_{1} {{\varvec{t}}}_i)\). In this paper, we call this method as the block SS–Beyn method and show it in Algorithm 7.

Both the block SS–RR method and the block SS–Beyn method are Rayleigh–Ritz-type projection methods with \({\mathcal {R}}({\widehat{S}})\). However, since the methods are targeted at different eigenvalue problems, they have different definitions of the residual vector. Therefore, these methods mathematically differ when \(B \ne I\). In contrast, the block SS–Arnoldi method and the block SS–Beyn method without a low-rank approximation, i.e., \({\widehat{m}} = LM\), are mathematically equivalent.

figure g

5 Error analyses of the contour integral-based eigensolvers with an iteration technique

As shown in Sect. 2.2.3, the FEAST eigensolver is based on the iteration. Other iterative contour integral-based eigensolvers have been designed to improve the accuracy [15, 21]. The basic concept is the iterative computation of the matrix \({\widehat{S}}_0^{(\ell -1)}\), from the initial matrix \({\widehat{S}}_0^{(0)} = V\) as follows:

$$\begin{aligned} {\widehat{S}}^{(\nu )}_0 := \sum _{j=1}^{N} \omega _j (z_jB-A)^{-1} B {\widehat{S}}_0^{(\nu -1)}, \quad \nu = 1, 2, \ldots , \ell -1. \end{aligned}$$
(31)

The matrices \({\widehat{S}}_k^{(\ell )}\) and \({\widehat{S}}^{(\ell )}\) are then constructed from \({\widehat{S}}_0^{(\ell -1)}\) as

$$\begin{aligned} {\widehat{S}}^{(\ell )} := [{\widehat{S}}_0^{(\ell )}, {\widehat{S}}_1^{(\ell )}, \ldots , {\widehat{S}}_{M-1}^{(\ell )} ], \quad {\widehat{S}}^{(\ell )}_k := \sum _{j=1}^{N} \omega _j z_j^k (z_jB-A)^{-1} B {\widehat{S}}_0^{(\ell -1)}, \end{aligned}$$
(32)

and \({\mathcal {R}}( {\widehat{S}}_0^{(\ell )} )\) and \({\mathcal {R}}( {\widehat{S}}^{(\ell )} )\) are used as subspaces rather than \({\mathcal {R}}( {\widehat{S}}_0 )\) and \({\mathcal {R}}( {\widehat{S}} )\). The \(\ell \) iterations of the FEAST eigensolver can be regarded as a Rayleigh–Ritz-type projection method on \({\mathcal {R}}({\widehat{S}}_0^{(\ell )})\).

From the discussion in Sect. 3, the matrix \({\widehat{S}}_0^{(\ell )}\) can be expressed as

$$\begin{aligned} {\widehat{S}}_0^{(\ell )} = \left( Q_{1:r} F_{1:r} {\widetilde{Q}}_{1:r}^{\mathrm{H}} \right) ^\ell V. \end{aligned}$$

Here, the eigenvalues of the linear operator \({\widehat{P}} := Q_{1:r} F_{1:r} {\widetilde{Q}}_{1:r}^{\mathrm{H}}\) are given by

$$\begin{aligned} f(\lambda _i) := \sum _{j=1}^N \frac{\omega _j}{z_j - \lambda _i}. \end{aligned}$$

The function \(f(\lambda )\), called the filter function, is used in the analyses of some eigensolvers with diagonalizable matrix pencil [9, 15, 22, 23]. The function \(f(\lambda )\) is characterized by \(|f(\lambda )|\approx 1\) in the inner region and \(|f(\lambda )|\approx 0\) in the outer region. Figure 2 plots the filter function when \(\varOmega \) is the unit circle and integration is performed by the N-point trapezoidal rule.

Fig. 2
figure 2

Magnitude of filter function \(|f(\lambda )|\) of the N-point trapezoidal rule for the unit circle region \(\varOmega \)

Error analyses of the block SS–RR method with the iteration technique (31) and (32) and the FEAST eigensolver in the diagonalizable case were given in [9, 15, 23]. In these error analyses, the block SS–RR method and the FEAST eigensolver were treated as projection methods with the subspaces \({\mathcal {R}}({\widehat{S}})\) and \({\mathcal {R}}({\widehat{S}}_0)\), respectively. In Sect. 4, we explained that the other contour integral-based eigensolvers are also projection methods with the subspaces \({\mathcal {R}}({\widehat{S}})\) and \({\mathcal {R}}({\widehat{S}}_0)\), but were designed to solve the standard eigenvalue problem (29). In this section, we establish the error bounds of the contour integral-based eigensolvers with the iteration technique (31) and (32), omitting the low-rank approximation, in non-diagonalizable cases.

5.1 Error bounds of the block SS–RR method and the FEAST eigensolver in the diagonalizable case

Let \((\lambda _i,{{\varvec{x}}}_i)\) be exact finite eigenpairs of the generalized eigenvalue problem \(A {{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i\). Assume that \(f(\lambda _i)\) are ordered by decreasing magnitude \(|f(\lambda _i)| \ge |f(\lambda _{i+1})|\). Define \({\mathcal {P}}^{(\ell )}\) and \({\mathcal {P}}_{LM}\) as orthogonal projectors onto the subspaces \({\mathcal {R}}( {\widehat{S}}^{(\ell )} )\) and the spectral projector with an invariant subspace \(\mathrm{span}\{ {{\varvec{x}}}_1, {{\varvec{x}}}_2, \ldots , {{\varvec{x}}}_{LM} \}\), respectively. Assume that the matrix \({\mathcal {P}}_{LM} [V, CV, \dots , C^{M-1}V]\) is full rank. Then, for each eigenvector \({{\varvec{x}}}_i, i = 1, 2, \ldots , LM\), there exists a unique vector \({{\varvec{s}}}_i \in {\mathcal {K}}_M^\square (C,V)\) such that \({\mathcal {P}}_{LM} {{\varvec{s}}}_i = {{\varvec{x}}}_i\).

In the diagonalizable case, for the error analysis of the block SS–RR method and the FEAST eigensolver, the following inequality was given in [15] and [9, 23] for \(M=1\):

$$\begin{aligned} \Vert (I - \mathcal {P^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \alpha \beta _i \left| \frac{f(\lambda _{LM+1})}{f(\lambda _i)} \right| ^\ell , \quad i = 1, 2, \ldots , LM, \end{aligned}$$
(33)

where \(\alpha = \Vert X_r \Vert _2 \Vert {\widetilde{X}}_r \Vert _2\) and \(\beta _i = \Vert {{\varvec{x}}}_i - {{\varvec{s}}}_i \Vert _2\). Note that, in the diagonalizable case, the linear operator \({\widehat{P}}\) can be expressed as \({\widehat{P}} = X_r f(\varLambda _r) {\widetilde{X}}_r^{\mathrm{H}}\), where \(f(\varLambda _r) := \mathrm{diag}(f(\lambda _1), f(\lambda _2), \dots f(\lambda _r))\). An additional error bound is given in [15]:

$$\begin{aligned} \Vert (A_{{\mathcal {P}}^{(\ell )}} - \lambda _i B_{{\mathcal {P}}^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \gamma _i \Vert (I - \mathcal {P^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \alpha \beta _i \gamma _i \left| \frac{f(\lambda _{LM+1})}{f(\lambda _i)} \right| ^\ell , \end{aligned}$$
(34)

for \(i = 1, 2, \ldots , LM\), where \(A_{{\mathcal {P}}^{(\ell )}} := {\mathcal {P}}^{(\ell )} A {\mathcal {P}}^{(\ell )}, B_{{\mathcal {P}}^{(\ell )}} := {\mathcal {P}}^{(\ell )} B {\mathcal {P}}^{(\ell )}\) and \(\gamma _i = \Vert {\mathcal {P}}^{(\ell )} (A - \lambda _i B) (I - {\mathcal {P}}^{(\ell )}) \Vert _2\).

Inequality (33) determines the accuracy of the subspace \({\mathcal {R}}({\widehat{S}})\), whereas inequality (34) defines the error bound of the block SS–RR method and the FEAST eigensolver.

5.2 Error bounds of the contour integral-based eigensolvers in the non-diagonalizable case

The constant \(\alpha \) in (33) derives from the following inequality for a diagonalizable matrix \(G_{\mathrm{diag}} = X D X^{-1}\)

$$\begin{aligned} \Vert G_{\mathrm{diag}}^\ell \Vert _2 \le \Vert X \Vert _2 \Vert D^\ell \Vert _{2} \Vert X^{-1} \Vert _2 \le \Vert X \Vert _2 \Vert X^{-1} \Vert _2 (\rho (G_{\mathrm{diag}}))^\ell , \end{aligned}$$

where \(\rho (G_{\mathrm{diag}})\) is the spectral radius of \(G_{\mathrm{diag}}\). This inequality is extended to a non-diagonalizable matrix \(D_\mathrm{non} = X J X^{-1}\) as follows:

$$\begin{aligned} \Vert G_{\mathrm{non}}^\ell \Vert _2 \le \Vert X \Vert _2 \Vert J^\ell \Vert _{2} \Vert X^{-1} \Vert _2 \le 2 \Vert X \Vert _2 \Vert X^{-1} \Vert _2 \ell ^{\eta -1} (\rho (G_\mathrm{non}))^\ell , \end{aligned}$$

where \(\rho (G_{\mathrm{non}})\) is the spectral radius of \(G_{\mathrm{non}}\) and \(\eta \) is the maximum size of the Jordan blocks. Using this inequality, the error bound of the contour integral-based eigensolvers in the non-diagonalizable case is given as

$$\begin{aligned} \Vert (I - \mathcal {P^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \alpha ^\prime \beta _i \ell ^{\eta -1} \left| \frac{f(\lambda _{LM+1})}{f(\lambda _i)} \right| ^\ell , \quad i = 1, 2, \ldots , LM, \end{aligned}$$
(35)

where \(\alpha ^\prime = 2 \Vert Q_{1:r} \Vert _2 \Vert {\widetilde{Q}}_{1:r} \Vert _2\). From (35), the error bound of the block SS–RR method and the FEAST eigensolver in the non-diagonalizable case is given by

$$\begin{aligned} \Vert (A_{{\mathcal {P}}^{(\ell )}} - \lambda _i B_{{\mathcal {P}}^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \gamma _i \Vert (I - \mathcal {P^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \alpha ^\prime \beta _i \gamma _i \ell ^{\eta -1} \left| \frac{f(\lambda _{LM+1})}{f(\lambda _i)} \right| ^\ell , \end{aligned}$$
(36)

for \(i = 1, 2, \dots LM\).

The inequality (36) derives from the error bound of the Rayleigh–Ritz procedure for generalized eigenvalue problems \(A{{\varvec{x}}}_i = \lambda _i B {{\varvec{x}}}_i\). From the error bound of the Rayleigh–Ritz procedure for standard eigenvalue problems [18, Theorem 4.3], we derive the error bound of the block SS–Arnoldi and block SS–Beyn methods as

$$\begin{aligned} \Vert (C_{{\mathcal {P}}^{(\ell )}} - \lambda _i I ) {\mathcal {P}}^{(\ell )}{{\varvec{x}}}_i \Vert _2 \le \gamma ^\prime \Vert (I - \mathcal {P^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \alpha ^\prime \beta _i \gamma ^\prime \ell ^{\eta -1} \left| \frac{f(\lambda _{LM+1})}{f(\lambda _i)} \right| ^\ell , \qquad \end{aligned}$$
(37)

for \(i = 1, 2, \ldots , LM\), where \(C_{{\mathcal {P}}^{(\ell )}} := {\mathcal {P}}^{(\ell )} C {\mathcal {P}}^{(\ell )}\) and \(\gamma ^\prime = \Vert {\mathcal {P}}^{(\ell )} C (I - {\mathcal {P}}^{(\ell )}) \Vert _2\).

In addition, let \({\mathcal {Q}}\) be the oblique projector onto \({\mathcal {R}}({\widehat{S}}^{(\ell )})\) and orthogonal to \({\mathcal {R}}( {\widetilde{S}})\). Then, from the error bound of the Petrov–Galerkin-type projection method for standard eigenvalue problems [18, Theorem 4.7], the error bound of the block SS–Hankel method is derived as follows:

$$\begin{aligned} \Vert (C_{{\mathcal {P}}^{(\ell )}}^{\mathcal {Q}} - \lambda _i I ) {\mathcal {P}}^{(\ell )}{{\varvec{x}}}_i \Vert _2 \le \gamma _i^{\prime \prime } \Vert (I - \mathcal {P^{(\ell )}} ) {{\varvec{x}}}_i \Vert _2 \le \alpha ^\prime \beta _i \gamma _i^{\prime \prime } \ell ^{\eta -1} \left| \frac{f(\lambda _{LM+1})}{f(\lambda _i)} \right| ^\ell , \qquad \end{aligned}$$
(38)

for \(i = 1, 2, \dots , LM\), where \(C_{{\mathcal {P}}^{(\ell )}}^{\mathcal {Q}} := {\mathcal {Q}} C {\mathcal {P}}^{(\ell )}\) and \(\gamma _i^{\prime \prime } = \Vert {\mathcal {Q}} (C - \lambda _i I) (I - {\mathcal {P}}^{(\ell )}) \Vert _2\).

Error bounds (36), (37) and (38) indicate that given a sufficiently large subspace, i.e., \(|f(\lambda _{LM+1})/f(\lambda _i)|^\ell \approx 0\), the contour integral-based eigensolvers can obtain the accurate target eigenpairs even if some eigenvalues exist outside but near the region and the target matrix pencil is non-diagonalizable.

6 Numerical experiments

This paper mainly aims to analyze the relationships among the contour integral-based eigensolvers and to map these relationships; although, in this section, the efficiency of the block SS–Hankel, block SS–RR, block SS–Arnoldi and block SS–Beyn methods are compared in numerical experiments with \(M = 1, 2, 4, 8\) and 16.

These methods compute 1000 eigenvalues in the interval \([-1,1]\) and the corresponding eigenvectors of a real symmetric generalized eigenvalue problem with 20,000 dimensional dense and random matrices. \(\varGamma \) is an ellipse with center 0 and major and minor axes 1 and 0.1, respectively. The parameters are \((L,M) = (4096,1), (2048,2), (1024,4), (512,8), (256,16)\) (note that \(LM=4096\)) and \(N=32\). Because of a symmetry of the problem, the number of required linear systems is \(N/2=16\). For the low-rank approximation, we used singular values \(\sigma _i\) satisfying \(\sigma _i/\sigma _1 \ge 10^{-14}\) and their corresponding singular vectors, where \(\sigma _1\) is the largest singular value.

The numerical experiments were carried out in double precision arithmetic on 8 nodes of COMA at CCS, University of Tsukuba. COMA has two Intel Xeon E5-2670v2 (2.5 GHz) and two Intel Xeon Phi 7110P (61 cores) per node. In these numerical experiments, we used only the CPU part. The algorithms were implemented in Fortran 90 and MPI, and executed with 8 [node] \(\times \) 2 [process/node] \(\times \) 8 [thread/process].

The numerical results are presented in Table 1. First, we consider the numerical rank \({\widehat{m}}\). Comparing M dependence of the numerical rank \({\widehat{m}}\) in the block SS–Hankel, block SS–RR and block SS–Beyn methods, we observe that the numerical rank \({\widehat{m}}\) increases with increasing M. This is causally related to the property of the subspace \({\mathcal {K}}_M^\square (C,V)\), because \({\widehat{S}}\) is written as

$$\begin{aligned} {\widehat{S}} = \left( Q_{1:r} F_{1:r} {\widehat{Q}}_{1:r}^{\mathrm{H}} \right) [V, CV, \dots , C^{M-1}V]. \end{aligned}$$

For \(M=1\), the subspace \({\mathcal {K}}_1^\square (C,V) = {\mathcal {R}}(V)\) is unbiased for all eigenvectors, since V is a random matrix. On the other hand, for \(M \ge 2\), the subspace \({\mathcal {K}}_M^\square (C,V)\) contains eigenvectors corresponding exterior eigenvalues well. Therefore, for computing interior eigenvalues, the numerical rank \({\widehat{m}}\) for \(M=16\) is expected to be larger than for \(M=1\).

Next, we consider the computation time. The computation times of the LU factorization, forward and back substitutions and the other computation time including the singular value decomposition and orthogonalization are denoted by \(t_{\mathrm{LU}}, t_{\mathrm{Solve}}, t_\mathrm{Other}\), respectively. The total computation time is also denoted by \(t_{\mathrm{Total}}\). We observe, from Table 1, that the most time-consuming part is to solve linear systems with multiple-right hand sides (\(t_{\mathrm{LU}} + t_{\mathrm{Solve}}\)). In particular, \(t_{\mathrm{Solve}}\) is much larger for \(M=1\) than for \(M=16\), because the number of right-hand sides for \(M=1\) is 16 times larger than for \(M=16\). Consequently, \(t_{\mathrm{Total}}\) increases with decreasing M.

We now focus on \(t_{\mathrm{Other}}\). The block SS–Arnoldi method consumes much greater \(t_{\mathrm{Other}}\) than the other methods because its current version applies no low-rank approximation technique to reduce the computational costs and improve the stability [13]. For the block SS–Hankel, block SS–RR and block SS–Beyn methods, \(t_{\mathrm{Other}}\) is smaller as M and the numerical rank \({\widehat{m}}\) are smaller. In addition, the block SS–Hankel method consumes smallest \(t_{\mathrm{Other}}\) among tested methods, because it performs no matrix orthogonalization.

Table 1 Computational results of the block SS–Hankel, block SS–RR, block SS–Arnoldi and block SS–Beyn methods with \(M=1, 2, 4, 8\) and 16

Finally, we consider the accuracy of the computed eigenpairs. The block SS–Hankel and block SS–Arnoldi methods are less accurate than the other methods, specifically for \(M=16\). This result is attributed to no matrix orthogonalization in the block SS–Hankel method, and to no low-rank approximation in the block SS–Arnoldi method. On the other hand, the block SS–RR and block SS–Beyn methods show high accuracy even for \(M=16\).

7 Conclusions

In this paper, we analyzed and mapped the mathematical relationships among the algorithms of the typical contour integral-based methods for solving generalized eigenvalue problems (1): the block SS–Hankel method, the block SS–RR method, the FEAST eigensolver, the block SS–Arnoldi method and the Beyn method. We found that the block SS–RR method and the FEAST eigensolver are projection methods for \(A{{\varvec{x}}}_i=\lambda _i B{{\varvec{x}}}_i\), whereas the block SS–Hankel, block SS–Arnoldi and Beyn methods are projection methods for the standard eigenvalue problem \(C{{\varvec{x}}}_i=\lambda _i {{\varvec{x}}}_i\). From the map of the algorithms, we also extended the existing Beyn method to \(M \ge 2\). Our numerical experiments indicated that increasing M reduces the computational costs (relative to \(M=1\)).

In future, we will compare the efficiencies of these methods in solving large, real-life problems. We also plan to analyze the relationships among contour integral-based nonlinear eigensolvers.