1 Introduction

We consider an iterative solution for the large sparse system of linear equations

$$\begin{aligned} Ax\equiv \left( \begin{array}{cc} W &{} -T \\ T &{} W \end{array}\right) \left( \begin{array}{c} y \\ z \end{array}\right) = \left( \begin{array}{c} f \\ g \end{array}\right) \equiv b, \end{aligned}$$
(1.1)

where the matrix \(A\in \mathbb {R}^{2m\times 2m}\) has a block two-by-two structure with \(W \in \mathbb {R}^{m\times m}\) and \(T \in \mathbb {R}^{m\times m}\). Note that the matrix \(A\) is nonsingular if and only if \(\mathrm{null}(W)\cap \mathrm{null}(T)=\{0\}\) and \(\pm \mathrm{i}\) is not a generalized eigenvalue of the matrix pair \((W,T)\) (i.e., \(Tx\ne \mp \mathrm{i}Wx\) for some \(x\ne 0\)), where \(\mathrm{null}(\cdot )\) denotes the nullspace of the corresponding matrix and \(\mathrm{i}=\sqrt{-1}\) is the imaginary unit. Many practical problems arising from scientific computing and engineering applications may require the solution of a linear system of the form (1.1), for example, computational electrodynamics [1, 2], optimization [37], optimal control [810], and real equivalent formulations of complex linear systems [1113]; see also [1417].

A number of iterative methods have been proposed for the block two-by-two linear system (1.1) such as C-to-R iteration methods [11], alternating splitting iteration methods [1820], preconditioned Krylov subspace iteration methods [21], and so on; see also [2224]. To solve the block two-by-two linear system (1.1) effectively, Bai et al., in [9], established a class of preconditioned and modified Hermitian and skew-Hermitian splitting (PMHSS) iteration methods by modifying and preconditioning the Hermitian and skew-Hermitian splitting (HSS) iteration method [18, 25, 26]. The PMHSS methods are essentially a kind of splitting iteration method, and the splitting matrices

$$\begin{aligned} F(\alpha )=\left[ 1+\frac{1}{\alpha }\right] Q \left( \begin{array}{cc} \alpha W+T &{}\quad 0 \\ 0 &{}\quad \alpha W+T \end{array}\right) ,\quad \text {with} \quad Q=\frac{1}{2} \left( \begin{array}{cc} I &{}\quad -I \\ I &{}\quad I \end{array}\right) , \end{aligned}$$
(1.2)

can be used as preconditioners called PMHSS preconditioners for the matrix \(A\) in (1.1), where \(\alpha \) is a given positive constant and \(I\) is the identity matrix. The authors established the convergence theory of these PMHSS methods and analyzed the spectral properties of the PMHSS preconditioned matrix \(F(\alpha )^{-1}A\) when the matrices \(W\) and \(T\) were symmetric positive semidefinite in [9]. Numerical experiments showed that the PMHSS preconditioners were quite competitive when used to precondition Krylov subspace iteration methods such as the generalized minimal residual (GMRES) method.

Based on structures of the PMHSS preconditioners, Bai in [12] recently constructed a class of rotated block triangular preconditioners, called rotated block lower triangular (RBLT), rotated block upper triangular (RBUT), and rotated block triangular product (RBTP) preconditioners, for a block two-by-two linear system of the form (1.1). These rotated block triangular preconditioners are applicable not only to symmetric positive semidefinite matrices \(W\) and \(T\) but also in cases where either \(W\) or \(T\) is nonsymmetric. The author analyzed the eigenproperties and derived bounds for the degrees of the minimal polynomials of the preconditioned matrices. It was shown that the GMRES iteration method incorporated with these rotated block triangular precondtioners could be competitive with and even more efficient than that incorporated with the PMHSS preconditioner.

Note that the rotated block triangular preconditioning processes in [12] involve solving linear subsystems of the coefficient matrix \(\alpha W+T\). Thus, it is necessary to invert \(\alpha W+T\) at each step of the preconditioning processes. In many practical applications, the matrices \(W\) and \(T\) are very large and sparse, and much more computing time is required to invert the matrix \(\alpha W+T\) exactly. By making use of the idea of inexact preconditioning [25, 27, 28], in this paper we present a class of inexact rotated block triangular preconditioners for a block two-by-two linear system of the form (1.1), which avoids inversion of the matrix \(\alpha W+T\) exactly at each step of solving the linear subsystems. We can take advantage of the structures of the matrices \(W\) and \(T\) to solve the linear subsystems inexactly. For example, we use incomplete Cholesky factorization when the matrix \(\alpha W+T\) is symmetric positive definite or incomplete LU factorization when the matrix \(\alpha W+T\) is nonsymmetric. Then the spectral properties of these inexact preconditioned matrices are analyzed, and the effectiveness of the GMRES iteration method incorporated with these inexact rotated block triangular preconditioners is shown when compared with the exact ones.

The outline of this paper is as follows. In Sect. 2 we present the inexact rotated block triangular preconditioners and describe procedures for computing the generalized residual equations with these preconditioners. The spectral properties of the preconditioned matrices are analyzed in Sect. 3. In Sect. 4, we use numerical results to show the effectiveness of these inexact preconditioners. Finally, in Sect. 5, we end the paper with some concluding remarks.

2 Inexact rotated triangular preconditioning

Let \(\alpha \) be a real constant. In [12] the author constructed rotated block triangular preconditioners for the block two-by-two matrix \(A\) in (1.1). When the matrices \(W\) and \(T\) are large and sparse, it is costly to compute the inverse of the matrix \(\alpha W+T\) exactly. Assume that the matrix \(\widetilde{\alpha W+T}\), an approximation to the matrix \(\alpha W+T\), is nonsingular. To save computing time, we present three inexact rotated block triangular preconditioners by following the approach in [12], which are an inexact rotated block lower triangular (IRBLT) preconditioner

$$\begin{aligned} \widetilde{L}(\alpha )=\frac{1}{\sqrt{\alpha ^2+1}}G \left( \begin{array}{cc} \widetilde{\alpha W+T} &{}\quad 0 \\ \alpha T-W &{}\quad \widetilde{\alpha W+T} \end{array}\right) , \end{aligned}$$
(2.1)

an inexact rotated block upper triangular (IRBUT) preconditioner

$$\begin{aligned} \widetilde{U}(\alpha )=\frac{1}{\sqrt{\alpha ^2+1}}G \left( \begin{array}{cc} \widetilde{\alpha W+T} &{}\quad W-\alpha T \\ 0 &{}\quad \widetilde{\alpha W+T} \end{array}\right) , \end{aligned}$$
(2.2)

and an inexact rotated block triangular product (IRBTP) preconditioner

$$\begin{aligned} \widetilde{P}(\alpha )=&\frac{1}{\sqrt{\alpha ^2+1}}G \left( \begin{array}{cc} \widetilde{\alpha W+T} &{}\quad 0 \\ \alpha T-W &{}\quad \widetilde{\alpha W+T} \end{array}\right) \left( \begin{array}{cc} \widetilde{\alpha W+T} &{}\quad 0 \\ 0 &{}\quad \widetilde{\alpha W+T} \end{array}\right) ^{-1} \left( \begin{array}{cc} \widetilde{\alpha W+T} &{}\quad W-\alpha T \\ 0 &{}\quad \widetilde{\alpha W+T} \end{array}\right) \end{aligned}$$
(2.3)

for the matrix \(A\), where the matrix

$$\begin{aligned} G=\frac{1}{\sqrt{2}} \left( \begin{array}{cc} I &{}\quad -I \\ I &{}\quad I \end{array}\right) \end{aligned}$$

is a block Givens rotation. Note that \(G=\sqrt{2}Q\), with \(Q\) being the matrix defined in (1.2). It is expected that these inexact preconditioners will be more effective than the exact ones in [12].

When these inexact rotated block triangular preconditioners are used to accelerate the convergence rate of Krylov subspace iteration methods, it is necessary to solve sequences of generalized residual equations of the forms

$$\begin{aligned} \widetilde{L}(\alpha )v=r,\quad \widetilde{U}(\alpha )v=r, \quad \text {and} \quad \widetilde{P}(\alpha )v=r, \end{aligned}$$
(2.4)

where \(r=(r_{a_{}}^\mathrm{T},r_{b_{}}^\mathrm{T})^\mathrm{T}\) and \(v=(v_a^\mathrm{T},v_b^\mathrm{T})^\mathrm{T}\) are the current and generalized residual vectors, respectively. Here and in the sequel, \((\cdot )^\mathrm{T^{}}\) denotes the transpose of a vector or a matrix. As these inexact preconditioners \(\widetilde{L}(\alpha )\), \(\widetilde{U}(\alpha )\), and \(\widetilde{P}(\alpha )\) are multiplicative structures, the generalized residual equations in (2.4) can be accomplished by solving only subsystems of the same coefficient matrix \(\widetilde{\alpha W+T}\). Thus, we can compute the generalized residual vectors \(v\) in (2.4) by the following procedures.

Procedures for Computing the Generalized Residuals

Set \(\widetilde{r}_a=\sqrt{\frac{\alpha ^2+1}{2}} (r_a+r_b)\)    and   \(\widetilde{r}_b=\sqrt{\frac{\alpha ^2+1}{2}} (-r_a+r_b)\).

  • Computing \(v\) from \(\widetilde{L}(\alpha )v=r\):

    • Solve \(v_a\) from \(\widetilde{(\alpha W+T)}v_a=\tilde{r}_a\).

    • Solve \(v_b\) from \(\widetilde{(\alpha W+T)}v_b=(W-\alpha T)v_a+\tilde{r}_b\).

  • Computing \(v\) from \(\widetilde{U}(\alpha )v=r\):

    • Solve \(v_b\) from \(\widetilde{(\alpha W+T)}v_b=\tilde{r}_b\).

    • Solve \(v_a\) from \(\widetilde{(\alpha W+T)}v_a=(\alpha T-W)v_b+\tilde{r}_a\).

  • Computing \(v\) from \(\widetilde{P}(\alpha )v=r\):

    • Solve \(\tilde{v}_a\) from \(\widetilde{(\alpha W+T)}\tilde{v}_a=\tilde{r}_a\).

    • Solve \(v_b\) from \(\widetilde{(\alpha W+T)}v_b=(W-\alpha T)\tilde{v}_a+\tilde{r}_b\).

    • Solve \(v_a\) from \(\widetilde{(\alpha W+T)}v_a=(\alpha T-W)v_b+\tilde{r}_a\).

From the foregoing procedures for computing the generalized residual equations, we see that the only difference between our proposed inexact preconditioners and the exact ones in [12] is that we solve the subsystems of the coefficient matrix \(\widetilde{\alpha W+T}\) instead of \(\alpha W+T\). Because the matrix \(\widetilde{\alpha W+T}\) is an approximation to the matrix \(\alpha W+T\), we may choose it by making use of the special structure of \(\alpha W+T\). Here, we adopt incomplete triangular factorization methods [21] to implement the inexact preconditioning processes. More specifically, we use incomplete LU factorization when \(\alpha W+T\) is nonsymmetric and incomplete Cholesky factorization when it is symmetric positive definite.

3 Eigenvalue properties of preconditioned matrices

In this section, we analyze the eigenvalue properties of the preconditioned matrices \(\widetilde{L}(\alpha )^{-1}A\), \(\widetilde{U}(\alpha )^{-1}A\), and \(\widetilde{P}(\alpha )^{-1}A\).

We first introduce some necessary notations that will be used in the subsequent discussions. Denote the matrices

$$\begin{aligned} M(\alpha )=\widetilde{(\alpha W+T)^{-1}}(\alpha W+T), \quad \widetilde{V}(\alpha )=\widetilde{(\alpha W+T)^{-1}}(W-\alpha T), \end{aligned}$$

and

$$\begin{aligned} \widetilde{R}(\alpha )= \left( \begin{array}{cc} M(\alpha ) &{}\quad \widetilde{V}(\alpha ) \\ \widetilde{V}(\alpha )(M(\alpha )-I) &{}\quad M(\alpha )+\widetilde{V}(\alpha )^2 \end{array}\right) , \end{aligned}$$
$$\begin{aligned} \widetilde{S}(\alpha )= \left( \begin{array}{cc} M(\alpha )+\widetilde{V}(\alpha )^2 &{}\quad \widetilde{V}(\alpha )(I-M(\alpha )) \\ -\widetilde{V}(\alpha ) &{}\quad M(\alpha ) \end{array}\right) , \end{aligned}$$
$$\begin{aligned} \widetilde{R}_s(\alpha )= \left( \begin{array}{cc} M(\alpha )+\widetilde{V}(\alpha )^2(I-M(\alpha )) &{}\quad \widetilde{V}(\alpha )-\widetilde{V}(\alpha )^3-\widetilde{V}(\alpha )M(\alpha ) \\ \widetilde{V}(\alpha )(M(\alpha )-I) &{}\quad M(\alpha )+\widetilde{V}(\alpha )^2 \end{array}\right) . \end{aligned}$$

It then follows from

$$\begin{aligned} \left( \begin{array}{cc} 0 &{}\quad -I \\ I &{}\quad 0 \end{array}\right) \widetilde{R}(\alpha ) \left( \begin{array}{cc} 0 &{}\quad I \\ -I &{}\quad 0 \end{array}\right) =\widetilde{S}(\alpha ) \end{aligned}$$

that \(\widetilde{R}(\alpha )\) and \(\widetilde{S}(\alpha )\) are orthogonally similar matrices. In addition, we introduce

$$\begin{aligned} I(\alpha )=\frac{1}{\sqrt{2(\alpha ^2+1)}} \left( \begin{array}{cc} (\alpha +1)I &{}\quad (\alpha -1)I \\ (1-\alpha )I &{}\quad (\alpha +1)I \end{array}\right) , \end{aligned}$$

which is a block two-by-two orthogonal matrix, and the corresponding spectral decomposition is

$$\begin{aligned} I(\alpha )=\Phi (\alpha )\Lambda (\alpha )\Phi (\alpha )^*, \end{aligned}$$

where

$$\begin{aligned} \Phi (\alpha )=\frac{1}{\sqrt{2}} \left( \begin{array}{cc} I &{}\quad \mathrm{i}I \\ \mathrm{i}I &{}\quad I \end{array}\right) \quad \mathrm{{and}} \quad \Lambda (\alpha )=\frac{1}{\sqrt{2(\alpha ^2+1)}} \left( \begin{array}{cc} ((\alpha +1)+\mathrm{i}(\alpha -1))I &{}\quad 0 \\ 0 &{}\quad ((\alpha +1)-\mathrm{i}(\alpha -1))I \end{array}\right) . \end{aligned}$$

Here and in the sequel, \((\cdot )^*\) and \(\Vert \cdot \Vert \) are used to represent the conjugate transpose and the Euclidean norm of a vector or a matrix, respectively. Let

$$\begin{aligned} D(\alpha )= \left( \begin{array}{cc} M(\alpha ) &{}\quad 0 \\ 0 &{}\quad M(\alpha ) \end{array}\right) \end{aligned}$$
(3.1)

and

$$\begin{aligned} \widetilde{D}(\alpha ) =D(\alpha )\Lambda (\alpha ) =\frac{1}{\sqrt{2(\alpha ^2+1)}} \left( \begin{array}{cc} ((\alpha +1)+\mathrm{i}(\alpha -1)) M(\alpha ) &{}\quad 0 \\ 0 &{}\quad ((\alpha +1)-\mathrm{i}(\alpha -1)) M(\alpha ) \end{array}\right) \end{aligned}$$
(3.2)

be two block-diagonal matrices.

In what follows, we derive equivalent expressions for the preconditioned matrices \(\widetilde{L}(\alpha )^{-1}A\), \(\widetilde{U}(\alpha )^{-1}A\), and \(\widetilde{P}(\alpha )^{-1}A\) and demonstrate the corresponding spectral properties.

Theorem 3.1

Let \(W\) and \(T\) be two real square matrices such that \(\widetilde{\alpha W+T}\) is nonsingular, with \(\alpha \) being a real constant. Then, for the preconditioning matrices \(\widetilde{L}(\alpha )\), \(\widetilde{U}(\alpha )\), and \(\widetilde{P}(\alpha )\) defined in (2.1), (2.2), and (2.3), it holds that

  1. (i)

    \(\widetilde{L}(\alpha )^{-1}A=\widetilde{R}(\alpha )I(\alpha )\), \(\widetilde{U}(\alpha )^{-1}A=\widetilde{S}(\alpha )I(\alpha )\), and \(\widetilde{P}(\alpha )^{-1}A=\widetilde{R}_s(\alpha )I(\alpha )\);

  2. (ii)

    If \(M(\alpha )\) is diagonalizable, i.e., there exist a diagonal matrix \(\Omega (\alpha )=\mathrm{diag}(\mu _1, \mu _2, \ldots , \mu _n)\) and a nonsingular matrix \(Q(\alpha )\) such that \(M(\alpha )=Q(\alpha )^{-1}\Omega (\alpha )Q(\alpha )\), then

    • \(\mathrm{(ii}_\mathrm{1}\mathrm{)}\) The eigenvalues of \(\widetilde{L}(\alpha )^{-1}A\) and \(\widetilde{U}(\alpha )^{-1}A\) are located respectively within the same union of circles having its center \(\widetilde{c}_{\pm j}(\alpha )\) and radius \(\widetilde{\delta }(\alpha )\) on the complex plane, where

      $$\begin{aligned} \widetilde{c}_{\pm j}(\alpha ) =\frac{1}{\sqrt{2(\alpha ^2+1)}}((\alpha +1)\pm \mathrm{i}(\alpha -1))\mu _j \end{aligned}$$

      and

      $$\begin{aligned} \widetilde{\delta }(\alpha ) =\Vert Q^{-1}(\alpha )\Vert \Vert Q(\alpha )\Vert \Vert \widetilde{V}(\alpha )\Vert \left\{ \sqrt{1+\Vert \widetilde{V}(\alpha )\Vert ^2}+\Vert M(\alpha )-I\Vert \right\} ; \end{aligned}$$
    • \(\mathrm{(ii}_\mathrm{2}\mathrm{)}\) The eigenvalues of \(\widetilde{P}(\alpha )^{-1}A\) are located respectively within the union of circles having its center \(\widetilde{c}_{\pm j}(\alpha )\) and radius \(\widetilde{\delta }_s(\alpha )\) on the complex plane, where

      $$\begin{aligned} \widetilde{c}_{\pm j}(\alpha ) =\frac{1}{\sqrt{2(\alpha ^2+1)}}((\alpha +1)\pm \mathrm{i}(\alpha -1))\mu _j \end{aligned}$$

      and

      $$\begin{aligned} \widetilde{\delta }_s(\alpha ) = \Vert Q^{-1}(\alpha )\Vert \Vert Q(\alpha )\Vert \left\{ \Vert \widetilde{V}(\alpha )\Vert ^2\sqrt{1+\Vert \widetilde{V}(\alpha )\Vert ^2} \right. +\Vert \widetilde{V}(\alpha )\Vert \Vert M(\alpha )-I\Vert \left( \Vert \widetilde{V}(\alpha )\Vert +1\right) \bigg \}. \end{aligned}$$

Proof

We first prove (i). We only demonstrate the product expression of the matrix \(\widetilde{L}(\alpha )^{-1}A\), and those about the IRBUT and the IRBTP preconditioning matrices \(\widetilde{U}(\alpha )\) and \(\widetilde{P}(\alpha )\) can be derived analogously to \(\widetilde{L}(\alpha )\). To this end, we define the matrix

$$\begin{aligned} J(\alpha )=\frac{1}{\sqrt{\alpha ^2+1}} \left( \begin{array}{cc} \alpha I &{}\quad I \\ -I &{}\quad \alpha I \end{array}\right) . \end{aligned}$$

By straightforward computation, we obtain \(J(\alpha )^{-1}=J(\alpha )^\mathrm{T}\), \(J(\alpha )^{-1}G^\mathrm{T}=I(\alpha )\), and

$$\begin{aligned} AJ(\alpha )=\frac{1}{\sqrt{\alpha ^2+1}} \left( \begin{array}{cc} \alpha W+T &{}\quad W-\alpha T \\ \alpha T-W &{}\quad \alpha W+T \end{array}\right) . \end{aligned}$$
(3.3)

Because the matrices \(A\) and \(G^\mathrm{T}\) satisfy \(AG^\mathrm{T}=G^\mathrm{T}A\) and

$$\begin{aligned} \widetilde{L}(\alpha )^{-1}=\sqrt{\alpha ^2+1} \left( \begin{array}{cc} (\widetilde{\alpha W+T})^{-1} &{}\quad 0 \\ \widetilde{V}(\alpha )(\widetilde{\alpha W+T})^{-1} &{}\quad (\widetilde{\alpha W+T})^{-1} \end{array}\right) G^\mathrm{T}, \end{aligned}$$
(3.4)

it follows from (3.3) and (3.4) that

$$\begin{aligned} \widetilde{L}(\alpha )^{-1}A&=\sqrt{\alpha ^2+1} \left( \begin{array}{cc} (\widetilde{\alpha W+T})^{-1} &{}\quad 0 \\ \widetilde{V}(\alpha )(\widetilde{\alpha W+T})^{-1} &{}\quad (\widetilde{\alpha W+T})^{-1} \end{array}\right) A G^\mathrm{T} \\&=\sqrt{\alpha ^2+1} \left( \begin{array}{cc} (\widetilde{\alpha W+T})^{-1} &{}\quad 0 \\ \widetilde{V}(\alpha )(\widetilde{\alpha W+T})^{-1} &{}\quad (\widetilde{\alpha W+T})^{-1} \end{array}\right) (AJ(\alpha ))(J(\alpha )^{-1}G^\mathrm{T})\\&=\left( \begin{array}{cc} (\widetilde{\alpha W+T})^{-1} &{}\quad 0 \\ \widetilde{V}(\alpha )(\widetilde{\alpha W+T})^{-1} &{}\quad (\widetilde{\alpha W+T})^{-1} \end{array}\right) \left( \begin{array}{cc} \alpha W+T &{}\quad W-\alpha T \\ \alpha T-W &{}\quad \alpha W+T \end{array}\right) I(\alpha ) \\&=\left( \begin{array}{cc} M(\alpha ) &{}\quad \widetilde{V}(\alpha ) \\ \widetilde{V}(\alpha )(M(\alpha )-I) &{}\quad M(\alpha )+\widetilde{V}(\alpha )^2 \end{array}\right) I(\alpha ) \\&=\widetilde{R}(\alpha ) I(\alpha ). \end{aligned}$$

This shows the validity of (i).

Now we demonstrate (ii). We only prove the eigenvalue properties of the IRBLT preconditioned matrix \(\widetilde{L}(\alpha )^{-1}A\) in \((\mathrm{ii}_1)\) because those of the IRBUT preconditioned matrix \(\widetilde{U}(\alpha )^{-1}A\) can be obtained similarly. From (3.1) we have

$$\begin{aligned} \widetilde{R}(\alpha )=D(\alpha )+\widetilde{Y}(\alpha ), \end{aligned}$$

with

$$\begin{aligned} \widetilde{Y}(\alpha )= \left( \begin{array}{cc} 0 &{}\quad \widetilde{V}(\alpha ) \\ \widetilde{V}(\alpha )(M(\alpha )-I) &{}\quad \widetilde{V}(\alpha )^2 \end{array}\right) = \left( \begin{array}{cc} \widetilde{V}(\alpha ) &{}\quad 0 \\ 0 &{}\quad \widetilde{V}(\alpha ) \end{array}\right) \left( \begin{array}{cc} 0 &{}\quad I \\ M(\alpha )-I &{}\quad \widetilde{V}(\alpha ) \end{array}\right) . \end{aligned}$$

Thus, it can be obtained from (i) that

$$\begin{aligned} \widetilde{L}(\alpha )^{-1}A=\widetilde{R}(\alpha ) I(\alpha ) =D(\alpha )I(\alpha )+\widetilde{Y}(\alpha )I(\alpha ) \end{aligned}$$

and

$$\begin{aligned} \Phi (\alpha )^*\widetilde{L}(\alpha )^{-1}A\Phi (\alpha )&=\Phi (\alpha )^*D(\alpha )I(\alpha )\Phi (\alpha )+ \Phi (\alpha )^*\widetilde{Y}(\alpha )I(\alpha )\Phi (\alpha ) \\&=\Phi (\alpha )^*D(\alpha )\Phi (\alpha ) \Phi (\alpha )^*I(\alpha )\Phi (\alpha )+ \Phi (\alpha )^*\widetilde{Y}(\alpha )I(\alpha )\Phi (\alpha ) \\&=D(\alpha )\Lambda (\alpha )+ \Phi (\alpha )^*\widetilde{Y}(\alpha )I(\alpha )\Phi (\alpha ) \\&=\widetilde{D}(\alpha )+ \Phi (\alpha )^*\widetilde{Y}(\alpha )I(\alpha )\Phi (\alpha ). \end{aligned}$$

Here we have used the fact that \(\Phi (\alpha )\) is unitary and \(\Phi (\alpha )^*D(\alpha )\Phi (\alpha )=D(\alpha )\). Because \(I(\alpha )\) is orthogonal, we can further obtain

$$\begin{aligned} \Vert \Phi (\alpha )^*\widetilde{Y}(\alpha )I(\alpha )\Phi (\alpha )\Vert&=\Vert \widetilde{Y}(\alpha )\Vert \\&\le \Vert \widetilde{V}(\alpha )\Vert \left\| \left( \begin{array}{cc} 0 &{}\quad I \\ M(\alpha )-I &{}\quad \widetilde{V}(\alpha ) \end{array}\right) \right\| \\&\le \Vert \widetilde{V}(\alpha )\Vert \left\{ \left\| \left( \begin{array}{cc} 0 &{}\quad I \\ 0 &{}\quad \widetilde{V}(\alpha ) \end{array}\right) \right\| +\left\| \left( \begin{array}{cc} 0 &{}\quad 0 \\ M(\alpha )-I &{}\quad 0 \end{array}\right) \right\| \right\} \\&\le \Vert \widetilde{V}(\alpha )\Vert \left\{ \sqrt{1+\Vert \widetilde{V}(\alpha )\Vert ^2}+\Vert M(\alpha )-I\Vert \right\} . \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert \Phi (\alpha )^*\widetilde{L}(\alpha )^{-1}A \Phi (\alpha )-\widetilde{D}(\alpha )\Vert \le \Vert \widetilde{V}(\alpha )\Vert \left\{ \sqrt{1+\Vert \widetilde{V}(\alpha )\Vert ^2}+\Vert M(\alpha )-I\Vert \right\} . \end{aligned}$$

Let \(\lambda \) be an eigenvalue of the matrix \(\widetilde{L}(\alpha )^{-1}A\). Because \(\Phi (\alpha )\) is a unitary matrix, \(\lambda \) is also an eigenvalue of the matrix \(\Phi (\alpha )^*\widetilde{L}(\alpha )^{-1}A \Phi (\alpha )\). If \(M(\alpha )\) is diagonalizable, i.e., \(M(\alpha )=Q(\alpha )^{-1}\Omega (\alpha )Q(\alpha )\), then from (3.2) we obtain

$$\begin{aligned} \widetilde{D}(\alpha )&= \frac{1}{\sqrt{2(\alpha ^2+1)}} \left( \begin{array}{cc} Q(\alpha )^{-1} &{}\quad 0 \\ 0 &{}\quad Q(\alpha )^{-1} \end{array}\right) \\&\quad \times \left( \begin{array}{cc} ((\alpha +1)+\mathrm{i}(\alpha -1)) \Omega (\alpha ) &{}\quad 0 \\ 0 &{}\quad ((\alpha +1)-\mathrm{i}(\alpha -1)) \Omega (\alpha ) \end{array}\right) \left( \begin{array}{cc} Q(\alpha ) &{}\quad 0 \\ 0 &{}\quad Q(\alpha ) \end{array}\right) , \end{aligned}$$

where \(\Omega (\alpha )=\mathrm{diag}(\mu _1, \mu _2, \ldots , \mu _n)\). Denoting \(\widetilde{c}_{\pm j}(\alpha )=((\alpha +1)\pm \mathrm{i}(\alpha -1))\mu _j/\sqrt{2(\alpha ^2+1)}\), we know that \(\widetilde{c}_{\pm j}(\alpha )\) are eigenvalues of the matrix \(\widetilde{D}(\alpha )\). From the Bauer–Fike theorem [29, 30] we have

$$\begin{aligned} |\lambda -\widetilde{c}_{\pm j}(\alpha )|&\le \Vert Q(\alpha )^{-1}\Vert \Vert Q(\alpha )\Vert \Vert \Phi (\alpha )^*\widetilde{L}(\alpha )^{-1}A \Phi (\alpha )-\widetilde{D}(\alpha )\Vert \\&\le \Vert Q(\alpha )^{-1}\Vert \Vert Q(\alpha )\Vert \Vert \widetilde{V}(\alpha )\Vert \left\{ \sqrt{1+\Vert \widetilde{V}(\alpha )\Vert ^2}+\Vert M(\alpha )-I\Vert \right\} = \widetilde{\delta }(\alpha ). \end{aligned}$$

This verifies the validity of \((\mathrm{ii}_1)\).

Finally, we prove \((\mathrm{ii}_2)\). By rewriting \(\widetilde{R}_s(\alpha )=D(\alpha )+Y_s(\alpha )\) with

$$\begin{aligned} Y_s(\alpha )= \left( \begin{array}{cc} \widetilde{V}(\alpha )^2(I-M(\alpha )) &{} \widetilde{V}(\alpha )-\widetilde{V}(\alpha )^3-\widetilde{V}(\alpha )M(\alpha ) \\ \widetilde{V}(\alpha )(M(\alpha )-I) &{} \widetilde{V}(\alpha )^2 \end{array}\right) , \end{aligned}$$

from (i) we obtain

$$\begin{aligned} \widetilde{P}(\alpha )^{-1}A\!=\!\widetilde{R}_s(\alpha ) I(\alpha ) =D(\alpha )I(\alpha )+\widetilde{Y}_s(\alpha )I(\alpha ) \end{aligned}$$

and, thereby,

$$\begin{aligned} \Phi (\alpha )^*\widetilde{P}(\alpha )^{-1}A\Phi (\alpha )&=\Phi (\alpha )^*D(\alpha )I(\alpha )\Phi (\alpha ) +\Phi (\alpha )^*\widetilde{Y}_s(\alpha )I(\alpha )\Phi (\alpha ) \nonumber \\&=D(\alpha )\Lambda (\alpha )+ \Phi (\alpha )^*\widetilde{Y}_s(\alpha )I(\alpha )\Phi (\alpha ) \nonumber \\&=\widetilde{D}(\alpha )+ \Phi (\alpha )^*\widetilde{Y}_s(\alpha )I(\alpha )\Phi (\alpha ). \end{aligned}$$
(3.5)

Moreover, it holds that

$$\begin{aligned} \Vert \Phi (\alpha )^*\widetilde{Y}_s(\alpha )I(\alpha )\Phi (\alpha )\Vert&\!=\!\Vert \widetilde{Y}_s(\alpha )\Vert \nonumber \\&\!\le \! \left\| \left( \begin{array}{cc} 0 &{}\quad \!-\!\widetilde{V}(\alpha )^3 \\ 0 &{}\quad \widetilde{V}(\alpha )^2 \end{array}\right) \right\| \!+\! \left\| \left( \begin{array}{cc} \widetilde{V}(\alpha )^2(I\!-\!M(\alpha )) &{}\quad \widetilde{V}(\alpha )(I\!-\!M(\alpha )) \\ \widetilde{V}(\alpha )(M(\alpha )\!-\!I) &{}\quad \widetilde{V}(\alpha ) \end{array}\right) \right\| \nonumber \\&\!\le \! \Vert \widetilde{V}(\alpha )\Vert ^2 \sqrt{1\!+\!\Vert \widetilde{V}(\alpha )\Vert ^2} \!+\!\left\| \left( \begin{array}{cc} \widetilde{V}(\alpha )(I\!-\!M(\alpha )) &{}\quad 0 \\ 0 &{}\quad \widetilde{V}(\alpha )(I-M(\alpha )) \end{array}\right) \left( \begin{array}{cc} \widetilde{V}(\alpha ) &{}\quad I \\ \!-\!I &{}\quad 0 \end{array}\right) \right\| \nonumber \\&\!\le \! \Vert \widetilde{V}(\alpha )\Vert ^2\sqrt{1\!+\!\Vert \widetilde{V}(\alpha )\Vert ^2} \!+\!\Vert \widetilde{V}(\alpha )\Vert \Vert M(\alpha )\!-\!I\Vert \left\| \left( \begin{array}{cc} \widetilde{V}(\alpha ) &{}\quad I \\ \!-\!I &{}\quad 0 \end{array}\right) \right\| \nonumber \\&\!\le \! \Vert \widetilde{V}(\alpha )\Vert ^2\sqrt{1\!+\!\Vert \widetilde{V}(\alpha )\Vert ^2} \!+\!\Vert \widetilde{V}(\alpha )\Vert \Vert M(\alpha )\!-\!I\Vert \left\{ \left\| \left( \begin{array}{cc} \widetilde{V}(\alpha ) &{}\quad 0 \\ 0 &{}\quad 0 \end{array}\right) \right\| \!+\!\left\| \left( \begin{array}{cc} 0 &{}\quad I \\ -I &{}\quad 0 \end{array}\right) \right\| \right\} \nonumber \\&\!\le \! \Vert \widetilde{V}(\alpha )\Vert ^2\sqrt{1\!+\!\Vert \widetilde{V}(\alpha )\Vert ^2} \!+\!\Vert \widetilde{V}(\alpha )\Vert \Vert M(\alpha )\!-\!I\Vert \left( \Vert \widetilde{V}(\alpha )\Vert \!+\!1\right) . \end{aligned}$$
(3.6)

Therefore, from (3.5) and (3.6) we have

$$\begin{aligned} \Vert \Phi (\alpha )^*\widetilde{P}(\alpha )^{-1}A \Phi (\alpha )-\widetilde{D}(\alpha )\Vert \le&\Vert \widetilde{V}(\alpha )\Vert ^2\sqrt{1+\Vert \widetilde{V}(\alpha )\Vert ^2} +\Vert \widetilde{V}(\alpha )\Vert \Vert M(\alpha )-I\Vert \left( \Vert \widetilde{V}(\alpha )\Vert +1\right) = \widetilde{\delta }_s(\alpha ). \end{aligned}$$
(3.7)

Analogously to the derivation of \((\mathrm{ii}_1)\), from (3.7) we immediately obtain the validity of \((\mathrm{ii}_2)\). \(\square \)

For the special case \(\widetilde{\alpha W+T}=\alpha W+T\), i.e., \(M(\alpha )=I\), the inexact rotated block triangular preconditioners change into the exact ones, and the results given in Theorem 3.1 are the same as those in [12]. We may guarantee that all the eigenvalues of these inexact preconditioned matrices are located in the right half or left half of the complex plane by selecting a suitable parameter \(\alpha \) and approximate matrix \(\widetilde{\alpha W+T}\), though this is not an easy task. In applications, we choose \(\widetilde{\alpha W+T}\) in accordance with the special structures of the matrices \(W\) and \(T\).

4 Numerical results

In this section, we test the effectiveness of our proposed inexact rotated block triangular preconditioners. To this end, we apply the GMRES iteration method, incorporated with the IRBLT, the IRBUT, and the IRBTP preconditioners, to the system of linear equations (1.1).

The real block two-by-two linear system (1.1) is equivalently reformulated from the complex linear system

$$\begin{aligned} (W+\mathrm{i}T)\mathbf{x}=\mathbf{h}, \quad \text {with} \ \mathbf{x}=y+\mathrm{i}z \quad \text {and}\quad \mathbf{h}=f+\mathrm{i}g. \end{aligned}$$
(4.1)

In the following two examples [12, 13], we need to solve the linear system (4.1).

Example 4.1

(Complex-Shifted Linear System) Consider the parabolic partial differential equation in [12, 13]. By discretizing it with a finite-element method of bilinear elements on a uniform rectangular mesh, we obtain a linear system of the form

$$\begin{aligned} (K+\mathrm{i}\omega \tau M)\mathbf{x}=\mathbf{h}, \end{aligned}$$

where \(\omega \) is a positive parameter, \(\tau \) is the temporal discretization step size, and \(K=M+\tau L\in {\mathbb {R}}^{m\times m}\), with \(M\in {\mathbb {R}}^{m\times m}\) being the mass matrix and \(L\in {\mathbb {R}}^{m\times m}\) the discrete negative Laplace operator. Here, the exact solution of the parabolic partial differential equation and the right-hand vector \(\mathbf{h}\) of the discretized linear system are given as in [12]; see also [11, 13].

In Example 4.1, let \(W=K\) and \(T=\omega \tau M\). Then we obtain the linear system (4.1). It is easy to verify that the matrices \(W\) and \(T\) are symmetric positive definite, as is \(\alpha W+T\).

Example 4.2

(Nonsymmetric BBC Problem) A complex linear system is of the form \((W+\mathrm{i}T)\mathbf{z}=\mathbf{h}\), with

$$\begin{aligned} W=I\otimes L+L\otimes I \quad \text {and} \quad T=10(I\otimes L_c+L_c\otimes I)+ 9(e_1e_{l}^\mathrm{T}+e_le_1^\mathrm{T})\otimes I, \end{aligned}$$

where \(L=\mathrm{tridiag}(-1-\theta ,2,-1+\theta ) \in {\mathbb {R}}^{l\times l}\) is a tridiagonal matrix, \(\theta =\nu /[2(l+1)]\) with \(\nu \) a positive constant, \(L_c=L-e_1e_{l}^\mathrm{T}-e_{l}e_1^\mathrm{T}\), and \(e_1\) and \(e_l\) are the first and last column vectors of the identity matrix \(I\in {\mathbb {R}}^{l\times l}\), respectively. The right-hand-side vector \(\mathbf{h}\) is defined as in [12]. See also [13, 18, 19].

In our implementations, the initial guess is taken to be zero and the iteration process is terminated once the current residual \(r^{(j)}\) satisfies

$$\begin{aligned} \frac{\Vert r^{(j)}\Vert _2}{\Vert r^{(0)}\Vert _2}\le 10^{-6}. \end{aligned}$$

The parameter \(\alpha \) is taken to be \(1.0\) because it is observed that in our implementations the results do not rely on \(\alpha \) sensitively. From [12] we see that it also holds true in the exact preconditioned GMRES methods. In addition, we implement the inexact preconditioning processes by incomplete Cholesky factorization in Example 4.1 and by incomplete LU factorization in Example 4.2. All codes were written in MATLAB R2008a and all experiments were performed on a personal computer with 1.86 G memory.

In Tables 1 and 2, we list the numbers of iteration steps (IT) and CPU times in seconds (CPU) of the IRBLT, IRBUT, and IRBTP preconditioned GMRES methods, termed briefly as IRBLT-GMRES, IRBUT-GMRES and IRBTP-GMRES, for Examples 4.1 and 4.2 with respect to different values of the problem parameter \(\omega \) or \(\nu \) and the problem size \(m\), respectively. Here, the CPU times are shown in parentheses. In Tables 3 and 4, the speed-up ratios of the CPU times between the exact preconditioned GMRES method in [12] and the inexact ones in this paper, i.e., \(\mathrm{CPU_{exact}}/\mathrm{CPU_{inexact}}\), are listed for Examples 4.1 and 4.2 with respect to various values of the problem parameter \(\omega \) or \(\nu \) and the problem size \(m\), respectively.

Table 1 Iteration step and CPU time [IT (CPU)] for Example 4.1
Table 2 Iteration step and CPU time [IT (CPU)] for Example 4.2

From Tables 1 and 2 we see that all of these three inexact preconditioned GMRES methods can successfully compute satisfactory approximations to the exact solutions of Examples 4.1 and 4.2 in a few iteration steps. Moreover, the iteration steps remain almost invariant for each method and for any fixed problem parameter \(\omega \) or \(\nu \) when the problem size \(m\) increases. It can be noticed from Table 1 that of the three methods, the IRBLT-GMRES method needs the most iteration steps and CPU time. The IRBUT-GMRES and the IRBTP-GMRES methods require almost the same number of iteration steps, but the IRBTP-GMRES method costs more CPU time because it requires solving three subsystems of the coefficient matrix \(\widetilde{\alpha W+T}\) and the same two subsystems for the IRBUT-GMRES method in each preconditioning process. We see also from Table 2 that the IRBUT-GMRES method requires the least CPU time, and the IRBTP-GMRES method sometimes requires the fewest number of iteration steps in Example 4.2.

Comparing the results in Tables 1 and 2 with those in [12] for Examples 4.1 and 4.2, we observe that for the same parameters \(\alpha \), \(\omega \) or \(\nu \), and \(m\), the iteration steps of the IRBLT-GMRES, IRBUT-GMRES, and IRBTP-GMRES methods remains almost the same as those of the RBLT-GMRES, RBUT-GMRES, and RBTP-GMRES methods. The CPU times of these inexact preconditioned GMRES methods are much less than the corresponding exact ones because the inexact preconditioning processes cost less CPU time than the exact ones. More specifically, in Example 4.1 we perform incomplete Cholesky factorization with threshold and pivoting (ICTP) [21] by making use of the sparse structure of the matrix \(\alpha W+T\), denoted by \(\alpha W+T\thickapprox \widetilde{L}\widetilde{L}^T\), and let \(\widetilde{\alpha W+T}=\widetilde{L}\widetilde{L}^T\). Because \(\widetilde{L}\widetilde{L}^T\) is a good approximation of \(\alpha W+T\), and \(\widetilde{L}\) is a very sparse matrix, the cost to invert \(\widetilde{\alpha W+T}\) is very cheap in the inexact preconditioning processes. In Example 4.2 we use incomplete LU factorization with threshold and pivoting (ILUTP) for the matrix \(\alpha W+T\) for the same reasons. Therefore, we say that the IRBLT, IRBUT, and IRBTP preconditioners are superior to the exact ones in [12] for accelerating the convergence rate of the GMRES method in Examples 4.1 and 4.2.

Table 3 Speed-up \((\mathrm{CPU_{exact}}/\mathrm{CPU_{inexact}})\) for Example 4.1
Table 4 Speed-up \((\mathrm{CPU_{exact}}/\mathrm{CPU_{inexact}})\) for Example 4.2

From Tables 3 and 4 we see that the speed-up ratios of \(\mathrm{CPU_{exact}}/\mathrm{CPU_{inexact}}\) are greater than 1 for any fixed problem parameter \(\omega \) or \(\nu \) and problem size \(m\), which indicates that the IRBLT-GMRES, IRBUT-GMRES, and IRBTP-GMRES methods cost less CPU time than the exact ones, respectively. This shows the advantages of these inexact preconditioners over the exact preconditioners in [12]. And the larger the speed-up ratio is, the more apparent the superiority of the inexact preconditioners over the exact ones. Further, the speed-up ratios also increase when the parameter \(m\) grows (i.e., the size of the matrices becomes large) in both Examples 4.1 and 4.2. In particular, when \(m=2,025\), the speed-up ratios are the largest for all cases listed in Tables 3 and 4. Hence, we may conclude that when matrices grow in size, their structures become more sparse and the inexact preconditioners are more suitable for solving the systems of linear equations in Examples 4.1 and 4.2. Moreover, the speed-up ratios of the IRBTP-GMRES method is generally greater than that of the IRBLT-GMRES and IRBUT-GMRES methods. Thus, it is obvious that the inexact preconditioners proposed in this paper are more effective than the exact ones in [12] based on the speed-up ratios of the CPU times.

5 Concluding remarks

IRBLT, IRBUT, and IRBTP preconditioning matrices were proposed to precondition linear systems, when the coefficient matrix is a block two-by-two matrix of real square blocks based on the RBLT, RBUT, and RBTP preconditioners in [12]. We analyzed the eigenvalue properties of these three inexact preconditioned matrices and showed the effectiveness of the IRBLT, IRBUT, and IRBTP preconditioners at accelerating the convergence rates of Krylov subspace iteration methods such as GMRES for solving the linear system (1.1) on the basis of both theoretical analysis and numerical results.

In Sect. 2 we proved that the eigenvalues of the IRBLT, IRBUT, and IRBTP preconditioned matrices are respectively located within the union of some circles. It is difficult to provide details about the eigenvalues and eigenvectors of these preconditioned matrices like the exact ones in [12] in the general case. In practical applications, we may choose the approximate matrix \(\widetilde{\alpha W+T}\) by splitting the matrix \(\alpha W+T\). More specifically, let \(\alpha W+T=B(\alpha )-C(\alpha )\) be a splitting of \(\alpha W+T\). Then we can take \(\widetilde{\alpha W+T}=B(\alpha )\) or

$$\begin{aligned} \widetilde{\alpha W+T}=(\alpha W+T)\left[ I-\left( B(\alpha )^{-1}C(\alpha )\right) ^m\right] ^{-1} =B(\alpha )\left[ \sum _{j=0}^{m-1}\left( B(\alpha )^{-1}C(\alpha )\right) ^j\right] ^{-1}, \end{aligned}$$
(5.1)

with \(m\) being a positive integer; see [31]. It is expected that in future work, when the approximation matrix \(\widetilde{\alpha W+T}\) is specially defined as in (5.1), more eigenproperties of the preconditioned matrices will be derived.