Abstract
Extended Krylov subspace methods are attractive methods for computing approximations of matrix functions and other problems producing large-scale matrices. In this work, we propose the extended nonsymmetric global Lanczos method for solving some matrix approximation problems. The derived algorithm uses short recursive relations to generate bi-orthonormal bases, with respect to the Frobenius inner product, of the corresponding extended Krylov subspaces \({K^{e}_{m}}(A,V)\) and \({K^{e}_{m}}(A^{T},W)\). Here, A is a large nonsymmetric matrix; V and \(W\in \mathbb {R}^{n\times s}\) are two blocks. New algebraic properties of the proposed method are developed and applications to approximation of both WTf(A)V and trace(WTf(A)V ) are given. Numerical examples are presented to show the performance of the extended nonsymmetric global Lanczos for these problems.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(A\in \mathbb {R}^{n\times n}\) be a large nonsymmetric and nonsingular matrix, and let V and \(W\in \mathbb {R}^{n\times s}\) be two block vectors with 1 < s ≪ n. We are interested in approximating expressions of the form
where f is a function such that f(A) is well defined. Evaluation of expressions (1) arises in various applications such as in network analysis (\(f(t)=\exp (t)\) or f(t) = t3) [10], machine learning \((f(t)=\log (t))\) [13, 20], theory of Quantum Chromodynamics (f(t) = t1/2), electronic structure computation [3, 7, 22], and the solution of ill-posed problems [11, 14]. The matrix function f(A) can be defined by the spectral factorization of A; see, e.g., [14, 16] for discussions on several possible definitions of matrix functions. In the present paper, we assume that the matrix A is so large that it is impractical to evaluate its spectral factorization.
Approximation of \(\mathcal {I}_{tr}\) in the case when A is a symmetric matrix and W = V and using global methods is well studied in [4]. Global Krylov subspace techniques were first proposed in [17] for solving linear equations with multiple right hand sides and Lyapunov equations. Consider the global Krylov subspace
where πm− 1 denotes the set of polynomials of degree at most m − 1. Using m steps of the global Lanczos method [18] to A with the initial block vector V leads to the decomposition
where ⊗ stands for the Kronecker product. The block columns V1,…,Vm of the matrix \(\mathbb {V}_{m}=[V_{1},\ldots ,V_{m}]\in \mathbb {R}^{n\times ms}\) form an F-orthonormal (with respect to the Frobenius inner product) basis of the global Krylov subspace (2) which means that
Moreover, the matrix \(T_{m}\in \mathbb {R}^{m\times m}\) in (3) is symmetric and tridiagonal, \(I_{s}\in \mathbb {R}^{s\times s}\) denotes the identity matrix, βm ≥ 0 and \(E_{m}\in \mathbb {R}^{ms\times s}\) is the m th block axis vector, i.e., Em corresponds to the (m − 1)s + 1,…,ms columns of the identity matrix Ims.
As for the classical Krylov subspace methods and in the symmetric case, it is attractive to approximate the expression \(\mathcal {I}_{tr}\) in (1) by
where \(\widetilde {e}_{1}\) is the first vector of the canonical basis of \(\mathbb {R}^{m}\). One can show that the approximation (4) is exact for polynomials of degree at most 2m − 1, i.e.,
see, Bellalij et al. [4] for more details. When A is nonsymmetric, the F-orthonormal basis \(\mathbb {V}_{m}\) of \({\mathbb {K}^{gl}_{m}(A,V)}\) cannot be generated with short recurrence relations. The nonsymmetric global Lanczos method [18] generates a pair of F-biorthonormal bases \(\mathbb {V}_{m}=[V_{1},\ldots ,V_{m}]\) and \(\mathbb {W}_{m}=[W_{1},\ldots ,W_{m}]\) with short recurrence relations, for the Krylov subspaces \(\mathbb {K}_{m}(A,V)\) and \(\mathbb {K}_{m}(A^{T},W)\), respectively. Application of m steps of global Lanczos nonsymmetric algorithm yields the following relations:
The block vectors Vis and Wis of \(\mathbb {V}_{m}=[V_{1},\ldots ,V_{m}]\) and \(\mathbb {W}_{m}=[W_{1},\ldots ,W_{m}]\) elements \(\mathbb {R}^{n\times s}\) are said to be F-biorthonormal, if
The matrix \({T^{b}_{m}}\in \mathbb {R}^{m\times m}\) is nonsymmetric and tridiagonal. The coefficients \(\beta _{m},\gamma _{m}\in \mathbb {R}\), and Em is the matrix defined in (3). The expression \(\mathcal {I}_{tr}\) in (1) can be approximated by
In the same way as in [12], one can show that the approximation \(\mathcal {G}_{m}^{b}\) is exact for polynomials of degree at most 2m − 1, i.e.,
Approximation of WTf(A)V is well studied in the literature; see, e.g., [12, 21] and the references therein. For the nonsymmetric matrices, Reichel et al. [21], approximate \(\mathcal {I}\) by performing m steps with the nonsymmetric block Lanczos method, applied to A with initial block vectors W and V and this leads to the following algebraic relations:
where \(W_{j},V_{j}\in \mathbb {R}^{n\times s}\) are bi-orthonormal, i.e.,
Os denotes the zero matrix of order s. The matrix \(J_{m}\in \mathbb {R}^{ms\times ms}\) is a block nonsymmetric tridiagonal matrix. The size of the block entries is s × s. \(\varGamma _{m},{\varDelta }_{m}\in \mathbb {R}^{s\times s}\), and Em are the matrix defined in (3). In [21], it was shown that \({\mathcal {G}^{\text {block}}_{m}(f)}\) defined by
where \(\widetilde {E}_{1}\in \mathbb {R}^{m\times s}\) corresponds to the first s columns of the identity matrix Im, can be used to approximate \(\mathcal {I}\). Also \({\mathcal {G}^{\text {block}}_{m}}\) is exact for polynomials of degree at most 2m − 1, i.e.,
In the context of approximating expressions of the form f(A)b for some vector \(b\in \mathbb {R}^{n}\), Druskin et al. [9, 19] computed approximation using the standard extended Krylov subspace generated by the vectors A−mb,…,A− 1b, b, Ab,…, Am− 1b.
We are interested in exploring approximations of (1) by using a pair of F-biorthonormal bases from the two extended global Krylov subspaces
This paper is organized as follows. Section 2 introduces the extended nonsymmetric global Lanczos process, for generating a pair of F-biorthonormal bases for the two extended global Krylov subspaces defined by (5). Section 3 describes the application of this process to the approximation of the two matrix functions given in (1) and give some properties. Section 4 presents some numerical experiments that illustrate the quality of the computed approximations.
2 Some algebraic properties of the extended nonsymmetric global Lanczos process
2.1 Preliminaries and notations
We begin by recalling some notations and definitions that will be used throughout this paper. The Kronecker product satisfies the following properties:
-
1.
(A ⊗ B)(C ⊗ D) = AC ⊗ BD.
-
2.
(A ⊗ B)T = AT ⊗ BT.
Definition 1
Partition the matrices \(M=[M_{1},\ldots ,M_{p}]\in \mathbb {R}^{n\times ps}\) and \(N=[N_{1},\ldots ,N_{l}]\in \mathbb {R}^{n\times ls}\) into block columns \(M_{i},N_{j}\in \mathbb {R}^{n\times s}\), and define the ◇-product of the matrices M and N as
The following proposition gives some properties satisfied by the above product.
Proposition 1
[5, 6] Let \(A,B,C\in \mathbb {R}^{n\times ps}\), \(D\in \mathbb {R}^{n\times n}\), \(L\in \mathbb {R}^{p\times p}\), and \(\alpha \in \mathbb {R}^{n\times n}\). Then, we have
-
1.
(A + B)T ◇ C = AT ◇ C + BT ◇ C
-
2.
AT ◇ (B + C) = AT ◇ B + AT ◇ C
-
3.
(αA)T ◇ C = α(AT ◇ C)
-
4.
(AT ◇ B)T = BT ◇ A
-
5.
(DA)T ◇ B = AT ◇ (DTB)
-
6.
AT ◇ (B(L ⊗ Is)) = (AT ◇ B)L
Definition 2
Let X be an n × s matrix. The vectorization of the matrix X, denoted vec(X), is the ns × 1 column vector obtained by stacking the columns of the matrix X, i.e.,
vec satisfies the following properties:
-
1.
vec(MXN) = (NT ⊗ M)vec(X), \(\forall A\in \mathbb {R}^{p\times q},N\in \mathbb {R}^{r\times s},\)\(X\in \mathbb {R}^{q\times r}\).
-
2.
vec(A)Tvec(B) = trace(ATB), \(\forall A,B\in \mathbb {R}^{n\times n}\).
Let M = [M1,…,Mm] and \(N=[N_{1},\ldots ,N_{m}]\in \mathbb {R}^{n\times ms}\), with \(M_{i},N_{i}\in \mathbb {R}^{n\times s}\). The following algorithm applied to the pair (M, N) allows us to obtain two F-biorthonormal matrices \(\mathbb {V},\mathbb {W}\in \mathbb {R}^{n\times ms}\).
Proposition 2
Let M and N be the matrices defined above and let \(\mathbb {V}\) and \(\mathbb {W}\in \mathbb {R}^{n\times ms}\) be the corresponding matrices obtained from Algorithm 1. Then we have the following decompositions:
where H = [hi, j] and G = [gi, j] are the two m × m upper triangular matrices determined by Algorithm 1 and satisfying
Proof
From Algorithm 1, the block vectors Vj and Wj are written as follows:
and because hi, j and gi, j vanish for i > j, we obtain
Then
Using the properties of ◇-product, it follows that
□
2.2 Description of the process
In this section, we recall the standard extended nonsymmetric Lanczos method given in [25]. Proofs in this section are formulated from results given in [25]. Let v and w be two vectors in \(\mathbb {R}^{n}\). The standard extended nonsymmetric Lanczos algorithm applied to the pairs (A, v) and (AT,w) determines two bi-orthonormal basis \(\mathbb {V}_{2m+2}=[v_{1},\ldots ,v_{2m+2}]\) and \(\mathbb {W}_{2m+2}=[w_{1},\ldots ,w_{2m+2}]\) of dimensions n × 2m, with \(v_{i},w_{i}\in \mathbb {R}^{n}\) (see [25, Algorithm 3]). Then the odd vectors v2j+ 1,w2j+ 1 verify the following equations:
where
The h2j+ 1,2j− 1,g2j+ 1,2j− 1 are such that \(w_{2j+1}^{T}v_{2j+1}=1\). Hence,
Similarly, the even vectors v2j+ 2,w2j+ 2 are computed by the following relations:
where
The h2j+ 2,2j,g2j+ 2,2j are such that \(w_{2j+2}^{T}v_{2j+2}=1\). Hence,
In the case where V and W are two blocks of size n × s, the implementation of the extended nonsymmetric global Lanczos algorithm applied to (A, V ) and (AT,W) will be similar to the standard algorithm [25, Algorithm 3] except that the standard inner product will be replaced by the Frobenius inner product 〈⋅,⋅〉F. This algorithm provides two F-biorthonormal bases \(\mathbb {V}_{2m+2}=[v_{1},\ldots ,v_{2m+2}]\) and \(\mathbb {W}_{2m+2}=[w_{1},\ldots ,w_{2m+2}]\). The dimension of these bases is n × 2(m + 1)s, where \(v_{i},w_{i}\in \mathbb {R}^{n\times s}\) are the i th block vector of \(\mathbb {V}_{2m+2}\) and \(\mathbb {W}_{2m+2}\), respectively. In addition, the block vectors vi and wi satisfy the relations (7) and (8). Following the idea in [24], the column vectors of the bases \(\mathbb {V}_{2m+2}\) and \(\mathbb {W}_{2m+2}\) can be computed two by two blocks as follows. Introduce
and
where Vj and Wj are the j th n × 2s block vectors of the matrices \(\mathbb {V}_{2m+2}=[V_{1},\ldots ,V_{m+1}]\) and \(\mathbb {W}_{2m+2}=[W_{1},\ldots ,W_{m+1}]\), respectively; Hi, j and Gi, j are 2 × 2 matrices. The block vectors Vj and Wj are such that
where H0 and \(G_{0}\in \mathbb {R}^{2\times 2}\) are computed by applying Algorithm 1 to [V, A− 1V ] and \([W,A^{-T}W]\in \mathbb {R}^{n\times 2s}\). And
Set
As the n × 2s blocks V1,…,Vm and W1,…,Wm are F-biorthonormal (with respect to the Frobenius-product) and by using the properties of the ◇-product, it follows that
The extended global nonsymmetric Lanczos is summarized in the following algorithm (Algorithm 2).
Proposition 3
Let v1,v2,…,v2m+ 2 and w1,w2,…,w2m+ 2 be the F-biorthon ormal block vectors determined by Algorithm 2. Then
where deg(pj− 1) = deg(rj− 1) = j − 1, deg(qj− 1) ≤ j − 1 and deg(sj− 1) ≤ j − 1. These relations also hold when V and A are replaced by W and AT, respectively.
Proof
The result follows easily from Algorithm 2. □
We notice here that in the previous proposition, the block vectors [V1,…,Vm+ 1] and [W1,…,Wm+ 1] form a F-biorthonormal basis of \(\mathbb {K}^{e}_{m+1}(A,V)\) and \(\mathbb {K}^e_{m+1}(A^T,W)\), respectively. When inverting A or solving linear systems via the LU decomposition is not easy, then we can compute matrix products of the form A− 1V and A−TW by using an efficient iterative linear solver such as the well-known GMRES method.
We now introduce the 2m × 2m matrices T2m and S2m, given by
where \(\mathbb {V}_{2m}=[V_1,\ldots ,V_m]\) and \(\mathbb {W}_{2m}=[W_1,\ldots ,W_m]\). An important result concerning the connection between the proposed method and the standard extended nonsymmetric Lanczos method is given by the following theorem.
Theorem 1
Let V, W be the initial given block vectors of \(\mathbb {R}^{n\times s}\). Set \(\mathcal {A}=I_s\otimes A\), v = vec(V ), and \(w=vec(W)\in \mathbb {R}^{ns}\) and perform m steps of the standard extended nonsymmetric Lanczos applied to the pairs \((\mathcal {A},v)\) and \((\mathcal {A}^T,w)\). Then we obtain the same F- biorthonormal basis \(\mathbb {V}_{2m+2}\), \(\mathbb {W}_{2m+2}\) as the one determined by the extended nonsymmetric global Lanczos algorithm. Moreover, the basis \(\mathbb {V}_{2m+2}\), \(\mathbb {W}_{2m+2}\) satisfy the following relations:
where \([\widetilde {t}_{2m+1,2m-1},\widetilde {t}_{2m+1,2m}]=[\langle A^Tw_{2m-1},v_{2m+1}\rangle _F,\langle A^Tw_{2m},v_{2m+1}\rangle _F]\), and \([\widetilde {s}_{2m+1,2m},\widetilde {s}_{2m+2,2m}]=[\langle A^{-T}w_{2m},v_{2m+1}\rangle _F,\langle A^{-T}w_{2m},v_{2m+2}\rangle _F]\).
\(E_m=[e_{2m-1},e_{2m}]\in \mathbb {R}^{2m\times 2}\) corresponds to the last two columns of the identity matrix I2m.
Proof
Perform m steps of the extended Lanczos algorithm applied to the pairs \((\mathcal {A},v)\) and \((\mathcal {A}^T,w)\) yield two ns × (2m + 2) bi-orthonormal bases \(\mathcal {V}_{2m+2}=[x_1,\ldots ,x_{2m+2}]\), \(\mathcal {W}_{2m+2}=[y_1,\ldots ,y_{2m+2}]\). These bases satisfy the relations (7) and (8) for the matrix \(\mathcal {A}\). By setting, \(X_j,Y_j\in \mathbb {R}^{n\times s}\) such that xj = vec(Xj) and yj = vec(Yj), we obtain
which means that Xj,Yj are F-biorthonormal. In addition, it follows that Xj = Vj and Yj = Wj since
Using the same techniques as above, we obtain that \(T_{2m}=\mathcal {W}^T_{2m}\mathcal {A}\mathcal {V}_{2m}\) and \(S_{2m}=\mathcal {W}^T_{2m}\mathcal {A}^{-1}\mathcal {V}_{2m}\). Let us prove the relation (12). Using [25, Lemma 3.1], it holds that
where \(\mathcal {T}_{2m}=\mathcal {W}^T_{2m}\mathcal {A}\mathcal {V}_{2m}=T_{2m}\).
In other hand,
while
Define \(\tau =[t_{2m+1,2m-1},t_{2m+1,2m}]E^T_m\), then
For i = 1,…,2m − 2, it is shown that
and
Which completes the proof of (12). The proof of the other relations is similar. □
The next two propositions express the entries of T2m and S2m in terms of the recursion coefficients. This will allow us to compute the entries quite efficiently.
Proposition 4
Let the coefficients hi, j and αi, j as defined in (9) and (10), respectively. The matrix T2m = [ti, j] in (11) is pentadiagonal with the nontrivial entries,
For j = 1,2,…,m − 1,
Proof
The proof is similar to the one given [25, Lemma 3.3]. □
Proposition 5
Let the coefficients hi, j as defined in (9). The matrix S2m = [si, j] in (11) is also pentadiagonal with the nontrivial entries, for j = 1,…, m − 1,
Proof
The proof is analogous to the proof of Proposition 4. □
The following result relates positive powers of S2m to negative powers of T2m.
Lemma 1
Let T2m and S2m as given by (11) and let \(e_1=[1,0,\ldots ,0]^T\in \mathbb {R}^{2m}\). Then
Lemma 2
Let the matrices T2m and S2m as defined by (11). Let \(\mathbb {V}_{2m}=[v_1,v_2,\ldots ,v_{2m}]\) be the matrix given in (12). Then for j = 1…m − 1, we have
Proof
It was shown in [25] that when performing m steps of standard extended nonsymmetric Lanczos method to the pairs \((\mathcal {A},v)\) and \((\mathcal {A}^T,w)\), it holds that
by using the properties of the ⊗-product and the “vec” operation, we obtain
which prove (14). Using the same techniques as above, we can prove (15). Finally, (16) follows from (15) and Lemma 1. □
Lemma 3
Let the matrices T2m and S2m as defined by (11). Let \(\mathbb {W}_{2m}=[w_1,w_2,\ldots ,w_{2m}]\) be the matrix given in (13). Then for j = 1…m − 1, we have
Proof
The proof is similar to the ones given in Lemma 2. □
3 Application to the approximation of matrix functions
3.1 Computing the matrix function trace(WTf(A)V )
The expression \(\mathcal {I}_{tr}\) defined in (1) can be approximated by
Lemma 4
Let the matrices T2m and S2m as defined by (11). Let v1 and w1 be the initial block vectors computed by Algorithm 2. Then the following equalities hold
Proof
We have trace\((w_1^TA^jv_1)=\text {vec}(w_1)^T\text {vec}(A^jv_1)\). From Theorem 1, we get y1 = vec(w1) and x1 = vec(v1). By using the properties of the “vec” operation, it follows that trace\((w_1^TA^jv_1)=y^T_1\mathcal {A}^jx_1\). Then using the result given in [25, Lemma 4.1], we obtain \(y^T_1\mathcal {A}^jx_1=e^T_1\mathcal {T}_{2m}e_1\). This shows (18) since, \(\mathcal {T}_{2m}=T_{2m}\). The second equation can be established by using the same techniques as above. □
Theorem 2
After m steps of Algorithm 2 with initial block vectors \(V,W\in \mathbb {R}^{n\times s}\), we get
where
Proof
From Algorithm 2, V and W are collinear with V1 and W1, respectively, i.e., V = αV1 and W = βW1 with αβ = 〈V, W〉F. Which implies that trace\((W^Tp(A)V)=\langle V,W\rangle _F \text {trace}(W_1^Tp(A)V_1)\). Using the results of Lemma 4, we get trace\((W_1^Tp(A)V_1)=e^T_1p(T_{2m})e_1\). This completes the proof. □
3.2 Computing the matrix function WTf(A)V
The aim of this subsection is to show how to use the extended nonsymmetric global Lanczos algorithm to approximate \(\mathcal {I}\) in (1) (see Algorithm 2). Before describing the application of the proposed method, we notice that the extended nonsymmetric block Lanczos method (ENBL) given in [2] can also be used to approximate \(\mathcal {I}\). After m steps of this algorithm applied to the pairs (A, V ) and (AT,W), we obtain two n × 2ms bi-orthonormal bases \({\mathcal{V}}_{2m}\) and \({\mathcal{W}}_{2m}\), i.e., \({\mathcal{W}}^T_{2m}{\mathcal{V}}_{2m}=I_{2ms}\). Then the expression \(\mathcal {I}\) can be approximated as follows: \(W^T({\mathcal{V}}_{2m}f({\mathcal{T}}_{2m}){\mathcal{W}}^T_{2m})V\) where \({\mathcal{T}}_{2m}\) is 2ms × 2ms block tridiagonal matrix with 2s × 2s blocks. The matrix \({\mathcal{T}}_{2m}={\mathcal{W}}^T_{2m}A{\mathcal{V}}_{2m}\) is computed recursively without requiring additional matrix-vector products with A (see [2, Proposition 3]). The ENBL algorithm is expensive as the number m of iterations increases and also for large values of s.
Now, let us come back to our proposed method. Applying Algorithm 2 to the pairs (A, V ) and (AT,W) allows us to obtain two F-biorthonormal bases \(\mathbb {V}_{2m}\) and \(\mathbb {W}_{2m}\) such that V = αv1 and W = βw1. It is clear that \(\mathbb {W}_{2m}^+\mathbb {V}_{2m}=I_{2ms}\); and then, we can consider the projector \(\mathcal {P}_{2m}\) defined as
Applying the projector \(\mathcal {P}_{2m}\) to \(\mathcal {I}\) gives the reduced matrix function
where \(A_{2m}=\mathbb {W}^+_{2m}A\mathbb {V}_{2m}\in \mathbb {R}^{2ms\times 2ms}\), \(B_{2m}=\mathbb {W}^+_{2m}V\in \mathbb {R}^{2ms\times s}\) and \(C_{2m}=\mathbb {V}^T_{2m}W\in \mathbb {R}^{2ms\times s}\).
The next proposition will allow us to compute A2m and B2m from the recursion matrix T2m without requiring the computation of the \(\mathbb {W}^+_{2m}\) and \(A\mathbb {V}_{2m}\).
Proposition 6
Let V, W be the initial block vectors where V = αV1. Then the matrices A2m and B2m defined above are computed as follows:
where \(\left [\begin {array}{ll} t_{2m+1,2m-1},& t_{2m+1,2m} \end {array}\right ]\) is defined by (12), and \({\mathcal{E}}_1\) corresponds to the first s columns of the identity matrix In.
Proof
Let \(E_j=[e_{2j-1},e_{2j}]\in \mathbb {R}^{2m\times 2}\), for j = 1,…,m. Multiplying the matrix A2m from the right by Ej ⊗ Is gives
Using (12) , we get
It follows that j = 1,…,m − 1,
while for j = m, it results that
Therefore,
To show the expression of B2m, we use the fact that V = αv1, which implies that
□
Lemma 5
Let A2m be the matrix be defined in (19). Let \(\mathbb {V}_{2m}=[v_1,\ldots ,v_{2m}]\) defined by (12). Then
Proof
The first equation is shown by induction. Av1 and v1 are two elements of \(\mathbb {K}^e_m(A,v_1)\) and then
The equality is true for j = 1. Let j = 2,…,m − 1, and assume that
Since \(A^jv_1\in \mathbb {K}^e_m(A,v_1)\), it follows that \(A^jv_1=\mathbb {V}_{2m}\mathbb {W}^+_{2m}A^jv_1\). By induction, we have
which completes the proof of (20).
The second equation follows from the fact that \(\mathbb {V}_{2m}\mathbb {W}^+_{2m}A^{-j}v_1=A^{-j}v_1\) ∀j = 1,…,m and by using same technique induction as above. □
Lemma 6
Let A2m be the matrix defined by (19) and let \(\mathbb {W}_{2m}=[ w_1,\ldots ,w_{2m}]\) be the matrix in (13). Then
Proof
We have \(A^{j,T}w_1\in \mathbb {K}^{e}_{m}(A^T,W)\), which means that \(\mathbb {W}^{+,T}_{2m}\mathbb {V}^T_{2m}A^{j,T}w_1=A^{j,T}w_1\) for any integer j such that − m ≤ j ≤ m − 1. According to the previous equality, and by the same techniques as the proof of Lemma 5, then both equations are shown. □
Proposition 7
After m steps of the process, we have
Proof
The proof is based on an application of results of Lemmas 5 and 6. Let j = j1 + j2 with j1,j2 ∈{0,…,m}. We have V = αv1 and W = βw1, then
Using equations of Lemmas 5 and 6, we obtain
which completes the proof of the first relation. The proof of the second relation is similar. □
Now, we compare the operations requirements for the extended block Lanczos method ENBL and the extended global Lanczos method ENGL. Both methods require the same cost of computing matrix-matrix products AV for some \(V\in \mathbb {R}^{n\times s}\); they also require the same cost of solving linear systems. The ENBL method requires 4ns2 operations to compute the n × 2s matrix VH and 4ns2 for computing the 2s × 2s matrix WTV, while the ENGL method only needs 4ns operations to compute WT ◇ V and 4ns to compute V (H ⊗ Is). To update the new bases Vj+ 1 and Wj+ 1, the ENBL method has to perform the bi-orthonormalization decomposition of \(\widehat {V}_{j+1}\) and \(\widehat {W}_{j+1}\) that costs 16ns2 in every step, while the global bi-orthonormalization decomposition costs only 16ns. To compute \({\mathcal{T}}_{2m}\) using [2, Proposition 3], we need to 2s(4ns + n + 1)(m + 1) operations, while for ENGL method, computing T2m costs only (10ns + 2)(m + 1). Moreover, the computation of \({\mathcal{T}}_{2m}\) requires solving linear systems of size s × s for every step, while ENGL method only needs the division by a scalar (see Proposition 4). In Table 1, we summarize the number of operations after m iterations of the ENBL algorithm and the ENGL algorithm. As we observed, the ENGL algorithm is less expensive than the ENBL algorithm, when the number of iteration m increases and the block size s is not small.
4 Numerical experiments
In this section, we give some numerical examples to show the performance of the extended nonsymmetric global Lanczos (ENGL) method . In the selected examples, the proposed method is applied to the approximation of expressions of the form \(\mathcal {I}_{tr}\) and \(\mathcal {I}\) given in (1). All experiments were carried out in MATLAB R2015a on a computer with an Intel Core i-3 processor and 3.89 GBytes of RAM. The computations were done with about 15 significant decimal digits.
As mentioned in the second section, we did not have to compute explicitly A− 1. In all examples, the matrix products with A− 1 and A−T in lines 2 − 8 − 9 of Algorithm 4 are computed via an LU factorization or by using an iterative solver. We used a preconditioned block biconjugate gradient (PBBiCG) method as described in Algorithm 5.
4.1 Examples for approximations of trace(WTf(A)V )
Example 1
In this experiment, we compared the performance of the ENGL algorithm with the performance of extended global Arnoldi algorithm (EGA) described in [1, 15]. We approximate \(\text {trace}(E^T_1\exp (A)E_1)\) where A is the nonsymmetric adjacency matrix pesa of order n = 11738. This matrix was obtained from the Suite Sparse Matrix Collection[8]. \(E_1\in \mathbb {R}^{n\times s}\) corresponds to the first s columns of identity matrix In. Results for several choices of the block size s and number of iterations m are reported in Table 2. We notice that in the ENGL and EGA algorithms, we used the PBBiCG algorithm defined by Algorithm 5 with an ILU preconditionner. As observed from this table, the approximate errors determined by ENGL have higher accuracy as compared with the approximations obtained by the EGA method.
Example 2
In this example, we used a diagonalizable matrix of order n = 1000 whose eigenvalues are log-uniformly distributed in the interval [10− 1,104] and random eigenvectors. We computed approximations of \(\text {trace}(E^T_1f(A)E_1)\) given by ENGL and by the standard nonsymmetric global Lanczos method (SNGL). Here we used s = 6. In Table 3, we reported the number of iterations, the relative errors, and the required CPU times obtained with different functions. Here we used the LU factorization to compute products of the form A− 1V and A−TW. The results show that the ENGL algorithm is faster and give better relative errors, while SNGL algorithm is unable to determine an accurate approximation for all functions f used in this example.
4.2 Examples for the approximation of WTf(A)V
In this subsection, we present some results to approximate WTf(A)V using the ENGL algorithm. In the following experiments, we used the functions: \(f(x)=\sqrt {x}\) and f(x) = x− 1/3. The blocks W and V were generated randomly with entries uniformly distributed on [0,1]. The matrix A was obtained from the centered finite difference discretization (CFDD) of the elliptic operators given by (21) on the unit square [0,1] × [0,1] with Dirichlet homogeneous boundary conditions. The number of inner grid points in each direction was n0 and the dimension of matrices is \(n=n_0^2\).
Example 3
We consider the approximation of WTA1/2V and WTB1/2V where the matrices \(A,B\in \mathbb {R}^{4900\times 4900}\) are nonsymmetric matrices coming from CFDD of the operators \({\mathcal{L}}_1(u)\) and \({\mathcal{L}}_2(u)\), respectively, and given by (21). The block size s was s = 4. In Fig. 1, we reported the relative errors of ENBL, ENGL, and SNBL algorithms versus the dimension of the projected subspace using the matrix A on the left and on the right part of this figure, we give the results corresponding to the matrix B. Both plots show that ENGL and ENBL algorithms yield significantly smaller errors than the SNBL algorithm.
Example 4
In this example, we consider nonsymmetric matrices coming from CFDD of the same operators as in Example 3. In Tables 4 and 5, we reported results for the ENGL and ENBL algorithms when approximating \(\mathcal {I}\). We used different values of the dimension n ({2500,4900,7225,10000}) and two different block sizes s = 10,20 in Table 4, and s = 30,40 in Table 5. For the last two values of n, we used PBBiCG preconditioned by the block ILU preconditioner (see [23]). As shown in Tables 4 and 5, when the block size s increases, the approximations of \(\mathcal {I}\) computed with ENGL are more accurate than the approximations produced by the ENBL algorithm.
5 Conclusion
This paper describes an extended nonsymmetric global Lanczos method for the approximation of trace(WTf(A)V ) and WTf(A)V. Two F-biorthonormal bases of the extended Krylov subspaces given by (5) are computed by short recurrence formulas. We gave some suitable algebraic relations. The numerical results show that the nonsymmetric extended global Lanczos method requires fewer iterations and CPU time as compared with the standard nonsymmetric global Lanczos method and to the extended global Arnoldi method when approximating trace(WTf(A)V ) and WTf(A)V.
References
Abidi, O., Heyouni, M., Jbilou, K.: On some properties of the extended block and global Arnoldi methods with applications to model reduction. Numer. Algorithms 75, 285–304 (2017)
Barkouki, H., Bentbib, A.H., Heyouni, M., Jbilou, K.: An extended nonsymmetric block Lanczos method for model reduction in large scale dynamical systems. Calcolo 55, 13–36 (2018)
Baroni, S., Gebauer, R., Malcioglu, O.B., Saad, Y., Umari, P., Xian, J.: Harnessing Molecular Excited States with Lanczos Chains, vol. 22. Art. Id. 074204 (2010)
Bellalij, M., Reichel, L., Rodriguez, G., Sadok, H.: Bounding matrix functionals via partial global block Lanczos decomposition. Appl. Numer. Math. 94, 127–139 (2015)
Bellalij, M., Jbilou, K., Sadok, H.: New convergence results on the global GMRES method for diagonalizable matrices. J. Comput. Appl. Math. 219, 350–358 (2008)
Bouyouli, R., Jbilou, K., Sadaka, R., Sadok, H.: Convergence properties of some block Krylov subspace methods for multiple linear systems. J. Comput. Appl. Math. 196, 498–511 (2006)
Bowler, D.R., Miyazaki, T.: O(n) methods in electronic structure calculations. Rep. Prog. Phys. 75, 43 (2012). Art. Id. 036503
Davis, T., Hu, Y.: The SuiteSparse Matrix Collection, https://sparse.tamu.edu
Druskin, V., Knizhnerman, L.: Extended Krylov subspaces: approximation of the matrix square root and related functions. SIAM J. Matrix Anal. Appl. 19, 755–771 (1998)
Estrada, E.: The Structure of Complex Networks: Theory and Applications. Oxford University Press, Oxford (2011)
Fenu, C., Reichel, L., Rodriguez, G., Sadok, H.: GCV for Tikhonov regularization by partial SVD. BIT 57, 1019–1039 (2017)
Golub, G.H., Meurant, G.: Matrices, Moments and Quadrature with Applications. Princeton University Press, Princeton (2010)
Han, I., Malioutov, D., Shin, J.: Large-scale log-determinant computation through stochastic chebyshev expansions. Proceedings of The 32nd International Conference on Machine Learning 37, 908–917 (2015)
Hansen, P.C.: Rank-deficient and Discrete Ill-Posed Problems. SIAM, Philadelphia (1998)
Heyouni, M.: Extended Arnoldi methods for large low-rank Sylvester matrix equations. Appl. Numer. Math. 60, 1171–1182 (2010)
HIGHAM, N.J.: Functions of Matrices: Theory and Computation. SIAM, Philadelphia (2008)
Jbilou, K., Sadok, A., Messaoudi, H.: Global FOM and GMRES algorithms for matrix equations. Appl. Numer Math. 31, 49–63 (1999)
Jbilou, K., Sadok, H., Tinzeft, A.: Oblique projection methods for multiple linear systems. Elect. Trans. Num. Anal. 20, 119–138 (2005)
Knizhnerman, VL., Simoncini, A: New investigation of the extended Krylov subspace method for matrix function evaluations. Numer. Linear Algebra Appl. 17, 615–638 (2010)
Ngo, T.T., Bellalij, M., Saad, Y.: The trace ratio optimization problem. SIAM Rev. 54, 545–569 (2012)
Reichel, L., Rodriguez, G., Tang, T.: New block quadrature rules for the approximation of matrix functions. Linear Algebra Appl. 502, 299–326 (2016)
Saad, Y., Chelikowsky, J., Shontz, S.: Numerical methods for electronic structure calculations of materials. SIAM Rev. 52, 3–54 (2010)
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)
Simoncini, V.: A new iterative method for solving large-scale Lyapunov matrix equations. SIAM J. Sci. Comput. 29, 1268–1288 (2007)
Schweitzer, M.: A two-sided short-recurrence extended Krylov subspace method for nonsymmetric matrices and its relation to rational moment matching. Numer. Algorithms 76, 1–31 (2016)
Acknowledgments
We would like to thank the two anonymous referees for their valuable remarks and suggestions allowing us to improve the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bentbib, A.H., El Ghomari, M. & Jbilou, K. Extended nonsymmetric global Lanczos method for matrix function approximation. Numer Algor 84, 1459–1479 (2020). https://doi.org/10.1007/s11075-020-00896-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-020-00896-8
Keywords
- Extended Krylov subspace
- Extended moment matching
- Laurent polynomial
- Nonsymmetric global Lanczos method
- Matrix function