Extended nonsymmetric global Lanczos method for matrix function approximation

Bentbib, A. H.; El Ghomari, M.; Jbilou, K.

doi:10.1007/s11075-020-00896-8

Extended nonsymmetric global Lanczos method for matrix function approximation

Original Paper
Published: 04 March 2020

Volume 84, pages 1459–1479, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Numerical Algorithms Aims and scope Submit manuscript

Extended nonsymmetric global Lanczos method for matrix function approximation

Download PDF

A. H. Bentbib¹,
M. El Ghomari¹ &
K. Jbilou^2,3

240 Accesses
Explore all metrics

Abstract

Extended Krylov subspace methods are attractive methods for computing approximations of matrix functions and other problems producing large-scale matrices. In this work, we propose the extended nonsymmetric global Lanczos method for solving some matrix approximation problems. The derived algorithm uses short recursive relations to generate bi-orthonormal bases, with respect to the Frobenius inner product, of the corresponding extended Krylov subspaces ${K^{e}_{m}}(A,V)$ and ${K^{e}_{m}}(A^{T},W)$. Here, A is a large nonsymmetric matrix; V and $W\in \mathbb {R}^{n\times s}$ are two blocks. New algebraic properties of the proposed method are developed and applications to approximation of both W^Tf(A)V and trace(W^Tf(A)V ) are given. Numerical examples are presented to show the performance of the extended nonsymmetric global Lanczos for these problems.

A unified approach to Krylov subspace methods for solving linear systems

Article 16 October 2023

An Efficient Implementation of the Gauss–Newton Method Via Generalized Krylov Subspaces

Article Open access 03 October 2023

Error bounds and estimates for Krylov subspace approximations of Stieltjes matrix functions

Article 17 December 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $A\in \mathbb {R}^{n\times n}$ be a large nonsymmetric and nonsingular matrix, and let V and $W\in \mathbb {R}^{n\times s}$ be two block vectors with 1 < s ≪ n. We are interested in approximating expressions of the form

$$ \mathcal{I}_{tr}:=\text{trace}(W^{T}f(A)V)\quad \text{ and }\quad \mathcal{I}:=W^{T}f(A)V, $$

(1)

where f is a function such that f(A) is well defined. Evaluation of expressions (1) arises in various applications such as in network analysis ($f(t)=\exp (t)$ or f(t) = t³) [10], machine learning $(f(t)=\log (t))$ [13, 20], theory of Quantum Chromodynamics (f(t) = t^1/2), electronic structure computation [3, 7, 22], and the solution of ill-posed problems [11, 14]. The matrix function f(A) can be defined by the spectral factorization of A; see, e.g., [14, 16] for discussions on several possible definitions of matrix functions. In the present paper, we assume that the matrix A is so large that it is impractical to evaluate its spectral factorization.

Approximation of $\mathcal {I}_{tr}$ in the case when A is a symmetric matrix and W = V and using global methods is well studied in [4]. Global Krylov subspace techniques were first proposed in [17] for solving linear equations with multiple right hand sides and Lyapunov equations. Consider the global Krylov subspace

$$ {\mathbb{K}^{gl}_{m}(A,V)}=\text{span}\{V,AV,\ldots,A^{m-1}V\}=\{p(A)V : p\in\varPi_{m-1}\}, $$

(2)

where π_m− 1 denotes the set of polynomials of degree at most m − 1. Using m steps of the global Lanczos method [18] to A with the initial block vector V leads to the decomposition

$$ A\mathbb{V}_{m}=\mathbb{V}_{m}(T_{m}\otimes I_{s})+\beta_{m}V_{m+1}{E^{T}_{m}}, $$

(3)

where ⊗ stands for the Kronecker product. The block columns V₁,…,V_m of the matrix $\mathbb {V}_{m}=[V_{1},\ldots ,V_{m}]\in \mathbb {R}^{n\times ms}$ form an F-orthonormal (with respect to the Frobenius inner product) basis of the global Krylov subspace (2) which means that

$$ \langle V_{j},V_{k} \rangle_{F} := \text{trace}({V_{k}^{T}}V_{j})= \left\{\begin{array}{cc} 1 & j=k,\\ 0 & j\ne k. \end{array}\right. $$

Moreover, the matrix $T_{m}\in \mathbb {R}^{m\times m}$ in (3) is symmetric and tridiagonal, $I_{s}\in \mathbb {R}^{s\times s}$ denotes the identity matrix, β_m ≥ 0 and $E_{m}\in \mathbb {R}^{ms\times s}$ is the m th block axis vector, i.e., E_m corresponds to the (m − 1)s + 1,…,ms columns of the identity matrix I_ms.

As for the classical Krylov subspace methods and in the symmetric case, it is attractive to approximate the expression $\mathcal {I}_{tr}$ in (1) by

$$ \mathcal{G}_{m}(f):=\|V\|^{2}_{F} \widetilde{e}^{T}_{1}f(T_{m})\widetilde{e}_{1}, $$

(4)

where $\widetilde {e}_{1}$ is the first vector of the canonical basis of $\mathbb {R}^{m}$. One can show that the approximation (4) is exact for polynomials of degree at most 2m − 1, i.e.,

$$ \mathcal{I}_{tr}(p)=\mathcal{G}_{m}(p),\quad \forall p\in\varPi_{2m-1}, $$

see, Bellalij et al. [4] for more details. When A is nonsymmetric, the F-orthonormal basis $\mathbb {V}_{m}$ of ${\mathbb {K}^{gl}_{m}(A,V)}$ cannot be generated with short recurrence relations. The nonsymmetric global Lanczos method [18] generates a pair of F-biorthonormal bases $\mathbb {V}_{m}=[V_{1},\ldots ,V_{m}]$ and $\mathbb {W}_{m}=[W_{1},\ldots ,W_{m}]$ with short recurrence relations, for the Krylov subspaces $\mathbb {K}_{m}(A,V)$ and $\mathbb {K}_{m}(A^{T},W)$, respectively. Application of m steps of global Lanczos nonsymmetric algorithm yields the following relations:

$$ \begin{array}{@{}rcl@{}} A\mathbb{V}_{m}&=&\mathbb{V}_{m}({T^{b}_{m}}\otimes I_{s})+\beta_{m}V_{m+1}{E^{T}_{m}},\\ A^{T}\mathbb{W}_{m}&=&\mathbb{W}_{m}(T^{b,T}_{m}\otimes I_{s})+\gamma_{m}W_{m+1}{E^{T}_{m}}. \end{array} $$

The block vectors V_is and W_is of $\mathbb {V}_{m}=[V_{1},\ldots ,V_{m}]$ and $\mathbb {W}_{m}=[W_{1},\ldots ,W_{m}]$ elements $\mathbb {R}^{n\times s}$ are said to be F-biorthonormal, if

$$ \begin{array}{@{}rcl@{}} \langle V_{j},W_{k}\rangle_{F} := \text{trace}({W_{k}^{T}}V_{j})= \left\{\begin{array}{cc} 1 & j=k,\\ 0 & j\ne k. \end{array}\right. \end{array} $$

The matrix ${T^{b}_{m}}\in \mathbb {R}^{m\times m}$ is nonsymmetric and tridiagonal. The coefficients $\beta _{m},\gamma _{m}\in \mathbb {R}$, and E_m is the matrix defined in (3). The expression $\mathcal {I}_{tr}$ in (1) can be approximated by

$$ \mathcal{G}_{m}^{b}(f):=\langle V,W \rangle_{F}\widetilde{e}^{T}_{1}f({T^{b}_{m}})\widetilde{e}_{1}. $$

In the same way as in [12], one can show that the approximation $\mathcal {G}_{m}^{b}$ is exact for polynomials of degree at most 2m − 1, i.e.,

$$ \mathcal{I}_{tr}(p)=\mathcal{G}_{m}^{b}(p)\quad \forall p\in\varPi_{2m-1}. $$

Approximation of W^Tf(A)V is well studied in the literature; see, e.g., [12, 21] and the references therein. For the nonsymmetric matrices, Reichel et al. [21], approximate $\mathcal {I}$ by performing m steps with the nonsymmetric block Lanczos method, applied to A with initial block vectors W and V and this leads to the following algebraic relations:

$$ \begin{array}{rcl} A[V_{1},\ldots,V_{m}]&=&[V_{1},\ldots,V_{m}]J_{m}+V_{m+1}\varGamma_{m}{E^{T}_{m}},\\ A^{T}[W_{1},\ldots,W_{m}]&=&[W_{1},\ldots,W_{m}]{J^{T}_{m}}+W_{m+1}{\varDelta}_{m}{E^{T}_{m}}, \end{array} $$

where $W_{j},V_{j}\in \mathbb {R}^{n\times s}$ are bi-orthonormal, i.e.,

$$ {W^{T}_{k}}V_{j} := \left\{\begin{array}{cc} I_{s} & j=k,\\ O_{s} & j\ne k. \end{array}\right. $$

O_s denotes the zero matrix of order s. The matrix $J_{m}\in \mathbb {R}^{ms\times ms}$ is a block nonsymmetric tridiagonal matrix. The size of the block entries is s × s. $\varGamma _{m},{\varDelta }_{m}\in \mathbb {R}^{s\times s}$, and E_m are the matrix defined in (3). In [21], it was shown that ${\mathcal {G}^{\text {block}}_{m}(f)}$ defined by

$$ {\mathcal{G}^{\text{block}}_{m}(f)}:=\widetilde{E}^{T}_{1}f(J_{m})\widetilde{E}_{1}, $$

where $\widetilde {E}_{1}\in \mathbb {R}^{m\times s}$ corresponds to the first s columns of the identity matrix I_m, can be used to approximate $\mathcal {I}$. Also ${\mathcal {G}^{\text {block}}_{m}}$ is exact for polynomials of degree at most 2m − 1, i.e.,

$$ \mathcal{I}(p)={\mathcal{G}^{\text{block}}_{m}(p)},\quad \forall p\in\varPi_{2m-1}. $$

In the context of approximating expressions of the form f(A)b for some vector $b\in \mathbb {R}^{n}$, Druskin et al. [9, 19] computed approximation using the standard extended Krylov subspace generated by the vectors A^−mb,…,A^− 1b, b, Ab,…, A^m− 1b.

We are interested in exploring approximations of (1) by using a pair of F-biorthonormal bases from the two extended global Krylov subspaces

$$ \begin{array}{rl} \mathbb{K}_{m}^{e}(A,V)&=\text{span}\{V,A^{-1}V,AV,\ldots,A^{m-1}V,A^{-m}V\},\\ \mathbb{K}_{m}^{e}(A^{T},W)&=\text{span}\{W,A^{-T}W,A^{T}W,\ldots,A^{m-1,T}W,A^{-m,T}W\}. \end{array} $$

(5)

This paper is organized as follows. Section 2 introduces the extended nonsymmetric global Lanczos process, for generating a pair of F-biorthonormal bases for the two extended global Krylov subspaces defined by (5). Section 3 describes the application of this process to the approximation of the two matrix functions given in (1) and give some properties. Section 4 presents some numerical experiments that illustrate the quality of the computed approximations.

2 Some algebraic properties of the extended nonsymmetric global Lanczos process

2.1 Preliminaries and notations

We begin by recalling some notations and definitions that will be used throughout this paper. The Kronecker product satisfies the following properties:

1.
(A ⊗ B)(C ⊗ D) = AC ⊗ BD.
2.
(A ⊗ B)^T = A^T ⊗ B^T.

Definition 1

Partition the matrices $M=[M_{1},\ldots ,M_{p}]\in \mathbb {R}^{n\times ps}$ and $N=[N_{1},\ldots ,N_{l}]\in \mathbb {R}^{n\times ls}$ into block columns $M_{i},N_{j}\in \mathbb {R}^{n\times s}$, and define the ◇-product of the matrices M and N as

$$ M^{T}\diamond N=[\text{trace}({M^{T}_{i}}N_{j})]_{i=1,\ldots,p}^{j=1,\ldots,l}\in{\mathbb R}^{p\times l}. $$

(6)

The following proposition gives some properties satisfied by the above product.

Proposition 1

[5, 6] Let $A,B,C\in \mathbb {R}^{n\times ps}$, $D\in \mathbb {R}^{n\times n}$, $L\in \mathbb {R}^{p\times p}$, and $\alpha \in \mathbb {R}^{n\times n}$. Then, we have

1.
(A + B)^T ◇ C = A^T ◇ C + B^T ◇ C
2.
A^T ◇ (B + C) = A^T ◇ B + A^T ◇ C
3.
(αA)^T ◇ C = α(A^T ◇ C)
4.
(A^T ◇ B)^T = B^T ◇ A
5.
(DA)^T ◇ B = A^T ◇ (D^TB)
6.
A^T ◇ (B(L ⊗ I_s)) = (A^T ◇ B)L

Definition 2

Let X be an n × s matrix. The vectorization of the matrix X, denoted vec(X), is the ns × 1 column vector obtained by stacking the columns of the matrix X, i.e.,

$$\text{vec}(X)=[X_{1,1},\ldots,X_{n,1},X_{1,2},\ldots,X_{n,2},\ldots,X_{1,s},\ldots,X_{n,s}]^{T}.$$

vec satisfies the following properties:

1.
vec(MXN) = (N^T ⊗ M)vec(X), $\forall A\in \mathbb {R}^{p\times q},N\in \mathbb {R}^{r\times s},$$X\in \mathbb {R}^{q\times r}$.
2.
vec(A)^Tvec(B) = trace(A^TB), $\forall A,B\in \mathbb {R}^{n\times n}$.

Let M = [M₁,…,M_m] and $N=[N_{1},\ldots ,N_{m}]\in \mathbb {R}^{n\times ms}$, with $M_{i},N_{i}\in \mathbb {R}^{n\times s}$. The following algorithm applied to the pair (M, N) allows us to obtain two F-biorthonormal matrices $\mathbb {V},\mathbb {W}\in \mathbb {R}^{n\times ms}$.

Proposition 2

Let M and N be the matrices defined above and let $\mathbb {V}$ and $\mathbb {W}\in \mathbb {R}^{n\times ms}$ be the corresponding matrices obtained from Algorithm 1. Then we have the following decompositions:

$$M=\mathbb{V}(H\otimes I_{s}), \quad N=\mathbb{W}(G\otimes I_{s}),$$

where H = [h_{i, j}] and G = [g_{i, j}] are the two m × m upper triangular matrices determined by Algorithm 1 and satisfying

$$H=\mathbb{W}^{T}\diamond M \quad and \quad G=\mathbb{V}^{T}\diamond N.$$

Proof

From Algorithm 1, the block vectors V_j and W_j are written as follows:

$$ M_{j}=\sum\limits_{i=1}^{j} h_{i,j}V_{i},\text{ and } N_{j}=\sum\limits_{i=1}^{j} g_{i,j}W_{i}, $$

and because h_{i, j} and g_{i, j} vanish for i > j, we obtain

$$ M_{j}=\sum\limits_{i=1}^{m} V_{i}(h_{i,j}\otimes I_{s})\text{ and } N_{j}=\sum\limits_{i=1}^{m} W_{i}(g_{i,j}\otimes I_{s}). $$

Then

$$M=\mathbb{V}(H\otimes I_{s}) \text{ and } N=\mathbb{W}(G\otimes I_{s}).$$

Using the properties of ◇-product, it follows that

$$ \begin{array}{@{}rcl@{}} H&=&(\mathbb{W}^{T}\diamond\mathbb{V})H=\mathbb{W}^{T}\diamond(\mathbb{V}(H\otimes I_{s}))=\mathbb{W}^{T}\diamond M.\\ G&=&(\mathbb{V}^{T}\diamond\mathbb{W})G=\mathbb{V}^{T}\diamond(\mathbb{W}(G\otimes I_{s}))=\mathbb{V}^{T}\diamond N. \end{array} $$

□

2.2 Description of the process

In this section, we recall the standard extended nonsymmetric Lanczos method given in [25]. Proofs in this section are formulated from results given in [25]. Let v and w be two vectors in $\mathbb {R}^{n}$. The standard extended nonsymmetric Lanczos algorithm applied to the pairs (A, v) and (A^T,w) determines two bi-orthonormal basis $\mathbb {V}_{2m+2}=[v_{1},\ldots ,v_{2m+2}]$ and $\mathbb {W}_{2m+2}=[w_{1},\ldots ,w_{2m+2}]$ of dimensions n × 2m, with $v_{i},w_{i}\in \mathbb {R}^{n}$ (see [25, Algorithm 3]). Then the odd vectors v_2j+ 1,w_2j+ 1 verify the following equations:

$$ \begin{array}{ll} \alpha=\sqrt{|w^{T}v|}, v_{1}&=v/\alpha, \beta= w^{T}v/\alpha, w_{1}=w/\beta, \\ h_{2j+1,2j-1}v_{2j+1}&=Av_{2j-1}-\sum \limits_{{i=2j-3}}^{2j} h_{i,2j-1}v_{i}=\widehat{v}_{2j+1},\\ g_{2j+1,2j-1}w_{2j+1}&=A^{T}w_{2j-1}-\sum \limits_{{i=2j-3}}^{2j} g_{i,2j-1}w_{i}=\widehat{w}_{2j+1}, \end{array} j=1,\ldots,m. $$

(7)

where

$$ h_{i,2j-1}={w_{i}^{T}}Av_{2j-1} \text{ and } g_{i,2j-1}={v_{i}^{T}}A^{T}w_{2j-1}, \text{ for } i=1,\ldots,2j. $$

The h_{2j+ 1,2j− 1},g_{2j+ 1,2j− 1} are such that $w_{2j+1}^{T}v_{2j+1}=1$. Hence,

$$ h_{2j+1,2j-1}=\sqrt{|\widehat{w}_{2j+1}^{T}\widehat{v}_{2j+1}|}, g_{2j+1,2j-1}=(\widehat{w}_{2j+1}^{T}\widehat{v}_{2j+1})/h_{2j+1,2j-1}. $$

Similarly, the even vectors v_2j+ 2,w_2j+ 2 are computed by the following relations:

$$ \begin{array}{lll} \widehat{v}_{2}&=A^{-1}v-({w_{1}^{T}}A^{-1}v)v_{1}, &\widehat{w}_{2}=A^{-T}w-({v_{1}^{T}}A^{-T}w)w_{1}\\ \gamma&=\sqrt{|\widehat{w}_{2}^{T}\widehat{v}_{2}|}, v_{2}=\widehat{v}_{2}/\gamma, &\varGamma= \widehat{w}_{2}^{T}\widehat{v}_{2}/\gamma, w_{2}=\widehat{w}_{2}/\varGamma, \end{array} $$

$$ \begin{array}{ll} h_{2j+2,2j}v_{2j+2}&=A^{-1}v_{2j}-\sum\limits_{{i=2j-2}}^{2j+1} h_{i,2j}v_{i}=\widehat{v}_{2j+2},\\ g_{2j+2,2j}w_{2j+2}&=A^{-T}w_{2j}-\sum\limits_{{i=2j-2}}^{2j+1} g_{i,2j}w_{i}=\widehat{w}_{2j+2}, \end{array} j=1,\ldots,m. $$

(8)

where

$$ h_{i,2j}={w_{i}^{T}}A^{-1}v_{2j} \text{ and } g_{i,2j}={v_{i}^{T}}A^{-T}w_{2j}, \text{ for } i=1,\ldots,2j+1. $$

The h_2j+ 2,2j,g_2j+ 2,2j are such that $w_{2j+2}^{T}v_{2j+2}=1$. Hence,

$$ h_{2j+2,2j}=\sqrt{|\widehat{w}_{2j+2}^{T}\widehat{v}_{2j+2}|}, g_{2j+2,2j}=\widehat{w}_{2j+2}^{T}\widehat{v}_{2j+2}/h_{2j+2,2j}. $$

In the case where V and W are two blocks of size n × s, the implementation of the extended nonsymmetric global Lanczos algorithm applied to (A, V ) and (A^T,W) will be similar to the standard algorithm [25, Algorithm 3] except that the standard inner product will be replaced by the Frobenius inner product 〈⋅,⋅〉_F. This algorithm provides two F-biorthonormal bases $\mathbb {V}_{2m+2}=[v_{1},\ldots ,v_{2m+2}]$ and $\mathbb {W}_{2m+2}=[w_{1},\ldots ,w_{2m+2}]$. The dimension of these bases is n × 2(m + 1)s, where $v_{i},w_{i}\in \mathbb {R}^{n\times s}$ are the i th block vector of $\mathbb {V}_{2m+2}$ and $\mathbb {W}_{2m+2}$, respectively. In addition, the block vectors v_i and w_i satisfy the relations (7) and (8). Following the idea in [24], the column vectors of the bases $\mathbb {V}_{2m+2}$ and $\mathbb {W}_{2m+2}$ can be computed two by two blocks as follows. Introduce

$$V_{j}=[v_{2j-1} , v_{2j}],\quad W_{j}=[w_{2j-1} , w_{2j}],$$

and

$$ H_{i,j}=\left[\begin{array}{ll} h_{2i-1,2j-1} & h_{2i-1,2j} \\ h_{2i,2j-1} & h_{2i,2j} \end{array}\right],\quad G_{i,j}=\left[\begin{array}{ll} g_{2i-1,2j-1} & g_{2i-1,2j} \\ g_{2i,2j-1} & g_{2i,2j} \end{array}\right], $$

(9)

where V_j and W_j are the j th n × 2s block vectors of the matrices $\mathbb {V}_{2m+2}=[V_{1},\ldots ,V_{m+1}]$ and $\mathbb {W}_{2m+2}=[W_{1},\ldots ,W_{m+1}]$, respectively; H_{i, j} and G_{i, j} are 2 × 2 matrices. The block vectors V_j and W_j are such that

$$ [V , A^{-1}V]=V_{1}(H_{0} \ \otimes I_{s}), \quad [W , A^{-T}W]=W_{1}(G_{0}\ \otimes I_{s}), $$

where H₀ and $G_{0}\in \mathbb {R}^{2\times 2}$ are computed by applying Algorithm 1 to [V, A^− 1V ] and $[W,A^{-T}W]\in \mathbb {R}^{n\times 2s}$. And

$$ \begin{array}{@{}rcl@{}} V_{j+1}(H_{j+1,j}\otimes I_{s})&=&[Av_{2j-1},A^{-1}v_{2j}]-V_{j-1}(H_{j-1,j}\otimes I_{s})-V_{j}(H_{j,j}\otimes I_{s}),\\ W_{j+1}(G_{j+1,j}\otimes I_{s})&=&[A^{T}w_{2j-1},A^{-T}w_{2j}] - W_{j-1}(G_{j-1,j}\!\otimes\! I_{s}) - W_{j}(G_{j,j}\!\otimes\! I_{s}). \end{array} $$

Set

$$ H_{0}=\left( \begin{array}{ll} \alpha_{1,1} & \alpha_{1,2}\\ 0 & \alpha_{2,2}, \end{array}\right)\quad G_{0}=\left( \begin{array}{ll} \beta_{1,1} & \beta_{1,2}\\ 0 & \beta_{2,2} \end{array}\right). $$

(10)

As the n × 2s blocks V₁,…,V_m and W₁,…,W_m are F-biorthonormal (with respect to the Frobenius-product) and by using the properties of the ◇-product, it follows that

$$ H_{i,j}={W^{T}_{i}}\diamond [Av_{2j-1} , A^{-1}v_{2j}] \text{ and } G_{i,j}={V^{T}_{i}}\diamond [A^{T}w_{2j-1} , A^{-T}w_{2j}]. $$

The extended global nonsymmetric Lanczos is summarized in the following algorithm (Algorithm 2).

Proposition 3

Let v₁,v₂,…,v_2m+ 2 and w₁,w₂,…,w_2m+ 2 be the F-biorthon ormal block vectors determined by Algorithm 2. Then

$$ \begin{array}{ccc} v_{2j-1}&=&p^{V}_{j-1}(A)V+q^{V}_{j-1}(A^{-1})V,\\ v_{2j}&=&r^{V}_{j-1}(A^{-1})A^{-1}V+s^{V}_{j-1}(A)V, \end{array} $$

where deg(p_j− 1) = deg(r_j− 1) = j − 1, deg(q_j− 1) ≤ j − 1 and deg(s_j− 1) ≤ j − 1. These relations also hold when V and A are replaced by W and A^T, respectively.

Proof

The result follows easily from Algorithm 2. □

We notice here that in the previous proposition, the block vectors [V₁,…,V_m+ 1] and [W₁,…,W_m+ 1] form a F-biorthonormal basis of $\mathbb {K}^{e}_{m+1}(A,V)$ and $\mathbb {K}^e_{m+1}(A^T,W)$, respectively. When inverting A or solving linear systems via the LU decomposition is not easy, then we can compute matrix products of the form A^− 1V and A^−TW by using an efficient iterative linear solver such as the well-known GMRES method.

We now introduce the 2m × 2m matrices T_2m and S_2m, given by

$$ T_{2m}=\mathbb{W}_{2m}^{T}\diamond A\mathbb{V}_{2m},\quad S_{2m}=\mathbb{W}_{2m}^{T}\diamond A^{-1}\mathbb{V}_{2m}, $$

(11)

where $\mathbb {V}_{2m}=[V_1,\ldots ,V_m]$ and $\mathbb {W}_{2m}=[W_1,\ldots ,W_m]$. An important result concerning the connection between the proposed method and the standard extended nonsymmetric Lanczos method is given by the following theorem.

Theorem 1

Let V, W be the initial given block vectors of $\mathbb {R}^{n\times s}$. Set $\mathcal {A}=I_s\otimes A$, v = vec(V ), and $w=vec(W)\in \mathbb {R}^{ns}$ and perform m steps of the standard extended nonsymmetric Lanczos applied to the pairs $(\mathcal {A},v)$ and $(\mathcal {A}^T,w)$. Then we obtain the same F- biorthonormal basis $\mathbb {V}_{2m+2}$, $\mathbb {W}_{2m+2}$ as the one determined by the extended nonsymmetric global Lanczos algorithm. Moreover, the basis $\mathbb {V}_{2m+2}$, $\mathbb {W}_{2m+2}$ satisfy the following relations:

$$ \begin{array}{@{}rcl@{}} A\mathbb{V}_{2m}&=&\mathbb{V}_{2m}(T_{2m}\otimes I_{s})+v_{2m+1} \left( \left[\begin{array}{ll} t_{2m+1,2m-1}, & t_{2m+1,2m} \end{array}\right] {E^{T}_{m}}\otimes I_{s}\right), \end{array} $$

(12)

$$ \begin{array}{@{}rcl@{}} A^{-1}\mathbb{V}_{2m}&=&\mathbb{V}_{2m}(S_{2m}\otimes I_{s})+ V_{m+1}\left( \left[\begin{array}{ll} 0 & s_{2m+1,2m}\\0 & s_{2m+2,2m} \end{array}\right] {E^{T}_{m}}\otimes I_{s}\right),\\ A^{T}\mathbb{W}_{2m}&=&\mathbb{W}_{2m}(T^{T}_{2m}\otimes I_{s})+w_{2m+1} \left( \left[\begin{array}{ll} \widetilde{t}_{2m+1,2m-1}, & \widetilde{t}_{2m+1,2m} \end{array}\right] {E^{T}_{m}}\!\otimes\! I_{s}\right), \end{array} $$

(13)

$$ \begin{array}{@{}rcl@{}} A^{-T}\mathbb{W}_{2m}&=&\mathbb{W}_{2m}(S^{T}_{2m}\otimes I_{s})+ W_{m+1}\left( \left[\begin{array}{ll} 0 & \widetilde{s}_{2m+1,2m}\\ 0 & \widetilde{s}_{2m+2,2m} \end{array}\right] {E^{T}_{m}}\otimes I_{s}\right), \end{array} $$

where $[\widetilde {t}_{2m+1,2m-1},\widetilde {t}_{2m+1,2m}]=[\langle A^Tw_{2m-1},v_{2m+1}\rangle _F,\langle A^Tw_{2m},v_{2m+1}\rangle _F]$, and $[\widetilde {s}_{2m+1,2m},\widetilde {s}_{2m+2,2m}]=[\langle A^{-T}w_{2m},v_{2m+1}\rangle _F,\langle A^{-T}w_{2m},v_{2m+2}\rangle _F]$.

$E_m=[e_{2m-1},e_{2m}]\in \mathbb {R}^{2m\times 2}$ corresponds to the last two columns of the identity matrix I_2m.

Proof

Perform m steps of the extended Lanczos algorithm applied to the pairs $(\mathcal {A},v)$ and $(\mathcal {A}^T,w)$ yield two ns × (2m + 2) bi-orthonormal bases $\mathcal {V}_{2m+2}=[x_1,\ldots ,x_{2m+2}]$, $\mathcal {W}_{2m+2}=[y_1,\ldots ,y_{2m+2}]$. These bases satisfy the relations (7) and (8) for the matrix $\mathcal {A}$. By setting, $X_j,Y_j\in \mathbb {R}^{n\times s}$ such that x_j = vec(X_j) and y_j = vec(Y_j), we obtain

$$ \text{trace}({Y_{i}^{T}}X_{j})=vec(Y_{i})^{T}\text{vec}(X_{j})={y^{T}_{i}}x_{j}=\delta_{i,j}, $$

which means that X_j,Y_j are F-biorthonormal. In addition, it follows that X_j = V_j and Y_j = W_j since

$$ \begin{array}{@{}rcl@{}} h_{i,2j-1}&=&{y^{T}_{i}}\mathcal{A}x_{2j}=\text{vec}(Y_{i})^{T}(I_{s}\otimes A)\text{vec}(X_{2j}).\\ &=&\text{vec}(Y_{i})^{T}\text{vec}(AX_{2j})=\langle AX_{2j-1},Y_{i}\rangle_{F}.\\ g_{i,2j-1}&=&{v^{T}_{i}}\mathcal{A}^{T}w_{2j-1}=\langle A^{T}Y_{2j-1},X_{i}\rangle_{F}.\\ h_{i,2j}&=&{w^{T}_{i}}\mathcal{A}^{-1}v_{2j}=\langle A^{-1}X_{2j},Y_{i}\rangle_{F}.\\ g_{i,2j}&=&{v^{T}_{i}}\mathcal{A}^{-T}w_{2j}=\langle A^{-T}Y_{2j},X_{i}\rangle_{F}. \end{array} $$

Using the same techniques as above, we obtain that $T_{2m}=\mathcal {W}^T_{2m}\mathcal {A}\mathcal {V}_{2m}$ and $S_{2m}=\mathcal {W}^T_{2m}\mathcal {A}^{-1}\mathcal {V}_{2m}$. Let us prove the relation (12). Using [25, Lemma 3.1], it holds that

$$ \mathcal{A}\mathcal{V}_{2m}= \mathcal{V}_{2m}\mathcal{T}_{2m}+x_{2m+1}[t_{2m+1,2m-1},t_{2m+1,2m}]{E^{T}_{m}}, $$

where $\mathcal {T}_{2m}=\mathcal {W}^T_{2m}\mathcal {A}\mathcal {V}_{2m}=T_{2m}$.

In other hand,

$$ \begin{array}{@{}rcl@{}} \mathcal{A}\mathcal{V}_{2m}&=&(I_{s}\otimes A)[vec(v_{1}),\ldots,vec(v_{2m})], \\ &=&[vec(Av_{1}),\ldots, vec(Av_{2m})], \end{array} $$

while

$$ \begin{array}{@{}rcl@{}} \mathcal{V}_{2m}T_{2m}&=&\left[\begin{array}{l} \sum\limits_{i=1}^{2m}[T_{2m}e_{1}]_{i} \text{vec}(v_{i}),\ldots,\sum\limits_{i=1}^{2m}[T_{2m}e_{2m}]_{i} \text{vec}(v_{i}) \end{array}\right]\\ &=&[vec(\mathbb{V}_{2m}(T_{2m}e_{1}\otimes I_{s})),\ldots,vec(\mathbb{V}_{2m}(T_{2m}e_{2m}\otimes I_{s}))]. \end{array} $$

Define $\tau =[t_{2m+1,2m-1},t_{2m+1,2m}]E^T_m$, then

$$ x_{2m+1}\tau=[vec(t_{2m+1,2m-1}v_{2m+1}),vec(t_{2m+1,2m}v_{2m+1})]{E^{T}_{m}}. $$

For i = 1,…,2m − 2, it is shown that

$$ \text{vec}(Av_{i})=\text{vec} (\mathbb{V}_{2m}(T_{2m}e_{i}\otimes I_{s})) $$

and

$$ \begin{array}{@{}rcl@{}} \text{vec}(Av_{2m-1})&=& \text{vec} (\mathbb{V}_{2m}(T_{2m}e_{2m-1}\otimes I_{s}))+\text{vec}(v_{2m+1}(t_{2m+1,2m-1}\otimes I_{s})),\\ \text{vec}(Av_{2m})&=& \text{vec} (\mathbb{V}_{2m}(T_{2m}e_{2m}\otimes I_{s}))+\text{vec}(v_{2m+1}(t_{2m+1,2m}\otimes I_{s})). \end{array} $$

Which completes the proof of (12). The proof of the other relations is similar. □

The next two propositions express the entries of T_2m and S_2m in terms of the recursion coefficients. This will allow us to compute the entries quite efficiently.

Proposition 4

Let the coefficients h_{i, j} and α_{i, j} as defined in (9) and (10), respectively. The matrix T_2m = [t_{i, j}] in (11) is pentadiagonal with the nontrivial entries,

$$ \begin{array}{@{}rcl@{}} t_{i,2j-1}&=&h_{i,2j-1}\quad \text{for } i\in\{2j-3,\ldots,2j+1\},\\ t_{1,2}&=&\frac{1}{\alpha_{2,2}}(\alpha_{1,1}-\alpha_{1,2}h_{1,1}),\\ t_{2,2}&=&\frac{-1}{\alpha_{2,2}}\alpha_{1,2}h_{2,1},\\ t_{3,2}&=&\frac{-1}{\alpha_{2,2}}\alpha_{1,2}h_{3,1}. \end{array} $$

For j = 1,2,…,m − 1,

$$ \begin{array}{@{}rcl@{}} t_{2j+1,2j+2}&=&\frac{-1}{h_{2j+2,2j}}t_{2j+1,2j-1:2j+1}h_{2j-1:2j+1,2j},\\ t_{2j+2,2j+2}&=&\frac{-1}{h_{2j+2,2j}}t_{2j+2,2j+1}h_{2j+1,2j},\\ t_{2j+3,2j+2}&=&\frac{-1}{h_{2j+2,2j}}t_{2j+3,2j+1}h_{2j+1,2j}. \end{array} $$

Proof

The proof is similar to the one given [25, Lemma 3.3]. □

Proposition 5

Let the coefficients h_{i, j} as defined in (9). The matrix S_2m = [s_{i, j}] in (11) is also pentadiagonal with the nontrivial entries, for j = 1,…, m − 1,

$$ \begin{array}{@{}rcl@{}} s_{i,2j}&=&h_{i,2j}\quad \text{for } i\in\{2j-2,\ldots,2j+2\},\\ s_{2j,2j+1}&=&\frac{-1}{h_{2j+1,2j-1}}s_{2j,2j-2:2j+1}h_{2j-2:2j+1,2j-1},\\ s_{2j+1,2j+1}&=&\frac{-1}{h_{2j+1,2j-1}}[s_{2j+1,2j}h_{2j,2j-1}+s_{2j+1,2j-1}h_{2j+1,2j-1}],\\ s_{2j+2,2j+1}&=&\frac{-1}{h_{2j+1,2j-1}}[s_{2j+2,2j}h_{2j,2j-1}+s_{2j+2,2j+1}h_{2j+1,2j+1}]. \end{array} $$

Proof

The proof is analogous to the proof of Proposition 4. □

The following result relates positive powers of S_2m to negative powers of T_2m.

Lemma 1

Let T_2m and S_2m as given by (11) and let $e_1=[1,0,\ldots ,0]^T\in \mathbb {R}^{2m}$. Then

$$ S_{2m}^{j}e_{1}=T_{2m}^{-j}e_{1}\quad \text{for}~~ j=1,2,\ldots,m. $$

Lemma 2

Let the matrices T_2m and S_2m as defined by (11). Let $\mathbb {V}_{2m}=[v_1,v_2,\ldots ,v_{2m}]$ be the matrix given in (12). Then for j = 1…m − 1, we have

$$ \begin{array}{@{}rcl@{}} A^{j}v_{1}&=&\mathbb{V}_{2m}(T^{j}_{2m}e_{1}\otimes I_{s}),\quad j=0,1,\ldots,m-1, \end{array} $$

(14)

$$ \begin{array}{@{}rcl@{}} A^{-j}v_{1}&=&\mathbb{V}_{2m}(S^{j}_{2m}e_{1}\otimes I_{s}),\quad j=0,1,\ldots,m, \end{array} $$

(15)

$$ \begin{array}{@{}rcl@{}} A^{-j}v_{1}&=&\mathbb{V}_{2m}(T^{-j}_{2m}e_{1}\otimes I_{s}),\quad j=0,1,\ldots,m. \end{array} $$

(16)

Proof

It was shown in [25] that when performing m steps of standard extended nonsymmetric Lanczos method to the pairs $(\mathcal {A},v)$ and $(\mathcal {A}^T,w)$, it holds that

$$ \mathcal{A}^{j}x_{1}=\mathcal{V}_{2m}\mathcal{T}_{2m}e_{1},\quad j=0,\ldots,m-1, $$

by using the properties of the ⊗-product and the “vec” operation, we obtain

$$ \mathcal{A}^{j}x_{1}=\text{vec}(A^{j}v_{1}) \text{ and } \mathcal{V}_{2m}\mathcal{T}_{2m}e_{1}=\sum\limits_{i=1}^{2m}[T^{j}_{2m}e_{1}]_{i}\text{vec}(v_{i})=\text{vec}(\mathbb{V}_{2m}(T^{j}_{2m}e_{1}\otimes I_{s})), $$

which prove (14). Using the same techniques as above, we can prove (15). Finally, (16) follows from (15) and Lemma 1. □

Lemma 3

Let the matrices T_2m and S_2m as defined by (11). Let $\mathbb {W}_{2m}=[w_1,w_2,\ldots ,w_{2m}]$ be the matrix given in (13). Then for j = 1…m − 1, we have

$$ \begin{array}{@{}rcl@{}} A^{j,T}w_{1}&=&\mathbb{W}_{2m}(T^{j,T}_{2m}e_{1}\otimes I_{s}),\quad j=0,1,\ldots,m-1,\\ A^{-j,T}w_{1}&=&\mathbb{W}_{2m}(S^{j,T}_{2m}e_{1}\otimes I_{s}),\quad j=0,1,\ldots,m,\\ A^{-j,T}w_{1}&=&\mathbb{W}_{2m}(T^{-j,T}_{2m}e_{1}\otimes I_{s}),\quad j=0,1,\ldots,m. \end{array} $$

Proof

The proof is similar to the ones given in Lemma 2. □

3 Application to the approximation of matrix functions

3.1 Computing the matrix function trace(W^Tf(A)V )

The expression $\mathcal {I}_{tr}$ defined in (1) can be approximated by

$$ \begin{array}{@{}rcl@{}} \mathcal{G}^{e}_{2m}(f):=\langle W,V \rangle_{F}{e^{T}_{1}}f(T_{2m})e_{1} \end{array} $$

(17)

Lemma 4

Let the matrices T_2m and S_2m as defined by (11). Let v₁ and w₁ be the initial block vectors computed by Algorithm 2. Then the following equalities hold

$$ \begin{array}{@{}rcl@{}} \text{trace}({w^{T}_{1}}A^{j}v_{1})&=&{e^{T}_{1}}T^{j}_{2m}e_{1}\quad\text{for } j=0,1,\ldots,2m-1,\\ \text{trace}({w^{T}_{1}}A^{-j}v_{1})&=&{e^{T}_{1}}T^{-j}_{2m}e_{1}\quad\text{for } j=0,1,\ldots,2m. \end{array} $$

(18)

Proof

We have trace$(w_1^TA^jv_1)=\text {vec}(w_1)^T\text {vec}(A^jv_1)$. From Theorem 1, we get y₁ = vec(w₁) and x₁ = vec(v₁). By using the properties of the “vec” operation, it follows that trace$(w_1^TA^jv_1)=y^T_1\mathcal {A}^jx_1$. Then using the result given in [25, Lemma 4.1], we obtain $y^T_1\mathcal {A}^jx_1=e^T_1\mathcal {T}_{2m}e_1$. This shows (18) since, $\mathcal {T}_{2m}=T_{2m}$. The second equation can be established by using the same techniques as above. □

Theorem 2

After m steps of Algorithm 2 with initial block vectors $V,W\in \mathbb {R}^{n\times s}$, we get

$$ \text{trace}(W^{T}p(A)V)= \langle V,W \rangle_{F} {e^{T}_{1}}p(T_{2m})e_{1}\quad \forall p\in{\varDelta}_{-2m,2m-1}, $$

where

$$ {\varDelta}_{-2m,2m-1}=\text{span}\{x^{-2m},\ldots,x^{-1},1,x,\ldots,x^{2m-1}\}. $$

Proof

From Algorithm 2, V and W are collinear with V₁ and W₁, respectively, i.e., V = αV₁ and W = βW₁ with αβ = 〈V, W〉_F. Which implies that trace$(W^Tp(A)V)=\langle V,W\rangle _F \text {trace}(W_1^Tp(A)V_1)$. Using the results of Lemma 4, we get trace$(W_1^Tp(A)V_1)=e^T_1p(T_{2m})e_1$. This completes the proof. □

3.2 Computing the matrix function W^Tf(A)V

The aim of this subsection is to show how to use the extended nonsymmetric global Lanczos algorithm to approximate $\mathcal {I}$ in (1) (see Algorithm 2). Before describing the application of the proposed method, we notice that the extended nonsymmetric block Lanczos method (ENBL) given in [2] can also be used to approximate $\mathcal {I}$. After m steps of this algorithm applied to the pairs (A, V ) and (A^T,W), we obtain two n × 2ms bi-orthonormal bases ${\mathcal{V}}_{2m}$ and ${\mathcal{W}}_{2m}$, i.e., ${\mathcal{W}}^T_{2m}{\mathcal{V}}_{2m}=I_{2ms}$. Then the expression $\mathcal {I}$ can be approximated as follows: $W^T({\mathcal{V}}_{2m}f({\mathcal{T}}_{2m}){\mathcal{W}}^T_{2m})V$ where ${\mathcal{T}}_{2m}$ is 2ms × 2ms block tridiagonal matrix with 2s × 2s blocks. The matrix ${\mathcal{T}}_{2m}={\mathcal{W}}^T_{2m}A{\mathcal{V}}_{2m}$ is computed recursively without requiring additional matrix-vector products with A (see [2, Proposition 3]). The ENBL algorithm is expensive as the number m of iterations increases and also for large values of s.

Now, let us come back to our proposed method. Applying Algorithm 2 to the pairs (A, V ) and (A^T,W) allows us to obtain two F-biorthonormal bases $\mathbb {V}_{2m}$ and $\mathbb {W}_{2m}$ such that V = αv₁ and W = βw₁. It is clear that $\mathbb {W}_{2m}^+\mathbb {V}_{2m}=I_{2ms}$; and then, we can consider the projector $\mathcal {P}_{2m}$ defined as

$$ \begin{array}{cccc} \mathcal{P}_{2m}:& \mathbb{R}^{n\times s}&\longrightarrow &\mathbb{K}^{e}_{m}(A,V)\\ &X &\longmapsto &\mathbb{V}_{2m}\mathbb{W}_{2m}^{+}X. \end{array} $$

Applying the projector $\mathcal {P}_{2m}$ to $\mathcal {I}$ gives the reduced matrix function

$$ \mathcal{B}^{e}_{2m}(f)=C^{T}_{2m}f(A_{2m})B_{2m}, $$

(19)

where $A_{2m}=\mathbb {W}^+_{2m}A\mathbb {V}_{2m}\in \mathbb {R}^{2ms\times 2ms}$, $B_{2m}=\mathbb {W}^+_{2m}V\in \mathbb {R}^{2ms\times s}$ and $C_{2m}=\mathbb {V}^T_{2m}W\in \mathbb {R}^{2ms\times s}$.

The next proposition will allow us to compute A_2m and B_2m from the recursion matrix T_2m without requiring the computation of the $\mathbb {W}^+_{2m}$ and $A\mathbb {V}_{2m}$.

Proposition 6

Let V, W be the initial block vectors where V = αV₁. Then the matrices A_2m and B_2m defined above are computed as follows:

$$ \begin{array}{lll} A_{2m}&=&(T_{2m}\otimes I_{s})+\mathbb{W}^{+}_{2m}v_{2m+1}(\left[\begin{array}{ll} t_{2m+1,2m-1},& t_{2m+1,2m} \end{array}\right]{E^{T}_{m}}\otimes I_{s}),\\ B_{2m}&=&\alpha \mathcal{E}_{1}, \end{array} $$

where $\left [\begin {array}{ll} t_{2m+1,2m-1},& t_{2m+1,2m} \end {array}\right ]$ is defined by (12), and ${\mathcal{E}}_1$ corresponds to the first s columns of the identity matrix I_n.

Proof

Let $E_j=[e_{2j-1},e_{2j}]\in \mathbb {R}^{2m\times 2}$, for j = 1,…,m. Multiplying the matrix A_2m from the right by E_j ⊗ I_s gives

$$A_{2m}(E_{j}\otimes I_{s})=\mathbb{W}^{+}_{2m}A\mathbb{V}_{2m}(E_{j}\otimes I_{s}).$$

Using (12) , we get

$$ A_{2m}(E_{j}\otimes I_{s})=(T_{2m}E_{j}\otimes I_{s})+\mathbb{W}_{2m}^{+}v_{2m+1}\left( \left[\begin{array}{ll} t_{2m+1,2m-1},& t_{2m+1,2m} \end{array}\right]{E^{T}_{m}}E_{j}\!\otimes\! I_{s}\right). $$

It follows that j = 1,…,m − 1,

$$A_{2m}(E_{j}\otimes I_{s})=T_{2m}E_{j}\otimes I_{s},$$

while for j = m, it results that

$$ A_{2m}(E_{m}\otimes I_{s})=(T_{2m}E_{m}\otimes I_{s})+\mathbb{W}_{2m}^{+}v_{2m+1}\left( \left[\begin{array}{ll} t_{2m+1,2m-1},& t_{2m+1,2m} \end{array}\right]\otimes I_{s}\right). $$

Therefore,

$$ A_{2m}=(T_{2m}\otimes I_{s})+\mathbb{W}^{+}_{2m}v_{2m+1}\left( \left[\begin{array}{ll} t_{2m+1,2m-1},& t_{2m+1,2m} \end{array}\right]{E^{T}_{m}}\otimes I_{s}\right). $$

To show the expression of B_2m, we use the fact that V = αv₁, which implies that

$$ \begin{array}{@{}rcl@{}} B_{2m}&=&\mathbb{W}_{2m}^{+}V=\alpha \mathbb{W}_{2m}^{+}v_{1}\\ &=&\alpha \mathbb{W}_{2m}^{+}\mathbb{V}_{2m}\mathcal{E}_{1}=\alpha \mathcal{E}_{1}. \end{array} $$

□

Lemma 5

Let A_2m be the matrix be defined in (19). Let $\mathbb {V}_{2m}=[v_1,\ldots ,v_{2m}]$ defined by (12). Then

$$ \begin{array}{@{}rcl@{}} A^{j}v_{1}&=&\mathbb{V}_{2m}A^{j}_{2m}\mathcal{E}_{1}\quad \text{for } j=0,\ldots,m-1,\\ A^{-j}v_{1}&=&\mathbb{V}_{2m}A^{-j}_{2m}\mathcal{E}_{1}\quad \text{for } j=0,\ldots,m. \end{array} $$

(20)

Proof

The first equation is shown by induction. Av₁ and v₁ are two elements of $\mathbb {K}^e_m(A,v_1)$ and then

$$ Av_{1}=\mathbb{V}_{2m}\mathbb{W}^{+}_{2m}Av_{1}=\mathbb{V}_{2m}\mathbb{W}^{+}_{2m}A\mathbb{V}_{2m}\mathbb{W}^{+}_{2m}v_{1}=\mathbb{V}_{2m}A_{2m}\mathcal{E}_{1}. $$

The equality is true for j = 1. Let j = 2,…,m − 1, and assume that

$$ A^{k}v_{1}=\mathbb{V}_{2m}A^{k}_{2m}\mathcal{E}_{1} \quad k=1,\ldots,j-1. $$

Since $A^jv_1\in \mathbb {K}^e_m(A,v_1)$, it follows that $A^jv_1=\mathbb {V}_{2m}\mathbb {W}^+_{2m}A^jv_1$. By induction, we have

$$ A^{j}v_{1}=\mathbb{V}_{2m}\mathbb{W}^{+}_{2m}A\mathbb{V}_{2m}A^{j-1}_{2m}\mathcal{E}_{1},$$

which completes the proof of (20).

The second equation follows from the fact that $\mathbb {V}_{2m}\mathbb {W}^+_{2m}A^{-j}v_1=A^{-j}v_1$ ∀j = 1,…,m and by using same technique induction as above. □

Lemma 6

Let A_2m be the matrix defined by (19) and let $\mathbb {W}_{2m}=[ w_1,\ldots ,w_{2m}]$ be the matrix in (13). Then

$$ \begin{array}{@{}rcl@{}} A^{j,T}w_{1}&=\mathbb{W}^{+,T}_{2m}A^{j,T}_{2m}\mathbb{V}^{T}_{2m}w_{1} &\text{for } j=0,\ldots,m-1,\\ A^{-j,T}w_{1}&=\mathbb{W}^{+,T}_{2m}A^{-j,T}_{2m}\mathbb{V}^{T}_{2m}w_{1} &\text{for } j=0,\ldots,m. \end{array} $$

Proof

We have $A^{j,T}w_1\in \mathbb {K}^{e}_{m}(A^T,W)$, which means that $\mathbb {W}^{+,T}_{2m}\mathbb {V}^T_{2m}A^{j,T}w_1=A^{j,T}w_1$ for any integer j such that − m ≤ j ≤ m − 1. According to the previous equality, and by the same techniques as the proof of Lemma 5, then both equations are shown. □

Proposition 7

After m steps of the process, we have

$$ \begin{array}{@{}rcl@{}} W^{T}A^{-j}V&=&C^{T}_{2m}A^{-j}_{2m}B_{2m},\quad \text{for } j=0,\ldots,2m,\\ W^{T}A^{j}V&=&C^{T}_{2m}A^{j}_{2m}B_{2m},\quad \text{for } j=0,\ldots,2m-1. \end{array} $$

Proof

The proof is based on an application of results of Lemmas 5 and 6. Let j = j₁ + j₂ with j₁,j₂ ∈{0,…,m}. We have V = αv₁ and W = βw₁, then

$$ \begin{array}{@{}rcl@{}} W^{T}A^{-j}V&=&\alpha\beta {w^{T}_{1}}A^{-j_{1}}A^{-j_{2}}v_{1}\\ &=&\alpha\beta(A^{-j_{1},T}w_{1})^{T}(A^{-j_{2}}v_{1})\\ \end{array} $$

Using equations of Lemmas 5 and 6, we obtain

$$ \begin{array}{@{}rcl@{}} &=&\alpha\beta(\mathbb{W}^{+,T}_{2m}A^{-j_{1},T}_{2m}\mathbb{V}^{T}_{2m}w_{1})^{T}(\mathbb{V}_{2m}A^{-j_{2}}_{2m}\mathcal{E}_{1})\\ &=&W^{T}\mathbb{V}_{2m}A^{-j}_{2m}\mathbb{W}^{+}_{2m}V\\ &=&C_{2m}^{T}A^{-j}_{2m}B_{2m}, \end{array} $$

which completes the proof of the first relation. The proof of the second relation is similar. □

Now, we compare the operations requirements for the extended block Lanczos method ENBL and the extended global Lanczos method ENGL. Both methods require the same cost of computing matrix-matrix products AV for some $V\in \mathbb {R}^{n\times s}$; they also require the same cost of solving linear systems. The ENBL method requires 4ns² operations to compute the n × 2s matrix VH and 4ns² for computing the 2s × 2s matrix W^TV, while the ENGL method only needs 4ns operations to compute W^T ◇ V and 4ns to compute V (H ⊗ I_s). To update the new bases V_j+ 1 and W_j+ 1, the ENBL method has to perform the bi-orthonormalization decomposition of $\widehat {V}_{j+1}$ and $\widehat {W}_{j+1}$ that costs 16ns² in every step, while the global bi-orthonormalization decomposition costs only 16ns. To compute ${\mathcal{T}}_{2m}$ using [2, Proposition 3], we need to 2s(4ns + n + 1)(m + 1) operations, while for ENGL method, computing T_2m costs only (10ns + 2)(m + 1). Moreover, the computation of ${\mathcal{T}}_{2m}$ requires solving linear systems of size s × s for every step, while ENGL method only needs the division by a scalar (see Proposition 4). In Table 1, we summarize the number of operations after m iterations of the ENBL algorithm and the ENGL algorithm. As we observed, the ENGL algorithm is less expensive than the ENBL algorithm, when the number of iteration m increases and the block size s is not small.

Table 1 Comparison of the extended nonsymmetric block Lanczos (ENBL) and the extended nonsymmetric global Lanczos (ENGL) algorithms

Full size table

4 Numerical experiments

In this section, we give some numerical examples to show the performance of the extended nonsymmetric global Lanczos (ENGL) method . In the selected examples, the proposed method is applied to the approximation of expressions of the form $\mathcal {I}_{tr}$ and $\mathcal {I}$ given in (1). All experiments were carried out in MATLAB R2015a on a computer with an Intel Core i-3 processor and 3.89 GBytes of RAM. The computations were done with about 15 significant decimal digits.

As mentioned in the second section, we did not have to compute explicitly A^− 1. In all examples, the matrix products with A^− 1 and A^−T in lines 2 − 8 − 9 of Algorithm 4 are computed via an LU factorization or by using an iterative solver. We used a preconditioned block biconjugate gradient (PBBiCG) method as described in Algorithm 5.

4.1 Examples for approximations of trace(W^Tf(A)V )

Example 1

In this experiment, we compared the performance of the ENGL algorithm with the performance of extended global Arnoldi algorithm (EGA) described in [1, 15]. We approximate $\text {trace}(E^T_1\exp (A)E_1)$ where A is the nonsymmetric adjacency matrix pesa of order n = 11738. This matrix was obtained from the Suite Sparse Matrix Collection[8]. $E_1\in \mathbb {R}^{n\times s}$ corresponds to the first s columns of identity matrix I_n. Results for several choices of the block size s and number of iterations m are reported in Table 2. We notice that in the ENGL and EGA algorithms, we used the PBBiCG algorithm defined by Algorithm 5 with an ILU preconditionner. As observed from this table, the approximate errors determined by ENGL have higher accuracy as compared with the approximations obtained by the EGA method.

Table 2 Example 1: $A\in \mathbb {R}^{n\times n}$ is the nonsymmetric adjacency pesa matrix with n = 11738

Full size table

Example 2

In this example, we used a diagonalizable matrix of order n = 1000 whose eigenvalues are log-uniformly distributed in the interval [10^− 1,10⁴] and random eigenvectors. We computed approximations of $\text {trace}(E^T_1f(A)E_1)$ given by ENGL and by the standard nonsymmetric global Lanczos method (SNGL). Here we used s = 6. In Table 3, we reported the number of iterations, the relative errors, and the required CPU times obtained with different functions. Here we used the LU factorization to compute products of the form A^− 1V and A^−TW. The results show that the ENGL algorithm is faster and give better relative errors, while SNGL algorithm is unable to determine an accurate approximation for all functions f used in this example.

Table 3 Example 2: $A\in \mathbb {R}^{n\times n}$ has eigenvalues distributed in the interval [10^− 1,10⁴] and a random eigenvector matrix. The block size s = 6

Full size table

4.2 Examples for the approximation of W^Tf(A)V

In this subsection, we present some results to approximate W^Tf(A)V using the ENGL algorithm. In the following experiments, we used the functions: $f(x)=\sqrt {x}$ and f(x) = x^− 1/3. The blocks W and V were generated randomly with entries uniformly distributed on [0,1]. The matrix A was obtained from the centered finite difference discretization (CFDD) of the elliptic operators given by (21) on the unit square [0,1] × [0,1] with Dirichlet homogeneous boundary conditions. The number of inner grid points in each direction was n₀ and the dimension of matrices is $n=n_0^2$.

$$ \begin{array}{ll} \mathcal{L}_{1}(u)&=-100u_{xx}-u_{yy}+10xu_{x},\\ \mathcal{L}_{2}(u)&=-e^{-xy}u_{xx}-e^{xy}u_{yy}+1/(x+y)u_{x}. \end{array} $$

(21)

Example 3

We consider the approximation of W^TA^1/2V and W^TB^1/2V where the matrices $A,B\in \mathbb {R}^{4900\times 4900}$ are nonsymmetric matrices coming from CFDD of the operators ${\mathcal{L}}_1(u)$ and ${\mathcal{L}}_2(u)$, respectively, and given by (21). The block size s was s = 4. In Fig. 1, we reported the relative errors of ENBL, ENGL, and SNBL algorithms versus the dimension of the projected subspace using the matrix A on the left and on the right part of this figure, we give the results corresponding to the matrix B. Both plots show that ENGL and ENBL algorithms yield significantly smaller errors than the SNBL algorithm.

Example 4

In this example, we consider nonsymmetric matrices coming from CFDD of the same operators as in Example 3. In Tables 4 and 5, we reported results for the ENGL and ENBL algorithms when approximating $\mathcal {I}$. We used different values of the dimension n ({2500,4900,7225,10000}) and two different block sizes s = 10,20 in Table 4, and s = 30,40 in Table 5. For the last two values of n, we used PBBiCG preconditioned by the block ILU preconditioner (see [23]). As shown in Tables 4 and 5, when the block size s increases, the approximations of $\mathcal {I}$ computed with ENGL are more accurate than the approximations produced by the ENBL algorithm.

Table 4 Example 4: Approximation of W^Tf(A)V for two functions and different matrix dimensions for the operators given by (21)

Full size table

Table 5 Example 4: Approximation of W^Tf(A)V for two functions and different matrix dimensions for the operators given by (21)

Full size table

5 Conclusion

This paper describes an extended nonsymmetric global Lanczos method for the approximation of trace(W^Tf(A)V ) and W^Tf(A)V. Two F-biorthonormal bases of the extended Krylov subspaces given by (5) are computed by short recurrence formulas. We gave some suitable algebraic relations. The numerical results show that the nonsymmetric extended global Lanczos method requires fewer iterations and CPU time as compared with the standard nonsymmetric global Lanczos method and to the extended global Arnoldi method when approximating trace(W^Tf(A)V ) and W^Tf(A)V.

References

Abidi, O., Heyouni, M., Jbilou, K.: On some properties of the extended block and global Arnoldi methods with applications to model reduction. Numer. Algorithms 75, 285–304 (2017)
Article MathSciNet Google Scholar
Barkouki, H., Bentbib, A.H., Heyouni, M., Jbilou, K.: An extended nonsymmetric block Lanczos method for model reduction in large scale dynamical systems. Calcolo 55, 13–36 (2018)
Article MathSciNet Google Scholar
Baroni, S., Gebauer, R., Malcioglu, O.B., Saad, Y., Umari, P., Xian, J.: Harnessing Molecular Excited States with Lanczos Chains, vol. 22. Art. Id. 074204 (2010)
Bellalij, M., Reichel, L., Rodriguez, G., Sadok, H.: Bounding matrix functionals via partial global block Lanczos decomposition. Appl. Numer. Math. 94, 127–139 (2015)
Article MathSciNet Google Scholar
Bellalij, M., Jbilou, K., Sadok, H.: New convergence results on the global GMRES method for diagonalizable matrices. J. Comput. Appl. Math. 219, 350–358 (2008)
Article MathSciNet Google Scholar
Bouyouli, R., Jbilou, K., Sadaka, R., Sadok, H.: Convergence properties of some block Krylov subspace methods for multiple linear systems. J. Comput. Appl. Math. 196, 498–511 (2006)
Article MathSciNet Google Scholar
Bowler, D.R., Miyazaki, T.: O(n) methods in electronic structure calculations. Rep. Prog. Phys. 75, 43 (2012). Art. Id. 036503
Article Google Scholar
Davis, T., Hu, Y.: The SuiteSparse Matrix Collection, https://sparse.tamu.edu
Druskin, V., Knizhnerman, L.: Extended Krylov subspaces: approximation of the matrix square root and related functions. SIAM J. Matrix Anal. Appl. 19, 755–771 (1998)
Article MathSciNet Google Scholar
Estrada, E.: The Structure of Complex Networks: Theory and Applications. Oxford University Press, Oxford (2011)
Book Google Scholar
Fenu, C., Reichel, L., Rodriguez, G., Sadok, H.: GCV for Tikhonov regularization by partial SVD. BIT 57, 1019–1039 (2017)
Article MathSciNet Google Scholar
Golub, G.H., Meurant, G.: Matrices, Moments and Quadrature with Applications. Princeton University Press, Princeton (2010)
Book Google Scholar
Han, I., Malioutov, D., Shin, J.: Large-scale log-determinant computation through stochastic chebyshev expansions. Proceedings of The 32nd International Conference on Machine Learning 37, 908–917 (2015)
Google Scholar
Hansen, P.C.: Rank-deficient and Discrete Ill-Posed Problems. SIAM, Philadelphia (1998)
Book Google Scholar
Heyouni, M.: Extended Arnoldi methods for large low-rank Sylvester matrix equations. Appl. Numer. Math. 60, 1171–1182 (2010)
Article MathSciNet Google Scholar
HIGHAM, N.J.: Functions of Matrices: Theory and Computation. SIAM, Philadelphia (2008)
Jbilou, K., Sadok, A., Messaoudi, H.: Global FOM and GMRES algorithms for matrix equations. Appl. Numer Math. 31, 49–63 (1999)
Article MathSciNet Google Scholar
Jbilou, K., Sadok, H., Tinzeft, A.: Oblique projection methods for multiple linear systems. Elect. Trans. Num. Anal. 20, 119–138 (2005)
MATH Google Scholar
Knizhnerman, VL., Simoncini, A: New investigation of the extended Krylov subspace method for matrix function evaluations. Numer. Linear Algebra Appl. 17, 615–638 (2010)
MathSciNet MATH Google Scholar
Ngo, T.T., Bellalij, M., Saad, Y.: The trace ratio optimization problem. SIAM Rev. 54, 545–569 (2012)
Article MathSciNet Google Scholar
Reichel, L., Rodriguez, G., Tang, T.: New block quadrature rules for the approximation of matrix functions. Linear Algebra Appl. 502, 299–326 (2016)
Article MathSciNet Google Scholar
Saad, Y., Chelikowsky, J., Shontz, S.: Numerical methods for electronic structure calculations of materials. SIAM Rev. 52, 3–54 (2010)
Article MathSciNet Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)
Book Google Scholar
Simoncini, V.: A new iterative method for solving large-scale Lyapunov matrix equations. SIAM J. Sci. Comput. 29, 1268–1288 (2007)
Article MathSciNet Google Scholar
Schweitzer, M.: A two-sided short-recurrence extended Krylov subspace method for nonsymmetric matrices and its relation to rational moment matching. Numer. Algorithms 76, 1–31 (2016)
Article MathSciNet Google Scholar

Download references

Acknowledgments

We would like to thank the two anonymous referees for their valuable remarks and suggestions allowing us to improve the quality of the paper.

Author information

Authors and Affiliations

Laboratory LAMAI, Faculty of Sciences and Technologies, Cadi Ayyad University, Marrakech, Morocco
A. H. Bentbib & M. El Ghomari
Laboratory LMPA, University ULCO, 50 Rue F. Buisson, Calais cedex, France
K. Jbilou
Laboratory CSEHS, University UM6P, Ben Guérir, Morocco
K. Jbilou

Authors

A. H. Bentbib
View author publications
You can also search for this author in PubMed Google Scholar
M. El Ghomari
View author publications
You can also search for this author in PubMed Google Scholar
K. Jbilou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. El Ghomari.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bentbib, A.H., El Ghomari, M. & Jbilou, K. Extended nonsymmetric global Lanczos method for matrix function approximation. Numer Algor 84, 1459–1479 (2020). https://doi.org/10.1007/s11075-020-00896-8

Download citation

Received: 18 July 2019
Accepted: 23 January 2020
Published: 04 March 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s11075-020-00896-8

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Extended nonsymmetric global Lanczos method for matrix function approximation

Abstract

Similar content being viewed by others

A unified approach to Krylov subspace methods for solving linear systems

An Efficient Implementation of the Gauss–Newton Method Via Generalized Krylov Subspaces

Error bounds and estimates for Krylov subspace approximations of Stieltjes matrix functions

1 Introduction

2 Some algebraic properties of the extended nonsymmetric global Lanczos process

2.1 Preliminaries and notations

Definition 1

Proposition 1

Definition 2

Proposition 2

Proof

2.2 Description of the process

Proposition 3

Proof

Theorem 1

Proof

Proposition 4

Proof

Proposition 5

Proof

Lemma 1

Lemma 2

Proof

Lemma 3

Proof

3 Application to the approximation of matrix functions

3.1 Computing the matrix function trace(WTf(A)V )

Lemma 4

Proof

Theorem 2

Proof

3.2 Computing the matrix function WTf(A)V

Proposition 6

Proof

Lemma 5

Proof

Lemma 6

Proof

Proposition 7

Proof

4 Numerical experiments

4.1 Examples for approximations of trace(WTf(A)V )

Example 1

Example 2

4.2 Examples for the approximation of WTf(A)V

Example 3

Example 4

5 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation

3.1 Computing the matrix function trace(W^Tf(A)V )

3.2 Computing the matrix function W^Tf(A)V

4.1 Examples for approximations of trace(W^Tf(A)V )

4.2 Examples for the approximation of W^Tf(A)V