Abstract
Motivated by the effectiveness of Krylov projection methods and the CP decomposition of tensors, which is a low rank decomposition, we propose Arnoldi-based methods (block and global) to solve Sylvester tensor equation with low rank right-hand sides. We apply a standard Krylov subspace method to each coefficient matrix, in order to reduce the main problem to a projected Sylvester tensor equation, which can be solved by a global iterative scheme. We show how to extract approximate solutions via matrix Krylov subspaces basis. Several theoretical results such as expressions of residual and its norm are presented. To show the performance of the proposed approaches, some numerical experiments are given.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A tensor is a multi-dimensional array, in which the order is the number of dimensions, also known as ways or modes. In the past few years, tensors have attracted significant attention in several applications, such as image processing, machine learning, and scientific computing [15, 20]. One of the popular problems in tensor-based modeling is the following equation, known as the Sylvester tensor equation
where the matrices \(A^{(n)}\in \mathbb {R}^{I_{n}\times I_{n}}, n=1,2,\cdots , N\), the right-hand side tensor \({\mathscr{B}} \in \mathbb {R}^{I_{1} \times I_{2} \times {\cdots } \times I_{N}}\) are known, and \(\mathcal {X}\in \mathbb {R}^{I_{1} \times I_{2} \times {\cdots } \times I_{N}}\) is the unknown tensor. The product ×i, i = 1, … , N, and some notations related to the concept of tensors will be specified in the next section. For simplicity, in the sequel, we define the following linear operator
It is easy to verify that (1) is equivalent to the following linear system of equations
with \(\mathbb {A}={\sum }_{n=1}^{N} I_{I_{N}}\otimes \cdots \otimes I_{I_{n+1}}\otimes A^{(n)}\otimes I_{I_{n-1}}\otimes {\cdots } I_{I_{1}}, x=vec\left (\mathcal {X}\right ),\) and \(b=vec\left ({\mathscr{B}}\right ),\) where ⊗ denotes the Kronecker product (defined in the next section), \(I_{I_{n}}\) stands for the identity matrix of order In and the operator vec rearranges tensor’s elements in a vector. It is well known that the Sylvester tensor equation (1) has a unique solution if and only if λ1 + λ2 + … + λN ≠ 0, for all λi ∈ σ(A(i)), where σ(A(i)) denotes the spectrum of A(i) (Lemma4.2 [6]).
Note that the coefficient matrix of the linear system (3) is of order \(\displaystyle \prod \limits _{i=1}^{N}I_{i}\), which may become too large even for moderate values of I1, … , IN and solving this linear system can be a real challenge. When \(\mathcal {X}\) is a tensor of order two, i.e., a matrix, (1) is reduced to
which is exactly the well-known Sylvester matrix equation, it has been widely used in control and communication theory, image restoration, and numerical methods for ordinary differential equations (see [4] and the references therein). Sylvester tensor equation (1) can arise when discretizing high dimensional linear partial differential equations using finite difference or spectral method [19, 21, 24]; a survey of the tensor-structured numerical methods in applications to multidimensional problems in scientific computing is given in [15]. As an example of the applications of (1), one can think of some Laplace-like operator L in an N −dimensional domain
In this paper, we are interested in the case where the discretized right-hand side in (1) is of low rank—this is possible when the function f is sufficiently smooth to be well approximated by a short sum of separable functions (see, e.g., [3, 11]). In general, the right-hand side tensor \({\mathscr{B}}\) can be approximated by a low rank tensor using CP decomposition [16].
In recent years, various methods have been proposed in order to solve (1). For instance, the tensor format of the GMRES method (GMRES-BTF) has been established by Chen and Lu [6]. In [7], gradient-based iterative algorithms have been proposed for solving (1); they are based on the hierarchical identification principle [9] and tensor arithmetic concepts. Kressner and Tobler proposed a tensor Krylov subspace method to solve (3) when the right-hand side is given in a tensor product structure, i.e., is of rank one. The idea based on applying a standard Krylov subspace method to the coefficient matrices, in order to approximate the solution by a vector of low tensor rank [17]. Ballani and Grasedyck presented an iterative scheme similar to Krylov subspace method to solve (3), relying on truncation operator, whereas the operator is implemented by hierarchical Tucker format [16], to allow applications in high dimensions [2]. Some well-known Krylov subspace methods have been studied in their tensor format by Beik et al. in [1]; the authors have described the tensor format of the full orthogonalization method (FOM-BTF) and conjugate gradient (CG-BTF)–type iterative algorithms. These methods are attractive if the coefficient matrices are not large, since they are based on the use of the tensor Krylov subspace associated to the operator \({\mathscr{M}}\) defined by (2).
In the concept of Krylov subspace methods, we apply an Arnoldi-based algorithm to the coefficient matrices to get a reduced Sylvester tensor equation which can be solved by a global iterative scheme. The approximate solution is then constructed from the solution of the reduced equation and the Krylov subspaces basis associated to the coefficient matrices.
The rest of this paper is organized as follows. In Section 2, we give notations adopted in this paper and some basic definitions and properties related to tensors. Section 3 is dedicated to theoretical results when the right-hand side tensor in (1) is of rank one. In Section 4, we present our approaches to solve (1) with a right-hand side tensor of a specific rank; we give the block Krylov approach in Section 4.1 and the global Krylov approach in Section 4.2. Some numerical examples are presented in Section 5 to evaluate the performance of our approaches. Finally, in Section 6, we give a brief conclusion and perspectives.
2 Notations and preliminary concepts
In this section, we introduce some basic definitions, tensor notations, and common operations related to tensors adopted in this paper (for more details, see [16]). Throughout this paper, vectors (tensors of order one), matrices (tensors of order two), and higher order tensors (order three or higher) are signified by lower-case letter, capital letters, and Euler script letters respectively. A tensor \(\mathcal {X}\) is an element in \(\mathbb {R}^{I_{1}\times I_{2}\times \cdots \times I_{N}}\), where \(I_{1},I_{2},\cdots ,I_{N}\in \mathbb {N}\); its entries are denoted by \(\mathcal {X}_{i_{1}i_{2}{\cdots } i_{N}},\) with in ∈{1,⋯ , In}, for every 1 ≤ n ≤ N. Fibers are the higher order analogue of matrix rows and columns. A fiber is defined by fixing every index but one. The n-mode fiber is denoted by \(\mathcal {X}_{i_{1},{\cdots } i_{n-1},:,i_{n+1},{\cdots } ,i_{N}}\in \mathbb {R}^{I_{n}}\), where the indices ij are fixed for j = 1, ⋯ , n − 1, n + 1, ⋯ , N. The notation \(\overline {i_{1}{\ldots } i_{N}}\) corresponds to a multi-index, which is obtained as follows
The inner product of two same size tensors \(\mathcal {A},{\mathscr{B}}\in \mathbb {R}^{I_{1}\times I_{2}\times \cdots \times I_{N}}\) is defined by
and the norm induced by this inner product is
Definition 1
Let \( \mathcal {A} \in \mathbb {R}^{I_{1}\times I_{2} \times {\cdots } \times I_{N}} \) be an N th-order tensor and \( U\in \mathbb {R}^{J\times I_{n}} \) be a matrix. The n-mode product of \( \mathcal {A} \) and U, denoted by \(\mathcal {A}\times _{n}U \), is a tensor of size
whose entries are given by
For \(\mathcal {X}\in \mathbb {R}^{I_{1}\times {\cdots } I_{N}}\) and {A} a set of matrices \(A_{n}\in \mathbb {R}^{I_{n}\times I_{n}}, n=1,2,{\cdots } ,N\), their multiplication in all possible modes (n = 1, 2, ... , N) is denoted as
and
Proposition 1
Let\(\mathcal {A}\in \mathbb {R}^{I_{1}\times \cdots \times I_{N}}\)be an N th-order tensor, \(U\in \mathbb {R}^{J\times I_{m}}\), \(V\in \mathbb {R}^{K\times I_{n}}\)and\(W\in \mathbb {R}^{I_{n}\times I_{n}}\)be three matrices, then for distinct modes in a series of multiplication, the order of the multiplication is irrelevant, i.e.,
If the modes are the same, then
Definition 2
[16]
The outer product of two tensors \(\mathcal {A}\in \mathbb {R}^{I_{1}\times I_{2}\times \ldots \times I_{N}}\) and \({\mathscr{B}}\in \mathbb {R}^{J_{1}\times J_{2}\times \cdots \times J_{M}}\) is a tensor denoted by \(\mathcal {A}\circ {\mathscr{B}}=\mathcal {C}\in \mathbb {R}^{I_{1}\times I_{2}\times \cdots \times I_{N}\times J_{1}\times J_{2}\times \cdots \times J_{M}}\).
Elementwise,
If v1, v2, … , vN are N vectors of sizes Ii, i = 1, … , N, their outer product is an N th-order tensor of size I1 ×… × IN and we have
Definition 3
[16]
An N th-order tensor \( \mathcal {A} \in \mathbb {R}^{I_{1}\times I_{2} \times \cdots \times I_{N}} \) is of rank one if it can be written as the outer product of N vectors \(v_{k}\in \mathbb {R}^{I_{k}}, k=1,2,\cdots ,N\), i.e.,
A tensor is of rank \(R\in \mathbb {N}\) if it could be written as the sum of R rank one tensors.
Definition 4
The Kronecker product of two matrices \(A\in \mathbb {R}^{I_{1}\times I_{2}}\) and \(B\in \mathbb {R}^{J_{1}\times J_{2}}\) is a matrix of size I1J1 × I1J2 denoted by A ⊗ B, where
The Kronecker product of two tensors \(\mathcal {A}\in \mathbb {R}^{I_{1}\times \cdots \times I_{N}}\) and \({\mathscr{B}}\in \mathbb {R}^{J_{1}\times \cdots \times J_{N}}\) is defined by
where
for in = 1, ⋯ , In, jn = 1, ⋯ , Jn, n = 1, ⋯ , N.
In the following remark, we state the link between Kronecker product of 3 vectors and their outer product
Remark 1
[8, Page 33]
Let v1, v2, and v3 be 3 vectors of sizes I1, I2, and I3, respectively, we have
It is easy to verify that the above remark still available for N vectors. This property shows that the idea in Section 3 of the current paper can be considered as a reformulation of the one in [17].
Proposition 2
[18]
Let \(\mathcal {A}=a_{1}\circ a_{2}\circ {\cdots } \circ a_{N}\) and \({\mathscr{B}}=b_{1}\circ b_{2}\circ {\cdots } \circ b_{N}\) denote two rank one tensors, then, the Kronecker product \(\mathcal {A}\otimes {\mathscr{B}}\) can be expressed by
It is well known that if A, B, C, and D are four matrices, we have
We give an elegant generalization of this property to tensors as follows
Proposition 3
Let\(\mathcal {A}\in \mathbb {R}^{I_{1}\times \cdots \times I_{N}}\), \({\mathscr{B}}\in \mathbb {R}^{J_{1}\times \cdots \times J_{N}}\)be two tensors, and\(A\in \mathbb {R}^{K_{n}\times I_{n}}\)and\(B\in \mathbb {R}^{L_{n}\times J_{n}}\)be two matrices, then
Proof
□
It is easy to verify the following result
Proposition 4
Let\(\mathcal {A}=a_{1}\circ a_{2}\circ {\cdots } \circ a_{N}\)a rank one tensor and {V} a set of N matrices {V1, V2, … , VN}, then we have
Definition 5 (CP decomposition 16)
Let \( \mathcal {A} \in \mathbb {R}^{I_{1}\times I_{2} \times {\cdots } \times I_{N}} \) be an N th-order tensor. The CP decomposition of \(\mathcal {A}\)is
where \(a_{r}^{(k)}\) are vectors of size Ik with 1 ≤ k ≤ N. If we define \(A_{n}=\left [a_{1}^{(n)} a_{2}^{(n)}{\cdots } a_{R}^{(n)} \right ] \) for \(n\in \left \lbrace 1,{\cdots } ,N \right \rbrace \), the CP decomposition can be symbolically written as
the matrices \(A_{n}\in \mathbb {R}^{I_{n}\times R}\) are called factor matrices. Often, the vectors \(a_{r}^{(n)}\) are chosen such that \(\Vert a_{r}^{(n)} \Vert =1\). In this case, the CP decomposition is written as
where λr is a scalar that compensates for the magnitudes of vectors \( a_{r}^{(n)}\).
In the following, we denote by m = (m1, m2, ⋯ , mN) a multi-index that represents all the Krylov subspaces dimensions, \(e_{m_{i}}=\left [0,{\ldots } ,0,1\right ]^{T}\in \mathbb {R}^{m_{i}} \), \(e_{1}^{(m_{i})}=\left [1,0,{\ldots } ,0\right ]^{T}\in \mathbb {R}^{m_{i}} \), \(E_{m_{i}}=\left [0_{R},{\ldots } ,0_{R},I_{R}\right ]^{T}\in \mathbb {R}^{Rm_{i}\times R}\), \(E_{1}^{(m_{i})}=\left [I_{R},0_{R}{\ldots } ,0_{R}\right ]^{T}\in \mathbb {R}^{Rm_{i}\times R}\) and \(e_{r}^{(R)}\in \mathbb {R}^{R}\) the r th vector from the canonical basis of \(\mathbb {R}^{R}\), where 0R and IR correspond to the square matrix full of zeros and the identity matrix of size R respectively.
3 Rank one right-hand side tensor
In this section, we assume that the right-hand side tensor in (1) is of rank one; i.e., it can be written as follows
where \(b_{i}\in \mathbb {R}^{I_{i}}, i=1,{\ldots } ,N\). Applying the Arnoldi algorithm (section 6.3 in [23]) to the pairs (A(i), bi), i = 1, … , N, leads to the following relations, for i = 1, … , N,
and
where the first vector of each basis \(V_{m_{i}}\) is exactly the normalized vector
We consider the following approximate solution of (1)
where \(\mathcal {Y}_{m}\in \mathbb {R}^{m_{1}\times \cdots \times m_{N}}\) and {Vm} is the set of matrices \(\lbrace V_{m_{1}},{\cdots } ,V_{m_{N}}\rbrace \).
The associated residual tensor is given by
We consider the Petrov-Galerkin condition on its tensor format as follows
hence
where \(\upbeta =\displaystyle \prod \limits _{i=1}^{N}\|b_{i}\|\). Thus, the reduced Sylvester tensor equation is given as follows
where \(\mathcal {E}_{m}=e_{1}^{(m_{1})}\circ e_{1}^{(m_{2})}\circ \cdots \circ e_{1}^{(m_{N})}\). The following proposition gives the associated residual tensor
Proposition 5
Let\(\mathcal {Y}_{m}\)be the solution of (7) and for i = 1, … , N, \(V_{m_{i}}\)are the basis obtained by applying Arnoldi algorithm to the pairs (A(i), bi), then
Proof
We have
Using the relations (5) and the expression of the right-hand side \({\mathscr{B}}\), we obtain
Taking in consideration the fact that \(b_{i}=\|b_{i}\|v_{1}^{(i)}=\|b_{i}\|V_{m_{i}}e_{1}^{(m_{i})}\) and using the proposition (4), it results
Invoking (7), the result achieved. □
Theorem 1
Let\(\mathcal {R}_{m}\)be the corresponding residual, then
where\(\mathcal {Y}_{m}\)is the solution of (7) and\(e_{m_{i}}=\left [0,{\ldots } ,0,1\right ]^{T}\in \mathbb {R}^{m_{i}}\).
Proof
□
4 Rank R right-hand side tensor
Inspired by the work addressed the Sylvester matrix equation in [10, 12], and the fact that an N th-order tensor can be decomposable using the CP decomposition mentioned in the definition 5, we propose two approaches to extract approximate solutions to (1) with low rank right-hand sides; by this means, we assume in the following that the right-hand side is of rank R, i.e.,
where \(b_{i}^{(r)}\in \mathbb {R}^{I_{i}}\), for \(i\in \left \lbrace 1,{\ldots } ,N \right \rbrace \), and \(r\in \left \lbrace 1,{\ldots } ,R \right \rbrace \). We set for i = 1,2, ⋯ , N
Straightforward computations show that the right-hand side tensor can also be written as follows
where \(\mathcal {I}_{R}\), called identity tensor, is the N th-order tensor of size R ×… × R with ones along the super-diagonal. In the following two paragraphs, we will show how to extract approximate solutions to (1), via block and global Krylov methods (for more details about the block and global Arnoldi algorithms, we refer the reader to section 6.1 in [23] and [13] respectively).
4.1 Block Krylov approach
Let \(\mathbb {U}_{m_{i}}=\left [U_{1}^{(i)},{\ldots } ,U_{m_{i}}^{(i)}\right ] \) and \(\mathbb {H}_{m_{i}}\) be the matrices obtained by applying block Arnoldi algorithm to the pairs (A(i), B(i)), i = 1,2, … , N, starting with \(U_{1}^{(i)}=Q^{(i)}\), where Q(i) is obtained from the QR factorisation of B(i), i.e., B(i) = Q(i)Ri, then the following relation holds, for i = 1, … , N,
An approximate solution is given by
where \(\mathcal {Y}_{m}\in \mathbb {R}^{{Rm_{1}\times \cdots \times Rm_{N}}}\) and \(\lbrace \mathbb {U}_{m}\rbrace =\lbrace \mathbb {U}_{m_{1}},{\ldots } ,\mathbb {U}_{m_{N}}\rbrace \).
We consider the following Petrov-Galerkin condition of orthogonality
with \(\mathcal {R}_{m}\) the residual tensor, which is given by
Using the relation (8) and the condition (9), we obtain
Using the expression of the right-hand side tensor \({\mathscr{B}}\) and the fact that \(U_{1}^{(i)}\), i = 1, … , N, are obtained from QR factorizations of B(i), i = 1, … , N, we have
where \(\tilde {R}^{(i)}=E_{1}^{(m_{i})}R^{(i)}\).
Then, the low dimensional Sylvester tensor equation is given as follows
where \({\mathscr{B}}_{m}=\mathcal {I}_{R}\times _{1} \tilde {R}^{(1)}\cdots \times _{N} \tilde {R}^{(N)}\).
Proposition 6
Let\(\mathcal {Y}_{m}\)be the solution of (10), the associated residual tensor to the approximate solution\(\mathcal {X}_{m}=\mathcal {Y}_{m}\times \lbrace \mathbb {U}_{m}\rbrace \)is
Proof
We have
Since \(U_{1}^{(i)}\), i = 1, … , N, are obtained from QR factorizations of B(i), i = 1, … , N, the right-hand side tensor \({\mathscr{B}}\) can be written as follows
Using the expression below and the relation (11), we obtain
Invoking (13), the result in the proposition achieved. □
The following theorem can be established in the same way as theorem 1 in Section 2
Theorem 2
Let\(\mathcal {R}_{m}\)be the corresponding residual to the approximate solution obtained by the block Arnoldi approach, then
where\(\mathcal {Y}_{m}\)is the solution of (10) and\(E_{m_{i}}=\left [0_{R},{\ldots } ,0_{R},I_{R}\right ]^{T}\in \mathbb {R}^{Rm_{i}\times R}\).
The block Arnoldi algorithm for Sylvester tensor equation is summarized in algorithm 1. By the end of this section, we point out that the solution of (10) is of size Rm1 × Rm2 ×…RmN, which may become large even for moderate values of R and mi. Solving (10) required in step (3), Algorithm (1), can be then challenging; by this means, we propose the method in the following section.
4.2 Global Krylov approach
The previous block Krylov approach reduced (1) to a low dimensional Sylvester tensor equation, where the solution is of size Rm1 × Rm2 ×…RmN, while the global Krylov approach constructs an approximate solution from an m1 ×… × mN tensor.
Let \(\mathbb {V}_{m_{i}}=\left [ V_{1}^{(i)},\cdots ,V_{m_{i}}^{(i)}\right ] \) be the matrices obtained by applying the global Arnoldi algorithm to the pairs (A(i), B(i)), i = 1,2, … , N, starting with \(V_{1}^{(i)}=B^{(i)}/\|B^{(i)}\|\), then the following relations hold, for i = 1, … , N,
An approximate solution is given by
where \(\mathcal {Y}_{m}\) is the m1 × ⋯ × mN tensor, satisfying the low dimensional Sylvester tensor equation
with \(\mathcal {E}_{m}=e_{1}^{(m_{1})}\circ \cdots \circ e_{1}^{(m_{N})}\) and \(\upbeta =\displaystyle \prod \limits _{i=1}^{N}\|B^{(i)}\|\).
Proposition 7
Let\(\mathcal {X}_{m}\)be the approximate solution obtained by the global Arnoldi approach and\(\mathcal {R}_{m}\)be the corresponding residual, then
where\(\mathcal {Y}_{m}\)is the solution of (13) and\(e_{m_{i}}=\left [0,{\ldots } ,0,1\right ]^{T}\in \mathbb {R}^{m_{i}}\).
Proof
We have
Using the proposition (3) and the fact that \(E_{m_{i}}=e_{m_{i}}\otimes I_{R}\), we obtain
Since the first blocks in each basis are taken from the right-hand side tensor \({\mathscr{B}}\), we have
where \(\upbeta =\displaystyle \prod \limits _{i=1}^{N}\|B^{(i)}\|\).
Using the fact that \(E_{1}^{(m_{i})}e_{r}^{(R)}=e_{r}^{(Rm_{i})}=e_{1}^{(m_{i})}\otimes e_{r}^{(R)}\) and the proposition (2), we obtain
then
Now, we obtain
Invoking (13) the result achieved. □
Lemma 1
Let\(\mathbb {V}_{m_{i}}\)be the basis generated by the global Arnoldi algorithm applied to the pairs (A(i), B(i)) and\(\mathcal {X}\in \mathbb {R}^{J_{1}\times \cdots \times J_{N}}\)with Ji = Rmiand\(\mathcal {Z}\in \mathbb {R}^{K_{1}\times \cdots \times K_{N}}\)with Ki = mi. Then
Proof
We set \(\mathcal {P}=\mathcal {X}\times _{i}\mathbb {V}_{m_{i}}\), the i th mode fiber of the tensor \(\mathcal {P}\) can be written as follows
As the norm of a tensor can be expressed as the sum of the norm of all its i-mode fibers, it results that
Since \(\mathbb {V}_{m_{i}}=\left [ V_{1},{\ldots } ,V_{m_{i}}\right ]\) is orthonormal, we obtain
therefore (16) is achieved.
We set \(\mathcal {Q}=\left (\mathcal {Z}\otimes \mathcal {I}_{R}\right )\times _{i}\mathbb {V}_{m_{i}}\), if \(\mathcal {Q}_{(\overline {r_{1}j_{1}},{\ldots } ,:,{\ldots } ,\overline {r_{N}j_{N}})}, \mathcal {Z}_{(j_{1},{\ldots } ,:,{\ldots } ,j_{N})}\) and \({\mathcal {I}_{R}}_{(r_{1},{\ldots } ,:,\ldots ,r_{N})}\) denotes the i th-mode fibers of \(\mathcal {Q}, \mathcal {Z}\), and \(\mathcal {I}_{R}\), respectively, we have
As \(\mathbb {V}_{m_{i}}=\left [ V_{1},{\ldots } ,V_{m_{i}}\right ]\) is orthonormal, we obtain
then (17) is achieved. □
In the following theorem, we give an upper bound for the residual norm
Theorem 3
Let\(\mathcal {X}_{m}\)be the approximate solution obtained by the global Arnoldi approach and\(\mathcal {R}_{m}\)be the corresponding residual, then
where\(\mathcal {Y}_{m}\)is the solution of (13) and\(e_{m_{i}}=\left [0,{\ldots } ,0,1\right ]^{T}\in \mathbb {R}^{m_{i}}\).
Proof
We have
Using the relations (12) and (15), we obtain
Let the tensor \(\mathcal {Z}_{m}^{(i)}\in \mathbb {R}^{(m_{1}+1)\times \cdots \times (m_{N}+1)}\), for i = 1, … , N, defined by
and the tensor \(\mathcal {F}_{m}\in \mathbb {R}^{(m_{1}+1)\times \cdots \times (m_{N}+1)}\), defined by
Then
By setting \(\tilde {\mathcal {Z}}_{m}={\sum }_{i=1}^{N}\mathcal {Z}_{m}^{(i)}\), we obtain
We set \(\mathcal {Z}_{m}^{(0)}=\upbeta \mathcal {F}_{m}-\tilde {\mathcal {Z}}_{m}\), we have
and
By applying the relation (16) of lemma 1 (N − 1) times to \(\|\mathcal {R}_{m}\|\), we obtain
Then, the relation (17) of lemma 1 leads to
□
The Global Arnoldi algorithm for Sylvester tensor equation is summarized as follows
4.3 Complexity consideration
In this section, we present the required number of operations to apply the global or block Arnoldi algorithm to the coefficient matrices. For sake of simplicity, we consider the 3-mode case, i.e., \({\mathscr{B}}\in \mathbb {R}^{n\times n\times n}\) and \(A^{(i)}\in \mathbb {R}^{n\times n}, i=1,2,3\). The associated number of operations in the Arnoldi process while computing \(A^{(i)}V_{j}^{(i)}\) is determined by
where nz(A(i)) refer to the number of non-zero entries of the matrix A(i). So for 3 matrices and after m iterations, the total cost is
which is 3mR(2n2 − 1) in the worst case when A(i), i = 1,2,3, are full matrices.
5 Numerical examples
In this section, we present three numerical examples to show the effectiveness of our approaches for solving (1), with large-scale coefficient matrices (Example 1). The low dimensional Sylvester tensor (10) and (13) will be solved by the GLS-BTF algorithm given in [14] when the size of the reduced equation is small, or by the recursive algorithm presented in [5]. The numerical results were performed on a 2.7-GHz Intel Core i5 and 8 Go 1600-MHz DDR3 with Matlab R2016a. In all the examples, the right-hand side tensor is either constructed randomly or constructed so that the exact solution \(\mathcal {X}^{\ast }\) is given. Note that each cycle corresponds to \(k^{\prime }=5\) iterations in all examples. The used stopping criterion is
where 𝜖 is a given tolerance and \(\mathcal {R}_{m}\) is the m th residual associated with the approximated solution \(\mathcal {X}_{m}\).
5.1 Example 1
We point out that this section is restricted to the special case of the Sylvester tensor equation with N = 3, i.e.,
In this first example, the coefficient matrices are taken from [25, Example 35.1], and have the same size n. They are generated by the Matlab-commands eye and rand as follows
and the right-hand side tensor \({\mathscr{B}}\) is chosen so that the exact solution \(\mathcal {X}^{\ast }\) is a random tensor of rank r, i.e.,
where \(x_{i}^{(k)}=\texttt {rand}(n,1)\), for i = 1, 2, 3, and k = 1, … , r. Notice, in this case, that straightforward computations show that the right-hand side tensor is of rank R = 3r. We point out that the Krylov subspace dimensions are chosen to be the same, i.e., m1 = m2 = m3 = m, since the coefficient matrices are the same. We take n in the set {500, 10, 000}, and the tolerance 𝜖 in the set {10− 12, 10− 9} respectively. The numerical results are reported in Table 1. We point out that the CPU time does not cover the construction of the approximate solution; it covers only the construction of the Krylov subspaces basis and the solution of the reduced Sylvester tensor equation. For n = 500, we gave the exact error and the residual norm, where for n = 10,000, we gave only the residual norm due to the time needed to construct the whole approximate solution tensor. The larger CPU time needed for the block method is caused by the computational expenses of the block Arnoldi algorithm and to the computation of the solution of the reduced Sylvester tensor equation of order mR × mR × mR for increasing m; for this reason, we will compute the following examples using only the global method.
5.2 Example 2
In this example, the coefficient matrices A(i), i = 1, … , N, are obtained from the 5-point discretization of the Poisson equation on an n0 −by − n0 mesh, with homogeneous Dirichlet boundary conditions. We use the following Matlab command to generate A(i), i = 1, … , N,
A(i) are then of size \({n_{0}^{2}}\times {n_{0}^{2}}\). For this experiment, we take n in the set {400, 121, 100} and N in the set {3, 4}, the tolerance 𝜖 is set to 10− 6. We construct the right-hand side tensor so that all the components of the exact solution \(\mathcal {X}^{\ast }\) are one (the numerical examples are reported in Table 2). We run the same example with a random rank R right-hand side tensor (see Table 3).
5.3 Example 3
Here, we keep the same data as the previous example, except for the coefficient matrices A(i), i = 1, … , N, which are obtained from the discretization of the operator
on the unit square \(\left [ 0,1\right ]\times \left [ 0,1\right ] \) with homogeneous Dirichlet boundary conditions. The number of inner grid points in each direction is n0 for the operator L. The dimensions of the matrices A(i), i = 1, … , N, are \(I_{i} = {n_{0}^{2}}\). The discretization of the operator L yields matrices extracted from the Lyapack packag [22], using the command fdm and denoted as
with f1(x, y) = exy, f2(x, y) = sin(xy), g(x, y) = y2 − x2. The numerical results for this example are reported in Tables 2 and 3.
6 Conclusion
In this paper, we have proposed new approaches to extract approximate solutions to (1) with low rank right-hand sides. The first approach is based on the use of the block Arnoldi algorithm for the coefficient matrices in (1), which leads to the reduced Sylvester tensor (10). The second approach is based on the use of the global Arnoldi algorithm in order to obtain the low dimensional Sylvester tensor (13). We gave the expressions of the residuals and the residual norms for each approach. Numerical examples show that our approaches lead to satisfactory results when applied to (1). Combining our approaches with some tensors decompositions in order to work with full rank right-hand sides is a future project.
References
Ali Beik, F.P., Movahed, F.S., Ahmadi-Asl, S.: On the Krylov subspace methods based on tensor format for positive definite Sylvester tensor equations. Numer. Linear Algebra Appl. 23(3), 444–466 (2016)
Ballani, J., Grasedyck, L.: A projection method to solve linear systems in tensor format. Numer. Linear Algebra Appl. 20(1), 27–43 (2013)
Beylkin, G., Mohlenkamp, M.J.: Algorithms for numerical analysis in high dimensions. SIAM J. Sci. Comput. 26(6), 2133–2159 (2005)
Calvetti, D., Reichel, L.: Application of ADI iterative methods to the restoration of noisy images. SIAM J. Matrix Anal. Appl. 17(1), 165–186 (1996)
Chen, M., Kressner, D.: Recursive blocked algorithms for linear systems with Kronecker product structure. arXiv:1905.09539 (2019)
Chen, Z., Lu, L.: A projection method and Kronecker product pre-conditioner for solving Sylvester tensor equations. Sci. Chin. Math. 55(6), 1281–1292 (2012)
Chen, Z., Lu, L.: A gradient based iterative solutions for sylvester tensor equations. Math. Probl. Eng., 2013 (2013)
Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.i.: Non-Negative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation. Wiley (2009)
Ding, F., Chen, T.: Gradient based iterative algorithms for solving a class of matrix equations. IEEE Trans. Autom. Control 50(8), 1216–1221 (2005)
El Guennouni, A., Jbilou, K., Riquet, A.: Block Krylov subspace methods for solving large Sylvester equations. Numer. Algor. 29(1–3), 75–96 (2002)
Grasedyck, L.: Existence and computation of low Kronecker-rank approximations for large linear systems of tensor product structure. Computing 72(3-4), 247–265 (2004)
Jbilou, K.: Low rank approximate solutions to large Sylvester matrix equations. Appl. Math. Comput. 177(1), 365–376 (2006)
Jbilou, K., Messaoudi, A., Sadok, H.: Global FOM and GMRES algorithms for matrix equations. Appl. Numer. Math. 31(1), 49–63 (1999)
Karimi, S., Dehghan, M.: Global least squares method based on tensor form to solve linear systems in Kronecker format. Trans. Instit. Measur. Control 40(7), 2378–2386 (2018)
Khoromskij, B.N.: Tensors-structured numerical methods in scientific computing: Survey on recent advances. Chemom. Intell. Lab. Syst. 110(1), 1–19 (2012)
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)
Kressner, D., Tobler, C.: Krylov subspace methods for linear systems with tensor product structure. SIAM J. Matrix Anal. Appl. 31(4), 1688–1714 (2010)
Lee, N., Cichocki, A.: Fundamental tensor operations for large-scale data analysis using tensor network formats. Multidim. Syst. Sign. Process. 29(3), 921–960 (2018)
Li, B.W., Tian, S., Sun, Y.S., Hu, Z.M.: Schur decomposition for 3d matrix equations and its application in solving radiative discrete ordinates equations discretized by chebyshev collocation spectral method. J. Comput. Phys. 229(4), 1198–1212 (2010)
Lu, H., Plataniotis, K.N., Venetsanopoulos, A.N.: A survey of multi-linear subspace learning for tensor data. Pattern Recogn. 44(7), 1540–1551 (2011)
Malek, A., Momeni-Masuleh, S.H.: A mixed collocation finite difference method for 3d microscopic heat transport problems. J. Comput. Appl. Math. 217(1), 137–147 (2008)
Penzl, T., et al.: A Matlab toolbox for large Lyapunov and Riccati equations, model reduction problems, and linear quadratic optimal control problems. Software available at https://www.tu-chemnitz.de/sfb393/lyapack/ (2000)
Saad, Y.: Iterative Methods for Sparse Linear Systems, vol. 82. SIAM (2003)
Sun, Y.S., Ma, J., Li, B.W.: Chebyshev collocation spectral method for three-dimensional transient coupled radiative conductive heat transfer. J. Heat Transfer 134(9), 092701 (2012)
Trefethen, L.N., Bau, D. III: Numerical Linear Algebra, vol. 50. SIAM (1997)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bentbib, A.H., El-Halouy, S. & Sadek, E.M. Krylov subspace projection method for Sylvester tensor equation with low rank right-hand side. Numer Algor 84, 1411–1430 (2020). https://doi.org/10.1007/s11075-020-00874-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-020-00874-0