Iterative methods for solving tensor equations based on exponential acceleration

Liang, Maolin; Dai, Lifang; Zhao, Ruijuan

doi:10.1007/s11075-023-01692-w

Iterative methods for solving tensor equations based on exponential acceleration

Original Paper
Published: 13 November 2023

Volume 97, pages 29–49, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Numerical Algorithms Aims and scope Submit manuscript

Iterative methods for solving tensor equations based on exponential acceleration

Download PDF

Maolin Liang¹,
Lifang Dai¹ &
Ruijuan Zhao²

258 Accesses
Explore all metrics

Abstract

The tensor equation $\mathcal {A}{\varvec{x}}^{m-1}={\varvec{b}}$ with the tensor $\mathcal {A}$ of order m and dimension n and the vector ${\varvec{b}}$, has practical applications in several fields including signal processing, high-dimensional PDEs, high-order statistics, and so on. In this paper, a class of exponential accelerated iterative methods is proposed for solving the tensor equation mentioned above in the sense that the coefficient tensor $\mathcal {A}$ is a symmetric and nonsingular or singular $\mathcal {M}$-tensor. The obtained iterative schemes involve the classical Newton’s method as a special case. It is shown that the proposed method for nonsingular case is superlinearly convergent, while for singular cases, it is linearly convergent. The performed numerical experiments demonstrate that our methods outperform some existing ones.

Alternating iterative methods for solving tensor equations with applications

Article 29 September 2018

Several efficient iterative algorithms for solving nonlinear tensor equation ${\mathcal {X}}+{\mathcal {A}}^{T}_N{\mathcal {X}}^{-1}_N{\mathcal {A}}={\mathcal {I}}$ with Einstein product

Article 08 February 2024

Convergence of a Jacobi-type method for the approximate orthogonal tensor diagonalization

Article 29 November 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $\mathbb {R}$ be the set of all real numbers. For an order m and dimension $n_1\times n_2\times \cdots \times n_m$ tensor $\mathcal {A}$, it has $n_1n_2\cdots n_m$ entries $\mathcal {A}_{i_1\ldots i_m}$ indexed by $i_j$ satisfying $1\le i_j\le n_j$, $j=1,2,\ldots ,m$. The set of all order m and dimension $n_1\times n_2\times \dots \times n_m$ tensors over the real field is denoted by $\mathbb {R}^{n_1\times n_2\times \dots \times n_m}$. Particularly, we denote this set by $\mathbb {R}^{[m,n]}$ if $n_1=\cdots =n_m=n$. We say that $\mathcal {A}\in \mathbb {R}^{[m,n]}$ is a symmetric tensor if its entries $\mathcal {A}_{i_1\ldots i_m}$ are invariant under any permutation of their indices $\{i_1,i_2,\ldots , i_m\}$. It is not difficult to observe that the tensor $\mathcal {A}$ reduces to a vector of size $n_1$ when $m=1$, or becomes a matrix of size $n_1\times n_2$ in the case that $m=2$.

Although tensors are a generalized form of matrices, they have significant differences from matrices, For example, unlike matrix situations, there are several definitions of tensor ranks, eigenvalues, and tensor-tensor multiplications based on different application backgrounds including chemometries, signal processing, high-dimensional statistics, and so on, one can refer to the Refs. [1,2,3] for details.

It is worth mentioning that tensor equations are important models for describing high-dimensional problems. In this paper, we consider the following tensor equation

$$\begin{aligned} \mathcal {A}{\varvec{x}}^{m-1}={\varvec{b}}, \end{aligned}$$

(1.1)

where $\mathcal {A}=(\mathcal {A}_{i_1i_2\ldots i_m})\in \mathbb {R}^{[m,n]}$, ${\varvec{b}}\in \mathbb {R}^{n}$, and $\mathcal {A}{\varvec{x}}^{m-1}\in \mathbb {R}^{n}$ defined by

$$\begin{aligned} (\mathcal {A}{\varvec{x}}^{m-1})_{i} =\sum _{i_2,i_3,\ldots ,i_m}\mathcal {A}_{i i_2\ldots i_m}{\varvec{x}}(i_2){\varvec{x}}(i_3)\ldots {\varvec{x}}(i_m), \end{aligned}$$

herein ${\varvec{x}}(i)$ stands for the ith-component of the vector ${\varvec{x}}$. The notation $\mathcal {A}{\varvec{x}}^{m-1}$ was first exploited by Qi in [4] to define the eigenvalues of a tensor.

Let $\mathcal {A}\in \mathbb {R}^{[m,n]}$. We say that $\lambda \in \mathbb {R}$ is an eigenvalue of $\mathcal {A}$ if there exists a nonzero ${\varvec{x}}\in \mathbb {R}^{n}$ such that

$$\begin{aligned} \mathcal {A}{\varvec{x}}^{m-1}=\lambda {\varvec{x}}^{[m-1]}, \end{aligned}$$

here ${\varvec{x}}^{[m-1]}=({\varvec{x}}(1)^{m-1},{\varvec{x}}(2)^{m-1},\ldots ,{\varvec{x}}(n)^{m-1})^\text {T}$. The spectral radius of $\mathcal {A}$ is the maximum modulus of the eigenvalues, and is denoted by $\rho (\mathcal {A})$. A tensor $\mathcal {A}\in \mathbb {R}^{[m,n]}$ is called a nonsingular (singular) $\mathcal {M}$-tensor [5, 6], if it can be represented as $\mathcal {A}=s \mathcal {I}-\mathcal {B}$ in which $s>\rho (\mathcal {B})$ ($s\ge \rho (\mathcal {B})$), $\mathcal {B}\in \mathbb {R}^{[m,n]}$ is nonnegative, and $\mathcal {I}$ is the identity tensor, that is, $\mathcal {I}_{i i \ldots i}=1$, and otherwise $\mathcal {I}_{i_1i_2\ldots i_m}=0$.

The tensor equation (1.1) has important applications in many fields, such as information retrieval [7], numerical solution of partial differential equations [8], tensor complement problem [9], higher-order statistics [10], and so on. In recent years, it has been researched deeply in the sense that the coefficient tensor $\mathcal {A}$ has some special structures. For instance, Ding and Wei extended the classical Jacobian and Gauss-Seidel methods for linear equations and the Newton method for nonlinear equations to (1.1) when $\mathcal {A}$ is a nonsingular (symmetric) $\mathcal {M}$-tensor and ${\varvec{b}}$ is a positive vector [8] (we call it $\mathcal {M}$-tensor equation for ease of expression). After that the homotopy method [11], the tensor method [12], splitting iterative methods [13,14,15], Newton-type method [16], neural work method [17] were proposed for solving the $\mathcal {M}$-tensor equation mentioned above. Subsequently, the classical Levenberg-Marquardt (LM) method was applied to (1.1) when the tensor $\mathcal {A}$ is a nonsingular semi-symmetric $\mathcal {M}$-tensor [18]. Very recently, the ADMM-type method and the two-step accelerated LM method were established, respectively, in [19] and [20], for solving the tensor equation (1.1) with a general tensor $\mathcal {A}$.

In essence, the tensor equation under consideration is a nonlinear equation. The iterative algorithms listed above fully considered the particularity of the corresponding tensor equations, and possess better convergence. Nevertheless, no one is suitable for all equations. It is our constant pursuit to establish more efficient iterative algorithms for the tensor equation mentioned above. In present paper, we are interested in the solution of the tensor equation (1.1) whose coefficient tensor $\mathcal {A}$ is a singular or a nonsingular $\mathcal {M}$-tensor. This kind of tensor equations arises from the higher-dimensional PDEs, and one can see [8] for more details. In addition, as is proved that, for any $\mathcal {A}\in \mathbb {R}^{[m,n]}$, there exists a symmetric tensor $\widehat{\mathcal {A}}\in \mathbb {R}^{[m,n]}$ such that $\mathcal {A}{\varvec{x}}^{m-1}=\widehat{\mathcal {A}}{\varvec{x}}^{m-1}$ [3]. Therefore, in this paper, we make the following assumptions:

$\star $ The tensor $\mathcal {A}$ in (1.1) is a symmetric $\mathcal {M}$-tensor.

$\star $ The tensor equation (1.1) is solvable.

The approach to be established here is relying on the exponentially accelerated technique for nonlinear equations, which is an extension of the classical Newton’s method [21]. As is well-known, this method is quite efficient and has quadratic convergence under some circumstances, but it relies heavily on initial values, and may fail to converge in the case that the initial guess is far from zero or the derivative of the function in the vicinity of the required root is small. Recently, Chen and Li in [22] proposed a class of exponential iteration approaches (denoted by EAI for short) that has quadratic convergence, and can be applied in the case where the Newton’s method is not successful. The EAI method contains the Newton’s method as a special case by taking the first order Taylor series expansion, see a short review in Sect. 2.

The under-considered tensor equation (1.1) is a nonlinear equation but possesses special structure. So we attempt to search more efficient iterative methods in two cases: the first case is that the coefficient tensor $\mathcal {A}$ is a symmetric and nonsingular $\mathcal {M}$-tensor, and the second one is that the coefficient tensor $\mathcal {A}$ is a symmetric and singular $\mathcal {M}$-tensor (see Sect. 3 for details). We shall apply the exponential acceleration technique introduced in [22] to the tensor equation (1.1). For ease of expression, we denote EAI-NS as the exponentially accelerated iterative method corresponding to the nonsingular $\mathcal {M}$-tensor $\mathcal {A}$, and EAI-S as the exponentially accelerated iterative method corresponding to the singular $\mathcal {M}$-tensor $\mathcal {A}$. Notably, in view of the singularity of the differential matrix of the vector-valued function, the EAI-S method inherits the characteristics of the LM method. It will be shown that both of them are also the high-dimensional generalizations of the Newton’s method proposed in [8]. Moreover, we can prove that the EAI-NS method is suplinearly convergent and the EAI-S method is linearly convergent under the aforementioned hypotheses. Several numerical examples derived from practical applications demonstrate that our methods are promising.

The remainder of this paper is organized as follows. In Sect. 2, we review some basic definitions and conclusions related to tensors and nonlinear equations. In Sect. 3, we present the exponentially accelerated iterative methods for solving the tensor equation (1.1), and the convergence of them will also been analyzed there. In Sect. 4, some numerical examples are given to illustrate the effectiveness of the proposed methods. In Sect. 5, we conclude this paper with some remarks.

2 Preliminaries

2.1 Notations and definitions

First of all, we introduce some necessary notations: scalars are denoted by lower-case letters, e.g., a, b, c; vectors are denoted by boldface lower-case letters, e.g., ${\varvec{a}},{\varvec{b}},{\varvec{c}}$; matrices are denoted by boldface capital letters, e.g., ${\varvec{A}},{\varvec{B}},{\varvec{C}}$; tensors are denoted by calligraphic script letters, e.g., $\mathcal {A},\mathcal {B},\mathcal {C}$.

Now, we introduce the following definition on the tensor-vector product (see, e.g., [1] for more details).

Definition 2.1

Let $\mathcal {A}=(\mathcal {A}_{i_1i_2\dots i_m})\in \mathbb {R}^{n_1\times n_2\times \cdots \times n_m}$ and ${\varvec{x}}=({\varvec{x}}(i)) \in \mathbb {R}^{n_k}$. Then, the k-mode (vector) product, denoted by $\mathcal {A}\bullet _k {\varvec{x}}$, is an $n_1\times \cdots \times n_{k-1}\times n_{k+1}\times \cdots \times n_m$ tensor, elementwise,

$$\begin{aligned} (\mathcal {A}\bullet _k {\varvec{x}})_{i_1\dots i_{k-1}i_{k+1}\dots i_m} =\sum _{i_k=1}^{n_k}\mathcal {A}_{i_1\dots i_k\dots i_m}{\varvec{x}}(i_k). \end{aligned}$$

Using Definition 2.1, for a given tensor $\mathcal {A}\in \mathbb {R}^{[m,n]}$ and vector ${\varvec{x}}\in \mathbb {R}^{n}$, we denote that

$$\mathcal {A}{\varvec{x}}^{m-k}=\mathcal {A}\bullet _{k+1} {\varvec{x}}\bullet _{k+2}{\varvec{x}} \cdots \bullet _m {\varvec{x}}\in \mathbb {R}^{[k,n]}, {k=1,2,\ldots , m-1.}$$

2.2 Two classical iterative methods for nonlinear equations

In this subsection, we first give a brief review on the classical Newton’s method. Let f be a real valued function over the closed and convex set $\Omega \subseteq \mathbb {R}$, and assume that it is continuously differentiable in the domain of a root $x^*$ of $f(x)=0$. Then, the Newton’s method to solve this nonlinear equation can be expressed as follows:

$$\begin{aligned} x_{k+1}=x_{k}-\dfrac{f(x_k)}{f'(x_k)}. \end{aligned}$$

This method is quite efficient and has quadratic convergence under some circumstances. Nevertheless, it may fail to converge when the initial guess is far from zero or the derivative of the function f in the vicinity of $x^*$ is small.

Due to the aforementioned shortcomings, Chen and Li [22] proposed the EAI method for the nonlinear equation $f(x)=0$, and the iterative scheme of which consists of the following iteration step:

$$\begin{aligned} x_{k+1}=x_{k}\exp \left( -\dfrac{f(x_k)}{x_kf'(x_k)}\right) . \end{aligned}$$

In particular, this iterative method reduces to the well-known Newton’s method by taking the first order Taylor series expansion of $\exp \left( -\frac{f(x_k)}{x_kf'(x_k)}\right) $. Recently, several variants of the EAI method were proposed for solving nonlinear equations; see, e.g., [23,24,25].

Next, we recall the classical Levenberg-Marquardt (LM) method [26, 27]. To do this, let ${\varvec{F}}:\mathbb {R}^n\rightarrow \mathbb {R}^n$ be a continuously differential function, then the LM method consists of computing the trial step at each iteration

$$\begin{aligned} {\varvec{d}}_k^{LM}=-({\varvec{J}}_k^\text {T} {\varvec{J}}_k +\tau _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}{\varvec{F}}_k, \end{aligned}$$

in which ${\varvec{F}}_k={\varvec{F}}({\varvec{x}}_k)$, ${\varvec{J}}_k$ represents the value of the Jacobian ${\varvec{J}}({\varvec{x}}):={\varvec{F}}\ '({\varvec{x}})$ at ${\varvec{x}}_k$, and the LM parameter $\tau _k>0$ is updated from iteration to iteration. Under the local error bound condition which is weaker than nonsingularity [28], it has been proved that this method has quadratic convergence, and several variants have been developed in the literature; see, e.g., [18, 20, 29] and the references therein.

3 The exponentially accelerated iterative methods

In this section, we shall establish the exponentially accelerated iterative methods to solve the tensor equation (1.1) under the assumption that it is always solvable.

In the sequel, two kinds of exponentially accelerated approaches will be proposed for (1.1) under the two scenarios: The first one is that the coefficient tensor $\mathcal {A}$ in (1.1) is a symmetric and nonsingular $\mathcal {M}$-tensor, and the second one is that the coefficient tensor $\mathcal {A}$ is a symmetric and singular $\mathcal {M}$-tensor. Furthermore, the convergence of the proposed iterative methods will be discussed. We should emphasize that the convergence analysis of the methods given in present paper as well as their iteration schemes are similar but different from that of the iterative method presented in [25].

3.1 The EAI method for (1.1) with nonsingular $\mathcal {M}$-tensor $\mathcal {A}$

As shown by Ding and Wei [8], the tensor equation (1.1) always has a solution when the coefficient tensor $\mathcal {A}$ is a nonsingular one. In order to derive the new iterative scheme, denote

$$\begin{aligned} {\varvec{F}}({\varvec{x}}):=\mathcal {A}{\varvec{x}}^{m-1}-{\varvec{b}}=0. \end{aligned}$$

(3.1)

Using (3.1) and Definition 2.1, it follows from the symmetry of the tensor $\mathcal {A}$ that the gradient of $F({\varvec{x}})$ is

$$\begin{aligned} {\varvec{J}}({\varvec{x}}):={\varvec{F}}\ '({\varvec{x}})=(m-1)\mathcal {A}{\varvec{x}}^{m-2}\in \mathbb {R}^{n\times n}. \end{aligned}$$

(3.2)

For ease of expression, we shall use ${\varvec{F}}_k$, and ${\varvec{J}}_k$ to represent the values of ${\varvec{F}}({\varvec{x}})$ and ${\varvec{J}}({\varvec{x}})$ at ${\varvec{x}}_k$, respectively.

Following the idea of the exponentially accelerated iterative method given in [22], we obtain the following iterative scheme for (1.1).

$$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} {\varvec{J}}_k \Delta {\varvec{x}}_k &{}=-{\varvec{F}}_k,\\ \quad {\varvec{x}}_{k+1}&{}= \text {diag}\left( \exp \left( \dfrac{\Delta {\varvec{x}}_k(i)}{{\varvec{x}}_k(i)}\right) \right) {\varvec{x}}_k,\\ \end{array}\right. } \end{aligned} \end{aligned}$$

(3.3)

in which

$$ \text {diag}\left( \exp \left( \dfrac{\Delta {\varvec{x}}_k(i)}{{\varvec{x}}_k(i)}\right) \right) :=\left( \begin{array}{cccc} \exp \left( \dfrac{\Delta {\varvec{x}}_k(1)}{{\varvec{x}}_k(1)}\right) &{} &{} &{} \\ &{} &{} \ddots &{} \\ &{} &{} &{}\exp \left( \dfrac{\Delta {\varvec{x}}_k(n)}{{\varvec{x}}_k(n)}\right) \\ \end{array} \right) . $$

Then, the exponentially accelerated iterative method for solving the tensor equation (1.1) in the case that $\mathcal {A}$ is a symmetric and nonsingular $\mathcal {M}$-tensor can be concretely reported as follows:

We have some comments for this algorithm:

(1) Since $\mathcal {A}$ is a nonsingular $\mathcal {M}$-tensor, the matrix ${\varvec{J}}_k$ is a nonsingular $\varvec{\textbf{M}}$-matrix [5]. In this case, the solution to the linear subproblem ${\varvec{J}}_k \Delta {\varvec{x}}_k =-{\varvec{F}}_k$ in (3.3) can be expressed explicitly, i.e., $\Delta {\varvec{x}}_k=-{\varvec{J}}_k^{-1}{\varvec{F}}_k$.

(2) By the definitions of $\text {diag}(\cdot )$ and $\text {exp}(\cdot )$, the iterative scheme in (3.3) is equivalent to

$$\begin{aligned} {\varvec{x}}_{k+1}(i)={\varvec{x}}_k(i)\exp \left( \dfrac{\Delta {\varvec{x}}_k(i)}{{\varvec{x}}_k(i)}\right) , \quad i=1,2,\ldots , n. \end{aligned}$$

(3.4)

In the implementation of Algorithm 1, the matrices $J_k$ may be singular or almost singular due to the influence of computer errors, one can update ${\varvec{x}}_{k}$ by the following format:

$$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} \Delta {\varvec{x}}_k &{}=-{{\varvec{J}}_k^\dagger {\varvec{F}}_k},\\ {\varvec{x}}_{k+1}&{}= \text {diag}\left( \exp \left( \dfrac{\Delta {\varvec{x}}_k(i)}{{\varvec{x}}_k(i)}\right) \right) {\varvec{x}}_k.\\ \end{array}\right. } \end{aligned} \end{aligned}$$

(3.5)

Herein the superscript $'\dagger \ '$ denotes the Moore-Penrose inverse of a matrix [30]. Particularly, if ${\varvec{x}}_k(i)=0$ for some index i, let the corresponding ${\varvec{x}}_{k+1}(i)=0$.

(3) From Algorithm 1 and the definition of the tensor-vector product, we know that the main tensor operations is to compute ${\varvec{F}}_k$ and ${\varvec{J}}_k$ at each iteration, and then the amount of operations contained in this algorithm is estimated conservatively by $\mathcal {O}(n^{m-1})$.

3.2 Convergence analysis of the EAI-NS method

Using the properties of the function ${\varvec{F}}({\varvec{x}})$ and ${\varvec{J}}({\varvec{x}})$, we can show under some assumptions that Algorithm 1 is superlinearly convergent.

We begin with the following lemmas.

Lemma 3.1

Let ${\varvec{F}}({\varvec{x}})$ and ${\varvec{J}}({\varvec{x}})$ be two functions defined in (3.1) and (3.2) respectively, and $\Omega \subset \mathbb {R}^n$ be a closed and convex set. Then, for any ${\varvec{x}}, {\varvec{y}}\in \Omega $, there exist $L_1>0$ and $L_2>0$ such that

$$\begin{aligned} \begin{aligned} \Vert {{\textbf {J}}}({{\textbf {y}}})-{{\textbf {J}}}({{\textbf {x}}})\Vert \le&L_1 \Vert {{\textbf {y}}}-{{\textbf {x}}}\Vert ,\\ \Vert {{\textbf {F}}}({{\textbf {y}}})-{{\textbf {F}}}({{\textbf {x}}})\Vert \le&L_2 \Vert {{\textbf {y}}}-{{\textbf {x}}}\Vert ,\\ \Vert {{\textbf {F}}}({{\textbf {y}}})-{{\textbf {F}}}({{\textbf {x}}})-{{\textbf {J}}}({{\textbf {x}}})({{\textbf {y}}}-{{\textbf {x}}})\Vert \le&L_1 \Vert {{\textbf {y}}}-{{\textbf {x}}}\Vert ^2.\\ \end{aligned} \end{aligned}$$

Especially, $\Vert {\varvec{J}}({\varvec{x}})\Vert \le L_2$.

Proof

The proofs of the first two inequalities can be derived following the ones of Corollary 3.1 in [18]. For the third one, because the function F is continuously differential, then there exist one constant $L_1>0$ and one vector $\hat{\varvec{x}}_k$ between ${\varvec{x}}_k$ and ${\varvec{x}}^*$ such that

$$\begin{aligned} \begin{aligned} \Vert {\varvec{F}}({\varvec{x}}_k)-{\varvec{F}}({\varvec{x}}^*)-{\varvec{J}}({\varvec{x}}^*)({\varvec{x}}_k-{\varvec{x}}^*)\Vert&=\Vert {\varvec{J}}(\hat{{\varvec{x}}}_k)({\varvec{x}}_k-{\varvec{x}}^*)-{\varvec{J}}({\varvec{x}}^*)({\varvec{x}}_k-{\varvec{x}}^*)\Vert \\&=\Vert [{\varvec{J}}(\hat{{\varvec{x}}}_k)-{\varvec{J}}({\varvec{x}}^*)]({\varvec{x}}_k-{\varvec{x}}^*)\Vert \\&\le \Vert {\varvec{J}}(\hat{{\varvec{x}}}_k)-{\varvec{J}}({\varvec{x}}^*)\Vert \Vert {\varvec{x}}_k-{\varvec{x}}^*\Vert \\&\le L_1 \Vert {\varvec{x}}_k-{\varvec{x}}^*\Vert ^2.\\ \end{aligned} \end{aligned}$$

The last inequality can be found in [29].$\square $

Lemma 3.2

([12]) Suppose that ${\varvec{x}}^*$ is a solution of the tensor equation (1.1) with nonsingular M-tensor $\mathcal {A}$. Then, ${\varvec{J}}({\varvec{x}})$ defined in (3.2) is a nonsingular M-matrix for any ${\varvec{x}}\ne 0$, and there exist positive numbers $\delta $ and C such that $\Vert {\varvec{J}({\varvec{x}})}^{-1}\Vert \le C$ for all ${\varvec{x}}$ satisfying $\Vert {\varvec{x}}-{\varvec{x}}^*\Vert \le \delta $.

By using Lemmas 3.1 and 3.2, we obtain the following theorem.

Theorem 3.3

Let $\mathcal {A}\in \mathbb {R}^{[m,n]}$ be a symmetric and nonsingular $\mathcal {M}$-tensor, and ${\varvec{b}}\in \mathbb {R}^{n}$, and assume that ${\varvec{x}}^*$ is a solution of the tensor equation (1.1). Then, Algorithm 1 is superlinearly convergent.

Proof

By Algorithm 1 and Lemma 3.1, we have

$$\begin{aligned} \Delta {\varvec{x}}_k=-{\varvec{J}}_k^{-1}{\varvec{F}}_k=-{\varvec{J}}_k^{-1}[{\varvec{F}}_k-{\varvec{F}}({\varvec{x}}^*)], \end{aligned}$$

and then

$$\begin{aligned} \Vert \Delta {\varvec{x}}_k\Vert \le L_2 \Vert {\varvec{J}}_k^{-1}\Vert \Vert {\varvec{x}}_k-{\varvec{x}}^*\Vert . \end{aligned}$$

(3.6)

By the Taylor theorem of the function $\exp (z)$ with the variable $z\in \mathbb {R}$, i.e., $\exp (z)=1+z+o(z)$, the equality (3.4) can be concretely expressed as

$$\begin{aligned} {\varvec{x}}_{k+1}(i)={\varvec{x}}_k(i)+\Delta {\varvec{x}}_k(i)+o(\Delta {\varvec{x}}_k(i)), \end{aligned}$$

(3.7)

then the iterative scheme (3.3) is rewritten as

$$\begin{aligned} {\varvec{x}}_{k+1}= & {} \left( \begin{array}{ccccc} {\varvec{x}}_k(1)+\Delta {\varvec{x}}_k(1)+o(\Delta {\varvec{x}}_k(1)) \\ \vdots \\ {\varvec{x}}_k(n)+\Delta {\varvec{x}}_k(n)+o(\Delta {\varvec{x}}_k(n))\\ \end{array} \right) \nonumber \\= & {} \left( \begin{array}{ccccc} {\varvec{x}}_k(1)-{\varvec{J}}_k^{-1}(1,:){\varvec{F}}_k+o(\Delta {\varvec{x}}_k(1)) \\ \vdots \\ {\varvec{x}}_k(n)-{\varvec{J}}_k^{-1}(n,:){\varvec{F}}_k+o(\Delta {\varvec{x}}_k(n))\\ \end{array} \right) \nonumber \\= & {} {\varvec{x}}_{k}-{\varvec{J}}_k^{-1}{\varvec{F}}_k+o(\Delta {\varvec{x}}_k).\\ \end{aligned}$$

At this time, we obtain

$$\begin{aligned} \begin{aligned} {\varvec{x}}_{k+1}-{\varvec{x}}^*&={\varvec{x}}_{k}-{\varvec{x}}^*-{\varvec{J}}_k^{-1}{\varvec{F}}_k+o(\Delta {\varvec{x}}_k)\\&={\varvec{J}}_k^{-1} {\varvec{J}}_k({\varvec{x}}_{k}-{\varvec{x}}^*)- {\varvec{J}}_k^{-1}{\varvec{F}}_k+o(\Delta {\varvec{x}}_k)\\&={\varvec{J}}_k^{-1}[{\varvec{J}}_k({\varvec{x}}_{k}-{\varvec{x}}^*)-{\varvec{F}}_k]+o(\Delta {\varvec{x}}_k)\\&={\varvec{J}}_k^{-1}[{\varvec{J}}_k({\varvec{x}}_{k}-{\varvec{x}}^*)-{\varvec{F}}_k+F({\varvec{x}}^*)]+o(\Delta {\varvec{x}}_k).\\ \end{aligned} \end{aligned}$$

Using Lemma 3.1 again and noting that the tensor $\mathcal {A}$ is a nonsingular $\mathcal {M}$-tensor, one can derive

$$\begin{aligned} \begin{aligned} \Vert {\varvec{x}}_{k+1}-{\varvec{x}}^*\Vert&\le \Vert {\varvec{J}}_k^{-1}\Vert \Vert {\varvec{J}}_k({\varvec{x}}_{k}-{\varvec{x}}^*)-{\varvec{F}}_k+{\varvec{F}}({\varvec{x}}^*)\Vert +o(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert )\\&\le L_1 \Vert {\varvec{J}}_k^{-1}\Vert \Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert ^2+o(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert )\\&{=O(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert ^2)+o(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert )}\\&{=o(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert )}, \end{aligned} \end{aligned}$$

which indicates that Algorithm 1 converges suplinearly.

3.3 The EAI method for (1.1) with singular $\mathcal {M}$-tensor $\mathcal {A}$

In this subsection, we are going to establish the exponentially accelerated method for solving the tensor equation (1.1) in the case that the coefficient tensor $\mathcal {A}$ is a symmetric and singular $\mathcal {M}$-tensor. Following the classical LM method [26, 27] and the EAI method proposed in Section 3.1, the main iterative steps of the iteration is constructed by

$$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} ({\varvec{J}}_k^\text {T}{\varvec{J}}_k +\mu _k {\varvec{I}}_n) \Delta {\varvec{x}}_k &{}=-{\varvec{J}}_k^\text {T} {\varvec{F}}_k,\\ {\varvec{x}}_{k+1}&{}= \text {diag}\left( \exp \left( \dfrac{\Delta {\varvec{x}}_k(i)}{{\varvec{x}}_k(i)}\right) \right) {\varvec{x}}_k,\\ \end{array}\right. } \end{aligned} \end{aligned}$$

(3.8)

in which let the corresponding ${\varvec{x}}_{k+1}(i)=0$ if ${\varvec{x}}_k(i)=0$ for some index i. Then, the new iterative method for solving the tensor equation (1.1) with symmetric and singular $\mathcal {M}$-tensor $\mathcal {A}$ (denoted by EAI-S for short) is described as follows:

In Algorithm 2, if let $\mu _k=0$ and $J_k$ be nonsingular, then it reduces to Algorithm 1, that is, it is an extension of the latter. In the case of $\mu _k=0$ and $J_k$ is singular, one can replace (3.8) with (3.5). Moreover, in the third step of Algorithm 2, we needs to calculate $\Delta {\varvec{x}}_k$ by solving the equation system $({\varvec{J}}_k^\text {T}{\varvec{J}}_k +\mu _k {\varvec{I}}_n) \Delta {\varvec{x}}_k =-{\varvec{J}}_k^\text {T} {\varvec{F}}_k$. The coefficient matrix here is symmetric and positive definite, so it can be solved by using the classical iterative methods (e.g., the CG method [30]) when the size n is large.

In the implementation of this algorithm, as in LM-type methods (e.g., [18]), we can make the parameter $\mu _k$ change with iteration steps. Of course, it can also be an invariant positive constant for simplicity. In addition, the computational complexity of Algorithm 2 is the same as that of Algorithm 1.

3.4 Convergence analysis of the EAI-S method

In this subsection, we discuss the convergence of the EAI-S method. The line of the mind to prove the convergence is similar to that of the EAI-NS method.

Theorem 3.4

Let $\mathcal {A}\in \mathbb {R}^{[m,n]}$ be a symmetric and singular $\mathcal {M}$-tensor, and ${\varvec{b}}\in \mathbb {R}^{n}$, and assume that ${\varvec{x}}^*$ is a solution of the tensor equation (1.1). Then, Algorithm 2 is linearly convergent for $\mu _k>0$.

Proof

Depending on the Algorithm 2 and the required assumptions, we have

$$\begin{aligned} \Delta {\varvec{x}}_k=-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}{\varvec{F}}_k =-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}({\varvec{F}}_k-{\varvec{F}}({\varvec{x}}^*)), \end{aligned}$$

so it follows that

$$\begin{aligned} \Vert \Delta {\varvec{x}}_k\Vert \le L_2^2 \Vert ({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}\Vert \Vert {\varvec{x}}_k-{\varvec{x}}^*\Vert . \end{aligned}$$

(3.9)

The iterative scheme (3.8) can be componentwise written as the same form as (3.4). Analogously, by the Taylor theorem of the function $\exp (z)$ with the variable $z\in \mathbb {R}$, we can rewrite the iterative scheme (3.8) as (3.7), that is,

$$\begin{aligned} {\varvec{x}}_{k+1}(i)={\varvec{x}}_k(i)+\Delta {\varvec{x}}_k(i)+o(\Delta {\varvec{x}}_k(i)), \end{aligned}$$

and so the iterative scheme (3.8) is equivalent to

$$\begin{aligned} \begin{aligned} {\varvec{x}}_{k+1}={\varvec{x}}_{k}-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}{\varvec{F}}_k +o(\Delta {\varvec{x}}_k).\\ \end{aligned} \end{aligned}$$

(3.10)

Furthermore, we obtain

$$\begin{aligned} \begin{aligned} {\varvec{x}}_{k+1}-{\varvec{x}}^*&={\varvec{x}}_{k}-{\varvec{x}}^*-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}{\varvec{F}}_k+o(\Delta {\varvec{x}}_k)\\&=({\varvec{x}}_{k}-{\varvec{x}}^*)-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}[{\varvec{F}}_k-{\varvec{F}}({\varvec{x}}^*)]+o(\Delta {\varvec{x}}_k)\\&=({\varvec{x}}_{k}-{\varvec{x}}^*)-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}{\varvec{J}}(\tilde{{\varvec{x}}}_k)({\varvec{x}}_k-{\varvec{x}}^*)+o(\Delta {\varvec{x}}_k)\\&=[{\varvec{I}}_n-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}{\varvec{J}}(\tilde{{\varvec{x}}}_k)]({\varvec{x}}_{k}-{\varvec{x}}^*)+o(\Delta {\varvec{x}}_k),\\ \end{aligned} \end{aligned}$$

in which $\tilde{{\varvec{x}}}_k$ is the Mean Point between ${\varvec{x}}_k$ and ${\varvec{x}}^*$.

Noticing that (3.9) and taking the norm at both ends of the above equality, we have

$$\begin{aligned} \begin{aligned} \Vert {\varvec{x}}_{k+1}-{\varvec{x}}^*\Vert&\le \Vert {\varvec{I}}_n-({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}{\varvec{J}}_k^\text {T}{\varvec{J}}(\tilde{{\varvec{x}}}_k)\Vert \Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert +o(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert )\\&{\le (1+\Vert ({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}\Vert \Vert {\varvec{J}}_k^\text {T}{\varvec{J}}(\tilde{{\varvec{x}}}_k)\Vert ) \Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert +o(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert )}\\&\le c \Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert +o(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert )\\&={O}(\Vert {\varvec{x}}_{k}-{\varvec{x}}^*\Vert ),\\ \end{aligned} \end{aligned}$$

(3.11)

where $c:=1+L_2^2\Vert ({\varvec{J}}_k^\text {T}{\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}\Vert $, which derives from Lemma 3.1 and the continuity of $J({\varvec{x}})$. The inequality (3.11) reflects that the Algorithm 2 is linearly convergent.

Remark 3.5

Let the singular values of the matrix $J_k$ be $\sigma ^{(k)}_i$ ($i=1,2,\ldots ,n$) which satisfy $\sigma ^{(k)}_1\ge \sigma ^{(k)}_2\ge \ldots \ge \sigma ^{(k)}_n$, then

$$c:=1+L_2^2\Vert ({\varvec{J}}_k^T {\varvec{J}}_k+\mu _k {\varvec{I}}_n)^{-1}\Vert \le 1+\dfrac{L_2^2}{\mu _k+(\sigma ^{(k)}_n)^2}.$$

Suppose that ${\varvec{x}}^*$ is a solution of the tensor equation (1.1) with a singular and symmetric $\mathcal {M}$-tensor $\mathcal {A}$, and we assume that the singular values $\sigma _i$ of the Jacobian ${\varvec{J}}({\varvec{x}}^*)$ satisfy the following unequal relationship:

$$\sigma _1\ge \sigma _2\ge \ldots \ge \sigma _r>0=\sigma _{r+1}=\cdots =\sigma _n,$$

then $ c\rightarrow 1+\dfrac{L_2^2}{\mu }$ as $k\rightarrow \infty $ and $\mu _k \rightarrow \mu $.

4 Numerical experiments

In this section, several numerical experiments will be given to illustrate the efficiency of the proposed iterative methods, i.e., Algorithms 1 and 2. All the codes were written in MATLAB (version R2016a) and run on a personal computer whose specifications are as follows: Intel(R) Core(TM) i7-10510U@1.80GHz and 8.00G memory. The tensor operations appeared in our tests were carried out via the tensor toolbox (version 3.2) [31].

In the numerical results, the symbols “IT" and “CPU" represent the number of iteration steps and the elapsed CPU time in seconds, respectively. The residual of the tensor equation (1.1) at $\varvec{x}_k$ is denoted by “RES", i.e., $\text {RES}=\Vert \mathcal {A}\varvec{x}_k^{m-1}-\varvec{b}\Vert .$ The stopping criteria is when the tolerance $\epsilon \le 1.0\text {e}-08$, or the iteration number exceeds the prescribed iteration $k_{\max }=10000$. In the following tests, we let the parameter $\mu _k$ be a positive constant chosen by $\texttt {rand}$ for simplicity. Additionally, the number of iteration steps, the CPU time, and the residual listed in the tables below are respectively the average of 5 runs from different starting points unless otherwise stated.

Example 4.1

Let the tensor $\mathcal {A}\in \mathbb {R}^{[m,n]}$ in (1.1) be $\mathcal {A}=s\mathcal {I}-\mathcal {B}$, where $\mathcal {B}\in \mathbb {R}^{[m,n]}$ is a nonnegative tensor with nonnegative entries $\mathcal {B}_{i_1i_2\ldots i_m}=|\sin (i_1+i_2+\cdots +i_m)|$, and choose the vector $\varvec{b}$ such that $\varvec{x}^*=8*\varvec{ones}(n,1)\in \mathbb {R}^n$ is a solution of the tensor equation mentioned above.

In view of the definition of the tensor $\mathcal {B}$, the coefficient tensor $\mathcal {A}$ given here is a symmetric $\mathcal {M}$-tensor. In this numerical example, the following two cases were considered:

${\textbf {Case I}}$. Let $s=n^{m-1}$, then the tensor $\mathcal {A}$ is a symmetric and nonsingular $\mathcal {M}$-tensor, and the corresponding tensor equation (1.1) always has a solution [8].

For randomly chosen initial iterative vectors $\varvec{x}_0$ and the parameter $\mu =\texttt {rand}$ (the random function involved in Matlab) in Algorithm 1, we compared the EAI-NS method with the promising iterative algorithms, that is, the steepest descent method (denoted by “SD" for short) [32], the conjugate gradient method (denoted by “CG" for short) [32], the SOR method (denoted by “SOR" for short) [15], the Newton method (denoted by “NT" for short) [8], all those algorithms are feasible for the tensor equations with symmetric coefficient tensors. The numerical results were reported in the Tables 1 and 2, in which the symbol “—" means that $\varvec{x}_k$ does not satisfy the terminated criterion although the number of iteration steps reaches the maximum $k_{\max }$.

Table 1 Nonsingular case (I): numerical results for the tensor equations in Example 4.1

Full size table

Table 2 Nonsingular case (II): numerical results for the tensor equations in Example 4.1

Full size table

From the Tables 1 and 2 one can observe that all the methods converge except the SOR method for the cases $m=4,5$ and $[m,n]=[3,100]$, the reason may be that the relaxation parameter $\omega $ there are selected randomly by the function $\omega =\texttt {rand}$, and so they are not optimal. Notably, the SD method and the CG method do have very good convergence. The Newton’s method is superior to other methods in terms of the number of iteration steps and the CPU time consumed, but the EAI-NS method and the Newton’s method have similar convergence. In addition, the EAI-NS method proposed in this article has better performance than the SD method, the CG method as well as the SOR method.

Furthermore, in order to better show the convergence behavior of the algorithms mentioned in the Table 1, we plotted their convergence curves v.s. the number of iteration steps k in Fig. 1. These curves show that the SD method and the SOR method have linear convergence, while the other methods have superlinear convergence, which is consistent with the theoretical results presented in Sect. 3.2.

${\textbf {Case II}}$. Let the constant number $s=\rho (\mathcal {B})$ in $\mathcal {A}=s\mathcal {I}-\mathcal {B}$, which will be gained by using the NQZ method [33]) (see Table 3 for details). At this time, the coefficient tensors $\mathcal {A}$ is a singular and symmetric $\mathcal {M}$-tensors.

Table 3 The spectral radius of the tensor $\mathcal {B}$ in Example 4.1

Full size table

As is well-known, the Newton method is feasible for the tensor equations with nonsingular coefficient tensors [8], In view of the singularity of the coefficient tensor $\mathcal {A}$, we compared the EAI-S method (i.e., Algorithm 2) with the LM-type method (denoted by “LM" for short) [18], the TALM method (denoted by “TALM" for short) [20] starting from the initial iterative vector ${\varvec{x}}_0$ and the parameter $\mu _k$ chosen randomly, and listed the numerical results in Table 4.

This table reflects that the EAI-S method has the best performance compared with the LM-type method and the TALM method both in the number of iteration steps and the elapsed CPU time. It is worth mentioning that the EAI-S method spends less CPU time when the iterative steps of the two methods are similar. Certainly, the TALM method outperforms the LM method under the environments presented in this example.

In addition, we described the curves of the logarithm of the residual RES of the three methods versus the iteration k in Fig. 2, which displays that the EAI-S method proposed in present paper has better performance. It should be pointed out that we only prove the linear convergence of the EAI-S method in Section 3.4, but from the figures one can observe that this method seems superlinearly convergent. This is an issue that we need to further consider in our future work. Additionally, the TALM method converges cubically [20], and the EAI-S method possesses analogous convergence behavior, so how to prove the convergent rate of this iterative method is an interesting thing.

Table 4 Singular case: numerical results for the tensor equations in Example 4.1

Full size table

Moreover, as stated in Sect. 3, the parameter $\mu _k$ appeared in Algorithm 2 could be chosen as an arbitrary positive number for simplicity. To numerically verify the influence of the parameter $\mu _k$ on the convergence of this algorithm, let the initial vector $\varvec{x}_0=\texttt {ones}(n,1)$.

We respectively display the convergence behavior of Algorithm 2 when choosing variable $\mu _k=\Vert F_k\Vert $, $\mu _k=\Vert F_k\Vert ^{1.5}$ and $\mu _k=\Vert F_k\Vert ^2$ v.s. invariant $\mu _k=0.05$ (see Fig. 3 for $[m,n]=[3,10]$, and Fig. 4 for $[m,n]=[5,10]$). From those figures we can observe that the variable parameter $\mu _k$ are beneficial for improving the convergence of the algorithm. It is a considerable problem to establish the corresponding theory for seeking the optimal parameter, which will be studied in the further work.

The following two numerical examples are derived from practical application problems.

Example 4.2

Consider the numerical solution of the following differential equation

$$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} -\max \limits _{(\gamma ,\lambda )\in (\Gamma ,\Lambda )}\{L^\lambda U-\eta U-\dfrac{1}{2} \alpha \gamma ^2 U+\beta \gamma \}=0, &{}\text {on}\ \Omega ,\\ U=g,\ {} &{}\text {on}\ \partial \Omega ,\\ \end{array}\right. } \end{aligned} \end{aligned}$$

in which $L^\lambda U(x)=\dfrac{1}{2} \sigma (x,\lambda )^2 U''(x)+\mu (x,\lambda ) U'(x)$, $\Omega =(0,1)$, $\Gamma =[0,+\infty )$, and $\Lambda $ is a compact metric space.

Applying “optimize then discretize" approach to the above differential equation, it can be discretized as the 3rd-order Bellman equation [10]

$$\begin{aligned} \begin{aligned} \max \limits _{\lambda \in {\Lambda _{\Delta x}}} \mathcal {A}(\lambda ){\textbf {u}}^2=\varvec{b},\\ \end{aligned} \end{aligned}$$

where ${\textbf {u}}=({\textbf {u}}_i)\in \mathbb {R}^{n+1}$, $u_i\approx U(i \Delta x)$ with $\Delta x=\dfrac{1}{n}$ for $i=0,1,2,\ldots ,n$, $\mathcal {A}(\lambda )$ is a 3rd-order and $(n+1)$-dimensional parameterized tensor whose entries are defined as follows:

$$\begin{aligned} \begin{aligned}&\mathcal {A}_{i,i-1,i}(\lambda )=\mathcal {A}_{i,i,i-1}(\lambda ),\\&2\mathcal {A}_{i,i,i-1}(\lambda )=-\dfrac{1}{2} \sigma _i^2(\lambda _i) \dfrac{1}{(\Delta x)^2} +\mu _i(\lambda _i) \dfrac{1}{\Delta x} {\textbf {1}}_{(-\infty ,0)}(\mu _i(\lambda _i)),\\&\mathcal {A}_{i,i,i}(\lambda )=\dfrac{1}{2} \sigma _i^2(\lambda _i) \dfrac{2}{(\Delta x)^2} +|\mu _i(\lambda _i)|\dfrac{1}{\Delta x}+\eta _i,\\&2\mathcal {A}_{i,i,i+1}(\lambda )=-\dfrac{1}{2} \sigma _i^2(\lambda _i) \dfrac{1}{(\Delta x)^2} -\mu _i(\lambda _i) \dfrac{1}{\Delta x} {\textbf {1}}_{(0,+\infty )}(\mu _i(\lambda _i)),\\&\mathcal {A}_{i,i+1,i}(\lambda )=\mathcal {A}_{i,i,i+1}(\lambda ),\\&i=1,2,\ldots ,n-1,\\ \end{aligned} \end{aligned}$$

here ${\textbf {1}}_{\mathbb {S}}$ stands for the indicator function over the set $\mathbb {S}$, $\sigma _i(\lambda )=\sigma (i \Delta x, \lambda )$, $\eta _i=\eta (i \Delta x)$, and $\mathcal {A}_{i,i,i}(\lambda )=1$ with $i=0,n$. The vector ${\varvec{b}}=(b_i)$ is given by

$$\begin{aligned} \begin{aligned} b_i= {\left\{ \begin{array}{ll} \dfrac{1}{2} \dfrac{\beta _i^2}{\alpha _i}, &{}\text {if}\ {\textit{i}}=1,2,\ldots , \textit{n}-1,\\ g_i^2,\ {} &{}\text {if}\ {\textit{i}}=0,\textit{n},\\ \end{array}\right. } \end{aligned} \end{aligned}$$

in which $\alpha _i=\alpha (i \Delta x)$, $\beta _i=\beta (i \Delta x)$, $g_i=g(i \Delta x)$.

In our tests, let $\sigma (x,\lambda )=0.2$, $\alpha (x)=2-x$, $\eta (x)=0.04$, $u(x,\lambda )=0.04\lambda $, $\beta (x)=1+x$, $g(x)=1$, and $\lambda =-1$. The coefficient tensor $\mathcal {A}$ is a non-symmetric and nonsingular $\mathcal {M}$-tensor, and thus the derived tensor equation has a nonnegative solution [8].

Although the Newton method mentioned above is very effective as shown in Example 4.1, it is theoretically infeasible for asymmetric cases. Therefore, we compared the EAI-NS method with the SD method [32], the CG method [32], and the SOR method [15] starting from different initial vectors being chosen as the same as in Example 4.1, and displayed the corresponding results in Table 5.

Table 5 Comparison of the proposed methods for the tensor equations in Example 4.2

Full size table

From Table 5, one can see that the number of iteration steps and the corresponding CPU time increase as the n increases from 10 to 60, and all the tested iterative algorithms are convergent for the 3rd-order Bellman tensor equation, in which the CG method takes less iterations and CPU time than the SD method and the SOR method in the majority of cases. Nevertheless, the EAI-NS method has better performance than the CG method.

It should point out that when $n=70$, except for the EAI-NS method, the other ones fail to stop before reaching the maximum number $k_{\max }$ of iteration steps. Particularly, a large number of unlisted numerical results also reflect similar phenomena, so our algorithm can effectively solve such problems. Moreover, in Fig. 5, we drew the convergence curves of the logarithm of the residual RES of the aforementioned algorithms versus the iteration k. These images also demonstrate the superlinear convergence of Algorithm 1.

Example 4.3

Consider the numerical solution of the Klein-Gordon equation [34, 35]

$$\begin{aligned} \begin{aligned} {\left\{ \begin{array}{ll} u({\varvec{x}})^{m-2}\cdot \triangle u({\varvec{x}})=-f({\varvec{x}}),\ {} &{}\text {in}\ \Omega ,\\ u({\varvec{x}})=g({\varvec{x}}),\ {} &{}\text {on}\ \partial \Omega ,\\ \end{array}\right. } \end{aligned} \end{aligned}$$

in which $f({\varvec{x}})$ is a constant function, $\triangle =\sum \limits _{k=0}^{d}\frac{\partial ^2}{\partial x_k^2}$, $\Omega =[0,1]^d$ and $m=3,4,\ldots $.

When $d=1$, the above Klein-Gordon equation is discretized as the tensor equation $\mathcal {L}_h {\textbf {u}}^{m-1}={\textbf {f}}$, where $h=1/(n-1)$, and $\mathcal {L}_h\in \mathbb {R}^{[m,n]}$ is a nonsymmetric and nonsingular $\mathcal {M}$-tensor, i.e.,

$$\begin{aligned} \begin{aligned} (\mathcal {L}_h)_{11\ldots 1}=&(\mathcal {L}_h)_{nn\ldots n}=1/h^2, \ \ \ (\mathcal {L}_h)_{ii\ldots i}=2/h^2,\ i=2,3,\ldots , n-1,\\ (\mathcal {L}_h)_{i i-1 i\ldots i}=&(\mathcal {L}_h)_{ii i-1 i\ldots i}= \cdots =(\mathcal {L}_h)_{i \ldots i i-1}=-1/h^2(m-1),\ i=2,3,\ldots , n-1,\\ (\mathcal {L}_h)_{i i+1 i\ldots i}=&(\mathcal {L}_h)_{ii i+1 i\ldots i}= \cdots =(\mathcal {L}_h)_{i\ldots i i+1}=-1/h^2(m-1),\ i=2,3,\ldots , n-1.\\ \end{aligned} \end{aligned}$$

In the following tests, we choose the vector ${\textbf {f}}$ as the vector such that $\mathcal {L}_h({\textbf {u}}^*)^{m-1}={\textbf {f}}$ for ${\textbf {u}}^*=8*\texttt {ones}(n,1)$ for simplicity.

Starting from the randomly initial iterative vectors chosen as the same as in Example 4.1, we performed the EAI-NS method, the SD method [32], the CG method [32], and the SOR method [15], and reported the numerical results in the Table 6.

From this table, we can observe that the CG method has better performance than the SD method and the SOR method for the convergent cases. Moreover, both the CG method and the EAI-NS method converge before the numbers of iteration steps reach the maximum value $k_{\max }$, and especially, our method takes less iteration steps as well as the CPU time.

Table 6 Comparison of the proposed methods for the tensor equations in Example 4.3

Full size table

Additionally, we also described the convergence curves of the logarithm of the residual RES corresponding to the four iterative approaches versus the iteration k in Fig. 6, from which we can see that the EAI-NS method has the best convergent.

5 Conclusions and remarks

Based on the exponential acceleration, in present paper, we develop the exponential accelerated iterative methods for the tensor equation (1.1) under two different cases: One is that the coefficient tensor $\mathcal {A}$ is a symmetric and nonsingular $\mathcal {M}$-tensor, and the other one is that the coefficient tensor $\mathcal {A}$ is a symmetric and singular $\mathcal {M}$-tensor, that is, Algorithms 1 and 2. These two iterative methods are extension of the Newton’s method, and Algorithm 1 possesses superlinear convergence, while Algorithm 2 is linearly convergent. The provided numerical results and many other trials that did not list in present paper demonstrate that the proposed methods are effective for solving tensor equations with the form of (1.1), and have better performance than some existing ones.

We should mention that since the key operations in those two algorithms contain the tensor-vector multiplications, which means that the computational amount of the proposed methods grow exponentially as the increasing dimension, that is, they suffer from the so-called “curse-of-dimensionality” [36]. So it is an interesting but important theme to overcome the curse.

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

Kolda, T., Bader, B.: Tensor decompositions and applications. SIAM Rev. 51, 455–500 (2009)
Article MathSciNet Google Scholar
Lathauwer, L., Castaing, J., Cardoso, J.: Fourth-order cumulant-based blind identification of underdetermined mixtures. IEEE Trans. Sig. Proc. 55, 2965–2973 (2007)
Article MathSciNet Google Scholar
Qi, L., Luo, Z.: Tensor analysis: spectral theory and special tensors. SIAM, Philadelphia (2017)
Qi, L.: Eigenvalues of a real supersymmetric tensor. J. Symb. Comput. 40, 1302–1324 (2005)
Article MathSciNet Google Scholar
Ding, W., Qi, L., Wei, Y.: $\cal{M} $-tensors and nonsingular $\cal{M} $-tensors. Linear Algebra Appl. 439(10), 3264–3278 (2013)
Article MathSciNet Google Scholar
Zhang, L., Qi, L., Zhou, G.: $\cal{M} $-tensors and some applications. SIAM J. Matrix Anal. Appl. 35(2), 437–452 (2014)
Article MathSciNet Google Scholar
Li, X., Ng, M.: Solving sparse non-negative tensor equations: algorithms and applications. Front. Math. China 10(3), 649–680 (2015)
Article MathSciNet Google Scholar
Ding, W., Wei, Y.: Solving multi-linear systems with $\cal{M} $-tensors. J. Sci. Comput. 68(2), 689–715 (2016)
Article MathSciNet Google Scholar
Luo, Z., Qi, L., Xiu, N.: The sparsest solutions to Z-tensor complementarity problems. Optim. Lett. 11, 471–482 (2017)
Article MathSciNet Google Scholar
Azimzadeh, P., Bayraktar, E.: High order Bellman equations and weakly chained diagonally dominant tensors. SIAM J. Matrix Anal. Appl. 40(1), 276–298 (2019)
Article MathSciNet Google Scholar
Han, L.: A homotopy method for solving multilinear systems with M-tensors. Appl. Math. Lett. 69, 49–54 (2017)
Article MathSciNet Google Scholar
Xie, Z., Jin, X., Wei, Y.: Tensor methods for solving symmetric M-tensor systems. J Sci Comput 74, 412–425 (2018)
Article MathSciNet Google Scholar
Li, D., Xie, S., Xu, H.: Splitting methods for tensor equations. Numer. Linear Algebra Appl. 24(5), e2102 (2017)
Article MathSciNet Google Scholar
Cui, L., Li, M., Song, Y.: Preconditioned tensor splitting iterations method for solving multi-linear systems. Appl. Math. Lett. 96, 89–94 (2019)
Article MathSciNet Google Scholar
Liu, D., Li, W., Vong, S.: The tensor splitting with application to solve multi-linear systems. J. Comput. Appl. Math. 330(1), 75–94 (2018)
Article MathSciNet Google Scholar
He, H., Ling, C., Qi, L., Zhou, G.: A globally and quadratically convergent algorithm for solving multilinear systems with $\cal{M} $-tensors. J. Sci. Comput. 73(3), 1718–1741 (2018)
Article MathSciNet Google Scholar
Wang, X., Che, M., Wei, Y.: Neural networks based approach solving multi-linear systems with M-tensors. Appl. Math. Lett. 351, 33–42 (2019)
Google Scholar
Lv, C., Ma, C.: A Levenberg-Marquardt method for solving semi-symmetric tensor equations. J. Comput. Appl. Math. 332, 13–25 (2018)
Article MathSciNet Google Scholar
Liang, M., Zheng, B., Zhao, R.: Alternating iterative methods for solving tensor equations with applications. Numer. Algor. 80, 1437–1465 (2019)
Article MathSciNet Google Scholar
Liang, M., Zheng, B., Zheng, Y., Zhao, R.: A two-step accelerated Levenberg-Marquardt method for solving multlinear systems in tensor-train format. J. Comput. Appl. Math. 382, 113069 (2021)
Article MathSciNet Google Scholar
Ortega, J., Rheinboldt, W.: Iterative solution of nonlinear equations in several variables. Academic Press, New York (1970)
Chen, J., Li, W.: On new exponential quadratically convergent iterative formulae. Appl. Math. Comput. 180, 242–246 (2006)
MathSciNet Google Scholar
Chen, J., Li, W.: An exponential regula falsi method for solving nonlinear equations. Numer. Algor. 41, 327–338 (2006)
Article MathSciNet Google Scholar
Kahya, E.: A class of exponential quadratically convergent iterative formulae for unconstrained optimization. Appl. Math. Comput. 186, 1010–1017 (2007)
MathSciNet Google Scholar
Smietanski, M.: On a new exponential iterative method for solving nonsmooth equations. Numer. Linear Algebra Appl. 26(5), e2255 (2019)
Article MathSciNet Google Scholar
Levenberg, K.: A method for the solution of certain nonlinear problems in least squares. Quarterly Appl. Math. 2, 164–168 (1944)
Article MathSciNet Google Scholar
Marquardt, D.: An algorithm for least-squares estimation of nonlinear parameters. J. Soc. Ind. Appl. Math. 11, 431–441 (1963)
Article MathSciNet Google Scholar
Yamashita, N., Fukushima, M.: On the rate of convergence of the Levenberg-Marquardt method. Computing 15, 239–249 (2001)
MathSciNet Google Scholar
Zhou, W.: On the convergence of the modified Levenberg-Marquardt method with a nonmonotone second order Armijo type line search. J. Comput. Appl. Math. 239, 152–161 (2013)
Article MathSciNet Google Scholar
Golub, G., Van Loan, C.: Matrix computations, 4th edn. Johns Hopkins University Press, Baltimore (2013)
Bader, B., Kolda, T., et al.: Tensor toolbox for MATLAB, Version 3.2. (2021). http://www.tensortoolbox.org
Li, T., Wang, Q., Zhang, X.: Gradient based iterative methods for solving symmetric tensor equations. Numer. Linear Algebra Appl. 29(2), e2414 (2022)
Article MathSciNet Google Scholar
Ng, M., Qi, L., Zhou, G.: Finding the largest eigenvalue of a nonnegative tensor. SIAM J. Matrix Anal. Appl. 31(3), 1090–1099 (2009)
Article MathSciNet Google Scholar
Matsuno, Y.: Exact solutions for the nonlinear Klein-Gordon and Liouville equations in four-dimensional Euclidean space. J. Math. Phys. 28(10), 2317–2322 (1987)
Article MathSciNet Google Scholar
Zwillinger, D.: Handbook of differential equations, 3rd edn. Academic Press Inc, Boston (1997)
Oseledets, I.: Tensor train decomposition. SIAM J. Sci. Comput. 33(5), 2295–2317 (2011)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are thankful to the handling editor and the referees for their constructive comments and suggestions, which greatly improve the quality of this paper.

Funding

This work was supported by National Natural Science Foundation of China (Nos. 11961057, 12201267, 12361081), the Natural Science Foundation of Gansu Province (Nos. 21JR1RE287, 22JR5RA559) the Innovation Foundation of Education Department of Gansu Province (Nos. 2021B-221, 2023B-135), and the Science Foundation of Tianshui Normal University (No. CXJ2021-01).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Tianshui Normal University, Tianshui, 741001, People’s Republic of China
Maolin Liang & Lifang Dai
School of Information Engineering, Lanzhou University of Finance and Economics, Lanzhou, 730101, People’s Republic of China
Ruijuan Zhao

Authors

Maolin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Lifang Dai
View author publications
You can also search for this author in PubMed Google Scholar
Ruijuan Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M. Liang provided the methodology along with the problem under consideration and re-editing and correcting the manuscript, L. Dai implemented the scheme and edited the manuscript, and R. Zhao completed the numerical experiments of the related algorithms.

Corresponding author

Correspondence to Maolin Liang.

Ethics declarations

Ethical approval

Not applicable

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, M., Dai, L. & Zhao, R. Iterative methods for solving tensor equations based on exponential acceleration. Numer Algor 97, 29–49 (2024). https://doi.org/10.1007/s11075-023-01692-w

Download citation

Received: 18 January 2023
Accepted: 17 October 2023
Published: 13 November 2023
Issue Date: September 2024
DOI: https://doi.org/10.1007/s11075-023-01692-w

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Iterative methods for solving tensor equations based on exponential acceleration

Abstract

Similar content being viewed by others

Alternating iterative methods for solving tensor equations with applications

Several efficient iterative algorithms for solving nonlinear tensor equation \({\mathcal {X}}+{\mathcal {A}}^{T}*_N{\mathcal {X}}^{-1}*_N{\mathcal {A}}={\mathcal {I}}\) with Einstein product

Convergence of a Jacobi-type method for the approximate orthogonal tensor diagonalization

1 Introduction

2 Preliminaries

2.1 Notations and definitions

Definition 2.1

2.2 Two classical iterative methods for nonlinear equations

3 The exponentially accelerated iterative methods

3.1 The EAI method for (1.1) with nonsingular \(\mathcal {M}\)-tensor \(\mathcal {A}\)

3.2 Convergence analysis of the EAI-NS method

Lemma 3.1

Proof

Lemma 3.2

Theorem 3.3

Proof

3.3 The EAI method for (1.1) with singular \(\mathcal {M}\)-tensor \(\mathcal {A}\)

3.4 Convergence analysis of the EAI-S method

Theorem 3.4

Proof

Remark 3.5

4 Numerical experiments

Example 4.1

Example 4.2

Example 4.3

5 Conclusions and remarks

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical approval

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation

Several efficient iterative algorithms for solving nonlinear tensor equation \({\mathcal {X}}+{\mathcal {A}}^{T}_N{\mathcal {X}}^{-1}_N{\mathcal {A}}={\mathcal {I}}\) with Einstein product