Abstract
In this paper, we consider the tensor eigenvalue complementarity problem which is closely related to the optimality conditions for polynomial optimization, as well as a class of differential inclusions with nonconvex processes. By introducing an NCP-function, we reformulate the tensor eigenvalue complementarity problem as a system of nonlinear equations. We show that this function is strongly semismooth but not differentiable, in which case the classical smooth methods cannot apply. Furthermore, we propose a damped semismooth Newton method for tensor eigenvalue complementarity problem. A new procedure to evaluate an element of the generalized Jacobian is given, which turns out to be an element of the B-subdifferential under mild assumptions. As a result, the convergence of the damped semismooth Newton method is guaranteed by existing results. The numerical experiments also show that our method is efficient and promising.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Complementarity problems have developed into a very fruitful discipline in the field of mathematical programming, which were originally studied in the standard optimality conditions for linear and smooth nonlinear optimization. The distinguishing feature of a complementarity problem is a set of complementarity conditions. Each of these conditions requires that the product of two or more quantities should be zero. They appear prominently in the study of equilibria problems and arise naturally in numerous applications from economics, engineering and the sciences [14].
As a special type of complementarity problems, the matrix eigenvalue complementarity problem first appeared in the stability analysis of finite dimensional mechanical systems with frictional contact [10], which has been well studied during the last decade. Mathematically speaking, given a matrix \(A \in \mathbb {R}^{n\times n}\) and a positive definite matrix \(B \in \mathbb {R}^{n\times n}\), the matrix eigenvalue complementarity problem [19, 34] consists of finding a scalar \(\lambda > 0\) and a vector \(\mathbf{x}\in \mathbb {R}^n {\setminus } \{\mathbf{0}\}\) such that
where \(\langle \cdot , \cdot \rangle \) denotes the inner product of two vectors. For more details about matrix eigenvalue complementarity problems and their applications, the readers are referred to [1, 11, 12, 18, 20].
Meanwhile, as a natural extension of the concept of matrices, a real mth order n-dimensional tensor \(\mathcal {A}=(a_{i_1 \ldots i_m})\) is a multidimensional array with each entry \(a_{i_1 \ldots i_m} \in \mathbb {R}\) for any \( i_1, \ldots , i_m \in [n]\), where \([n]=\{ 1,2, \ldots , n\}\). Denote the set of all real mth order n-dimensional tensors by \(T_{m, n}\). For any vector \(\mathbf{x}\in \mathbb {R}^n\), let \(\mathcal {A}\mathbf{x}^{m-1}\) be a vector in \(\mathbb {R}^n\) whose ith component is defined by
Let \(\mathcal {A}\mathbf{x}^m\) be the scalar denoted by \(\mathcal {A}\mathbf{x}^m= \mathbf{x}^\top \mathcal {A}\mathbf{x}^{m-1}\), which is exactly a homogeneous polynomial of \(\mathbf{x}\) with degree m. We say a tensor \(\mathcal {A}\) is symmetric if its entries are invariant under permutation. Denote the set of all real symmetric mth order n-dimensional tensors by \(S_{m, n}\). A tensor \(\mathcal {A}\in S_{m,n}\) is called positive definite if \(\mathcal {A}\mathbf{x}^m > 0\) for all \(\mathbf{x}\ne 0\). Clearly, when m is odd, there are no positive definite tensors.
It is possible that the ideas of eigenvalues of tensors had been raised earlier. However, it was the independent work of Lim [24] and Qi [30] that initiated the rapid developments of the spectral theory of tensors. Moreover, these definitions can all be unified under generalized tensor eigenpair framework as follows, introduced by Chang, Pearson and Zhang [4]. Let \(\mathcal {A}, \mathcal {B}\in T_{m,n}\). Assume further that m is even and \(\mathcal {B}\) is positive definite. We say \((\lambda , \mathbf{x}) \in \mathbb {C} \times \{ \mathbb {C}^n {\setminus } \{ \mathbf{0} \} \}\) is a generalized eigenpair if
Different choices of \(\mathcal {B}\) yield different versions of the tensor eigenvalue problem [4, 21]. After that, the study of tensors and the spectra of tensors with their various applications has attracted extensive attention and interest.
In this paper, we consider the tensor eigenvalue complementarity problem (TEiCP), which consists of finding a scalar \(\lambda >0\) and a vector \(\mathbf{x}\in \mathbb {R}^n {\setminus } \{ \mathbf{0} \}\) such that
where \(\mathcal {A}\in T_{m,n}\) and \(\mathcal {B}\in S_{m,n}\) is positive definite. This problem is not only a natural generalization of the classical eigenvalue complementarity problem for matrices, but also closely related to the optimality conditions for polynomial optimization [36], a kind of nonlinear differential dynamical system [8], as well as a class of differential inclusions with nonconvex processes [22]. In particular, without the constraint \(\lambda >0\), any solution \(\lambda \) of (1) is called a Pareto H-eigenvalue or a Pareto Z-eigenvalue by Song and Qi [36] when tensor \(\mathcal {B}\) has different special forms, respectively. The properties of Pareto eigenvalues and their relationship with polynomial optimization are also studied in [36]. The general properties of TEiCP, including the solution existence and uniqueness, are systematically investigated by Chen et al. [8]. Note that the eigenvalue complementarity problem under a given index set is also considered in [8]. When the nonnegative cones in (1) are replaced by a closed convex cone and its dual cone, TEiCP is called the cone eigenvalue complementarity problem for high-order tensors by Ling et al. [22]. Moreover, as a natural extension of quadratic eigenvalue complementarity problem for matrices, Ling et al. [23] also consider the high-degree eigenvalue complementarity problem for tensors. Some properties of Pareto eigenvalues are further studied in [39]. Another kind of complementarity problems related to tensors is considered in [5, 16, 25, 35].
On the other hand, some algorithms for computing the solutions of TEiCP have been proposed, such as shifted projected power method [8], scaling-and-projection algorithm [22] and alternating direction method of multipliers [23]. Notice that all these methods are first order algorithms that are based on gradient information. In this paper, we present a semismooth Newton method for computing the solutions of TEiCP. It turns out that TEiCP is a parameterized nonlinear complementarity problem. By introducing an NCP-function to get rid of the nonnegative constraints in (1), we reformulate it as a system of nonsmooth operator equations. The main difficulty is that this function is not differentiable. As a result, the classical Newton method cannot apply. Fortunately, this function is strongly semismooth and the semismooth Newton method has been well studied since the work of Qi and Sun [32]. In order to implement the semismooth Newton method, we also propose a new procedure to evaluate an element of the generalized Jacobian, which turns out to be an element of the B-subdifferential under some mild assumptions. The numerical results indicate that our method is efficient to compute the solutions of TEiCP.
The rest of this paper is organized as follows. In Sect. 2, we recall some basic definitions and results in nonsmooth analysis and nonlinear complementarity problem. In Sect. 3, we reformulate TEiCP as a system of nonlinear equations. In Sect. 4, we present a damped semismooth Newton method for TEiCP and its convergence analysis. Some numerical results are reported in Sect. 5. In Sect. 6, we give two final remarks.
Throughout this paper, we assume that m is even and \(\mathcal {B}\in S_{m,n}\) is positive definite. We use small letters \(x, y, \ldots , \) for scalars, small bold letters \(\mathbf{x}, \mathbf{y}, \ldots , \) for vectors, capital letters \(A, B, \ldots , \) for matrices, calligraphic letters \(\mathcal {A}, \mathcal {B}, \ldots , \) for tensors. All the tensors discussed in this paper are real.
2 Preliminaries
In this section, we present some basic definitions and properties in nonsmooth analysis and nonlinear complementarity problem, which will be used in the sequel.
Suppose that \(F: U \subseteq \mathbb {R}^{n_1} \rightarrow \mathbb {R}^{n_2}\) is a locally Lipschitz function, where U is nonempty and open. By Rademacher’s Theorem, F is differentiable almost everywhere. Let \(D_F \subseteq \mathbb {R}^{n_1}\) denote the set of points at which F is differentiable. For any \(\mathbf{x}\in D_F\), we write \(JF(\mathbf{x})\) for the usual \(n_2 \times n_1\) Jacobian matrix of partial derivatives. The B-subdifferential of F at \(\mathbf{x}\in U\) is the set defined by
The Clark’s generalized Jacobian of F at \(\mathbf{x}\) is the set defined by
where “co” denotes the convex hull. In the case of \(n_2 =1, \partial F(\mathbf{x})\) is called the generalized gradient. Some fundamental properties about generalized Jacobian are given below. For more details, one can refer to [9].
Proposition 1
Suppose that the function \(F: U \subseteq \mathbb {R}^{n_1} \rightarrow \mathbb {R}^{n_2}\) is locally Lipschitz, where U is nonempty and open. Then for any \(\mathbf{x}\in U\), we have
-
(a)
\(\partial F(\mathbf{x})\) is a nonempty convex compact subset of \(\mathbb {R}^{n_2 \times n_1}\);
-
(b)
\(\partial F(\mathbf{x}) = \partial _B F(\mathbf{x}) = \{ JF(\mathbf{x})\}\) if F is continuously differentiable at \(\mathbf{x}\);
-
(c)
\(\partial F(\mathbf{x}) \subseteq \partial f^1(\mathbf{x}) \times \partial f^2 (\mathbf{x}) \times \cdots \times \partial f^m (\mathbf{x})\), where \(F(\mathbf{x})= [ f^1(\mathbf{x}), f^2(\mathbf{x}), \ldots , f^m(\mathbf{x})]\) and the latter denotes the set of all matrices whose ith row belongs to \(\partial f^i(\mathbf{x})\) for each i.
Let \(U \subseteq \mathbb {R}^{n_1}\) be nonempty and open. The function \(F : U \rightarrow \mathbb {R}^{n_2} \) is semisoomth [17, 32] at \(\mathbf{x}\in \mathbb {R}^{n_1}\), if it is locally Lipschitz at \(\mathbf{x}\) and if
exists for all \(\mathbf{d}\in \mathbb {R}^n\). If F is semismooth at all \(\mathbf{x}\in U\), we call F semismooth on U. The function F is called strongly semismooth [33] if it is semismooth and for any \(\mathbf{x}\in U\) and \(V \in \partial F(\mathbf{x}+\mathbf{d})\),
where \(F'(\mathbf{x}; \mathbf{d})\) denotes the directional derivative [3] of F at \(\mathbf{x}\) in direction \(\mathbf{d}\), i.e.,
It is worth mentioning [32] that if the function F is semismooth, the directional derivative \(F'(\mathbf{x}; \mathbf{d})\) exists for all \(\mathbf{d}\in \mathbb {R}^n\) and
Suppose that \(F : \mathbb {R}^n \rightarrow \mathbb {R}^n\) is directional differentiable at \(\mathbf{x}\); we say that F is BD-regular [29] at \(\mathbf{x}\) if for any \(\mathbf{d}\in \mathbb {R}^n {\setminus } \{ \mathbf{0}\}\),
We say that F is strongly BD-regular at \(\mathbf{x}\) if all \(V \in \partial _B F(\mathbf{x})\) are nonsingular. These concepts can be very important for the convergence analysis of the semismooth Newton method.
Below, we introduce the classical nonlinear complementarity problem. It will be shown in the next section that the tensor eigenvalue complementarity problem is a special kind of parameterized nonlinear complementarity problem.
Definition 1
Given a mapping \(F: \mathbb {R}^n_+ \rightarrow \mathbb {R}^n \), the nonlinear complementarity problem, denoted by NCP(F), is to find a vector \(\mathbf{x}\in \mathbb {R}^n\) satisfying
Many solution methods developed for NCP or related problems are based on reformulating them as a system of equations using so-called NCP-functions. Here, a function \(\phi : \mathbb {R}^2 \rightarrow \mathbb {R}\) is called an NCP-function if
Given an NCP-function \(\phi \), let us define
By definition, \(\mathbf{x}\in \mathbb {R}^n\) is a solution of NCP(F) if and only if it solves the system of equations \(\Phi (\mathbf{x}) = \mathbf{0}\).
Here, we present some NCP-functions which are widely used in nonlinear complementarity problems. For more details about NCP-functions and their smoothing approximations, one can refer to [31, 37, 40, 41] and references therein.
-
The min function [38]
$$\begin{aligned} \phi _{min}(a,b) := a - (a-b)_+ . \end{aligned}$$ -
The Fischer-Burmeister function [15]
$$\begin{aligned} \phi _{FB}(a,b):= (a+b) - \sqrt{a^2+b^2} . \end{aligned}$$ -
The penalized Fischer-Burmeister function [6]
$$\begin{aligned} \phi _{\tau }(a,b):= \tau \phi _{FB}(a,b) + (1-\tau ) a_+ b_+ , \end{aligned}$$where \(\tau \in (0,1)\).
Here, \(x_+ =\max \{x,0\}\) for \(x \in \mathbb {R}\). It has been shown that all these NCP-functions are globally Lipschitz continuous, directionally differentiable, and strongly semismooth. Their generalized gradients are given as follows.
Proposition 2
(Proposition 1 of [6]) Let \(\phi _{min}(a,b), \phi _{FB}(a,b)\) and \(\phi _{\tau }(a,b)\) be defined as above. Then
-
(a)
The generalized gradient \(\partial \phi _{min}(a,b)\) is equal to the set of all \((v_a, v_b)\) such that
$$\begin{aligned} (v_a,v_b) = \left\{ \begin{array}{l@{\quad }l} (1,0) &{} \text {if } a < b, \\ (1-v,v) &{} \text {if } a = b, \\ (0,1) &{} \text {if } a > b, \end{array} \right. \end{aligned}$$where v is any scalar in the interval [0, 1].
-
(b)
The generalized gradient \(\partial \phi _{FB}(a,b)\) is equal to the set of all \((v_a, v_b)\) such that
$$\begin{aligned} (v_a,v_b) = \left\{ \begin{array}{l@{\quad }l} \big (1-\frac{a}{\Vert (a,b)\Vert },1-\frac{b}{\Vert (a,b)\Vert } \big ) &{} \text {if } (a,b) \ne (0,0), \\ (1-\sigma ,1-\eta ) &{} \text {if } (a,b) = (0,0), \end{array} \right. \end{aligned}$$where \((\sigma ,\eta )\) is any vector satisfying \(\Vert (\sigma , \eta ) \Vert \le 1\).
-
(c)
For any \(\tau \in (0,1)\), the generalized gradient \(\partial \phi _{\tau }(a,b)\) is equal to the set of all \((v_a, v_b)\) such that
$$\begin{aligned}&(v_a,v_b) \\&\quad = \left\{ \begin{array}{l@{\quad }l} \tau \big (1-\frac{a}{\Vert (a,b)\Vert },1-\frac{b}{\Vert (a,b)\Vert } \big ) + (1-\tau ) (b_+ \partial a_+, a_+ \partial b_+ ) &{} \text {if } (a,b) \ne (0,0), \\ \tau (1-\sigma ,1-\eta ) &{} \text {if } (a,b) = (0,0), \end{array} \right. \end{aligned}$$where \((\sigma ,\eta )\) is any vector satisfying \(\Vert (\sigma , \eta ) \Vert \le 1\) and
$$\begin{aligned} \partial x_+ = \left\{ \begin{array}{l@{\quad }l} 0 &{} \text {if } x < 0, \\ {[0,1]} &{} \text {if } x = 0, \\ 1 &{} \text {if } x > 0. \end{array} \right. \end{aligned}$$
3 Reformulation
Suppose that m is even. As mentioned before, the tensor eigenvalue complementarity problem has the form of
where \(\mathcal {A}\in T_{m,n}\) and \(\mathcal {B}\in S_{m,n}\) is positive definite. Note that any solution with \(\mathbf{w}=0\) is a generalized tensor eigenpair of \((\mathcal {A}, \mathcal {B})\). Denote the solution set of (1) by \(\sigma (\mathcal {A}, \mathcal {B})\), i.e.,
Notice that if \((\lambda , \mathbf{x}) \in \sigma (\mathcal {A},\mathcal {B})\), then \((\lambda , s \mathbf{x}) \in \sigma (\mathcal {A}, \mathcal {B})\) for any \(s >0\). Without loss of generality, we only consider solutions satisfying \(\Vert \mathbf{x}\Vert _2 =1\). On the other hand, it is clear that
where \(\tilde{\mathcal {A}} \in T_{m,n}\) is the unique semi-symmetric tensor [27] such that \(\mathcal {A}\mathbf{x}^{m-1} = \tilde{\mathcal {A}} \mathbf{x}^{m-1}\) for all \(\mathbf{x}\in \mathbb {R}^n\). Hence, we always assume that \(\mathcal {A}\in T_{m,n}\) is semi-symmetric.
By introducing a new variable \(t \in \mathbb {R}\), we denote
As a result, TEiCP can be regarded as a parameterized nonlinear complementarity problem, i.e.,
with the constraint \(\mathbf{x}^\top \mathbf{x}= 1 \). It follows that the TEiCP can be represented compactly by the system of nonlinear equations
where \(\mathbf{z}=(\mathbf{x},t)\in \mathbb {R}^{n+1}, t \ne 0\), and \(H : \mathbb {R}^{n+1} \rightarrow \mathbb {R}^{n+1}\) is defined by
where the mapping \(\Phi : \mathbb {R}^{n+1} \rightarrow \mathbb {R}^n\) is given by
and \(\phi (a,b)\) is an NCP-function. Moreover, a natural metric function of \(H(\mathbf{z})\) is given by
By using the techniques for nonlinear complementarity problems, we can see that finding a solution of TEiCP is equivalent to solve the corresponding system of nonlinear equations.
Proposition 3
Let \(H(\mathbf{z})\) be defined by (7). If \((\lambda , \mathbf{x})\) is a solution of (3) with \(\Vert \mathbf{x}\Vert =1 \), then \(H(\mathbf{z}) =0\) with \(\mathbf{z}=(\mathbf{x}, \pm \sqrt{\lambda }) \in \mathbb {R}^{n+1}\). On the other hand, if \(H(\mathbf{z}) = 0\) with \(z=(\mathbf{x}, t) \in \mathbb {R}^{n+1}\) and \(t \ne 0\), then \((t^2, \mathbf{x})\) is a solution of (3) with \(\Vert \mathbf{x}\Vert =1 \).
Let \(H_{min}(\mathbf{z}), H_{FB}(\mathbf{z})\) and \(H_{\tau }(\mathbf{z})\) be the functions defined by (7), corresponding to the NCP-functions \(\phi _{min}, \phi _{FB}\) and \(\phi _{\tau }\), respectively. In what follows, we will show that these functions are all strongly semismooth, which can be very important for nonsmooth Newton methods.
Lemma 1
The functions \(\Phi _{min}(\mathbf{z}), \Phi _{FB}(\mathbf{z})\) and \(\Phi _{\tau }(\mathbf{z})\) are strongly semismooth, where \(\Phi _{min}(\mathbf{z}), \Phi _{FB}(\mathbf{z})\) and \(\Phi _{\tau }(\mathbf{z})\) are defined by (8), corresponding to the NCP-functions \(\phi _{min}, \phi _{FB}\) and \(\phi _{\tau }\), respectively.
Proof
Notice that for any \(\mathbf{z}=(\mathbf{x},t)\in \mathbb {R}^{n+1}\), the function \(F(\mathbf{z})= (t^2 \mathcal {B}- \mathcal {A}) \mathbf{x}^{m-1}\) is continuously differentiable and its Jacobian \(JF(\mathbf{z})\) is locally Lipschitz continuous. It follows from Theorem 1 of [37] that \(\Phi _{FB}(\mathbf{z})\) and \(\Phi _{\lambda }(\mathbf{z})\) are strongly semismooth. Similarly, \(\Phi _{min}(\mathbf{z})\) is strongly semismooth since the composition of strongly semismooth functions is again strongly semismooth [26].
Theorem 1
The functions \(H_{min}(\mathbf{z}), H_{FB}(\mathbf{z})\) and \(H_{\tau }(\mathbf{z})\) are strongly semismooth. Moreover, for any \(\mathbf{z}\in \mathbb {R}^{n+1}\), we have
Proof
It is clear that \(H_{n+1} (\mathbf{z})= \mathbf{x}^\top \mathbf{x}-1 \) is continuously differentiable. By Lemma 1, the functions \(\Phi _{min}(\mathbf{z}), \Phi _{FB}(\mathbf{z})\) and \(\Phi _{\tau }(\mathbf{z})\) are strongly semismooth. It follows that \(H_{min}(\mathbf{z}), H_{FB}(\mathbf{z})\) and \(H_{\tau }(\mathbf{z})\) are strongly semismooth since all their components are strongly semismooth. Moreover, by Proposition 2.6.2 of [9], (10) holds immediately. \(\square \)
4 Semismooth Newton method
In order to establish a semismooth Newton method for TEiCP, we need to obtain an element of \(\partial H(\mathbf{z})\). According to the Jacobian chain rule (see Theorem 2.6.6 of [9]), we have the following result.
Proposition 4
Suppose that the mapping \(F: \mathbb {R}^{n_1 + n_2} \rightarrow \mathbb {R}^{n_1}\) is continuously differentiable and the function \(\phi : \mathbb {R}^2 \rightarrow \mathbb {R}\) is locally Lipschitz. Let \( \Phi : \mathbb {R}^{n_1 + n_2} \rightarrow \mathbb {R}^{n_1}\) be the mapping such that
Then for any \(\mathbf{z}\in \mathbb {R}^{n_1 +n_2}\), we have
where \(D_a(\mathbf{z}) = diag\{ a_i (\mathbf{z}) \} \) and \(D_b(\mathbf{z}) = diag \{ b_i(\mathbf{z}) \}\) are diagonal matrices in \(\mathbb {R}^{n_1 \times n_1}\) with entries \((a_i (\mathbf{z}), b_i(\mathbf{z})) \in \partial \phi (x_i, F_i(\mathbf{z}))\), where \(\partial \phi (x_i, F_i(\mathbf{z}))\) denotes the set \(\partial \phi (a,b)\) with (a, b) being replaced by \((x_i, F_i(\mathbf{z}))\).
Let \(F(\mathbf{z})\) be defined in (4) and \(\phi (a,b)\) be one of the NCP-functions \(\phi _{FB}\) and \(\phi _{\tau }\). Since \(\mathcal {A}\in T_{m,n}\) is semi-symmetric, by simple computation, the Jacobian of F at \(\mathbf{z}\) is given by
Here, for a tensor \(\mathcal {T}= (t_{i_1 \ldots i_m})\in T_{m,n}\) and a vector \(\mathbf{x}\in \mathbb {R}^n\), let \(\mathcal {T}\mathbf{x}^{m-2}\) be a matrix in \(\mathbb {R}^{n \times n}\) whose (i, j)-th component is defined by
By Propositions 2 and 4, we can obtain the overestimation of \(\partial \Phi _{FB} (\mathbf{z})\) and \(\partial \Phi _{\tau } (\mathbf{z})\), respectively. In the following, we present a procedure to obtain an element of \(\partial \Phi _{\tau } (\mathbf{z})\) for any \(\mathbf{z}\in \mathbb {R}^{n+1}\), where \(\tau \in (0, 1]\). Note that \(\partial \Phi _{1} (\mathbf{z}) = \partial \Phi _{FB} (\mathbf{z})\).
Algorithm 1
A procedure to generate an element \(V \in \partial \Phi _{\tau }(\mathbf{z})\)
Step 0. Given \(\tau \in (0,1], \mathbf{z}=(\mathbf{x},t) \in \mathbb {R}^{n+1} \) and let \(V_i\) be the ith row of a matrix \(V \in \mathbb {R}^{n\times (n+1)}\).
Step 1. Set \(S_1 = \{ i \in [n] : x_i =0, F_i(\mathbf{x},t) =0\}, S_2 = \{ i \in [n] : x_i =0, F_i(\mathbf{x},t) >0\}, S_3 = \{ i \in [n] : x_i >0, F_i(\mathbf{x},t) =0\}\) and \(S_4 = \{ i \in [n] : x_i >0, F_i(\mathbf{x},t) >0\}\).
Step 2. Let \(\mathbf{c}\in \mathbb {R}^n \) such that \(c_i =1\) for \(i \in S_1 \cup S_2 \cup S_3\) and 0 otherwise.
Step 3. For \(i \in S_1\), set
Step 4. For \(i \in S_3\), set
Step 5. For \(i \in S_4\), set
Step 6. For \(i \not \in S_1 \cup S_3 \cup S_4\), set
For \(\tau \in (0,1)\), based on the overestimate of \(\partial \Phi _{\tau }(\mathbf{z})\), we can see that under some assumptions, the matrix
is an element in the B-subdifferential of \(H_\tau \) at \(\mathbf{z}\), where \(V \in \mathbb {R}^{n \times (n+1)}\) is the matrix generated by Algorithm 1 with \(\tau \in (0,1)\).
Theorem 2
Let \(\mathbf{z}=(\mathbf{x},t) \in \mathbb {R}^{n+1}\) be given and let \(V \in \mathbb {R}^{n \times (n+1)}\) be the matrix generated by Algorithm 1 with \(\tau \in (0,1)\). Suppose that \(\nabla _{\mathbf{x}} F_i (\mathbf{z})^\top \mathbf{c}\ne 0\) for all \(i \in S_3\). Then the matrix \(G = {\left( \begin{array}{c} {V} \\ 2\mathbf{x}^\top \quad 0 \end{array} \right) }\) is an element of \(\partial _B H_{\tau } (\mathbf{z})\).
Proof
Notice that \(H_{\tau }(\mathbf{z})\) is differentiable everywhere except on the set
We shall generate a sequence \(\{ \mathbf{z}^k \}_{k=1}^{\infty } \subseteq \mathbb {R}^{n+1}{\setminus } \Omega \) such that \(JH(\mathbf{z}^k)\) tends to the matrix G. Then the conclusion follows immediately by the definition of B-subdifferential.
The conclusion is trivial if \(\mathbf{z}\not \in \Omega \), i.e., \(S_1 \cup S_2 \cup S_3 = \emptyset \). In the following, we suppose that \(\mathbf{z}\in \Omega \), i.e., \(S_1 \cup S_2 \cup S_3 \ne \emptyset \). Let \(\mathbf{z}^k = \mathbf{z}- \frac{1}{k} (\mathbf{c}^\top , 0)\), where \(\mathbf{c}\in \mathbb {R}^n\) is the vector given in Step 2. It is clear that \(z^k_i < 0\) for \(i \in S_1 \cup S_2\). For \(i \in S_1 \cup S_3\), by Taylor expansion, we have
where \(\zeta ^k \rightarrow \mathbf{z}\) as \(k \rightarrow \infty \). Since \(\nabla _{\mathbf{x}} F_i (\mathbf{z})^\top \mathbf{c}\ne 0\) for all \(i \in S_3\), by continuity, we have that for all \(i \in S_3, F_i(\mathbf{z}^k) \ne 0\) when k is large enough. Hence, there exists \(N >0\) such that \(H_{\tau }(\mathbf{z}^k)\) is differentiable for all \(k > N\).
For \(i \in [n+1]\), let \( JH(\mathbf{z}^k)_i\) be the ith row of \(JH(\mathbf{z}^k)\). If \(i \not \in S_1 \cup S_2 \cup S_3\) or \(i = n+1\), by continuity, it is obvious that \(JH(\mathbf{z}^k)_i\) tends to the ith row of G. For \(i \in S_1 \cup S_2\), we have
For \(i \in S_3\), it is not difficult to show that
Note that for \(i \in S_1\), by substituting (11), we have
Similarly,
It follows that for \(i \in S_1 \cup S_2 \cup S_3, JH(\mathbf{z}^k)_i\) tends to the ith row of the matrix G. \(\square \)
We also mention that \(H_{FB}(\mathbf{z})\) is differentiable everywhere except on the set
Hence, \(H_{FB}(\mathbf{z})_i\) is differentiable for all \(i \in S_3\). By a proof similar to that of Theorem 2, we can see that the matrix \(G = {\left( \begin{array}{c} {V} \\ 2\mathbf{x}^\top \quad 0 \end{array} \right) }\) is exactly an element of \(\partial _B H_{FB} (\mathbf{z})\) without any assumption, where \(V \in \mathbb {R}^{n \times (n+1)}\) is the matrix generated by Algorithm 1 with \(\tau =1\).
Theorem 3
Let \(\mathbf{z}=(\mathbf{x},t) \in \mathbb {R}^{n+1}\) be given and let \(V \in \mathbb {R}^{n \times (n+1)}\) be the matrix generated by Algorithm 1 with \(\tau =1\). Then the matrix \(G = {\left( \begin{array}{c} {V} \\ 2\mathbf{x}^\top \quad 0 \end{array} \right) }\) is an element of \(\partial _B H_{FB} (\mathbf{z})\).
Now we present some properties of the metric functions \(\Psi _{FB}(\mathbf{z})\) and \(\Psi _{\tau }(\mathbf{z})\).
Theorem 4
The metric functions \(\Psi _{FB}(\mathbf{z})\) and \(\Psi _{\tau }(\mathbf{z})\) defined in (9) are continuously differentiable with \( \nabla \Psi (\mathbf{z}) = G^\top H(\mathbf{z})\) for any \(G \in \partial H(\mathbf{z})\).
Proof
By known rules of the calculus of generalized gradients (see [9], Theorem 2.6.6), it holds that \(\partial \Psi (\mathbf{z}) = H(\mathbf{z})^\top \partial H(\mathbf{z})\). Since the zero components of \(H(\mathbf{z})\) cancel the “multivalued rows” of \(\partial H(\mathbf{z})\), it is easy to check that \(H(\mathbf{z})^\top \partial H(\mathbf{z})\) is single valued everywhere. By the corollary to Theorem 2.2.4 in [9], we have that \(\Psi (\mathbf{z})\) is continuously differentiable. \(\square \)
Note that the metric function \(\Psi _{min}(\mathbf{z})\) is not continuously differentiable, which makes the generalized Newton direction not necessarily a descent direction. Now we present a damped Newton method for tensor eigenvalue complementarity problem. Here, we only take the NCP-functions \(\phi _{FB}\) and \(\phi _{\tau }\).
Algorithm 2
Damped semismooth Newton method for TEiCP
Step 0. Given \(\epsilon >0, \rho >0, p>2, \beta \in (0, \frac{1}{2})\) and choose \(\mathbf{z}^0 =(\mathbf{x}^0, t^0)\in \mathbb {R}^{n+1}\). Set \(k=0\).
Step 1. If \(\Vert H(\mathbf{z}^k) \Vert \le \epsilon \), stop. Otherwise, go to Step 2.
Step 2. Compute
where \(V_k \in \mathbb {R}^{n \times (n+1)} \) is the element of \(\partial \Phi (\mathbf{z}^k)\) generated by Algorithm 1. Find the solution \(\mathbf{d}^k\) of the system
If \(G_k\) in (12) is ill-conditioned or if the condition
is not satisfied, set \(\mathbf{d}^k = - \nabla \Psi (\mathbf{z}^k)\).
Step 3. Find the smallest \(i_k= 0,1, \ldots \) such that \(\alpha _k = 2^{-i_k}\) and
Set \(\mathbf{z}^{k+1} = \mathbf{z}^k + \alpha _k \mathbf{d}^k\).
Step 4. Set \(k=k+1\) and go back to Step 1.
The global convergence of Algorithm 2 is guaranteed by the following theorem, whose proof can be found in the papers [13, 28, 29].
Theorem 5
Suppose that the solution set \(\sigma (\mathcal {A},\mathcal {B})\) is nonempty. Let \(\{\mathbf{z}^k\} \subseteq \mathbb {R}^{n+1}\) be generated by Algorithm 2. Assume that \(H(\mathbf{z}^k) \ne \mathbf{0}\) for all k. Then the conclusions (a) and (b) hold:
-
(a)
\(\Vert H(\mathbf{z}^{k+1}) \Vert \le \Vert H(\mathbf{z}^k) \Vert \) ;
-
(b)
Each accumulation point \(\mathbf{z}^*\) of the sequence \(\{ \mathbf{z}^k \}\) is a stationary point of \(\Psi \). Furthermore, if \(H(\mathbf{z})\) is strongly BD-regular at \(\mathbf{z}^*\), then \(\mathbf{z}^*\) is a zero of \(H(\mathbf{z})\) if and only if \(\{ \mathbf{z}^k\}\) converges to \(\mathbf{z}^*\) quadratically and \(\alpha _k\) eventually becomes 1. On the other hand, \(\mathbf{z}^*\) is not a zero of \(H(\mathbf{z})\) if and only if \(\{ \mathbf{z}^k\}\) diverges or \(\lim _{k \rightarrow \infty }\alpha _k =0\).
Here, we make several remarks. First, if \(H(\mathbf{z}^k) = \mathbf{0}\) for some k, Algorithm 2 terminates right then with a zero solution of \(H(\mathbf{z})\). Second, by Theorem 1, \(H(\mathbf{z})\) is strongly semismooth. It follows from Lemma 2.3 of [29] that the directional differential \(H'(\cdot , \cdot )\) is semicontinuous of degree 2 at \(\mathbf{z}^*\). Then the convergence is quadratic if \(H(\mathbf{z})\) is strongly BD-regular at \(\mathbf{z}^*\). Third, when \(H(\mathbf{z})\) is not strongly BD-regular at \(\mathbf{z}^*\), i.e., there exists a singular matrix \(V \in \partial _B H(\mathbf{z}^*)\), the local convergence is also established by using adaptive constructs of outer inverses [7].
5 Numerical results
In this section, we present the numerical performance of the damped semismooth Newton method for the tensor eigenvalue complementarity problem. All codes were written by using Matlab Version R2012b and the Tensor Toolbox Version 2.5 [2]. The numerical experiments were done on a laptop with an Intel Core i5-2430M CPU (2.4 GHz) and RAM of 5.58 GB.
In the implementation of Algorithm 2, we set the parameters \(\epsilon = 10^{-6}, \rho = 10^{-10} , p=2.1\) and \(\beta = 10^{-4}\). We choose the penalized Fischer-Burmeister function with \(\tau =0.95\). Numerically, we say that the matrix \(G_k\) in (12) is ill-conditioned if \(\kappa (G_k) \ge 10^{10}\), where \(\kappa (M)= {\sigma _{max}(M) \over \sigma _{min} (M)}\) denotes the condition number of the matrix M. We let Algorithm 2 run until any of the following situations occur:
-
(a)
\(k=1000\) Failure (converge to a stationary point but not a solution),
-
(b)
\(\Vert H(\mathbf{z})\Vert \le \epsilon \) Success (a solution has been detected).
For simplicity, the positive definite tensor \(\mathcal {B}\in S_{m,n}\) is chosen as the identity tensor [21], i.e., \(\mathcal {B}\mathbf{x}^{m-1} = \mathbf{x}\) for all \(\mathbf{x}^\top \mathbf{x}=1\) where m is even.
Our first numerical experiment concerns the symmetric tensor \(\mathcal {A}\in S_{6,4}\) described in Table 1 of [8]. We compare the numerical performance of the damped semismooth Newton method and the shifted projected power method [8]. It has been shown that this problem is solvable, i.e., the solution set is nonempty, since A is symmetric with \(a_{111111} > 0\) [8]. In order to get all possible solutions, we take 1000 random initial points for both methods. We adopt the following strategy: one generates a random vector \(\mathbf{x}^0 \in \mathbb {R}^4\) with each element uniformly distributed on the open interval (0, 1), a scalar \(t^0 \in \mathbb {R}\) drawn from the standard normal distribution N(0, 1), and then normalizes \(\mathbf{x}^0\) such that \(\Vert \mathbf{x}^0 \Vert =1\). Note that for the shifted projected power method, the initial point \(\mathbf{x}^0\) needs to be chosen such that \(\mathcal {A}(\mathbf{x}^0)^m > 0\) while the damped semismooth Newton method does not have this constraint. We also record the time needed for finding the proper initial point for the shifted projected power method.
The numerical results of these two methods are reported in Tables 1 and 2, respectively. In Tables 1 and 2, No. denotes the number of each solution detected by the method within 1000 random initial points. Ite. denotes the average number of iteration for each solution. In Table 1, Time1 and Time2 denote the average time of finding the initial point and the average time of iteration in second, respectively. In Table 2, Time denotes the average time of iteration and the values in bold are the solutions detected only by the damped semismooth Newton method.
Our second numerical experiment consists of applying the damped semismooth Newton method to a sample of 10 randomly generated tensors. Given order m and dimension n, the tensor \(\mathcal {A}\in S_{m,n}\) is generated as [8], i.e., we select random entries from \([-1, 1]\) and symmetrize the result. To make TEiCP solvable, we reset its first entry by 0.5. The idea is measuring the rate of success of the damped semismooth Newton algorithm with a given number of initial points. The damped semismooth Newton algorithm is declared successful if a solution is found while working with a prescribed number of initial points (for instance, 1, 5, or 10). The outcome of this experiment is reported in Table 3.
The third numerical experiment focuses on nonnegative tensors. As pointed out in [8], there exists a unique solution for TEiCP when \(\mathcal {A}\in T_{m,n}\) is irreducible nonnegative and \(\mathcal {B}= \mathcal {I}\in S_{m,n}\), where \(\mathcal {I}\) is the diagonal tensor with diagonal entries 1 and 0 otherwise. To be specific, this unique solution is exactly the largest H-eigenvalue of \(\mathcal {A}\), associated with the unique positive eigenvector. We generate a symmetric nonnegative tensor \(\mathcal {A}\in S_{6,4}\) with all entries randomly selected from (0, 1). The tensor \(\mathcal {A}\) is given in Table 4. We test the damped semismooth Newton method with the initial point \(\mathbf{x}^0 = \mathbf{e}/ \Vert \mathbf{e}\Vert \) and \(t^0 = \sqrt{\mathcal {A}(\mathbf{x}^0)^6 / \mathcal {I}(\mathbf{x}^0)^6}\), where \(\mathbf{e}= (1, 1, \ldots , 1)^\top \in \mathbb {R}^n\). The iteration of Algorithm 2 is given in Table 5. Here we also report the value of \(\alpha _k\) derived in Step 3 of Algorithm 2.
It is worth mentioning that the damped semismooth Newton method can also work for nonsymmetric tensors while the shifted projected power method does not work. We test the damped semismooth Newton method for randomly generated nonnegative tensors with all entries selected from the interval (0, 1). Note that these nonnegative tensors are not symmetric in general, and the order m is not necessary to be even. We use the same initial point in the third numerical experiment. Interestingly, we find that the damped semismooth Newton method always converges to the unique solution. We summarize our numerical results in Table 6. For each case, we use a sample of 100 random tensors to record the number of success (Suc.), the average number of iteration (Ite.), the average time of iteration (Time) and the average value of \(\lambda \)-solution (\(\lambda ^*\)).
6 Final remarks
In this paper, we propose a damped semismooth Newton method for the tensor eigenvalue complementarity problem. Here we make two final remarks.
-
1.
Given an index set \(J \subseteq [n]\), the generalized tensor eigenvalue complementarity problem \(\text {(TEiCP)}_J\) is also considered in [8]. In fact, we can also apply our damped semismooth Newton method to \(\text {(TEiCP)}_J\). Since the results are similar, we omit them in this paper.
-
2.
From the numerical results, we may give a new way to compute the largest H-eigenvalue of irreducible nonnegative tensors since this problem can be equivalently reformulated as a tensor eigenvalue complementarity problem.
References
Adly, S., Rammal, H.: A new method for solving Pareto eigenvalue complementarity problems. Comput. Optim. Appl. 55, 703–731 (2013)
Bader, B.W., Kolda, T.G., et al.: MATLAB Tensor Toolbox Version 2.5. http://www.sandia.gov/~tgkolda/TensorToolbox/ (2012 )
Bonnans, J.F., Cominetti, R., Shapiro, A.: Second order optimality conditions based on parabolic second order tangent sets. SIAM J. Optim. 9(2), 466–493 (1999)
Chang, K.C., Pearson, K., Zhang, T.: On eigenvalue problems of real symmetric tensors. J. Math. Anal. Appl. 350, 416–422 (2009)
Che, M., Qi, L., Wei, Y.: Positive definite tensors to nonlinear complementarity problems. J. Optim. Theor. Appl. 168(2), 475–487 (2015)
Chen, B., Chen, X., Kanzow, C.: A penalized Fischer–Burmeister NCP-function. Math. Program. 88, 211–216 (2000)
Chen, X., Nashed, Z., Qi, L.: Convergence of Newton’s method for singular smooth and nonsmooth equations using adaptive outer inverses. SIAM J. Optim. 7, 445–462 (1997)
Chen, Z., Yang, Q., Ye, L.: Generalized eigenvalue complementarity problem for tensors. arXiv preprint arXiv:1505.02494 (2015)
Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York (1983)
da Costa, A., Figueiredo, I., Júdice, J., Martins, J.: A complementarity eigenproblem in the stability analysis of finite dimensional elastic systems with frictional contact. In: Ferris, M., Pang, J.S., Mangasarian, O. (eds.) Complementarity: Applications, Algorithms and Extensions, pp. 67–83. Kluwer, New York (2001)
da Costa, A., Martins, J.A.C., Figueiredo, I.N., Júdice, J.: The directional instability problem in systems with frictional contacts. Comput. Methods Appl. Mech. Eng. 193, 357–384 (2004)
da Costa, A., Seeger, A.: Cone-constrained eigenvalue problems: theory and algorithms. Comput. Optim. Appl. 45(1), 25–57 (2010)
de Luca, T., Facchinei, F., Kanzow, C.: A semismooth equation approach to the solution of nonlinear complementarity problems. Math. Program. 75, 407–439 (1996)
Ferris, M., Pang, J.: Engineering and economic applications of complementarity problems. SIAM Rev. 39(4), 669–713 (1997)
Fischer, A.: A special Newton-type optimization method. Optimization 24, 269–284 (1992)
Gowda, M.S., Luo, Z., Qi, L., Xiu, N.: Z-tensors and complementarity problems. arXiv preprint arXiv:1510.07933 (2015)
Hintermüller, M.: Semismooth Newton methods and applications. Department of Mathematics, Humboldt-University of Berlin (2010)
Júdice, J.J., Raydan, M., Rosa, S.S., Santos, S.A.: On the solution of the symmetric eigenvalue complementarity problem by the spectral projected gradient algorithm. Numer. Algorithm 47(4), 391–407 (2008)
Júdice, J.J., Sherali, H.D., Ribeiro, I.M.: The eigenvalue complementarity problem. Comput. Optim. Appl. 37, 139–156 (2007)
Júdice, J.J., Sherali, H.D., Ribeiro, I.M., Rosa, S.S.: On the asymmetric eigenvalue complementarity problem. Optim. Method Softw. 24(4–5), 549–568 (2009)
Kolda, T.G., Mayo, J.R.: Shifted power method for computing tensor eigenpairs. SIAM J. Matrix Anal. Appl. 32, 1095–1124 (2011)
Ling, C., He, H., Qi, L.: On the cone eigenvalue complementarity problem for higher-order tensors. Comput. Optim. Appl. 63(1), 143–168 (2016)
Ling, C., He H., Qi, L.: Higher-degree eigenvalue complementarity problems for tensors. Comput. Optim. Appl. (2015). doi: 10.1007/s10589-015-9805-x
Lim, L.H.: Singular values and eigenvalues of tensors: a variational approach. In: Proceedings of the IEEE International Workshop on Computational Advances in Multi-Sensor Addaptive Processing. CAMSAP05, pp. 129–132. IEEE Computer Society Press, Piscataway (2005)
Luo, Z., Qi, L., Xiu, N.: The sparsest solutions to Z-Tensor complementarity problems. Optim. Lett. (2016). doi: 10.1007/s11590-016-1013-9
Mifflin, R.: Semismooth and semiconvex functions in constrained optimization. SIAM J. Control Optim. 15(6), 959–972 (1977)
Ni, Q., Qi, L.: A quadratically convergent algorithm for finding the largest eigenvalue of a nonnegative homogeneous polynomial map. J. Global Optim. 61(4), 627–641 (2015)
Pang, J.S.: Newton’s method for B-differentiable equations. Math. Oper. Res. 15(2), 311–341 (1990)
Qi, L.: Convergence analysis of some algorithms for solving nonsmooth equations. Math. Oper. Res. 18(1), 227–244 (1993)
Qi, L.: Eigenvalues of a real supersymmetric tensor. J. Symbolic Comput. 40, 1302–1324 (2005)
Qi, L., Sun, D., Zhou, G.: A new look at smoothing Newton methods for nonlinear complementarity problems and box constrained variational inequalities. Math. Program. 87(1), 1–35 (2000)
Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58(1–3), 353–367 (1993)
Qi, L., Yin, H.: A strongly semismooth integral function and its application. Comput. Optim. Appl. 25(1–3), 223–246 (2003)
Queiroz, M., Júdice, J.J., Humes, J.C.: The symmetric eigenvalue complementarity problem. Math. Comp. 73, 1849–1863 (2004)
Song, Y., Qi, L.: Tensor complementarity problem and semi-positive tensors. J. Optim. Theory Appl. (2015). doi: 10.1007/s10957-015-0800-2
Song, Y., Qi, L.: Eigenvalue analysis of constrained minimization problem for homogeneous polynomials. J. Global Optim. 64(3), 563–575 (2016)
Sun, D., Qi, L.: On NCP-functions. Comput. Optim. Appl. 13(1–3), 201–220 (1999)
Sun, D., Sun, J.: Strong semismoothness of the Fischer-Burmeister SDC and SOC complementarity functions. Math. Program. 103(3), 575–581 (2005)
Xu, F., Ling, C.: Some properties on Pareto-eigenvalues of higher-order tensors. Oper. Res. Trans. 19(3), 34–41 (2015)
Yin, H., Ling, C., Qi, L.: Convergence rate of Newton’s method for \(L_2\) spectral estimation. Math. Program. 107(3), 539–546 (2006)
Zhou, G., Caccetta, L., Teo, K.L.: A superlinearly convergent method for a class of complementarity problems with non-Lipschitzian functions. SIAM J. Optim. 20(4), 1811–1827 (2010)
Acknowledgments
The authors are very grateful to the two anonymous referees for their valuable suggestions and constructive comments, which have considerably improved the presentation of the paper. We are also thankful to Dr. Ziyan Luo for her helpful comments. Liqun Qi work was supported by the Hong Kong Research Grant Council (Grant No. PolyU 502111, 501212, 501913 and 15302114).
Author information
Authors and Affiliations
Corresponding author
Additional information
Zhongming Chen work was done when he was visiting the Hong Kong Polytechnic University.
Rights and permissions
About this article
Cite this article
Chen, Z., Qi, L. A semismooth Newton method for tensor eigenvalue complementarity problem. Comput Optim Appl 65, 109–126 (2016). https://doi.org/10.1007/s10589-016-9838-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-016-9838-9