Abstract
Let A be a real matrix of size \(m\times n\). In classical linear algebra, a real number \(\sigma \) is called a singular value of A if there exist unit vectors \(u\in \mathbb {R}^m\) and \(v\in \mathbb {R}^n\) such that \(Av = \sigma u\) and \( A^\top u = \sigma v\). In variational analysis, a singular value of A is viewed as a critical value of the bilinear form \(\langle u,Av\rangle \) when u and v range on the unit spheres of \(\mathbb {R}^m\) and \(\mathbb {R}^n\), respectively. If u and v are further required to be nonnegative, then the idea of criticality is expressed by means of a pair of complementarity problems, namely, \(0\le u \perp (Av-\sigma u)\ge 0\) and \(0\le v \perp (A^\top u -\sigma v)\ge 0\). The parameter \(\sigma \) is now called a Pareto singular value of A. In this work we study the concept of Pareto singular value and, by way of application, we analyze a problem of nonnegative matrix factorization. The set \(\Xi (A)\) of Pareto singular values of A is nonempty and finite. We derive an explicit formula for the maximum number of Pareto singular values in a matrix of prescribed size. The elements of \(\Xi (A)\) can be found by solving a collection of classical singular value problems involving the principal submatrices of A. Unfortunately, such a method is cost prohibitive if m and n are large. For matrices of large size we propose an algorithm of alternating minimization type. This work is a continuation of our paper entitled Cone-constrained singular value problems published in the Journal of Convex Analysis (30, 2023, pp. 1285–1306).
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This work deals with singular value problems under nonnegativity constraints on singular vectors. Let \(S_d\) be the unit sphere of \(\mathbb {R}^{d}\) and \(\mathbb {M}_{m,n}\) be the space of real matrices of size \(m\times n\). Classically, the set \(\Sigma (A)\) of singular values of a matrix \(A\in \mathbb {M}_{m,n}\) is defined as follows: a real number \(\sigma \) is a singular value of A if there exists a pair \((u,v) \in S_m\times S_n\) such that
where the superscript \(\top \) denotes matrix transposition. In such a case, \((u,v,\sigma )\) is a singular triplet of A and u (respectively, v) is a left-singular (respectively, right-singular) vector of A. The equations in (1) can be viewed as the criticality conditions for the problem which consists in minimizing or maximizing the bilinear form \(\langle u,Av\rangle \) over the bi-sphere \( S_m\times S_n\). In a number of applications, the vectors u and v are subject to additional constraints: nonnegativity, monotonicity, sparsity, etc. For instance, Montanari and Richard [2] consider a Principal Component Analysis problem
where the constraint \(v\ge 0\) means that each component of v is nonnegative. Writing down the criticality condition for (2) leads to a hybrid complementarity problem
where \(\perp \) denotes orthogonality. The system (3) is not just a matter of adding the inequality \(v\ge 0\) to the equations in (1), but it is something more involved pertaining to the realm of the theory of complementarity problems. We refer to (3) as a hybrid complementarity problem because it is a mixture of an equation and a complementary problem. In this work, we give the priority to minimization over maximization and assume that both singular vectors are subject to a nonnegativity constraint:
We say that \((u,v) \in S_m\times S_n\) is a critical pair of (4) if there exists a real number \(\sigma \) such that
Since (4) is a nonconvex optimization problem, criticality is necessary but not sufficient for optimality. The system (5) of complementarity problems can be written in the equivalent form
where \(\mathbb {P}_d\) is the Pareto cone (or nonnegative orthant) of the space \(\mathbb {R}^d\). The partial order relation induced by \(\mathbb {P}_d\) is the componentwise ordering of \(\mathbb {R}^d\). Seeger and Sossa [6] introduced a concept of singular value for matrices that is relative to a prescribed pair of nonzero closed convex cones. Such “ordering” cones can be absolutely general in fact, but in this work we focus on Pareto cones. In other words, that a vector is nonnegative means that it is nonnegative componentwisely.
Definition 1
A real number \(\sigma \) is a Pareto singular value of A if the system (5) holds for some \((u,v)\in S_m\times S_n\). In such a case, \((u,v,\sigma )\) is called a Pareto singular triplet of A.
The set \(\Xi (A)\) of Pareto singular values of A is nonempty. Indeed, if \((u,v,\sigma )\) is a Pareto singular triplet of A, then \(\sigma =\langle u, Av \rangle \) and, therefore,
is the set of critical values of (4). As pointed out in [6], the set \(\Xi (A)\) is finite and \(\varrho _{\textrm{min}}(A)\) is its smallest element:
A word of caution is however in order: the term
is not necessarily an element of \(\Xi (A)\). Indeed, the criticality conditions for the maximization problem (7) are not exactly the same as those for the minimization problem (4). The organization of this paper is as follows. Section 2 settles the background of our work and presents some preliminary results concerning the set \(\Xi (A)\). Unless the matrix A is of moderate size, the numerical computation of \(\Xi (A)\) is an expensive task. Section 3 characterizes \(\Xi (A)\) when A is a nonnegative matrix. In this special case, computing \(\Xi (A)\) boils down to evaluate the spectral norm of each submatrix of A. Section 4 concerns the maximum number of Pareto singular values in a matrix of prescribed size. We characterize the matrices A of size \(m\times n\) for which the cardinality of \(\,\Xi (A)\) is highest. Section 5 and Sect. 6 are left to applications and algorithmic issues.
2 Background and preliminary results
Proposition 1 shows that \(\Xi (A)\) remains unchanged if A is transposed or multiplied by a permutation matrix. The notation \(\textrm{Perm}(d)\) stands for the set of permutation matrices of order d.
Proposition 1
Let \(A\in \mathbb {M}_{m,n}\). Then
-
(a)
\(\Xi (A^\top )=\Xi (A)\) and \(\,\Xi (tA)=t\,\Xi (A)\) for all \(t>0\).
-
(b)
\(\Xi (P A\, Q)=\Xi (A)\) for all \(P\in \textrm{Perm}(m)\) and \(Q\in \textrm{Perm}(n)\).
Proof
Part (a) is obvious, so we focus on (b). Let \(\sigma \in \Xi (A)\). This means that (5) holds for some \((u,v)\in S_m\times S_n\). The vectors \(z= Pu\) and \(w= Q^\top v\) have unit length and satisfy
But a permutation matrix of order d leaves invariant the cone \( \mathbb {P}_d\) and preserves the angle between any two d-dimensional vectors. Hence,
Thus, \((z,w,\sigma )\) is a Pareto singular triplet of \(B=P A\, Q\). In particular, \(\sigma \in \Xi (B)\). The reverse inclusion \(\Xi (B)\subseteq \Xi (A)\) is obtained by exchanging the roles of A and B. \(\square \)
The next proposition considers \(\Xi : \mathbb {M}_{m,n}\rightrightarrows \mathbb {R}\) as a multivalued map. Consistently, upper and lower semicontinuity are understood in the traditional multivalued sense, cf. Rockafellar and Wets [4, p. 193].
Proposition 2
Let \(\max \{m,n\}\ge 2\).
-
(a)
\(\textrm{gr}(\Xi )=\{(A,\sigma )\in \mathbb {M}_{m,n}\times \mathbb {R}: \sigma \in \Xi (A)\}\) is a closed set.
-
(b)
\(\Xi \) is upper semicontinuous, but not lower semicontinuous.
Proof
Part (a). Let \(\{A_k\}_{k}\rightarrow A\) and \(\{\sigma _k\}_{k}\rightarrow \sigma \) be converging sequences such that \(\sigma _k\in \Xi (A_k)\). Hence, for each k, there exists \((u_k, v_k)\in S_m\times S_n\) such that
By taking subsequences if necessary, we may suppose that \(\{u_k\}_{k}\) and \(\{v_k\}_{k}\) converge to unit vectors u and v, respectively. Passing to the limit in (8) yields \(\sigma \in \Xi (A)\). Part (b). Since
for all \(A\in \mathbb {M}_{m,n}\) and the functions \(\varrho _{\textrm{min}}\) and \( \varrho _{\textrm{max}}\) are continuous on \(\mathbb {M}_{m,n}\), it is clear that \(\Xi \) maps bounded sets into bounded sets. But mapping bounded sets into bounded sets and having a closed graph is sufficient for a multivalued map to be upper semicontinuous, cf. [4, Theorem 5.19]. For proving that \(\Xi \) is not lower-semicontinuous, we construct a counter-example. We suppose for instance that \(n\ge 2\) and, for each \(\varepsilon \ge 0\), we define \(A_\varepsilon \in \mathbb {M}_{m,n}\) by
Note that \(1\in \Xi (A_0)\). If \(\varepsilon >0\) goes to 0, then \(A_\varepsilon \) goes to \(A_0\), but 1 remains away from \(\Xi (A_\varepsilon )\). Indeed, by writing down (5) for the matrix \(A_\varepsilon \), we get the system
together with the nonnegativity constraints on the unit vectors u and v. A quick examination at this system shows that \(\sigma \) cannot be positive. Hence, the distance from 1 to the set \(\Xi (A_\varepsilon )\subseteq \mathbb {R}_-\) is at least one. \(\square \)
2.1 Characterization and computation of \(\,\Xi (A)\)
The theory of Pareto singular values for rectangular matrices is related to the theory of Pareto eigenvalues for square matrices. Let E be a square matrix of order d. Seeger [5] refers to a real number \(\lambda \) as a Pareto eigenvalue of E if the complementarity problem
has a nonzero solution \(x\in \mathbb {R}^d\). The set of Pareto eigenvalues of E is denoted by \(\Pi (E)\) and it is called the Pareto spectrum of E. A nonzero solution x to (9) can always be normalized so that \( \Vert x\Vert =1\). Up to a detail, it is possible to compute \(\Xi (A)\) by solving an eigenvalue problem of the type (9). Indeed, the couple of complementarity problems mentioned in (5) can be written as a single complementarity problem
involving the Jordan–Wielandt matrix
The sets \(\Xi (A)\) and \(\Pi (E_A)\) are almost equal. Indeed, it is shown in [6, Proposition 3] that
From an algorithmic point of view, transforming a Pareto singular value problem into a Pareto eigenvalue problem is not always a good strategy. In fact, since \(E_A\) is of order \(m+n\), the numerical handling of (10) could be rather expensive. We suggest to compute the Pareto singular values of A without resorting to the theory of Pareto eigenvalues. One can use for instance the characterization of \(\Xi (A)\) given in Theorem 1. In what follows, \(\langle d\rangle =\{1,2,\ldots ,d\},\)
and \(A^{I,J}\) is the submatrix of A with entries \(a_{i,j}\) indexed by \(i\in I\) and \(j\in J\). In particular, \(A^{\langle m\rangle , \langle n\rangle }=A\). The symbol \(w>0\) means that w is a vector with positive components.
Theorem 1
A real number \(\sigma \) belongs to \(\Xi (A)\) if and only if there exist \((I,J)\in {\mathcal {J}}(m,n)\) and vectors \(\xi \in {\mathbb {R}}^{|I|}\) and \(\eta \in {\mathbb {R}}^{|J|}\) such that
Theorem 1 is stated in [6], but we recall it here for the reader’s convenience. Without loss of generality, the system (12)–(15) can be written with the extra conditions \(\Vert \xi \Vert =1\) and \(\Vert \eta \Vert =1\). In such a case, a Pareto singular triplet \((u,v,\sigma )\) of A is obtained by setting \(u_i=\xi _i\) if \(i\in I\), \( u_i=0\) if \(i\in \langle m\rangle \backslash I\), \(v_j=\eta _j\) if \(j\in J\), and \(v_j=0\) if \(j\in \langle n\rangle \backslash J\). Note that \(\Vert u\Vert =\Vert \xi \Vert \) and \(\Vert v\Vert =\Vert \eta \Vert \). Theorem 1 shows that
where
is either empty or a singleton. To see that \(\Xi _{I,J}(A)\) cannot have more than one element, it suffices to write (12) as an eigenvalue problem
and observe that the Jordan-Wieland matrix asociated to \(A^{I,J}\) is symmetric. If \(\Xi _{I,J}(A)\) is nonempty, then the pair (I, J) is said to be successful for detecting a Pareto singular value of A. Beware that two different successful pairs, say \((I_0,J_0)\) and \((I_1,J_1)\), may yield the same \(\sigma \).
Example 1
Consider the test example
Table 1 is obtained by working out the system (12)–(15) for each of the \(21=(2^2-1)(2^3-1)\) choices of (I, J). The Pareto singular value \(\sqrt{5}=2.236068\) appears twice in Table 1, because it is obtained with two distinct choices of (I, J). In this example, there are 7 successful pairs (I,J), but only 6 Pareto singular values. A Pareto singular value that is obtained with different choices of (I, J) is counted only once.
3 Pareto singular values of nonnegative matrices
If A is nonnegative, then the slackness inequalities (14)–(15) hold automatically and the set (17) depends only on the entries of the submatrix \(A^{I,J}\). In what follows, we use the notation \(B\unlhd A\) to indicate that B is a submatrix of A.
Proposition 3
Let A be nonnegative. Then \(\Xi (B)\subseteq \Xi (A)\) for all \( B\unlhd A\) and, in particular, each entry of A belongs to \(\Xi (A)\).
Proof
Let \(A\in \mathbb {M}_{m,n}\) be nonnegative. Consider a submatrix \(B=A^{I_0,J_0}\) with \((I_0,J_0)\in {\mathcal {J}}(m,n)\). Note that if \((I,J)\in {\mathcal {J}}(m,n)\) is such that \(I\subseteq I_0\) and \(J\subseteq J_0\), then
Let \(\sigma \in \Xi (B)\). By Theorem 1, there exists a pair (I, J) as just mentioned such that (12)–(13) holds for some pair \((\xi ,\eta )\) of vectors of appropriate dimensions. If we view \(B^{I,J}\) as a submatrix of A and recall that the slackness inequalities (14)–(15) can be dropped, then it is clear that \(\sigma \in \Xi (A)\). \(\square \)
Proposition 3 can be generalized to the case in which only some entries of A are nonnegative. Consider for instance a matrix
where the signs on the central block B and the four corner blocks are arbitrary. If the signs on the northern block N, southern block S, western block W, and eastern block E, are nonnegative, then \(\Xi (B)\subseteq \Xi (A)\). The same conclusion holds for a matrix A having any of the following forms:
Example 2
The matrix
contains (18) in the northwestern corner. Since
are nonnegative blocks, each Pareto singular value of (18) is a Pareto singular value of (19).
The next theorem is the main result of this section. We start by writing two useful lemmas.
Lemma 1
\(A\in \mathbb {M}_{m,n}\) is nonnegative if and only if each Pareto singular value of A is nonnegative.
Proof
A matrix A is nonnegative if and only if \(\langle u,Av\rangle \ge 0\) for all \(u\ge 0\) and \(v\ge 0\). The later condition is equivalent to saying that \(\varrho _{\textrm{min}}(A)\) is nonnegative. Formula (6) completes the job. \(\square \)
Let \(\lambda _{\textrm{max}}(E)\) be the largest eigenvalue of a symmetric matrix E. The spectral norm of a rectangular matrix A is given by
This notation is consistent with the fact that \(\sigma _{\textrm{max}}(A)\) is the largest singular value of A.
Lemma 2
Let A be a nonnegative matrix. Then \(\sigma _{\textrm{max}}(A)\) belongs to \(\,\Xi (A)\). Furthermore,
Proof
Note that \(\sigma \le \varrho _{\textrm{max}}(A)\le \sigma _{\textrm{max}}(A)\) for all \(A\in \mathbb {M}_{m,n}\) and all \(\sigma \in \Xi (A)\). Less easy to prove is that \(\sigma _{\textrm{max}}(A)\in \Xi (A)\) when A is nonnegative. For each \(\varepsilon >0\), we set \(A_\varepsilon = A+ \varepsilon \, 1_m 1_n^\top \), where \(1_d\) is the d-dimensional vector of all ones. Since \(A_\varepsilon \) is positive, also \(A_\varepsilon ^\top A_\varepsilon \) is positive. Perrons’s theorem ensures the existence of a positive unit vector \(v_\varepsilon \) such that
Since \(\sigma _\varepsilon = [\lambda _{\textrm{max}}(A_\varepsilon ^\top A_\varepsilon )]^{1/2}= \sigma _{\textrm{max}}(A_\varepsilon )\) is positive, the vector
is well defined and positive. By combining (21)–(22) and the definition of \(\sigma _\varepsilon \), we get
Hence, \((u_\varepsilon ,v_\varepsilon )\) is a pair of positive unit vectors satisfying \(A_\varepsilon v_\varepsilon =\sigma _\varepsilon u_\varepsilon \) and \(A_\varepsilon ^\top u_\varepsilon =\sigma _\varepsilon v_\varepsilon \). This proves that \(\sigma _{\textrm{max}}(A_\varepsilon )\in \Xi (A_\varepsilon )\) for all \(\varepsilon >0\). But the function \(\sigma _{\textrm{max}}: \mathbb {M}_{m,n}\rightarrow \mathbb {R}\) is continuous and the multivalued map \(\Xi \) has a closed graph, cf. Proposition 2. So, by letting \(\varepsilon \rightarrow 0\), we obtain \(\sigma _{\textrm{max}}(A)\in \Xi (A)\). \(\square \)
The second formula in (20) looks similar to (6). However, (6) holds for all A, whereas the second formula in (20) may be false if A contains a negative entry. The next theorem shows that computing the set \(\Xi (A)\) for a nonnegative matrix A is essentially a matter of evaluating the spectral norm \(\sigma _{\textrm{max}}(B)\) of each submatrix B of A. If the size of B is reasonable, then \(\sigma _{\textrm{max}}(B)\) can be computed in short time and with precision.
Theorem 2
\(A\in \mathbb {M}_{m,n}\) is nonnegative if and only if \( \,\Xi (A) = \{\sigma _{\textrm{max}}(B): B\unlhd A\}\).
Proof
For avoiding trivialities, we suppose that \(\max \{m,n\}\ge 2\). Each element of the set
is nonnegative. If \(\Xi (A)=\Gamma (A)\), then A is nonnegative by Lemma 1. Conversely, suppose that A is nonnegative. For proving the inclusion \(\Xi (A)\subseteq \Gamma (A)\), we pick any \(\sigma \in \Xi (A)\). Since A is nonnegative, we know that \(\sigma \ge 0\), cf. Lemma 1. By Theorem 1, there exists \((I,J)\in {\mathcal {J}}(m,n)\) such that (12)–(13) holds for some pair \((\xi ,\eta )\) of unit vectors. Hence, \(M\eta = \sigma ^2 \eta \) with \(M= (A^{I,J})^\top A^{I,J}\). Let \(\lambda \) be the largest eigenvalue of the symmetric matrix M. Since M is nonnegative and positive semidefinite, there exists a nonnegative unit vector \(\gamma \in \mathbb {R}^{\vert J\vert }\) such that \(M\gamma =\lambda \gamma .\) Since \( \gamma \) belongs to the Pareto cone of the space \(\mathbb {R}^{\vert J\vert }\) and \(\eta \) belongs to the interior of the same cone, these eigenvectors of M cannot be orthogonal. Hence, \( \sigma ^2=\lambda \) and \(\sigma = \sqrt{\lambda }= \sigma _{\textrm{max}}(A^{I,J})\) belongs to \(\Gamma (A)\). For proving the reverse inclusion \(\Gamma (A)\subseteq \Xi (A)\), we pick any \(B\unlhd A\) and show that \(\sigma = \sigma _{\textrm{max}}(B)\) belongs to \(\Xi (A)\). We suppose that B is not of size \(1\times 1\), otherwise we are done because each entry of A belongs to \(\Xi (A)\). Since B is nonnegative, Lemma 2 implies that \(\sigma \in \Xi (B)\). By using Proposition 3, we deduce that \(\sigma \in \Xi (A)\). \(\square \)
Example 3
Consider a nonnegative matrix of rank-one, that is to say, a matrix of the form \(pq^\top \), where \(p\in \mathbb {P}_m\) and \(q\in \mathbb {P}_n\) are nonzero vectors. Note that
where \(p_I\) is the vector formed with the components of p indexed by I. The definition of \(q_J\) is analogous. Hence, Theorem 2 yields \(\,\Xi (pq^\top )= \left\{ \Vert p_I\Vert \, \Vert q_J\Vert : (I,J)\in {\mathcal {J}}(m,n)\right\} \).
Let \(\mathbb {M}_{m,n}\) be equipped with the Frobenius inner product \(\langle Y, X\rangle = \textrm{tr}( Y^\top X)\) and let \({\mathcal {N}}_{m,n}\) be the closed convex cone of nonnegative matrices of size \(m\times n\). The next corollary is a consequence of Theorem 2, but it can be proven also in a direct way.
Corollary 1
Let \(A\in \mathbb {M}_{m,n}\) be nonnegative. Then
corresponds to the distance from A to the boundary of \({\mathcal {N}}_{m,n}\).
Consider again the case of a general matrix A. The set \(\Gamma (A)\) introduced in (23) collects the spectral norms of all submatrices of A. If A contains at least one negative entry, then \(\Xi (A)\) and \(\Gamma (A)\) could be quite different in fact. By way of example, consider the matrix (18) whose Pareto singular values are displayed in Table 1. Table 2 shows the spectral norm of each of the 21 submatrices of (18). For easy of visualization, we display these values in decreasing order. After removing repetitions, we get 18 different values. In this example, \(\Xi (A)\) and \(\Gamma (A)\) are not only different, but they have also different cardinality.
For subsequent use, we introduce the following natural concept.
Definition 2
A matrix A is called spectral-norm scattered if all the submatrices of A have a different spectral norm.
Definition 2 makes sense even if A admits negative entries. By way of example, the matrix (18) is not spectral-norm scattered because the submatrices \( \left[ \begin{array}{ccc} 1&\quad 2 \end{array} \right] \) and \( \left[ \begin{array}{ccc} 2&\quad 1 \end{array} \right] \) have the same spectral norm. Random matrices following an absolutely continuous probability distribution are usually spectral-norm scattered. Topologically speaking,
is an open set in \(\mathbb {M}_{m,n}\). The proof of this fact will be incorporated in Proposition 4.
4 Counting Pareto singular values
In general, \(\Sigma (A)\) and \(\Xi (A)\) are non-comparable with respect to set inclusion, i.e., neither set is included in the other. The particular example
shows that classical and Pareto singular values may form disjoint sets. A matrix A of size \(m\times n\) has at most \(2\min \{m,n\}\) singular values. The factor 2 is because negative singular values are also counted. On the other hand, it is proven in [6, Theorem 4] that
grows exponentially in n and m separately. More precisely,
The right-hand side of (25) corresponds to the cardinality of \({\mathcal {J}}(m,n)\) or, equivalently, to the number of submatrices of a matrix of size \(m\times n\). A matrix attaining the maximum in (24) is called PSV-saturated. The next proposition characterizes the set
of PSV-saturated matrices of size \(m\times n\). A set in a vector space is said to be positively homogeneous if it is stable under multiplication by positive scalars.
Proposition 4
Let \(\max \{m,n\}\ge 2\). Then
Furthermore, this set is open and positively homogeneous.
Proof
\(A\in \mathbb {M}_{m,n}\) is PSV-saturated if and only if the following two conditions hold true:
This observation is due to (16) and the fact that each \(\Xi _{I,J}(A)\) is either empty or a singleton. Let \( A\in \mathfrak {S}(m,n)\). We claim that A is a nonnegative matrix. Suppose, to the contrary, that A has a negative entry. By Proposition 1 (b), we may consider the case \(a_{1, 1}<0\). By Proposition 1 (a) and the fact that \(\max \{m,n\}\ge 2\), we may suppose for instance that \(n\ge 2\). In such a case, the condition (27) is false for the choice \((I, J)= (\{1\},\{2\})\), because the slackness inequality (15) is violated for \(j=1\). We deduce that A is nonnegative. In such a case, \(\Xi _{\{i\},\{j\}}(A)= \{a_{i,j}\}\) for all \(i\in \langle m\rangle \) and \(j\in \langle n\rangle \). This implies that the entries of A are all distinct, otherwise (28) is violated. We now claim that A is positive. Suppose, to the contrary, that A has a zero entry. Such a zero entry is necessarily unique and, without loss of generality, we may take \(a_{1, 1}=0\). As before, we consider the case \(n\ge 2\). For the choice \((I,J)= (\{1\},\{1,2\})\), the system (12) becomes
These equations being unsolvable for positive variables \(\xi _1\), \(\eta _1\), and \(\eta _2\), it follows that \(\Xi _{I,J}(A)= \emptyset \), contradicting (27). In conclusion, A is positive as claimed. In such a case, \(\Xi _{I,J}(A)=\left\{ \sigma _{\textrm{max}}(A^{I,J})\right\} \) for all \((I,J)\in {\mathcal {J}}(m,n)\) and condition (28) can be written as
This implication means that A is spectral-norm scattered. We have shown in this way that the first set in (26) is included in the second. The reverse inclusion is a consequence of Theorem 2. That \(\mathfrak {S}(m,n)\) is positively homogeneous follows from Proposition 1 (a). Finally, we prove that \(\mathfrak {S}(m,n)\) is open. The set of positive matrices of size \(m\times n\) is clearly open. For showing that \(\textrm{SNS}(m,n)\) is open, we proceed as follows. For each \((I,J)\in {\mathcal {J}}(m,n)\), let \(\pi _{I,J}:\mathbb {M}_{m,n}\rightarrow \mathbb {M}_{m,n}\) be the linear map given by
One can check that \(\sigma _{\textrm{max}}( A^{I,J})= \sigma _{\textrm{max}} (\pi _{I,J}(A))\). Hence,
is expressible as intersection of finitely many open sets. \(\square \)
Constructing PSV-saturated matrices is not difficult. By way of example,
belongs to \(\mathfrak {S}(3,4)\) because it is matrix of size \(3\times 4\) with as much as \(105=(2^3-1)(2^4-1)\) Pareto singular values. The next proposition considers the case of arbitrary dimensions m and n.
Proposition 5
An example of PSV-saturated matrix of size \(m\times n\) is the rank-one matrix
Proof
We suppose that \(\max \{m,n\}\ge 2\), otherwise the result is trivial. Let \(p=(s,s^2,\ldots ,s^m)^\top \) and \(q=(t,t^2,\ldots ,t^n)^\top \), where s and t are real numbers to be chosen in a convenient way. Let \(A=pq^\top \). Pick distinct pairs \((I_0, J_0)\) and \( (I_1,J_1)\) in \({\mathcal {J}}(m,n)\) and consider the bivariate polynomial
which is not identically zero. Since all the coefficients of \(\Phi \) are integer, this polynomial does not vanish at a pair (s, t) of algebraically independent numbers. Hence, if we take for instance \((s,t)=(\pi ,e^{\pi })\), then we get a matrix A that is spectral-norm scattered. This proves that (29) belongs to \(\mathfrak {S}(m,n)\). \(\square \)
Remark 1
The above proof uses the fact that \(\pi \) and \(e^\pi \) are algebraically independent. We could have chosen any other pair of algebraically independent positive numbers. It is worthwhile mentioning that all matrices in a small neighbourhood of (29) are still in \(\mathfrak {S}(m,n)\). Also, any positive multiple of (29) remains in \(\mathfrak {S}(m,n)\).
In a sense, the opposite of being PSV-saturated is to have a unique Pareto singular value.
Proposition 6
A negative matrix A has \(-\sigma _{\textrm{max}}(A)\) as unique Pareto singular value.
Proof
We suppose that \(\max \{m,n\}\ge 2\), otherwise the proposition is trivial. Let A be negative. The slackness conditions (14)–(15) are violated, unless \((I,J)= (\langle m\rangle ,\langle n\rangle )\). For such a choice of pair (I, J), the system (12)–(15) reduces to
Since A is negative, the scalar \(\sigma \) is negative. Note that \(A^\top A\) is a positive square matrix. Perron’s theorem applied to the eigenvalue problem \(A^\top A v=\sigma ^2 v\) implies that \(\sigma ^2=\lambda _{\textrm{max}}(A^TA)\) and, a posteriori, \(\sigma =-\sigma _{\textrm{max}}(A)\). \(\square \)
In applications, A has usually positive and negative entries. The presence of one or several negative entries reduces drastically the number of Pareto singular values. This is due to two reasons: firstly, for a number of pairs (I, J), the system (12)–(13) could be unsolvable. In other words, \(A^{I,J}\) may not have a singular value with positive singular vectors. And, secondly, even if (12)–(13) were solvable, it may happen that the slackness inequalities (14)–(15) are violated. Table 3 reports on a numerical experiment with randomly generated matrices of different dimensions. For each pair (m, n), we generated a sample \(\{A^{(1)},\ldots , A^{(N)}\}\) of size \(N=10^6\) from a Gaussian random matrix \(\textbf{A} \rightsquigarrow [{\mathcal {N}}(0,1)]^{m\times n}.\) This notation means that each entry of \(\textbf{A}\) follows a standard normal distribution. Table 3 displays the empirical mean \( (1/N) \sum _{k=1}^N\textrm{card}[\,\Xi (A^{(k)}\,] \) of the random variable \(\textrm{card}[\,\Xi (\textbf{A})]\). Each figure is rounded to 2 decimal places, but the second decimal place is not trustworthy (repeating the same experiment with another sample of size \(N=10^6\) may yield a slightly different value). Table 3 is symmetric with respect to n and m, so the portion below the diagonal is left blank. As one can see, the expected number of Pareto singular values in a random Gaussian matrix of size \(m\times n\) grows rather slowly with respect to m and n. This observation is consistent with the fact that a typical realization of \(\textbf{A}\) has a certain number of negative entries. By way of example, a Gaussian random matrix of size \(5\times 8\) has 9.11 Pareto singular values in average. This is far less than the number \(7905= (2^5-1)(2^8-1)\) of Pareto singular values in a PSV-saturated matrix of the same size.
5 Applications
5.1 Nash equilibria of bilinear forms on Pareto cones
As a first application of the concept of Pareto singular triplet of a matrix A, we consider the problem of finding a pair \(({\bar{u}}, {\bar{v}})\) such that
A pair \(( {\bar{u}}, {\bar{v}})\) as above is called a Nash equilibrium of (4) and the term \(\langle {\bar{u}},A {\bar{v}}\rangle \) is called a Nash value of (4). Satisfying (30)–(32) is stronger than being a critical pair of (4), but it is weaker than being a solution to (4). In other word, being a Nash equilibrium is a property that lies between optimality and criticality. The next result is less evident.
Proposition 7
Suppose that \(A\in \mathbb {M}_{m,n}\) has at least one negative entry. Let \(({\bar{u}}, {\bar{v}}, {\bar{\sigma }} )\) be a Pareto singular triplet of A with \({\bar{\sigma }} <0\). Then \(( {\bar{u}}, {\bar{v}})\) is a Nash equilibrium of (4).
Proof
We start by proving that the minimization problem
admits \({\bar{u}}\) as unique solution. Let \({\tilde{u}}\) be any solution to (33). We must show that \({\tilde{u}} ={\bar{u}}\). Since \(({\bar{u}}, {\bar{v}}, {\bar{\sigma }} )\) solves the system (5), we have in particular
where \(I=\{i\in \langle m\rangle : {\bar{u}}_i>0\}\). Since \({\tilde{u}}\) is a solution to (33), we have \(\Vert {\tilde{u}}\Vert =1\) and
for some real number \({\tilde{\sigma }}\) (viewed as a Lagrange multiplier associated to the equality constraint \(\Vert u\Vert ^2-1=0\)). By using (36), we get \({\tilde{\sigma }}=\langle {\tilde{u}},A{\bar{v}}\rangle \le \langle {\bar{u}},A{\bar{v}}\rangle ={\bar{\sigma }} <0\) and
where \({\tilde{I}}=\{i\in \langle m\rangle : {\tilde{u}}_i>0\}\). If \(i_0\in I\,\backslash {\tilde{I}}\), then the combination of (34) and (38) yields
which is a contradiction. Similarly, if \(i_0\in {\tilde{I}}\,\backslash I\), then the combination of (35) and (37) yields
which is again a contradiction. Hence, \(I={\tilde{I}}\). We get \({\bar{\sigma }} {\bar{u}}_i={\tilde{\sigma }} {\tilde{u}}_i\) for all \(i\in I\) and \({\bar{u}}_i= {\tilde{u}}_i=0 \) for all \(i \in \langle m\rangle \backslash I\). Thus, \({\bar{\sigma }} {\bar{u}}={\tilde{\sigma }} {\tilde{u}}.\) Since \({\bar{u}} \) and \({\tilde{u}}\) are unit vectors and \({\tilde{\sigma }}\le {\bar{\sigma }} <0\), we deduce that \({\tilde{\sigma }}={\bar{\sigma }}\) and \({\tilde{u}}= {\bar{u}}\). In a similar way, one can prove that the minimization problem
admits \({\bar{v}}\) as unique solution. In particular, \(({\bar{u}}, {\bar{v}})\) is a Nash equilibrium of (4). \(\square \)
Corollary 2
Suppose that \(A\in \mathbb {M}_{m,n}\) has at least one negative entry. Then each negative Pareto singular value of A is a Nash value of (4).
5.2 Nonnegative matrix factorization
Let X be a data matrix of size \(m\times n\). For instance, the columns of X may correspond to n realizations of an m-dimensional random vector. Nonnegative matrix factorization (NMF) is a reduction strategy which consists in approximating X by the product of two low-rank nonnegative matrices. Given a small integer r, the issue at hand is that of finding a pair \((P,Q)\in \mathbb {M}_{m,r}\times \mathbb {M}_{n,r}\) that solves the minimization problem
where \(\Vert \cdot \Vert \) is the Frobenius norm on \(\mathbb {M}_{m,n}\). NMF is of special relevance when X is a nonnegative matrix, but it makes sense also for any data matrix. General information on (39) can be found in Gillis and Glineur [1], Tolić et al. [7, Section 2.1], and references therein. We briefly discuss the particular case \(r=1\), which corresponds to the rank-one NMF problem
We assume that X has at least one positive entry, otherwise \((p,q)=(0,0)\) is a solution to (40) and we are done. Note that p and q are not required to be unit vectors. The next theorem shows that solving (40) boils down to solve (4), and viceversa.
Theorem 3
Let X be a data matrix with at least one positive entry and let \(A=-X\). Then the optimal values of (4) and (40) are related by the formula
Furthermore, the optimization problems (4) and (40) are equivalent in the following sense:
-
(a)
If \((u_0, v_0)\) is a solution to (4), then
$$\begin{aligned} (p_0,q_0)=\left( \sqrt{-\varrho _{\textrm{min}}(A)} \,u_0,\sqrt{-\varrho _{\textrm{min}}(A)} \,v_0\right) \end{aligned}$$(41)is a solution to (40).
-
(b)
If \((p_1, q_1)\) is a solution to (40), then \((u_1,v_1)= \left( \Vert p_1\Vert ^{-1}\,p_1,\,\Vert q_1\Vert ^{-1}\,q_1\right) \) is a solution to (4).
Proof
Consider the problem (40). If we incorporate a nonnegative factor \(\sigma \) in front of \(pq^\top \), then we may assume that p and q are unit vectors. Hence, we write (40) in the equivalent form
The identity \(\Vert X- \sigma uv^\top \Vert ^2= \Vert X\Vert ^2 -2 \sigma \langle u,Xv\rangle + \sigma ^2\Vert u\Vert ^2\,\Vert u\Vert ^2\) yields
where the minimization problem between brackets is precisely (4). Since A has at least one negative entry, \(\varrho _{\textrm{min}}(A)\) is negative. Hence, \(\sigma _0= -\varrho _{\textrm{min}}(A)\) is the unique minimizer of h on \(\mathbb {R}_+\). We have shown that if \((u_0,v_0)\) is a solution to (4), then \((u_0,v_0,\sigma _0)\) is a solution to (42). In particular,
and the pair (41) is a solution to (40). Note also that
Conversely, let \((p_1, q_1)\) be a solution to (40). We claim that \(p_1\not =0\) and \(q_1\not =0\). If the claim is false, then \(p_1q_1^\top =0\) and \( \Vert X-pq^\top \Vert \ge \Vert X\Vert \,\) for all \(p\ge 0\) and \( q\ge 0\). In particular,
for all \(t>0\). This leads to
and contradicts that X has a positive entry. Hence, the claim is true and \(u_1=\Vert p_1\Vert ^{-1} \,p_1\) and \(v_1=\Vert q_1\Vert ^{-1}\,q_1\) are well defined. If \(s_1=\Vert p_1\Vert \,\Vert q_1\Vert \), then \( {\bar{p}} =\sqrt{s_1}\, u_1\) and \({\bar{q}}=\sqrt{s_1}\,v_1\) are nonnegative vectors such that
Hence,
and the triplet \((u_1, v_1, s_1)\) is a solution to (42). A posteriori, \((u_1, v_1)\) is a solution to (4). \(\square \)
6 Algorithmic issues
An ambitious endeavor is to determine all the critical values of the minimization problem (4) and, for each critical value \(\sigma \), to display an associated critical pair (u, v). Such a goal is reasonable only if A is of moderate size. The only exhaustive method at our disposal is the utilization of Theorem 1. We must solve (12)–(15) for each one of the \((2^m-1)(2^n-1)\) choices of (I, J). In order to have a rough idea of the CPU time needed for determining \(\Xi (A)\) as a function of m and n, we have worked out the case of a test matrix \(A\in \mathbb {M}_{m,n}\) that contains positive and negative entries, namely,
Table 4 suggests that the cost of evaluating \(\Xi (A)\) grows exponentially in m and n, separately. The figures shown in Table 4 correspond to CPU times measured in seconds. Our numerical experiments are carried in macOS Catalina with a processor 3.4 GHz Intel Core i5, Memory(RAM) 8 Gb. The codes are implemented with Matlab R2017a. If we deal with a nonnegative matrix A, then we can forget about slackness inequalities and proceed directly to the computation of the spectral norm of each submatrix of A. As test example of nonnegative matrix \(A\in \mathbb {M}_{m,n}\), we worked out the case
As shown in Table 5, the CPU time needed for determining \(\Xi (A)\) still exhibits exponential growth with respect to n and m. Such CPU times are however significatively smaller than those displayed in Table 4. The fact that (44) is a nonnegative matrix is therefore a useful information.
6.1 Alternating minimization
We come back to the case of a general matrix A. If the dimensions m and n are not moderate, then determining the entire set \(\Xi (A)\) is not a realistic scope. For instance, computing all the Pareto singular values of (43) when \((m,n)= (12,20)\) took us almost 59 hours. More reasonable goals are:
-
(i)
To determine the smallest Pareto singular value \(\varrho _{\textrm{min}}(A)\) and, in tandem, to find a solution (u, v) to the minimization problem (4).
-
(ii)
To find one or a bunch of Pareto singular values of A.
-
(iii)
To find a negative Pareto singular value of A, knowing that A has a least one negative entry. By Proposition 7, such a Pareto singular value is a Nash value of (4).
We address the third goal. We saw that solving (4) is essentially the same task as solving (40). For the sake of convenience, we introduce the function \(f:\mathbb {R}^m\times \mathbb {R}^n\rightarrow \mathbb {R}\) given by
Note that if p and q are nonzero vectors, then \(f(p,\cdot \,)\) and \(f(\,\cdot , q)\) are strictly convex quadratic functions. The rank-one NMF problem (40) is about minimizing f(p, q) with respect to \(p\ge 0\) and \(q\ge 0\) simultaneously. As in Sect. 5, we assume that X contains at least one positive entry, in which case both components of a solution \((p_*, q_*)\) are nonzero vectors. Any solution \((p_*, q_*)\) to (40) is a Nash equilibrium of (40) in the sense that \(p_*\) and \(q_*\) are nonnegative vectors such that
Since (45)–(46) is about minimizing f(p, q) with respect to \(p\ge 0\) and \(q\ge 0\) separately, a natural way of handling (40) is to use the alternating minimization algorithm
Alternating minimization algorithms for solving unconstrained NMF problems have been already considered in the literature, cf. Gillis and Glineur [1]. Given the special form of f and the nonnegativity constraints on p and q, we have
where \(\textrm{Proj}(w,K)\) denotes the metric projection of w onto K and the maximum of two vectors is taken componentwisely. Hence, (47) can be written in the more explicit form
Such recursion formulas are mentioned in [1]. We initialize the algorithm (47) at a convenient point \(p^0\in \mathbb {R}^m\). A fundamental criterion for choosing \(p^0\) is to ensure that the sequences \(\{p^k\}_{k\ge 1}\) and \(\{q^k\}_{k\ge 1}\) are well defined. This issue is addressed in the first part of Lemma 3. The second part of Lemma 3 is a useful boundedness result.
Lemma 3
Let X be a matrix of size \(m\times n\) with at least one positive entry. Let (47) be initialized at a point \(p^0\in {\mathbb {P}}_m\) such that \(-X^\top p^0\notin {\mathbb {P}}_n\). Then:
-
(a)
\(p^k\not =0\) and \(q^k \not =0\) for all \(\,k\ge 1\).
-
(b)
\(\Vert p^k\Vert \,\Vert q^k\Vert \) remains bounded as \(k\rightarrow \infty \).
Proof
Part (a). Since X has a positive entry, also \(X^\top \) has a positive entry and the sets
are nonempty. These sets are positively homogeneous and do not contain the origin. Note that the initial point \(p^0\) belongs to \({\mathcal {P}}\). We claim that \( p^k\in {\mathcal {P}}\) and \(q^k\in {\mathcal {Q}}\) for all \(k\ge 1\). For proving this claim, it suffices to show the following implications:
Let \(p\in {\mathcal {P}}\). Since \(z= \textrm{Proj}\,(X^\top p,\mathbb {P}_n )\) belongs to \(\mathbb {P}_n\), we just need to check that \(-Xz\notin {\mathbb {P}}_m\). We know that \(-X^\top p\notin {\mathbb {P}}_n\). Hence, \( r=\textrm{card}\{j\in \langle n \rangle :(X^\top p)_j>0\}\ge 1\). Permuting the columns of X if necessary, we may assume that \(X=[Y\,\vert \, W]\), where \(Y\in \mathbb {M}_{m, r}\) and \(W\in \mathbb {M}_{m, n-r}\) are such that
Note that
Suppose, to the contrary, that \(-Xz\in {\mathbb {P}}_m\). In such a case,
Since \(p\in {\mathbb {P}}_m\), condition (53) yields \(\left\| Y^\top p\right\| ^2= \langle p, Y Y^\top p\rangle \le 0.\) Hence, \(Y^\top p=0\), contradicting the first condition in (52). This completes the proof of (50). The proof of (51) is analogous. Part (b). Since the projection map onto \(\mathbb {P}_m \) is positively homogeneous, the combination of (49) and the second equation in (47) yields
By passing to norms and using the fact that \(\Vert \textrm{Proj}(w,\mathbb {P}_m )\Vert \le \Vert w\Vert \) for all \(w\in \mathbb {R}^m,\) we get
as desired. \(\square \)
If the matrix \(X=[x_1,\ldots , x_n]\) is given in terms of its column vectors, then the assumption made on the initial point \(p^0\in \mathbb {R}^m\) says that \(p^0\ge 0\) and \(\langle p^0, x_j\rangle >0\) for some \(j\in \langle n\rangle \). Finding such a point \(p^0\) offers no difficulty. The next theorem can be used to find a Nash equilibrium to (40) and, at the same time, to detect a negative Pareto singular value of \(A=-X\).
Theorem 4
Consider the same assumptions as in Lemma 3. Suppose that
Then \((p_*, q_*)\) is a Nash equilibrium of (40). Furthermore,
is a Pareto singular triplet of \(A=-X\) and \((u_*, v_*)\) is a Nash equilibrium of (4).
Proof
Passing to limits in (47) yields
The first equality in (55) asserts that \(q_*\) is a minimizer of \(f(p_*,\cdot \,)\) on \(\mathbb {P}_n\). Analogously, the second equality in (55) asserts that \(p_*\) is a minimizer of \(f(\,\cdot , q_*)\) on \(\mathbb {P}_m\). This proves that \((p_*, q_*)\) is a Nash equilibrium of (40). Next, we write (55) in the equivalent form
where \(\varrho _*= \Vert p_*\Vert \,\Vert q_*\Vert \) is a positive real number. Except for the intervention of the projection maps, the system (56) looks similar to the unconstrained model (1). As pointed out in Quittner [3], when \(\varrho \) is a positive real number and K is a closed convex cone, a “projected” equation of the type \(\,\textrm{Proj}\left( w,K\right) = \varrho \, y\,\) is equivalent to saying that \(y\in K\) and
By using this observation, it is clear that (56) can be written as
The change \((A, \sigma _*)=(-X,-\varrho _*)\) shows that \((u_*, v_*, \sigma _*) \) is a Pareto singular triplet of A. That \((u_*, v_*)\) is a Nash equilibrium of (4) follows from Proposition 7 and the negativity of \(\sigma _*\). \(\square \)
Remark 2
Hypothesis (54) may seem strong at first sight, but numerical testing with random data suggests that this hypothesis is fulfilled in a vast majority of cases. We carried out the following experiment: for \((m,n)=(20,30)\), we generated \(10^6\) realizations from a Gaussian random matrix \(\textbf{X} \rightsquigarrow [{\mathcal {N}}(0,1)]^{m\times n}\) and the same number of realizations from a random vector \(\textbf{p}^{0}\in {\mathcal {P}}\). The hypothesis (54) turned out to be true for each one of \(10^6\) realizations of \((\textbf{X}, \textbf{p}^0\,)\).
The pair \((u_*, v_*)\) obtained in Theorem 4 is a Nash equilibrium of (4), but not necessarily a solution to (4). Consistently, the negative real number \(\sigma _*\) is a Nash value of (4), but not necessarily the optimal value of (4). Table 6 illustrates this fact with the help of a test example. We consider the matrix (43) and take \((m,n)=(3,8)\). The first column of Table 6 displays all the negative Pareto singular values of A. This information is unknown or hard to obtain when A is of large size, but in this example the dimensions m and n are small. The matrix \(X=-A\) contains at least one positive entry. We apply the algorithm (47) initialized at \(10^6\) random initial points \(p^0\in {\mathcal {P}}\) and count the number of times that each negative \(\sigma \) is detected. This operation yields the percentages displayed in the second column of Table 6. The last column displays the average number of iterations needed to detect a particular \(\sigma \), in case the algorithm converges to such a value. As stopping criterion for the algorithm (47), we consider the rule
but other rules are also possible. If we decrease the tolerance level from \(10^{-7}\) to \(10^{-12}\), then the number of iterations in the last column of Table 6 almost duplicate. The smallest Pareto singular value \(\varrho _{\textrm{min}}(A)= -1.440480\) was detected with \(31.24\%\) of the initial points, but more frequently we got convergence towards the Pareto singular value \(-1.170013 =- \Vert p_*\Vert \,\Vert q_*\Vert \), where
is just a Nash equilibrium of (40).
The main interest of algorithm (47) is that m and n could be large. Table 7 deals again with the matrix (43), but now we take \((m,n)= (100, 120)\). Getting the entire list of negative Pareto singular values of A by using Theorem 1 is simply too expensive. So, we run the algorithm (47) with a large number of initial points and observe what happens. Table 7 displays the negative Pareto singular values of (43) detected with the help of \(10^6\) initial points. The first column of Table 7 displays the negative Pareto singular values of (43) detected by the algorithm (47). The value \(-29.801378\) is probably the smallest Pareto singular value of (43), but we do not have a formal proof of this fact. Finding the complete list of negative Pareto singular values of a matrix of size \(100\times 120\) with the help of Theorem 1 is cost prohibitive.
References
Gillis, N., Glineur, F.: Nonnegative factorization and the maximum edge biclique problem. CORE Discussion Paper 2010/59 (2010). http://hdl.handle.net/2078.1/33448
Montanari, A., Richard, E.: Non-negative principal component analysis: message passing algorithms and sharp asymptotics. IEEE Trans. Inf. Theory 62, 1458–1484 (2016)
Quittner, P.: Spectral analysis of variational inequalities. Comment. Math. Univ. Carolinae 27, 605–629 (1986)
Rockafellar, R. T., Wets, R. J. B.: Variational Analysis. Springer, Berlin (1998)
Seeger, A.: Eigenvalue analysis of equilibrium processes defined by linear complementarity conditions. Linear Algebra Appl. 292, 1–14 (1999)
Seeger, A., Sossa, D.: Cone-constrained singular value problems. J. Convex Anal. 30, 1285–1306 (2023)
Tolić, D., Antulov-Fantulin, N., Kopriva, I.: A nonlinear orthogonal nonnegative matrix factorization approach to subspace clustering. Pattern Recogn. 82, 40–55 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
David Sossa: Partially supported by FONDECYT (Chile) through Grant 11220268.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Seeger, A., Sossa, D. Singular value problems under nonnegativity constraints. Positivity 27, 46 (2023). https://doi.org/10.1007/s11117-023-01000-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11117-023-01000-9
Keywords
- Pareto singular value
- Pareto eigenvalue
- Complementarity problem
- Nash equilibria
- Nonnegative matrix factorization