Abstract
In this paper, the spherical quasi-convexity of quadratic functions on spherically subdual convex sets is studied. Sufficient conditions for spherical quasi-convexity on spherically subdual convex sets are presented. A partial characterization of spherical quasi-convexity on spherical Lorentz sets is given, and some examples are provided.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The aim of this paper is to study theoretical properties of spherical quasi-convexity of quadratic functions on spherically subdual convex sets. It is well known that quadratic functions play an important role in nonlinear programming theory. For instance, the minimization problem of quadratic functions on the sphere occurs as a subproblem in methods of nonlinear programming (see the background section of [1] for an extensive review of the literature on the subject). We are interested in the problem of minimizing a quadratic function, defined by the symmetric matrix Q, constrained to a subset C of the sphere. This problem is a quadratic constrained optimization problem on the sphere, and it is also a minimum eigenvalue problem in C. It is essential to emphasize that there exists a special case, when C is the intersection of the Lorentz cone with the sphere. This special case is of particular interest because the minimum eigenvalue of Q in C is non-negative, if and only if the matrix Q is Lorentz copositive; see [2, 3]. In general, changing the Lorentz cone by an arbitrary closed and convex cone K would lead to a more general concept of K-copositivity, thus our study is anticipated to initialize new perspectives for investigating the general copositivity of a symmetric matrix. In general, exploiting the specific intrinsic geometric and algebraic structure of problems posed on the sphere can significantly lower down the cost of finding solutions; see [4,5,6,7,8,9]. We know that a strict local minimizer of a spherically quasi-convex quadratic function is also a strict global minimizer, which makes interesting and natural to refer the problem about characterizing the spherically quasi-convex quadratic functions on spherically convex sets.
The aim of this paper is to introduce both sufficient conditions and necessary conditions for quadratic functions to be spherically quasi-convex on spherically subdual convex sets. In particular, several examples are presented. This paper continues the study of [10], which can be regarded as premier study about the topic of quasi-convexity of quadratic functions on the Euclidean space. The main literature about the quasi-convexity of quadratic functions on Euclidean convex sets includes, but it is not limited to [11,12,13,14,15].
The remaining part of this paper is structured as follows. Section 2 presents previous results and notations that will be used throughout this paper. In Sect. 2.1, we recall the fundamental properties of spherically quasi-convex functions on spherically convex sets and, in Sect. 2.1.1, particular versions of these conditions for quadratic spherically quasi-convex functions. Section 3 provides derivations of many useful properties of spherically quasi-convex functions on spherically subdual convex sets. In Sect. 4, we prove a condition partially characterizing the spherical quasi-convexity of quadratic functions on spherically convex sets associated with the Lorentz cone. Perspectives and open problems are presented in Sects. 5, and 6 concludes the paper.
2 Terminology and Basics Results
In this section, we introduce some notations and present the previous results used throughout the paper. Let \({\mathbb {R}}^n\) be the n-dimensional Euclidean space with the canonical inner product \(\langle \cdot ,\cdot \rangle \) and norm \(\Vert {\cdot }\Vert \). For a set \(D\subset {\mathbb {R}}^n\), denote by [D] the span of a set D, i.e., the smallest linear subspace of \({\mathbb {R}}^n\) that contains the set D. For a vector subspace \({\mathbb {V}} \subset {\mathbb {R}}^n\), denote by \( {\mathbb {V}}^\perp =\{x\in {\mathbb {R}}^n:~\langle v,x\rangle =0, ~\forall ~ v \in {\mathbb {V}}\}\) its orthogonal complement in \({\mathbb {R}}^n\). In particular, for simplifying the notation, for a vector \(u\in {\mathbb {R}}^n\) we set \([u]:=[\{u\}]\) and \([u]^\perp :=[\{u\}]^\perp \). Denote by \({\mathbb {R}}^{m \times n}\) the set of all \(m\times n\) matrices with real entries, \({\mathbb {R}}^n\equiv {\mathbb {R}}^{n \times 1}\), by \(e^i\) the i-th canonical unit vector in \({\mathbb {R}}^n\), and by \(\mathrm{I_n}\) the \(n\times n\) identity matrix. A set \({{\mathscr {K}}}\subseteq {\mathbb {R}}^{n}\) is called a cone if for any \(\alpha > 0\) and \(x\in \mathscr {K}\) we have \(\alpha x\in \mathscr {K}\). A cone \({{\mathscr {K}}}\subseteq {\mathbb {R}}^n\) is called a convex cone if for any \(x,y \in \mathscr {K}\), we have \(x + y \in \mathscr {K}\). The dual cone of a cone \({\mathscr {K}} \subseteq {\mathbb {R}}^n\) is the closed convex cone \({\mathscr {K}}^*:=\{ x\in {\mathbb {R}}^n :~ \langle x, y \rangle \ge 0, ~ \forall \, y\in {\mathscr {K}}\}.\) A cone \( {{\mathscr {K}}} \subseteq {\mathbb {R}}^{n}\) is called pointed if \( {{\mathscr {K}}} \cap {(-\,{\mathscr {K}})} \subseteq \{0\}\). A pointed closed convex cone is called proper cone if it has a nonempty interior. The cone \({{\mathscr {K}}}\subseteq {\mathbb {R}}^n\) is called subdual if \({\mathscr {K}}\subseteq {{\mathscr {K}}}^*\) and self-dual if \({{\mathscr {K}}}={\mathscr {K}}^*\). The Lorentz cone is defined by
which is a special case of the elliptic cone defined by
where \( \theta _2, \ldots ,\theta _n\) are positive real numbers and \(v^1, v^2, \ldots , v^n\) are L.I. vectors in \({\mathbb {R}}^n\). We recall that the Lorentz cone \({{\mathscr {L}}}\) and the non-negative orthant \({\mathbb {R}}^n_{+}\) are self-dual cones. Let \({{\mathscr {K}}}\) be a closed convex cone. Let \(x\in {\mathbb {R}}^n\), then the projection \(P_{{\mathscr {K}}}(x)\) of the point x onto the cone \( {{\mathscr {K}}}\) is defined by
For any \(x\in {{\mathscr {K}}}\), we define the non-negative part of x, nonpositive part of x and the absolute value of x with respect to \({{\mathscr {K}}}\) by
respectively. We recall from Moreau’s decomposition theorem [16] (see also [17, Theorem 3.2.5]) that for a closed convex cone \({{\mathscr {K}}}\) there hold:
For any \(z\in {\mathbb {R}}\times {{\mathbb {R}}}^{n-1}\), let \(z:=(z_1, {z^2}) \in {\mathbb {R}}\times {{\mathbb {R}}}^{n-1} \), where \({z^2}:= (z_2, z_3,\ldots , z_n)^{^\top }\). An explicit formula for the projection mapping \(\mathrm{P}_{\mathscr {L}}\) onto the Lorentz cone \({{\mathscr {L}}}\) is given in [18, Proposition 3.3], which is recalled for the case when \(x\notin {{\mathscr {L}}}\cup -{{\mathscr {L}}}\) in the following lemma.
Lemma 2.1
Let \(x=(x_1, {x^2}) \in \{(y_1, {y^2}) \in {\mathbb {R}}\times {\mathbb {R}}^{n-1}: ~ |y_1|<\Vert {y^2}\Vert \} \) and \({\mathscr {L}}\) be the Lorentz cone. Then,
and, as a consequence, the absolute value of x with respect to \({{\mathscr {L}}}\) is given by
For a general nonzero vector \(x=\left(x_1, {x^2}\right) \in {\mathbb {R}}\times {\mathbb {R}}^{n-1}\), the absolute value of x with respect to \({{\mathscr {L}}}\) is given in the next lemma, which follows immediately from Lemma 2.1 and Eq. (2).
Lemma 2.2
Consider a nonzero vector \(x=\left(x_1, {x^2}\right) \in {\mathbb {R}}\times {\mathbb {R}}^{n-1}\) and let \({\mathscr {L}}\) be the Lorentz cone. Then, the absolute value of x is given by
where \({{\,\mathrm{sgn}\,}}(x_1)\) is equal to \({-}\) 1, 0 or 1 whenever \(x_1\) is negative, zero or positive, respectively.
Let \( {{\mathscr {K}}} \subseteq {\mathbb {R}}^{n}\) be a (not necessarily convex) cone. Let us recall that \(A\in {\mathbb {R}}^{n \times n}\) is \({\mathscr {K}}\)-copositive if \( \langle Ax, x \rangle \ge 0\) for all \(x\in {{\mathscr {K}}}\) and a Z-matrix is a matrix with nonpositive off-diagonal elements. It is easy to see that the Lorentz cone \({\mathscr {L}}\) can be written as
where \(J={{\,\mathrm{diag}\,}}(1,-1,\ldots ,-1)\in {\mathbb {R}}^{n\times n}\). It is easy to see that
This straightforwardly implies that \(A\in {\mathbb {R}}^{n \times n}\) is \({{\mathscr {L}}}\)-copositive if and only if it is \({{\mathscr {L}}}\cup -{\mathscr {L}}\)-copositive. Hence, the S-Lemma (see [19,20,21]) implies:
Lemma 2.3
\(A\in {\mathbb {R}}^{n \times n}\) is \({{\mathscr {L}}}\)-copositive if and only if there exists a \(\rho \ge 0\) such that \(A-\rho J\) is positive semidefinite.
This result is well-known and it is remarked in Lemma 2.2 of [2]. Let \({{\mathscr {K}}} \subseteq {\mathbb {R}}^n \) be a pointed closed convex cone with nonempty interior, the \({\mathscr {K}}\)-Z-property of a matrix \(A\in {\mathbb {R}}^{n\times n}\) means that \( \langle Ax,y\rangle \le 0\), for any \((x,y)\in C({{\mathscr {K}}})\), where \(C({{\mathscr {K}}}):=\{(x,y)\in {\mathbb {R}}^n\times {\mathbb {R}}^n:~x\in {{\mathscr {K}}},\text { }y\in {{\mathscr {K}}}^*, \langle x, y \rangle =0\}\). Throughout the paper, the \(n\)-dimensional Euclidean sphere \(S^{n-1}\) is denoted by
The intrinsic distance on the sphere between \(x, y \in S^{n-1}\) is defined by \({{\,\mathrm{d}\,}}(x, y):=\arccos \langle x, y\rangle .\) It is also easy to verify that \({{\,\mathrm{d}\,}}(x, y)\le \pi \) for any \(x, y\in S^{n-1}\), and \({{\,\mathrm{d}\,}}(x, y)=\pi \) if and only if \(x=-\,y\). The intersection curve of a plane through the origin of \({\mathbb {R}}^{n}\) with the sphere \( S^{n-1}\) is called a geodesic. A geodesic segment joining x to y is said to be minimal if its length is equal to \({{\,\mathrm{d}\,}}(x, y)\). A set \({{\mathscr {C}}} \subseteq S^{n-1}\) is said to be spherically convex if for any \(x\), \(y\in {{\mathscr {C}}} \) all the minimal geodesic segments joining \(x\) to \(y\) are contained in \( {\mathscr {C}} \). For notational convenience, in the following text we assume that all spherically convex sets are nonempty proper subsets of the sphere. For each closed set \( {{\mathscr {A}}} \subseteq S^{n-1}\), let \({{\mathscr {K}}}_{{\mathscr {A}}}\subseteq {\mathbb {R}}^{n}\) be the cone spanned by \({{\mathscr {A}}}\), namely
Obviously, \({{\mathscr {K}}}_{{\mathscr {A}}} \) is the smallest cone containing \({{\mathscr {A}}}\) (see Fig. 1). In the following proposition, a relationship between spherically convex sets and the cones spanned by them will be exhibited. The proof is given in [22, Proposition 2].
Proposition 2.1
The set \({{\mathscr {C}}}\) is spherically convex if and only if the cone \(\mathscr {K}_{{\mathscr {C}}}\) is pointed convex.
2.1 Spherically Quasi-convex Functions on Spherically Convex Sets
In this section, we recall the concept of spherically quasi-convex functions on spherically convex sets and we present a basic characterization of them; for more details, see [10]. For this concept and its properties in Euclidean space, see [23].
Definition 2.1
Let \( {{\mathscr {C}}}\subseteq S^{n-1}\) be a spherically convex set and \(I\subseteq {\mathbb {R}}\) be an interval. A function \(f:{{\mathscr {C}}} \rightarrow {\mathbb {R}}\) is said to be spherically quasi-convex (respectively, strictly spherically quasi-convex) if for any minimal geodesic segment \(\gamma :I\rightarrow {{\mathscr {C}}}\), the composition \( f\circ \gamma :I\rightarrow {\mathbb {R}}\) is quasi-convex (respectively, strictly quasi-convex) in the usual sense, i.e., \(f(\gamma (t))\le {\text {max}}\{ f(\gamma (t_{1})), f(\gamma (t_{2}))\}\) for all \(t\in [t_{1}, t_{2}]\subseteq I\), (respectively, \(f(\gamma (t))< {\text {max}}\{ f(\gamma (t_{1})), f(\gamma (t_{2}))\}\) for all \(t\in [t_{1}, t_{2}]\subseteq I\), \(t_1\ne t_2\)).
Remark 2.1
The above definition implies that, if \(f: S^{n-1} \rightarrow {\mathbb {R}}\) is spherically quasi-convex, then f is constant. However, as we will show, there exist non-constant spherically quasi-convex functions defined on proper spherically convex sets of \(S^{n-1}\).
To simplify the notations, the sub-level sets of a function \(f:{\mathbb {R}}^n\supseteq {{\mathscr {M}}} \rightarrow {\mathbb {R}}\) are denoted by
We end this section with a proposition whose proof can be found in [10, Proposition 6].
Proposition 2.2
Let \( {{\mathscr {C}}}\subseteq S^{n-1}\) be a spherically convex set. A function \(f:{{\mathscr {C}}} \rightarrow {\mathbb {R}}\) is spherically quasi-convex if and only if for all \(c\in {\mathbb {R}}\) the sub-level sets \([f\le c]\) are spherically convex.
2.1.1 Spherically Quasi-convex Quadratic Functions on Spherically Convex Sets
In this section, we recall earlier results of quadratic quasi-convex functions on general spherically convex sets. Henceforth, we assume that \(A=A^\top \in {\mathbb {R}}^{n\times n}\), the cone \({{\mathscr {K}}}\subseteq {\mathbb {R}}^{n}\) is proper and subdual. Define
and assume that \({{\mathscr {C}}}\) is open and spherically convex. Let \(q_A: {{\mathscr {C}}}\rightarrow {\mathbb {R}}\) be the quadratic function defined by
We remark that \(q_A\) can be extended to \( {\overline{{\mathscr {C}}}}\). For the simplicity of notations, we will denote the extended values by \(q_A(x)\) too, but the spherical quasiconvexity of \(q_A\) will always be understood as a function defined on \({{\mathscr {C}}}\). To proceed, we need as well the restriction of Rayleigh quotient \(\varphi _A : {{\,\mathrm{int}\,}}({{\mathscr {K}}}) \rightarrow {\mathbb {R}}\) defined by
We remark that \(\varphi _A\) can be extended to \({{\mathscr {K}}}\). For the simplicity of notations, we will denote the extended values by \(\varphi _A(x)\) too, but the quasiconvexity of \(\varphi _A\) will always be understood as a function defined on \({{\,\mathrm{int}\,}}({{\mathscr {K}}})\). In the followings, we state some properties of the functions \(q_A\) and \(\varphi _A(x)\), for details see [10].
Proposition 2.3
The following statements are equivalent:
-
(a)
\(q_A\) is spherically quasi-convex;
-
(b)
\(\langle Ax,y\rangle \le \langle x,y\rangle {\text {max}}\left\{ q_A(x), ~q_A(y)\right\} \), for all \( x,y\in {\overline{\mathscr {C}}}\);
-
(c)
\( \displaystyle \frac{\langle Ax,y\rangle }{\langle x,y\rangle }\le {\text {max}}\left\{ \varphi _A(x), ~\varphi _A(y)\right\} \), for all \(x,y\in {{\mathscr {K}}}\) with \(\langle x,y\rangle \ne 0\).
Corollary 2.1
Let \({{\mathscr {K}}}\) be self-dual. If \(q_A\) is spherically quasi-convex, then A has the \({\mathscr {K}}\)-Z-property.
Theorem 2.1
The function \(q_A\) is spherically quasi-convex if and only if \(\varphi _A\) is quasi-convex.
The next result uses the following notations: Let \(c\in {\mathbb {R}}\) and define the cone
Corollary 2.2
The function \(q_A\) is spherically quasi-convex if and only if \([\varphi _A\le c]\) is convex, for any \(c\in {\mathbb {R}}\).
3 Spherically Quasi-convex Quadratic Functions on Spherically Subdual Convex Sets
In this section, we present partial conditions characterizing the spherical quasi-convexity of quadratic functions on spherically subdual convex sets associated to subdual cones. The results obtained generalize the corresponding ones obtained in [10, Sect. 4.1]. In due course, we will present a more precise correspondence. Throughout this section, we assume that \({{\mathscr {K}}}\) is subdual, i.e., \({{\mathscr {K}}}\subseteq {\mathscr {K}}^*\) and proper. A closed set \( {{\mathscr {A}}} \subseteq S^{n-1}\) is called spherically subdual convex set if the associated cone \({{\mathscr {K}}}_{{\mathscr {A}}}\) is subdual. It is clear that if \(A=A^\top \in {\mathbb {R}}^{n\times n}\) has only one eigenvalue, then \(q_A\) is constant and, consequently, it is spherically quasi-convex. Henceforth, throughout this section we assume that A has at least two distinct eigenvalues. Let us recall that \(q_A\) and \(\varphi _A\) are defined in (4) and (5), respectively. Two technical lemmas, which are useful in the following text, will be presented. They are generalizations of Lemmas 14 and 15 of [10], respectively. More specifically, in Lemma 14 of [10] a condition implying the convexity of \([\varphi _A\le c]\), for all \(c\notin (\lambda _2,\lambda _n)\), is presented if \(K={\mathbb R}^n_{+}\), while here in Lemma 3.1 that condition is extended to an arbitrary subdual cone. Similarly, Lemma 3.3 generalizes Lemma 15 of [10] from \(K={{\mathbb {R}}}^n_{+}\) to an arbitrary subdual cone. For stating the next lemma, for \(\{v^1, v^2,\ldots ,v^n\}\) a orthonormal system of eigenvectors of A corresponding to the eigenvalues \(\lambda _1 < \lambda _2 \le \cdots \le \lambda _n\), respectively, and \(c\in (\lambda _1,\lambda _2]\), define the convex cone
for \(i=2, \ldots , n\). Note that if \(\lambda _1<c<\lambda _2\), then \(\theta _i(c)> 0\), for \(i=2, \ldots , n\), and \({\mathscr {L}}_{c}\), \(-{\mathscr {L}}_{c}\) are also a pointed proper elliptic cones. We also need to consider the following cone
Considering that \({{\mathscr {K}}}\) is a proper cone, then the cone \({{\mathscr {W}}}\) is also proper, i.e., \({{\,\mathrm{int}\,}}({{\mathscr {W}}})\ne \varnothing \).
Lemma 3.1
Let \(n\ge 2\), \(A=A^\top \in {\mathbb {R}}^{n\times n}\) and \(\{v^1, v^2,\ldots ,v^n\}\) be an orthonormal system of eigenvectors of A corresponding to the eigenvalues \(\lambda _1 < \lambda _2 \le \cdots \le \lambda _n\), respectively. Then, the sublevel set \([\varphi _A\le c]\) is convex for all \(c\notin (\lambda _2,\lambda _n)\) if and only if \(v^1 \in {{\mathscr {W}}}^*\cup -{{\mathscr {W}}}^*\). In particular if \(v^1 \in {{\mathscr {K}}}^*\), then \([\varphi _A\le c]\) is convex for all \(c\notin (\lambda _2,\lambda _n)\).
Proof
By using the spectral decomposition of A, we have \(A=\sum _{i=1}^n\lambda _iv^i(v^i)^\top \). From (5), we have
If \(\lambda _1<c\le \lambda _2\), then by using (6) the equality (8) can be completed as follows
Sufficiency of the first statement: Let \(v^1\in {{\mathscr {W}}}^*\) (a similar argument holds for \(v^1\in -{\mathscr {W}}^*\)). If \(c<\lambda _1\), then considering that \(v^1, v^2, \ldots , v^n\) are linearly independent and \(0\notin {{\,\mathrm{int}\,}}({{\mathscr {K}}})\), we obtain from (8) that \([\varphi _A\le c]=\varnothing \) and hence it is convex. If \(c=\lambda _1\), then (8) implies that \([\varphi _A\le c]={{\mathscr {S}}}\cap {{\,\mathrm{int}\,}}({{\mathscr {K}}}),\) where \({{\mathscr {S}}}: =\{x\in {\mathbb {R}}^n ~: \langle v^i, x\rangle =0, ~\text{ for }~ i=2, \ldots , n\}\). Thus, due to \({{\,\mathrm{int}\,}}({{\mathscr {K}}})\) and \({{\mathscr {S}}}\) being convex, we conclude that \([\varphi _A\le c]\) is also convex. Now, we suppose that \(\lambda _1<c\le \lambda _2\). Since \(v^1\in {{\mathscr {W}}}^*\), for any \(x\in {\mathscr {W}}\) we obtain that \(\langle v^1,x\rangle \ge 0\) and from (9) we have \([\varphi _A\le c]= {{\mathscr {L}}}_c\cap {{\,\mathrm{int}\,}}({{\mathscr {K}}})\). Due to the convexity of the cones \({{\mathscr {L}}}_c\) and \({{\,\mathrm{int}\,}}({{\mathscr {K}}})\), we obtain that \([\varphi _A\le c]\) is convex. Finally, if \(c\ge \lambda _n\), then (8) implies that \([\varphi _A\le c]={{\,\mathrm{int}\,}}({{\mathscr {K}}})\) is convex.
Necessity of the first statement: We will show that \(v^1\notin {{\mathscr {W}}}^*\cup -{{\mathscr {W}}}^*\) implies that \([\varphi _A\le c]\) is not convex, for some \(c\in (\lambda _1,\lambda _2)\). Suppose that \(v^1\notin {{\mathscr {W}}}^*\cup -{{\mathscr {W}}}^*\). Thus, considering that \({{\,\mathrm{int}\,}}({\mathscr {W}})\ne \varnothing \), there exist \(y,z\in {{\,\mathrm{int}\,}}({\mathscr {W}})\) such that \(\langle v^1,y\rangle >0\) and \(\langle v^1,z\rangle <0\). Thus, (6) and (7) imply that
We claim that there exists \({{\bar{c}}}\in (\lambda _1,\lambda _2)\) such that \(y\in {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}({{\mathscr {L}}}_{{{\bar{c}}}})\) and \(z\in {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}(-\,{\mathscr {L}}_{{{\bar{c}}}})\). In order to simplify the notations, for \( x\in {\mathbb {R}}^n\) and \(c\in (\lambda _1,\lambda _2]\), we define the following function
Note that \(\psi \) is a continuous function and, from the definition of \(\theta _i\) in (6), it is also decreasing with respect to the second variable c. By using definitions (6) and (11), we have
Thus, taking into account the first inclusion in (10) we conclude, by setting \(c=\lambda _2\) in (12), that
Hence, there exists a \({{\hat{c}}}\in (\lambda _1,\lambda _2)\) sufficiently close to \(\lambda _2\) such that \(\psi (y,{{\hat{c}}})< \langle v^1,y\rangle \). Similarly, we can also prove that there exists a \({{\tilde{c}}}\in (\lambda _1,\lambda _2)\) sufficiently close to \(\lambda _2\) such that \(\psi (z,{{\tilde{c}}})< -\langle v^1,z\rangle \). Thus, letting \({{\bar{c}}}={\text {max}}\{{{\hat{c}}}, {{\tilde{c}}} \}\) we conclude that \(\psi (y,{{\bar{c}}})< \langle v^1,y\rangle \) and \(\psi (z,{{\bar{c}}})< -\langle v^1,z\rangle \), which by (11) and (12) yields
We know by (10) that \(y\in {{\,\mathrm{int}\,}}({\mathscr {K}})\) and \(z\in {{\,\mathrm{int}\,}}({\mathscr {K}})\), which together with (13) yields \(y\in {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}({{\mathscr {L}}}_{{{\bar{c}}}})\) and \(z\in {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}(-\,{\mathscr {L}}_{{{\bar{c}}}})\) and the claim is concluded. Therefore, there exist \(r_y>0\) and \(r_z>0\) such \(B(y,r_y)\subset {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}({\mathscr {L}}_{{{\bar{c}}}})\) and \(B(z,r_z)\subset {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}(-\,{\mathscr {L}}_{{{\bar{c}}}})\), where \(B(y,r_y)\) and \(B(z,r_z)\) denote the open balls with centres y, z and radii \(r_y>0\), \(r_z>0\). Hence, by dimensionality reasons, we can take \(u_y\in {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}({{\mathscr {L}}}_{{{\bar{c}}}})\) and \(u_z \in {{\,\mathrm{int}\,}}({\mathscr {K}})\cap {{\,\mathrm{int}\,}}(-\,{\mathscr {L}}_{{{\bar{c}}}})\) such that \(v^1\), \(u_y\) and \(u_z\) are linearly independent (L.I.). Thus, in particular, we have \(0\notin [u_y,u_z]\), where \([u_y,u_z]\) denotes the straight line segment joining \(u_y\) to \(u_z\). Since \( {{\,\mathrm{int}\,}}({{\mathscr {L}}}_{{{\bar{c}}}})\cap {{\,\mathrm{int}\,}}(-\,{\mathscr {L}}_{{\bar{c}}})=~\varnothing \) and \(0\notin [u_y,u_z]\), the segment \([u_y,u_z]\) is intersecting, at the distinct points \(w_y\ne 0\) and \(w_z\ne 0\), the boundaries of the sets \( {{\,\mathrm{int}\,}}({{\mathscr {L}}}_{{{\bar{c}}}})\) and \( {{\,\mathrm{int}\,}}(-\,{\mathscr {L}}_{{{\bar{c}}}})\), respectively. Moreover, due to \(u_y\) and \(u_z\) being L.I., \(0\notin [u_y,u_z]\), and \(w_y, w_z\in [u_y,u_z]\), we conclude that the vectors \(v^1\), \(w_y\) and \(w_z\) are also L. I. Our next task is to prove that \((w_y+w_z)/2\) does not belong to \( {{\mathscr {L}}_{\bar{c}}}\cup -{{\mathscr {L}}_{{\bar{c}}}}\), i.e.,
First, due to \(w_y\) and \( w_z\) belonging to the boundaries of \( {{\mathscr {L}}}_{{{\bar{c}}}}\) and \( -{\mathscr {L}}_{{{\bar{c}}}}\), respectively, we obtain from (6) that
On the other hand, by using the two equalities in (15), we obtain after some algebraic manipulations that
Thus, considering that \(\left\langle v^1,\frac{1}{2}(w_y+w_z) \right\rangle ^2= \left\langle v^1,\frac{1}{2}w_y\right\rangle ^2+ \left\langle v^1,\frac{1}{2}w_z \right\rangle ^2+ 2 \left\langle v^1,\frac{1}{2}w_y\right\rangle \left\langle v^1,\frac{1}{2}w_z \right\rangle ,\) we have
Applying Cauchy–Schwarz inequality and then, using again both equalities in (15), we conclude that
We are going to prove that Cauchy inequality (17) is strict. For that, assume the contrary, i.e., that the last Cauchy inequality holds as equality. In this case, there exists \(\alpha \ne 0\) such that
which implies that \(w_y+\alpha w_z\) is orthogonal to the set of vectors \(\{v^2, \ldots , v^n \}\). Thus, since \(\{v^1, v^2,\ldots ,v^n\}\) is an orthonormal system, \(w_y+\alpha w_z\) is parallel to the vector \(v^1\), which is a contradiction due to vectors \(v^1\), \(w_y\) and \(w_z\) being L.I. Hence, (17) holds strictly and combining it with (16) we conclude that
and (14) holds. Therefore, considering that \(\frac{1}{2}(w_y+w_z) \in ]u_y,u_z[\), we conclude that \(]u_y,u_z[ \not \subset {{\mathscr {L}}_{{\bar{c}}}}\cup -{{\mathscr {L}}_{{\bar{c}}}}\). Thus, by using the notation (9), we also have \(]u_y,u_z[ \not \subset ({{\mathscr {L}}}_{{\bar{c}}}\cup -{{\mathscr {L}}}_{\bar{c}})\cap {{\,\mathrm{int}\,}}({{\mathscr {K}}})=[\varphi _A\le {{\bar{c}}}]\), and due to \(u_y, u_z\in ({{\mathscr {L}}}_{{\bar{c}}}\cup -{{\mathscr {L}}}_{\bar{c}})\cap {{\,\mathrm{int}\,}}({{\mathscr {K}}})=[\varphi _A\le {{\bar{c}}}]\), it follows that \([\varphi _A\le {{\bar{c}}}]\) is not convex.
The proof of the second statement follows from \({\mathscr {K}}^*\subseteq {\mathscr {W}}^*\). \(\square \)
Remark 3.1
The dual of \({\mathscr {W}}\) in (7) can be expressed as
Corollary 3.1
Suppose that \(n\ge 3\) and \(\lambda _2\le (\lambda _1+\lambda _3)/2\). If \({\mathscr {K}}\cap -{\mathscr {L}}_{\lambda _2}=\{0\}\) or \({\mathscr {K}}\cap {\mathscr {L}}_{\lambda _2}=\{0\}\), then \([\varphi _A\le c]\) is convex for all \(c\notin (\lambda _2,\lambda _n)\).
Proof
First note that if \(n\ge 3\) and \(\lambda _2\le (\lambda _1+\lambda _3)/2\), then \(\theta _i(\lambda _2)\ge 1\) for any \(i\ge 3\). Define the cone
Note that \( {\mathscr {L}}_{[v^2]^\perp }\) is a self-dual Lorentz cone as a subset of the subspace \([v^2]^\perp \). Moreover, considering that \(\theta _i(\lambda _2)\ge 1\) for any \(i\ge 3\), we conclude \({\mathscr {L}}_{\lambda _2}\cap [v^2]^\perp \subset \mathscr {L}_{[v^2]^\perp }\). Consequently, taking into account that \( {\mathscr {L}}_{[v^2]^\perp }\) is a self-dual cone, the cone \(\mathscr {L}_{\lambda _2}\cap [v^2]^\perp \) is subdual as a subset of the subspace \([v^2]^\perp \). To simplify the notation, denote by upper star (i.e., \(^*\)) the dual of a cone in \({\mathbb {R}}^n\) and by lower star (i.e., \(_*\)) the dual of a cone in \([v^2]^\perp \). Thus, using this notation we state
Indeed, since \(v^2,-v^2\in {\mathscr {L}}_{\lambda _2}\), for any \(z\in {\mathscr {L}}_{\lambda _2}^*\), we have \(\langle z,v^2\rangle =0\) and hence \({\mathscr {L}}_{\lambda _2}^*\subseteq [v^2]^\perp \), which implies \({\mathscr {L}}_{\lambda _2}^*\subseteq ({\mathscr {L}}_{\lambda _2}\cap [v^2]^\perp )_*\). Conversely, let \(u\in ({\mathscr {L}}_{\lambda _2}\cap [v^2]^\perp )_*\). Given \(v\in {\mathscr {L}}_{\lambda _2}\), take \(w\in {\mathscr {L}}_{\lambda _2}\cap [v^2]^\perp \) and \(t\in {\mathbb {R}}\) such that \(v=w+tv^2\). Hence, \(\langle u,v\rangle =\langle u,w\rangle \ge 0\), which implies that \(u\in {\mathscr {L}}_{\lambda _2}^*\). Hence, we conclude that \(({\mathscr {L}}_{\lambda _2}\cap [v^2]^\perp )_*\subseteq \mathscr {L}_{\lambda _2}^*\) and (19) is proved. Next suppose \({\mathscr {K}}\cap -{\mathscr {L}}_{\lambda _2}=\{0\}\). Hence, by using the first equality in (18) we obtain \(\mathscr {W}^*=({\mathscr {K}}\cap {\mathscr {L}}_{\lambda _2})^*\). Therefore, considering that \({\mathscr {L}}_{\lambda _2}\cap [v^2]^\perp \) is subdual and (19), we obtain
Hence, the result follows from Lemma 3.1. The case \({\mathscr {K}}\cap {{\mathscr {L}}}_{\lambda _2}=\{0\}\) can be proved similarly. \(\square \)
The next lemma is used in the proof of Proposition 3.1, which is essential for proving Theorem 3.1.
Lemma 3.2
Let \(n\ge 3\) and \(B=B^\top \in {\mathbb {R}}^{n\times n}\). Let \(\mu _1 \le \mu _2 \le \cdots \le \mu _n\) be eigenvalues of the matrix B. Assume that \(\mu _1\le \mu _ 2<0<\mu _n\). Then, for any vector \({\bar{x}}\in {\mathbb {R}}^n{\setminus }\{0\}\) such that \(B{{\bar{x}}}\ne 0\) and \(\left\langle B{{\bar{x}}},{{\bar{x}}}\right\rangle =0\), and any number \(\delta >0\), the set is not convex.
Proof
Since \(\mu _1={\text {min}}_{x\in S^{n-1}} q_B(x) <{\text {max}}_{x\in S^{n-1}} q_B(x)= \mu _ n\), we can take \({{\bar{x}}}\in {\mathbb {R}}^n{\setminus }\{0\}\) such that \(B{{\bar{x}}}\ne 0\) and \(\left\langle B{{\bar{x}}},{{\bar{x}}}\right\rangle =0\). Define the following vector subspace \({\mathscr {N}}:=[\{u\in {\mathbb {R}}^n:~Bu=\mu u,\text { for some }\mu <0\}]\) of \({\mathbb {R}}^{n}\). It follows from assumption (a) or (b) that \(\dim ({\mathscr {N}})\ge 2\). To simplify the notation set, \({{\bar{y}}}:=B{{\bar{x}}}\ne 0\). To proceed with the proof, we first need to prove that \({\mathscr {N}}\ne [{\bar{y}}]^\perp \). Assume to the contrary that \({\mathscr {N}}= [{\bar{y}}]^\perp \). In this case, due to \({{\bar{y}}}=B{{\bar{x}}}\) and \(B=B^\top \), the definition of \([\bar{y}]^\perp \) implies that \( \left\langle Bv, {{\bar{x}}}\right\rangle =0\), for all \(v\in {\mathscr {N}}\). Thus, the definition of \({\mathscr {N}}\) implies \( \left\langle v, {{\bar{x}}}\right\rangle =0\), for all \(v\in {\mathscr {N}}\), which yields \({\mathscr {N}}\subset [{\bar{x}}]^\perp :=\{v\in {\mathbb {R}}^{n}:~ \left\langle v,{\bar{x}}\right\rangle =0\}\). Moreover, considering that \(\left\langle {{\bar{y}}},{\bar{x}}\right\rangle =0\), we also have \({{\bar{y}}} \in [{\bar{x}}]^\perp \). Hence, we conclude that \([\bar{y}]+\mathscr {N}\subset [{\bar{x}}]^\perp \). If \({\bar{y}}\notin {\mathscr {N}}\), then due to \({{\bar{y}}}\ne 0\) and \({\mathscr {N}}= [\bar{y}]^\perp \) we have \(\dim ([\bar{y}]+\mathscr {N})=n\). Having that \([\bar{y}]+\mathscr {N}\subset [{\bar{x}}]^\perp \), we obtain \({{\bar{x}}} = 0\), which contradicts the assumption \({{\bar{x}}} \ne 0\). Hence, \({{\bar{y}}} \in {\mathscr {N}}= [{\bar{y}}]^\perp \), which also contradicts \({{\bar{y}}} \ne 0\). Therefore, \({\mathscr {N}}\ne [{\bar{y}}]^\perp \). Thus, we have
Hence, there exists a unit vector \(a\in \mathscr {N}\) such that \(\langle a,{{\bar{y}}}\rangle =0\). Since \(\mathscr {N}\ne [\bar{y}]^\perp \), we can choose a sequence of vectors \(\{a^k\}\subset \mathscr {N}\) such that \(\lim _{k\rightarrow \infty }a^k=a\) and \(\langle a^k,{{\bar{y}}}\rangle \ne 0\). Let \(\{u^1, u^2,\ldots ,u^n\}\) be an orthonormal system of eigenvectors of the matrix B corresponding to the eigenvalues \(\mu _1, \mu _2 , \ldots , \mu _n\), respectively. Note that the spectral decomposition of B implies \( B=\sum _{i=1}^n\mu _i u^i(u^i)^\top . \) Since \(\{a^k\}\subset \mathscr {N}\), we have \(a^k=\sum _{i=1}^\ell \alpha _{k, i} u^i\), where \(2\le \ell =\dim (\mathscr {N})<n\) and \(\mu _1, \ldots , \mu _\ell \) are the negative eigenvalues of B. Thus, \( \langle Ba^k,a^k\rangle =\sum _{i=1}^\ell \alpha ^2_{k, i} \mu _i<0. \) For proceeding with the proof, we define
Then, \(\langle Bp^k,p^k\rangle =0\) and, due to \(\langle a,{{\bar{y}}}\rangle =0\) and \(\lim _{k\rightarrow \infty }a^k=a\), we have \(\lim _{k\rightarrow \infty }p^k={{\bar{x}}}\). Hence, if k is sufficiently large, then for any \(\delta >0\) arbitrary but fixed, we have \(p^k\in \varXi \left(B,{{\bar{x}}},\delta \right)\). For such an k, after some simple algebraic manipulations we conclude
Hence, \({{\bar{x}}},p^k\in \varXi \left(B,{{\bar{x}}},\delta \right)\), but \(({\bar{x}}+p^k)/2\notin \varXi \left(B,{{\bar{x}}},\delta \right)\). Therefore, \(\varXi \left(B,{{\bar{x}}},\delta \right)\) is not convex. \(\square \)
Proposition 3.1
Let \(n\ge 3\) and \(A=A^\top \in {\mathbb {R}}^{n\times n}\) be a nonsingular matrix. Suppose that \(q_A\) is not constant and \(\lambda _1 \le \lambda _2 \le \cdots \le \lambda _n\) are eigenvalues of A. If \(q_A\) is quasiconvex, then the following conditions hold:
-
(i)
\(\lambda _1 < \lambda _2\);
-
(ii)
either \(\lambda _2\le {\text {min}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x)\) or \({\text {max}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x)\le \lambda _2\).
Proof
Suppose by contradiction that one of the following two conditions holds:
-
(a)
\(\lambda _1= \lambda _2\);
-
(b)
\({\text {min}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x)<\lambda _2<{\text {max}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x)\).
First of all, note that due to \(q_A\) not being constant, we have \(\lambda _1\le {\text {min}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x) <{\text {max}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x)\le \lambda _ n\), where \({\overline{\mathscr {C}}}\) is defined in (3). Hence, we can take a \(\mu \in {\mathbb {R}}\) such that \(\mu \ne \lambda _i\) for all \(i=1, \ldots n,\) and satisfying
if the condition (a) holds. Otherwise, if the condition (b) holds, we take a \(\mu \in {\mathbb {R}}\) satisfying
Then, any of the conditions (20) or (21) implies that \(\pm (A-\mu I_n)\) is not \(\mathscr {K}\)-copositive. Since the matrix \(A-\mu I_n\) is not \(\mathscr {K}\)-copositive, there exists a \(p\in \mathscr {K}\) such that \(\langle Ap,p\rangle <\mu \Vert p\Vert ^2\). Hence, there exist also an \(u\in {{\,\mathrm{int}\,}}(\mathscr {K})\) sufficiently close to p such that \(\langle Au,u\rangle <\mu \Vert u\Vert ^2\). Similarly, since \(-(A-\mu I_n)=\mu I_n-A\) is not \(\mathscr {K}\)-copositive, there exists a \(v\in {{\,\mathrm{int}\,}}(\mathscr {K})\) such that \(\langle Av,v\rangle >\mu \Vert v\Vert ^2\). Therefore, by continuity, there exists a \(t\in ]0,1[\) such that \(\langle A{{\bar{x}}},{{\bar{x}}}\rangle =\mu \Vert {{\bar{x}}}\Vert ^2\), where \({\bar{x}}=(1-t)u+tv\in {{\,\mathrm{int}\,}}(\mathscr {K})\). Letting \(B=A-\mu I\), its eigenvalues are given by \(\mu _i:=\lambda _i-\mu \), for \(i=1, 2, \ldots , n\). Thus, we conclude from (20) and (21) that
if the condition (a) or (b) holds, respectively. Since \(B{\bar{x}}\ne 0\) and \(\langle B{{\bar{x}}},{{\bar{x}}}\rangle =0\), we conclude from Lemma 3.2 that, for all \(\delta >0\), the set is not convex. Hence, there exists an \(s\in ]0,1[\) and \(a^0,a^1\in \varXi \left(B,{\bar{x}},\delta \right)\) such that \(a^s:=(1-s)a^0+sa^1\notin \varXi \left(B,{\bar{x}},\delta \right)\). Thus, owing to the ball of centre \({{\bar{x}}}\) and radius \(\delta \) is convex, \(a^s\notin \varXi \left(B,{\bar{x}},\delta \right)\) implies \(\langle Aa^s,a^s\rangle -\mu \Vert a^s\Vert ^2=\langle Ba^s,a^s\rangle >0\). On the other hand, since \(a^0,a^1\in \varXi \left(B,{\bar{x}},\delta \right)\), we have \(\langle Aa^i,a^i\rangle -\mu \Vert a^i\Vert ^2=\langle Ba^i,a^i\rangle \le 0\), for \(i\in \{0,1\}\). Furthermore, if \(\delta \) is sufficiently small, then considering that \({{\bar{x}}}\in {{\,\mathrm{int}\,}}(\mathscr {K})\), we have \(a^0,a^1\in {{\,\mathrm{int}\,}}\mathscr {K}\). Hence, \(a^0,a^1\in [\varphi _A\le \mu ]\) and \(a^s\notin [\varphi _A\le \mu ]\). By using Corollary 2.2, this contradicts the spherical quasiconvexity of A. \(\square \)
Lemma 3.3
Let \(A \in {\mathbb {R}}^{n\times n}\) and \(\lambda ,c\in {\mathbb {R}}\) such that \(\lambda \le c\). If the matrix \(\lambda I_n-A\) is \({\mathscr {K}}\)-copositive, then \([\varphi _A\le c]={{\,\mathrm{int}\,}}({{\mathscr {K}}})\). As a consequence, the set \([\varphi _A\le c]\) is convex.
Proof
Since \(\lambda \le c\), we have \(\langle A x,x \rangle -c\Vert x\Vert ^2\le \langle A x,x \rangle -\lambda \Vert x\Vert ^2= \langle (A-\lambda I_n) x,x \rangle \). Thus, considering that \(\lambda I_n-A\) is \({\mathscr {K}}\)-copositive, we conclude that \(\langle A x,x \rangle -c\Vert x\Vert ^2\le 0\), for all \(x\in {{\,\mathrm{int}\,}}({{\mathscr {K}}})\), which implies that \([\varphi _A\le c]=\{x\in {{\,\mathrm{int}\,}}({{\mathscr {K}}}):~ \langle A x,x \rangle -c \Vert x\Vert ^2\le 0\}={{\,\mathrm{int}\,}}({{\mathscr {K}}})\). \(\square \)
Theorem 3.1
Let \(n\ge 3\), \(k\ge 1\), \(A=A^\top \in {\mathbb {R}}^{n\times n}\) and \(\{v^1, v^2,\ldots ,v^n\}\) be an orthonormal system of eigenvectors of A corresponding to the eigenvalues \(\lambda _1=\cdots =\lambda _k<\lambda _{k+1}\le \cdots \le \lambda _n\), respectively. Then, we have the following statements:
-
(i)
If \(q_A\) is quasiconvex and not constant, then \(k=1\).
-
(ii)
If \(q_A\) is quasiconvex and not constant, then either \(\lambda _2\le {\text {min}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x)\) or \({\text {max}}_{x\in {\overline{{\mathscr {C}}}}} q_A(x)\le \lambda _2\).
-
(iii)
Suppose that \(k=1\) and \(\lambda _2I_n-A\) is \(\mathscr {K}\)-copositive. Then, \(q_A\) is spherically quasiconvex if and only if \(v^1 \in {{\mathscr {W}}}^*\cup -{{\mathscr {W}}}^*\). In particular if \(v^1 \in {\mathscr {K}}^*\), then \(q_A\) is spherically quasiconvex.
-
(iv)
Suppose that \({{\mathscr {K}}}^*= {{\mathscr {K}}}\) and \(v^1\in {\mathscr {K}}\). Then, \(q_A\) is spherically quasiconvex if and only if \(k=1\) and \(\lambda _2 I_n-A\) is \({\mathscr {K}}\)-copositive.
Proof
Items (i) and (ii) follow from Proposition 3.1. Item (iii) follows from Lemmas 3.1, 3.3 and Corollary 2.2. Next, we prove item (iv). Suppose that \(k=1\), \({{\mathscr {K}}}^*= {{\mathscr {K}}}\) and \(v^1\in {{\mathscr {K}}}\). If \(\lambda _2 I_n-A\) is \({{\mathscr {K}}}\)-copositive, then the result follows from item (iii). If \(q_A\) is spherically quasiconvex, then item (i) implies that \(k=1\) and item (ii) implies that either \(A-\lambda _2 I_n\) or \(\lambda _2 I_n-A\) is \({{\mathscr {K}}}\)-copositive. If \(A-\lambda _2 I_n\) is \({{\mathscr {K}}}\)-copositive, then \(v^1\in {{\mathscr {K}}}\) implies \(\langle (A-\lambda _2 I_n)v^1,v^1\rangle \ge 0\). Hence, \(\lambda _1\ge \lambda _2\), which contradicts \(k=1\). Therefore, this case cannot hold and we must have \(\lambda _2 I_n-A\) is \({\mathscr {K}}\)-copositive. \(\square \)
The next corollary follows by combining Lemma 3.3 and Corollary 3.1.
Corollary 3.2
Let \(n\ge 3\), \(A=A^\top \in {\mathbb {R}}^{n\times n}\) and \(\lambda _1<\lambda _{2}\le \cdots \le \lambda _n\) the eigenvalues of A. Suppose that \(\lambda _2\le (\lambda _1+\lambda _3)/2\) and \(\lambda _2I_n-A\) is \(\mathscr {K}\)-copositive. If \({\mathscr {K}}\cap -{\mathscr {L}}_{\lambda _2}=\{0\}\) or \({\mathscr {K}}\cap \mathscr {L}_{\lambda _2}=\{0\}\), then \(q_A\) is spherically quasiconvex.
In the following two theorems, we present classes of quadratic quasi-convex functions defined in spherically subdual convex sets, which include as particular instances [10, Examples 18 and 19].
Theorem 3.2
Let \(n\ge 3\), \(A=A^\top \in {\mathbb {R}}^{n\times n}\) and \(\{v^1, v^2,\ldots ,v^n\}\) be an orthonormal system of eigenvectors of the matrix A corresponding to the eigenvalues \(\lambda _1 , \lambda _2, \ldots , \lambda _n\) , respectively. Moreover, assume that \(\lambda :=\lambda _1\), \(\mu :=\lambda _2=\ldots = \lambda _{n-1}\), \(\eta :=\lambda _n\) and
where \( |{\cdot }|^{{\mathscr {K}}}\) is defined in (1). Then, the quadratic function \(q_A\) is spherically quasi-convex.
Proof
By using the spectral decomposition of A, we have
Hence, for all \(x\in {{\mathscr {K}}}\), by using \(\Vert x\Vert ^2=\sum _{i=1}^n\langle v^i,x\rangle ^2\) and (24), after some calculations we obtain
To proceed with the proof, we note that (1) implies that \(|v^n|^{{\mathscr {K}}}\in {{\mathscr {K}}}+{{\mathscr {K}}}^*\) and, owing to \({\mathscr {K}} \subseteq {{\mathscr {K}}}^*\), we conclude that \(|v^n|^{{\mathscr {K}}}\in {\mathscr {K}}^*\). Thus, (23) implies that
Hence, for all \(x\in {{\mathscr {K}}}\), the last inequality yields
On the other hand, by using \(|{v^{n}}|^{{\mathscr {K}}}=\mathrm{P}_{\mathscr {K}}(v^{n})+\mathrm{P}_{{{\mathscr {K}}}^*}(-\,v^{n})\), \({v^{n}}=\mathrm{P}_{\mathscr {K}}({v^{n}})-\mathrm{P}_{{{\mathscr {K}}}^*}(-\,{v^{n}})\), \(P_{{\mathscr {K}}}(v^n)\in K\subseteq K^*\), we obtain \(\langle v^n+|v^n|^{{\mathscr {K}}},x\rangle \langle v^n-|v^n|^{{\mathscr {K}}},x\rangle =-\,4\langle P_{{\mathscr {K}}}(v^n),x\rangle \langle P_{{\mathscr {K}}^*}(-\,v^n),x\rangle \le 0\), for all \(x\in {{\mathscr {K}}}\). Thus, due to \(\lambda<\mu <\eta \), the previous inequality together with (27) implies
Thus, considering that \(\lambda <\mu \), the combination of (25) with (28) implies that \(\mu \mathrm{I_n}-A\) is \({{\mathscr {K}}}\)-copositive. Taking into account that \(|v^n|^{{\mathscr {K}}}\in {{\mathscr {K}}}^*\), (23) implies \(v^1\in {\mathscr {K}}^*\). Therefore, we can apply the item (iii) of Theorem 3.1 to conclude that \(q_A\) is spherically quasi-convex. \(\square \)
Remark 3.2
The subduality of \({\mathscr {K}}\) is essential for the proof of Theorem 3.2. Indeed, suppose that the cone \({\mathscr {K}}\) is not-selfdual and \({\mathscr {K}}^*\) is subdual (for example take a pointed closed convex cone which contains the nonnegative orthant in its interior). Then, \({\mathscr {K}}^*\subsetneqq {\mathscr {K}}\), hence \({\mathscr {K}}\) is not subdual. Choose \({\mathscr {K}}\) such that \(v^n\in {\mathscr {K}}{\setminus }{\mathscr {K}}^*\). It follows that \(|v^n|^{{\mathscr {K}}}=v^n\notin {\mathscr {K}}^*\) and therefore the first inequality in (26) fails. Hence, the proof of Theorem 3.2 fails.
The following example satisfies the assumptions of Theorem 3.2.
Example 3.1
Letting \( {{\mathscr {K}}}={\mathbb {R}}^{n}_{+}\) and \(\lambda<(\lambda +\eta )/2<\mu <\eta \), the unit vectors \(v^1=(e^1+e^n)/\sqrt{2}\), \(v^2=e^2, \ldots , v^{n-1}=e^{n-1}, v^n=(e^1-e^n)/\sqrt{2}\) are pairwise orthogonal and satisfy the condition (23). Now, taking \(\mathscr {K} = \mathscr {L}\) and denoting \(v^n = \left((v^n)_1, (v^n)^2\right)\), by using Lemma 2.2, (23) can be written as
and \(\lambda<\mu <\eta \). The vectors \(v^1=(e^1+e^n)/\sqrt{2}, v^2=e^2, ~ \ldots , ~ v^{n-1}=e^{n-1}, v^n=(-\,e^1+e^n)/\sqrt{2}\) are pairwise orthogonal and satisfy the last inclusion.
Theorem 3.3
Let \(n\ge 3\), \(A=A^\top \in {\mathbb {R}}^{n\times n}\) and \(\{v^1, v^2,\ldots ,v^n\}\) be an orthonormal system of eigenvectors of A corresponding to the eigenvalues \(\lambda _1 , \lambda _2 , \ldots , \lambda _n\), respectively, such that \(v^1\in {{\,\mathrm{int}\,}}({{\mathscr {K}}}^*)\). Let
Assume that
Then, \(\lambda _{2} \mathrm{I_n}-A\) is \({{\mathscr {K}}}\)-copositive. Consequently, the quadratic function \(q_A\) is spherically quasi-convex.
Proof
Note that the spectral decomposition of A implies \(A=\sum _{i=1}^n\lambda _i v^i(v^i)^\top \). Thus, considering that \(\Vert x\Vert ^2=\sum _{i=1}^n\langle v^i,x\rangle ^2\), for all \(x\in {{\mathscr {K}}}\), we conclude that
Since (30) implies \(\lambda _{2}-\lambda _1>0\) and \(0\le \lambda _{j}-\lambda _{2}\le \lambda _n-\lambda _{2}\), for all \(j=3, \ldots , n\), it follows from (31) that
Since the second equality in (29) implies \( \langle v^{3}, x\rangle ^2+ \cdots + \langle v^n, x\rangle ^2\le \eta \langle v^1, x\rangle ^2\), the latter inequality becomes
First assume that \(\delta =1/\eta \). Thus, the last inequality in (30) implies \(\eta (\lambda _n-\lambda _{2})/(\lambda _{2}-\lambda _{1})\le 1\), which combined with (32) yields
Next, assume that \(\delta =\alpha \). First of all, note that \(\sum _{i=3}^n\langle v^i,y\rangle ^2\le \sum _{i=1}^n\langle v^i,y\rangle ^2=\Vert y\Vert ^2=1\), for all \(y\in S^n\). Thus, using (29), we conclude that
Hence, it follows from (32) that
Due to \(\delta =\alpha \), the last inequality in (30) implies \( (\lambda _n-\lambda _{2})/[\alpha (\lambda _{2}-\lambda _{1})]\le 1\), which together with (34) also implies (33). Hence, we conclude that \(\lambda _{2} \mathrm{I_n}-A\) is \({{\mathscr {K}}}\)-copositive. Therefore, since \(v^1\in {{\mathscr {K}}}^*\) and it is an eigenvector of A corresponding to the eigenvalue \(\lambda _1\), by applying item (iii) of Theorem 3.1, we can conclude that the function \(q_A\) is spherically quasi-convex. \(\square \)
Remark 3.3
The numbers \(\alpha \) and \(\eta \) have an analytical motivation for the required copositivity. The number \(\alpha \) is the cosine square of the maximal angle between the vector \(v^1\) and the cone \(\mathscr {K}\) (i.e., the maximal Euclidean angle between the first eigenvector and the nonzero vectors of the cone). For the number \(\eta \), consider the affine hyperspace \({\mathscr {H}}=\{x\in {\mathbb {R}}^n:~\langle v^1,x\rangle =1\}\) and the \((n-2)\)-dimensional linear subspace \({\mathscr {M}}=\{x\in {\mathbb {R}}^n:~\langle v^1,x\rangle =0\rangle ,\text { }\langle v^2,x\rangle =0\}\). Then, \(\eta \) is the maximal length of a vector in the orthogonal projection of the slice \(\mathscr {K}\cap {\mathscr {H}}\) onto \({\mathscr {M}}\).
In the following, we present an example satisfying the assumptions of Theorem 3.3.
Example 3.2
Letting \({{\mathscr {L}}}\) the Lorentz cone, the vectors \(v^i=e^i\), for all \(i\in \{1, \ldots , n\}\) and the eigenvectors \(\lambda _1<\lambda _2\le \cdots \le \lambda _n<\lambda _2 +(1/2)(\lambda _2-\lambda _1)\), the condition (30) is satisfied. In this case, \(\alpha =1/2\).
Corollary 3.3
Let \(n\ge 3\) and \(\{v^1, v^2,\ldots ,v^n\}\) be an orthonormal system of eigenvectors of \(A=A^\top \in {\mathbb {R}}^{n\times n}\) corresponding to the eigenvalues \(\lambda _1 , \lambda _2 , \ldots ,\lambda _n\) , respectively. Assume that \(\lambda _1=:\lambda <\mu :=\lambda _2=\cdots =\lambda _n\). If \(v^1 \in {{\mathscr {K}}}^*\), then \(q_{A}\) is spherically quasi-convex. \(\square \)
Proof
Using the spectral decomposition of A, we have
Since \(\Vert x\Vert ^2=\sum _{i=1}^n\langle v^i,x\rangle ^2\), for all \(x\in {\mathbb {R}}^{n}\), by using (35) and \(\lambda <\mu \), we obtain
In particular, (36) implies that \(\mu I_n-A\) is \({\mathscr {K}}\)-copositive. Thus, since \( v^1 \in {{\mathscr {K}}}^*\), by applying item (iii) of Theorem 3.1 with \(\lambda _2=\mu \) we can conclude that the function \(q_A\) is spherically quasi-convex. \(\square \)
In the next example, we show how to generate matrices satisfying the assumptions of Corollary 3.3 and consequently generate spherically quasi-convex functions on spherically subdual convex sets.
Example 3.3
The Householder matrix associated to \(v\in {{\,\mathrm{int}\,}}({{\mathscr {K}}}^*)\) is defined by \(H:=\mathrm{I_n}-2vv^{\mathrm{T}}/\Vert v\Vert ^2\). We know that H is a symmetric and nonsingular matrix. Furthermore, \(Hv=-\,v\) and \(Hu=u\) for all \(u\in {{\mathscr {S}}}\), where \( {{\mathscr {S}}}:=\{ u\in {\mathbb {R}}^n~:~ \langle v, u \rangle =0\}\). Since the dimension of \({{\mathscr {S}}}\) is \(n-1\), then we have 1 and \(-1\) are eigenvalues of H with multiplicities \(n-1\) and 1, respectively. Moreover, considering that \(v\in {{\,\mathrm{int}\,}}({{\mathscr {K}}}^*)\), Corollary 3.3 implies that \(q_{H}(x)=\langle Hx,x\rangle \) is spherically quasi-convex.
4 Spherically Quasi-convex Quadratic Functions on the Spherical Lorentz Convex Set
In this section, we present a condition partially characterizing the spherical quasi-convexity of quadratic functions on spherically convex sets associated to the Lorentz cone. First, we remark that for the Lorentz cone \({{\mathscr {L}}}\), since by Lemma 2.3 we have a characterization of \({{\mathscr {L}}}\)-copositive matrices, item (iii) of Theorem 3.1 provides a more general result than Theorem 3.3.
Proposition 4.1
Let \({{\mathscr {L}}}\) be the Lorentz cone, \(n\ge 2\), \(A=A^\top \in {\mathbb {R}}^{n\times n}\), \(\lambda _1\le \lambda _2\le ...\le \lambda _n\) be the eigenvalues of A, \(v^1\) be an eigenvector of A corresponding to \(\lambda _1\) and \(J={{\,\mathrm{diag}\,}}(1,-1,\ldots ,-1)\in {\mathbb {R}}^{n\times n}\). If \(v^1\in {{\mathscr {L}}}\) and there exists an \(\rho \ge 0\) such that \(\lambda _2I_n-A-\rho J\) is positive semidefinite, then \(q_A\) is spherically quasiconvex.
Proof
If there exists an \(\rho \ge 0\) such that \(\lambda _2I_n-A-\rho J\) is positive semidefinite, then it follows from Lemma 2.3 that \(\lambda _2I_n-A\) is a \({{\mathscr {L}}}\)-copositive matrix. Therefore, considering that \(v^1\in {{\mathscr {L}}}={{\mathscr {L}}}^*\) and it is an eigenvector of A associated to the eigenvalue \(\lambda _1\), by applying item (iii) of Theorem 3.1, we conclude that \(q_A\) is spherically quasi-convex. \(\square \)
The next result is a version of [10, Theorem 20] for this cone.
Theorem 4.1
Let \(n\ge 3\) and \(\{v^1, v^2,\ldots ,v^n\}\) be an orthonormal system of eigenvectors of \(A=A^\top \in {\mathbb {R}}^{n\times n}\) corresponding to the eigenvalues \(\lambda _1 , \lambda _2 , \ldots ,\lambda _n\) , respectively. Assume that \(\lambda _1=:\lambda <\mu :=\lambda _2=\cdots =\lambda _n\). Then, \(q_{A}\) is a spherically quasi-convex function if and only if \(v^1\in {{\mathscr {L}}}\).
Proof
If there exists an eigenvector of A corresponding to the smaller eigenvalue belonging to \({{\mathscr {L}}}\), then Corollary 3.3 implies that \(q_{A}\) is spherically quasi-convex. Conversely, assume that \(q_{A}\) is spherically quasi-convex. Thus, by using the spectral decomposition of A, we have
We can also assume, without loss of generality, that the number \(v^1_ 1\ge 0\). Let \(x\in \partial {{\mathscr {L}}}{\setminus }\{0\}\) and note that \(y=2x_1e^1-x\in \partial {{\mathscr {L}}}{\setminus }\{0\}\). Since \(\sum _{i=1}^nv^i(v^i)^\top =I_n\) (i.e., the spectral decomposition of \(I_n\)) and \(\langle x, y\rangle =0\) (37) implies that
Since \(x, y\in {{\mathscr {L}}}\), \(\langle x, y\rangle =0\) and \({{\mathscr {L}}}\) is a self-dual cone, it follows from Corollary 2.1 that \( \langle Ax,y\rangle \le 0\). Thus, considering that \(\lambda <\mu \) and \(y=2x_1e^1-x\) (38) yields
On the other hand, due to \(x\in {{\mathscr {L}}}\), we have \(x^1\ge 0\). Thus, since \(v_1^1\ge 0\), if \(\langle v^1,x\rangle <0\), then we have \(\langle v^1,x\rangle [(2v^1_1x_1-\langle v^1,x\rangle ]<0\), which contradicts (39). Hence \(\langle v^1,x\rangle \ge 0\), where x can be chosen arbitrarily in \(\partial {{\mathscr {L}}}{\setminus }\{0\}\). Therefore, \(v^1\in {\mathscr {L}}\) and the proof is complete. \(\square \)
5 Perspectives and Open Problems
First of all, we note that for all our classes of spherically quasi-convex quadratic functions \(q_A\) on the spherically subdual convex set \({{\mathscr {C}}}=S^{n-1}\cap {{\,\mathrm{int}\,}}({{\mathscr {K}}})\), the matrix A has the smallest eigenvalue with multiplicity one and the associated eigenvector belongs to the dual \({{\mathscr {K}}}^*\) of the subdual cone \({{\mathscr {K}}}\). We conjecture that this condition is necessary and sufficient to characterize spherically quasi-convex quadratic functions. It is worth noting that for selfdual cones such a characterization can be obtained by proving that, under the assumption \(v^1\in {{\mathscr {K}}}\), the matrix \(\lambda _2 I_n-A\) is \({{\mathscr {K}}}\)-copositive, where \(\{v^1, v^2,\ldots ,v^n\}\) is an orthonormal system of eigenvectors of A corresponding to the eigenvalues \(\lambda _1<\lambda _{2}\le \cdots \le \lambda _n\), respectively. We also remark that, in Theorem 4.1 we present a partial characterizations of spherically quasi-convex quadratic functions on the spherical Lorentz convex set. However, the general question remains open even for this specific set.
Although efficient algorithms for solving intrinsically unconstrained problems posed on the whole sphere are well known (see [4,5,6,7,8,9]), an even more challenging problem is to develop efficient algorithms for constrained quadratic optimization problems on spherically convex sets. As far as we know, the first algorithm for a special problem on this subject has recently appeared in [24, Example 5.5.2]; see also [25, Sect. 10].
Minimizing a quadratic function on the intersection of the Lorentz cone with the sphere is particularly relevant, since the nonnegativity of the minimum value is equivalent to the Lorentz-copositivity of the corresponding matrix, see [2, 3]; see also [24, Example 5.5.2] and [25, Sect. 10]. In general, replacing the Lorentz cone with an arbitrary closed convex cone K leads to the more general concept of \({{\mathscr {K}}}\)-copositivity. By considering the intrinsic geometrical properties of the sphere, interesting perspectives for detecting the general copositivity of matrices emerge.
Note that isometries map geodesics onto geodesics and also preserve convex sets. Moreover, the composition of convex and quasi-convex functions with isometries is also convex and quasi-convex functions, respectively. Hence, all concepts studied in this paper are preserved by isometries of the sphere. On the other hand, considering that the Euclidean space and the sphere is not isometric, there is no clear relationship between concepts of convexity and quasi-convexity of these geometric spaces. On the other hand, in principle, it is possible to extend concepts similar to those studied in this paper to any submanifolds of the Euclidean space, which naturally depend on curvatura/metric. As a consequence, the following question arises: It is possible to unify these concepts or part of them in a common framework? This is an interesting question that deserves to be investigated.
6 Conclusions
In [10, 22, 26, 27], we studied some intrinsic properties of the spherically convex sets and functions. In the present paper, we showed further developments of this topic. In particular, many of the results obtained in the previous papers are related to the conditions implying spherical quasi-convexity of quadratic functions on the spherical positive orthant. In the present paper, these results are generalized to subdual cones. As far as we know, this is the pioneer study of spherically quasi-convex quadratic functions on spherically subdual convex sets. As stated in Sect. 5, there are still interesting questions to be answered in this topic, we foresee further progress in these direction in the near future.
References
Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems with applications to large scale minimization. Math. Program. 77(2, Ser. B), 273–299 (1997). Semidefinite programming
Loewy, R., Schneider, H.: Positive operators on the \(n\)-dimensional ice cream cone. J. Math. Anal. Appl. 49, 375–392 (1975)
Gajardo, P., Seeger, A.: Solving inverse cone-constrained eigenvalue problems. Numer. Math. 123(2), 309–331 (2013)
Hager, W.W.: Minimizing a quadratic over a sphere. SIAM J. Optim. 12(1), 188–208 (2001)
Hager, W.W., Park, S.: Global convergence of SSM for minimizing a quadratic over a sphere. Math. Comput. 74(251), 1413–1423 (2005)
Smith, S.T.: Optimization techniques on Riemannian manifolds. In: Hamiltonian and Gradient Flows, Algorithms and Control, Fields Inst. Commun., vol. 3, pp. 113–136. Amer. Math. Soc., Providence (1994)
So, A.M.C.: Deterministic approximation algorithms for sphere constrained homogeneous polynomial optimization problems. Math. Program. 129(2, Ser. B), 357–382 (2011)
Zhang, L.: On the convergence of a modified algorithm for the spherical facility location problem. Oper. Res. Lett. 31(2), 161–166 (2003)
Zhang, X., Ling, C., Qi, L.: The best rank-1 approximation of a symmetric tensor and related spherical optimization problems. SIAM J. Matrix Anal. Appl. 33(3), 806–821 (2012)
Ferreira, O., Németh, S., Xiao, L.: On the spherical quasi-convexity of quadratic functions. Linear Algebra Appl. 562, 205–222 (2019)
Martos, B.: Subdefinite matrices and quadratic forms. SIAM J. Appl. Math. 17, 1215–1223 (1969)
Ferland, J.A.: Maximal domains of quasi-convexity and pseudo-convexity for quadratic functions. Math. Program. 3, 178–192 (1972)
Schaible, S.: Quasiconvex, pseudoconvex, and strictly pseudoconvex quadratic functions. J. Optim. Theory Appl. 35(3), 303–338 (1981)
Komlósi, S.: Generalized convexity of a certain class of quadratic functions. Izv. Vyssh. Uchebn. Zaved. Mat. 9, 38–43 (1984)
Karamardian, S., Schaible, S., Crouzeix, J.P.: Characterizations of generalized monotone maps. J. Optim. Theory Appl. 76(3), 399–413 (1993)
Moreau, J.J.: Décomposition orthogonale d’un espace hilbertien selon deux cônes mutuellement polaires. C. R. Acad. Sci. Paris 255, 238–240 (1962)
Hiriart-Urruty, J.B., Lemaréchal, C.: Convex analysis and minimization algorithms: Fundamentals. I, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 305. Springer, Berlin (1993)
Fukushima, M., Luo, Z.Q., Tseng, P.: Smoothing functions for second-order-cone complementarity problems. SIAM J. Optim. 12(2), 436–460 (2002)
Jakubovič, V.A.: The \(S\)-procedure in nonlinear control theory. Vestnik Leningrad. Univ. 1, 62–77 (1971) (in Russian)
Jakubovič, V.A.: The \(S\)-procedure in nonlinear control theory. Vestnik Leningrad. Univ. 4, 73–93 (1977) (English translation)
Pólik, I., Terlaky, T.: A survey of the S-lemma. SIAM Rev. 49(3), 371–418 (2007)
Ferreira, O.P., Iusem, A.N., Németh, S.Z.: Projections onto convex sets on the sphere. J. Glob. Optim. 57(3), 663–676 (2013)
Mangasarian, O.L.: Nonlinear Programming. In: Classics in Applied Mathematics, vol. 10. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1994)
Lange, K.: MM Optimization Algorithms. Appl. Math. vol. 147. Society for Industrial and Applied Mathematics (SIAM), Philadelphia ( 2016)
Bauschke, H.H., Bui, M.N., Wang, X.: Projecting onto the intersection of a cone and a sphere. SIAM J. Optim. 28(3), 2158–2188 (2018)
Ferreira, O.P., Iusem, A.N., Németh, S.Z.: Concepts and techniques of optimization on the sphere. TOP 22(3), 1148–1170 (2014)
Ferreira, O.P., Németh, S.Z.: On the spherical convexity of quadratic functions. J. Glob. Optim. 73(3), 537–545 (2019)
Acknowledgements
This work was supported by CNPq (Grants 302473/2017-3, 408151/2016-1) and FAPEG/PRONEM-201710267000532.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ferreira, O.P., Németh, S.Z. & Xiao, L. On the Spherical Quasi-convexity of Quadratic Functions on Spherically Subdual Convex Sets. J Optim Theory Appl 187, 1–21 (2020). https://doi.org/10.1007/s10957-020-01741-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-020-01741-7