Abstract
In this paper, a characterization of the solution sets of convex smooth optimization programmings on Riemannian manifolds, in terms of the Riemannian gradients of the cost functions, is obtained.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This paper is concerned with a characterization of the solution sets of convex optimization programmings on Riemannian manifolds. Characterizations of the solution sets of the nonlinear programming problems with multiple solutions, provided that one minimizer is known, play an important role in many fields, such as optimization problems, variational inequalities, and equilibrium problems. Mangasarian [16] presented several characterizations of the solution sets for differentiable convex programs on linear spaces and applied them to study monotone linear complementarity problems. Further investigation has been done by Burke and Ferris [3]. In the last decade, Mangasarian type characterizations were derived for several smooth and nonsmooth convex or generalized convex problems on linear spaces; see [12, 13, 23] and references therein.
Extensions of concepts and techniques from Euclidean spaces to Riemannian manifolds are natural and lead to successful tools in optimization, therefore such topics with practical and theoretical purposes have been the subject of several research papers. Udriste [22] and Rapscák [17] introduced the theory of convex functions on Riemannian manifolds motivated by the fact that some constrained optimization problems can be seen as unconstrained ones from the Riemannian geometry point of view. In addition, another advantage is that optimization problems with nonconvex objective functions can be written as convex optimization problems by endowing the space with an appropriate Riemannian metric (see Example 2.1). Recently, a number of important results have been obtained on various aspects of optimization theory and applications on Riemannian manifolds, which introduced several important techniques and methods for existence of solutions of optimization problems on Riemannian manifolds; see [1, 2, 8, 20].
The purpose of this paper is to present a simple characterization of solution sets of convex optimization problems on Riemannian manifolds in terms of the Riemannian gradient of the cost function. To the best of our knowledge, it has not been given before and to formulate and prove this result on Riemannian manifolds, we need to use several tools and techniques from Riemannian geometry. One of the applications of this characterization is for the problem of minimizing convex quadratic functions defined on a convex subset of a sphere, which arises in solving fixed point theorems, surjectivity theorems, existence theorems for complementarity problems, and variational inequalities by calculating the scalar derivatives; for more details about these theorems and their applications see [6] and references therein. In particular, some existence theorems could be reduced to optimizing a quadratic function on a convex subset of the sphere. Moreover, the minimization problems of quadratic functions defined on the sphere occur as subproblems in methods of nonlinear programming; see [18].
2 Preliminaries
In this section, we introduce some fundamental properties and notations of Riemannian manifolds. These basic facts can be found in any introductory book on Riemannian geometry; see for example [19]. Throughout this paper, M is an n-dimensional Riemannian manifold with a Riemannian metric \(\langle \cdot , \cdot \rangle _{x}\) on the tangent space \(T_{x}M\cong {{\mathbb {R}}}^{n}\) for every \(x\in M\). The corresponding norm is denoted by \(\Vert \cdot \Vert \). Let us recall that the length of a piecewise \(C^{1}\) curve \(\gamma :[a,b]\rightarrow M\) is defined by
By minimizing the length functional over the set of all piecewise \(C^{1}\) curves with \(\gamma (0)=x\) and \(\gamma (1)=y\) for \(x, y \in M\), we obtain a Riemannian distance on M denoted by d(x, y). The space of vector fields on M is denoted by \({\mathcal X}(M)\) and \(\nabla \) is the Levi-Civita connection associated to M. A geodesic is a smooth curve \(\gamma \) satisfying the equation \(\nabla _{\gamma ^{\prime }(t)} \gamma ^{\prime }(t)=0\). The exponential mapping \( \exp : {\tilde{T}} M \rightarrow M\) is defined as \(\exp (v) = \gamma (1)\), where \(\gamma \) is the geodesic defined by its starting point x and the velocity \(\gamma ^{\prime }(0)=v\) at x and \({\tilde{T}} M\) is an open neighborhood in TM. The restriction of \(\exp \) to \(T_{x}M\) in \({\tilde{T}} M\) is denoted by \(\exp _{x}\) for every \(x\in M\). For a minimizing geodesic \(\gamma :[0,l]\rightarrow M\) connecting x to y in M, and for a vector \(v \in T_xM\), there is a unique parallel vector field P along \(\gamma \) such that \(P(0)=v,\) this is called the parallel translation of v along \(\gamma .\) The mapping \( T_xM\ni v\mapsto P(1)\in T_yM\) is a linear isometry from \(T_xM\) onto \(T_yM\). This map is denoted by \(P_{x}^{y}.\)
The Riemannian metric induces a map \(f \mapsto \mathrm{grad}\, f \in {\mathcal X}(M)\) which associates to each differentiable function f at \(x\in M\), its gradient via the rule
The Riemannian Hessian of f at a point \(x\in M\) is the linear mapping
defined by
for every \(v\in T_{x}M.\) Note that \(\mathrm{Hess}\,f(x)\) satisfies
and this formula fully defines \(\mathrm{Hess}\, f(x)\).
A subset S of a Riemannian manifold is called convex if any two points x, \(y\in S\) can be joined by a unique minimizing geodesic (denoted by \(\gamma _{xy}\)) which lies entirely in S. Note that there is little consistency in the meanings attached to the terms “convex set” and “strongly convex set” (see page 105 in [9] and page 2488 in [15] and references therein).
It is known that \( ~\exp _{x}^{-1}\) is well-defined on every convex set S, \(d(x, y)=|| \exp _{x}^{-1}(y)||\) for every \(x, y\in S\), and
see [11]. Let S be a nonempty convex subset of M, a function \(f: S\rightarrow {\mathbb {R}}\) is said to be convex if for every x, \(y\in S\) and every \(t\in [0,1]\),
The following example illustrates a nonconvex function which can be written as a convex function on a Riemannian manifold with an appropriate metric (see [4]).
Example 2.1
The function \(f: {{\mathbb {R}}}^{2}\rightarrow {\mathbb {R}}\) defined by
is not convex. Endowing \({{\mathbb {R}}}^{2}\) with the metric \(g(x):=\mathrm{diag}(1,e^{2x_{1}})\), we obtain a Riemannian manifold \(M_{g}\). The Hessian matrix
is positive semidefinite, therefore f is convex on \(M_{g}\).
3 Characterization of the solution sets
Our aim is to characterize the solution set of the following optimization problem
where \(S\subseteq M\) is a convex subset of M and f is a twice continuously differentiable convex function on some open convex set containing S. We denote the solution set of the optimization problem (1) by
and assume that \(\bar{S}\ne \emptyset \). If \(\bar{x}\in \bar{S}\), then
and \(\bar{S}\) is a convex subset of S. The following theorem is a generalization of [7, Proposition 15] from the sphere \(S^{n}\) to a general setting which describes the relation between the solution sets of (1) and a variational inequality.
Theorem 3.1
Let \(S\subseteq M\) be a convex subset of M and f be a twice continuously differentiable convex function on some open convex set containing S. Then \(\bar{x}\in \bar{S}\) if and only if
Proof
Let \(\bar{x}\in \bar{S}\), \(y\in S\), and \(\gamma _{\bar{x} y}\) be the minimal geodesic connecting \(\bar{x}\) and y. By convexity of S, \(f(\gamma _{\bar{x} y}(t))\ge f(\bar{x})\) for all \(t\in [0,1]\). Therefore
Now suppose that (2) holds. By convexity of f, we have
\(\square \)
Now we present a characterization for the solution set of a convex optimization problem on a convex subset of a Riemannian manifold which is our main result.
Theorem 3.2
Let \(S\subseteq M\) be a convex subset of M, f be a twice continuously differentiable convex function on some open convex set containing S, and \(\bar{x}\in \bar{S}\). Then
Proof
We denote the right hand side of (3) by \(S^{*}\). On the contrary, we assume that \(x\in S^{*}\setminus \bar{S}.\) By convexity of f,
Now, by properties of the parallel translation, we get
which is a contradiction.
For the converse, let \(x\in \bar{S}\) and \(\gamma _{\bar{x} x}(t)\) be the minimal geodesic connecting \(\bar{x}\) and x. Since \(\bar{S}\) is convex, this geodesic lies entirely in \(\bar{S}\) so \(f(\gamma _{\bar{x} x }(t))=f(\bar{x})\) for all \(t\in [0,1]\). Therefore
Similarly, we have \(\langle \mathrm{grad}\, f(x),\exp _{x}^{-1}\bar{x}\rangle _{x}=0.\) These two equations and properties of the parallel transport imply
Now, we define \(F: [0,1]\rightarrow T_{\bar{x}}M\) as follows
Note that \(\gamma _{x \bar{x}}^{\prime }(t)=P_{x}^{\gamma _{x \bar{x}}(t)}[\exp _{x}^{-1}\bar{x}]\) and
hence
Since F is \(C^{1}\) and by the previous equality, we get
where
is a linear map. We claim that this linear map is positive semidefinite and therefore corresponds to a positive semidefinite matrix A. To prove the claim, note that \(\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t))\) is a positive semidefinite linear map, hence
for every \(w\in T_{\bar{x}}M\), therefore by integration with respect to t, it implies that
which proves our claim. Combining (4) and (5), we deduce that
Since A is symmetric positive semidefinite, it follows from (6) and [10, p. 431] that
which completes the proof. \(\square \)
The following corollary is an immediate consequence of Theorem 3.2.
Corollary 3.1
Let \(S\subseteq M\) be a convex subset of M, f be a twice continuously differentiable convex function on some open convex set containing S, and \(\bar{x}\in \bar{S}\). Then
- (i)
the function \(x\mapsto \Vert \mathrm{grad}\, f(x)\Vert \) is constant on \(\bar{S}.\)
- (ii)
\(\bar{S}=\tilde{S}=\{x\in S:\langle \mathrm{grad}\, f(\bar{x}),\exp _{\bar{x}}^{-1}x)\rangle _{\bar{x}}\le 0, P^{\bar{x}}_{x}[\mathrm{grad}\, f( x)]=\mathrm{grad} f(\bar{x})\}.\)
Proof
The first part is obtained by Theorem 3.2 and the isometric property of the parallel translation. For proving the second part, we have that the inclusion \(\bar{S}\subseteq \tilde{S}\) holds by Theorem 3.2. For the converse, assume that \(x\in \tilde{S}\). Since the parallel translation is an isometry and f is convex, we deduce that
This shows that \(f(\bar{x})\ge f( x)\). Since \(\bar{x}\) is the minimizer of f on S, we have \(f(\bar{x})= f( x)\), which proves that \( \tilde{S}\subseteq \bar{S}\). \(\square \)
Now, we present some examples to illustrate how our characterization of solution sets of optimization problems works in particular nontrivial settings of Riemannian manifolds.
Example 3.1
Let M be a Hadamard manifold and U be a nonempty open convex subset of M. Pick \(x_{0},y_{0} \in U\) with \(x_{0}\ne y_{0}.\) Choose \(\varepsilon >0\) such that \( S:= {\bar{B}}(x_{0},\varepsilon )\subset U\), \(A:={\bar{B}}({y_{0}},\varepsilon )\subset U\) are convex and \(S\cap A=\emptyset .\) Define the function \(f: S\rightarrow {\mathbb {R}}\) by
Note that the function \(d_{A}^{2}(\cdot )\) is convex and twice continuously differentiable on U. Let \(\bar{x}\in \bar{S}\) and \(p:=\pi _{A}(\bar{x}),\) where \(\pi _{A}(\bar{x})\) is the metric projection of \(\bar{x}\) on A. We have that
Thus, by using Theorem 3.2, we deduce
Example 3.2
The general matrix rank minimization problem (RMP) expressed as
where \(X\in {{\mathbb {R}}}^{n\times n}\), \(P_n\) is the set of positive semidefinite \(n\times n\) matrices, and S is a convex set, is computationally hard to solve. This problem arises in many areas such as control, system identification, statistics, signal processing, and computational geometry; see [5] and references therein. Rather than solving the RMP, one can use the function
as a smooth surrogate for \(\mathrm{rank} X \) and instead solve the following problem
where \(\delta >0\) can be interpreted as a small regularization constant; see [5]. Note that this surrogate is not convex on the linear space \({{\mathbb {R}}}^{n\times n}\). This application motivated us to consider the problem
which is not convex with respect to the Euclidean metric on \({{\mathbb {R}}}^{n\times n}\). Here \(P_n^+\) is the set of positive definite \(n\times n\) matrices and \(A\in P_n^+\).
The set of symmetric positive definite matrices, as a Riemannian manifold, is the most studied example of manifolds of nonpositive curvature. The tangent space to \(P_n^+\) at any of its points P is the space \(T_PP_n^+=\{P\}\times S_n\), where \(S_n\) is the space of symmetric \(n\times n\) matrices. On each tangent space \(T_PP_n^+\), the inner product is defined by
The Riemannian distance between \(P,Q\in P_n^+\) is given by
where \(\lambda _i, ~i=1,...,n\), are eigenvalues of \(P^{-1}Q.\) The exponential map
is defined by
Moreover, if \(P\in P_n^+\), then
is defined by
where \(\log \) and \(\exp \) denote the logarithm and exponential functions on the matrix space; for more details see [21]. The parallel transport along the unique geodesic connecting X and Y, is defined by
Moreover, the Riemannian gradient of a function f defined on \(P_n^+\) is given by using the Euclidean gradient, denoted by \(\nabla f\), using the following formula,
where \(\mathrm{symm}(\nabla f(X))= 1/2(\nabla f(X)+\nabla f(X)^T).\) For \(f(X)=\log \det (X)\), \(\nabla f(X)=X^{-1}\), therefore \(\mathrm{grad}\, f(X)=X\). First we claim that S is a convex subset of \(P_n^+\). Assume that \(X, Y\in S\), then \(X\ge A\) and \(Y\ge A\). The unique geodesic connecting these two points is defined by
by using the Löwner–Heinz inequality (see [14, Lemma 2.1]), we have \(\gamma (t)\ge A\) and therefore S is convex in \(P_n^+\). We claim that A is a solution for the problem (7). By using Theorem 3.1, we need to prove that \(\langle A,\exp _A^{-1}(Y)\rangle _A\ge 0\) for all \(Y\in S\). Note that
Therefore,
To illustrate Theorem 3.2, we will see that
Note that \(\mathrm{grad}\, f(X)=X,\mathrm{grad}\, f(A)=A\), and
Moreover,
which shows the required equation.
Recall that the unit sphere \(S^{2}:=\{x\in {{\mathbb {R}}}^3:~||x||=1\}\) is a 2-dimensional manifold with the usual Riemannian distance function defined as
For every \(\bar{x} \in S^{2},\) it follows from the definition of the Riemannian metric on \(S^{2}\) that
where \(\langle \cdot , \cdot \rangle \) denotes the standard inner product in \({{\mathbb {R}}}^{3}.\) For \(x\in S^{2}\), the exponential map \(\exp _{x}:T_{x} S^{2}\rightarrow S^{2}\) is defined by
Moreover, \(\exp ^{-1}_{x}:S^{2}\rightarrow T_{x} S^{2}\) is
where \(\theta =\arccos \langle x,y\rangle .\) Let \(t\rightarrow \gamma (t)\) be the unique minimal geodesic in \(S^{2}\) joining \(\gamma (0)=x\) to \(\gamma (0)=y,\) and let \(u:=\frac{\gamma ^{\prime }(0)}{||\gamma ^{\prime }(0)||}.\) The parallel translation of a vector \(v\in T_{x} S^{2}\) along the geodesic \(\gamma \) is given by
see [1]. In the following example, which is an improvement of [16, Corollary 1, p. 1988], we consider the optimization problem on convex subsets of the unit sphere involving quadratic cost functions.
Example 3.3
Let S be a convex subset of \(S^{2}\) and f be the quadratic convex function on an open convex subset of \(S^{2}\) containing S defined by
where \(A\in {{\mathbb {R}}}^{n\times n}\) is an \(n\times n\) matrix. Suppose that \(\bar{x}\in \bar{S}\). By using formulas (8)–(10) and [1, p. 74], for every \(x\in \bar{S}\), we get
and
where \(\theta =\arccos \langle {\bar{x}},x\rangle .\) Therefore, by Theorem 3.2, we obtain
where \(c=A\bar{x}-(\bar{x}{\bar{x}}^{T})A\bar{x}\) and \(d=P^{x}_{\bar{x}}(c).\)
References
Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2009)
Barani, A.: Subdifferentials of perturbed distance function in Riemannian manifolds. Optimization 67(11), 1849–1868 (2018)
Burke, J.V., Ferris, M.C.: Characterization of solution sets of convex programs. Oper. Res. Lett. 10, 57–60 (1991)
Da Cruz Neto, J.X., Oliveira, O.P., Lucambio Pérez, L.R., Nemeth, S.Z.: Convex- and monotone-transformable mathematical programming problems and a proximal-like point method. J. Glob. Optim. 35(1), 53–69 (2006)
Fazel, M., Hindi, H., Boyd, S.P.: Log-det heuristic for matrix rank minimization with applications to hankel and euclidean distance matrices. In: Proceedings of the 2003 American Control Conference, 2003, vol. 3, pp. 2156–2162 (2003)
Ferreira, O.P., Nemeth, S.Z.: On the spherical convexity of quadratic functions. J. Glob. Optim. 73(3), 537–545 (2019)
Ferreira, O.P., Iusem, A.N., Nemeth, S.Z.: Projections onto convex sets on the sphere. J. Glob. Optim. 57, 663–676 (2013)
Grohs, P., Hosseini, S.: \(\varepsilon \)-Subgradient algorithms for locally Lipschitz functions on Riemannian manifolds. Adv. Comput. Math. 42(2), 333–360 (2016)
Grossir, D.: Newton’s method, zeroes of vector fields, and the Riemannian center of mass. Adv. Appl. Math. 33, 95–135 (2004)
Horn, R.A., Johnson, C.A.: Matrix Analysis. Cambridge University Press, Cambridge (2013)
Hosseini, S., Pouryayevali, M.R.: On the metric projection onto prox-regular subsets of Riemannian manifolds. Proc. Am. Math. Soc. 141, 233–244 (2013)
Jeyakumar, V., Yang, X.Q.: Convex composite multiobjective nonsmooth programming. Math. Program. 59, 325–343 (1993)
Jeyakumar, V., Lee, G.M., Dinh, N.: Lagrange multiplier conditions characterizing the optimal solution set of cone-constrained convex programs. J. Optim. Theory Appl. 123, 83–103 (2004)
Lim, Y.: Factorizations and geometric means of positive definite matrices. Linear Algebra Appl. 437(9), 2159–2172 (2012)
Li, C., Yao, J.C.: Variational inequalities for set-valued vector fields: convexity of the solution set and proximal point algorithm. SIAM J. Control Optim. 50(4), 2486–2514 (2012)
Mangasarian, O.L.: A simple characterization of solution sets of convex programs. Oper. Res. Lett. 7, 21–26 (1988)
Rapscák, T.: Smooth Nonlinear Optimization in \({\mathbb{R}}^{n}\). Kluwer Academic, Dordrecht (1997)
Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems with applications to large scale minimization. Math. Program. 77(2 Ser. B), 237–299 (1997)
Sakai, T.: Riemannian Geometry. American Mathematic Society, Providence (1992)
Schneider, R., Uschmajew, A.: Convergence results for projected line-search methods on varieties of low-rank matrices via Lojasiewicz inequality. SIAM J. Optim. 25(1), 622–646 (2015)
Sra, S., Hosseini, R.: Conic geometric optimization on the manifold of positive definite matrices. SIAM J. Optim. 25(1), 713–739 (2015)
Udriste, C.: Convex Functions and Optimization Methods on Riemannian Manifolds. Mathematics and its Applications, vol. 297. Kluwer Academic, Dordrecht (1994)
Wu, Z.L., Wu, S.Y.: Characterizations of the solution sets of convex programs and variational inequality problems. J. Optim. Theory Appl. 130, 339–358 (2006)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Barani, A., Hosseini, S. Characterization of solution sets of convex optimization problems in Riemannian manifolds. Arch. Math. 114, 215–225 (2020). https://doi.org/10.1007/s00013-019-01382-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00013-019-01382-x