Characterization of solution sets of convex optimization problems in Riemannian manifolds

Barani, A.; Hosseini, S.

doi:10.1007/s00013-019-01382-x

Characterization of solution sets of convex optimization problems in Riemannian manifolds

Published: 04 October 2019

Volume 114, pages 215–225, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Archiv der Mathematik Aims and scope Submit manuscript

Characterization of solution sets of convex optimization problems in Riemannian manifolds

Download PDF

A. Barani¹ &
S. Hosseini²

420 Accesses
8 Citations
Explore all metrics

Abstract

In this paper, a characterization of the solution sets of convex smooth optimization programmings on Riemannian manifolds, in terms of the Riemannian gradients of the cost functions, is obtained.

Nonconvex Weak Sharp Minima on Riemannian Manifolds

Article 25 June 2019

Characterization of Lower Semicontinuous Convex Functions on Riemannian Manifolds

Article 30 July 2016

First Order Methods for Optimization on Riemannian Manifolds

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

This paper is concerned with a characterization of the solution sets of convex optimization programmings on Riemannian manifolds. Characterizations of the solution sets of the nonlinear programming problems with multiple solutions, provided that one minimizer is known, play an important role in many fields, such as optimization problems, variational inequalities, and equilibrium problems. Mangasarian [16] presented several characterizations of the solution sets for differentiable convex programs on linear spaces and applied them to study monotone linear complementarity problems. Further investigation has been done by Burke and Ferris [3]. In the last decade, Mangasarian type characterizations were derived for several smooth and nonsmooth convex or generalized convex problems on linear spaces; see [12, 13, 23] and references therein.

Extensions of concepts and techniques from Euclidean spaces to Riemannian manifolds are natural and lead to successful tools in optimization, therefore such topics with practical and theoretical purposes have been the subject of several research papers. Udriste [22] and Rapscák [17] introduced the theory of convex functions on Riemannian manifolds motivated by the fact that some constrained optimization problems can be seen as unconstrained ones from the Riemannian geometry point of view. In addition, another advantage is that optimization problems with nonconvex objective functions can be written as convex optimization problems by endowing the space with an appropriate Riemannian metric (see Example 2.1). Recently, a number of important results have been obtained on various aspects of optimization theory and applications on Riemannian manifolds, which introduced several important techniques and methods for existence of solutions of optimization problems on Riemannian manifolds; see [1, 2, 8, 20].

The purpose of this paper is to present a simple characterization of solution sets of convex optimization problems on Riemannian manifolds in terms of the Riemannian gradient of the cost function. To the best of our knowledge, it has not been given before and to formulate and prove this result on Riemannian manifolds, we need to use several tools and techniques from Riemannian geometry. One of the applications of this characterization is for the problem of minimizing convex quadratic functions defined on a convex subset of a sphere, which arises in solving fixed point theorems, surjectivity theorems, existence theorems for complementarity problems, and variational inequalities by calculating the scalar derivatives; for more details about these theorems and their applications see [6] and references therein. In particular, some existence theorems could be reduced to optimizing a quadratic function on a convex subset of the sphere. Moreover, the minimization problems of quadratic functions defined on the sphere occur as subproblems in methods of nonlinear programming; see [18].

2 Preliminaries

In this section, we introduce some fundamental properties and notations of Riemannian manifolds. These basic facts can be found in any introductory book on Riemannian geometry; see for example [19]. Throughout this paper, M is an n-dimensional Riemannian manifold with a Riemannian metric $\langle \cdot , \cdot \rangle _{x}$ on the tangent space $T_{x}M\cong {{\mathbb {R}}}^{n}$ for every $x\in M$. The corresponding norm is denoted by $\Vert \cdot \Vert $. Let us recall that the length of a piecewise $C^{1}$ curve $\gamma :[a,b]\rightarrow M$ is defined by

$$\begin{aligned} L(\gamma )= & {} \int \limits ^{b}_{a}\Vert \gamma ^{\prime }(t)\Vert dt. \end{aligned}$$

By minimizing the length functional over the set of all piecewise $C^{1}$ curves with $\gamma (0)=x$ and $\gamma (1)=y$ for $x, y \in M$, we obtain a Riemannian distance on M denoted by d(x, y). The space of vector fields on M is denoted by ${\mathcal X}(M)$ and $\nabla $ is the Levi-Civita connection associated to M. A geodesic is a smooth curve $\gamma $ satisfying the equation $\nabla _{\gamma ^{\prime }(t)} \gamma ^{\prime }(t)=0$. The exponential mapping $ \exp : {\tilde{T}} M \rightarrow M$ is defined as $\exp (v) = \gamma (1)$, where $\gamma $ is the geodesic defined by its starting point x and the velocity $\gamma ^{\prime }(0)=v$ at x and ${\tilde{T}} M$ is an open neighborhood in TM. The restriction of $\exp $ to $T_{x}M$ in ${\tilde{T}} M$ is denoted by $\exp _{x}$ for every $x\in M$. For a minimizing geodesic $\gamma :[0,l]\rightarrow M$ connecting x to y in M, and for a vector $v \in T_xM$, there is a unique parallel vector field P along $\gamma $ such that $P(0)=v,$ this is called the parallel translation of v along $\gamma .$ The mapping $ T_xM\ni v\mapsto P(1)\in T_yM$ is a linear isometry from $T_xM$ onto $T_yM$. This map is denoted by $P_{x}^{y}.$

The Riemannian metric induces a map $f \mapsto \mathrm{grad}\, f \in {\mathcal X}(M)$ which associates to each differentiable function f at $x\in M$, its gradient via the rule

$$\begin{aligned} \langle \mathrm{grad}\, f( x),v \rangle _{x}= df(v)=\frac{d}{dt}f(\exp _{x}(tv))|_{t=0}, \quad v\in T_{x}M. \end{aligned}$$

The Riemannian Hessian of f at a point $x\in M$ is the linear mapping

$$\begin{aligned} \mathrm{Hess}\, f(x): T_{x}M\rightarrow T_{x}M \end{aligned}$$

defined by

$$\begin{aligned} \mathrm{Hess}\, f(x)[v]=\nabla _{v}~\mathrm{grad}\, f \end{aligned}$$

for every $v\in T_{x}M.$ Note that $\mathrm{Hess}\,f(x)$ satisfies

$$\begin{aligned} \mathrm{Hess}\, f(x)(v,v)=\frac{d^{2}}{dt^{2}}f(\exp _{x}(tv))|_{t=0}, \quad v\in T_{x}M, \end{aligned}$$

and this formula fully defines $\mathrm{Hess}\, f(x)$.

A subset S of a Riemannian manifold is called convex if any two points x, $y\in S$ can be joined by a unique minimizing geodesic (denoted by $\gamma _{xy}$) which lies entirely in S. Note that there is little consistency in the meanings attached to the terms “convex set” and “strongly convex set” (see page 105 in [9] and page 2488 in [15] and references therein).

It is known that $ ~\exp _{x}^{-1}$ is well-defined on every convex set S, $d(x, y)=|| \exp _{x}^{-1}(y)||$ for every $x, y\in S$, and

$$\begin{aligned} \gamma _{xy}(t)=\exp _{x}(t\exp _{x}^{-1}y)\quad \text {for all} ~t\in [0,1]; \end{aligned}$$

see [11]. Let S be a nonempty convex subset of M, a function $f: S\rightarrow {\mathbb {R}}$ is said to be convex if for every x, $y\in S$ and every $t\in [0,1]$,

$$\begin{aligned} f(\gamma _{xy}(t))\le (1-t)f(x)+tf(y). \end{aligned}$$

The following example illustrates a nonconvex function which can be written as a convex function on a Riemannian manifold with an appropriate metric (see [4]).

Example 2.1

The function $f: {{\mathbb {R}}}^{2}\rightarrow {\mathbb {R}}$ defined by

$$\begin{aligned} f(x)= e^{x_{1}}(\cosh (x_{2})-1),\quad x=(x_{1},x_{2}), \end{aligned}$$

is not convex. Endowing ${{\mathbb {R}}}^{2}$ with the metric $g(x):=\mathrm{diag}(1,e^{2x_{1}})$, we obtain a Riemannian manifold $M_{g}$. The Hessian matrix

$$\begin{aligned} \mathrm{Hess}\, f(x)=\mathrm{diag}(e^{x_{1}}(\cosh (x_{2})-1),e^{x_{1}}\cosh (x_{2})+e^{3x_{1}}(\cosh (x_{2})-1)), \end{aligned}$$

is positive semidefinite, therefore f is convex on $M_{g}$.

3 Characterization of the solution sets

Our aim is to characterize the solution set of the following optimization problem

$$\begin{aligned} \min _{x\in S} f(x), \end{aligned}$$

(1)

where $S\subseteq M$ is a convex subset of M and f is a twice continuously differentiable convex function on some open convex set containing S. We denote the solution set of the optimization problem (1) by

$$\begin{aligned} \bar{S}= \mathop {\hbox {argmin}}\limits _{x\in S}f(x), \end{aligned}$$

and assume that $\bar{S}\ne \emptyset $. If $\bar{x}\in \bar{S}$, then

$$\begin{aligned} \bar{S}=\{x\in S : f(x)=f(\bar{x})\}, \end{aligned}$$

and $\bar{S}$ is a convex subset of S. The following theorem is a generalization of [7, Proposition 15] from the sphere $S^{n}$ to a general setting which describes the relation between the solution sets of (1) and a variational inequality.

Theorem 3.1

Let $S\subseteq M$ be a convex subset of M and f be a twice continuously differentiable convex function on some open convex set containing S. Then $\bar{x}\in \bar{S}$ if and only if

$$\begin{aligned} \langle \mathrm{grad}\, f(\bar{x}),\exp _{\bar{x}}^{-1}y\rangle _{\bar{x}}\ge 0\quad \text {for all }y\in S. \end{aligned}$$

(2)

Proof

Let $\bar{x}\in \bar{S}$, $y\in S$, and $\gamma _{\bar{x} y}$ be the minimal geodesic connecting $\bar{x}$ and y. By convexity of S, $f(\gamma _{\bar{x} y}(t))\ge f(\bar{x})$ for all $t\in [0,1]$. Therefore

$$\begin{aligned} \langle \mathrm{grad}\, f(\bar{x}),\exp _{\bar{x}}^{-1}y\rangle _{\bar{x}}=\lim _{t\rightarrow 0} \frac{f(\gamma _{\bar{x}y}(t))-f(\bar{x})}{t}\ge 0. \end{aligned}$$

Now suppose that (2) holds. By convexity of f, we have

$$\begin{aligned} f(y)-f(\bar{x})\ge \langle \mathrm{grad}\, f(\bar{x}),\gamma ^{\prime }_{\bar{x} y}(0)\rangle _{\bar{x}} =\langle \mathrm{grad}\, f(\bar{x}),\exp _{\bar{x}}^{-1}y\rangle _{\bar{x}}\ge 0\quad \text {for all }y\in S. \end{aligned}$$

$\square $

Now we present a characterization for the solution set of a convex optimization problem on a convex subset of a Riemannian manifold which is our main result.

Theorem 3.2

Let $S\subseteq M$ be a convex subset of M, f be a twice continuously differentiable convex function on some open convex set containing S, and $\bar{x}\in \bar{S}$. Then

$$\begin{aligned} \bar{S}=\{x\in S:\langle \mathrm{grad}\, f(\bar{x}),\exp _{\bar{x}}^{-1}x\rangle _{\bar{x}}=0,\ P^{\bar{x}}_{x}[\mathrm{grad} f( x)]=\mathrm{grad}\, f(\bar{x})\}. \end{aligned}$$

(3)

Proof

We denote the right hand side of (3) by $S^{*}$. On the contrary, we assume that $x\in S^{*}\setminus \bar{S}.$ By convexity of f,

$$\begin{aligned} \langle \mathrm{grad}\, f(x),\exp _x^{-1}(\bar{x})\rangle _{\bar{x}}\le f(\bar{x})-f(x)<0. \end{aligned}$$

Now, by properties of the parallel translation, we get

$$\begin{aligned} \begin{aligned} \langle \mathrm{grad}\, f(x),\exp _x^{-1}(\bar{x})\rangle _{x}&=\langle P_{x}^{\bar{x}}[ \text {grad} f( x)], P_{x}^{\bar{x}}[\exp _{x}^{-1}\bar{x}]\rangle _{\bar{x}}\\&=\langle \text {grad} f( \bar{x}), -\exp _{\bar{x}}^{-1}x\rangle _{\bar{x}}=0, \end{aligned} \end{aligned}$$

which is a contradiction.

For the converse, let $x\in \bar{S}$ and $\gamma _{\bar{x} x}(t)$ be the minimal geodesic connecting $\bar{x}$ and x. Since $\bar{S}$ is convex, this geodesic lies entirely in $\bar{S}$ so $f(\gamma _{\bar{x} x }(t))=f(\bar{x})$ for all $t\in [0,1]$. Therefore

$$\begin{aligned} \langle \mathrm{grad}\, f( \bar{x}),\exp _{\bar{x}}^{-1}x\rangle _{\bar{x}}=\lim _{t\rightarrow 0}\frac{f(\gamma _{\bar{x}x}(t))-f(\bar{x})}{t}=0. \end{aligned}$$

Similarly, we have $\langle \mathrm{grad}\, f(x),\exp _{x}^{-1}\bar{x}\rangle _{x}=0.$ These two equations and properties of the parallel transport imply

$$\begin{aligned} \langle \mathrm{grad}\, f( \bar{x})-P_{x}^{\bar{x}}[\mathrm{grad}\, f(x)],\exp _{\bar{x}}^{-1}x\rangle _{\bar{x}}=0. \end{aligned}$$

(4)

Now, we define $F: [0,1]\rightarrow T_{\bar{x}}M$ as follows

$$\begin{aligned} F(t)= P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}[\mathrm{grad}\, f( \gamma _{x \bar{x}}(t))]. \end{aligned}$$

Note that $\gamma _{x \bar{x}}^{\prime }(t)=P_{x}^{\gamma _{x \bar{x}}(t)}[\exp _{x}^{-1}\bar{x}]$ and

$$\begin{aligned} F(t-s)=P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}P_{\gamma _{x \bar{x}}(t-s)}^{\gamma _{x \bar{x}}(t)}[\mathrm{grad}\, f( \gamma _{x \bar{x}}(t-s))]\quad \text {for all }t\in [0,1], \end{aligned}$$

hence

$$\begin{aligned} \begin{aligned} F^{\prime }(t)&=-\frac{d}{ds}F(t-s)|_{s=0}\\&=-P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}\frac{d}{ds}P_{\gamma _{x \bar{x}}(t-s)}^{\gamma _{x \bar{x}}(t)}[\mathrm{grad}\, f( \gamma _{x \bar{x}}(t-s))]|_{s=0}\\&=P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}[\text {Hess} f(\gamma _{x \bar{x}}(t))(\gamma ^{\prime }_{x \bar{x}}(t))]. \end{aligned} \end{aligned}$$

Since F is $C^{1}$ and by the previous equality, we get

$$\begin{aligned} \mathrm{grad}\, f( \bar{x})- P^{\bar{x}}_{x}[\mathrm{grad}\, f( x)]= & {} F(1)-F(0)=\int \limits _{0}^{1}F^{\prime }(t)dt\nonumber \\= & {} \int \limits _{0}^{1}P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t))(\gamma ^{\prime }_{x \bar{x}}(t))dt\nonumber \\= & {} \int \limits _{0}^{1}P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t))(P_{\bar{x}}^{\gamma _{x \bar{x}}(t)}(\exp _{\bar{x}}^{-1}x))dt\nonumber \\= & {} \left( \int \limits _{0}^{1}P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t)\right) P_{\bar{x}}^{\gamma _{x \bar{x}}(t)}( \cdot ) dt)\exp _{\bar{x}}^{-1}x\nonumber \\= & {} A\exp _{\bar{x}}^{-1}x, \end{aligned}$$

(5)

where

$$\begin{aligned} \int \limits _{0}^{1}P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t))P_{\bar{x}}^{\gamma _{x \bar{x}}(t)}(\cdot )dt: T_{\bar{x}}M\rightarrow T_{\bar{x}}M \end{aligned}$$

is a linear map. We claim that this linear map is positive semidefinite and therefore corresponds to a positive semidefinite matrix A. To prove the claim, note that $\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t))$ is a positive semidefinite linear map, hence

$$\begin{aligned} \begin{aligned}&\langle w,P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}[\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t)) (P_{\bar{x}}^{\gamma _{x \bar{x}}(t)}(w))]\rangle _{\bar{x}}\\&\quad =\langle P_{\bar{x}}^{\gamma _{x \bar{x}}(t)}(w),\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t)) [P_{\bar{x}}^{\gamma _{x \bar{x}}(t)}(w)]\rangle _{\gamma _{x \bar{x}}(t)} \ge 0 \end{aligned} \end{aligned}$$

for every $w\in T_{\bar{x}}M$, therefore by integration with respect to t, it implies that

$$\begin{aligned} \langle w,Aw\rangle _{\bar{x}}= & {} \left\langle w,\int \limits _{0}^{1}P_{\gamma _{x \bar{x}}(t)}^{\bar{x}}[\mathrm{Hess}\, f(\gamma _{x \bar{x}}(t)) (P_{\bar{x}}^{\gamma _{x \bar{x}}(t)}(w))]dt\right\rangle _{\bar{x}}\ge 0, \end{aligned}$$

which proves our claim. Combining (4) and (5), we deduce that

$$\begin{aligned} \langle \mathrm{grad}\, f( \bar{x})- P^{\bar{x}}_{x}[\text {grad} f( x)],\exp _{\bar{x}}^{-1}x\rangle _{\bar{x}} =\langle A\exp _{\bar{x}}^{-1}x,\exp _{\bar{x}}^{-1}x\rangle _{\bar{x}}=0. \end{aligned}$$

(6)

Since A is symmetric positive semidefinite, it follows from (6) and [10, p. 431] that

$$\begin{aligned} \mathrm{grad}\, f( \bar{x})- P^{\bar{x}}_{x}[\mathrm{grad}\, f( x)]=A\exp _{\bar{x}}^{-1}x=0, \end{aligned}$$

which completes the proof. $\square $

The following corollary is an immediate consequence of Theorem 3.2.

Corollary 3.1

Let $S\subseteq M$ be a convex subset of M, f be a twice continuously differentiable convex function on some open convex set containing S, and $\bar{x}\in \bar{S}$. Then

(i)
the function $x\mapsto \Vert \mathrm{grad}\, f(x)\Vert $ is constant on $\bar{S}.$
(ii)
$\bar{S}=\tilde{S}=\{x\in S:\langle \mathrm{grad}\, f(\bar{x}),\exp _{\bar{x}}^{-1}x)\rangle _{\bar{x}}\le 0, P^{\bar{x}}_{x}[\mathrm{grad}\, f( x)]=\mathrm{grad} f(\bar{x})\}.$

Proof

The first part is obtained by Theorem 3.2 and the isometric property of the parallel translation. For proving the second part, we have that the inclusion $\bar{S}\subseteq \tilde{S}$ holds by Theorem 3.2. For the converse, assume that $x\in \tilde{S}$. Since the parallel translation is an isometry and f is convex, we deduce that

$$\begin{aligned} \begin{aligned} f(\bar{x})- f(x)\ge \langle \mathrm{grad}\, f( x),\exp _{ x}^{-1}\bar{x}\rangle _{{x}}&=\langle P^{\bar{x}}_{x}[\mathrm{grad}\, f( x)],P^{\bar{x}}_{x}[\exp _{ x}^{-1}\bar{x}]\rangle _{\bar{x}}\\&=-\langle \mathrm{grad}\, f( \bar{x}),\exp _{\bar{x}}^{-1} x\rangle _{{\bar{x}}} \ge 0. \end{aligned} \end{aligned}$$

This shows that $f(\bar{x})\ge f( x)$. Since $\bar{x}$ is the minimizer of f on S, we have $f(\bar{x})= f( x)$, which proves that $ \tilde{S}\subseteq \bar{S}$. $\square $

Now, we present some examples to illustrate how our characterization of solution sets of optimization problems works in particular nontrivial settings of Riemannian manifolds.

Example 3.1

Let M be a Hadamard manifold and U be a nonempty open convex subset of M. Pick $x_{0},y_{0} \in U$ with $x_{0}\ne y_{0}.$ Choose $\varepsilon >0$ such that $ S:= {\bar{B}}(x_{0},\varepsilon )\subset U$, $A:={\bar{B}}({y_{0}},\varepsilon )\subset U$ are convex and $S\cap A=\emptyset .$ Define the function $f: S\rightarrow {\mathbb {R}}$ by

$$\begin{aligned} f(x):=\frac{1}{2}d_{A}^{2}(x). \end{aligned}$$

Note that the function $d_{A}^{2}(\cdot )$ is convex and twice continuously differentiable on U. Let $\bar{x}\in \bar{S}$ and $p:=\pi _{A}(\bar{x}),$ where $\pi _{A}(\bar{x})$ is the metric projection of $\bar{x}$ on A. We have that

$$\begin{aligned} \mathrm{grad }f(\bar{x})=-\exp ^{-1}_{\bar{x}}p,\quad \Vert \mathrm{grad }f(\bar{x})\Vert =d(\bar{x}, p)=d(\bar{x}, A). \end{aligned}$$

Thus, by using Theorem 3.2, we deduce

$$\begin{aligned} \begin{aligned} \bar{S}=\big \{x\in S:\langle \exp ^{-1}_{\bar{x}}p, \exp ^{-1}_{\bar{x}}x\rangle _{\bar{x}}=0, \mathrm{grad } f( x)=-P_{\bar{x}}^{x}[\exp ^{-1}_{\bar{x}}p]\big \}. \end{aligned} \end{aligned}$$

Example 3.2

The general matrix rank minimization problem (RMP) expressed as

$$\begin{aligned} \min \mathrm{rank} X,\quad \quad X\in S\subset P_n, \end{aligned}$$

where $X\in {{\mathbb {R}}}^{n\times n}$, $P_n$ is the set of positive semidefinite $n\times n$ matrices, and S is a convex set, is computationally hard to solve. This problem arises in many areas such as control, system identification, statistics, signal processing, and computational geometry; see [5] and references therein. Rather than solving the RMP, one can use the function

$$\begin{aligned} \log \det (X +\delta I), \end{aligned}$$

as a smooth surrogate for $\mathrm{rank} X $ and instead solve the following problem

$$\begin{aligned} \min \log \det (X +\delta I),\quad X\in S, \end{aligned}$$

where $\delta >0$ can be interpreted as a small regularization constant; see [5]. Note that this surrogate is not convex on the linear space ${{\mathbb {R}}}^{n\times n}$. This application motivated us to consider the problem

$$\begin{aligned} \min \log \det (X),\qquad X\in S=\{X\in P_n^+:~~0<A\le X\}, \end{aligned}$$

(7)

which is not convex with respect to the Euclidean metric on ${{\mathbb {R}}}^{n\times n}$. Here $P_n^+$ is the set of positive definite $n\times n$ matrices and $A\in P_n^+$.

The set of symmetric positive definite matrices, as a Riemannian manifold, is the most studied example of manifolds of nonpositive curvature. The tangent space to $P_n^+$ at any of its points P is the space $T_PP_n^+=\{P\}\times S_n$, where $S_n$ is the space of symmetric $n\times n$ matrices. On each tangent space $T_PP_n^+$, the inner product is defined by

$$\begin{aligned} \langle A,B\rangle _{P}=\mathrm{tr} (P^{-1}AP^{-1}B). \end{aligned}$$

The Riemannian distance between $P,Q\in P_n^+$ is given by

$$\begin{aligned} \text{ dist }(P,Q)=\left( \mathop \sum \limits _{i=1}^n \ln ^2(\lambda _i)\right) ^{1/2}, \end{aligned}$$

where $\lambda _i, ~i=1,...,n$, are eigenvalues of $P^{-1}Q.$ The exponential map

$$\begin{aligned} \exp _P:S_n\rightarrow P_n^+ \end{aligned}$$

is defined by

$$\begin{aligned} \exp _P(v)=P^{1/2}\exp (P^{-1/2}vP^{-1/2})P^{1/2}. \end{aligned}$$

Moreover, if $P\in P_n^+$, then

$$\begin{aligned} \exp _P^{-1}:P_n^+\rightarrow S_n \end{aligned}$$

is defined by

$$\begin{aligned} \exp _P^{-1}(Q)=P\log (P^{-1}Q), \end{aligned}$$

where $\log $ and $\exp $ denote the logarithm and exponential functions on the matrix space; for more details see [21]. The parallel transport along the unique geodesic connecting X and Y, is defined by

$$\begin{aligned} P_X^Y(Z)=(YX^{-1})^{1/2}Z(X^{-1}Y)^{1/2}. \end{aligned}$$

Moreover, the Riemannian gradient of a function f defined on $P_n^+$ is given by using the Euclidean gradient, denoted by $\nabla f$, using the following formula,

$$\begin{aligned} \mathrm{grad}\, f (X) = X\mathrm{symm}(\nabla f(X))X, \end{aligned}$$

where $\mathrm{symm}(\nabla f(X))= 1/2(\nabla f(X)+\nabla f(X)^T).$ For $f(X)=\log \det (X)$, $\nabla f(X)=X^{-1}$, therefore $\mathrm{grad}\, f(X)=X$. First we claim that S is a convex subset of $P_n^+$. Assume that $X, Y\in S$, then $X\ge A$ and $Y\ge A$. The unique geodesic connecting these two points is defined by

$$\begin{aligned} \gamma (t):= X^{1/2}(X^{-1/2}YX^{-1/2})^tX^{1/2}, \end{aligned}$$

by using the Löwner–Heinz inequality (see [14, Lemma 2.1]), we have $\gamma (t)\ge A$ and therefore S is convex in $P_n^+$. We claim that A is a solution for the problem (7). By using Theorem 3.1, we need to prove that $\langle A,\exp _A^{-1}(Y)\rangle _A\ge 0$ for all $Y\in S$. Note that

$$\begin{aligned} \begin{aligned} \langle A,\exp _A^{-1}(Y)\rangle _A&=\mathrm{tr} (A^{-1}AA^{-1}A\log (A^{-1}Y))\\&=\mathrm{tr} \log (A^{-1}Y)=\log \det (A^{-1}Y)\ge 0. \end{aligned} \end{aligned}$$

Therefore,

$$\begin{aligned} \bar{S}= \{X\in S: \log \det A=\log \det X\}. \end{aligned}$$

To illustrate Theorem 3.2, we will see that

$$\begin{aligned} \bar{S} = \{X\in S:\langle \mathrm{grad}\, f(A),\exp _{A}^{-1}X\rangle _{A}=0,~ P^{A}_{X}[\mathrm{grad}\, f( X)]=\mathrm{grad} f(A)\}. \end{aligned}$$

Note that $\mathrm{grad}\, f(X)=X,\mathrm{grad}\, f(A)=A$, and

$$\begin{aligned} P_X^A(X)=(AX^{-1})^{1/2}X(X^{-1}A)^{1/2}=A \text { for all }~X\in S. \end{aligned}$$

Moreover,

$$\begin{aligned} \langle \mathrm{grad}\, f(A),\exp _{A}^{-1}X\rangle _{A}&=\langle A, A\log (A^{-1}X)\rangle _A\\&=\mathrm{tr} (A^{-1}AA^{-1}A\log (A^{-1}X))\\&=\mathrm{tr} \log (A^{-1}X)=\log \det (A^{-1}X), \end{aligned}$$

which shows the required equation.

Recall that the unit sphere $S^{2}:=\{x\in {{\mathbb {R}}}^3:~||x||=1\}$ is a 2-dimensional manifold with the usual Riemannian distance function defined as

$$\begin{aligned} d(x,y)=\arccos \langle x,y\rangle \quad \mathrm{for ~all}~ x,y\in S^{2}. \end{aligned}$$

For every $\bar{x} \in S^{2},$ it follows from the definition of the Riemannian metric on $S^{2}$ that

$$\begin{aligned} \langle u,v \rangle _{\bar{x}}=\langle u,v \rangle \quad \mathrm{for ~ all}~u,v\in T_{\bar{x}}S^{2}, \end{aligned}$$

where $\langle \cdot , \cdot \rangle $ denotes the standard inner product in ${{\mathbb {R}}}^{3}.$ For $x\in S^{2}$, the exponential map $\exp _{x}:T_{x} S^{2}\rightarrow S^{2}$ is defined by

$$\begin{aligned} \exp _{x}(v)=\cos (||v||)x+\sin (||v||)\frac{v}{||v||}, \quad v\in T_{x} S^{2}. \end{aligned}$$

(8)

Moreover, $\exp ^{-1}_{x}:S^{2}\rightarrow T_{x} S^{2}$ is

$$\begin{aligned} \exp ^{-1}_{x}(y)=\frac{\theta }{\sin \theta }(y-x\cos \theta ), \quad y \in S^{2}, \end{aligned}$$

(9)

where $\theta =\arccos \langle x,y\rangle .$ Let $t\rightarrow \gamma (t)$ be the unique minimal geodesic in $S^{2}$ joining $\gamma (0)=x$ to $\gamma (0)=y,$ and let $u:=\frac{\gamma ^{\prime }(0)}{||\gamma ^{\prime }(0)||}.$ The parallel translation of a vector $v\in T_{x} S^{2}$ along the geodesic $\gamma $ is given by

$$\begin{aligned} \begin{aligned} P_{x}^{\gamma (t)}(v)&=\big (-\sin (||\gamma ^{\prime }(0)||t)u^{\prime }v\big )x\\&\quad -\big (\cos (||\gamma ^{\prime }(0)||t)u^{\prime }v\big )u+(I-uu^{\prime })v; \end{aligned} \end{aligned}$$

(10)

see [1]. In the following example, which is an improvement of [16, Corollary 1, p. 1988], we consider the optimization problem on convex subsets of the unit sphere involving quadratic cost functions.

Example 3.3

Let S be a convex subset of $S^{2}$ and f be the quadratic convex function on an open convex subset of $S^{2}$ containing S defined by

$$\begin{aligned} \begin{aligned} f(x):=\langle Ax,x\rangle =x^{T}Ax, \end{aligned} \end{aligned}$$

(11)

where $A\in {{\mathbb {R}}}^{n\times n}$ is an $n\times n$ matrix. Suppose that $\bar{x}\in \bar{S}$. By using formulas (8)–(10) and [1, p. 74], for every $x\in \bar{S}$, we get

$$\begin{aligned} \mathrm{grad}\, f( x)=2(Ax-(xx^{T})Ax), \end{aligned}$$

and

$$\begin{aligned} \exp ^{-1}_{\bar{x}}(x)=\frac{\theta }{\sin \theta }(x-{\bar{x}}\cos \theta ), \end{aligned}$$

where $\theta =\arccos \langle {\bar{x}},x\rangle .$ Therefore, by Theorem 3.2, we obtain

$$\begin{aligned} \bar{S}=\{x\in S:\langle c, x- \bar{x}\cos \theta \rangle _{\bar{x}}=0, (I-xx^{T})Ax=d\}, \end{aligned}$$

where $c=A\bar{x}-(\bar{x}{\bar{x}}^{T})A\bar{x}$ and $d=P^{x}_{\bar{x}}(c).$

References

Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2009)
MATH Google Scholar
Barani, A.: Subdifferentials of perturbed distance function in Riemannian manifolds. Optimization 67(11), 1849–1868 (2018)
Article MathSciNet Google Scholar
Burke, J.V., Ferris, M.C.: Characterization of solution sets of convex programs. Oper. Res. Lett. 10, 57–60 (1991)
Article MathSciNet Google Scholar
Da Cruz Neto, J.X., Oliveira, O.P., Lucambio Pérez, L.R., Nemeth, S.Z.: Convex- and monotone-transformable mathematical programming problems and a proximal-like point method. J. Glob. Optim. 35(1), 53–69 (2006)
Article MathSciNet Google Scholar
Fazel, M., Hindi, H., Boyd, S.P.: Log-det heuristic for matrix rank minimization with applications to hankel and euclidean distance matrices. In: Proceedings of the 2003 American Control Conference, 2003, vol. 3, pp. 2156–2162 (2003)
Ferreira, O.P., Nemeth, S.Z.: On the spherical convexity of quadratic functions. J. Glob. Optim. 73(3), 537–545 (2019)
Article MathSciNet Google Scholar
Ferreira, O.P., Iusem, A.N., Nemeth, S.Z.: Projections onto convex sets on the sphere. J. Glob. Optim. 57, 663–676 (2013)
Article MathSciNet Google Scholar
Grohs, P., Hosseini, S.: $\varepsilon $-Subgradient algorithms for locally Lipschitz functions on Riemannian manifolds. Adv. Comput. Math. 42(2), 333–360 (2016)
Article MathSciNet Google Scholar
Grossir, D.: Newton’s method, zeroes of vector fields, and the Riemannian center of mass. Adv. Appl. Math. 33, 95–135 (2004)
Article MathSciNet Google Scholar
Horn, R.A., Johnson, C.A.: Matrix Analysis. Cambridge University Press, Cambridge (2013)
MATH Google Scholar
Hosseini, S., Pouryayevali, M.R.: On the metric projection onto prox-regular subsets of Riemannian manifolds. Proc. Am. Math. Soc. 141, 233–244 (2013)
Article MathSciNet Google Scholar
Jeyakumar, V., Yang, X.Q.: Convex composite multiobjective nonsmooth programming. Math. Program. 59, 325–343 (1993)
Article Google Scholar
Jeyakumar, V., Lee, G.M., Dinh, N.: Lagrange multiplier conditions characterizing the optimal solution set of cone-constrained convex programs. J. Optim. Theory Appl. 123, 83–103 (2004)
Article MathSciNet Google Scholar
Lim, Y.: Factorizations and geometric means of positive definite matrices. Linear Algebra Appl. 437(9), 2159–2172 (2012)
Article MathSciNet Google Scholar
Li, C., Yao, J.C.: Variational inequalities for set-valued vector fields: convexity of the solution set and proximal point algorithm. SIAM J. Control Optim. 50(4), 2486–2514 (2012)
Article MathSciNet Google Scholar
Mangasarian, O.L.: A simple characterization of solution sets of convex programs. Oper. Res. Lett. 7, 21–26 (1988)
Article MathSciNet Google Scholar
Rapscák, T.: Smooth Nonlinear Optimization in ${\mathbb{R}}^{n}$. Kluwer Academic, Dordrecht (1997)
Google Scholar
Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems with applications to large scale minimization. Math. Program. 77(2 Ser. B), 237–299 (1997)
Article MathSciNet Google Scholar
Sakai, T.: Riemannian Geometry. American Mathematic Society, Providence (1992)
Google Scholar
Schneider, R., Uschmajew, A.: Convergence results for projected line-search methods on varieties of low-rank matrices via Lojasiewicz inequality. SIAM J. Optim. 25(1), 622–646 (2015)
Article MathSciNet Google Scholar
Sra, S., Hosseini, R.: Conic geometric optimization on the manifold of positive definite matrices. SIAM J. Optim. 25(1), 713–739 (2015)
Article MathSciNet Google Scholar
Udriste, C.: Convex Functions and Optimization Methods on Riemannian Manifolds. Mathematics and its Applications, vol. 297. Kluwer Academic, Dordrecht (1994)
Book Google Scholar
Wu, Z.L., Wu, S.Y.: Characterizations of the solution sets of convex programs and variational inequality problems. J. Optim. Theory Appl. 130, 339–358 (2006)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Lorestan University, P.O. Box 465, Khorramabad, Iran
A. Barani
Hausdorff Center for Mathematics, Institute for Numerical Simulation, University of Bonn, 53115, Bonn, Germany
S. Hosseini

Authors

A. Barani
View author publications
You can also search for this author in PubMed Google Scholar
S. Hosseini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. Barani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barani, A., Hosseini, S. Characterization of solution sets of convex optimization problems in Riemannian manifolds. Arch. Math. 114, 215–225 (2020). https://doi.org/10.1007/s00013-019-01382-x

Download citation

Received: 02 March 2019
Published: 04 October 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00013-019-01382-x

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Characterization of solution sets of convex optimization problems in Riemannian manifolds

Abstract

Similar content being viewed by others

Nonconvex Weak Sharp Minima on Riemannian Manifolds

Characterization of Lower Semicontinuous Convex Functions on Riemannian Manifolds

First Order Methods for Optimization on Riemannian Manifolds

1 Introduction

2 Preliminaries

Example 2.1

3 Characterization of the solution sets

Theorem 3.1

Proof

Theorem 3.2

Proof

Corollary 3.1

Proof

Example 3.1

Example 3.2

Example 3.3

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Characterization of solution sets of convex optimization problems in Riemannian manifolds

Abstract

Similar content being viewed by others

Nonconvex Weak Sharp Minima on Riemannian Manifolds

Characterization of Lower Semicontinuous Convex Functions on Riemannian Manifolds

First Order Methods for Optimization on Riemannian Manifolds

1 Introduction

2 Preliminaries

Example 2.1

3 Characterization of the solution sets

Theorem 3.1

Proof

Theorem 3.2

Proof

Corollary 3.1

Proof

Example 3.1

Example 3.2

Example 3.3

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation