1 Introduction

The convex feasibility problem (CFP, for short), which consists in finding a point in the intersection of finitely many closed and convex sets in Hilbert spaces, is a very broad and adaptive framework, and its study has a long and rich history in applied mathematics. The main approach to solve this problem is offered by the projection-like algorithms, which probably goes back to the work by von Neumann in the early 1930s for the special case, when there are only two sets. For details, we refer the reader to [110] and the references therein. For the special situation, where each convex set is given as a sublevel set of a convex function, Censor and Lent proposed in [11] a cyclic subgradient projection algorithm by using, instead of the orthogonal projections to the convex sets, the projections of the corresponding subgradients. As pointed out in [12], this special approach does not lose the generality because one can certainly choose the functions as the squared Euclidean distance functions to the involved convex sets to cover the general case. Since then, some authors have incorporated acceleration procedures in order to improve the speed of convergence; see, e.g., [2, 8, 1215].

Recent interests are focused on extensions of some important notions and techniques in Hilbert spaces to Riemannian manifolds. The reason is that, as explained in [16], some optimization problems, arising in various applications, cannot be posted in linear spaces and require Riemannian manifold structures for their formalization and study; while some nonconvex and/or nonsmooth problems of constrained optimization in \({\mathbb {R}}^n\) can be reduced to convex and smooth unconstrained optimization problems on appropriate Riemannian manifolds; see, for examples, [1725]. For developments of optimization techniques on Riemannian manifolds, we refer the reader to [16, 2637] and the bibliographies therein for various results, examples, discussions, and applications. Here, let us particularly mention the CFP on Riemannian manifolds with each convex set being given as a sublevel set of a convex function and the cyclic subgradient projection algorithm (i.e., Algorithm 3.1) for solving it; this algorithm was first proposed in [38], but the idea comes from the book of Udriste, that is, the general descent and/or gradient algorithms studied in [25], Chap. 7, §3 and §4] for finding a minimizer of a \(C^2\)-function \(f\) on Riemannian manifolds. Convergence properties of the gradient algorithm have also been studied in [24, 30, 34, 37, 39]. The CFP can be reformulated as a convex minimization problem on Riemannian manifolds, but the induced convex function \(f\) on Riemannian manifolds is nonsmooth in general; hence, the known results in [25] and the literature mentioned above cannot be applied to solve the CFP on Riemannian manifolds considered here, nor establish the convergence properties of the cyclic subgradient projection algorithm.

In the special case, when the underlying Riemannian manifold is of non-negative sectional curvature, Bento and Melo proved in [38] that the cyclic subgradient projection algorithm converges to a solution of the CFP, and if, additionally, a Slater type condition is satisfied, then this algorithm has the finite termination property. Moreover, the following open problem was proposed there: How is the CFP solved on Riemannian manifolds with negative sectional curvature?

The main purpose of the present paper is to study the convergence problem of the cyclic subgradient projection algorithm on Riemannian manifolds with possible non-positive sectional curvature. More precisely, under the assumption that the underlying Riemannian manifold is of sectional curvature bounded from below, we show that, employing the same step sizes as in [38], any sequence generated by the cyclic subgradient projection algorithm converges to a solution of the CFP, and if a Slater type condition is additionally satisfied, then this algorithm terminates in a finite number of iterations. Our results extend the corresponding ones in [38], and, in particular, the open problem mentioned above is solved partially.

The paper is organized as follows. We recall some basic notions, notations, and properties of Riemannian geometry, together with some definitions and preliminary results on convex analysis on Riemannian manifolds in Sect. 2 . In Sect. 3 , the cyclic subgradient projection algorithm considered in [38] is introduced and some useful lemmas are shown. The main results of the present paper, including the convergence theorems and the finite termination theorems of this algorithm, are presented in Sects. 4 and 5, respectively.

2 Notations and Preliminaries

The notations used in the present paper are standard; and the readers are referred to some textbooks for more details, for example, [25, 40, 41].

Let \(M\) be a connected \(n\)-dimensional Riemannian manifold. We use \(\nabla \) to denote the Levi-Civita connection on \(M\). Let \(x\in M\), let \(T_{x}M\) denote the tangent space at \(x\) to \(M\). We denote by \(\langle ,\rangle _{x}\) the scalar product on \(T_{x}M\) with the associated norm \(\Vert .\Vert _{x}\), where the subscript \(x\) is sometimes omitted. For \(y\in {M}\), let \(\gamma :[0,1]\rightarrow M\) be a piecewise smooth curve joining \(x\) to \(y\). Then, the arc-length of \(\gamma \) is defined by \(l(\gamma ):=\int _{0}^{1}\Vert \gamma '(t)\Vert \mathrm{d}t\), while the Riemannian distance from \(x\) to \(y\) is defined by \(d(x,y):=\inf _{\gamma }l(\gamma )\), where the infimum is taken over all piecewise smooth curves \(\gamma :[0,1]\rightarrow M\) joining \(x\) to \(y\). A vector field \(V\) is said to be parallel along \(\gamma \) if \(\nabla _{\gamma '}V=0\). In particular, for a smooth curve \(\gamma \), if \(\gamma '\) is parallel along itself, then \(\gamma \) is called a geodesic, that is, a smooth curve \(\gamma \) is a geodesic if and only if \(\nabla _{\gamma '}{\gamma '}=0\). A geodesic \(\gamma :[0,1]\rightarrow M\) joining \(x\) to \(y\) is minimal if its arc-length equals its Riemannian distance between \(x\) and \(y\). By the Hopf–Rinow theorem [41], \((M,d)\) is a complete metric space, and there is at least one minimal geodesic joining \(x\) to \(y\). The closed metric ball in \(M\) centered at the point \(x\in M\) with radius \(r>0\) is denoted by \(\mathbb {B}(x,r)\), i.e.,

$$\begin{aligned} \mathbb {B}(x,r):=\{y\in M:d(x,y)\le r\}. \end{aligned}$$

A complete simply connected Riemannian manifold of non-positive sectional curvature is called a Hadamard manifold. One important example of Hadamard manifold is the \(m\)-dimensional hyperbolic space \({\mathbb {H}}^m\), which is defined as follows:

$$\begin{aligned} {\mathbb {H}}^m:=\{x=(x_1,\ldots ,x_{m+1})\in {\mathbb {R}}^{m+1}: \langle x,x\rangle _{{\mathbb {H}}}=-1, x_{m+1}>0\}, \end{aligned}$$

where \(\langle \cdot ,\cdot \rangle _{\mathbb {H}}\) denotes the symmetric bilinear form (which is called the Lorentz metric on \({\mathbb {R}}^{m+1}\)), defined by

$$\begin{aligned} \langle x,y\rangle _{\mathbb {H}}:=\sum _{i=1}^{n}x_iy_i-x_{n+1}y_{n+1}\quad \mathrm{for\,any\,} x=(x_i), y=(y_i)\in {\mathbb {R}}^{m+1}. \end{aligned}$$

Then, \({\mathbb {H}}^m\) is a Hadamard manifold with sectional curvature \(-1\) [20, 27]. Let \(\kappa <0\) and let \(M_{\kappa }^{m}\) be the Hadamard manifold obtained from the \(m\)-dimensional hyperbolic space \({\mathbb {H}}^m\) by multiplying the distance function by the constant \(\frac{1}{\sqrt{|\kappa |}}\).

Consider two geodesic segments \(\gamma _1,\gamma _2:[0,+\infty [\rightarrow M\) emanating from \(p\) (i.e., \(\gamma _1(0)=\gamma _2(0)=p\)). Let \(\angle _p(\gamma _1,\gamma _2)\) denote the angle between \(\gamma _1\) and \(\gamma _2\) at \(p\), which is defined to be the angle between the tangent vectors \(\gamma _1'(0)\) and \(\gamma _2'(0)\). By [27, p. 173, Corollary 1A7.], \(\angle _p(\gamma _1,\gamma _2)\) coincides with the Alexandrov angle. Recall that a geodesic triangle \(\triangle (p_{1}p_{2}p_{3})\) in \(M\) is a figure consisting of three points \(p_{1},p_{2},p_{3}\) (the vertices of \(\triangle (p_{1}p_{2}p_{3})\)) and three geodesic segments \(\gamma _{i}\) (the edges of \(\triangle (p_{1}p_{2}p_{3})\)) that join \(p_{i-1}\) to \(p_{i+1}\) with \(i=1,2,3\) (mod3). For each \(i=1,2,3\) (mod3), the inner angle of \(\triangle (p_{1}p_{2}p_{3})\) at \(p_{i}\) is denoted by \(\angle (p_{i-1}p_{i}p_{i+1})\), which equals \(\angle _{p_i}(-\gamma _{i-1},\gamma _{i+1})\). The following proposition is known in [40, p. 138].

Proposition 2.1

Let \(\kappa <0\) and let \(\triangle (p_{1}p_{2}p_{3})\) be a geodesic triangle in \(M_{\kappa }^{2}\). Then, the following “law of cosines” holds

$$\begin{aligned} \cosh (\sqrt{|\kappa |}l_{2})&=\cosh (\sqrt{|\kappa |}l_{1})\cosh (\sqrt{|\kappa |}l_{3})\\&\quad -\sinh (\sqrt{|\kappa |}l_{1})\sinh (\sqrt{|\kappa |}l_{3})\cos \angle (p_{1}p_{2}p_{3}), \end{aligned}$$

where \(l_i=d(p_{i-1},p_{i+1})\) for each \(i=1,2,3 \) (mod3).

Another important tool that will be used is the comparison theorem described in Proposition 2.2 below, which is known in [40, p. 161, Theorem 4.2].

Following [40, p. 161], a generalized geodesic hinge \(\varLambda (p;\gamma ,\tau )\) in \(M\) is a figure consisting of a point \(p\in M\) (the vertex of the hinge) and two geodesic segments \(\gamma ,\tau \) (the edges of the hinge) emanating from \(p\) with one being minimal. Moreover, a hinge \(\varLambda (\bar{p};\bar{\gamma },\bar{\tau })\) in \(M_{\kappa }^{2}\) is called a comparison hinge of \(\varLambda (p;\gamma ,\tau )\) if it satisfies

$$\begin{aligned} l(\bar{\gamma })=l(\gamma ), l(\bar{\tau })=l(\tau ),\quad \mathrm{and}\quad \angle _{\bar{p}}(\bar{\gamma },\bar{\tau })=\angle _{p}(\gamma , \tau ). \end{aligned}$$

Proposition 2.2

Let \(\kappa <0\) and suppose that \(M\) is a complete Riemannian manifold of sectional curvature bounded from below by \(\kappa \). Let \(\varLambda (p;\gamma ,\tau )\) be a generalized geodesic hinge in \(M\). Then, there exists a comparison hinge \(\varLambda (\bar{p};\bar{\gamma },\bar{\tau })\) in \(M_{\kappa }^{2}\) such that \(d(q_\gamma ,q_\tau )\le d(\bar{q}_{\bar{\gamma }},\bar{q}_{\bar{\tau }})\), where \(q_\gamma \), \(q_\tau \), \(\bar{q}_{\bar{\gamma }}\), and \(\bar{q}_{\bar{\tau }}\) denote the end points of \(\gamma \), \(\tau \), \(\bar{\gamma }\), and \(\bar{\tau }\), respectively.

Let \(\varGamma _{xy}\) denote the set of all geodesics \(\gamma :[0,1]\rightarrow M\) with \(\gamma (0)=x\) and \(\gamma (1)=y\). Then, \(\varGamma _{xy}\) is nonempty for all \(x,y\in M\). Consider next a real-valued function \(f:M\rightarrow \mathbb {R}\). We say that \(f\) is convex on \(M\) if, for any \(x,y\in M\) and any \(\gamma \in \varGamma _{xy}\), the composition \((f\circ \gamma ):[0,1]\rightarrow \mathbb {R}\) is a convex function on \([0,1]\), i.e.,

$$\begin{aligned} f(\gamma (t))\le (1-t)f(x)+tf(y)\quad \mathrm{for\,all}\; t\in [0,1]. \end{aligned}$$

For the remainder, we always assume that \(f:M\rightarrow \mathbb {R}\) is convex on \(M\). Then, as pointed out in [21, p. 642] (see also [25, p. 70]), \(f\) is Lipschitz continuous on any compact subset of \(M\). This, in particular, implies that \(f\) is continuous on \(M\). Let \(x\in M\). We define the subdifferential \(\partial f(x)\) of \(f\) at \(x\) by

$$\begin{aligned} \partial f(x):=\{g\in T_{x}M:f(y)\ge f(x)+\langle g,\gamma '(0)\rangle , \quad \forall y\in M \;\mathrm{and}\;\gamma \in \varGamma _{xy}\}. \end{aligned}$$

Then, \(\partial f(x)\) is a nonempty, closed convex set in the space \(T_{x}M\). Furthermore, by the Lipschitz continuity, one sees that

$$\begin{aligned} \partial f(\cdot )\,\mathrm{is\,uniformly\,bounded\,on\,any\,bounded\,subset\,of\,}M \end{aligned}$$
(1)

(see also [38]).

We end this section with the notions of the distance function and the projection associated to a subset \(Q\) of \(M\). The distance function \(d_{Q}(\cdot )\) of \(Q\) is defined by

$$\begin{aligned} d_{Q}(x):=\inf _{y\in Q}d(x,y)\quad \mathrm{for\,each}\; x\in M; \end{aligned}$$

while the projection \(P(\cdot |Q)\) on \(Q\) is defined by

$$\begin{aligned} P(x|Q):=\left\{ y\in Q:d(x,y)=d_{Q}(x)\right\} \quad \mathrm{for\,each}\; x\in M. \end{aligned}$$

Note that, in general, \(P(\cdot |Q)\) is a set-valued map even if \(Q\) is convex in the sense that, for any \(y,z\in Q\), any minimal geodesic joining \(y\) to \(z\) is in \(Q\). For example, consider \(M:={\mathbb {S}}^2\), the two dimensional sphere, and

$$\begin{aligned} Q:=\{(t_1,t_2,t_3)\in M: \mathrm{each}\; t_i\ge 0\}. \end{aligned}$$

Let \(x_0:=(0,0,-1)\). Then, \(d_{Q}(x_0)=\frac{\pi }{2}\) and \(P(x_0|Q)=\{ (t_1,t_2,0)\in Q\}\).

3 Algorithm and Auxiliary Results

Let \(I:=\{1,2,\ldots ,m\}\), where \(m\in {\mathbb {N}}\), and let \(f_{i}:\ M\rightarrow \mathbb {R}\) be a convex function for each \(i\in I\). Consider a family of closed and convex subsets \(\{Q_i:i\in I\}\) in \(M\) with each \(Q_i\) given by

$$\begin{aligned} Q_{i}:=\left\{ x\in M\ :\ f_{i}(x)\le 0\right\} . \end{aligned}$$

The CFP on Riemannian manifold \(M\) considered here consists in finding a point \(x^*\) satisfying

$$\begin{aligned} x^{*}\in Q:=\bigcap _{i=1}^{m}Q_{i}. \end{aligned}$$

Let \(f:\ M\rightarrow \mathbb {R}\) be the sup-function of \(\{f_i:i\in I\}\) defined by

$$\begin{aligned} f(x):=\max _{i\in I}f_{i}(x)\quad \mathrm{for\,each}\; x\in M. \end{aligned}$$

Then, \(f\) is convex on \(M\), and the solution set \(Q\) is re-expressed as

$$\begin{aligned} Q=\{x\in M:f(x)\le 0\}. \end{aligned}$$

We assume throughout the whole paper that \(Q\ne \emptyset \).

The following algorithm was proposed in [38] by Bento et al. for solving the CFP on Riemannian manifolds, which is actually suggested by the book of Udriste (see [25], Chap. 7, §3 and §4]); in particular, this algorithm coincides with “the gradient method” described in [25, p. 263] in the case when \(f\) is smooth.

Algorithm 3.1

  • Step 1. Select \(x^{0}\in M\) and put \(k:=0\).

  • Step 2. If \(f(x^{k})\le 0\), then stop; otherwise, go to Step 3.

  • Step 3. Put \(i_{k}:=(k\,\mathrm{mod}\,m)+1\).

  • Step 4. If \(f_{i_{k}}(x^{k})\le 0\), then set \(x^{k+1}:=x^{k}\); otherwise, go to Step 5.

  • Step 5. Select \(g^{k}\in \partial f_{i_{k}}({x^{k}})\), \(t_k \in (0,+\infty )\) and construct the geodesic \(\gamma _k\) such that

    $$\begin{aligned} \gamma _{k} (0)=x^{k}\quad \mathrm{and}\quad \gamma '_k(0)=-\frac{g^{k}}{\Vert g^{k}\Vert } . \end{aligned}$$
    (2)
  • Step 6. Set \(x^{k+1}:=\gamma _{k}(t_{k})\).

  • Step 7. Replace \(k\) by \(k+1\) and go to Step 2.

Remark 3.1

  1. (i)

    Note that the selected \(g^{k}\in \partial f_{i_{k}}({x^{k}})\) in Step 5 is nonzero (otherwise, \(x^{k}\) is a minimum of \(f_{i_{k}}\) and then, \(f_{i_{k}}(x^{k})\le 0\)). This means that Algorithm 3.1 is well defined.

  2. (ii)

    The sequence \(\{t_{k}\}\) is called the step size sequence. By Algorithm 3.1, we have that

    $$\begin{aligned} d(x^{k},x^{k+1})\le t_{k}\quad \mathrm{for\,each}\; k\in {\mathbb {N}}. \end{aligned}$$
    (3)

    To analyze the convergence of Algorithm 3.1, we first establish the basic inequality on Riemannian manifolds. For simplicity, we define the function \(\hbar :]0,\infty [\rightarrow {\mathbb {R}}\) by

    $$\begin{aligned} \hbar (t):=\frac{\tanh t}{t}\quad \mathrm{for\,any}\; t\in ]0,\infty [. \end{aligned}$$

    It is easy to verify that \(\hbar \) is strictly decreasing on \(]0,\infty [\).

For the remainder of the paper, we assume that the Riemannian manifold \(M\) is of sectional curvature bounded from below by some negative number \(\kappa \).

Lemma 3.1

Let \(\{x^{k}\}\) be a sequence generated by Algorithm 3.1. Let \(z\in M\) and let \(k\ge 0\) be such that \(f_{i_{k}}(x^{k})>0\). Then, the following inequality holds:

$$\begin{aligned} \cosh \left( \sqrt{|\kappa |}d(x^{k+1},z)\right)&\le \cosh \left( \sqrt{|\kappa |}d(x^{k},z)\right) \\ \nonumber&\quad +\sqrt{|\kappa |}\cosh \left( \sqrt{|\kappa |} \mathrm{d}(x^{k},z)\right) \sinh (\sqrt{|\kappa |}t_{k}) \\ \nonumber&\quad \cdot \left( \frac{t_{k}}{2}-\hbar \left( \sqrt{|\kappa |} d(x^{k},z)\right) \frac{f_{i_{k}}(x^{k})-f_{i_{k}}(z)}{\Vert g^{k}\Vert }\right) . \end{aligned}$$
(4)

Proof

Without loss of generality, we assume that \(\kappa =-1\). Take \(z\in M\) and let \(\tau _{k}:[0,1]\rightarrow M\) be a minimal geodesic segment joining \(x^{k}\) and \(z\) with \(\tau _k(0)=x^{k}\) and \(\tau _k(1)=z\). Let \(\gamma _{k}:[0, t_k]\rightarrow M\) be the geodesic segment defined by Algorithm 3.1. Then,

$$\begin{aligned} l(\gamma _{k})=t_{k},\quad l(\tau _{k})=d(x^{k},z). \end{aligned}$$
(5)

Consider the generalized geodesic hinge \(\varLambda (x^{k};\gamma _{k},\tau _{k})\) in \(M\), and let \(\alpha := \angle _{x^k}(\gamma _k,\tau _k)\). Then \(\alpha \) coincides with the angle between the tangent vectors \(\gamma '_{k}(0)\) and \(\tau '_{k}(0)\) in the tangent space \(T_{x^k}M\). This, together with (2), implies that

$$\begin{aligned} \langle g^{k},\tau '_{k}(0)\rangle =-\Vert g^{k}\Vert \cdot d(x^{k},z)\cos \alpha . \end{aligned}$$
(6)

By Proposition 2.2, there exists a comparison geodesic hinge \(\varLambda (\bar{x}^{k};\bar{\gamma }_{k},\bar{\tau }_{k})\) of the hinge \(\varLambda (x^{k};\gamma _{k},\tau _{k})\) in \(M_{-1}^{2}\) such that

$$\begin{aligned} d(x^{k+1},z)\le d(\bar{x}^{k+1},\bar{z}), \end{aligned}$$
(7)

where \(\bar{x}^{k+1}\) and \(\bar{z}\) are the end points of \(\bar{\gamma }_{k}\) and \(\bar{\tau }_{k}\), respectively. Consider the geodesic triangle \(\triangle (\bar{x}^{k}\bar{x}^{k+1}\bar{z})\) in \(M_{-1}^{2}\). By the definition of the comparison geodesic hinge and using (5), we have that

$$\begin{aligned} d(\bar{x}^{k},\bar{x}^{k+1})=t_{k},\quad d(\bar{x}^{k},\bar{z})=d(x^{k},z), \quad \mathrm{and}\quad \angle _{\bar{x}^{k}}(\bar{\gamma }_k,\bar{\tau }_k) =\alpha . \end{aligned}$$

Then, applying the “law of cosines” to the geodesic triangle \(\triangle (\bar{x}^{k}\bar{x}^{k+1}\bar{z})\) in \(M_{-1}^{2}\), we get

$$\begin{aligned} \cosh d(\bar{x}^{k+1},\bar{z})=\cosh d(x^{k},z)\cosh t_{k}-\sinh d(x^{k},z)\sinh t_{k}\cos \alpha . \end{aligned}$$

Combing this and (7) yields that

$$\begin{aligned} \cosh d(x^{k+1},z) \le \cosh d(x^{k},z)\cosh t_{k} -\sinh d(x^{k},z)\sinh t_{k}\cos \alpha . \end{aligned}$$
(8)

By elementary calculus, one can easily verify that

$$\begin{aligned} \cosh t \le 1+\frac{t}{2}\sinh t\quad \mathrm{for\,any}\; t\ge 0. \end{aligned}$$

It follows from (8) that

$$\begin{aligned}&\cosh d(x^{k+1},z)\le \cosh d(x^{k},z)\left( 1+\frac{t_{k}\sinh t_{k}}{2}\right) -\sinh d(x^{k},z)\sinh t_{k}\cos \alpha \nonumber \\&\quad =\cosh d(x^{k},z)+\cosh d(x^{k},z)\sinh t_{k}\left[ \frac{t_{k}}{2}-\tanh d(x^{k},z)\cos \alpha \right] . \end{aligned}$$
(9)

Recall that \(g^{k}\in \partial f_{i_{k}}(x^{k})\). By equality (6), one has that

$$\begin{aligned} f_{i_{k}}(z)\ge f_{i_{k}}(x^{k})+\langle g^{k},\tau '_{k}(0)\rangle =f_{i_{k}}(x^{k})-\Vert g^{k}\Vert \cdot \mathrm{d}(x^{k},z)\cos \alpha ; \end{aligned}$$

hence,

$$\begin{aligned} -\cos \alpha \le -\frac{f_{i_{k}}(x^{k})-f_{i_{k}}(z)}{\mathrm{d}(x^{k},z)\Vert g^{k}\Vert }. \end{aligned}$$

Substituting this into (9) and recalling the definition of the function \(\hbar \), we see that (4) holds, and the proof is complete. \(\square \)

Set

$$\begin{aligned} f^{*}:=\inf _{x\in M}f(x). \end{aligned}$$

Then, \(f^{*}\le 0\) by the nonemptiness assumption of \(Q\). Throughout the whole paper, let \(\{f^*_i:\,i\in I\}\subseteq {\mathbb {R}}\) satisfy

$$\begin{aligned} f^*\le f_i^*\le 0 \quad \mathrm{for\,each}\; i\in I \end{aligned}$$

and choose the step size sequence \(\{t_k\}\) as follows:

$$\begin{aligned} t_{k}:=\left\{ \begin{array}{ll} \alpha _{k}\frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert },\quad &{} \mathrm{if}\;f_{i_{k}}(x^{k})>0,\\ 0,&{} \mathrm{otherwise}, \end{array} \right. \end{aligned}$$
(10)

where \(i_{k}=(k\,\mathrm{mod}\,m)+1\). Set

$$\begin{aligned} C:=\{x\in M:f_i(x)\le f_i^*,\quad \forall i\in I\}. \end{aligned}$$

Note that \(C\subseteq Q\) and the equality holds if each \(f_i^*=0\). Furthermore, let \(x^0\in M\) and write \(r_0:= d_C(x^{0})\).

Lemma 3.2

Let \(x^0\in M\) and suppose that

$$\begin{aligned} 0<\inf _k\alpha _{k}\le \sup _k\alpha _k<2\hbar (2\sqrt{|\kappa |}r_{0}). \end{aligned}$$
(11)

Let \(\{x^{k}\}\) be a sequence generated by Algorithm 3.1 with initial point \(x^0\). Set

$$\begin{aligned} \tilde{C}:=C\bigcap \left( \bigcup _{\omega \in P(x^{0}|C)}{\mathbb {B}}(\omega ,r_{0})\right) \end{aligned}$$

and let \(z\in \tilde{C}\). Then, the following estimate holds for each \(k\in {\mathbb {N}}\):

$$\begin{aligned} d^{2}(x^{k+1},z)\le d^{2}(x^{k},z)-\frac{\sigma t_{k}^{2}}{4\beta }, \end{aligned}$$
(12)

where

$$\begin{aligned} \beta :=\frac{\sinh (2\sqrt{|\kappa |}r_{0})}{4\sqrt{|\kappa |}r_{0}}\quad \mathrm{and}\quad \sigma :=\min \left\{ \inf _k\alpha _{k}, 2-\frac{\sup _k\alpha _{k}}{\hbar (2\sqrt{|\kappa |}r_{0})}\right\} . \end{aligned}$$

Proof

As in the proof of Lemma 3.1, we also assume that \(\kappa =-1\). By the definition of \(\sigma \), assumption (11) implies that \(\sigma \in ]0,1[\) and

$$\begin{aligned} \sigma \le \alpha _{k}\le (2-\sigma )\hbar (2r_{0})\quad \mathrm{for\,each}\; k\in {\mathbb {N}}. \end{aligned}$$
(13)

Without loss of generality, we assume that \(f_{i_{k}}(x^{k})> 0\) (otherwise, \(x^{k+1}=x^k\) and (12) holds trivially by (10)). Then,

$$\begin{aligned} f_{i_{k}}(x^{k})-f_{i_{k}}(z)\ge f_{i_{k}}(x^{k})-f_{i_k}^*>0 \end{aligned}$$
(14)

(noting that \(z\in \tilde{C}\)). By the choice of \(t_k\) and noting (13), one sees that

$$\begin{aligned} \frac{t_{k}}{2}=\frac{\alpha _{k}}{2} \frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert } \le \left[ \hbar (2r_{0})-\frac{\sigma \hbar (2r_{0})}{2}\right] \frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert }. \end{aligned}$$

This, together with (14), implies that

$$\begin{aligned}&\frac{t_{k}}{2}-\hbar \left( \mathrm{d}(x^{k},z)\right) \frac{f_{i_{k}}(x^{k})-f_{i_{k}}(z)}{\Vert g^{k}\Vert }\\&\quad \le \left[ \hbar (2r_{0})-\hbar \left( \mathrm{d}(x^{k},z)\right) -\frac{\sigma \hbar (2r_0)}{2}\right] \frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert }. \end{aligned}$$

Applying (4), we get that

$$\begin{aligned}&\cosh d(x^{k+1},z)\le \cosh d(x^{k},z)\\ \nonumber&\quad +\cosh d(x^{k},z)\sinh t_{k}\left[ \hbar (2r_0)-\hbar \left( \mathrm{d}(x^{k},z)\right) -\frac{\sigma \hbar (2r_0)}{2}\right] \frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert }. \end{aligned}$$
(15)

Since \(\sigma >0\), we have that

$$\begin{aligned}&\cosh d(x^{k+1},z)\le \cosh d(x^{k},z)\\ \nonumber&\quad +\cosh d(x^{k},z)\sinh t_{k}\left[ \hbar (2r_0)-\hbar \left( \mathrm{d}(x^{k},z)\right) \right] \frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert }. \end{aligned}$$
(16)

Below, we use mathematical induction to show that

$$\begin{aligned} d(x^{k},z)\le 2r_0 \end{aligned}$$
(17)

holds for each \(k\in {\mathbb {N}}\). By the choice of \(z\) and the definition of \( \tilde{C}\), we can choose \(\omega \in P(x^0|C)\) such that \(\mathrm{d}(z,\omega )\le r_0\). Thus, (17) is clear for \(k=0\), because

$$\begin{aligned} d(x^0,z)\le d(x^0,\omega )+ d(\omega ,z)\le 2r_0. \end{aligned}$$

Now, suppose that (17) is valid for \(k=n\), that is,

$$\begin{aligned} d(x^{n},z)\le 2r_0. \end{aligned}$$
(18)

Then,

$$\begin{aligned} \hbar (2r_0)\le \hbar \left( d(x^{n},z)\right) \end{aligned}$$
(19)

(as \(\hbar \) is decreasing on \(]0,+\infty [\)). It follows from (16) that

$$\begin{aligned} \cosh d(x^{n+1},z)\le \cosh \mathrm{d}(x^{n},z); \end{aligned}$$

hence, \( d(x^{n+1},z)\le d(x^{n},z)\). This, together with (18), yields that (17) holds for \(k=n+1\). Therefore, (17) is checked for all \(k\in {\mathbb {N}}\), and so (19) holds for all \(n\in {\mathbb {N}}\). Thus, combining (15) and (19) (with \(k\) in place of \(n\)), one concludes that

$$\begin{aligned} \begin{array}{ll} \cosh d(x^{k+1},z)\le \cosh d(x^{k},z)-\cosh \mathrm{d}(x^{k},z)\sinh t_{k}\frac{\sigma \hbar (2r_0)\left( f_{i_{k}}(x^{k})-f_{i_k}^*\right) }{2\Vert g^{k}\Vert }. \end{array} \end{aligned}$$
(20)

Recalling (17), one can use Taylor expansion for the function \(\cosh (\cdot )\) to verify the following estimate:

$$\begin{aligned} \cosh d(x^{k+1},z)-\cosh d(x^{k},z)\ge \beta \left( \mathrm{d}^{2}(x^{k+1},z)- d^{2}(x^{k},z)\right) , \end{aligned}$$

where \(\beta :=\frac{\sinh (2r_0)}{4r_0}\). This, together with (20), implies that

$$\begin{aligned} d^2(x^{k+1},z)\le d^2(x^{k},z)-\frac{\sigma }{2\beta } \hbar (2r_0)\cosh d(x^{k},z)\sinh t_{k} \frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert }. \end{aligned}$$

Since \(\sinh t_{k}>t_{k}\) and \(\cosh d(x^{k},z)\ge 1\), it follows that

$$\begin{aligned} d^2(x^{k+1},z)\le d^2(x^{k},z)-\frac{\sigma \hbar (2r_0)t_{k}\left( f_{i_{k}}(x^{k})-f_{i_k}^*\right) }{2\beta \Vert g^{k}\Vert }. \end{aligned}$$
(21)

On the other hand, using (10) and (13), we see that

$$\begin{aligned} \frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{ \Vert g^{k}\Vert }=\frac{t_k}{\alpha _k}\ge \frac{t_{k}}{ (2-\sigma )\hbar (2r_0)}\ge \frac{t_{k}}{ 2\hbar (2r_0)}. \end{aligned}$$

This, together with (21), yields (12). The proof is complete. \(\square \)

Remark 3.2

Suppose that assumption (11) holds. Then, the following assertions hold by Lemma 3.2:

  1. (i)

    The sequence \(\{x^{k}\}\subseteq {\mathbb {B}}(\omega ,r_{0})\) for any point \(\omega \in P(x^{0}|C)\), and is Fejér convergent to \(\tilde{C}\) in the sense that

    $$\begin{aligned} d(x^{k+1},z)\le d(x^{k},z)\quad \mathrm{for\,each}\; k\in {\mathbb {N}}\quad \mathrm{and}\quad z\in \tilde{C}. \end{aligned}$$
    (22)
  2. (ii)

    By (1), we have that

    $$\begin{aligned} L:=\sup \{\Vert g\Vert :\; g\in \partial f_{i}(x^k),\,(i,k)\in I\times {\mathbb {N}}\}<+\infty . \end{aligned}$$
    (23)
  3. (iii)

    By (3) and (12), we have that

    $$\begin{aligned} \lim _{k\rightarrow \infty } d(x^{k+1},x^{k})=\lim _{k\rightarrow \infty }t_{k}=0. \end{aligned}$$
    (24)

4 Convergence Results

We assume in this section that \(f_i^*=0\) for each \(i\in I\) in the step size sequence \(\{t_k\}\), that is Algorithm 3.1 employs the following step sizes

$$\begin{aligned} t_{k}:=\left\{ \begin{array}{ll} \alpha _{k}\frac{f_{i_{k}}(x^{k})}{\Vert g^{k}\Vert },\quad &{} \mathrm{if}\; f_{i_{k}}(x^{k})>0,\\ 0,&{}\mathrm{otherwise}. \end{array} \right. \end{aligned}$$

Then, one has that \(C=Q\). The main result of this section is as follows.

Theorem 4.1

Let \(x^0\in M\) and suppose that assumption (11) holds. Then, any sequence \(\{x^{k}\}\) generated by Algorithm 3.1 with initial point \(x^0\) converges to a point \(x^{\infty }\in Q\).

Proof

By assumption (11), Remark 3.2 is applicable. Then, \(\{x^k\}\) satisfies (22), (24), and that

$$\begin{aligned} d(x^{k },\omega )\le r_0\quad \mathrm{for\,any}\; \omega \in P(x^{0}|C). \end{aligned}$$
(25)

Thus, there exists a subsequence \(\{x^{k_j}\}\) of \(\{x^{k}\}\) such that \(x^{k_j}\rightarrow x^\infty \in M\). Below, we verify that \(x^\infty \in Q\), that is, \(f_l(x^\infty )\le 0\) for each \(l\in I\). To do this, recall that \(i_{k_j}=(k\,\mathrm{mod}\,m)+1\) and so \(i_k\in I\). Thus, without loss of generality, we assume that \(i_{k_j}=m\) for each \(j\). Let \(l\in I\) and consider the subsequence \(\{x^{k_j+l}\}_{j=0}^\infty \). Then,

$$\begin{aligned} x^{k_j+l}\rightarrow x^\infty \quad \mathrm{and}\quad i_{k_j+l}=l, \end{aligned}$$
(26)

where the first assertion is true by (24). Recalling the choice of \(t_k\), we have that, if \(f_l(x^{k_{j}+l})> 0\), then

$$\begin{aligned} t_{k_j+l}=\alpha _{k_j+l}\frac{f_l(x^{k_j+l})}{\Vert g^{k_j+l}\Vert } \ge \frac{\sigma }{L}f_l(x^{k_j+l}), \end{aligned}$$

where \(L>0\) is given by (23); hence,

$$\begin{aligned} f_l(x^{k_j+l})\le \frac{L}{\sigma }t_{k_j+l}. \end{aligned}$$
(27)

This inequality holds trivially if \(f_l(x^{k_{j}+l})\le 0\). Thus, (27) holds for each \(j\in {\mathbb {N}}\). Since \(f_l\) is continuous on \(M\), it follows from (26) that

$$\begin{aligned} f_l(x^\infty )=\lim _{j\rightarrow \infty }f_l(x^{k_j+l})\le 0. \end{aligned}$$

Therefore, \(x^\infty \in Q\) as \(l\in I\) is arbitrary. Moreover, by (25), we get that \( d(x^{\infty },\omega )\le r_0\), and so \(x^\infty \in \tilde{C}\) (noting that \(C=Q\)). Appying (22), one concludes that \(\{x^{k}\}\) converges to \(x^{\infty }\) and the proof is complete. \(\square \)

As a consequence of Theorem 4.1, we get the following corollary, which was proved in [38].

Corollary 4.1

Suppose that the sectional curvature of \(M\) is non-negative and that

$$\begin{aligned} 0<\inf _k\alpha _{k}\le \sup _k\alpha _{k}<2. \end{aligned}$$
(28)

Then, any sequence \(\{x^{k}\}\) generated by Algorithm 3.1 with initial point \(x^0\in M\) converges to a point \(x^{\infty }\in Q\).

Proof

Let \(x^0\in M\). Recall that function \(\hbar \) is continuous on \(]0,\infty [\) and \(\hbar (0)=1\). Then, by assumption (28), one can choose \( \kappa <0\) such that assumption (11) holds. Furthermore, by assumption, it is clear that the sectional curvature of \(M\) is bounded from below by \( \kappa \). Thus, Theorem 4.1 is applicable to complete the proof. \(\square \)

5 Finite Convergence

Recall that the step size sequence is given as in (10):

$$\begin{aligned} t_{k}:=\left\{ \begin{array}{ll} \alpha _{k}\frac{f_{i_{k}}(x^{k})-f_{i_k}^*}{\Vert g^{k}\Vert },\quad &{} \mathrm{if}\; f_{i_{k}}(x^{k})>0,\\ 0,&{} \mathrm{otherwise}, \end{array} \right. \end{aligned}$$

where \(i_{k}=(k\,\mathrm{mod}\,m)+1\) and \( f^*\le f_i^*\le 0\) for each \(i\in I\). We suppose in this section that the CFP on \(M\) satisfies the Slater condition:

$$\begin{aligned} f^{*}<0, \end{aligned}$$

and that \(\{f^*_i:i\in I\}\) satisfies

$$\begin{aligned} f^*\le f_i^*<0 \quad \mathrm{for\,each}\; i\in I. \end{aligned}$$

The following theorem shows that Algorithm 3.1 terminates in a finite number of iterations.

Theorem 5.1

Let \(x^0\in M\) and suppose that assumption (11) holds. Then, any sequence \(\{x^{k}\}\) generated by Algorithm 3.1 with initial point \(x^0\) terminates in a finite number of iterations.

Proof

Suppose on the contrary that Algorithm 3.1 generates an infinite sequence \(\{x^k\}\), which satisfies that

$$\begin{aligned} \sup _{i\in I}f_i(x_k)>0\quad \mathrm{for\,each}\; k\in {\mathbb {N}}. \end{aligned}$$

This means that there exists a subsequence \(\{x^{k_j}\}\) such that

$$\begin{aligned} f_{i_{k_j}}(x^{k_j})>0\quad \mathrm{for\,each}\; j\in {\mathbb {N}}, \end{aligned}$$
(29)

where \(i_{k_j}=(k_j\,\mathrm{mod}\,m)+1\) as before. Moreover, by assumption (11), Remark 3.2 is applicable. Therefore, \(\{x^k\}\) is bounded and (24) holds. Thus, without loss of generality, we may further assume that \(\{x^{k_j}\}\) converges to some point \(x^\infty \in M\) and \(i_{k_j}=1\) for each \(j\) (as each \(i_{k_j}\) is in the finite set \(I\)). Since \(f_1\) is continuous on \(M\), it follows from (29) that

$$\begin{aligned} f_1(x^{\infty })=\lim _{j\rightarrow +\infty } f_1(x^{k_j})\ge 0. \end{aligned}$$
(30)

Furthermore, we have that

$$\begin{aligned} t_{k_j}=\alpha _{k_j}\frac{f_1(x^{k_j})-f_1^*}{\Vert g^{k_j}\Vert } \ge \frac{\sigma }{L}\left( f_1(x^{k_j})-f_1^*\right) \quad \mathrm{for\,each}\; j\in {\mathbb {N}}, \end{aligned}$$

where \(L>0\) is given by (23). Passing to the limit and making use of (24), we get that \(f_1(x^{\infty })-f_1^*\le 0\); hence, \(f_1(x^\infty )\le f_1^*<0\), which contradicts (30). Hence, the proof is complete. \(\square \)

With a similar argument that we did for Corollary 4.1, but using Theorem 5.1 in place of Theorem 4.1, we can show the following corollary, which was proved in [38, Theorem 5.1].

Corollary 5.1

Suppose that the sectional curvature of \(M\) is non-negative and that

$$\begin{aligned} 0<\inf _k\alpha _{k}\le \sup _k\alpha _{k}<2. \end{aligned}$$

Then, any sequence \(\{x^{k}\}\) generated by Algorithm 3.1 with initial point \(x^0\in M\) terminates in a finite number of iterations.

6 Conclusions

By using a different approach, we generalize the results in [38] for the Riemannian manifolds with non-negative sectional curvature to those with sectional curvature bounded from below. In particular, the open problem proposed in [38] has been solved partially. It is still an open problem how to establish similar convergence results on Riemannian manifolds without the assumption that the sectional curvature is bounded from below. Note that, in the Hilbert space setting, if the CFP satisfies a Slater type condition, then the cyclic subgradient projection algorithm is linearly convergent [1]. Our future work is to establish the linear convergence result about the algorithm on Riemannian manifolds under a Slater type condition. Moreover, recalling that every complete, connected Riemannian manifold is a special geodesic metric space (as pointed out in [27]), we expect that our approach in the present paper can be used to study similar problems in the setting of geodesic spaces (with curvature bounded from below).