Abstract
In this paper, we propose and analyze a variant of the proximal point method for obtaining weakly efficient solutions of convex vector optimization problems in real Hilbert spaces, with respect to a partial order induced by a closed, convex and pointed cone with nonempty interior. The proposed method is a hybrid scheme that combines proximal point type iterations and projections onto some special halfspaces in order to achieve the strong convergence to a weakly efficient solution. To the best of our knowledge, this is the first time that proximal point type method with strong convergence has been considered in the literature for solving vector/multiobjective optimization problems in infinite dimensional Hilbert spaces.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we propose and analyze a proximal point type method for solving vector optimization problems. We consider a closed, convex, and pointed cone \( K \subset {\mathbb {R}}^m \) with nonempty interior, in order to define a partial order \(\preceq (\prec )\) in \( {\mathbb {R}}^m \), given by \(z\preceq z'\; (\text {or} \ z \prec z')\) if and only if \(z'-z\in K\;(\text {or} \ z'-z \in \mathrm{int}(K))\). The partial order \( \succeq (\succ )\) is defined in a similar way. We are interested in the following vector optimization problem:
where \(F:\mathscr {H}\rightarrow {\mathbb {R}}^m \) is a K-convex function and \( \mathscr {H}\) is a real Hilbert space. In the case in which the objective function contains more than one component, i.e., \(F=(F_1,F_2,\ldots , F_m)\) with \(m>1\), in general, there is no point that minimizes all of them at once. Hence, we are interested in obtaining weakly efficient solutions for problem (1), i.e., points \(x^*\in \mathscr {H}\) such that there is no \(x\in \mathscr {H}\) satisfying \(F(x)\prec F(x^*)\). If \( K={\mathbb {R}}^m_{+} \), (1) is called multiobjective optimization problem and the concept of weakly efficient solution corresponds to the usually called weak Pareto solution.
Many methods have been proposed for solving problem (1). Most of them are “natural” extensions of classical ones for solving scalar optimization problems, which corresponds to problem (1) with \(m=1\) and \(K={\mathbb {R}}_+\). The classical steepest descent method was the first one to be adapted for solving continuously differentiable multiobjective problems in finite-dimensional spaces, see [23]. The latter method was extended to the vectorial setting in [27] and further explored in [25] to deal with constrained vector optimization problems. The full convergence of the multiobjective projected gradient method was studied in [6] in the quasi-convex case. In [13], the authors generalized the gradient method of [23] for solving multiobjective optimization problems over a Riemannian manifold. The subgradient method was considered in the multiobjective setting in [20] and more generally for solving vector optimization problems in [3]. Newton’s method was first introduced in the multiobjective setting in [22] and further generalized for solving vector optimization problems in [26, 32]. In [16], the authors proposed and analyzed the proximal point method in the vectorial setting. A scalarized version of the latter method was studied in [11] for solving quasi-convex multiobjective problems. The latter reference also analyzed finite termination of the proposed scheme under some stronger assumptions. Variants of the vector proximal point method for solving some classes of nonconvex multiobjective optimization problems were considered in [12, 15]. A hybrid approximate extragradient proximal point method was also studied in [19] for obtaining a common solution of a variational inequality problem and a vector optimization problem. Other variants of the proximal gradient method were analyzed in the multiobjective and vector optimization setting in [8, 17, 44]. See, for instance, [28] for a review about vector proximal point method and some variants. Most recently, [34] proposed and analyzed a conjugate gradient method for solving vector optimization problems using different line-search strategies. Finally, most of the aforementioned multiobjective/vector methods have also been extended to solve multiobjective/vector optimization problems on Riemannian manifolds; see, for instance, [10, 14, 21] and references therein.
There exists a vast literature concerning proximal point type methods for solving different classes of problems; see, for instance, [18, 29, 31, 36, 37, 39,40,41], to name just a few. The proximal point method for minimizing a convex function \( f:\mathscr {H}\rightarrow {\mathbb {R}}\) consists of iteratively generating a sequence \(\{x^k\}\) by means of the following procedure
where \(\{\alpha _k\} \subset {\mathbb {R}}_{++}\). The above scheme can be seen as a sequence of regularized subproblems, since the objective function f may have several minimizers (or none), whereas (2) has just one. This scheme has found many practical applications; see, e.g., [38] and references therein. Inexact variants of the above scheme have been extensively considered in the literature. In order to give more details, we first recall that the solutions of the convex optimization problem \(\min _x f(x)\) correspond to the zeroes of the maximal monotone operator \( T=\partial f \), where \( \partial f\) denotes the subdifferential of f. The proximal point method applied to obtain zeroes of a general monotone inclusion \(0\in T(x)\) consists of generating \(\{x^{k}\}\) where \(y{:}{=}x^{k+1}\) is a solution of
This general scheme was first considered by Martinet in [36, 37] and further explored by Rockafellar in [39, 40]. Since then, many authors have investigated this scheme in different contexts. In [40] it was proved that the sequence \( \{ x^k \} \) generated by the proximal point method converges to a zero of T in the weak topology of \( \mathscr {H}\), whenever such a zero exists. The latter reference posed, as an open question, the issue of whether or not this scheme is strongly convergent. Güller in [29] exhibited an optimization problem for which the proximal point method is weakly convergent but not strongly. A few years later, a new proximal-type method for solving monotone inclusions, enjoying the strong convergence property, was introduced in [42]. This scheme consists of combining proximal iterations with projection steps onto the intersection of two halfspaces. More precisely, given \( x^0 \in \mathscr {H}\) and \( \{ \alpha _k \} \subset {\mathbb {R}}_{++} \), it computes an “inexact solution" \((y^k,v^k)\) of the following problem:
and obtains \(x^{k+1}\) as
where
Here, \( P_D(x) \) denotes the projection of x onto a nonempty, closed and convex set \(D \subset \mathscr {H}\). Since then, many works concerning with methods possessing strong convergence properties have been considered in the literature for solving different kind of problems; see, for instance, [1, 5, 7, 9, 24, 35, 43], to name just a few. Most of the aformentioned papers deal with variants of the above “hybrid” proximal point method.
In this paper, we propose and analyze a hybrid proximal point method for computing a weakly efficient solution for the vector optimization problem (1). The proposed scheme combines some ideas of the proximal point method of [16] and the technique introduced in [42] (described in (3)-(5)) for guaranteeing the strong convergence. Under some mild conditions, we prove that the whole sequence generated by the proposed scheme is strongly convergent to a weakly efficient solution for problem (1). To the best of our knowledge, this is the first time that a proximal point type method with strong convergence has been considered in the literature for solving multiobjective/vector optimization problems in infinite dimensional spaces. It is worth mentioning that a key ingredient for proving strong convergence of algorithms based on the technique of [42] is the convexity of the solution set of the problem under consideration. Since, in general, the weak efficient solutions set for problem (1) is not convex, the extension of the aforementioned technique to the vector setting is not immediate. Moreover, an interesting feature of the proposed hybrid vector proximal point method is that its subproblems do not need to be solved restricted to the sublevel sets of the objective function F, namely, \(\Omega _k=\{x \in \mathscr {H}\ | \ F(x)\preceq F(x^k)\}\), \(k\ge 1\). The latter condition does not necessarily hold to multiobjective/vector proximal point methods, and it has been considered in the related literature in order to prove convergence of the whole sequence generated by these schemes.
The paper is organized as follows: Section 2 contains notation, definitions and some basic results. Section 3 formally describes the proposed vector proximal point method. Section 4 is devoted to the convergence analysis of the method and is divided into two subsections. The first one presents some technical results and basic properties, whereas the second subsection establishes the main convergence results. The last section contains a conclusion.
2 Notation and Preliminary Results
In this section, we formally state some notations, basic definitions, and preliminary results used in the paper.
We denote the inner product in \(\mathscr {H}\) by \({\langle }\cdot ,\cdot {\rangle }\) and its induced norm by \(\Vert \cdot \Vert \). The closed ball centered at \(x\in \mathscr {H}\) with radius \(\rho \) is denoted by \(B[x,\rho ]\), i.e., \(B[x,\rho ]{:}{=}\{ y\in \mathscr {H}\ | \ \Vert y-x \Vert \le \rho \}\). Now, we recall some definitions and properties of scalar functions (i.e., problem (1) with \(m=1\)). A function \(f: \mathscr {H}\rightarrow {\mathbb {R}}\) is said to be \(\mu \)-strongly convex with \(\mu \ge 0\) if and only if
If \(\mu =0\) in the above inequality, then f is said to be convex. The subdifferential of f at \(x \in \mathscr {H}\) is defined by
Now, let us recall a well-known inequality which will be needed later on. Assume that f is \(\mu \)-strongly convex (not necessarily differentiable) for some \(\mu \ge 0\) and let \(x^*\) be a minimizer of f. Then the following inequality holds
We also need the following well-known result on projections, see [2, Theorem 3.14].
Proposition 2.1
Let D be a nonempty, closed and convex set in \(\mathscr {H}\). Then, for all \(x\in \mathscr {H}\) and \(y\in D\), \({\langle }y-P_D(x), x-P_D(x) {\rangle }\le 0\).
The image of a function \(F:\mathscr {H}\rightarrow {\mathbb {R}}^m\) is the set \(\mathrm{Im}\,F{:}{=}\{z\in {\mathbb {R}}^m\, |\; z=F(x), \ x\in \mathscr {H}\}\). The positive dual cone of K is \(K^*{:}{=}\{w\in {\mathbb {R}}^m \ | \ {\langle }u,w{\rangle }\ge 0, \ \forall u\in K\}\).
In this paper, we assume that there exists a compact set \(C\subset K^{*}\setminus \{0\}\) such that \(cone(conv(C))=K^*\) and \(0\not \in C\). Since \(K =K^{**}\) (see [33, Theorem 14.1]), we have
Hence, given \(z,z' \in {\mathbb {R}}^m\), we have the following equivalences:
For a general cone K, the generator set C can be chosen as \( C=\{w\in K^*\ | \ \Vert w\Vert =1\}\). Note that if K is a polyhedral cone then \(K^*\) is also a polyhedral cone, and hence C can be chosen as a normalized finite set of its extreme rays. In scalar optimization, i.e., \(K={\mathbb {R}}_+\), one may consider \(C=\{1\}\). For multiobjective optimization, K and \(K^*\) are the positive orthant of \({\mathbb {R}}^m\) denoted by \({\mathbb {R}}^m_+\), and a natural choice for C is the canonical basis of \({\mathbb {R}}^m\).
A vector function \(G:\mathscr {H}\rightarrow {\mathbb {R}}^m\) is called K-convex if and only if for all \(x,y\in \mathscr {H}\) and \( t \in [0,1]\),
Note that if the partial order is given by the cone \( {\mathbb {R}}^m_+ \), i.e., \( K={\mathbb {R}}^m_+ \), then G is K-convex if and only if each coordinate \( G_i \) is a convex function in the usual sense. More generally, it follows from (8) and (10) that, for every \( w \in C \), the function \( \phi _w: \mathscr {H}\rightarrow {\mathbb {R}}\) given by \( \phi _w(\cdot ) {:}{=} \langle G(\cdot ),w \rangle \) is convex. Moreover, G is said to be positively lower semicontinuous if the scalar function \( \phi _w (\cdot )\) is lower semicontinuous for all \(w\in C\).
Throughout the paper, the objective function \(F:\mathscr {H}\rightarrow {\mathbb {R}}^m\) in (1) is assumed to be K-convex and positively lower semicontinuous.
Next, we present an elementary result needed in our analysis. We include its proof just for the sake of completeness.
Lemma 2.1
The following statements hold:
-
a)
if F is a K-convex function, then the set \(L_{F,K}(v){:}{=}\{x \in \mathscr {H}\ | \ F(x)\preceq v\}\) is convex, for all \(v\in {\mathbb {R}}^m\);
-
b)
a point \( x \in \mathscr {H}\) is weakly efficient for problem (1) if and only if
$$\begin{aligned} \max _{w \in C}\langle F(y)-F(x),w\rangle \ge 0,\quad \forall y\in \mathscr {H}. \end{aligned}$$
Proof
-
a)
Let \( v \in {\mathbb {R}}^m\) be given and consider \( x,y \in L_{F,K}(v) \). It follows from the K-convexity of F and (8) that for all \( t \in [0,1] \) and \( w \in C \), we have
$$\begin{aligned} 0 \le&\, \langle tF(x) + (1-t)F(y) - F(t x + (1-t)y),w \rangle \\ \le&\,\langle tv + (1-t)v - F(t x + (1-t)y),w \rangle \\ =&\,\langle v - F(t x + (1-t)y),w \rangle , \end{aligned}$$which, in view of (8), imply that \( F(t x + (1-t)y) \preceq v\), i.e., \( t x + (1-t)y \in L_{F,K}(v) \).
-
b)
A point x is a weakly efficient solution for problem (1) if and only if there is no \( y \in \mathscr {H}\) satisfying \( F(y) \prec F(x) \) which, in view of (9), is equivalent to say that for every \( y \in \mathscr {H}\) there exists \( w_y \in C \) such that \( {\langle }F(y) - F(x),w_y {\rangle }\ge 0 \). Since C is a compact set, the statement of (b) follows.
\(\square \)
Next, we introduce some useful concepts for stating and analyzing our method, to be studied in Section 3. Let \(x\in \mathscr {H}\) and \(\xi \in {\mathbb {R}}^m\) be given, and define the functions \(\psi _{\xi }, \; \theta _{x,\xi }:\mathscr {H}\rightarrow {\mathbb {R}}\) as
and
where \(\alpha >0\) is a given parameter. Since F is K-convex, it follows easily from (8)–(11) that \(\psi _\xi \) is convex. Hence, \(\theta _{x,\xi }\) is \(\alpha \)-strongly convex and, therefore, has a unique minimizer
In view of (11)–(13), we have that \( 0 \in \partial \psi _{\xi } (\tilde{x}) + \alpha (\tilde{x} - x), \) or equivalently,
Remark 2.1
The method stated in the next section chooses \(\xi \) iteratively according to a decreasing condition of the objective value on the solution obtained from the proximal subproblem (13). It is worth mentioning that a key ingredient in our method and its analysis is the appropriate choice of \(\xi \) during the execution of the method. It is interesting to note that if \(\xi _1\ne \xi _2\) then the subdifferential \(\partial \psi _{\xi _1} (x)\) may differ from \(\partial \psi _{\xi _2} (x)\). Indeed, consider
Choosing \(\xi _1=(0,0)\) and \(\xi _2=(-1,-4) \), we have \( \psi _{\xi _1}(0) = F_1(0)-\xi ^1_1=F_2(0)-\xi ^2_1=0 \), which implies that \( \partial \psi _{\xi _1}(0)=[0,2] \); see, for example, [30, Proposition 4.3.2]. Moreover, it can also be easily seen that \( \psi _{\xi _2}(0) = F_2(0)-\xi _2^2=4>F_1(0)-\xi _2^1 =1\), which implies that \( \partial \psi _{\xi _2}(0)= \{2\}\).
Next, we show that if x is the solution of the proximal subproblem (13), then x is a weakly efficient solution for problem (1). This result will be essential for justifying the stopping criterion of our method.
Lemma 2.2
Let \(x\in \mathscr {H}\), \(\xi \in {\mathbb {R}}^m\), and a scalar \(\alpha >0\) be given and consider \({\tilde{x}}\) as in (13). If \({\tilde{x}}=x\), then x is a weakly efficient solution for problem (1).
Proof
Since \( x=\tilde{x}\), it follows from (14) that \(0 \in \partial \psi _{\xi }(x)\), which in turn implies that x minimizes \( \psi _{\xi } \), in view of the convexity of \( \psi _{\xi }\). From (11) and the compactness of C, we see that, for every \(y\in \mathscr {H}\), there exists \(w_y \in C\) such that \( \psi _{\xi }(y)= \langle F(y) - \xi ,w_y \rangle \). Hence, for every \(y\in \mathscr {H}\), we have
which in turn implies that
Therefore, it follows from Lemma 2.1 (b) that x is a weakly efficient solution for problem (1). \(\square \)
3 A Hybrid Vector Proximal Point Method
In this section, we present the hybrid vector proximal point method for computing weakly efficient solutions for problem (1). We also establish two basic results showing, in particular, that the intersection of two special halfspaces is nonempty. The latter fact immediately implies that the projection step required by the method is well-defined.
Next, we state the hybrid vector proximal point method.
Hybrid Vector Proximal Point Method (HVPPM)
Step 0. Let \(x^0\in \mathscr {H}\), a bounded sequence \(\{\alpha _k\}\) such that \(\inf _k \alpha _k>0\), and set \(F_{-1}^{lev}{:}{=}F(x^0)\), \(k{:}{=}0\);
Step 1. Compute \({\tilde{x}}^k{:}{=}{\tilde{x}}(x^k,F_{k-1}^{lev})\) as in (11)–(13);
Step 2. If \(\tilde{x}^k=x^k\) then stop and output \(x^k\);
Step 3. Define
Step 4. Let \({\tilde{v}}^k{:}{=}\alpha _k(x^k-{\tilde{x}}^k)\) and define
Step 5. Compute
set \(k{:}{=}k+1\) and return to Step 1.
End
Some remarks about the HVPPM follow. First, the computation of \({\tilde{x}}^k\) as in Step 1 requires to solve the proximal subproblem (13) with \((x,\xi )=(x^k, F_{k-1}^{lev})\). This vector is then used to check the stopping criterion in Step 2 which if satisfied, implies that \(x^k\) is a weakly efficient solution for problem (1), in view of Lemma 2.2. Second, the vector \({\tilde{x}}^k\) is also needed for defining the level value \(F_k^{lev}\) and the vector \({\tilde{v}}^k\), which in turn, are used to construct the halfspace \(H_k\) as in (15). Third, it is worth mentioning that \({\tilde{v}}^k\) corresponds to \({\tilde{v}}\) computed as in (14) with \((x,\xi ,\alpha )=(x^k, F_{k-1}^{lev},\alpha _k)\), and hence it holds that
Fourth, from the definition of \(F_{k}^{lev}\) in Step 3, we immediately see that \(F_{k}^{lev}\preceq F_{k-1}^{lev}\). This nonincreasing property in the order given by the cone K will be very important in our analysis. Fifth, it will be shown in Lemma 3.2 that the set \(H_k\cap W_k\) is nonempty, and hence \(x^{k+1}\) as in Step 5 is well-defined. Finally, the extra projection step required to compute \(x^{k+1}\) does not entail a significant computational cost and is fundamental for guaranteeing the strong convergence property of HVPPM.
Next we present a basic result which will be useful in our analysis.
Lemma 3.1
Let \( ({\tilde{x}}^k,F_k^{lev})\) be generated by HVPPM and consider \(\psi _{F_{k}^{lev}}\) as in (11). Then, \(\psi _{F_{k}^{lev}}(\tilde{x}^k)\ge 0\) and the following equivalence holds
Proof
It is immediate that if \(F_{k}^{lev} =F(\tilde{x}^k)\) then \(\psi _{F_{k}^{lev} }(\tilde{x}^k)=0\), in view of (11). Now if \(F_{k}^{lev} \ne F(\tilde{x}^k)\), then it follows from Step 3 that \(F_{k}^{lev}=F_{k-1}^{lev}\) and \( F(\tilde{x}^k) \not \preceq F_{k}^{lev}\). Hence, using (8) and (11), we have
concluding the proof of the lemma. \(\square \)
As previously mentioned, if HVPPM stops at some iteration k, then \(x^{k}\) is a weakly efficient solution for problem (1). Hence, in order to analyze the convergence of the HVPPM, we assume from now on that the stopping criterion in Step 2 of HVPPM is never satisfied.
Given \(k\ge 0,\) consider the sub-level set \(S_k\) given by
The next result shows a fundamental property of \(S_k\), which immediately implies that \(x^{k+1}\) as in Step 5 is well-defined.
Lemma 3.2
For every \(k\ge 0\), we have
where \(H_k\), \(W_k\), and \(S_k\) are as in (15), (16), and (18), respectively. As a consequence, \(x^{k+1}\) as in Step 5 is well-defined, for every \(k\ge 0\).
Proof
We first prove that
Let \(k\ge 0\) be given. In view of Step 3, there exits \(y^{k}\in \{x^0,{\tilde{x}}^0, {\tilde{x}}^1,\ldots ,{\tilde{x}}^{k}\}\) such that \(F_{k}^{lev}=F(y^{k})\), which implies that \(S_{k}\) given in (18) is nonempty. Take an arbitrary \({\hat{x}} \in S_k \). In view of (11) and the compactness of C, there exists \( {\hat{w}}^k \in C\) such that \( \psi _{F_{k-1}^{lev}}({\hat{x}}) = \langle F({\hat{x}}) - F_{k-1}^{lev},{\hat{w}}^k \rangle \). Now, note that (17) yields \( \tilde{v}^{k} \in \partial \psi _{F_{k-1}^{lev}}(\tilde{x}^{k})\). It follows from the latter two facts, subgradient inequality in (6), and the definition of \( \psi _{F_{k-1}^{lev}}\) (see (11)), that
and hence
Now, suppose that \(F_{k}^{lev} = F(\tilde{x}^{k})\). Hence, Lemma 3.1 implies that \( \psi _{F_{k}^{lev}}(\tilde{x}^{k})=0 \), which in view of the above inequality and the fact that \({\hat{x}}\in S_k\), yields
On the other hand, if \(F_{k}^{lev} \ne F(\tilde{x}^{k})\), then \( F_{k}^{lev}=F_{k-1}^{lev} \) (see Step 3). Hence, in view of the second inequality in (21) and the fact that \({\hat{x}}\in S_k\), we have
Therefore, in view of the latter inequality, (22), and the definition of \( H_{k} \) in (15), we conclude that \( {\hat{x}} \in H_{k}\). Since \({\hat{x}} \in S_k\) is arbitrary, the proof of (20) follows. Now we proceed to prove that \(S_{k} \subset W_k\), for every \( k \ge 0\). This inclusion will be proved by induction. From (16), we immediately see that \(W_0= \mathscr {H}\) and then the inclusion \(S_0\subseteq W_0\) trivially holds. Assume that \(S_{k} \subset W_k\), for some \( k \ge 0\). Hence, in view of (20), we have \(\emptyset \ne S_k \subset H_k \cap W_k \). Then, \(x^{k+1}\) as in Step 5 is well-defined, and hence by Proposition 2.1, we have
Now note that \( \emptyset \ne S_{k+1} \subset S_k\) due to (18), (20), and the fact that \( F_{k+1}^{lev} \preceq F_{k}^{lev} \). Hence, since \(S_k \subseteq H_k\cap W_k\), we obtain that the inequality in (23) holds for any \(y\in S_{k+1}\). Then, it follows from the definition of \(W_{k+1}\) in (16) that \(S_{k+1}\subseteq W_{k+1}\), which by induction argument, proves that \(S_{k}\subseteq W_{k}\) for every \(k\ge 0\). Therefore, in view of (20), we conclude that (19) holds for every \(k\ge 0\). The last statement of the lemma follows immediately from the first one and the definition of \(x^{k+1}\) in Step 5. \(\square \)
4 Convergence Analysis of HVPPM
This section is devoted to the convergence analysis of HVPPM. It is is divided into two subsections. The first one presents some basic properties of the proposed method as well as some technical results. The second one establishes the main convergence results of HVPPM.
In order to ensure the convergence of HVPPM, we need the following assumption:
(A1) The set \( S{:}{=}\{x\in \mathscr {H}\ | \ F(x)\preceq F^{lev}_k, \forall k\ge 0\} \) is nonempty.
Since F is positively lower semicontinuous and K-convex, we see that S is a closed and convex set, see Lemma 2.1 (a). A standard assumption in the related literature is the so called completeness of Im F, meaning that for every sequence \(\{ y^k\}\subset \mathscr {H}\) satisfying \(F(y^{k+1})\preceq F(y^k)\) for all \(k\ge 0\), there exists \(y\in \mathscr {H}\) such that \(F(y)\preceq F(y^k)\) for all \(k\ge 0\). The completeness of Im F ensures the existence of weakly efficient points for vector optimization problems (see [33, Section 3]). As previously mentioned, \(\{F_k^{lev}\}\) satisfies \(F^{lev}_k\preceq F^{lev}_{k-1},\forall k\ge 0\). Moreover, for every \(k\ge 0\), there exists \(y^k\in \{x^0,\tilde{x}^0,\tilde{x}^1,\ldots ,\tilde{x}^k\}\) such that \(F_k^{lev}=F(y^k)\in \mathrm{Im}\,F\). Hence, the completeness of \(\mathrm{Im}\, F\) implies that (A1) holds. Note that in the scalar case, if the solution set for problem (1) is nonempty then (A1) is immediately satisfied.
Our main goal now is to show the strong convergence of the sequence \(\{x^k\}\) generated by HVPPM to a weakly efficient solution for problem (1). First, we present several technical results establishing some properties of HVPPM.
4.1 Some Basic and Technical Results
This subsection contains some basic and technical results which are useful for establishing the strong convergence of HVPPM.
We start by showing that \(\{x^k\}\) is contained in the intersection of two certain balls, obtaining, in particular, that \(\{x^k\}\) is bounded.
Lemma 4.1
Assume that \(\mathbf{(A1)}\) holds. Then, the sequence \(\{x^k\}\) generated by HVPPM satisfies
where \({{\hat{x}}}{:}{=}P_{S}(x^0)\) and \(\rho {:}{=}\Vert x^0-{\hat{x}}\Vert \).
Proof
Since S is a nonempty, closed and convex set, \({{\hat{x}}}{:}{=}P_{S}(x^0)\) is well-defined. Note that \(S\subset S_k\) for every \(k\ge 0\), in view of the definitions of \(S_k\) and S given in (18) and (A1), respectively. Thus, by Lemma 3.2, we have \(S\subset H_k\cap W_k\) for all \(k\ge 0\). Hence, it follows from Step 5 and Proposition 2.1 that
for all \(k\ge 0\). The above inequality immediately yields
which clearly proves the lemma. \(\square \)
We remark that, similarly to the proof in [4, Lemma 2.10], it can be shown that the inclusion in Lemma 4.1 can be strengthened to \(\{x^k\} \subset B\left[ \frac{x^0 + {\hat{x}}}{2},\frac{\rho }{2}\right] \).
Next, we present some basic results about the sequences \(\{x^k\}\) and \(\{{\tilde{x}}^k\}\) generated by HVPPM.
Lemma 4.2
Let \(\{(x^k,{\tilde{x}}^k)\}\) be generated by HVPPM and assume that \(\mathbf{(A1)}\) holds. Then, it holds that
where \(\rho \) is as in Lemma 4.1. As a consequence, the sequence \(\{\Vert x^k -\tilde{x}^k\Vert \}\) converges to zero.
Proof
For every \(k\ge 0\), we have \( x^{k+1} \in H_k \) (see Step 5). Hence,
Since it is assumed that the stopping criterion in Step 2 is never satisfied, we have \({\tilde{v}}^k\ne 0\) for every \(k\ge 0\), due to Step 2, the definition of \({\tilde{v}}^k\) in Step 4, and the fact that \( \alpha _k \not =0 \). Hence, in view of the definition of \(H_k\) in (15), we easily see that
which combined with (26), the fact that \( \psi _{F_{k}^{lev}}(\tilde{x}^k) \ge 0\) (see Lemma 3.1), and the definition of \({\tilde{v}}^k\) in Step 4, yields
proving (24). Now, in view of (16) and the fact that \(x^{k+1}\in W_k\) (see Step 5), we obtain
Thus, \(\Vert x^{k+1}-x^k\Vert ^2\le \Vert x^{k+1} - x^0\Vert ^2-\Vert x^k-x^0 \Vert ^2\), which combined with Lemma 4.1 imply that
The proof of (25) then follows from the latter conclusion and (24). The last statement of the lemma immediately follows from (25).\(\square \)
Next, we present a technical result containing some inequalities which will be used to prove that all weak cluster points of the sequence \(\{x^k\}\) generated by HVPPM are weakly efficient solutions for problem (1) and belong to the set S defined in (A1).
Lemma 4.3
Let \(\{(x^k,{\tilde{x}}^k,\alpha _k)\}\) be generated by HVPPM and assume that \(\mathbf{(A1)}\) holds. For any \( y \in \mathscr {H}\) and \(k\ge 0\), define \(\gamma ^{k}_{y}\) as
Then, for every \(y\in \mathscr {H}\), the following statements hold:
-
a)
the sequence \(\{\gamma ^{k}_{y}\}\) converges to zero;
-
b)
for every \(k\ge 0\), it holds that
$$\begin{aligned} \psi _{F_{k-1}^{lev}}(y) \ge \psi _{F_{k-1}^{lev}}({\tilde{x}}^k)+ \gamma ^k_y. \end{aligned}$$(28)As a consequence, for every \({{\hat{x}}} \in S\) and \(w\in C\), we have
$$\begin{aligned}&{\langle }F_{k-1}^{lev},w \rangle \ge {\langle }F({\tilde{x}}^k),w \rangle + \gamma ^k_{{{\hat{x}}}}, \nonumber \\&\quad {\langle }F(y),w^k_y \rangle \ge {\langle }F({\tilde{x}}^k),w^k_y \rangle + \gamma ^k_{y}, \qquad \forall k\ge 0, \end{aligned}$$(29)where \(w^k_y\in C\) is such \(\psi _{F_{k-1}^{lev}}(y)= {\langle }F(y) - F_{k-1}^{lev},w^k_y \rangle \).
Proof
-
a)
From (27) and the Cauchy-Schwarz inequality, we have, for every \( y \in \mathscr {H}\) and \(k\ge 0\),
$$\begin{aligned} |\gamma ^{k}_{y}|= & {} \frac{\alpha _k}{2}\left| \Vert x^k-{\tilde{x}}^k\Vert ^2+2\langle y-x^k, x^k-\tilde{x}^k\rangle \right| \\\le & {} \frac{\alpha _k}{2}\left( \Vert x^k-{\tilde{x}}^k\Vert ^2+2\Vert y-x^k\Vert \Vert x^k-\tilde{x}^k\Vert \right) . \end{aligned}$$We then conclude that \(\{\gamma ^{k}_{y}\}\) converges to zero, because \(\{\alpha _k\}\) and \(\{x^k\}\) are bounded, and \(\{ \Vert x^k - \tilde{x}^k \Vert \}\) converges to zero, in view of Step 0, Lemma 4.1, and the last statement in Lemma 4.2, respectively.
-
b)
In view of (12), one can see that \(\theta _{x^k,F_{k-1}^{lev}}\) is \(\alpha _k\)-strongly convex. It follows from (7) with \( f=\theta _{x^k,F_{k-1}^{lev}}\) and (13) that, for every \( y \in \mathscr {H}\) and \(k\ge 0\),
$$\begin{aligned} \theta _{x^{k},F_{{k}-1}^{lev}}(y) \ge \theta _{x^{k},F_{{k}-1}^{lev}}({\tilde{x}}^{k})+ \frac{\alpha _{k}}{2}\Vert y-\tilde{x}^{k}\Vert ^2, \end{aligned}$$and hence, in view of (12), we have
$$\begin{aligned} \psi _{F_{k-1}^{lev}}(y)+\frac{\alpha _k}{2} \Vert y - x^k \Vert ^2 \ge \psi _{F_{k-1}^{lev}}(\tilde{x}^k) + \frac{\alpha _{k}}{2}\Vert y-\tilde{x}^{k}\Vert ^2. \end{aligned}$$
Therefore, (28) follows from the above inequality and (27). We proceed to prove that the inequalities in (29) holds. For any \( {\hat{x}} \in S \), we have \( \psi _{F_{k-1}^{lev}}({\hat{x}})\le 0 \), in view of the definition of S in (A1), (8), and (11). Hence, by combining (28) with \( y={\hat{x}}\) and the definition of \(\psi _{F_{k-1}^{lev}}(\tilde{x}^k)\) in (11), we obtain, for every \( w \in C \) and \(k\ge 0\),
which immediately implies that the first inequality in (29) holds. Now, first note that from the compactness of C and (11), there exists, indeed, \( w^{k}_{y} \in C \) as in the last statement of the lemma. Hence, from (11) and (28), we have
which immediately implies that last inequality in (29) holds. \(\square \)
4.2 Main Convergence Results
This subsection establishes the main convergence results about the sequence \(\{x^k\}\) generated by HVPPM.
We start by analyzing the weak cluster points of \(\{x^k\}\), showing, in particular, that they are weakly efficient solutions for problem (1).
Proposition 4.1
Assume that \(\mathbf{(A1)}\) holds. Then, the sequence \(\{x^k\}\) generated by HVPPM is bounded and all its weak cluster points belong to S and are weakly efficient solutions for problem (1).
Proof
In view of Lemma 4.1, the sequence \(\{x^k\}\) is bounded and hence it has a weak cluster point. Let \({\bar{x}}\) be an arbitrary weak cluster point of \(\{x^k\}\) and take a subsequence \(\{x^{i_k}\}\) weakly convergent to \({\bar{x}}\). In view of the last statement in Lemma 4.2, \(\{ \Vert x^k - \tilde{x}^k \Vert \}\) converges to zero, and so we conclude that \( \{ \tilde{x}^{i_k} \}\) is also weakly convergent to \( \bar{x} \). It follows from Lemma 4.3 (a) that for every \(y\in \mathscr {H}\), the sequence \(\{ \gamma ^{k}_{y} \}\) given in (27) converges to zero. Then, from the above facts, the first inequality in (29) and the positive lower semicontinuity of F, we see that, for all \(w\in C\),
Since, in view of Step 3, \(F^{lev}_k\preceq F^{lev}_{k-1}\) for every \(k\ge 0\), we obtain by (8) that, for all \(w\in C\), the sequence \(\{{\langle }F_k^{lev},w{\rangle }\}\) is nonincreasing. Hence, it follows from (30) that
From the above inequality and (8) we conclude that \(F({\bar{x}})\preceq F^{lev}_k\) for every \(k\ge 0\), which is equivalent to say that \( {\bar{x}}\in S\).
Now, we proceed to prove that \({\bar{x}}\) is a weakly efficient solution for problem (1). Let an arbitrary \(y\in \mathscr {H}\) and consider the sequence \( \{w_y^{k}\} \subset C \) as in the last statement of Lemma 4.3. Since C is compact, we may assume without loss of generality that the subsequence \(\{w_y^{i_k}\}\) converges to some \({\bar{w}}\in C\). In view of Step 3, we may consider two cases: i) there exists \(k_0\) such that \( F_{i_k}^{lev} = F({\tilde{x}}^{i_k})\) for all \(k\ge k_0\); ii) there exists an infinite set \(N\subset {\mathbb {N}}\) such that \(F_{i_k}^{lev} \ne F({\tilde{x}}^{i_k})\) for every \(k\in N\).
If the statement in (i) holds, then it follows from the last inequality in (29) that
Now, if the statement in (ii) holds, then \( F^{lev}_{i_k}=F_{i_k-1}^{lev}\) for every \(k\in N\) (see Step 3). Hence, it follows from (28), the first statement of Lemma 3.1, and the definition of \(w^{i_k}_y\) (see the last statement in Lemma 4.3), that
which implies that, in the case (ii), the inequality in (31) holds for all \(k\in N\). Therefore, in view of (8) and the fact that \({\bar{x}}\in S\), we conclude that, in both cases, we have
Using that \( \{\gamma ^{i_k}_{y} \}\) converges to zero and \( \{ w^{i_k}_{y} \} \) converges to \( {\bar{w}} \in C\), it follows from the above inequality that \( \langle F(y), {\bar{w}} \rangle \ge \langle F({\bar{x}}), {\bar{w}} \rangle , \) which clearly implies that \( \max _{w\in C}{\langle }F(y)-F({\bar{x}}),w{\rangle }\ge 0. \) Since y is an arbitrary vector in \(\mathscr {H}\), Lemma 2.1 (b) implies that \(\bar{x}\) is a weakly efficient solution for problem (1), concluding the proof of the theorem. \(\square \)
The next result shows, in particular, that the sequence \(\{x^k\}\) generated by HVPPM is strongly convergent to a weakly efficient solution for problem (1). Although its proof is similar to the one of [42, Theorem 1], we consider it for the sake of completeness.
Theorem 4.1
Assume that \(\mathbf{(A1)}\) holds. Then, the sequence \(\{x^k\} \) generated by HVPPM is strongly convergent to \({{\hat{x}}}{:}{=}P_{S}(x^0)\), which is a weakly efficient solution for problem (1).
Proof
It follows from Proposition 4.1 that \(\{x^{k}\}\) is bounded and that any weak cluster point of it belongs to S. Now, we proceed to prove that every weakly convergent subsequence of \( \{x^k\} \) converges strongly to \( {\hat{x}} \). Let an arbitrary subsequence \(\{x^{i_k}\}\) weakly convergent to some \({\bar{x}}\in S\). In view of Lemma 4.1, we have \(\Vert x^k-x^0\Vert \le \Vert x^0-{{\hat{x}}}\Vert \) for all \(k\ge 0\), where \({{\hat{x}}}=P_{S}(x^0)\). Thus, we obtain
Hence, since \(\{x^{i_k}\}\) is weakly convergent to \({\bar{x}}\), we have
where the last inequality is due to Proposition 2.1, \({{\hat{x}}}=P_{S}(x^0)\), and the fact that \({\bar{x}}\in S\). Hence, \(\{x^{i_k}\}\) is strongly convergent to \({{\hat{x}}}\). It then follows that any weakly convergent subsequence of \(\{x^k\}\), is strongly convergent to \({{{\hat{x}}}}\). Therefore, since \(\{x^k\}\) is bounded and hence has weak cluster point, one can easily conclude that the whole sequence \(\{x^k\} \) is strongly convergent to \({{\hat{x}}}\), which is a weakly efficient solution of problem (1) in view of Proposition 4.1.\(\square \)
5 Conclusion
This paper proposes and analyzes a hybrid vector proximal point method for finding weakly efficient solutions for vector optimization problems in real Hilbert spaces. The proposed method combines proximal point iterations with projections onto the intersection of two special halfspaces. Two interesting features of this method are: the proximal subproblems are unconstrained and the sequence generated by the method is proven to be strongly convergent to a weakly efficient solution of the vector optimization problem, under standard assumptions. To the best of our knowledge, none proximal point type method proposed in the literature for solving vector optimization problems in infinite dimensional Hilbert spaces has this strong convergence property. Moreover, this property seems to be new even in the multiobjective setting. The proposed method can be seen as an extension of the hybrid proximal point method of Solodov and Svaiter to compute weak efficient solutions of multiobjective/vector optimization problems. It is worth mentioning that this extension is not immediate. The main difficulty is due to the fact that the set of weak efficient solutions of multiobjective/vector optimization problems may not be convex even if the problem itself is. It can be easily seen that such a property is fundamental to this kind of technique (projections onto special halfspaces) for forcing strong convergence of proximal type methods. In order to overcome this drawback, a sequence of nonincreasing (in the order given by the cone) functional values is specially defined during the process and used to solve the proximal subproblems.
References
Bauschke, H.H., Combettes, P.L.: A weak-to-strong convergence principle for Fejér-monotone methods in Hilbert spaces. Math. Oper. Res. 26(2), 248–264 (2001)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York (2011)
Bello Cruz, J.Y.: A subgradient method for vector optimization problems. SIAM J. Optim. 23(4), 2169–2182 (2013)
Bello Cruz, J.Y., Díaz Millán, R. Phan, H. M.: Conditional extragradient algorithms for solving variational inequalities. Pac. J. Optim. 15(3),331–357 (2019)
Bello Cruz, J.Y., Iusem, A.N.: A strongly convergent method for nonsmooth convex minimization in Hilbert spaces. Numer. Funct. Anal. Optim. 32(10), 1009–1018 (2011)
Bello Cruz, J.Y., Lucambio Pérez, L.R., Melo, J.G.: Convergence of the projected gradient method for quasiconvex multiobjective optimization. Nonlinear Anal. 74(16), 5268–5273 (2011)
Bello Cruz, J.Y., Lucambio Pérez, L.R.: A subgradient-like algorithm for solving vector convex inequalities. J. Optim. Theory Appl. 162(2), 392–404 (2014)
Bello Cruz, J.Y., Melo, J.G., Serra, R.G.: A proximal gradient splitting method for solving convex vector optimization problems. Optimization (2020). https://doi.org/10.1080/02331934.2020.1800699
Bello Cruz, J.Y., Oliveira, W.D.: On weak and strong convergence of the projected gradient method for convex optimization in real Hilbert spaces. Numer. Funct. Anal. Optim. 37(2), 129–144 (2016)
Bento, G.C., Cruz Neto, J.X., Meireles, L.: Proximal point method for locally Lipschitz functions in multiobjective optimization of Hadamard manifolds. J. Optim. Theory Appl. 179(1), 37–52 (2018)
Bento, G.C., Cruz Neto, J.X., Soubeyran, A.: A proximal point-type method for multicriteria optimization. Set-Valued Var. Anal. 22(3), 557–573 (2014)
Bento, G.C., Cruz Neto, J.X., López, G., Soubeyran, A., Souza, J.C.O.: The proximal point method for locally Lipschitz functions in multiobjective optimization with application to the compromise problem. SIAM J. Optim. 28(2), 1104–1120 (2018)
Bento, G.C., Ferreira, O.P., Oliveira, P.R.: Unconstrained steepest descent method for multicriteria optimization on Riemannian manifolds. J. Optim. Theory Appl. 154(1), 88–107 (2012)
Bento, G.C., Ferreira, O.P., Pereira, Y.: Proximal point method for vector optimization on Hadamard manifolds. Oper. Res. Lett. 46(1), 13–18 (2018)
Bento, G.C., Ferreira, O.P., Sousa Junior, V.L.: Proximal point method for a special class of nonconvex multiobjective optimization functions. Optim. Lett. 12(2), 311–320 (2018)
Bonnel, H., Iusem, A.N., Svaiter, B.F.: Proximal methods in vector optimization. SIAM J. Optim. 15(4), 953–970 (2005)
Boţ, R., Grad, S.M.: Inertial forward-backward methods for solving vector optimization problems. Optimization 67(7), 959–974 (2018)
Brezis, H., Lions, P.L.: Infinite Products of Resolvents. Israel J. Math. 29(4), 329–345 (1978)
Ceng, L.C., Mordukhovich, B.S., Yao, J.C.: Hybrid approximate proximal method with auxiliary variational inequality for vector optimization. J. Optim. Theory Appl. 146(2), 267–303 (2010)
Cruz Neto, J.X., Da Silva, G.J.P., Ferreira, O.P., Lopes, J.O.: A subgradient method for multiobjective optimization. Comput. Optim. Appl. 54(3), 461–472 (2013)
Ferreira, O.P., Louzeiro, M.S., Prudente, L.F.: Gradient method for optimization on Riemannian manifolds with lower bounded curvature. SIAM J. Optim. 29(4), 2517–2541 (2019)
Fliege, J., Graña Drummond, L.M., Svaiter, B.F.: Newton’s method for multiobjective optimization. SIAM J. Optim. 20(2), 602–626 (2009)
Fliege, J., Svaiter, B.F.: Steepest descent methods for multicriteria optimization. Math. Methods Oper. Res. 51(3), 479–494 (2000)
Gárciga, O., Iusem, A.N., Svaiter, B.F.: On the need for hybrid steps in hybrid proximal point methods. Oper. Res. Lett. 29(5), 217–220 (2001)
Graña Drummond, L.M., Iusem, A.N.: A projected gradient method for vector optimization problems. Comput. Optim. Appl. 28(1), 5–29 (2004)
Graña Drummond, L.M., Raupp, F.M.P., Svaiter, B.F.: A quadratically convergent Newton method for vector optimization. Optimization 63(5), 661–677 (2014)
Graña Drummond, L.M., Svaiter, B.F.: A steepest descent method for vector optimization. J. Comput. Appl. Math. 175(2), 395–414 (2005)
Grad, S.M.: A survey on proximal point type algorithm for solving vector optimization problems. In: Bauschke, H., Burachik, R., Luke, D. (eds.) Splitting Algorithms, Modern Operator Theory, and Applications. Springer, Cham (2019)
Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29(2), 403–419 (1991)
Hiriart-Urruty, J.-B., Lemaréchal, C.: Convex analysis and minimization algorithms I, vol. 305. Springer-Verlag, Berlin (1993)
Lions, P.L.: An Iterative Method for Solving Variational Inequality. Israel J. Math. 31(2), 204–208 (1978)
Lu, F., Chen, C.R.: Newton-like methods for solving vector optimization problems. Appl. Anal. 93(8), 1567–1586 (2014)
Luc, D.T.: Theory of Vector Optimization. Lecture Notes in Economics and Mathematical Systems, vol. 319. Springer-Verlag, Berlin (1989)
Lucambio Pérez, L.R., Prudente, L.F.: Nonlinear conjugate gradient methods for vector optimization. SIAM J. Optim. 28(3), 2690–2720 (2018)
Marques Alves, M., Melo, J.G.: Strong convergence in Hilbert spaces via \(\Gamma \)-duality. J. Optim. Theory Appl. 158(2), 343–362 (2013)
Martinet, B.: Regularization of Variational Inequalities by Successive Approximations. Rev. Française de Inform. 4, 154–159 (1970)
Martinet, B.: Algorithms for Solving Optimization and Minimax Problems. Habilitation à diriger des recherches, Université Joseph-Fourier - Grenoble I (1972). https://tel.archives-ouvertes.fr/tel-00284255. Université : Université scientifique et médicale de Grenoble
Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)
Rockafellar, R.T.: A dual approach to solving nonlinear programming problems by unconstrained optimization. Math. Program. 5(1), 354–373 (1973)
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)
Solodov, M., Svaiter, B.F.: A hybrid projection-proximal point algorithm. J. Convex Anal. 6(1), 59–70 (1999)
Solodov, M., Svaiter, B.F.: Forcing strong convergence of proximal point iterations in a hilbert space. Math. Program. 87(1), 189–202 (2000)
Svaiter, B.: A new duality theory for mathematical programming. Optimization 60(8–9), 1209–1231 (2011)
Tanabe, H., Fukuda, E.H., Yamashita, N.: Proximal gradient methods for multiobjective optimization and their applications. Comput. Optim. Appl. 72(2), 339–361 (2019)
Acknowledgements
We would like to thank the anonymous referees for their valuable suggestions and remarks that allowed us to improve the first draft of the paper. The work of the second author is supported in part by FAPEG/CNPq/PRONEM-201710267000532 and CNPq-312559/2019-4. Data sharing not applicable to this article as no datasets were generated or analyzed during the current study
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Guang-ya Chen.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Iusem, A.N., Melo, J.G. & Serra, R.G. A Strongly Convergent Proximal Point Method for Vector Optimization. J Optim Theory Appl 190, 183–200 (2021). https://doi.org/10.1007/s10957-021-01877-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-021-01877-0