Proximal point method for a special class of nonconvex multiobjective optimization functions

Bento, G. C.; Ferreira, O. P.; Sousa Junior, V. L.

doi:10.1007/s11590-017-1114-0

Proximal point method for a special class of nonconvex multiobjective optimization functions

Original Paper
Published: 06 February 2017

Volume 12, pages 311–320, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Optimization Letters Aims and scope Submit manuscript

Proximal point method for a special class of nonconvex multiobjective optimization functions

Download PDF

G. C. Bento¹,
O. P. Ferreira¹ &
V. L. Sousa Junior¹

700 Accesses
5 Citations
Explore all metrics

Abstract

The proximal point method for a special class of nonconvex multiobjective functions is studied in this paper. We show that the method is well defined and that the accumulation points of any generated sequence, if any, are Pareto–Clarke critical points. Moreover, under additional assumptions, we show the full convergence of the generated sequence.

An inexact proximal point method with quasi-distance for quasi-convex multiobjective optimization

Article 30 May 2024

An inexact proximal point method for quasiconvex multiobjective optimization

Article 01 July 2024

A scalarization proximal point method for quasiconvex multiobjective minimization

Article 30 September 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Multiobjective optimization is the process of simultaneously optimizing two or more real-valued objective functions. Usually, no single point will minimize all the given objective functions at once (i.e., there is no ideal minimizer), and so the concept of optimality has to be replaced by the concept of Pareto optimality or as we will see, Pareto–Clarke critical; see [11]. These types of problems have applications in the economy, industry, agriculture, and other fields; see [13]. Bonnel et al. [6] considered extensions of the proximal point method to the multiobjective setting, see also [1,2,3,4, 7,8,9, 20] and references therein.

Our goal is to study the proximal point method introduced in [6] for the multiobjective problems, where each component function is lower-$C^1$, a special class of nonconvex functions. Over the last four decades, several authors have proposed generalized proximal point methods for certain nonconvex minimization problems. As far as we know, the first generalization was performed in [12], see also [15] for a review. Our approach extends to the multiobjective context the results of [15]. More precisely, we show that the method is well defined and that the accumulation points of any generated sequence, if any, are Pareto–Clarke critical for the multiobjective function. Moreover, under some additional assumptions, we show the full convergence of the generated sequence.

The organization of the paper is as follows. In Sect. 2, some notation and basic results used throughout the paper are presented. In Sect. 3, the main results are stated and proved. Some final remarks are made in Sect. 4.

2 Preliminaries

In this section, we present some basic results and definitions.

We denote $I:=\{1,\ldots ,m\},\,{\mathbb {R}}^{m}_{+}:=\left\{ x\in {\mathbb {R}}^m{:}\,x_j\ge 0, j\in I\right\} $, and ${\mathbb {R}}^{m}_{++}:=\left\{ x\in {\mathbb {R}}^m{:}\,x_j>0, j\in I\right\} $. For $y, z\in {\mathbb {R}}^m,\,z\succeq y$ (or $y\preceq z$ ) means that $z-y\in {\mathbb {R}}^{m}_{+}$ and $z\succ y$ (or $y\prec z$ ) means that $z-y\in {\mathbb {R}}^{m}_{++}$. We consider the unconstrained multiobjective problem: $\min _{x\in {\mathbb {R}}^n}F(x)$, where $F{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}^m$, with $F(x)=(f_1(x),\ldots ,f_m(x))$. Given a nonempty set $C\subset {\mathbb {R}}^{n}$, a point $x^*\in C$ is said to be a weak Pareto solution of the problem $\min _w \{F(x){:}\,x\in C\}$ if, and only if, there is no $x\in C$ with $F(x)\prec F(x^*)$. We denote as $\text{ argmin }_w \{F(x){:}\,x\in C\}$ the weak Pareto solutions set. In particular, when $C={\mathbb {R}}^{n}$, we denote this set as $U^{*}$. Assume that C is convex. F is called $\nu $-strongly convex (or simply strongly convex) on $C,\,\nu \in {\mathbb {R}}^m_{++}$, if, and only if, for every $x, y\in C$,

$$\begin{aligned} F\left( (1-t)x+ty\right) \preceq (1-t)F(x)+tF(y)- \nu t(1-t)\Vert x-y\Vert ^2 ,\quad t\in [0,1]. \end{aligned}$$

F is said to be convex when $\nu =0$ in the above inequality. Note that F is convex (resp. strongly convex) if, and only if, F is component-wise convex (resp. strongly convex). Moreover, this definition generalizes the definition of a convex function in the scalar case. The proof of the next proposition can be found in [16, p. 95].

Proposition 1

If C is a convex set and F is a convex function, then

$$\begin{aligned} \bigcup _{z\in {\mathbb {R}}^{m}_{+}{\backslash }\{0\}} {\mathop {{\mathrm{argmin}}}\limits _{x\in C}}\; \langle F(x),z\rangle = {\mathop {{\mathrm{argmin}}}_w}\,\left\{ F(x){:}\,x\in C\right\} . \end{aligned}$$

If $m=1,\,f$ is L-strongly convex on $\varOmega \subset {\mathbb {R}}^n$ with constant L if, and only if,

$$\begin{aligned} \langle u-v,x-y\rangle \ge L \Vert u-v\Vert ^2,\quad u\in \partial f(x),\quad v\in \partial f(y), \end{aligned}$$

(1)

whenever $x, y \in \varOmega $, where $\partial f$ denotes the subdifferential.

Remark 1

Let $f_1, f_2{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ be convex on $\varOmega $. Thus, $\partial f_{1}(x)$ and $\partial f_{2}(x)$ are nonempty, convex, and compact for $x\in \varOmega $. Moreover, if $\lambda _1, \lambda _2\ge 0$ then $\partial (\lambda _1f_1+\lambda _2f_2)(x)=\lambda _1\partial f_1(x)+\lambda _2 \partial f_2(x), $ for $x\in \varOmega $; see [19, Theorem 23.8].

Let $C\subset {\mathbb {R}}^n$ be nonempty, closed, and convex. The normal cone is defined by

$$\begin{aligned} N_C(x):=\left\{ v\in {\mathbb {R}}^n{:}\,\langle v,y-x\rangle \le 0,~ y\in C\right\} . \end{aligned}$$

(2)

Remark 2

If $g{:}\,{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}$ is convex, then the first-order optimality condition for $\min _{x\in C}g(x)$ is $0\in \partial g(x)+N_{C}(x)$. If g is the maximum of a finite collection of continuously differentiable functions, then this constraint qualification is satisfied.

Let $f{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ be locally Lipschitz at $x\in {\mathbb {R}}^n$ and $d\in {\mathbb {R}}^n$. The Clarke directional derivative [10, p. 25] of f at x in the direction d is defined as

$$\begin{aligned} f^{\circ }(x,d):=\displaystyle \limsup _{t\downarrow 0~y \rightarrow x}\frac{f(y+td)-f(y)}{t}, \end{aligned}$$

and the Clarke subdifferential of f at x, denoted by $\partial ^{\circ }f(x)$, is defined as

$$\begin{aligned} \partial ^{\circ }f(x):=\left\{ w\in {\mathbb {R}}^n{:}\,\langle w,d\rangle \le f^{\circ }(x,d),~\forall ~d\in {\mathbb {R}}^n\right\} . \end{aligned}$$

The previous definition can be found in [10, p. 27]. If f is convex, $f^{\circ }(x,d)=f'(x,d)$, where $f'(x,d)$ is the usual directional derivative. Moreover, $\partial ^{\circ }f(x)=\partial f(x)$ for all $x\in {\mathbb {R}}^n$; see [10, Proposition 2.2.7]. The next lemmas can be found in [10, p. 39]

Lemma 1

Let $\varOmega \subset {\mathbb {R}}^n$ be open and convex. If $f{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ is locally Lipschitz on $\varOmega $ and $g{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ is convex on $\varOmega $, then $(f+g)^{\circ }(x,d)=f^{\circ }(x,d)+g'(x,d)$ for each $x\in \varOmega $ and $d\in {\mathbb {R}}^n$. Consequently, if $g{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ is continuously differentiable on $\varOmega ,\, \partial ^{\circ }(f+g)(x)=\partial ^{\circ }f(x)+{\mathrm{grad}}\,g(x)$ for each $ x\in \varOmega .$

Lemma 2

Let $\varOmega \subset {\mathbb {R}}^n$ be open and convex. Let $f_i:{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ be a continuously differentiable function on $\varOmega ,\,i\in I$. Define $f(x):=\max _{i\in I}f_i(x)$, and $I(x):=\{i\in I{:}\,f_i(x)=f(x)\}$. Then, (a) f is locally Lipschitz on $\varOmega $ and $\text{ conv }\{\mathrm{grad}f_i(x){:}\,i\in I(x)\}\subset \partial ^{\circ }f(x),\,x\in \varOmega $; (b) if $f_i:{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ is differentiable and convex on $\varOmega ,\,i\in I$, then $\partial f(x)=\text{ conv }\{\mathrm{grad}f_i(x){:}\,i\in I(x)\}$. In particular, x minimizes f on $\varOmega $ if, and only if, there exists $\alpha _i\ge 0,\,i\in I(x)$, such that $0=\sum _{i\in I(x)}\alpha _i\,\mathrm{grad}\,f_i(x)$ and $\sum _{i\in I(x)}\alpha _i=1$; (c) if $f_i:{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ is $L_i$-strongly convex, for $i\in I,\,f$ is $\min _{i\in I}L_i$ strongly convex.

Proof

The proofs of items (a) and (b) can be found in [5, Proposition 4.5.1] and [17, p. 49], respectively. The proof of item (c) follows from the definition of a strongly convex function. $\square $

Definition 1

Let $F=(f_1,\ldots ,f_m)^{T}{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}^m$ be locally Lipschitz on ${\mathbb {R}}^n.$ We say that $x^*\in {\mathbb {R}}^n$ is a Pareto–Clarke critical point of F if, for all directions $d\in {\mathbb {R}}^n$, there exists $i_0=i_0(d)\in \{1,\ldots ,m\}$, such that $f^{\circ }_{i_0}(x^*,d)\ge 0.$

Remark 3

The previous definition can be found in [11]. When $m=1$, the last definition becomes the classic definition of the critical point for the nonsmooth convex function. The last definition generalizes, for nonsmooth multiobjective optimization, the condition $\text{ Im }\left( JF (x^*)\right) \cap \left( -{\mathbb {R}}^{m}_{++}\right) =\emptyset , $ which characterizes a Pareto critical point when F is continuously differentiable.

3 Proximal algorithm for multiobjective optimization

In this section, we present a proximal point method to minimize a nonconvex function F, where its component is given by the maximum of continuously differentiable functions. Our goal is to prove the following theorem:

Theorem 1

Let $\varOmega \subset {\mathbb {R}}^n$ be open and convex and $I_j:=\{1,\ldots ,\ell _j\}$, with $ \ell _j \in {\mathbb {Z}}_+$. Let $F(x):=(f_1(x),\ldots ,f_m(x))$, where $ f_j(x):=\max _{i\in I_j}f_{ij}(x),\,j\in {I}, $ and $f_{ij}:{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ is a continuously differentiable function on $\varOmega $ and continuous on ${\bar{\varOmega }}$, for all $i\in I_j$. Assume that for all $j\in {I},\,-\infty <\inf _{x\in {\mathbb {R}}^n} f_j(x),\,\mathrm{grad}f_{ij}$ is Lipschitz on $\varOmega $ with constant $L_{ij}$ for each $i\in I_j$ and $S_F(F({{\bar{y}}})):=\left\{ x\in {\mathbb {R}}^n{:}\, F(x)\preceq F({\bar{y}})\right\} \subset \varOmega $, for some ${{\bar{y}}}\in {\mathbb {R}}^n$. Let ${\bar{\lambda }}>0$ and ${\bar{\mu }}>0$, such that $ {\bar{\mu }}<1$. Take $\{e^k\}\subset {\mathbb {R}}^m_{++}$ and $\{\lambda _k\}\subset {\mathbb {R}}_{++}$ satisfying

$$\begin{aligned} \Vert e^k\Vert =1, \quad {\bar{\mu }}<e^k_j, \quad \frac{1}{{\bar{\mu }}}\max _{i\in I_j}L_{ij}<\lambda _k \le {\bar{\lambda }}, \quad j\in {I}, \quad k=0, 1,\ldots . \end{aligned}$$

(3)

Let ${\hat{x}}\in S_F(F({\bar{y}}))$. If $\varOmega _k:=\{x\in {\mathbb {R}}^n{:}\, F(x)\preceq F(x^k)\}$, then

$$\begin{aligned} x^{k+1}\in {\mathop {\mathrm{argmin}}_{w}}\,\left\{ F(x)+\frac{\lambda _k}{2}\Vert x-x^k\Vert ^2e^k {:}\, x\in \varOmega _k\right\} ,\quad k=0,1,\ldots , \end{aligned}$$

(4)

starting with $x^0={\hat{x}}$ is well defined, the generated sequence $\{x^k\}$ rests in $S_F(F({\bar{y}}))$ and any accumulation point of $\{x^k\}$ is a Pareto–Clarke critical point of F, as long as $\varOmega _k$ is convex, for each k.

In order to prove the above theorem we need some preliminaries. Hereafter, we assume that all the assumptions of Theorem 1 hold. We start proving the well-definedness of the sequence in (4).

Proposition 2

The proximal point method (4) applied to F with starting point $x^0={\hat{x}}$ is well defined.

Proof

The proof will be made by induction on k. Let $\{x^k\}$ be as in (4). By assumption, ${\hat{x}}\in S_F(F({\bar{y}}))$. Thus, we assume that $x^k\in S_F(F({\bar{y}}))$ for some k. Take $z\in {\mathbb {R}}_{+}^{m}{\backslash }\{0\}$ and define $\varphi _k(x):=\langle F(x),z\rangle +(\lambda _k/2)\langle e^k, z\rangle \Vert x-x^k\Vert ^2$. As $-\infty <\inf _{x\in {\mathbb {R}}^n} f_j(x)$ for all $j\in I$, the function $\langle F(\cdot ),z\rangle $ is bounded below and, taking into account that $\langle e^k, z\rangle >0$, it follows that $\varphi _k$ is coercive. Then, as $\varOmega _k$ is closed, there exists ${\tilde{x}}\in \varOmega _k$, such that ${\tilde{x}}=\text{ argmin }_{x\in \varOmega _k}\varphi _k(x)$. Therefore, from Proposition 1 we can take $x^{k+1}:={\tilde{x}}$ and the induction is done, proving the proposition. $\square $

Lemma 3

For all ${\tilde{x}}\in {\mathbb {R}}^n,\,v:=(v_1, \ldots , v_m) \in {\mathbb {R}}^m_{++},\, j\in {I}$ and $\lambda $ satisfying $\sup _{i\in I_j}L_{ij}<\lambda v_j$, the functions $f_{ij}+\lambda v_j\Vert \cdot -{\tilde{x}}\Vert ^2/2,\,f_{j}+\lambda v_j \Vert \cdot -{\tilde{x}}\Vert ^2/2$ and $F+(\lambda /2)\Vert \cdot -{\tilde{x}}\Vert ^2v$ are strongly convex on $\varOmega $. Moreover, $ \left\langle F(\cdot ),z \right\rangle +\lambda \left\langle v, z\right\rangle \left\| \cdot -{\tilde{x}}\right\| ^2/2$ is strongly convex on $\varOmega $ for each $ z \in {\mathbb {R}}^m_+{\backslash }\{0\}$.

Proof

Take $ j\in {I},\,i\in I_j,\,{\tilde{x}}\in {\mathbb {R}}^n,\,v_j \in {\mathbb {R}}_{++}$ and define $h_{ij}=f_{ij}+\lambda v_j \Vert \cdot -{\tilde{x}}\Vert ^2/2$. Since $\mathrm{grad}\,h_{ij}(x)=\mathrm{grad}\,f_{ij}(x)+\lambda v_j (x-{\tilde{x}})$, we have $\langle \mathrm{grad}\,h_{ij}(x)-\mathrm{grad}\,h_{ij}(y),x-y\rangle =\langle \mathrm{grad}\,f_{ij}(x)-\mathrm{grad}\,f_{ij}(y),x-y\rangle +\lambda v_j \Vert x-y\Vert ^2\ $. Using the Cauchy inequality, last equality becomes

$$\begin{aligned}&\left\langle \mathrm{grad}\,h_{ij}(x)-\mathrm{grad}\,h_{ij}(y),x-y\right\rangle \ge -\Vert \mathrm{grad}\,f_{ij}(x)-\mathrm{grad}\,f_{ij}(y)\Vert \Vert x-y\Vert \\&\quad +\,\lambda v_j \Vert x-y\Vert ^2. \end{aligned}$$

As $\mathrm{grad}f_{ij}$ is Lipschitz on $\varOmega $ with constant $L_{ij},\,\langle \mathrm{grad}\,h_{ij}(x)-\mathrm{grad}\,h_{ij}(y),x-y\rangle \ge (\lambda v_j -L_{ij})\Vert x-y\Vert ^2$ holds. Hence, the last inequality along with the assumption $\lambda v_j >\sup _{i\in I_j}L_{ij}$ implies that $\mathrm{grad}\,h_{ij}$ is strongly monotone. Therefore, (1) implies that $h_{ij}$ is strongly convex, proving the first part of the lemma. The second and third parts of the lemma follow from the first one. $\square $

Hereafter, $\{x^k\}$ is generated by (4). Note that Proposition 1 implies that there exists a sequence $\{ z^k\}\subset {\mathbb {R}}^m_+{\backslash }\{0\}$, such that

$$\begin{aligned} x^{k+1}={\mathop {\text{ argmin }}\limits _{x\in \varOmega _k}}\,\psi _k(x), \end{aligned}$$

(5)

where the function $\psi _k{:}\,{\mathbb {R}}^n\rightarrow {\mathbb {R}}$ is defined by

$$\begin{aligned} \psi _k(x):=\left\langle F(x),z^k \right\rangle +\frac{\lambda _k}{2}\left\langle e^k, z^k\right\rangle \left\| x-x^k\right\| ^2. \end{aligned}$$

(6)

The solution of the problem in (5) is not altered through the multiplication of $z^k$ by positive scalars. Thus, we can suppose $\Vert z^k\Vert =1$ for $k=0,1, \ldots .$

Proof of Theorem 1

The well-definedness of (4) follows from Proposition 2. As $x_0={\hat{x}}\in S_F(F({\bar{y}}))\subset \varOmega $, (4) implies $\{x^k\}\subset S_F(F({\bar{y}}))$. Let ${\bar{x}}$ be an accumulation point of $\{x^k\}$. Assume that $\varOmega _k$ is convex and, by contradiction, that ${\bar{x}}$ is not Pareto–Clarke critical of F. Then, there exists $d\in {\mathbb {R}}^n$, such that

$$\begin{aligned} f_{i}^{\circ }({\bar{x}},d)<0,\quad i\in I. \end{aligned}$$

(7)

Thus, d is a descent direction for F in ${\bar{x}}$ and there exists $\delta >0$, such that $F({\bar{x}}+td)\prec F({\bar{x}})$ for all $t\in (0,\delta ]$. Hence, ${\bar{x}}+td\in \varOmega _k$, for $k=0, 1, \ldots $.

Let $\{z_k\}$ be a sequence satisfying (5). Hence, we can combine Lemma 3 and Remark 2 to obtain

$$\begin{aligned} 0\in \partial \left( \left\langle F(\cdot ),z^k\right\rangle +\frac{\lambda _k}{2}\left\langle e^k, z^k\right\rangle \left\| \cdot -x^k\right\| ^2\right) (x^{k+1})+N_{\varOmega _k}(x^{k+1}), \quad k=0,1,\ldots . \end{aligned}$$

Letting $z^k=(z^k_1,\ldots ,z^k_m)$ and $e^k=(e^k_1,\ldots ,e^k_m)$, Remark 1 gives us,

$$\begin{aligned} 0\in \sum _{j=1}^{m}z^k_j\partial \left( f_j+\frac{\lambda _k}{2} e^k_j\left\| \cdot -x^k\right\| ^2\right) (x^{k+1})+N_{\varOmega _k}(x^{k+1}), \quad k=0,1,\ldots . \end{aligned}$$

The last inclusion implies that there exists $v^{k+1}\in N_{\varOmega _k}(x^{k+1})$, such that

$$\begin{aligned} 0\in \sum _{j=1}^{m}z^k_j\partial \left( f_j+\frac{\lambda _k}{2}e^k_j\left\| \cdot -x^k\right\| ^2\right) (x^{k+1})+v^{k+1}, \quad k=0,1,\ldots . \end{aligned}$$

Since $\max _{i\in I_j}L_{ij}<\lambda _ke^k_j$, Lemma 3 implies that $f_{ij}+\lambda _k e^k_j\Vert \cdot -x^k\Vert ^2/2$ and $f_j+\lambda _k e^k_j\Vert \cdot -x^k\Vert ^2/2$ are strongly convex for all $j\in {I},\,k=0, 1,\ldots $. Applying Lemma 2(b), for $I=I_j$ and for the functions $f_{ij}+\lambda _k e^k_j \Vert \cdot -x^k\Vert ^2/2$ and $f_j+\lambda _k e^k_j\Vert \cdot -x^k\Vert ^2/2$, for each $j\in {I}$, we obtain

$$\begin{aligned} 0= & {} \sum _{j=1}^{m}z^k_j\left( \sum _{i\in I_j(x^{k+1})}\alpha _{ij}^{k+1}{\mathrm{grad}}\left( f_{ij}+\frac{\lambda _k e^k_j}{2}\left\| \cdot -x^k\right\| ^2\right) (x^{k+1})\right) +v^{k+1},\\&\quad \times \sum _{i\in I_j(x^{k+1})}\alpha _{ij}^{k+1}=1, \end{aligned}$$

which holds for all $k=0,1,\ldots $, with $\alpha _{ij}^{k+1}\ge 0,\,i\in I_j(x^{k+1})$. This tells us that

$$\begin{aligned} 0= & {} \sum _{j=1}^{m}z^k_j\left( \sum _{i\in I_j(x^{k+1})}\alpha _{ij}^{k+1}\left( {\mathrm{grad}}f_{ij}(x^{k+1})+\lambda _k e^k_j (x^{k+1}-x^k)\right) \right) +v^{k+1},\nonumber \\&\quad \times \sum _{i\in I_j(x^{k+1})}\alpha _{ij}^{k+1}=1, \end{aligned}$$

(8)

for all $k=0,1,\ldots $. For all $j\in {I}$, let $\{\alpha _{ij}^{k+1}\}\subset {\mathbb {R}}^m$ be the sequence defined by $\alpha _{j}^{k+1}=(\alpha _{1j}^{k+1},\alpha _{2j}^{k+1},\ldots ,\alpha _{mj}^{k+1}),\,\alpha _{ij}^{k+1}=0,\,i\in I_j{\backslash } I_j(x^{k+1})$, for all $k=0,1,\ldots $. Since $\sum _{i\in I_j(x^{k+1})}\alpha _{ij}^{k+1}=1,\,\Vert \alpha _j^{k+1}\Vert _1=1$ for all k, where $\Vert \cdot \Vert _1$ is the sum norm in ${\mathbb {R}}^n$. Thus, $\{\alpha _{j}^{k+1}\}$ is bounded. As $\{x^k\}\subset S_F(F({\bar{y}}))$ and F is continuous on $\varOmega $, we have ${\bar{x}}\in S_F(F({\bar{y}}))$. Since $I_j$ is finite we can assume without loss of generality that $I_j(x^{k_1+1})=I_j(x^{k_2+1})=\cdots =:{\tilde{I}}_J$, and (8) becomes

$$\begin{aligned} 0= & {} \sum _{j=1}^{m}z^{k_s}_j\left( \sum _{i\in {\tilde{I}}_J}\alpha _{ij}^{k_s+1}{\mathrm{grad}}f_{ij}(x^{k_s+1})+\lambda _{k_s} e^{k_s}_j(x^{k_s+1}-x^{k_s})\right) +v^{k_s+1},\nonumber \\&\quad \times \sum _{i\in {\tilde{I}}_J}\alpha _{ij}^{k_s+1}=1,\quad s=0,1,\ldots . \end{aligned}$$

(9)

From the continuity of F we obtain that $\varOmega _k$ is closed. Considering that $x^{k_s}\in \varOmega _{k_s},\,\varOmega _{k_s}$ is convex and $\varOmega _{k_s+1}\subset \varOmega _{k_s}$, for $s=0, 1, \ldots $, we obtain that

$$\begin{aligned} {\tilde{\varOmega }}:=\cap _{s=0}^{+\infty }\varOmega _{k_s}, \end{aligned}$$

(10)

is nonempty, closed, and convex. As $v^{k_s+1}\in N_{\varOmega _{k_s}}(x^{k_s+1})$ and ${\tilde{\varOmega }}\subset \varOmega _{k_s}$, (2) implies

$$\begin{aligned} \left\langle v^{k_s+1},~x-x^{k_s+1}\right\rangle \le 0,\quad x\in {\tilde{\varOmega }}, \quad s=0,1,\ldots . \end{aligned}$$

(11)

On the other hand, let $\{z^{k_s+1}\},\,\{x^{k_s+1}\},\,\{e^{k_s+1}_j\},\,\{\lambda _{k_s+1}\}$, and $\{\alpha _j^{k_s+1}\}$ be the subsequences of $\{z^{k+1}\},\,\{x^{k+1}\},\,\{e^{k+1}_j\},\,\{\lambda _{k+1}\}$, and $\{\alpha _j^{k+1}\}$, respectively, such that $\lim _{s\rightarrow +\infty }(z^{k_s+1},x^{k_s+1},e_j^{k_s+1},\lambda _{k_s+1},\alpha _j^{k_s+1})=({\overline{z}},{\overline{x}},{\overline{e}}_j,{\hat{\lambda }},{\bar{\alpha }}_j)$. This fact along with (9), implies that $\lim _{s\rightarrow +\infty }v^{k_s+1}={\bar{v}}$. From (11), ${\bar{v}}\in N_{{\tilde{\varOmega }}}({\bar{x}})$. Hence, in view of (9), we have $0=\sum _{j=1}^{m}{{{\bar{z}}}}_j\sum _{i\in {\tilde{I}}_J}{\bar{\alpha }}_{ij}{\mathrm{grad}}f_{ij}({\bar{x}})+{\bar{v}}$ and $ \sum _{i\in {\tilde{I}}_J}{\bar{\alpha }}_{ij}=1$. Let $x\in {\tilde{\varOmega }}$. Taking $u_j=\sum _{i\in {\tilde{I}}_J}{\bar{\alpha }}_{ij}{\mathrm{grad}}f_{ij}({\bar{x}})$, we have

$$\begin{aligned} 0=\sum _{j=1}^{m}{{{\bar{z}}}}_j\langle u_j,x-{\bar{x}}\rangle +\langle {\bar{v}},x-{\bar{x}}\rangle . \end{aligned}$$

(12)

As ${\bar{x}}+td\in \varOmega _k$, for all $k=0, 1, \ldots $, the definition of ${\tilde{\varOmega }}$ in (10) implies that ${\bar{x}}+td\in {\tilde{\varOmega }},\,t\in (0,\delta ]$. Since $u_j=\sum _{i\in {\tilde{I}}_J}{\bar{\alpha }}_{ij}{\mathrm{grad}}f_{ij}({\bar{x}})$ and $\sum _{i\in {\tilde{I}}_J}{\bar{\alpha }}_{ij}=1$, Lemma 2 (a) and (b) implies that $u_j\in \partial ^{\circ }f_j({\bar{x}})$. Hence, using that ${\bar{v}}\in N_{{\tilde{\varOmega }}}({\bar{x}})$ and definition of $f_j^{\circ }({\bar{x}},d)$, equality (12) with $x={\bar{x}}+td$ yields $ 0\le \sum _{j=1}^{m}{{{\bar{z}}}}_j\langle u_j,d\rangle \le \sum _{j=1}^{m}{{{\bar{z}}}}_j f_j^{\circ }({\bar{x}},d). $ Thus, there exists $j\in I$ such that $f_j^{\circ }({\bar{x}},d)\ge 0$, which contradicts (7). Therefore, ${\bar{x}}$ is Pareto–Clarke critical point of F.

Now let us introduce some conditions that will guarantee that $\{x^k\}$ converges to a point $x^*\in U^*$. Suppose that

(H1) $U=\{y\in {\mathbb {R}}^n{:}\, F(y) \preceq F(x^k), ~ k=0,1, \ldots \}\ne \varnothing $;
(H2) there exists $c\in {\mathbb {R}}$ such that the following conditions hold:
- (a) $S_F(ce):=\left\{ x\in {\mathbb {R}}^n{:}\, F(x)\preceq ce\right\} \ne \varnothing $ and $S_F(ce) \subsetneq S_{F}(F({\bar{y}}))$;
- (b) $S_F(ce)$ is convex and F is convex on $S_{F}(ce)$, where $e:=(1, \ldots , 1 )\in {\mathbb {R}}^m$;
(H3) there exists $\delta >0$ such that for all $z\in {\mathbb {R}}^m_+{\backslash } \{0\},\,x\in S_{F}(F({\bar{y}}))\,{\backslash }\, S_{F}(ce)$ and $w_z(x)\in \partial ^{\circ }\left( \langle F(\cdot ),z\rangle \right) (x)+N_{\varOmega _k}(x)$, it holds that $\Vert w_z(x)\Vert>\delta >0$.

In general, the set U defined in assumption (H1) may be an empty set. To guarantee that U is nonempty, an additional assumption on the sequence $\{x^{k}\}$ is needed. In the next remark we give such a condition.

Remark 4

If the sequence $\{x^{k}\}$ has an accumulation point, then $U\ne \varnothing $, i.e., assumption (H1) holds. Indeed, let ${\bar{x}}$ be an accumulation point of the sequence $\{x^{k}\}$. Then, there exists a subsequence $\{x^{k_j}\}$ of $\{x^{k}\}$ which converges to ${\bar{x}}$. Since F is continuous, $\{F(x^{k})\}$ has $F({\bar{x}})$ as an accumulation point. Using the definition of $\{x^{k}\}$ in (4), we conclude that $\{F(x^{k})\}$ is a decreasing sequence. Hence, the usual arguments easily show that the whole sequence $\{F(x^{k})\}$ converges to $F({\bar{x}})$ and ${\bar{x}}\in U$, i.e., $U\ne \varnothing $.

Next, we present a function satisfying the assumptions of Theorem 1, as well as (H1), (H2) and (H3), when $m=2,\,j=2$ and ${I}=I_{j}= 2$.

Example 1

Take $0<\epsilon <0.4,\,\varOmega =(\epsilon , +\infty ),\,{I}:=\{1,2\}$, and ${{\bar{y}}}=2.718 \ldots $ with $\ln {{\bar{y}}}=1$. Let $F:{\mathbb {R}}\rightarrow {\mathbb {R}}^2$ be defined by $F(x)= (0,0)$ for $x\in {\mathbb {R}} {\backslash } {\mathbb {R}}_{++}$ and $F(x):=(f_1(x),f_2(x))$, where $f_j(x):=\max _{i\in {I} }f_{ij}(x)$ for $j\in {I}$ and $f_{11}(x)=\ln x+1/x,\,f_{21}(x)=\ln x-1/x,\,f_{12}(x)=2\sqrt{x}+1/x,\,f_{22}(x)=2\sqrt{x}-1/x$, for all $x\in {\mathbb {R}}_{++}$. Note that $f_{1j},\,f_{2j}$ are continuously differentiable on $\varOmega $ and continuous on ${\bar{\varOmega }}$, for all $j\in {I}$. Since $f''_{1j},\,f''_{2j}$ are bounded on $\varOmega $, we conclude that $f'_{1j},\,f'_{2j}$ are Lipschitz on $\varOmega $, for $j\in {I}$. Since $\max \{a,b\}=(a+b)/2+|a-b|/2$, for all $a, b\in {\mathbb {R}}$, we conclude $f_1(x)=\ln x+1/x$ and $f_2(x)=2\sqrt{x}+1/x$. Therefore, $ F(x)=\left( \ln x+1/x, ~ 2\sqrt{x}+1/x\right) ,\,x\in {\mathbb {R}}_{++}$. It is easy to see that F is nonconvex and $\{x\in \varOmega {:}\, F(x)\preceq (\zeta , \zeta )\} $ is convex and nonempty, for all $\zeta \ge 1$. Consider the following multiobjective optimization problem $\min _{w} \{ F(x) {:}\, x\in \varOmega \}$, which has $x^*=1$ as the unique solution. In fact, $F(1) \prec F(x)$, for all $x\in {\mathbb {R}}_{++}$. Hence, we obtain that $-\infty <\inf _{x\in {\mathbb {R}}}f_j(x)$, for $j\in {I}$. Since $0<\epsilon <{{\bar{y}}}$, we conclude that $S_F(F({{\bar{y}}}))\subset \varOmega $ and $S_F(F({{\bar{y}}}))\ne \varnothing $. Therefore, taking into account that $\varOmega _k$ is convex, F satisfies all the assumptions of Theorem 1. We are going to prove that F also satisfies (H1), (H2), and (H3). Since $F(1) \prec F(x)$, for all $x\in {\mathbb {R}}_{++}$, we conclude that F satisfies (H1). Let $c=f_2(2)$ and note that $ (0.6, ~2] \subset S_F(ce) \subsetneq [0.5, ~2.7] \subset S_F(F({{\bar{y}}})). $ But this tells us, in particular, that F satisfies (H2). Finally, we are going to prove that F satisfies (H3). First, note that $ S_F(F({{\bar{y}}})){\backslash } S_F(ce)\subset [0.47, ~0.57)\cup (2, ~2.72]. $ For each point $z=(z_1, z_2)\in {\mathbb {R}}^2_+{\backslash } \{0\}$ with $\Vert z\Vert _{1}:=z_1+z_2=1$, take $x\in S_F(F({{\bar{y}}})){\backslash } S_F(ce)$ and $w_z(x)\in \partial ^{\circ }\left( \langle F(\cdot ),z\rangle \right) (x)+N_{\varOmega _k}(x)$. Hence, there exists $v\in N_{\varOmega _k}(x)$, such that

$$\begin{aligned} w_z(x)=z_1\varphi _1(x)+z_2\varphi _2(x)+v, \end{aligned}$$

(13)

where $\varphi _1(x):=(1/x-1/x^2)$ and $\varphi _2(x):=(1/\sqrt{x}-1/x^2)$. First, we assume that $x\in [0.47, ~0.57)$. In this case, $ N_{\varOmega _k}(x)\subset {\mathbb {R}}_{-}$ and, using the above equality, we obtain $w_z\le z_1\varphi _1(x)+z_2\varphi _2(x)$. Since $(x-1)/x^2<-0,4/(0,47)^2$ and $(x^2-\sqrt{x})/x^{3/2}<-0,2/(0,47)^{3/2}$, we have $ w_z(x)\le z_1\varphi _1(x)+z_2\varphi _2(x)<-0,4/(0,47)^2z_1-0,2/(0,47)^{3/2}z_2. $ Then, for all $w_z(x)\in \partial ^{\circ }\left( \langle F(\cdot ),z\rangle \right) (x)+N_{\varOmega _k}(x)$,

$$\begin{aligned} |w_z(x)|>\frac{0,4}{(0,47)^2}z_1+\frac{0,27}{(0,47)^{3/2}}z_2>\Vert z\Vert _{1}\frac{0,27}{(0,47)^{3/2}}=\frac{0,27}{(0,47)^{3/2}}, \end{aligned}$$

(14)

for $x\in [0.47, ~0.57)$. Assuming that $x\in (2, ~2.72]$, it follows that $ N_{\varOmega _k}(x)\subset {\mathbb {R}}_{+}$. Hence, it follows from (13) that $w_z(x)\ge z_1\varphi _1(x)+z_2\varphi _2(x)$. From $(x-1)/x^2>1/(2,72)^2$ and $\left( 1/\sqrt{x}-1/x^2\right) >2,3/(2,72)^{3/2}$ we obtain $ w_z(x)\ge z_1\varphi _1(x)+z_2\varphi _2(x)>1/(2,72)^2z_1+2,3/(2,72)^{3/2}z_2. $ Thus, for all $w_z(x)\in \partial ^{\circ }\left( \langle F(\cdot ),z\rangle \right) (x)+N_{\varOmega _k}(x)$,

$$\begin{aligned} |w_z(x)|>\frac{1}{(2,72)^2}z_1+\frac{2,3}{(2,72)^{3/2}}z_2>\Vert z\Vert _{1}\frac{1}{(2,72)^2}=\frac{1}{(2,72)^2}, \quad x\in (2, ~2.72]. \end{aligned}$$

Since $S_F(F({{\bar{y}}})){\backslash } S_F(ce)\subset [0.47, ~0.57)\cup (2, ~2.72] $, combining (14) with the last inequality, we conclude that F satisfies (H3) with $\delta = 1/(2,72)^2$.

Lemma 4

Assume that (H1), (a) in (H2), and (H3) hold and $\lambda _k$ satisfies (3). Then, after a finite number of steps the proximal iterates lies in $S_F(ce)$, i.e., there exists $k_0$ such that $\{x^k\}\subset S_F(ce)$, for all $k\ge k_0$.

Proof

Condition (a) in (H2) implies that $S_F(ce)\ne \varnothing $. Suppose, by contradiction, that $x^k\in S_F(F({{\bar{y}}})){\backslash } S_F(ce)$ for all k. Let $\{z_k\}$ be a sequence satisfying (5). Hence, we can combine Lemma 3, Remark 2 and Lemma 1 to obtain $0\in \partial ^{\circ }\left( \left\langle F(\cdot ),z^k\right\rangle \right) (x^{k+1})+(\lambda _k/2)\left\langle e^k, z^k\right\rangle \left( x^{k+1}-x^k\right) +N_{\varOmega _k}(x^{k+1}),\,k\ge 0$. Then, $-(\lambda _k/2)\left\langle e^k, z^k\right\rangle \left( x^{k+1}-x^k\right) \in \partial ^{\circ }\left( \left\langle F(\cdot ),z^k\right\rangle \right) (x^{k+1})+N_{\varOmega _k}(x^{k+1})$, for $k\ge 0$. As $x^{k+1}\in S_F(F({{\bar{y}}})){\backslash } S_F(ce)$, (H3) along with the last inclusion gives us

$$\begin{aligned} \frac{\lambda _k}{2}\left\langle e^k, z^k\right\rangle \left\| x^{k+1}-x^k\right\| >\delta ,\quad k\ge 0. \end{aligned}$$

(15)

Hence, $(\lambda _k/2)\left\langle e^k, z^k\right\rangle \left\| x^{k+1}-x^k\right\| ^2\le \Vert F(x^{k})- F(x^{k+1})\Vert $ holds, in view of (5) and (6), and $\Vert z^k\Vert =1$, for all $k\ge 0$. Thus, from (H1), $ \Vert F(x^{k})- F(x^{k+1})\Vert \rightarrow 0$, contradicting (15).$\square $

Lemma 5

Assume that (H1) and (H2) hold and $\lambda _k$ satisfies (3). If $x^k\in S_F(ce)$ for some k, then $\{x^k\}$ converges to a point $x^*\in U^*.$

Proof

By hypothesis, $x^{k}\in S_F(ce)$ for some k, i.e., there exists $k_0$ such that $F(x^{k_0})\preceq ce$. Hence, the definition of $\{x^k\}$ in (4) implies that $\{x^k\}\subset S_F(ce)$, for all $k\ge k_0$. Therefore, using (3), (H1), and (H2), the result follows by applying [6, Theorem 3.1] with $F_0=F,\,S=S_F(ce),\,X={\mathbb {R}}^n,\,C={\mathbb {R}}^m_+$, and using (H1) instead of (A). $\square $

Theorem 2

Under the conditions $(H1),\,(H2)$, and (H3), the sequence $\{x^k\}$ generated by (4) converges to a point $x^*\in U^*$.

Proof

It follows by combination of Lemma 4 with Lemma 5. $\square $

Remark 5

As the function in Example 3.1 is not convex, the analysis in [6] does not allow us to conclude that $\{x^k\}$ converges to a minimizer. However, as the function satisfies (H1) to (H3), Theorem 2 guarantees that $\{x^k\}$ converges.

4 Conclusions

The main contribution of this paper is the extension of the convergence analysis of the proximal method (4), which has been studied in [6], in order to increase the range of its applications; see Example 1. The proximal point method is indeed a conceptual scheme that transforms a given problem into a sequence of better behaved subproblems, which has been proven to be an efficient tool in several instances through the methods that can be derived from it (e.g., Augmented Lagrangians, both classical or generalized). In this sense, the proximal point method is basic and we expect that the results of the present paper will become an additional step toward solving general multiobjective optimization problems. Finally, it is worth noting that in our analysis of the preference relation was induced by a cone with a nonempty interior. Various vector optimization problems can be formalized by using convex ordering cones with empty interiors, this restriction could open some new perspectives in the point of view of the numerical methods, see [2, 14, 18]. We foresee new developments in this direction in the near future.

References

Apolinário, H.C.F., Papa Quiroz, E.A., Oliveira, P.R.: A scalarization proximal point method for quasiconvex multiobjective minimization. J. Glob. Optim. 64(1), 79–96 (2016)
Article MathSciNet MATH Google Scholar
Bao, T.Q., Mordukhovich, B.S.: Relative Pareto minimizers in multiobjective optimization: existence and optimality conditions. Math. Program. 122, 301–347 (2010)
Article MathSciNet MATH Google Scholar
Bao, T.Q., Mordukhovich, B.S.: Necessary conditions for super minimizers in constrained multiobjective optimization. J. Glob. Optim. 43, 533–552 (2009)
Article MathSciNet MATH Google Scholar
Bento, G.C., Cruz Neto, J.X., Soubeyran, A.: A proximal point-type method for multicriteria optimization. Set-Valued Var. Anal. 22(3), 557–573 (2014)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P.: Convex Analysis and Optimization. Athena Scientific, Belmont (2003)
MATH Google Scholar
Bonnel, H., Iusem, A.N., Svaiter, B.F.: Proximal methods in vector optimization. SIAM J. Optim. 15(4), 953–970 (2005)
Article MathSciNet MATH Google Scholar
Ceng, L.C., Mordukhovich, B.S., Yao, J.C.: Hybrid approximate proximal method with auxiliary variational inequality for vector optimization. J. Optim. Theory Appl. 146(2), 267–303 (2010)
Article MathSciNet MATH Google Scholar
Ceng, L.C., Yao, J.C.: Approximate proximal methods in vector optimization. Eur. J. Oper. Res. 183(1), 1–19 (2007)
Article MathSciNet MATH Google Scholar
Chuong, T.D., Mordukhovich, B.S., Yao, J.C.: Hybrid approximate proximal algorithms for efficient solutions in vector optimization. J. Nonlinear Convex Anal. 12(2), 257–286 (2011)
MathSciNet MATH Google Scholar
Clarke, F.H.: Optimization and Nonsmooth Analysis, Volume 5 of Classics in Applied Mathematics, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (1990)
Book Google Scholar
Custódio, A.L., Madeira, J.F.A., Vaz, A.I.F., Vicente, L.N.: Direct multisearch for multiobjective optimization. SIAM J. Optim. 21(3), 1109–1140 (2011)
Article MathSciNet MATH Google Scholar
Fukushima, M., Mine, H.: A generalized proximal point algorithm for certain nonconvex minimization problems. Int. J. Syst. Sci. 12(8), 989–1000 (1981)
Article MathSciNet MATH Google Scholar
Gal, T., Hanne, T.: On the development and future aspects of vector optimization and MCDM. A tutorial. In: Multicriteria Analysis (Coimbra, 1994), pp. 130–145. Springer, Berlin (1997)
Grad, S.M., Pop, E.L.: Vector duality for convex vector optimization problems by means of the quasi-interior of the ordering cone. Optimization 63(1), 21–37 (2014)
Article MathSciNet MATH Google Scholar
Kaplan, A., Tichatschke, R.: Proximal point methods and nonconvex optimization. J. Glob. Optim. 13(4), 389–406 (1998)
Article MathSciNet MATH Google Scholar
Luc, D.T.: Theory of Vector Optimization, volume 319 of Lecture Notes in Economics and Mathematical Systems. Springer, Berlin (1989)
Mäkelä, M.M., Neittaanmäki, P.: Nonsmooth Optimization. World Scientific Publishing Co., Inc., River Edge (1992)
Book MATH Google Scholar
Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation, II: Applications. Grundlehren Series in Fundamental Principles of Mathematical Sciences, vol. 331. Springer, Berlin (2006)
Rockafellar, R.T.: Convex Analysis. Princeton Mathematical Series, No. 28. Princeton University Press, Princeton (1970)
Villacorta, K.D.V., Oliveira, P.R.: An interior proximal method in vector optimization. Eur. J. Oper. Res. 214(3), 485–492 (2011)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The work was supported by CAPES and CNPq.

Author information

Authors and Affiliations

Universidade Federal de Goiás, Goiânia, GO, Brazil
G. C. Bento, O. P. Ferreira & V. L. Sousa Junior

Authors

G. C. Bento
View author publications
You can also search for this author in PubMed Google Scholar
O. P. Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
V. L. Sousa Junior
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. L. Sousa Junior.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bento, G.C., Ferreira, O.P. & Sousa Junior, V.L. Proximal point method for a special class of nonconvex multiobjective optimization functions. Optim Lett 12, 311–320 (2018). https://doi.org/10.1007/s11590-017-1114-0

Download citation

Received: 24 August 2016
Accepted: 01 February 2017
Published: 06 February 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s11590-017-1114-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Proximal point method for a special class of nonconvex multiobjective optimization functions

Abstract

Similar content being viewed by others

An inexact proximal point method with quasi-distance for quasi-convex multiobjective optimization

An inexact proximal point method for quasiconvex multiobjective optimization

A scalarization proximal point method for quasiconvex multiobjective minimization

1 Introduction

2 Preliminaries

Proposition 1

Remark 1

Remark 2

Lemma 1

Lemma 2

Proof

Definition 1

Remark 3

3 Proximal algorithm for multiobjective optimization

Theorem 1

Proposition 2

Proof

Lemma 3

Proof

Proof of Theorem 1

Remark 4

Example 1

Lemma 4

Proof

Lemma 5

Proof

Theorem 2

Proof

Remark 5

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation