Abstract
In this paper, we propose a linear scalarization proximal point algorithm for solving lower semicontinuous quasiconvex multiobjective minimization problems. Under some natural assumptions and, using the condition that the proximal parameters are bounded, we prove the convergence of the sequence generated by the algorithm and, when the objective functions are continuous, we prove the convergence to a generalized critical point of the problem. Furthermore, for the continuously differentiable case we introduce an inexact algorithm, which converges to a Pareto critical point.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we consider the class of problems known as multiobjective minimization problems, which involves a minimization of a set of quasiconvex objective functions. A motivation to study this problem is the consumer demand theory in microeconomy, where the quasiconvexity of each objective function is a natural condition associated with diversification of the consumption; see Sect. 3.1 for more details.
Another motivation is the multiobjective quasiconvex minimization model in location theory, where we need to find a location for an installation so that this location minimizes some functions involving some distances between the new location and each cluster set of demand points; see Sect. 4 of Apolinário et al. [1].
For other motivations, we recommend the excellent book [2], which contains a heterogeneous collection of contributions on generalized convexity and generalized monotonicity. In particular, we recommend the lecture of Chapters 2, 3, 5 and 6.
Recently, Apolinário et al. [1] have introduced an exact linear scalarization proximal point algorithm to solve the above class of problems, when each objective function is locally Lipschitz and quasiconvex. The authors proved, under some natural assumptions, that the sequence generated by the proposed algorithm is well defined and converges globally to a Pareto–Clarke critical point.
Unfortunately, the proposed algorithm cannot be applied to solve a general class of proper lower semicontinuous quasiconvex functions, in particular to solve constrained multiobjective problems or minimization problems with continuous quasiconvex functions, which are not locally Lipschitz. Moreover, for a future implementation and application, it is necessary to construct inexact versions of the proposed algorithm.
Thus, we had two motivations to develop the present paper: The first motivation was to extend the convergence properties of the linear scalarization proximal point method introduced in [1] to solve more general, probably constrained, quasiconvex multiobjective problems and the second one was to introduce an inexact algorithm, when each objective function is continuously differentiable.
Some works related to this paper are as follows:
-
Bento et al. [3] introduced a proximal point algorithm for multiobjective optimization using a nonlinear scalarization function. Assuming that the objective functions are quasiconvex and continuously differentiable, the authors proved that the sequence generated by the algorithm converges to a Pareto critical point. The difference between our work and the paper of Bento et al. [3] is that in the present paper we consider a linear scalarization function instead of a nonlinear ones and another difference is that our assumptions are slightly weaker than the paper [3], because we obtain convergence results for nondifferentiable quasiconvex functions.
-
Makela et al. [4] developed a multiobjective proximal bundle method for nonsmooth optimization where the objective functions are locally Lipschitz (not necessarily smooth nor convex). The authors proved that any accumulation point of the sequence is a weak Pareto efficient solution and under some assumptions, they obtained that any accumulation point is a substationary point.
-
Chuong et al. [5] developed three algorithms of the so-called hybrid approximate proximal type to find Pareto optimal points for a general class of convex constrained problems of vector optimization in finite- and infinite-dimensional spaces, and proved the convergence of the sequence generated by their algorithms.
This paper proposes a linear scalarization proximal point algorithm for solving multiobjective minimization problems. Under the assumption that each objective function is a proper lower semicontinuous quasiconvex function, we prove the global convergence of the sequence generated by the algorithm to some generalized critical point of the problem and convergence to a weak Pareto efficient solution when the objective functions are continuous and the regularized proximal parameters converge to zero. Additionally, when the objective functions are differentiable, we introduce an inexact proximal algorithm and prove the convergence of sequence generated by the algorithm to a Pareto critical point of the multiobjective minimization problem.
The paper is organized as follows: In Sect. 2, we recall some concepts and basic results on multiobjective optimization, descent direction, scalar representation, quasiconvex and convex functions, Fréchet and limiting subdifferential, \(\epsilon \)-subdifferential and Fejér convergence. In Sect. 3, we present the problem and we give an example of a quasiconvex model in demand theory. In Sect. 4, we introduce an exact algorithm and analyze its convergence. In Sect. 5, we present an inexact algorithm for the differentiable case and analyze its convergence. In Sect. 6, we give a numerical example of the algorithm, in Sect. 7, we give some perspectives and open problems, and in Sect. 8, we give our conclusions.
2 Preliminaries
In this section, we present some basic concepts and results that are important for the development of our work. These facts can be found, for example, in Hadjisavvas [2], Mordukhovich [6] and Rockafellar and Wets [7].
2.1 Definitions, Notations and Some Basic Results
Along this paper \( {\mathbb {R}}^n\) denotes an Euclidean space, that is, a real vectorial space with the canonical inner product \(\langle x,y\rangle =\sum \nolimits _{i=1}^{n} x_iy_i\) and the norm given by \(||x||={\sqrt{\langle x, x\rangle }}\).
Given a function \(f :{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\ \cup \ \left\{ +\infty \right\} \), the effective domain of f we denote by \(\text {dom}(f)= \small {\left\{ x \in {\mathbb {R}}^n: f(x) < + \infty \right\} },\) . If \(\text {dom}(f) \ne \emptyset \), f is called proper. And f is called coercive, if \(\lim \nolimits _{\left\| x\right\| \rightarrow +\infty }f(x)= +\infty \) . We denote by \(\arg \min \left\{ f(x): x \in {\mathbb {R}}^n\right\} \) the set of minimizers of f and \({\bar{f}} \), the optimal value of problem: \(\min \left\{ f(x): x \in {\mathbb {R}}^n\right\} ,\) if it exists. The function f is lower semicontinuous at \({\bar{x}}\) if for all sequence \(\left\{ x^l\right\} _{l \in {\mathbb {N}}} \) such that \(\lim \limits _{l \rightarrow +\infty }x^l = {\bar{x}}\) we obtain that \(f({\bar{x}}) \le \liminf \limits _{l \rightarrow +\infty }f(x^l)\).
We say that f is differentiable at \({{\bar{x}}}\), if there exists \(v\in {\mathbb {R}}^n\) such that
where
The next result ensures that the set of minimizers of a function, under some assumptions, is nonempty.
Proposition 2.1
(Rockafellar and Wets [7], Theorem 1.9) Suppose that \(f:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} \) is proper, lower semicontinuous and coercive, then the optimal value \( {\bar{f}}\) is finite and the set \(\arg \min \left\{ f(x): x \in {\mathbb {R}}^n\right\} \) is nonempty and compact.
Definition 2.1
Let \(D \subset {\mathbb {R}}^n\) be a convex set and \({\bar{x}} \in D\). The normal cone to D at \({\bar{x}} \in D\) is given by
It follows an important result that involves sequences of nonnegative numbers, which will be useful in Sect. 5.
Lemma 2.1
(Polyak [8], Lemma 2.2.2.) Let \(\{w_k\}\), \(\{p_k\}\) and \(\{q_k\}\) be sequences of nonnegative real numbers. If
then the sequence \(\{w_k\}\) is convergent.
2.2 Multiobjective Optimization
In this subsection, we present some properties and notations on multiobjective optimization; see, for example, the books of Miettinen [9] and Luc [10] for more details.
Considering the cone \({\mathbb {R}}^m_+ = \{ y\in {\mathbb {R}}^m : y_i\ge 0, \forall \ i = 1,\ldots , m \},\) we define in \({\mathbb {R}}^m\) the following partial order \(\preceq \) induced by \({\mathbb {R}}^m_+\): given \(y,y'\in {\mathbb {R}}^m\), then \(y\ \preceq \ y'\) if, and only if, \(y'-y\in {\mathbb {R}}^m_+\) , which is equivalent to \( y_i \le \ y'_i,\) for all \( i= 1,2,\ldots ,m \) .
Given \( {\mathbb {R}}^m_{++}= \{ y\in {\mathbb {R}}^m : y_i>0, \forall \ i = 1,\ldots , m \},\) we may define another relation \(\prec \) induced by \({\mathbb {R}}^m_{++}:\)\(y\ \prec \ y'\), if, and only if, \(y'-y\in {\mathbb {R}}^m_{++}\), which is equivalent to \( y_i < \ y'_i\) for all \( i= 1,2,\ldots ,m\).
Let us consider the multiobjective optimization problem (MOP) :
where \(G = \left( G_1, G_2,\ldots , G_m\right) \) with \(G(x)=+\infty _{{\mathbb {R}}^m_+}\) (in the sense of the paper of Bolintinéanu [11]) if \(x\notin \text {dom}(G):=\bigcap _{i=1}^m\text {dom}(G_i)\ne \emptyset \) and each \(G_i:{\mathbb {R}}^n \longrightarrow {\mathbb {R}}\cup \left\{ +\infty \right\} , i=1,\ldots ,m\).
Along the paper, we use the notation \(G: {\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m\) when \(\text {dom}(G_i)={\mathbb {R}}^n,\) for each \(i=1,\ldots ,m,\) and we say that G is continuous (differentiable, continuously differentiable) if each \(G_i\) is continuous (differentiable, continuously differentiable) for each \(i=1,\ldots ,m\).
Definition 2.2
(Miettinen [9], Definition 2.2.1) A point \(x^* \in {\mathbb {R}}^n\) is a Pareto optimal point or Pareto efficient solution of the problem (1), if there does not exist \(x \in {\mathbb {R}}^n \) such that \( G_{i}(x) \le G_{i}(x^*)\), for all \(i \in \left\{ 1,\ldots ,m\right\} \), and \( G_{j}(x) < G_{j}(x^*)\), for at least one index \( j \in \left\{ 1,\ldots ,m\right\} .\)
Definition 2.3
(Miettinen [9], Definition 2.5.1) A point \(x^* \in {\mathbb {R}}^n\) is a weak Pareto efficient solution of the problem (1), if there does not exist \(x \in {\mathbb {R}}^n \) such that \( G_{i}(x) < G_{i}(x^*)\), for all \(i \in \left\{ 1,\ldots ,m\right\} \).
We denote by \(\arg \min \left\{ G(x):x\in {\mathbb {R}}^n \right\} \) and by \(\arg \min _w\left\{ G(x):x\in {\mathbb {R}}^n \right\} \) the set of Pareto efficient solutions and weak Pareto efficient solutions to the problem (1), respectively. It is easy to check that
2.3 Pareto Critical Point and Descent Direction
Let \(G:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m\) be a differentiable function. Given \(x \in {\mathbb {R}}^n\), the Jacobian of G at x, denoted by JG(x), is a matrix of order \(m \times n\) whose entries are defined by \(\left( JG(x)\right) _{i,j} = \frac{\partial G_i}{\partial x_j}(x),\) where \(i=1,\ldots ,m\) and \(j=1,\ldots ,n\). We may represent it by
The image of the Jacobian of G at x is denoted by
A necessary, but not sufficient, first-order optimality condition for the problem (1) at \(x \in {\mathbb {R}}^n \), is
Equivalently, \(\forall \ v \in {\mathbb {R}}^n\), there exists \(i_0 = i_0(v) \in \lbrace 1,\ldots ,m\rbrace \) such that
Definition 2.4
Let \(G:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m\) be a differentiable function. A point \(x^* \in {\mathbb {R}}^n\) satisfying (2) is called a Pareto critical point.
Following from the previous definition, if a point x is not a Pareto critical point, then there exists a direction \(v \in {\mathbb {R}}^ n\) satisfying
i.e., \(\langle \nabla G_i( x ) , v \rangle < 0, \ \forall \ i \in \lbrace 1,\ldots , m \rbrace \). As G is continuously differentiable, then
This implies that v is a descent direction for the function \(G_i\), i.e., there exists \(\varepsilon > 0 \), such that
Therefore, v is a descent direction for G at x, i.e., there exists \( \varepsilon > 0 \) such that
2.4 Scalar Representation
In this subsection, we present an useful technique in multiobjective optimization, where the original vector optimization problem is replaced by a family of scalar problems.
Let \(F=(F_1,F_2,\ldots ,F_m)\) be a map with \(F(x)=+\infty _{{\mathbb {R}}^m_{+}}\) (in the sense of the paper of Bolintinéanu [11]) if \(x\notin \text {dom}(F):=\bigcap _{i=1}^m\text {dom}(F_i)\ne \emptyset \) where each \(F_i:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \{+ \infty \}.\)
Definition 2.5
(Luc [10], Definition 2.1, page 86) A function \(f:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \{+ \infty \}\) is said to be a strict scalar representation of a map F when given \(x,{\bar{x}}\in {\mathbb {R}}^n :\)
Furthermore, we say that f is a weak scalar representation of F if
Proposition 2.2
Let \(f:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \{+ \infty \}\) be a proper function. Then, f is a strict scalar representation of F if, and only if, there exists a strictly increasing function \(g:F\left( {\mathbb {R}}^n\right) \longrightarrow {\mathbb {R}}\) such that \(f = g \circ F.\)
Proof
See Luc [10] Proposition 2.3. \(\square \)
Proposition 2.3
Let \(f:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \{+ \infty \}\) be a weak scalar representation of a vector function \(F=(F_1,F_2,\ldots ,F_m),\) and \(\arg \min \left\{ f(x):x\in {\mathbb {R}}^n\right\} \) the set of minimizer points of f. Then, we have
Proof
Let \({{\bar{x}}}\in \text {argmin}\left\{ f(x): x\in {\mathbb {R}}^n\right\} \) and suppose that \({{\bar{x}}}\) is not a weak Pareto efficient solution of the problem \(\min \{F(x):x\in {\mathbb {R}}^n\},\) then there exists \(x\in {\mathbb {R}}^n\) such that \(F_i(x)< F_i({{\bar{x}}}),\) for all \(i=1,2,\ldots ,m.\) From Definition 2.5 we have \(f(x)<f({{\bar{x}}}),\) which is a contradiction. Therefore, \({{\bar{x}}}\) is a weak Pareto efficient solution. \(\square \)
2.5 Quasiconvex and Convex Functions
In this subsection, we present some definitions of quasiconvex functions and quasiconvex multiobjective functions. These definitions and some properties can be also found in Bazaraa et al. [12], Luc [10], Mangasarian [13] and references therein.
Definition 2.6
Let \(f:{\mathbb {R}}^n\longrightarrow {\mathbb {R}} \cup \{+ \infty \}\) be a proper function. Then, f is called quasiconvex, if for all \(x,y\in {\mathbb {R}}^n\), and for all \( t \in \left[ 0,1\right] \), it holds that
Definition 2.7
Let \(f:{\mathbb {R}}^n\longrightarrow {\mathbb {R}} \cup \{+ \infty \}\) be a proper function. Then, f is called convex, if for all \(x,y\in {\mathbb {R}}^n\), and for all \( t \in \left[ 0,1\right] \), it holds that
Observe that if f is a quasiconvex function, then \(\text {dom}(f)\) is a convex set. On the other hand, while a convex function can be characterized by the convexity of its epigraph, a quasiconvex function can be characterized by the convexity of the lower level sets:
Definition 2.8
Let \(F_i:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \{+ \infty \},\)\(i=1,2,\ldots ,m,\) and \(F= (F_1,\ldots ,F_m),\) then F is \({\mathbb {R}}^m_+\)-quasiconvex (convex), if each component \(F_i\) is quasiconvex (convex).
2.6 Fréchet and Limiting Subdifferentials
Definition 2.9
Let \(f: {\mathbb {R}}^n \longrightarrow {\mathbb {R}} \cup \{ +\infty \}\) be a proper function.
-
(a)
For each \(x \in \text {dom}(f)\), the set of regular subgradients (also called Fréchet subdifferential) of f at x, denoted by \({\hat{\partial }}f(x)\), is the set of vectors \(v \in {\mathbb {R}}^n\) such that
$$\begin{aligned} f(y) \ge f(x) + \left\langle v,y-x\right\rangle + o(\left\| y - x\right\| ),\hbox { where }\lim \limits _{y \rightarrow x}\frac{o(\left\| y - x\right\| )}{\left\| y - x\right\| } =0. \end{aligned}$$Or equivalently, \({\hat{\partial }}f (x) := \left\{ v \in {\mathbb {R}}^n : \liminf \limits _{y\ne x,\ y \rightarrow x} \dfrac{f(y)- f(x)- \langle v , y - x\rangle }{||y - x ||} \ge 0 \right\} \).
If \(x \notin \text {dom}(f)\), then \({\hat{\partial }}f(x) = \emptyset \).
-
(b)
The set of general subgradients (also called limiting subdifferential) f at \(x \in {\mathbb {R}}^n\), denoted by \(\partial f(x)\), is defined as follows:
$$\begin{aligned} \partial f(x) := \left\{ v \in {\mathbb {R}}^n : \exists \ x^l \rightarrow x, \ \ f(x^l) \rightarrow f(x), \ \ v^l \in {\hat{\partial }} f(x^l)\ \text {and}\ v^l \rightarrow v \right\} . \end{aligned}$$
Proposition 2.4
(Fermat’s rule generalized) If a proper function \(f: {\mathbb {R}}^n \longrightarrow {\mathbb {R}} \cup \{+ \infty \}\) has a local minimum at \({\bar{x}} \in \text {dom}(f)\), then \(0\in {\hat{\partial }} f\left( {\bar{x}}\right) \).
Proof
See Rockafellar and Wets [7] Theorem 10.1. \(\square \)
Proposition 2.5
Let \(f: {\mathbb {R}}^n \longrightarrow {\mathbb {R}} \cup \{+ \infty \}\) be a proper function. Then, the following properties are true
-
(i)
\({\hat{\partial }}f(x) \subset \partial f(x)\), for all \(x \in {\mathbb {R}}^n\).
-
(ii)
If f is differentiable at \({\bar{x}}\), then \({\hat{\partial }}f({\bar{x}}) = \{\nabla f ({\bar{x}})\}\), so \(\nabla f ({\bar{x}})\in \partial f({\bar{x}})\).
-
(iii)
If f is continuously differentiable in a neighborhood of x, then
\({\hat{\partial }}f(x) = \partial f(x) = \{\nabla f (x)\}\).
-
(iv)
If \( g = f + h \) with f finite at \({\bar{x}}\) and h is continuously differentiable in a neighborhood of \({\bar{x}}\), then \({\hat{\partial }}g({\bar{x}}) = {\hat{\partial }}f({\bar{x}}) + \nabla h({\bar{x}})\) and \(\partial g({\bar{x}}) = \partial f({\bar{x}}) + \nabla h({\bar{x}})\).
Proof
See Rockafellar and Wets [7] Exercise 8.8, page 304. \(\square \)
2.7 \(\varepsilon \)-Subdifferential
We present some important concepts and results on \(\varepsilon \)-subdifferential. The theory of these facts can be found, for example, in Jofre et al. [14] and Rockafellar and Wets [7].
Definition 2.10
Let \(f: {\mathbb {R}}^n \longrightarrow {\mathbb {R}} \cup \{+ \infty \}\) be a proper lower semicontinuous function, and let \(\varepsilon \) be an arbitrary nonnegative real number. The Fréchet \(\varepsilon \)-subdifferential of f at \(x \in \text {dom}(f)\) is defined by
Remark 2.1
When \(\varepsilon = 0\), (3) reduces to the well-known Fréchet subdifferential, which is denoted by \({\hat{\partial }}f(x)\), according to Definition 2.9. More precisely,
where B is the closed unit ball in \({\mathbb {R}}^n\) centered at zero. Therefore,
From Definition 5.1 of Treiman, [15],
Equivalently, \(x^* \in {\hat{\partial }}_{\epsilon }f (x)\), if and only if, for each \(\eta > 0\), there exists \(\delta > 0\) such that
We now define a new kind of approximate subdifferential.
Definition 2.11
The limiting Fréchet \(\varepsilon \)-subdifferential of f at \(x \in \text {dom} (f)\) is defined by
where
\(\text {with}\ x^*_l \in {\hat{\partial }}_{\varepsilon }f (x_l)\, \rbrace .\)
In the case where f is continuously differentiable, the limiting Fréchet \(\varepsilon \)-subdifferential takes a very simple form, according to the following proposition
Proposition 2.6
Let \(f: {\mathbb {R}}^n \longrightarrow {\mathbb {R}}\) be a continuously differentiable function at x with derivative \(\nabla f (x)\). Then,
Proof
See Jofré et al. [14] Proposition 2.8. \(\square \)
2.8 Fejér Convergence
Definition 2.12
A sequence \(\left\{ y_k\right\} \subset {\mathbb {R}}^n\) is said to be Fejér convergent to a set \(U\subseteq {\mathbb {R}}^n\) if, \(\left\| y_{k+1} - u \right\| \le \left\| y_k - u\right\| , \forall \ k \in {\mathbb {N}},\ \forall \ u \in U\).
The following result on Fejér convergence is well known.
Lemma 2.2
If \(\left\{ y_k\right\} \subset {\mathbb {R}}^n\) is Fejér convergent to some set \(U\ne \emptyset \), then:
-
(i)
The sequence \(\left\{ y_k\right\} \) is bounded.
-
(ii)
If an accumulation point y of \(\left\{ y_k\right\} \) belongs to U, then \(\lim \limits _{k\rightarrow +\infty }y_k = y\).
Proof
See Schott [16] Theorem 2.7. \(\square \)
3 The Problem
We are interested in solving the multiobjective optimization problem (MOP):
where \(F=\left( F_1, F_2,\ldots , F_m\right) \) and each \(F_i: {\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \{+ \infty \}\) is an extended function satisfying the following assumptions:
- \({{{({\mathbf{C}}_{\mathbf{1.1}})}}}\) :
-
Each \(F_i:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}\cup \{+ \infty \}\), \(i=1,\ldots ,m\), is a proper lower semicontinuous function and
$$\begin{aligned} \text {dom}(F):=\bigcap _{i=1}^m\text {dom}(F_i)\ne \emptyset . \end{aligned}$$ - \({{{({\mathbf{C}}_{\mathbf{1.2}})}}}\) :
-
\(0 \preceq F(x), \forall x\in {\mathbb {R}}^n,\) that is, \(F_i(x)\ge 0,\) for each \(i=1,2,\ldots ,m.\)
- \({{{({\mathbf{C}}_{\mathbf{2}})}}}\) :
-
F is \({\mathbb {R}}^m_+\)-quasiconvex, that is, each \(F_i\) is quasiconvex.
3.1 A Quasiconvex Model in Demand Theory
Let n be a finite number of consumer goods. A consumer is an agent who must choose how much to consume each good. An ordered set of numbers representing the amounts consumed of each good set is called vector of consumption, and denoted by \( x = (x_1, x_2, \ldots , x_n) \) where \( x_i \) with \( i = 1,2,\ldots , n \), is the quantity consumed of good i. Denote by X the feasible set of these vectors, which will be called the set of consumption, usually in economic applications we have \( X \subset {\mathbb {R}} ^ n_ + \).
In the classical approach of demand theory, the analysis of consumer behavior starts specifying a preference relation over the set X, denoted by \(\succeq \). The notation “\(x \succeq y\)” means that “x is at least as good as y” or “y is not preferred to x.” This preference relation \( \succeq \) is assumed rational, i.e., is complete because the consumer is able to order all possible combinations of goods, and transitive because consumer preferences are consistent, which means if the consumer prefers \({\bar{x}}\) to \({\bar{y}} \) and \({\bar{y}}\) to \({\bar{z}}\), then he prefers \({\bar{x}}\) to \({\bar{z}} \) (see Definition 3.B.1 of Mas-Colell et al. [17]). We say that \(\succeq \) is a convex preference relation on a convex set X if for all \(x,y,z\in X\) and \(\lambda \in [0,1]\) satisfying \(z\succeq x\) and \(z\succeq y\) we obtain \(z \succeq \lambda x+(1-\lambda ) y.\)
The quasiconcave model for a convex preference relation \(\succeq \) is
where \(\mu \) is the utility function representing the preference; see Papa Quiroz et al. [18] for more details. Now consider a multiple criteria, that is, consider m convex preference relations denoted by \(\succeq _i, i=1,2,\ldots ,m.\) Suppose that for each preference \(\succeq _i,\) there exists an utility function, \( \mu _i,\) respectively, then the problem of maximizing the consumer preference on X is equivalent to solve the quasiconcave multiobjective optimization problem
Since there is not a single point, which maximize all the functions simultaneously, the concept of optimality is established in terms of Pareto optimality or efficiency. Taking \(F = (- \mu _1(x), - \mu _2(x), \ldots , - \mu _m(x)) \), we obtain a minimization problem with quasiconvex multiobjective function, since each component function is quasiconvex one.
4 Exact Algorithm
In this section, to solve the problem (5), we propose a linear scalarization proximal point algorithm with quadratic regularization using the Fréchet subdifferential, denoted by SPP algorithm.
We consider two sequences: the proximal parameters \(\left\{ \alpha _k\right\} ,\) with \(\alpha _k>0\) for each k, and a sequence \( \left\{ z_k \right\} =\{(z_k^1,z_k^2,\ldots ,z_k^m)\}\subset {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \) with \(\left\| z_k\right\| = 1\).
SPP Algorithm
-
Initialization: Choose an arbitrary starting point
$$\begin{aligned} x^0\in {\mathbb {R}}^n \end{aligned}$$(6) -
Main Steps: Given \(x^k,\) find \(x^{k+1}\in {\varOmega }_k=\left\{ x\in {\mathbb {R}}^n: F(x) \preceq F(x^k)\right\} \) such that
$$\begin{aligned} 0 \in {\hat{\partial }}\left( \left\langle F(.), z_k\right\rangle + \dfrac{\alpha _k}{2} \Vert \ .\ - x^k \Vert ^2 + \delta _{{\varOmega }_k}(.) \right) (x^{k+1}), \end{aligned}$$(7)where \(\delta _{{\varOmega }_k}\) is the indicator function of \({\varOmega }_k\).
-
Stop criterion: If \(x^{k+1}=x^{k}\), then stop. Otherwise, do \(k \leftarrow k + 1 \) and return to Main Steps.
4.1 Existence of the Iterates
Theorem 4.1
Assume that assumptions \({{{({\mathbf{C}}_{\mathbf{1.1}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{1.2}})}}}\) are satisfied, then the sequence \(\left\{ x^k\right\} \), generated by the \({\mathbf{SPP}}\) algorithm, is well defined.
Proof
Let \(x^0 \in {\mathbb {R}}^n \) be an arbitrary point given in the initialization step. Given \(x^k\), define \(\varphi _k(x)=\left\langle F(x), z_k\right\rangle + \frac{\alpha _k}{2}\left\| x - x^k\right\| ^2 +\delta _{{\varOmega }_k}(x)\), where \(\delta _{{\varOmega }_k}(.)\) is the indicator function of \({{\varOmega }_k}\). Then, we have that min\(\{\varphi _k(x): x \in {\mathbb {R}}^n\}\) is equivalent to min\(\{\left\langle F(x), z_k\right\rangle + \frac{\alpha _k}{2}\left\| x - x^k\right\| ^2: x \in {\varOmega }_k\}\). As \(\varphi _k\) is lower semicontinuous and coercive, then using Proposition 2.1, we obtain that there exists \(x^{k+1} \in {\mathbb {R}}^n\), which is a global minimizer of \(\varphi _k.\) From Proposition 2.4, \(x^{k+1}\) satisfies:
\(\square \)
Remark 4.1
We are interested in the asymptotic convergence of the SPP algorithm, so we assume that \(x^k\ne x^{k+1}\) for all k. If \(x^k=x^{k+1}\) for some k, then from the SPP algorithm we have that
that is, \(x^{k+1}\) is a critical point of the optimization problem \(\min \{\left\langle F(.), z_k\right\rangle : x\in {\varOmega }_k\}.\) Observe that the function \(\left\langle F(.), z_k\right\rangle \) is a strict scalar representation of the map F, so if, furthermore, \(x^{k+1}\) is a minimizer of that optimization problem, then from Proposition 2.3 , \(x^{k+1}\) is a weak Pareto optimal point of the original problem (5).
4.2 Fejér Convergence Property
To obtain some desirable properties it is necessary to assume the following assumption on the function F and the initial point \(x^0\):
- \({{({\mathbf{C}}_{\mathbf{3}})}}\) :
-
The set \(\left( F(x^0) - {\mathbb {R}}^m_+\right) \cap F({\mathbb {R}}^n)\) is \({\mathbb {R}}^m_+\)-complete, meaning that for each sequence \(\left\{ a_k\right\} \subset {\mathbb {R}}^n\), with \(a_0 = x^0\), such that \(F(a_{k+1}) \preceq F(a_k)\), there exists \( a \in {\mathbb {R}}^n\) such that \(F(a)\preceq F(a_k), \ \forall \ k \in {\mathbb {N}}\).
Remark 4.2
The assumption \( {{({\mathbf{C}}_{\mathbf{3}})}}\) is cited in several works involving the proximal point method for convex functions, Bonnel et al. [19], Ceng and Yao [20], and Villacorta and Oliveira [21].
Proposition 4.1
Assume that assumptions \({{{({\mathbf{C}}_{\mathbf{1.1}})}}}\) and \({{{({\mathbf{C}}_{{\mathbf{2}}})}}}\) are satisfied. If \(x\in \text {dom}(F)\cap {\varOmega }\) and \(g \in {\hat{\partial }}\left( \left\langle F(.), z\right\rangle + \delta _{{\varOmega }}(.) \right) (x)\), with \(z \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \), and \(F(y) \preceq F(x)\), with \(y \in {\varOmega }\), and \({\varOmega } \subset {\mathbb {R}}^n\) a closed and convex set, then \(\left\langle g , y - x\right\rangle \le 0\).
Proof
Let \(t \in ] 0, 1]\), then from the \({\mathbb {R}}^m_+\)-quasiconvexity of F and the assumption that \(F(y) \preceq F(x)\), we have: \(F_i(ty + (1-t)x) \le \text {max}\left\{ F_i(x), F_i(y)\right\} = F_i(x),\ \forall \ i \in \lbrace 1,\ldots m\rbrace \). It follows that for each \(z \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \), we have
As \(g \in {\hat{\partial }}\left( \left\langle F(.), z\right\rangle + \delta _{{\varOmega }}(.) \right) (x)\), we obtain
On the other hand, we have \(\lim \limits _{t \rightarrow 0}\frac{o(t\left\| y - x\right\| )}{t\left\| y - x\right\| }= 0\). Thus,
Thus, dividing (10) by t and taking \(t \rightarrow 0\), we obtain the desired result. \(\square \)
Under assumptions \({{({\mathbf{C}}_{{\mathbf{1,1}}})}},\)\({{({\mathbf{C}}_{{\mathbf{1,2}}})}}\) and \({{{({\mathbf{C}}_{\mathbf{3}})}}}\), we obtain that the set
is nonempty. Furthermore, if the assumption \({{({\mathbf{C}}_{\mathbf{2}})}}\) is satisfied, then E is a nonempty and convex set.
Proposition 4.2
Under assumptions \({{{({\mathbf{C}}_{\mathbf{1.1}})}}}\), \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{3}})}}}\), the sequence \(\left\{ x^k\right\} \), generated by the SPP algorithm, (6) and (7), is Fejér convergent to E.
Proof
Observe that \(\forall \ x \in {\mathbb {R}}^n\):
From Theorem 4.1, (7) and from Proposition 2.5, (iv), we have that there exists \(g_k \in {\hat{\partial }}\left( \left\langle F(.), z_k\right\rangle + \delta _{{\varOmega }_k}(.)\right) (x^{k+1})\) such that:
Now take \(x^* \in E\), then \(x^* \in {\varOmega }_k\) for all \(k \in {\mathbb {N}}\). Combining (11) with \(x = x^*\) and (12), we obtain:
where the last inequality follows from Proposition 4.1. Relation (13) implies that
Thus,
\(\square \)
Proposition 4.3
Under assumptions \({{{({\mathbf{C}}_{\mathbf{1.1}})}}}, {{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{3}})}}},\) the sequence \(\left\{ x^k\right\} \) generated by the SPP algorithm, (6) and (7), satisfies
Proof
It follows from (15) that \( \forall \ x^* \in E\), \(\left\{ \left\| x^k - x^*\right\| \right\} \) is a nonnegative and nonincreasing sequence and hence is convergent. Thus, the right-hand side of (14) converges to 0, when \(k \rightarrow +\infty \), and the result is obtained. \(\square \)
4.3 Convergence of the Iterates
In this subsection, we prove the convergence of the proposed algorithm, when F is a non differentiable vector function.
Proposition 4.4
Under assumptions \({{{({\mathbf{C}}_{\mathbf{1.1}})}}}, {{({\mathbf{C}}_{\mathbf{1.2}})}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{3}})}}},\) the sequence \(\left\{ x^k\right\} \) generated by the SPP algorithm converges to some point of E.
Proof
From Proposition 4.2 and Lemma 2.2, (i), \(\left\{ x^k\right\} \) is bounded, and then, there exists a subsequence \(\left\{ x^{k_j}\right\} \) such that \(\lim \nolimits _{j\rightarrow +\infty }x^{k_j} = {\widehat{x}}\). Since \(\left\langle F(.), z\right\rangle \) is lower semicontinuous function for all \(z \in {\mathbb {R}}^m_+ \backslash \left\{ 0\right\} \), then \(\left\langle F({{\widehat{x}}}),z\right\rangle \le \liminf \nolimits _{j\rightarrow +\infty }\left\langle F(x^{k_j}) , z\right\rangle \). On the other hand, \(x^{k+1} \in {\varOmega }_k\) so \(\left\langle F(x^{k+1}) , z\right\rangle \le \left\langle F(x^{k}) , z\right\rangle \). Furthermore, from assumption \({{{({\mathbf{C}}_{\mathbf{1.2}})}}}\) the function \(\left\langle F(.), z\right\rangle \) is bounded below for each \(z \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} ,\) and then, the sequence \(\left\{ \left\langle F(x^k),z\right\rangle \right\} \) is nonincreasing and bounded below, hence convergent. Therefore,
It follows that \(\left\langle F(x^k)-F({\widehat{x}}),z\right\rangle \ge 0, \forall \ k \in {\mathbb {N}}, \forall \ z \in {\mathbb {R}}^m_+ \backslash \left\{ 0\right\} \). We conclude that \(F(x^k) - F({\widehat{x}}) \in {\mathbb {R}}^m_+\), i.e., \(F({\widehat{x}})\preceq F(x^k), \forall \ k \in {\mathbb {N}}\). Therefore, \({\widehat{x}}\in E\) and by Lemma 2.2, (ii), we get the result. \(\square \)
4.4 Convergence to a Weak Pareto Efficient Solution
In this subsection, assuming that F is also continuous and the sequence \(\{\alpha _k\}\) converges to zero, then we obtain the convergence of the sequence \(\{x^k\}\) to a weak Pareto efficient solution.
Theorem 4.2
Let \(F:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m \) be a continuous vector function satisfying assumptions \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{3}})}}}\). If \( \lim \limits _{k\rightarrow +\infty }\alpha _k= 0\) and the iterations are given in the form
then the sequence \(\lbrace x^k\rbrace \) converges to a weak Pareto efficient solution of the problem (5).
Proof
Let \(x^{k+1}\in \text {arg min} \left\{ \left\langle F(x), z_k\right\rangle +\frac{\alpha _k}{2}\left\| x - x^k\right\| ^2 : x\in {\varOmega }_k\right\} \), this implies that
\(\forall \ x \in {\varOmega }_k\). Since the sequence \(\left\{ x^k\right\} \) converges to some point of E, then there exists \(x^* \in E\) such that \(\lim \nolimits _{k\rightarrow +\infty }x^{k}= x^*\). Since that \(\left\{ z_k\right\} \) is bounded because \(\left\| z_k\right\| = 1\), there exists a subsequence \(\left\{ z_{k_l}\right\} _{l\in {\mathbb {N}}}\) such that \(\lim \limits _{l\rightarrow +\infty }z_{k_l}={\bar{z}}\), with \({\bar{z}} \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \) (as \(||z_{k_l}||=1\) and from the continuity of the norm ||.|| we have that \(||{\bar{z}}||=1\) and so \({\bar{z}}\ne 0\)). Taking \(k=k_l\) in (17), we have
\(\forall \ x \in E.\) As
and from the continuity of F, taking \(l \rightarrow +\infty \) in (18), we obtain
Thus, \(x^* \in \text {arg min} \left\{ \left\langle F(x), {\overline{z}}\right\rangle : x \in E\right\} \). Now, \(\left\langle F(.), {\overline{z}}\right\rangle \), with \({\bar{z}} \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \) is a strict scalar representation of F, so a weak scalar representation, then by Proposition 2.3 we have that \(x^* \in \text {arg min}_w \left\{ F(x):x \in E \right\} \).
We shall prove that \(x^* \in \text {arg min}_w \left\{ F(x):x \in {\mathbb {R}}^n \right\} \). Suppose by contradiction that \(x^* \notin \text {arg min}_w \left\{ F(x):x \in {\mathbb {R}}^n \right\} \), then there exists \({\widetilde{x}} \in {\mathbb {R}}^n\) such that
So for \({\bar{z}} \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \), it follows that
Since \(x^* \in E\), from (20) we conclude that \({\widetilde{x}} \in E\). Therefore, from (19) and (21) we obtain a contradiction. \(\square \)
4.5 Convergence to a Generalized Critical Point
In this subsection, we prove the convergence of the sequence \(\{g^k\}\) to 0, where
We call the above result as convergence to a generalized critical point.
Theorem 4.3
Let \(F:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m \) be a continuous vector function satisfying assumptions \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{3}})}}}\). If \(0< \alpha _k < {\tilde{\alpha }}\), then the sequence \(\lbrace x^k\rbrace \) generated by the SPP algorithm, (6) and (7), satisfies
where \(g^k \in {\hat{\partial }}\left( \left\langle F(.), z_k\right\rangle + \delta _{{\varOmega }_k}(.)\right) (x^{k+1})\).
Proof
From Theorem 4.1, (7) and from Proposition 2.5, (iv), there exists a vector \(g_k \in {\hat{\partial }}\left( \left\langle F(.), z_k\right\rangle + \delta _{{\varOmega }_k}(.)\right) (x^{k+1})\) such that \(g^k = \alpha _k(x^k - x^{k+1})\). Since \(0< \alpha _k < {\tilde{\alpha }}\), then
From Proposition 4.3, \(\lim \limits _{k\rightarrow +\infty }\left\| x^{k+1} - x^k\right\| = 0\), and from (22) we have \(\lim \limits _{k\rightarrow +\infty }g^k = 0\). \(\square \)
Remark 4.3
If F is continuously differentiable, the SSP algorithm coincides with the algorithm proposed in [1] and so we obtain the global convergence of the sequence.
5 An Inexact Proximal Algorithm
In this section, we present an inexact version of the SPP algorithm, which we denote by ISPP algorithm, when F satisfies the following assumption:
- \({{{({\mathbf{C}}_{\mathbf{4}})}}}\) :
-
\(F:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m\) is a continuously differentiable vector function on \({\mathbb {R}}^n\).
5.1 ISPP Algorithm
Let \(F: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^m\) be a vector function satisfying the assumptions \({\mathbf{( C_2)}}\) and \({{({\mathbf{C}}_{\mathbf{4}})}}\), and consider two sequences: the proximal parameters \(\left\{ \alpha _k\right\} ,\) with \(\alpha _k>0\) for each k, and a sequence \( \left\{ z_k \right\} =\{(z_k^1,z_k^2,\ldots ,z_k^m)\}\subset {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \) with \(\left\| z_k\right\| = 1\).
-
Initialization: Choose an arbitrary starting point
$$\begin{aligned} x^0\in {\mathbb {R}}^n. \end{aligned}$$(23) -
Main Steps: Given \(x^k,\) define the function \({\varPsi }_k : {\mathbb {R}}^n \rightarrow {\mathbb {R}} \) such that \( {\varPsi }_k (x) = \left\langle F(x), z_k\right\rangle \) and consider \({\varOmega }_k = \left\{ x\in {\mathbb {R}}^n: F(x) \preceq F(x^k)\right\} \). Find \(x^{k+1}\in {\varOmega }_k\) satisfying
$$\begin{aligned}&0 \in {\hat{\partial }}_{\epsilon _k}{\varPsi }_k (x^{k+1}) + \alpha _k\left( x^{k+1} - x^k\right) + \nu _k, \end{aligned}$$(24)$$\begin{aligned}&\displaystyle \sum _{k=1}^{\infty } \delta _k < +\infty , \end{aligned}$$(25)where \(\nu _k\in {\mathcal {N}}_{{\varOmega }_k}(x^{k+1}),\)\(\delta _k = \text {max}\left\{ \dfrac{\varepsilon _k}{\alpha _k}, \dfrac{\Vert \nu _k\Vert }{\alpha _k}\right\} ,\)\(\varepsilon _k\ge 0,\) and \({\hat{\partial }}_{\varepsilon _k}\) is the Fréchet \(\varepsilon _k\)-subdifferential.
-
Stop criterion: If \(x^{k+1}=x^{k},\) then stop. Otherwise, do \(k \leftarrow k + 1 \) and return to Main Steps.
5.2 Existence of the Iterates
Proposition 5.1
Let \(F:{\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m \) be a vector function satisfying the assumptions \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{4}})}}}\). Then, the sequence \(\left\{ x^k\right\} \) generated by the ISPP algorithm is well defined.
Proof
Let \(x^0 \in {\mathbb {R}}^n \) given by (23) and considering \(x^k\) fixed, analogous to the proof of Theorem 4.1, there exists \(x^{k+1}\) satisfying
From Proposition 2.5, \(\ (iii)\) and \(\ (iv)\), we obtain
From Remark 2.1, \(x^{k+1}\) satisfies (24) with \(\varepsilon _k = 0. \)\(\square \)
Remark 5.1
As in the exact algorithm, we are interested in the asymptotic convergence of the ISPP algorithm, so we assume that \(x^k\ne x^{k+1}\) for all k. If \(x^k=x^{k+1}\) for some k, then from the algorithm we have that
From Proposition 2.5, (iv), it is equivalent to
that is, \(x^{k+1}\) is a approximate critical point of the optimization problem \(\min \{\left\langle F(.), z_k\right\rangle : x\in {\varOmega }_k\}.\) Observe that the function \(\left\langle F(.), z_k\right\rangle \) is a strict scalar representation of the map F, so if, furthermore, \(x^{k+1}\) is a minimizer of that optimization problem, then from Proposition 2.3, \(x^{k+1}\) is a weak Pareto optimal point of the original problem (5).
The next proposition gives a necessary condition for quasiconvex differentiable vector functions.
Proposition 5.2
Let \(F :{\mathbb {R}}^n\longrightarrow {\mathbb {R}}^m \) be a differentiable quasiconvex vector function and \(x,z\in {\mathbb {R}}^n.\) If \(F\left( x\right) \preceq F\left( z\right) \), then \(\left\langle \nabla F_i(z) , x - z\right\rangle \le 0\), \(\forall \ i \in \left\{ 1,\ldots ,m\right\} \).
Proof
Since F is \({\mathbb {R}}^m_+\)-quasiconvex, each \(F_i,\)\( i = 1,\ldots , m,\) is quasiconvex. Then, the result follows from the classical characterization of the scalar differentiable quasiconvex functions; see Mangasarian [13], p.134. \(\square \)
Proposition 5.3
Let \(\left\{ x^k\right\} \) be a sequence generated by the ISPP algorithm. If the assumptions \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\), \({{{({\mathbf{C}}_{\mathbf{3}})}}}\), \({{{({\mathbf{C}}_{\mathbf{4}})}}}\) and (25) are satisfied, then for each \({\hat{x}} \in E\), \(\{ \left\| {\hat{x}} - x^k\right\| ^ 2 \}\) converges and \(\{x^k\}\) is bounded.
Proof
From (24), there exist \(g_k \in {\hat{\partial }}_{\varepsilon _k}{\varPsi }_k (x^{k+1})\) and \(\nu _k \in {\mathcal {N}}_{{\varOmega }_k}(x^{k+1})\) such that
It follows that for any \(x \in {\mathbb {R}}^n\), we obtain
Therefore,
Note that \(\forall \ x \in {\mathbb {R}}^n,\) see equation (17) of [1]:
On the other hand, let \({\varPsi }_k (x) = \left\langle F(x), z_k\right\rangle \), where \(F: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^m\) is continuously differentiable vector function, then \( {\varPsi }_k : {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) is continuously differentiable with gradient denoted by \(\nabla {\varPsi }_k \). From Proposition 2.6, we have
where B is the closed unit ball in \({\mathbb {R}}^n\) centered at zero. Furthermore, \({\hat{\partial }}_{\varepsilon _k} {\varPsi }_k(x) \subset \partial _{\varepsilon _k} {\varPsi }_k(x)\), (see (2.12) in Jofré et al. [14]). As \(g_k \in {\hat{\partial }}_{\epsilon _k}{\varPsi }_k(x^{k+1})\), we have that \(g_k \in \partial _{\epsilon _k}{\varPsi }_k(x^{k+1})\), then
with \(\Vert h_k \Vert \le 1 \). Now take \({\hat{x}}\in \text {E},\) then
From Proposition 5.2, we conclude that (30) becomes
Using in the inequality \(\Vert {\hat{x}} - x^{k+1}\Vert ^2 + \frac{1}{4} \ge \Vert {\hat{x}} - x^{k+1}\Vert ,\) it follows
Consider \(x = {\hat{x}}\) in (28), using (31), (32) and the condition (25) we obtain
Thus,
The condition (25) guarantees that
where \(k_0\) is a natural number sufficiently large, and so,
combining with (33), results in
Since \(\displaystyle \sum \nolimits _{i=1}^{\infty } \delta _k < \infty ,\) applying Lemma 2.1 in the inequality (34), we obtain the convergence of \(\{ \Vert {\hat{x}} - x^k\Vert ^2 \}\), for each \({\hat{x}} \in E,\) which implies that there exists \(M \in {\mathbb {R}}_+\), such that \(\left\| {\hat{x}} - x^k \right\| \le M, \ \ \forall \ k \in {\mathbb {N}}. \) Now, since that \(\Vert x ^k\Vert \le \Vert x ^k - {\hat{x}} \Vert + \Vert {\hat{x}} \Vert ,\) we conclude that \(\{x^k\}\) is bounded, and so, we guarantee that the set of accumulation points of this sequence is nonempty. \(\square \)
5.3 Convergence of the ISPP Algorithm
Proposition 5.4
(Convergence to some Point of E) If the assumptions \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\), \({{{({\mathbf{C}}_{\mathbf{3}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{4}})}}}\) are satisfied, then the sequence \(\left\{ x^k\right\} \) generated by the ISPP algorithm converges to some point of the set E.
Proof
As \(\left\{ x^k\right\} \) is bounded, then there exists a subsequence \(\left\{ x^{k_j}\right\} \) such that \(\lim \nolimits _{j\rightarrow +\infty }x^{k_j} = {\hat{x}}\). Since F is continuous in \({\mathbb {R}}^n\), then the function \(\left\langle F(.), z\right\rangle \) is also continuous in \({\mathbb {R}}^n\) for all \(z \in {\mathbb {R}}^m\), in particular, for all \(z \in {\mathbb {R}}^m_+ \backslash \left\{ 0\right\} \), and \(\left\langle F({{\widehat{x}}}),z\right\rangle = \lim \nolimits _{j\rightarrow +\infty }\left\langle F(x^{k_j}) , z\right\rangle \). On the other hand, we have that \(F(x^{k+1}) \preceq F(x^{k})\), and so, \(\left\langle F(x^{k+1}) , z\right\rangle \le \left\langle F(x^{k}) , z\right\rangle \) for all \(z \in {\mathbb {R}}^m_+ \backslash \left\{ 0\right\} \). Furthermore, the function \(\left\langle F(.), z\right\rangle \) is bounded below, for each \(z \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \), then the sequence \(\left\{ \left\langle F(x^k),z\right\rangle \right\} \) is nonincreasing and bounded below, thus convergent. So,
It follows that \(F(x^k) - F({\widehat{x}}) \in {\mathbb {R}}^m_+\), i.e., \(F({\widehat{x}})\preceq F(x^k), \forall \ k \in {\mathbb {N}}\). Therefore, \({\widehat{x}}\in E\). Now, from Proposition 5.3, we have that the sequence \(\{ \left\| {\widehat{x}} - x^k\right\| \}\) is convergent, and since \(\lim \limits _{k\rightarrow +\infty }\left\| x^{k_j} - {\hat{x}}\right\| = 0\) (because \(\{x^{k_j}\}\) converges to \({\hat{x}}\)), we conclude that \(\lim \limits _{k\rightarrow +\infty }\left\| x^{k} - {\widehat{x}}\right\| = 0\), i.e., \(\lim \nolimits _{k\rightarrow +\infty }x^{k} = {\widehat{x}}.\)\(\square \)
Theorem 5.1
Suppose that the assumptions \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\), \({{{({\mathbf{C}}_{\mathbf{3}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{4}})}}}\) are satisfied. If \(0< \alpha _k < {\widetilde{\alpha }}\), then the sequence \(\lbrace x^k\rbrace \) generated by the ISPP algorithm, (23), (24) and (25), converges to a Pareto critical point of the problem (5).
Proof
From Proposition 5.4 there exists \({\widehat{x}}\in E\) such that \(\lim \limits _{j\rightarrow +\infty }x^{k}= {\widehat{x}}\). Furthermore, as the sequence \(\left\{ z^k\right\} \) is bounded, there exists \( \left\{ z^{k_j}\right\} _{j \in {\mathbb {N}}}\) such that \(\lim \nolimits _{j\rightarrow +\infty }z^{k_j}= {\bar{z}}\), with \({\bar{z}} \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \). From (24) there exists \(g_{k_j} \in {\hat{\partial }}_{\varepsilon _{k_j}}{\varPsi }_{k_j} (x^{{k_j}+1})\), with \(g_{k_j} = \nabla {\varPsi }_{k_j}(x^{{k_j}+1}) + \varepsilon _{k_j} h_{k_j}\) with \(\Vert h_{k_j} \Vert \le 1 \), and \(\nu _{k_j} \in {\mathcal {N}}_{{\varOmega }_{k_j}}(x^{{k_j}+1})\), such that
Since \(\nu _{k_j} \in {\mathcal {N}}_{{\varOmega }_{k_j}}(x^{{k_j}+1})\), then
Take \({\bar{x}} \in E\). By definition of E, \({\bar{x}} \in {\varOmega }_k\), for all \( k \in {\mathbb {N}}\), so \({\bar{x}} \in {\varOmega }_{k_j}\). Combining (36) with \(x = {\bar{x}}\) and (35), we have
where M is a constant such that \(\left\langle h_{k_j}\ ,\ {\bar{x}} - x^{{k_j}+1}\right\rangle \le M.\)
Observe that, \(\forall \ x \in {\mathbb {R}}^n\):
Now, from (26) with \(x = {\bar{x}} \in E\), and (31), we obtain
Thus, from (38), with \(x = {\bar{x}}\), we have
Since that the sequence \(\left\{ \left\| {\bar{x}} - x^k\right\| \right\} \) is convergent and \(\displaystyle \sum \nolimits _{i=1}^{\infty }\delta _k < \infty \), from (39) we conclude that \(\displaystyle \lim \nolimits _{k \rightarrow +\infty } \left\| x^{k +1} - x^k \right\| = 0\). Furthermore, as
we obtain that the sequence \(\{ \left\| {\bar{x}} - x^{{k_j}+1}\right\| \}\) is bounded.
Thus, returning to (37), since \(\lim \nolimits _{k\rightarrow +\infty }\varepsilon _k = 0 \), \(\lim \nolimits _{j\rightarrow +\infty }x^{k}= {\widehat{x}}\) and \(\lim \nolimits _{j\rightarrow +\infty }z^{k_j}= {\bar{z}}\), taking \(j \rightarrow + \infty \), we obtain
From the quasiconvexity of each component function \(F_i\), for each \(i \in \left\{ 1,\ldots ,m\right\} \), we have that
\(\left\langle \nabla F_i({\widehat{x}})\ ,\ {\bar{x}} - {\widehat{x}} \right\rangle \le 0\) and because \({\bar{z}} \in {\mathbb {R}}^m_+\backslash \left\{ 0\right\} \), from (41), we obtain
Without loss of generality, consider the set \(J = \left\{ i \in I: {\bar{z}}_i > 0 \right\} \), where
\(I = \left\{ 1,\ldots ,m\right\} \). Thus, from (42), for all \({\bar{x}} \in E\) we have
Now we will show that \({\widehat{x}}\) is a Pareto critical point.
Suppose by contradiction that \({\widehat{x}}\) is not a Pareto critical point, then there exists a direction \(v \in {\mathbb {R}}^n\) such that \(JF({\widehat{x}})v \in -{\mathbb {R}}^m_{++}\), i.e.,
Therefore, v is a descent direction for the multiobjective function F in \({\widehat{x}}\), so, \(\exists \ \varepsilon > 0\) such that
Since \({\widehat{x}}\ \in \ E\), then from (45) we conclude that \({\widehat{x}} + \lambda v \in E\). Thus, from (43) with \({\bar{x}} = {\widehat{x}} + \lambda v \), we obtain:
It follows that \(\left\langle \nabla F_{i}({\widehat{x}})\ ,\ v \right\rangle = 0\) for all \( i \in J,\) contradicting (44). Therefore, \({\widehat{x}}\) is Pareto critical point of the problem (5). \(\square \)
6 A Numerical Result
In this subsection, we give a simple numerical example of the SPP algorithm showing the functionality of the proposed method. For that, we use a Intel Core i5 computer 2.30 GHz, 3GB of RAM, Windows 7 as operational system with SP1 64 bits and we implement our code using MATLAB software 7.10 (R2010a).
Example 6.1
Consider the following multiobjective minimization problem
where \(F_1(x_1,x_2)=-e^{-x_1^2-x_2^2}+1\) and \(F_2(x_1,x_2)=(x_1-1)^2+(x_2-2)^2.\) This problem satisfies the assumptions \({{{({\mathbf{C}}_{\mathbf{1.2}})}}},\)\({{{({\mathbf{C}}_{\mathbf{2}})}}}\) and \({{{({\mathbf{C}}_{\mathbf{4}})}}}.\) We can easily verify that the points \({{\bar{x}}}=(0,0)\) and \({\hat{x}}=(1,2)\) are Pareto efficient solutions of the problem.
We take \(x^0=(-1,3)\) as an initial point and given \(x^k\in {\mathbb {R}}^2,\) the main step of the SPP\(\text{ algorithm }\) is to find a critical point ( local minimum, local maximum or a saddle point) of the following problem
In this example, we consider \(z_k=\left( z_1^k,z_2^k\right) =\left( \frac{1}{{\sqrt{2}}},\frac{1}{{\sqrt{2}}}\right) \) and \(\alpha _k=1,\) for each k. We take \(z^0=(2,3)\) as the initial point to solve all the subproblems using the MATLAB function fmincon (with interior point algorithm), and we consider the stop criterion \(||x^{k+1}-x^k||<0.0001\) to finish the algorithm. The numerical results are given in the following table:
k | \( N[x^{k}] \) | \(x^k=(x^{k}_{1}, x^{k}_{2}) \) | \( ||x^{k}-x^{k-1} || \) | \(\sum F_i(x^k)z_i^k\) | \( F_1(x_1^k,x_2^k)\) | \(F_2(x_1^k,x_2^k)\) |
---|---|---|---|---|---|---|
1 | 10 | (0.17128, 2.41010) | 1.31144 | 1.30959 | 0.99709 | 0.85496 |
2 | 10 | (0.65440, 2.16217) | 0.54302 | 0.80586 | 0.99392 | 0.14574 |
3 | 9 | (0.85337, 2.05877 ) | 0.22423 | 0.71983 | 0.99303 | 0.02496 |
4 | 7 | (0.93534, 2.01588 ) | 0.09251 | 0.70518 | 0.99284 | 0.00443 |
5 | 7 | (0.96912, 1.99814) | 0.03816 | 0.70268 | 0.99279 | 0.00096 |
6 | 7 | (0.98305, 1.99080) | 0.01574 | 0.70226 | 0.99277 | 0.00037 |
7 | 7 | (0.98879, 1.98776) | 0.00649 | 0.70219 | 0.99277 | 0.00028 |
8 | 7 | (0.99115,1.98651) | 0.00268 | 0.70217 | 0.99276 | 0.00026 |
9 | 7 | (0.99213, 1.98599) | 0.00110 | 0.70217 | 0.99276 | 0.00026 |
10 | 7 | (0.99253, 1.98578) | 0.00046 | 0.70217 | 0.99276 | 0.00026 |
11 | 7 | (0.99270, 1.98569) | 0.00019 | 0.70217 | 0.99276 | 0.00026 |
12 | 7 | (0.99277,1.98565) | 0.00008 | 0.70217 | 0.99276 | 0.00026 |
The above table shows that we need \(k=12\) iterations to solve the problem; \(N[x^{k}]\) denotes the inner iterations of each subproblem to obtain the point \(x^{k},\) for example to obtain the point \(x^3=(0.85337, 2.05877 )\) we need \(N[x^3]=9\) inner iterations. Observe also that in each iteration, we obtain \(F(x^k)\succeq F(x^{k+1})\) and the function \(\langle F(x^k),z^k\rangle \) is nonincreasing.
7 Perspectives and Open Problems
To reduce considerably the computational cost in each iteration of the SPP algorithm, it is necessary to consider the unconstrained iteration
which is more practical than (7). One natural condition to obtain (46) is that \(x ^{k +1} \in \text {int} {\varOmega }_k\) (interior of \({\varOmega }_k\)), because in this case (7) becomes (46). So, we believe that a variant of the SPP algorithm may be an interior proximal point method. Thus, a future work may be the introduction of an interior variable metric proximal point method to solve the problem (5).
On the other hand, the SPP algorithm may be applied to solve a class of linear multiobjective problems. In fact, if the objective functions \(F_i,\)\(i=1,2,\ldots ,m,\) are defined as \(F_i(x)={c_i}^Tx,\) if x satisfies \( Ax=b, x\in {\mathbb {R}}^{n}_+\) and \(F_i(x)=+\infty \) otherwise, where \(c_i\in {\mathbb {R}}^{n}_+,\)\(b\in {\mathbb {R}}^m\) are given vectors and \(A\in {\mathbb {R}}^{m\times n}\) is a \(m\times n\) matrix, then Theorem 4.2 assures that the sequence \(\lbrace x^k\rbrace \) converges to a weak Pareto efficient solution of the problem
The extension of the above theorem when \(\{\lambda _k\}\) does not converge to zero is an open question.
Other future works may be the extension of the proposed algorithm to solve more general constrained vector minimization problems using the class of proximal distances and also to obtain conditions for the finite convergence of the SPP algorithm for the quasiconvex case.
8 Conclusions
This paper introduces an exact linear scalarization proximal point algorithm, denoted by SPP algorithm, to solve arbitrary extended multiobjective quasiconvex minimization problems and in the differentiable case, it is presented an inexact version of the proposed algorithm.
Our paper may be considered as a first attempt to develop a proximal point method to solve constrained multiobjective problems with quasiconvex objective functions. Future works should improve the result obtained in Theorem 4.3 targeting real applications, for example, in demand theory.
References
Apolinário, H.C.F., Papa Quiroz, E.A., Oliveira, P.R.: A scalarization proximal point method for quasiconvex multiobjective minimization. J. Global Optim. 64, 79–96 (2016)
Hadjisavvas, N., Komlosi, S., Shaible, S.: Handbook of Generalized Convexity and Generalized Monotonicity. Springer, New York (2005)
Bento, G.C., Cruz Neto, J.X., Soubeyran, A.: A proximal point-type method for multicriteria optimization. Set Valued Var. Anal. 22, 557–573 (2014)
Makela, M.M., Karmitsa N., Wilppu, O.: Multiobjective proximal bundle method for nonsmooth optimization, TUCS Technical Report 1120, Turku (2014)
Chuong, T.D., Mordukhovich, B.S., Yao, J.C.: Hybrid approximate proximal algorithms for efficient solutions in vector optimization. J. Nonlinear Convex Anal. 12, 257–286 (2011)
Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation I, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 330. Springer, Berlin (2006)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Polyak, B.T.: Introduction to Optimization. Translations Series in Mathematics and Engineering. Optimization Software Inc., Publications Division, New York (1987)
Miettinen, K.M.: Nonlinear Multiobjective Optimization. Kluwer Academic Publishers, Boston (1999)
Luc, D.T.: Theory of Vector Optimization, Lecture Notes in Economics and Mathematical Systems. Springer, Berlin (1989)
Bolintinéanu, S.: Approximate efficiency and scalar stationarity in unbounded nonsmooth convex vector optimization problems. J. Optim. Theory Appl. 106, 265–296 (2000)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 3rd edn. Wiley, New York (2006)
Mangasarian, O.L.: Nonlinear Programming. McGraw-Hill, New York (1969)
Jofré, A., Luc, D.T., Therá, M.: \(\varepsilon \)-subdifferential and \(\varepsilon \)-monotonicity. Nonlinear Anal Theory Methods Appl 33, 71–90 (1998)
Treiman, J.S.: Clarke’s gradients and \(\varepsilon \)-subgradients in Banach spaces. Trans. Am. Math. Soc. 294, 65–78 (1984)
Schott, D.: Basic properties of Fejér monotone sequences. Rostocker Math. Kolloqu. 49, 57–74 (1995)
Mas-Colell, A., Whinston, M.D., Green, J.R.: Microeconomic Theory. Oxford University Press, New York (1995)
Papa Quiroz, E.A., Mallma Ramirez, L., Oliveira, P.R.: An inexact proximal method for quasiconvex minimization. Eur. J. Oper. Res. 246, 721–729 (2015)
Bonnel, H., Iusem, A.N., Svaiter, B.F.: Proximal methods in vector optimization. SIAM J. Optim. 15, 953–970 (2005)
Ceng, L., Yao, J.: Approximate proximal methods in vector optimization. Eur. J. Oper. Res. 183, 1–19 (2007)
Villacorta, K.D.V., Oliveira, P.R.: An interior proximal method in vector optimization. Eur. J. Oper. Res. 214, 485–492 (2011)
Acknowledgements
The authors thank the referees for their helpful comments and suggestions. The research of the first author was supported by the Postdoctoral Scholarship CAPES-FAPERJ Edital PAPD-2011. The research of P.R.Oliveira was partially supported by CNPQ/Brazil.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Christiane Tammer.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Papa Quiroz, E.A., Apolinário, H.C.F., Villacorta, K.D. et al. A Linear Scalarization Proximal Point Method for Quasiconvex Multiobjective Minimization. J Optim Theory Appl 183, 1028–1052 (2019). https://doi.org/10.1007/s10957-019-01582-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-019-01582-z
Keywords
- Multiobjective minimization
- Lower semicontinuous quasiconvex functions
- Proximal point methods
- Fejér convergence
- Pareto–Clarke critical point