Abstract
It is well known that the monotonicity of the underlying mapping of variational inequalities plays a central role in the convergence analysis. In this paper, we propose an infeasible projection algorithm (IPA for short) for nonmonotone variational inequalities. The next iteration point of IPA is generated by projecting a vector onto a half-space. Hence, the computational cost of computing the next iteration point of IPA is much less than the algorithm of Ye and He (Comput. Optim. Appl. 60, 141–150, 2015) (YH for short). Moreover, if the underlying mapping is Lipschitz continuous with its modulus is known, by taking suitable parameters, IPA requires only one projection onto the feasible set per iteration. The global convergence of IPA is obtained when the solution set of its dual variational inequalities is nonempty. Moreover, if in addition error bound holds, the convergence rate of IPA is Q-linear. IPA can be used for a class of quasimonotone variational inequality problems and a class of quasiconvex minimization problems. Comparing with YH and Algorithm 2 in Deng, Hu and Fang (Numer. Algor. 86, 191–221, 2021) (DHF for short) by solving high-dimensional nonmonotone variational inequalities, numerical experiments show that IPA is much more efficient than YH and DHF from CPU time point of view. Moreover, IPA is less dependent on the initial value than YH and DHF.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(\mathbb {R}^{n}\) be n dimension Euclidean space, \(C\subset \mathbb {R}^{n}\) be a nonempty closed and convex set and \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) be a continuous mapping. We consider the classical variational inequality problem of finding a vector x ∈ C such that
where 〈⋅,⋅〉 denotes the inner product of \(\mathbb {R}^{n}\). We let
denote the solution set of variational inequalities and let
denote the solution set of the dual variational inequalities of Problem (1).
If C is nonempty closed and convex and F is continuous or hemicontinuous on C, the relationship SD ⊂ S is known as Minty lemma. The relationship S ⊂ SD holds when F is pseudomonotone on C in the sense of Karamardian [1]; i.e., for all x, y ∈ C,
Hence, we obtain that
whenever C is nonempty closed and convex, F is continuous and pseudomonotone on C.
From [2, Example 4.2], we see that S = SD may not satisfied when F is quasimonotone in the sense of Karamardian [1] on C, i.e., for all x, y ∈ C,
Recently, an interesting work in [3] show that if \(C\subset \mathbb {R}\) is closed and bounded, then SD ≠ ∅ if and only if F is quasimonotone, see Lemma 3.1 and Proposition 3.1 therein. Moreover, from Theorem 3.1 and Example 3.1 in [3], we construct high-dimensional variational inequalities, which is not quasimonotone with SD ≠ ∅ whenever the number of dimensional n ≥ 2, see Example 6.1.
The global convergence of many well-known algorithms are based on the condition that S = SD; see, for example, Goldstein-Levitin-Polyak Projection algorithms [4, 5], proximal point algorithm [6], extragradient projection algorithms [7, 8] and its variant algorithms [9, 10], infeasible projection-based algorithm [11,12,13], self-adaptive projection-based algorithm [14] and projected reflected gradient algorithm [15] (because these algorithms need F is at least pseudomonotone on C). The condition S = SD is also used for generalized monotone or nonmonotone variational inequalities or nonmonotone equilibrium problems to establish the global convergence, see, for example, [16,17,18] and [19]. Moreover, the assumption about monotonicity or pseudomonotonicity of the underlying mapping is used in variational inequality and fixed point problems, see, for example, [20,21,22,23,24]. However, if we generalize the pseudomonotone of F to qusimonotone, there exists the case that \(S_{D} \subsetneqq S\); see, for example, [2, Example 4.2]. So, assumption S = SD may not suit for quasimonotone or nonmonotone variational inequalities.
There still exist some algorithms for quasimonotone variational inequality problems; see, for example, [25] and [26]. However, the global convergence of these algorithms need extra assumptions than SD ≠ ∅. Recently, [27] presented an iterative algorithm for quasimonotone variational inequalities, the global convergence is established under the assumptions SD ≠ ∅ and
However, from Example 4.4 below, we see that assumption (3) may not suit for quasimonotone variational inequalities.
In [2], we proposed a double projection algorithm for nonmonotone variational inequalities. The global convergence is obtained under assumptions that SD ≠ ∅Footnote 1, C is closed and convex and F is continuous on C (without needing any monotonicity of F). This algorithm is generalized to solve nonmonotone set-valued variational inequalities and nonmonotone equilibrium problems; see, for example, [28,29,30] and [31]. However, the next iteration point xk+ 1 in [2] is generated by projecting a vector onto the intersection of the feasible set C and k + 1 half-spaces. Hence, the computational cost of computing xk+ 1 will increase as k increase. Moreover, the next iterate point of algorithms in [28,29,30] and [31] are all similarly generated in [2]. Very recently, [32] proposed an extragradient method for solving variational inequalities without monotonicity.
Inspired by [11,12,13, 33, 34], we present an infeasible projection algorithm (IPA) for Problem (1) with SD ≠ ∅. In IPA, xk+ 1 is generated by projecting xk onto only one half-space, which is selected from former k + 1 half-spaces and has the largest distance from xk. Moreover, the computational cost of computing xk+ 1 can be ignored, see Remark 3.1 below. If F is in addition L-Lipschitz continuous, by taking suitable parameters, IPA requires only one projection onto the feasible set per iteration (see Proposition 4.1 and Remark 4.3 below). Comparing with the algorithm in [27], the global convergence of IPA without needing extra assumption in (3). Moreover, if in addition the error bound condition holds and F is Lipschitz continuity, the convergence rate of IPA is Q-linear (see Theorem 5.1 below). To compare with YH and DHF, we construct high-dimensional variational inequalities with SD ≠ ∅ and perform these algorithms in Matlab. Numerical experiments show that IPA is much more efficient than YH and DHF in terms of CPU time. Moreover, IPA is less dependent on the initial value than YH and DHF.
The rest of this paper is organized as follows. Some notations and preliminary materials are introduced in Section 2. IPA and its convergence analysis is introduced in Section 3. The simplified IPA is introduced in Section 4 when F is in addition Lipschitz continuous on \(\mathbb {R}^{n}\). The convergence rate of IPA is introduced in Section 5 under additional error bound assumption. Finally, numerical experiments are reported in Section 6.
2 Preliminaries
In this section, we introduce some properties about the projection mapping and some materials which will be used in future convergence analysis.
Let ∥⋅∥ denote Euclidean norm in \(\mathbb {R}^{n}\) and let dist(x, C) denote the Euclidean distance from a vector x to C; i.e.,
Let PC(x) denote the orthogonal projection of the vector x onto C; i.e.,
From the fact that C is closed and convex, we have \(\text {dist}(x,P_{C}(x))=\inf \{\|x-y\|:\) y ∈ C}. For a fixed \(x\in \mathbb {R}^{n}\) and a positive number μ, we let r(x, μ) denote the natural residual mapping of problem (1); i.e.,
In the following, we recall some well-known properties about the projection mapping and the natural residual mapping.
Lemma 2.1
[36] Let \(C\subset \mathbb {R}^{n}\) be a nonempty closed convex set. Then, the following statements hold.
-
(a)
for any fixed \(x\in \mathbb {R}^{n}\), we have
$$ \begin{array}{@{}rcl@{}} z = P_{C}(x)\iff z\in C\text{ and }\langle z-x,y-z\rangle\geq 0 \text{~ for all ~} y\in C. \end{array} $$(5) -
(b)
PC(⋅) is nonexpansive; i.e.,
$$ \begin{array}{@{}rcl@{}} \|P_{C}(x)-P_{C}(z)\|\leq\|x-z\| \text{~ for all ~} x,z\in \mathbb{R}^{n}. \end{array} $$
Lemma 2.2
[10] Let \(C\subset \mathbb {R}^{n}\) be a nonempty closed convex set and x be any fixed point in \(\mathbb {R}^{n}\). If y = PC(x), then we have
Lemma 2.3
[37] Let v be a fixed vector and \(H:=\{x\in \mathbb {R}^{n}:\langle u,x\rangle \leq a\}\) be a half-space. Then
Moreover, if in addition v ∉ H, it follows that
Lemma 2.4
[34, 38] Let C be a nonempty closed and convex set of \(\mathbb {R}^{n}\) and r(x, μ) be defined in (4). Then the following statements hold.
-
(a)
x is a solution of problems (1) if and only if ∥r(x, μ)∥ = 0 for each fixed μ > 0.
-
(b)
If there exists a positive number μ > 0 such that ∥r(x, μ)∥ = 0, then x is a solution of problems (1).
Lemma 2.5
[39] Let r(x, μ) be defined in (4). Then, for any fixed point \(x\in \mathbb {R}^{n}\), the following properties hold.
-
(a)
function μ↦∥r(x, μ)∥ is nondecreasing whenever μ > 0.
-
(b)
function \(\mu \mapsto \frac {\|r(x,\mu )\|}{\mu }\) is nonincreasing whenever μ > 0.
The following inequality is a direct conclusion from Lemma 2.5. For simplicity, we omit its proof.
Lemma 2.6
[10] Let \({\Omega }\subset \mathbb R^{n}\) be a closed convex set and h be a real-valued function with \(\text {dom} h = \mathbb R^{n}\). Denote \(\widetilde {\Omega }:= \{x\in {\Omega }: h(x)\le 0\}\). If \(\widetilde {\Omega }\) is nonempty and h is Lipschitz continuous on Ω with modulus θ > 0, then
Lemma 2.7
[40, 41] Let {γk} and {βk} be the nonnegative real number sequences satisfying \({\sum }_{k=0}^{\infty } \beta _{k} < \infty \) and γk+ 1 ≤ γk + βk for all k. Then, the sequence {γk} is convergent.
Before ending this section, we use the following lemma to show some sufficient condition about SD ≠ ∅, see, for example, [2, Proposition 2.1] and [42, Proposition 1].
Lemma 2.8
If one of the following statements hold
-
(a)
F is pseudomonotone on C and S ≠ ∅;
-
(b)
F is the gradient of f, where f is a differentiable quasiconvex function on an open set \(K\supset C\) and can attains its global minimum on C;
-
(c)
F is quasimonotone on C, F≠ 0 and C is bounded;
-
(d)
F is quasimonotone on C, F≠ 0 on C and there exists a positive number r such that, for every x ∈ C with ∥x∥≥ r, there exists y ∈ C such that ∥y∥≤ r and 〈F(x), y − x〉≤ 0;
-
(e)
F is quasimonotone on C and S ∖ ST ≠ ∅ with ST := {x ∈ C|〈F(x), y − x〉 = 0,for all y ∈ C};
-
(f)
F is quasimonotone on C, intC is nonempty and there exists x∗∈ S such that F(x∗)≠ 0,
then SD is nonempty.
3 IPA and its convergence analysis
To solve nonmonotone Problem (1) with SD ≠ ∅, we see that the next iterate point in [2] is generated by projecting a vector onto the intersection of the feasible set C and k + 1 half-spaces. In this section, we present a new infeasible projection-based algorithm (IPA) for Problem (1) without monotonicity. The next iteration point of IPA is generated by projecting the current iteration point onto only one half-space, which is selected from the former k + 1 half-spaces and has the largest distance from the current iteration point. In this section, we first present our IPA as Algorithm 1 below. Then, we will show the well-definedness and the global convergence of IPA.
Remark 3.1
The line search procedure in (??) is inspired by [11]. The initial step-size \({\alpha _{k}^{0}}\) therein is suggested as αk− 1. This together with the fact η ∈ (0, 1) show that {αk} is nonincreasing. In this paper, we relax the range of \({\alpha _{k}^{0}}\) on a bounded closed interval below away from zero. Thus, we can use Barzilai-Borwein step-size to accelerate algorithm (Barzilai-Borwein step-size is wildly used in convex and nonconvex optimization). Moreover, to avoid a smaller step-size, we enlarge the initial step-size \({\alpha _{k}^{0}}\) by using a self-adaptive strategy. So, our \({\alpha _{k}^{0}}\) is computed by (30), see also our former work [43].
The procedure to choose the half-space which has the largest distance from xk to half-spaces {H0, ⋯ , Hk} needs k + 1 projection from xk onto each half-space in {H0, ⋯ , Hk}. Fortunately, this procedure can be performed in a parallel way. Hence, together with Lemma 2.3, the computational cost of this procedure can be ignored.
Now, we show the well-definedness of line search of IPA whenever F is continuous on \(\mathbb {R}^{n}\).
Lemma 3.2
Suppose that F is continuous on \(\mathbb {R}^{n}\). Let \(\alpha \in [\alpha _{\min \limits }, \alpha _{\max \limits }]\) be a fixed number and η ∈ (0,1). Then there exists a nonnegative integer m such that
Proof
If x ∈ S, then, from Lemma 2.4 (a), we have x = PC(x − αF(x)). Hence, (11) holds at m = 0.
If x ∉ S, from Lemma 2.4 (b), we see that ∥r(x, ν)∥ > 0 for any ν > 0. In this case, we claim that inequality (11) holds after finitely many iterations. Suppose to the contrary that
Then, we consider the following two cases.
-
Case A:
If x ∈ C, then we have PC(x) = x. This together with the continuity of both PC(⋅) and F(⋅) on \(\mathbb {R}^{n}\) and the fact η ∈ (0,1) gives
$$ \begin{array}{@{}rcl@{}} \|F(x) - F(P_{C}(x - \alpha \eta^{m} F(x)))\|\rightarrow 0\text{~ as ~}m\rightarrow \infty. \end{array} $$(13)On the other hand, in view of the fact that η ∈ (0,1) and the second relation of Lemma 2.5, for sufficient large m, we see that
$$ \sigma\frac{\|x - P_{C}(x - \alpha \eta^{m} F(x))\|}{\alpha \eta^{m}} = \sigma \frac{\|r(x,\alpha \eta^{m})\|}{\alpha \eta^{m}} \geq \sigma\frac{\|r(x,1)\|}{1}>0, $$(14)where the last inequality holds due to the fact x ∉ S and Lemma 2.4(b). Consequently, (13) and (14) contradict (12).
-
Case B:
If x ∉C, then we have σ∥x − PC(x)∥ > 0. Using this together with the continuity of F(⋅) and PC(⋅), it follows that \(\alpha \eta ^{m} \|F(x) - F(P_{C}(x - \alpha \eta ^{m} F(x)))\|\rightarrow 0\) and \(\sigma \|{x - P_{C}(x - \alpha \eta ^{m} F(x))}\|\rightarrow \sigma \|{x-P_{C}(x)}\|>0\) as \(m \rightarrow \infty \). However, this contradicts (12).
Hence, we see that (11) holds for some finite nonnegative integer m. □
Next, we use following lemma to show that the half-space Hk defined by hk(v) in (??) can separate strictly the current iteration point xk from SD.
Lemma 3.3
Suppose that F is continuous on \(\mathbb {R}^{n}\) and SD ≠ ∅. Let hk be defined in (??) and {xk} be the infinite sequence generated by IPA. Then, for any fixed x∗∈ SD, we have
Moreover, we have \(S_{D}\subseteq H_{k}\) and xk ∉ Hk for all k.
Proof
Let x∗∈ SD. Invoking the definition of hk, we see that
where the first inequality holds from the definition of SD and the fact that zk ∈ C, the last inequality holds from (5), the fact zk = PC(xk − αkF(xk)) and the fact x∗∈ C. This completes the first relation of (15).
Next, from the definition of hk, we see that
where the first inequality holds from Cauchy-Schwartz inequality and (??), the last inequality holds from the fact ∥r(xk, αk)∥ > tol > 0. This completes the second relation of (15).
Hence, from (15) and the definition of Hk, we further obtain that x∗∈ Hk and xk ∉ Hk for all k. This completes the proof. □
Before analyze the convergence of the sequence {xk} generated by IPA, we study the properties of the sequence {xk, zk, αk} generated by IPA.
Theorem 3.4
Suppose that F is continuous on \(\mathbb {R}^{n}\) and SD ≠ ∅. Let σ and η be given in IPA and {xk, zk, αk} be the infinite sequence generated by IPA. Then the following statements hold.
-
(a)
For any fixed \(x \in \bigcap _{i=0}^{\infty }H_{i}\), the sequence {∥xk − x∥} is convergent. Moreover, the sequences {xk}, {F(xk)}, {zk} and {xk − zk − αk(F(xk) − F(zk))} are all bounded. If F is in addition Lipschitz continuous with modulus L, one has
$$ \begin{array}{@{}rcl@{}} \inf\limits_{k>0}{\alpha_{k}}\geq \min\{\alpha_{min}, \eta \frac{\sigma}{L}\}>0. \end{array} $$(16) -
(b)
For any fixed \(x\in \bigcap _{i=0}^{\infty } H_{i}\), it holds that
$$ \begin{array}{@{}rcl@{}} 0\!<\!\text{dist}^{2}(x^{k}, H_{k}) \!\leq\! \text{dist}^{2}(x^{k}, \hat H_{k}) \!\leq\! \|x^{k} - x\|^{2} - \|x^{k+1} - x\|^{2} \text{ for all }k. \end{array} $$(17)Moreover, we have \(\lim \limits _{k\rightarrow \infty }\text {dist}(x^{k}, \hat H_{k})=0\).
-
(c)
For any cluster point \(\bar {x}\) of the sequence {xk}, it holds that \(\bar {x} \in \bigcap _{i=0}^{\infty }H_{i}\).
-
(d)
(Global subsequential convergence) Any cluster point of the infinite sequence {xk} is a solution of Problem (1).
Proof
-
(a)
From the fact SD ≠ ∅ and Lemma 3.3 that \(S_{D}\subset \bigcap _{i=0}^{\infty }H_{i}\), we have \(\bigcap _{i=0}^{\infty }H_{i}\neq \emptyset \). Moreover, together with the fact \(\bigcap _{i=0}^{\infty }H_{i} \subset \hat H_{k}\) for any k and (??), we see that, for any fixed \(x\in \bigcap _{i=0}^{\infty }H_{i}\),
$$ \|x^{k+1} - x \|=\|P_{\hat H_{k}}(x^{k}) - x \| = \|P_{\hat H_{k}}(x^{k}) - P_{\hat H_{k}}(x) \| \leq \|x^{k} - x \|\text{ for any }k. $$(18)Using this together with Lemma 2.7, we obtain that, for any fixed \(x\in \bigcap _{i=0}^{\infty }H_{i}\), the sequence {∥xk+ 1 − x∥} is convergent. Moreover, {xk} is bounded. Hence, by using the continuity of F, we see that {F(xk)} is bounded. Combine this with the fact zk = PC(xk − αkF(xk)), \(0< \alpha _{k} < \alpha _{\max \limits }\) and the continuity of PC(⋅), we further obtain that {zk} is bounded. similarly, we have {F(zk)}, and {xk − zk − αk(F(xk) − F(zk))} are all bounded.
Now, we show that \( \inf \limits _{k>0}{\alpha _{k}}>0\) when F is Lipschitz continuous. Let L > 0 be the Lipschitz modulus of F. Then, we see that (??) holds when \(\alpha \leq \frac {\sigma }{L}\). Hence, from the definition of αk, we deduces that either \(\alpha _{k} = {\alpha _{k}^{0}}\) (if \({\alpha _{k}^{0}}\leq \frac {\sigma }{L}\)) or \(\alpha _{k} > \eta \frac {\sigma }{L}\). Using this together with the fact \(0<\alpha _{min}\leq {\alpha _{k}^{0}}\), we have
-
(b)
The first inequality holds from the fact xk ∉ Hk in Lemma 3.3, the second inequality holds from (??) and the third inequality holds from the fact \(\bigcap _{i=0}^{\infty } H_{i} \subseteq \hat H_{k}\) for any k, Lemma 2.2 and (??). This completes the proof of (17). Note that the right hand of (17) is summable. Then, we have \(\lim \limits _{k\rightarrow \infty }\text {dist}(x^{k}, \hat H_{k})=0\).
-
(c)
From the definition of \(\hat H_{k}\), we have
$$ 0\leq \text{dist}(x^{k}, H_{i}) \leq \text{dist}(x^{k}, \hat H_{k})\text{ for all }i \leq k. $$This together with Theorem 3.4(b) implies that
$$ \lim_{k\rightarrow\infty}\text{dist}(x^{k}, H_{i})= 0\text{ for any fixed }i. $$Using this together with the fact dist(⋅, Hi) is continuous on \(\mathbb {R}^{n}\) and the fact \(\bar x\) is a cluster point of {xk}, we further obtain that \(\text {dist}(\bar x, H_{i}) = 0\) for any fixed i. This completes the proof.
-
(d)
Let \(\bar x\) be a cluster point of the sequence {xk} and \(I\subset \mathbb {N}\) be an index set such that \(\lim _{k\rightarrow \infty ,k\in I}x^{k}= \bar x\).
We first show that \(\lim _{k\rightarrow \infty }\|r(x^{k},\alpha _{k})\|= 0\). To this end, we recall from (a) that there exists M1 > 0 such that ∥xk − zk − αk(F(xk) − F(zk))∥≤ M1 for all k. Using this together with the definition of hk in (??), we see that M1 is a Lipschitz modulus of hk for all k. Moreover, from Lemma 2.6 and (15), we have
This together with (b) gives
Next, we show that there exists an index set \(J\subseteq I\) such that
To this end, we denote \(\widetilde \alpha :=\inf _{i\in I}\{\alpha _{i}\}\) and consider the following two cases.
-
Case 1:
If \(\widetilde \alpha > 0\), then \(\widetilde \alpha \leq \alpha _{i}\) for all i. Using this together with (6), we see that
$$ \begin{array}{@{}rcl@{}} 0\leq\|r(x^{i},1)\|\leq\frac{\|r(x^{i},\alpha_{i})\|}{\min\{\alpha_{i},1\}}\leq\frac{\|r(x^{i},\alpha_{i})\|}{\min\{\widetilde\alpha,1\}}, \forall i\in I. \end{array} $$Hence, together with (19), we have \(\lim _{i\rightarrow \infty ,i\in I}\|r(x^{i},1)\|=0\). By taking J = I, we see that (20) holds.
-
Case 2:
If \(\widetilde \alpha =0\), then there exists an index set \(J\subseteq I\) such that \(\lim _{i\rightarrow \infty ,i\in J}\alpha _{i}=0\). This together with (??) implies that, for sufficiently large i ∈ J,
$$ \alpha_{i} \eta^{-1}\left\|F(x^{i}) - F\left( P_{C}\left( x^{i} - \alpha_{i} \eta^{-1}F(x^{i})\right)\right)\right\| > \sigma \|r(x^{i},\alpha_{i} \eta^{-1})\|. $$Hence, for sufficiently large i ∈ J, it follows that
$$ \begin{array}{@{}rcl@{}} \left\|F(x^{i}) - F\left( P_{C}\left( x^{i} - \alpha_{i} \eta^{-1}F(x^{i})\right)\right)\right\| &>& \frac{\sigma \|r(x^{i},\alpha_{i} \eta^{-1})\|}{\alpha_{i} \eta^{-1}}\\&\geq& \frac{\sigma \|r(x^{i},1 )\|}{1}>0, \end{array} $$(21)where the second inequality holds from the second relation of Lemma 2.5 and the fact that \(\lim _{i\rightarrow \infty ,i\in J}\alpha _{i}=0\), the third inequality holds because xi ∉ S.
On the other hand, from the triangle inequality and the fact that PC(⋅) is nonexpansive, it holds that
$$ \begin{array}{@{}rcl@{}} &&\|x^{i} - P_{C}\left( x^{i} - \alpha_{i} \eta^{-1}F(x^{i})\right)\|\\ &&\leq\|r(x^{i},\alpha_{i})\| + \| P_{C}\left( x^{i} - \alpha_{i} F(x^{i})\right) - P_{C}\left( x^{i} - \alpha_{i} \eta^{-1}F(x^{i})\right)\|\\ &&\leq\|r(x^{i},\alpha_{i})\| + \| x^{i} - \alpha_{i} F(x^{i}) - (x^{i} - \alpha_{i} \eta^{-1}F(x^{i}))\|\\ &&=\|r(x^{i},\alpha_{i})\| + \alpha_{i}(\eta^{-1}-1) \|F(x^{i})\|\rightarrow 0 \text{ (as }i\rightarrow \infty,i\in J), \end{array} $$(22)where the limit holds from the fact \(J\subseteq I\), (19), the fact \(\lim _{i\rightarrow \infty ,i\in J}\alpha _{i}=0\) and the fact {F(xi)} is bounded. Using this together with the fact \(\lim _{k\rightarrow \infty ,k\in I}x^{k}= \bar x\), the continuity of F and (22), it follows that
$$ \lim_{i\rightarrow \infty, i\in J}\left\|F(x^{i}) - F\left( P_{C}\left( x^{i} - \alpha_{i} \eta^{-1}F(x^{i})\right)\right)\right\|= 0. $$
Finally, by using the continuity of ∥r(⋅,1)∥ and passing to the limit in (20) along the index set J, we see that \(\|r(\bar x,1)\|=0\). This together with Lemma 2.4 completes the proof. □
Now, we are ready to show the global convergence of IPA.
Theorem 3.5
[Global sequential convergence] Suppose that F is continuous on \(\mathbb {R}^{n}\) and SD ≠ ∅. Let {xk} be the infinite sequence generated by IPA. Then {xk} is globally convergent to a solution of Problem (1).
Proof
Let x∗ be any fixed cluster point of {xk}. Then, from Theorem 3.4(c), we see that \(x^{*} \in \bigcap _{i=0}^{\infty } H_{i}\). Hence, by replacing x by x∗ in (18), we see that
Moreover, from Lemma 2.7, it follows that {∥xk − x∗∥} is convergent. This together with the fact that x∗ is a cluster point of {xk} gives
This together with Theorem 3.4(d) gets the desired conclusion. □
4 Simplified IPA when underlying mapping is Lipschitz continuous
In Section 3, we present IPA for nonmonotone Problem (1). The global convergence is established under the assumption that SD ≠ ∅. In this section, we aims to simply IPA by omitting its line search procedureFootnote 2 whenever F is L-Lipschitz continuous. We call this simplified IPA as IPAL
Now, we show the simplified IPA as Algorithm 2 below.
We use the following Proposition to show that IPA can reduce to IPAL whenever F is L-Lipschitz continuous.
Proposition 4.1
Suppose that F is L-Lipschitz continuous on \(\mathbb {R}^{n}\). Let \({\alpha _{k}^{0}} = \lambda \) with λ ∈ (0,1/L) and σ = λL. Then, IPA reduces to IPAL.
Proof
It suffices to show that, for each k, αk = λ whenever \({\alpha _{k}^{0}} = \lambda \) with λ ∈ (0,1/L). Invoking the definition of Lipschitz continuous, we see that
Using this together with the fact that \({\alpha _{k}^{0}} \! =\! \lambda \) and σ = λL, we see that (??) holds at mk = 0 for any k. Hence, we obtain that αk = λ for any k. This completes the proof. □
From Proposition 4.1 and Theorem 3.5, we have the following corollary.
Corollary 4.2
Suppose that F is L-Lipschitz continuous on \(\mathbb {R}^{n}\) and SD ≠ ∅. Let {xk} be an infinite sequence generated by IPAL. Then {xk} is globally convergent to a solution of Problem (1).
Remark 4.3
If the Lipschitz modulus of F is known, from Proposition 4.1, we see that, by taking suitable parameters, IPA requires only one projection onto the feasible set in each iteration. Moreover, from Lemma 2.3, the computational cost of the next iteration point in IPA is much less than the algorithm in [2]. Comparing with algorithm in [27], the global convergence of simplified IPA without needing extra assumption {z ∈ C : F(z) = 0}∖SD is a finite set. Moreover, from (23) in the following example, we see that this assumption may not suit for quasimonotone variational inequality problems.
Before ending this section, we use the following example to show the extra assumption in [27] (see also assumption (3)). Note that the label of formula (3) is (cond:notsuit) may not suit for quasimonotone variational inequality problems.
Example 4.4
Let C = [− 1, 1] × [− 1, 1] and \(F(x) = ({x_{1}^{2}},0)^{T}\) with \(x = (x_{1},x_{2})^{T}\in \mathbb {R}^{2}\). Then, it is routine to check that F is quasimonotone on C. Moreover, we see that
Moreover,
5 Convergence rate of IPA
In this section, we analyze the rate of the sequence generated by IPA. Our analysis based on the following local error bound; there exist positive numbers c1 and c2 such that
where SD is the solution set of dual variational inequality of Problem (1). This condition is different with the error bound condition in [11], see formula (3) therein. The difference is that we replace S with SD in [11] therein. However, from (2), we see that (24) is coincide with the error bound in [11] if F is in addition pseudomonotone on C. The error bound plays an central role in convergence rate analysis. Interested reader can refer [44,45,46] and the survey paper [47] for the sufficient conditions of the error bound.
Theorem 5.1
Suppose that F is Lipschitz continuous on \(\mathbb {R}^{n}\) with modulus L and SD ≠ ∅. Let {xk} be the infinite sequence generated by IPA. Then, the following statements hold.
-
(a)
There exist t1 > 0 and t2 > 0 such that
$$ \begin{array}{@{}rcl@{}} t_{1}\|r(x^{k},1)\|\leq\text{dist} (x^{k}, H_{k})\leq t_{2} \|r(x^{k},1)\text{ for all }k. \end{array} $$(25) -
(b)
If in addition (24) holds, the convergence of the sequence generated by IPA is Q-linear; i.e., there exists a positive number \(\tilde c \in [0,1)\) such that
$$ \begin{array}{@{}rcl@{}} \text{dist} (x^{k+1}, S_{D})\leq \tilde c\cdot \text{dist} (x^{k},S_{D})\text{ for sufficient large }k. \end{array} $$(26)
Proof
-
(a)
From Lemma 3.3, we have xk ∉ Hk for all k. Using this together with Lemma 2.3, we see that
$$ \begin{array}{@{}rcl@{}} P_{H_{k}}(x^{k})\!&=&\! x^{k}-\frac{\langle{x^{k} - z^{k} - \alpha_{k}(F(x^{k})-F(z^{k}))}\rangle{x^{k}-z^{k}}}{\|{x^{k}-z^{k} - \alpha_{k}(F(x^{k}) - F(z^{k}))}\|^{2}}(x^{k}-z^{k}-\alpha_{k}(F(x^{k})\\&&\!-F(z^{k}))). \end{array} $$(27)
Moreover, it follows that
where the first inequality holds from triangle inequality and the fact αk > 0 for all k, the second inequality holds from Cauchy-Schwartz inequality, the third inequality and the fifth inequality hold from (??) and the fact σ < 1, the sixth inequality holds from (6) and the last inequality holds from (16). Let \(t_{1} = \frac {1-\sigma }{1+\sigma }\min \limits \{1,\alpha _{min}, \eta \frac {\sigma }{L}\}\). Then, the first inequality of (25) holds.
Next, from (27), we see that
where the first inequality holds from the fact αk∥F(xk) − F(zk)∥≤ σ∥xk − zk∥, the second inequality holds from (6) and the last inequality holds from the facts \(0<{\alpha _{k}^{0}}\leq \alpha _{\max \limits }\) and 0 < η < 1. Let \(t_{2} = \frac {1+\sigma }{1-\sigma }\max \limits \{1,\alpha _{\max \limits }\}\). Then, the second inequality of (25) holds.
-
(b)
Note the fact SD = ∩y∈C{x : 〈F(y), y − x〉≥ 0} that SD is closed and convex. Using this together with the assumption SD ≠ ∅, we see that there exists only one vector \(\bar x^{k}\in S_{D}\) such that
$$ \begin{array}{@{}rcl@{}} \bar x^{k}: = P_{S_{D}}(x^{k}) \text{ for each }k. \end{array} $$(28)
We first claim that the following inequality holds
To this end, recall from Lemma 3.3 that \(S_{D}\subset \bigcap _{k=0}^{\infty } H_{k}\) and (17), we see that
Using this, we obtain that
where the first equality holds from (28) and the last inequality holds from (25).
Now, we are ready to show our main conclusion.
From Theorem 3.4(b), we see that \(\lim \limits _{k\rightarrow \infty }\text {dist}(x^{k}, H_{k})=0\). Using this together with the first relation of (25), it follows that \(\lim \limits _{k\rightarrow \infty }\|r(x^{k},1)\|=0\). Hence, for any given c2 > 0, there exists integer N > 0 such that ∥r(xk,1)∥≤ c2 for all k ≥ N. This together with (29) and (24) gives
Note the fact \(\text {dist}^{2}\left (x^{k+1}, S_{D}\right ) \geq 0\) that \(1-\frac {{t_{1}^{2}}}{{c_{1}^{2}}}\geq 0\). Let \(\tilde c = \sqrt {\left (1-\frac {{t_{1}^{2}}}{{c_{1}^{2}}}\right )}\). Then (26) holds. □
6 Numerical experiments
In this section, we test IPA, Algorithm 2.1 in [2] (YH for short) and Algorithm 2 in [31] (DHF for short) for high-dimensional nonmonotone variational inequalities problem. All codes are written in Matlab and performed in Matlab2015a on a notebook with Intel(R) Core(TM) i7-5500U CPU (2.40GHZ 2.40 GHZ) and 8 GB of RAM.
In IPA, we take \(\alpha _{\min \limits }=10^{-10}\), \(\alpha _{\max \limits }=10^{10}\), η = 0.5 and σ = 0.99. Inspired by the renowned Barzilai-Borwein step-size and self-adaptive step-size, we take the parameter \({\alpha _{k}^{0}}\) as follows: set \({\alpha _{0}^{0}} = 1\) and take, for k ≥ 1,
which is similar as in our former work [43]. In YH, we take γ = 0.4 and σ = 0.99. In DHF, we take the parameters are what the corresponding reference proposed in [31, Table 5], i.e., \(\gamma _{k} = \frac {k+0.1}{30(k+1)}\), θ = 0.98 and ρ = 0.5.
Example 6.1
Let
Then, it is routine to check that \(S_{D} = \{(-1,\dots ,-1)\}\) and \(S = \{(x_{1},\dots ,x_{n}):x_{i} \in \{-1,0\},\forall i \}\). Moreover, from the fact that \(x\rightarrow x^{2}\) is quasimonotone on [− 1,1], [3, Theorem 3.1] and [3, example 3.1], we see that F is not quasimonotone on C whenever n ≥ 2.
In Example 6.1, the terminated criterion for YH and DHF are both tol = 10− 5 and the terminated criterion for IPA is tol = 10− 6, i.e., the procedure of YH and DHF terminated whenever ∥r(x, 1)∥ ≤ 10− 5 and the procedure of IPA terminated whenever ∥r(x, α)∥≤ 10− 6. We first generate 9 initial points x0 by setting
where \(ones(n,1)\in \mathbb {R}^{n\times 1}\) is a vector with each component is 1. Next, for a fixed dimensional number n, we repeatedly use YH, DHF and IPA to solve Example 6.1 with different initial point 9 times, respectively. Finally, we report the number of iterations (iter), the number of projections (np), the CPU time (in seconds) and disxSFootnote 3 in Table 1, averaged over the 9 initial points.
Example 6.2
In this example, we set
Then, we can check that \(S_{D} = \{(1,\dots ,1)\}\) and \(S = \{(x_{1},\dots ,x_{n}):x_{i} \in \{0,1\},\forall i \}\). Let I be the identity mapping. Then, from Example 6.1, we see that G is the difference of F and I.
In Example 6.2, the terminated criterion for YH and DHF are both tol = 10− 4 and the terminated criterion for IPA is tol = 10− 6. We first use YH, DHF and IPA to solve Example 6.2 with n = 1 and initial point \(x^{0}\in \{0.1, 0.2, 0.3,\dots ,0.9\}\). We let ∖ denote the procedure of algorithm lose to find the approximate solution when CPU time less than 2 min. We report iter, np, the CPU time (in seconds) and disxS in Table 2.
Based on Table 2, we test DHF and IPA by solving high-dimension Example 6.2 with the initial point belongs to the following set
We report iter, np, the CPU time (in seconds) and disxS in Table 3, averaged over the 6 different initial points.
Remark 6.3
From Tables 1, 2 and 3, we see that YH, DHF and IPA are all have the ability to find a solution of nonmonotone VI. Obviously, IPA is much more efficient than YH and DHF from the terms of CPU time. Moreover, from Table 2, we see that IPA is less dependent on the initial value than YH and DHF.
Notes
From Lemma 2.8 (b) below or [35, Theorem 3.5.4] , we see that the global minimizer on C of a smooth quasiconvex function f belongs to SD with F = ∇f. Hence, our algorithm can be applied in a class of quasiconvex optimization problem.
For each k, we take step-size αk as a fixed positive number to avoid computing αk by (??).
Here, disxS is used to denote the distance of the output point to S.
References
Karamardian, S.: Complementarity problems over cones with monotone and pseudomonotone maps. J. Optim. Theory Appl. 18, 445–454 (1976)
Ye, M., He, Y.: A double projection method for solving variational inequalities without monotonicity. Comput. Optim. Appl. 60, 141–150 (2015)
He, Y.: Solvability of the minty variational inequality. J. Optim. Theory. Appl. 174, 686–692 (2017)
Goldstein, A.A.: Convex programming in Hilbert space. Bull. Amer. Math. Soc. 70, 709–710 (1964)
Levitin, E.S., Polyak, B.T.: Constrained minimization problems. USSR Comput. Math. Math. Phys. 6, 1–50 (1966)
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)
Korpelevich, G.M.: The extragradient method for finding saddle points and other problems. Matecon 17(10), 747–756 (1976)
Khobotov, E.N.: Modification of the extragradient method for solving variational inequalities and certain optimization problems. USSR Comput. Math. Math. Phys. 27, 120–127 (1987)
Solodov, M.V., Svaiter, B.F.: A new projection method for variational inequality problems. SIAM J. Control Optim. 37(3), 765–776 (1999)
He, Y.R.: A new double projection algorithm for variational inequalities. J. Comput. Appl. Math. 185(1), 166–173 (2006)
Solodov, M.V., Tseng, P.: Modified projection-type methods for monotone variational inequalities. SIAM J. Control Optim. 34(5), 1814–1830 (1996)
He, B.: A class of projection and contraction methods for monotone variational inequalities. Appl. Math. Optim. 35, 69–76 (1997)
Censor, Y., Gibali, A., Reich, S.: Extensions of Korpelevich’s extragradient method for the variational inequality problem in Euclidean space. Optimization 61, 1119–1132 (2012)
Han, D.R., Lo, H.K.: Two new self-adaptive projection methods for variational inequality problems. Comp. Math. Appl. 43, 1529–1537 (2002)
Malitsky, Y.V.: Projected reflected gradient methods for monotone variational inequalities. SIAM J. Optim. 25(1), 502–520 (2015)
Konnov, I.V.: On the rate of convergence of combined relaxation methods. Izv. Vyssh. Uchebn. Zaved. Mat. 37(12), 89–92 (1993)
Konnov, I.V.: Combined Relaxation Methods for Variational Inequalities. Springer, Berlin (2001)
Konnov, I.V.: Combined Relaxation Methods for Generalized Monotone Variational Inequalities. In: Generalized Convexity and Related Topics. Lecture Notes in Economics and Mathematical Systems, vol. 583, pp. 3–31 (2006)
Tran, D.Q., Dung, M.L., Nguyen, V.H.: Extragradient algorithms extended to equilibrium problems. Optim 57(6), 749–776 (2008)
Alakoya, T.O., Jolaoso, L.O., Mewomo, O.T.: Modified inertial subgradient extragradient method with self adaptive stepsize for solving monotone variational inequality and fixed point problems. Optim 70, 545–574 (2021)
Alakoya, T.O., Taiwo, A., Mewomo, O.T., et al.: An iterative algorithm for solving variational inequality, generalized mixed equilibrium, convex minimization and zeros problems for a class of nonexpansive-type mappings. Ann. Univ Ferrara (2021)
Gibali, A., Jolaoso, L.O., Mewomo, O.T., et al.: Fast and simple bregman projection methods for solving variational inequalities and related problems in banach spaces. Res. Math. 75, 179 (2020)
Jolaoso, L.O., Taiwo, A., Alakoya, T.O., et al.: A Strong Convergence Theorem for Solving Pseudo-monotone Variational Inequalities Using Projection Methods. J. Optim. Theory Appl. 185, 744–766 (2020)
Ogwo, G.N., Izuchukwu, C., Mewomo, O.T.: Inertial methods for finding minimum-norm solutions of the split variational inequality problem beyond monotonicity. Numer Algor. https://doi.org/10.1007/s11075-021-01081-11 (2021)
Brito, A.S., da Cruz Neto, J.X., Lopes, J.O., Oliveira, P.R.: Interior proximal algorithm for quasiconvex programming problems and variational inequalities with linear constraints. J. Optim. Theory Appl. 154, 217–234 (2012)
Langenberg, N.: An interior proximal method for a class of quasimonotone variational inequalities. J. Optim. Theory Appl. 155, 902–922 (2012)
Liu, H., Yang, J.: Weak convergence of iterative methods for solving quasimonotone variational inequalities. Compu. Optim. Appl. https://doi.org/10.1007/s10589-020-00217-8 (2020)
Burachik, R.S., Millan, R.D.: A projection algorithm for Non-Monotone variational inequalities. Set-valued Var. Anal. 28(1), 149–166 (2020)
Strodiot, J.J., Vuong. P. T., Nguyen, T.T.: A class of shrinking projection extragradient methods for solving non-monotone equilibrium problems in Hilbert spaces. J. Glob. Optim. 64(1), 159–178 (2016)
Van Dinh, B., Kim, D.S.: Projection algorithms for solving nonmonotone equilibrium problems in Hilbert space. J. Comput. Appl. Math. 302, 106–117 (2016)
Deng, L., Hu, R., Fang, Y.: Projection extragradient algorithms for solving nonmonotone and non-Lipschitzian equilibrium problems in Hilbert spaces. Numer. Algor. 86, 191–221 (2021)
Lei, M., He, Y.R.: An extragradient method for solving variational inequalities without monotonicity. J. Optim. Theory Appl. 188, 432–446 (2021)
Ye, M.L.: A half-space projection method for solving generalized Nash equilibrium problems. Optim. 66(7), 1119–1134 (2017)
Ye, M.L.: An improved projection method for solving generalized variational inequality problems. Optim. 67(9), 1523–1533 (2018)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 3rd edn. Wiley, New Jersey (2006)
Zarantonello, E.H.: Projections on convex sets in Hilbert space and spectral theory, contributions to nonlinear functional analysis. Academic Press, New York (1971)
Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press (2004)
Xiu, N.H., Zhang, J.Z.: Some recent advances in projection-type methods for variational inequalities. J. Comput. Appl. Math. 152, 559–585 (2003)
Gafni, E.M., Bertsekas, D.P.: Two-metric projection problems and descent methods for asymmetric variational inequality problems. Math. Program. 53, 99–110 (1984)
Polyak, B.T.: Introduction to optimization. Optimization Software Inc., New York (1987)
Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces, 2nd edn. Springer International Publishing, AG (2017)
Marcotte, P., Zhu, D.L.: A cutting plane method for solving quasimonotone variational inequalities. Comput. Optim. Appl. 20, 317–324 (2001)
Ye, M.L., Pong, T.K.: A subgradient-based approach for finding the maximum feasible subsystem with respect to a set. SIAM J. Optim. 30(2), 1274–1299 (2020)
Luo, Z.Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30 (2), 408–425 (1992)
Tseng, P.: On linear convergence of iterative methods for the variational inequality problem. J. Comput. Appl. Math. 60(1-2), 237–252 (1995)
Solodov, M.V.: Convergence rate analysis of iteractive algorithms for solving variational inequality problems. Math. Program. 96(3), 513–528 (2003)
Pang, J.S.: Error bounds in mathematical programming. Math. Program. 79(1-3), 299–332 (1997)
Funding
This work is partially supported by the National Science Foundation of China (11871359, 11871059 and 11801455) and is supported by scientific research projects of China West Normal University (20A024).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ye, M. An infeasible projection type algorithm for nonmonotone variational inequalities. Numer Algor 89, 1723–1742 (2022). https://doi.org/10.1007/s11075-021-01170-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-021-01170-1