1 Introduction

Let Ω be a closed convex subset of R n, F be a continuous mapping from R n to itself. The variational inequality problem, denoted by VI(Ω,F), is to find a vector u Ω such that

$$ \mbox{VI}(\varOmega,F) \quad \bigl(u-u^*\bigr)^TF\bigl(u^* \bigr) \ge 0, \quad \forall u\in \varOmega. $$
(1.1)

Notice that VI(Ω,F) is invariant when F is multiplied by a positive scalar β>0. It is well known ([1], p. 267) that, for any β>0,

$$ u^*\ \mbox{is a solution of VI}(\varOmega, F) \quad \Longleftrightarrow \quad u^* = P_{\varOmega}\bigl[ u^* - \beta F\bigl(u^* \bigr) \bigr], $$
(1.2)

where P Ω (⋅) denotes the projection onto Ω with respect to the Euclidean norm, i.e.,

$$P_{\varOmega}(v) = \mbox{argmin} \bigl\{ \| u-v \| \mid u\in \varOmega \bigr\}. $$

Throughout this paper we assume that the mapping F is monotone and Lipschitz continuous, i.e.,

$$(u-v)^T \bigl(F(u) - F(v)\bigr) \ge 0, \quad \forall u,v\in R^n, $$

and there is a constant L>0 (not necessary to know), such that

$$\bigl\| F(u) - F(v)\bigr\| \le L \|u-v\|, \quad \forall u,v\in R^n. $$

Moreover, we assume that the solution set of VI(Ω,F), denoted by Ω , is nonempty. The nonempty assumption of the solution set, together with the monotonicity assumption of F, implies that Ω is closed and convex (see p. 158 in [3]).

Among the algorithms for monotone variational inequalities, the extragradient (EG) method proposed by Korpelevich [12] is one of the simple and attractive methods. In fact, each iteration of the extragradient method can be divided into two steps. The k-th iteration of EG method begins with a given u kΩ, the first step produces a vector \(\tilde{u}^{k}\) via a projection

$$ \tilde{u}^k = P_{\varOmega} \bigl[u^k - \beta_k F\bigl(u^k\bigr) \bigr], $$
(1.3a)

where β k >0 is selected to satisfy (see [13])

$$ \beta_k \bigl\| F\bigl(u^k\bigr) - F \bigl(\tilde{u}^k\bigr) \bigr\| \le \nu \bigl\|u^k -\tilde{u}^k \bigr\|, \quad \nu\in (0,1). $$
(1.3b)

Since \(\tilde{u}^{k}\) is not accepted as the new iterate, for designation convenience, we call it as a predictor and β k is named the prediction step size. The second step (correction step) of the k-th iteration updates the new iterate u k+1 by

$$ u^{k+1} = P_{\varOmega}\bigl[u^k - \beta_k F\bigl(\tilde{u}^k\bigr)\bigr], $$
(1.3c)

where β k and \(\tilde{u}^{k}\) are given in (1.3a). The sequence {u k} generated by the extragradient method is Fejér monotone with respect to the solution set, namely,

$$ \bigl\|u^{k+1}- u^*\bigr\|^2 \le \bigl\|u^k - u^*\bigr\|^2 -\bigl(1- \nu^2\bigr)\bigl\|u^k - \tilde{u}^k\bigr\|^2 . $$
(1.4)

For a proof of the above contraction property, the readers may consult [3] (see pp. 1115–1118 therein). Notice that, in the extragradient method, the step size of the prediction (1.3a) and that of the correction (1.3c) are equal. Thus these two steps seem like ‘symmetric’.

Because of its simple iterative forms, recently, the extragradient method has been applied to solve some large optimization problems in the area of information science, such as in machine learning [11, 14, 22, 23], optical network [16, 17] and speech recognition [18], etc. In addition, Nemirovski [15] and Tseng [24] proved the O(1/t) convergence rate of the extragradient method. Both in the theoretical and practical aspects, the interest in the extragradient method becomes more active.

In the last decades, a class of projection and contraction (PC) methods for monotone variational inequalities [5, 6, 8, 19, 20] are developed. Similarly as in the extragradient method, each iteration of the PC methods consists of two steps. The prediction step of PC methods produces the predictor \(\tilde{u}^{k}\) via (1.3a) just as in the extragradient method (but the condition (1.3b) is not necessary). The PC methods exploit a pair of geminate directions [7, 8] offered by the predictor, namely, they are

$$ d\bigl(u^k,\tilde{u}^k\bigr) = \bigl(u^k - \tilde{u}^k\bigr) - \beta_k\bigl(F \bigl(u^k\bigr) - F\bigl(\tilde{u}^k\bigr)\bigr) \quad \mbox{and} \quad \beta_k F\bigl(\tilde{u}^k\bigr). $$
(1.5)

Here, both the directions are ascent directions of the unknown distance function \(\frac{1}{2}\|u-u^{*}\|^{2}\) at the point u k. Based on such directions, the goal of the correction step is to generate a new iterate which is closer to the solution set. It leads to choosing the ‘optimal’ step length

$$ \varrho_k = \frac{ (u^k -\tilde{u}^k)^T d(u^k,\tilde{u}^k)}{\|d(u^k,\tilde{u}^k)\|^2}, $$
(1.6)

and a relaxation factor γ∈(0,2), the second step (correction step) of the PC methods updates the new iterate u k+1 by

$$\begin{aligned} u^{k+1} = u^k - \gamma \varrho_k d\bigl(u^k,\tilde{u}^k\bigr), \end{aligned}$$
(1.7)

or

$$ u^{k+1} = P_{\varOmega} \bigl[u^k - \gamma \varrho_k\beta_k F\bigl( \tilde{u}^k\bigr)\bigr]. $$
(1.8)

The PC methods (without line search) make one (or two) projection(s) on Ω at each iteration, and the distance of the iterates to the solution set monotonically converges to zero. According to the terminology in [2], these methods belong to the class of Fejér contraction methods. In fact, the only difference between the extragradient method and one of the PC methods is that they use different step sizes in the correction step (see (1.3c) and (1.8)). According to our numerical experiments [6, 8], the PC methods always outperform the extragradient methods.

Stimulated by the complexity statement of the extragradient method, this paper shows the O(1/t) convergence rate of the projection and contraction methods for monotone VIs. Recall that Ω can be characterized as (see (2.3.2) in p. 159 of [3])

$$\varOmega^* = \bigcap_{u\in \varOmega} \bigl\{ \tilde{u}\in \varOmega : (u-\tilde{u})^TF(u) \ge 0 \bigr\}. $$

This implies that \(\tilde{u} \in \varOmega\) is an approximate solution of VI(Ω,F) with the accuracy ϵ if it satisfies

$$\tilde{u}\in \varOmega \quad \mbox{and} \quad \inf_{u\in \varOmega } \bigl\{ (u-\tilde{u})^T F(u) \bigr\}\ge -\epsilon. $$

In this paper, we show that, for given ϵ>0, in O(L/ϵ) iterations the projection and contraction methods can find a \(\tilde{u}\) such that

$$ \tilde{u}\in \varOmega \quad \mbox{and} \quad \sup _{u\in \mathcal{ D}(\tilde{u})} \bigl\{ (\tilde{u}-u)^T F(u) \bigr\}\le \epsilon, $$
(1.9)

where

$$\mathcal{ D}(\tilde{u}) = \bigl\{ u\in \varOmega \mid \|u-\tilde{u}\| \le 1 \bigr\}. $$

As a byproduct of the complexity analysis, we find why taking a suitable relaxation factor γ∈(1,2) in the correction steps (1.7) and (1.8) of the PC methods can achieve the faster convergence.

The outline of this paper is as follows. Section 2 recalls some basic concepts in the projection and contraction methods. In Sect. 3, we investigate the geminate descent directions of the distance function. Section 4 shows the contraction property of the PC methods. In Sect. 5, we carry out the complexity analysis, which results in an O(1/t) convergence rate and suggests using the large relaxation factor in the correction step of the PC methods. In Sect. 6, we present some numerical results to indicate the efficiency of the PC methods in comparison with the extragradient method. Finally, some conclusion remarks are addressed in the last section.

Throughout the paper, the following notational conventions are used. We use u to denote a fixed but arbitrary point in the solution set Ω . A superscript such as in u k refers to a specific vector and usually denotes an iteration index. For any real matrix M and vector v, we denote the transpose by M T and v T, respectively. The Euclidean norm will be denoted by ∥⋅∥.

2 Preliminaries

In this section, we summarize the basic concepts of the projection mapping and three fundamental inequalities for constructing the PC methods. Throughout this paper, we assume that the projection on Ω in the Euclidean-norm has a closed form and it is easy to be carried out. Since

$$P_{\varOmega}(v) = \mbox{argmin} \biggl\{\frac{1}{2} \| u-v\|^2 \,\Big|\,u\in \varOmega\biggr\}, $$

according to the optimal solution of the convex minimization problem, we have

$$ \bigl(v -P_{\varOmega}(v)\bigr)^T \bigl(u -P_{\varOmega}(v) \bigr)\le 0, \quad \forall v\in R^n, \forall\ u \in \varOmega. $$
(2.1)

Consequently, for any uΩ, it follows from (2.1) that

$$\begin{aligned} \|u-v\|^2 = & \bigl\| \bigl(u - P_{\varOmega}(v)\bigr) -\bigl( v-P_{\varOmega}(v)\bigr)\bigr\|^2 \\ = & \bigl\|u - P_{\varOmega}(v)\bigr\|^2 - 2\bigl(v -P_{\varOmega}(v) \bigr)^T \bigl(u -P_{\varOmega}(v) \bigr) + \bigl\|v -P_{\varOmega}(v)\bigr\|^2 \\ \ge & \bigl\|u - P_{\varOmega}(v)\bigr\|^2 + \bigl\|v - P_{\varOmega}(v) \bigr\|^2. \end{aligned}$$

Therefore, we have

$$ \bigl\| u- P_{\varOmega} (v) \bigr\|^2 \le \| u - v\|^2 - \bigl\| v - P_{\varOmega}(v)\bigr\|^2, \quad \forall\ v\in R^n,\ \forall\ u\in \varOmega. $$
(2.2)

For given u and β>0, let \(\tilde{u} = P_{\varOmega}[u- \beta F(u)]\) be given via a projection. We say that \(\tilde{u}\) is a test-vector of VI(Ω,F) because

$$u=\tilde{u} \quad \Leftrightarrow \quad u\in \varOmega^*. $$

Since \(\tilde{u} \in \varOmega\), it follows from (1.1) that

$$ (\mbox{FI-1}) \quad \bigl(\tilde{u} - u^*\bigr)^T \beta F\bigl(u^*\bigr) \ge 0, \quad \forall u^*\in \varOmega^*. $$
(2.3)

Setting v=uβF(u) and u=u in the inequality (2.1), we obtain

$$ (\mbox{FI-2}) \quad \bigl(\tilde{u} - u^*\bigr)^T \bigl((u-\tilde{u})-\beta F(u) \bigr) \ge 0, \quad \forall u^*\in \varOmega^*. $$
(2.4)

Under the assumption that F is monotone we have

$$ (\mbox{FI-3}) \quad \bigl(\tilde{u} - u^*\bigr)^T \beta \bigl( F(\tilde{u})-F\bigl(u^*\bigr) \bigr) \ge 0, \quad \forall u^*\in \varOmega^*. $$
(2.5)

The inequalities (2.3), (2.4) and (2.5) play an important role in the projection and contraction methods. They were emphasized in [5] as three fundamental inequalities in the projection and contraction methods.

Definition 2.1

(Ascent direction)

For any but fixed u Ω , a direction d is called an ascent direction of \(\frac{1}{2} \|u- u^{*}\|^{2}\) at u if and only if the inner-product (uu )T d>0.

Definition 2.2

(Predictor in projection-type methods)

For given u k, \(\tilde{u}^{k}\) given by (1.3a) is called the predictor.

Definition 2.3

(Prediction step-size condition in EG method)

The condition (1.3b) is called the prediction step-size condition in the extragradient method.

Indeed, the predictor \(\tilde{u}^{k}\) in the projection and contraction methods [5, 6, 8, 19] is produced by (1.3a). Because the mapping F is Lipschitz continuous (even if the constant L>0 is unknown), without loss of generality, we can assume that inf k≥0{β k }≥β L >0 and β L =O(1/L). In practical computation, we can choose a constant ν∈(0,1) and make an initial guess of β=ν/L and decrease β by a constant factor and repeat the procedure whenever (1.3b) is violated.

To the direction \(d(u^{k},\tilde{u}^{k})\) defined in (1.5), there is a correlative ascent direction \(\beta_{k}F(\tilde{u}^{k})\). Notice that the projection equation (1.3a) can be written as

$$ \tilde{u}^k = P_{\varOmega}\bigl\{ \tilde{u}^k- \bigl[\beta_k F\bigl(\tilde{u}^k \bigr) -d\bigl(u^k,\tilde{u}^k\bigr)\bigr]\bigr\}, $$
(2.6)

where \(d(u^{k},\tilde{u}^{k})\) is defined in (1.5). It follows from (1.2) that \(\tilde{u}^{k}\) is a solution of VI(Ω,F) if and only if \(d(u^{k},\tilde{u}^{k})=0\). In [9, 10], \(d(u^{k},\tilde{u}^{k})\) and \(\beta_{k}F(\tilde{u}^{k})\) in (2.6) are called a pair of geminate directions and denoted by \(d_{1}(u^{k},\tilde{u}^{k})\) and \(d_{2}(u^{k},\tilde{u}^{k})\), respectively. In this paper, we restrict \(d_{2}(u^{k},\tilde{u}^{k})\) to be \(F(\tilde{u}^{k})\) times a positive scalar β k . The projection and contraction methods considered in this paper belong to the prox-like contraction methods [9, 10]. Instead of the step-size condition in the extragradient method (see (1.3b)), we use the following general step-size conditions for the projection and contraction methods.

Definition 2.4

(Prediction step-size conditions in PC methods)

Let c 1,c 2>0 be given constants. For given u k, let \(\tilde{u}^{k}\) be given by (1.3a) and \(d(u^{k},\tilde{u}^{k})\) be defined by (1.5). We say β k satisfies the prediction step-size conditions in the projection and contraction methods, if its related direction \(d(u^{k},\tilde{u}^{k})\) satisfies

$$\begin{aligned} \bigl(u^k-\tilde{u}^k \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr) \ge c_1 \bigl\|u^k-\tilde{u}^k\bigr\|^2 \end{aligned}$$
(2.7)

and

$$ \varrho_k: = \frac{(u^k-\tilde{u}^k)^T d(u^k,\tilde{u}^k)}{ \| d(u^k,\tilde{u}^k)\|^2} \ge c_2. $$
(2.8)

3 The ascent directions

For any but fixed u Ω , (uu ) is the gradient of the unknown distance function \(\frac{1}{2} \|u- u^{*}\|^{2}\) in the Euclidean-normFootnote 1 at the point u.

3.1 Geminate ascent directions

The forthcoming analysis is based on the general conditions. Note that an equivalent expression of (2.6) is

$$ \tilde{u}^k \in \varOmega, \bigl(u- \tilde{u}^k\bigr)^T \bigl\{ \beta_k F\bigl( \tilde{u}^k\bigr) -d\bigl(u^k,\tilde{u}^k \bigr) \bigr\} \ge 0, \quad \forall u\in \varOmega, $$
(3.1)

and from (2.8) we have

$$ \bigl(u^k-\tilde{u}^k \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr) = \varrho_k \bigl\| d\bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2. $$
(3.2)

The following lemmas tell us that both the direction \(d(u^{k},\tilde{u}^{k})\) (for u kR n) and \(F(\tilde{u}^{k})\) (for u kΩ) are ascent directions of the function \(\frac{1}{2}\|u-u^{*}\|^{2}\) whenever u k is not a solution point. The proof is similar to those in [7], for completeness sake of this paper, we restate the short proofs.

Lemma 3.1

Let u k, \(\tilde{u}^{k}\) and \(d(u^{k},\tilde{u}^{k})\) be given by (2.6) and the general conditions (2.7) and (2.8) be satisfied. Then we have

$$ \bigl(u^k - u^*\bigr)^T d \bigl(u^k,\tilde{u}^k\bigr) \ge \varrho_k \bigl\| d\bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2, \quad \forall u^k\in R^n,\ u^*\in \varOmega^*. $$
(3.3)

Proof

Note that u Ω. By setting u=u in (3.1) (the equivalent expression of (2.6)), we get

$$\bigl(\tilde{u}^k - u^*\bigr)^Td\bigl(u^k, \tilde{u}^k\bigr) \ge \bigl(\tilde{u}^k - u^* \bigr)^T\beta_k F\bigl(\tilde{u}^k\bigr) \ge 0, \quad \forall u^*\in \varOmega^*. $$

The last inequality follows from the monotonicity of F and \((\tilde{u}^{k} - u^{*})^{T}F(u^{*}) \ge 0\). Therefore,

$$\bigl(u^k-u^*\bigr)^Td\bigl(u^k, \tilde{u}^k\bigr) \ge \bigl(u^k-\tilde{u}^k \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr), \quad \forall u^*\in \varOmega^*. $$

The assertion (3.3) is followed from the above inequality and (3.2) directly. □

Lemma 3.2

Let \(u^{k}, \tilde{u}^{k}\) and \(d(u^{k},\tilde{u}^{k})\) be given by (2.6) and the general conditions (2.7) and (2.8) be satisfied. If u kΩ, then we have

$$ \bigl(u^k - u^*\bigr)^T \beta_k F\bigl(\tilde{u}^k\bigr) \ge \varrho_k \bigl\| d\bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2, \quad \forall u^*\in \varOmega^*. $$
(3.4)

Proof

Since \((\tilde{u}^{k} - u^{*})^{T}\beta_{k} F(\tilde{u}^{k}) \ge 0\), we have

$$\bigl(u^k - u^*\bigr)^T\beta_k F\bigl( \tilde{u}^k\bigr) \ge \bigl(u^k - \tilde{u}^k \bigr)^T\beta_k F\bigl(\tilde{u}^k\bigr), \quad \forall u^*\in \varOmega^*. $$

Note that because u kΩ, by setting u=u k in (3.1), we get

$$\bigl(u^k- \tilde{u}^k\bigr)^T \beta_k F\bigl(\tilde{u}^k\bigr)\ge \bigl(u^k- \tilde{u}^k\bigr)^Td\bigl(u^k, \tilde{u}^k\bigr). $$

From the above two inequalities follows that

$$\bigl(u^k-u^*\bigr)^T \beta_k F\bigl( \tilde{u}^k\bigr) \ge \bigl(u^k-\tilde{u}^k \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr), \quad \forall u^*\in \varOmega^*. $$

The assertion (3.4) is followed from the above inequality and (3.2) directly. □

We would like to emphasize that (3.3) holds for u kR n while (3.4) is hold only for u kΩ.

3.2 Ascent directions in the extragradient method

The condition (1.3b) is necessary in the prediction step of the extragradient method. Setting u=u k, \(\tilde{u}=\tilde{u}^{k}\) and β=β k in the fundamental inequalities (2.3), (2.4) and (2.5), and adding them, we get

$$ \bigl(\tilde{u}^k - u^*\bigr)^T d \bigl(u^k,\tilde{u}^k\bigr) \ge 0, \quad \forall u^*\in \varOmega^*, $$
(3.5)

where \(d(u^{k},\tilde{u}^{k})\) defined in (1.5). It follows from (3.5) that

$$ \bigl(u^k-u^*\bigr)^T d\bigl(u^k, \tilde{u}^k\bigr) \ge \bigl(u^k-\tilde{u}^k \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr). $$
(3.6)

Note that, under the condition (1.3b), it follows that

$$\begin{aligned} \bigl(u^k-\tilde{u}^k \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr) =& \bigl\|u^k-\tilde{u}^k\bigr\|^2 - \bigl(u^k- \tilde{u}^k\bigr)^T\beta_k \bigl(F \bigl(u^k\bigr)- F\bigl(\tilde{u}^k\bigr)\bigr) \\ \ge & (1- \nu) \bigl\|u^k-\tilde{u}^k\bigr\|^2. \end{aligned}$$
(3.7)

This means that \(d(u^{k},\tilde{u}^{k})\) is an ascent direction of the unknown distance function \(\frac{1}{2}\|u-u^{*}\|^{2}\) at the point u k, and the condition (2.7) is satisfied with c 1=1−ν. By using the definition of \(d(u^{k},\tilde{u}^{k})\), we have

$$\begin{aligned} &{2\bigl(u^k-\tilde{u}^k\bigr)d \bigl(u^k,\tilde{u}^k\bigr) - \bigl\|d\bigl(u^k, \tilde{u}^k\bigr)\bigr\|^2 } \\ &{ \quad {}= d\bigl(u^k,\tilde{u}^k \bigr)^T\bigl\{ 2\bigl(u^k-\tilde{u}^k\bigr) - d\bigl(u^k,\tilde{u}^k\bigr)\bigr\} } \\ &{ \quad {}= \bigl\{ \bigl(u^k-\tilde{u}^k\bigr) - \beta_k\bigl(F\bigl(u^k\bigr) - F\bigl( \tilde{u}^k\bigr)\bigr)\bigr\}^T \bigl\{ \bigl(u^k-\tilde{u}^k\bigr) + \beta_k\bigl(F \bigl(u^k\bigr) - F\bigl(\tilde{u}^k\bigr)\bigr)\bigr\} } \\ &{ \quad {} = \bigl\|u^k-\tilde{u}^k\bigr\|^2 - \beta_k^2 \bigl\|F\bigl(u^k\bigr) - F\bigl( \tilde{u}^k\bigr)\bigr\|^2 } \\ &{ \quad {}\ge \bigl(1-\nu^2\bigr) \bigl\|u^k-\tilde{u}^k\bigr\|^2.} \end{aligned}$$

From the above inequality follows that the condition (2.8) is satisfied with \(c_{2}\ge \frac{1}{2}\). In other words, the step-size conditions in Definition 2.4 are satisfied if the condition (1.3b) is holds. Thus, the directions \(d(u^{k},\tilde{u}^{k})\) and \(\beta_{k} F(\tilde{u}^{k})\) defined in (1.5) are a pair of geminate ascent directions. In other words, the step-size condition (1.3b) is sufficient for the one in the projection and contraction methods.

4 Corrector and the convergence in the contraction sense

The extragradient method uses (1.3c) to update the new iterate. Based on the pair of geminate ascent directions in (2.6), namely, \(d(u^{k},\tilde{u}^{k})\) and \(\beta_{k}F(\tilde{u}^{k})\), in the projection and contraction methods, we use the one of the following corrector forms to update the new iterate u k+1:

$$ \mbox{(Correction of PC Method-I)}\quad u_I^{k+1} = u^k - \gamma \varrho_k d\bigl(u^k, \tilde{u}^k\bigr), $$
(4.1)

or

$$ \mbox{(Correction of PC Method-II)}\quad u_\mathit{II}^{k+1} = P_{\varOmega}\bigl[ u^k - \gamma \varrho_k \beta_k F\bigl(\tilde{u}^k\bigr)\bigr], $$
(4.2)

where γ∈(0,2) and ϱ k is defined in (2.8). Note that the same step size length is used in (4.1) and (4.2) even if the search directions are different. Recall that \(\tilde{u}^{k}\) is obtained via a projection, by using the correction form (4.2), we have to make an additional projection on Ω in the PC methods. Replacing γϱ k in (4.2) by 1 (and if the step size condition (1.3b) is satisfied), it reduces to the update form of the extragradient method (see (1.3c)). For any solution point u Ω , we define

$$ \vartheta_I(\gamma)= \bigl\| u^k - u^*\bigr\|^2 - \bigl\|u_I^{k+1} - u^*\bigr\|^2 $$
(4.3)

and

$$ \vartheta_\mathit{II}(\gamma)= \bigl\| u^k - u^*\bigr\|^2 - \bigl\| u_\mathit{II}^{k+1} - u^*\bigr\|^2, $$
(4.4)

which measure the profit in the k-th iteration. The following theorem gives a lower bound of the profit function, the similar results were established in [68].

Theorem 4.1

For given u k, let the general conditions (2.7) and (2.8) be satisfied. If the corrector is updated by (4.1) or (4.2), then for any u Ω and γ>0, we have

$$ \vartheta_ I(\gamma) \ge q(\gamma), $$
(4.5)

and

$$ \vartheta_\mathit{II}(\gamma)\ge q(\gamma) + \bigl\|u_I^{k+1} - u_\mathit{II}^{k+1}\bigr\|^2, $$
(4.6)

respectively, where

$$ q(\gamma) = \gamma(2-\gamma) \varrho_k^2 \bigl\| d\bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2. $$
(4.7)

Proof

Using the definition of ϑ I (γ) and \(u_{I}^{k+1}\) (see (4.1)), we have

$$\begin{aligned} \vartheta_I(\gamma) = & \bigl\| u^k - u^*\bigr\|^2 - \bigl\|u^k - u^* - \gamma \varrho_k d \bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2 \\ = & 2 \gamma \varrho_k\bigl(u^k- u^* \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr) - \gamma^2 \varrho_k^2 \bigl\| d\bigl(u^k, \tilde{u}^k\bigr)\bigr\|^2. \end{aligned}$$
(4.8)

Recalling (3.3), we obtain

$$2 \gamma \varrho_k\bigl(u^k- u^*\bigr)^Td \bigl(u^k,\tilde{u}^k\bigr) \ge 2 \gamma \varrho_k^2 \bigl\| d\bigl(u^k, \tilde{u}^k\bigr)\bigr\|^2 . $$

Substituting it in (4.8) and using the definition of q(γ), we get ϑ I (γ)≥q(γ) and the first assertion is proved. Now, we turn to show the second assertion. Because

$$u_\mathit{II}^{k+1} = P_{\varOmega}\bigl[ u^k - \gamma \varrho_k \beta_k F\bigl(\tilde{u}^k \bigr)\bigr], $$

and u Ω, by setting u=u and \(v= u^{k} - \gamma \varrho_{k} \beta_{k} F(\tilde{u}^{k})\) in (2.2), we have

$$ \bigl\|u^* - u_\mathit{II}^{k+1} \bigr\|^2 \le \bigl\|u^*- \bigl(u^k - \gamma \varrho_k \beta_k F \bigl(\tilde{u}^k\bigr) \bigr)\bigr\|^2 - \bigl\| u^k - \gamma \varrho_k \beta_k F\bigl(\tilde{u}^k \bigr) - u_\mathit{II}^{k+1} \bigr\|^2. $$
(4.9)

Thus,

$$\begin{aligned} \vartheta_\mathit{II}(\gamma) = & \bigl\| u^k - u^*\bigr\|^2 - \bigl\| u_\mathit{II}^{k+1} - u^*\bigr\|^2 \\ \ge & \bigl\| u^k - u^*\bigr\|^2 - \bigl\|\bigl(u^k - u^* \bigr) -\gamma \varrho_k \beta_k F\bigl( \tilde{u}^k\bigr) \bigr\|^2 \\ &{} + \bigl\|\bigl(u^k - u_\mathit{II}^{k+1}\bigr) - \gamma \varrho_k \beta_k F\bigl(\tilde{u}^k\bigr)\bigr\|^2 \\ = & \bigl\| u^k - u_\mathit{II}^{k+1}\bigr\|^2 + 2 \gamma \varrho_k \beta_k\bigl( u_\mathit{II}^{k+1} -u^*\bigr)^T F\bigl(\tilde{u}^k\bigr) \\ \ge & \bigl\| u^k - u_\mathit{II}^{k+1}\bigr\|^2 + 2 \gamma \varrho_k \beta_k\bigl( u_\mathit{II}^{k+1} -\tilde{u}^k\bigr)^T F\bigl(\tilde{u}^k\bigr) . \end{aligned}$$
(4.10)

The last inequality in (4.10) follows from \((\tilde{u}^{k}-u^{*})^{T} F(\tilde{u}^{k}) \ge 0\). Since \(u_{\mathit{II}}^{k+1}\in \varOmega\), by setting \(u=u_{\mathit{II}}^{k+1}\) in (3.1), we get

$$\bigl(u_\mathit{II}^{k+1}-\tilde{u}^k \bigr)^T \bigl\{ \beta_k F\bigl(\tilde{u}^k \bigr) -d\bigl(u^k,\tilde{u}^k\bigr) \bigr\} \ge 0, $$

and consequently, substituting it in the right hand side of (4.10), we obtain

$$\begin{aligned} \vartheta_\mathit{II}(\gamma) \ge & \bigl\| u^k - u_\mathit{II}^{k+1}\bigr\|^2 + 2 \gamma \varrho_k \bigl( u_\mathit{II}^{k+1} -\tilde{u}^k \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr) \\ = & \bigl\| u^k - u_\mathit{II}^{k+1}\bigr\|^2+ 2 \gamma \varrho_k \bigl(u^k -\tilde{u}^k \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr) \\ &{} - 2 \gamma \varrho_k \bigl( u^k- u_\mathit{II}^{k+1} \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr) . \end{aligned}$$
(4.11)

To the two crossed term in the right hand side of (4.11), we have (by using (3.2))

$$2\gamma\varrho_k\bigl(u^k - \tilde{u}^k\bigr)^Td\bigl(u^k, \tilde{u}^k\bigr)= 2\gamma\varrho_k^2 \bigl\| d \bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2, $$

and

$$\begin{aligned} &-2 \gamma \varrho_k \bigl( u^k- u_\mathit{II}^{k+1} \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr) \\ &\quad {}= \bigl\|\bigl( u^k - u_\mathit{II}^{k+1}\bigr)-\gamma \varrho_kd\bigl(u^k,\tilde{u}^k\bigr) \bigr\|^2 - \bigl\| u^k - u_\mathit{II}^{k+1} \bigr\|^2 - \gamma^2 \varrho_k^2 \bigl\|d \bigl(u^k,\tilde{u}^k\bigr) \bigr\|^2, \end{aligned}$$

respectively. Substituting them in the right hand side of (4.11) and using \(u^{k}-\gamma \varrho_{k}d(u^{k},\tilde{u}^{k})= u_{I}^{k+1}\), we obtain

$$\begin{aligned} \vartheta_\mathit{II}(\gamma) \ge& \gamma(2-\gamma) \varrho_k^2 \bigl\|d\bigl(u^k,\tilde{u}^k \bigr) \bigr\|^2 + \bigl\| u_I^{k+1} - u_\mathit{II}^{k+1}\bigr\|^2 \\ =& q(\gamma) + \bigl\| u_I^{k+1} - u_\mathit{II}^{k+1}\bigr\|^2, \end{aligned}$$
(4.12)

and the proof is complete. □

Note that q(γ) is a quadratic function of γ, it reaches its maximum at γ =1. In practice, ϱ k is the ‘optimal’ step size in (4.1) and (4.2), γ is a relaxation factor. Because q(γ) is a lower bound of ϑ I (γ) (resp. ϑ II (γ)), the desirable new iterate is updated by (4.1) (resp. (4.2)) with γ∈[1,2) and the reason is interpreted in Fig. 1.

Fig. 1
figure 1

The interpreting of γ∈[1,2) for more profit in each iteration

From Theorem 4.1 and the definition of ϱ k (see (2.8)), we obtain

$$ \bigl\| u^{k+1} - u^*\bigr\|^2 \le \bigl\| u^{k} - u^*\bigr\|^2 - \gamma(2-\gamma) \varrho_k \bigl(u^k-\tilde{u}^k\bigr)^Td \bigl(u^k,\tilde{u}^k\bigr). $$
(4.13)

By using the general conditions (2.7) and (2.8), it follows from the above inequality that

$$ \bigl\| u^{k+1} - u^*\bigr\|^2 \le \bigl\| u^{k} - u^*\bigr\|^2 - \gamma(2-\gamma) c_1c_2 \bigl\|u^k-\tilde{u}^k\bigr\|^2. $$
(4.14)

Due to the property (4.14), we call the methods which use update forms (4.1) and (4.2) PC Method-I and PC Method II, respectively. Note that the assertion (4.14) is derived from the general conditions (2.7)–(2.8). From (4.14), the convergence result of the PC methods follows directly (just as the convergence of the extragradient method follows from (1.4), for details, see Theorem 2.1 in [6]).

5 Convergence rate of the PC methods

This section proves the convergence rate of the projection and contraction methods. Recall that the base of the complexity proof is (see (2.3.2) in p. 159 of [3])

$$ \varOmega^* = \bigcap_{u\in \varOmega} \bigl\{ \tilde{u}\in \varOmega : (u-\tilde{u})^TF(u) \ge 0 \bigr\}. $$
(5.1)

In the sequel, for given ϵ>0 and \(\mathcal{ D} \subset \varOmega\), we focus our attention to find a \(\tilde{u}\) such that

$$ \tilde{u}\in \varOmega \quad \mbox{and} \quad \sup _{u\in \mathcal{ D}(\tilde{u})} (\tilde{u}-u)^T F(u) \le \epsilon. $$
(5.2)

Although the PC Method I uses the update form (4.1) and it does not guarantee that {u k} belongs to Ω, the sequence \(\{\tilde{u}^{k}\}\subset \varOmega\) in the PC methods with different corrector forms. Now, we prove the key inequality of the PC Method I for the complexity analysis.

Lemma 5.1

For given u kR n, let \(\tilde{u}^{k}\) and \(d(u^{k},\tilde{u}^{k})\) be given by (2.6) and the general conditions (2.7) and (2.8) be satisfied. If the new iterate u k+1 is updated by (4.1) with any γ>0, then we have

$$ \bigl(u- \tilde{u}^k\bigr)^T \gamma \varrho_k\beta_kF\bigl(\tilde{u}^k\bigr) + \frac{1}{2} \bigl(\bigl\|u-u^k\bigr\|^2 - \bigl\|u-u^{k+1}\bigr\|^2 \bigr) \ge \frac{1}{2}q(\gamma), \quad \forall u\in \varOmega, $$
(5.3)

where q(γ) is defined in (4.7).

Proof

Because (due to (3.1))

$$\bigl(u- \tilde{u}^k \bigr)^T \beta_k F \bigl(\tilde{u}^k\bigr) \ge \bigl(u - \tilde{u}^k \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr), \quad \forall u\in \varOmega, $$

and (see (4.1))

$$\gamma\varrho_k d\bigl(u^k,\tilde{u}^k\bigr) = u^k - u^{k+1}, $$

we need only to show that

$$ \bigl(u- \tilde{u}^k\bigr)^T \bigl(u^k - u^{k+1}\bigr) + \frac{1}{2} \bigl( \bigl\|u-u^k\bigr\|^2 - \bigl\|u-u^{k+1}\bigr\|^2 \bigr) \ge \frac{1}{2}q(\gamma), \quad \forall u\in \varOmega. $$
(5.4)

To the crossed term in the left hand side of (5.4), namely \((u- \tilde{u}^{k})^{T}(u^{k} - u^{k+1})\), using an identity

$$(a-b)^T(c-d) = \frac{1}{2} \bigl(\|a-d \|^2 - \|a-c\|^2 \bigr) + \frac{1}{2} \bigl(\|c-b \|^2 - \|d-b\|^2 \bigr), $$

we obtain

$$\begin{aligned} \bigl(u- \tilde{u}^k\bigr)^T \bigl(u^k -u^{k+1}\bigr) =& \frac{1}{2} \bigl( \bigl\|u-u^{k+1}\bigr\|^2 - \bigl\|u-u^k\bigr\|^2 \bigr) \\ &{} + \frac{1}{2} \bigl( \bigl\|u^k-\tilde{u}^k \bigr\|^2 - \bigl\|u^{k+1} -\tilde{u}^k \bigr\|^2 \bigr). \end{aligned}$$
(5.5)

By using \(u^{k+1} = u^{k} -\gamma\varrho_{k} d(u^{k},\tilde{u}^{k})\) and (3.2), we get

$$\begin{aligned} \bigl\|u^k-\tilde{u}^k\bigr\|^2 - \bigl\|u^{k+1} - \tilde{u}^k \bigr\|^2 = & \bigl\|u^k- \tilde{u}^k\bigr\|^2 - \bigl\|\bigl( u^k- \tilde{u}^k\bigr) - \gamma \varrho_k d \bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2 \\ = & 2\gamma \varrho_k \bigl( u^k-\tilde{u}^k \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr) - \gamma^2 \varrho_k^2 \bigl\| d\bigl(u^k, \tilde{u}^k\bigr)\bigr\|^2 \\ = & \gamma(2-\gamma) \varrho_k^2 \bigl\| d \bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2. \end{aligned}$$

Substituting it in the right hand side of (5.5) and using the definition of q(γ), we obtain (5.4) and the lemma is proved. □

The both sequences \(\{\tilde{u}^{k}\}\) and {u k} in the PC method II belong to Ω. In the following lemma we prove the same assertion for PC method II as in Lemma 5.1.

Lemma 5.2

For given u kΩ, let \(\tilde{u}^{k}\) and \(d(u^{k},\tilde{u}^{k})\) be given by (2.6) and the general conditions (2.7) and (2.8) be satisfied. If the new iterate u k+1 is updated by (4.2) with any γ>0, then we have

$$\begin{aligned} &\bigl(u- \tilde{u}^k\bigr)^T\gamma \varrho_k\beta_kF\bigl(\tilde{u}^k\bigr) + \frac{1}{2} \bigl(\bigl\|u-u^k\bigr\|^2 - \bigl\|u-u^{k+1} \bigr\|^2 \bigr) \ge \frac{1}{2}q(\gamma), \quad \forall u\in \varOmega, \end{aligned}$$
(5.6)

where q(γ) is defined in (4.7).

Proof

For investigating \((u- \tilde{u}^{k})^{T} \beta_{k} F(\tilde{u}^{k})\), we divide it in the terms

$$ \bigl(u^{k+1}- \tilde{u}^k \bigr)^T \gamma\varrho_k\beta_k F\bigl( \tilde{u}^k\bigr) \quad \mbox{and} \quad \bigl(u- u^{k+1} \bigr)^T \gamma\varrho_k\beta_k F\bigl( \tilde{u}^k\bigr). $$
(5.7)

First, we deal with the term \((u^{k+1}- \tilde{u}^{k})^{T}\gamma\varrho_{k}\beta_{k} F(\tilde{u}^{k})\). Since u k+1Ω, substituting u=u k+1 in (3.1) we get

$$\begin{aligned} \bigl(u^{k+1}- \tilde{u}^k \bigr)^T \gamma\varrho_k\beta_k F\bigl( \tilde{u}^k\bigr) \ge & \gamma\varrho_k \bigl(u^{k+1}- \tilde{u}^k \bigr)^Td \bigl(u^k,\tilde{u}^k\bigr) \\ = & \gamma\varrho_k\bigl(u^k - \tilde{u}^k \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr) \\ &{} -\gamma \varrho_k \bigl(u^k -u^{k+1}\bigr)^Td \bigl(u^k,\tilde{u}^k\bigr). \end{aligned}$$
(5.8)

To the first crossed term of the right hand side of (5.8), using (3.2), we have

$$\gamma\varrho_k\bigl(u^k - \tilde{u}^k \bigr)^T d\bigl(u^k,\tilde{u}^k\bigr)= \gamma \varrho_k^2 \bigl\| d\bigl(u^k, \tilde{u}^k\bigr)\bigr\|^2. $$

To the second crossed term of the right hand side of (5.8), using the Cauchy-Schwarz Inequality, we get

$$- \gamma\varrho_k\bigl(u^k -u^{k+1} \bigr)^Td\bigl(u^k,\tilde{u}^k\bigr) \ge - \frac{1}{2}\bigl\|u^k-u^{k+1}\bigr\|^2 - \frac{1}{2}\gamma^2\varrho_k^2 \bigl\| d \bigl(u^k,\tilde{u}^k\bigr)\bigr\|^2 . $$

Substituting them in the right hand side of (5.8), we obtain

$$\begin{aligned} & \bigl(u^{k+1}- \tilde{u}^k \bigr)^T\gamma\varrho_k \beta_k F\bigl( \tilde{u}^k\bigr) \ge \frac{1}{2}\gamma (2-\gamma) \varrho_k^2 \bigl\| d\bigl(u^k, \tilde{u}^k\bigr)\bigr\|^2 - \frac{1}{2} \bigl\|u^k-u^{k+1}\bigr\|^2. \end{aligned}$$
(5.9)

Now, we turn to treat of the term \((u- u^{k+1})^{T} \gamma\varrho_{k}\beta_{k} F(\tilde{u}^{k})\) in (5.7). Since u k+1 is updated by (4.2), u k+1 is the projection of \((u^{k} - \gamma\varrho_{k} \beta_{k} F(\tilde{u}^{k}) )\) on Ω, it follows from (2.1) that

$$\bigl\{ \bigl(u^k - \gamma\varrho_k \beta_k F\bigl(\tilde{u}^k\bigr) \bigr) - u^{k+1} \bigr \}^T \bigl(u - u^{k+1} \bigr)\le 0, \quad \forall u\in \varOmega, $$

and consequently

$$\bigl( u - u^{k+1} \bigr)^T\gamma\varrho_k \beta_k F\bigl(\tilde{u}^k\bigr) \ge \bigl( u - u^{k+1} \bigr)^T \bigl(u^k - u^{k+1} \bigr), \quad \forall u\in \varOmega. $$

Using the identity \(a^{T}b = \frac{1}{2} \{\|a\|^{2} - \|a-b\|^{2} +\|b\|^{2}\}\) to the right hand side of the last inequality, we obtain

$$\begin{aligned} & \bigl( u - u^{k+1} \bigr)^T\gamma \varrho_k\beta_k F \bigl(\tilde{u}^k \bigr) \\ &\quad {}\ge \frac{1}{2} \bigl(\bigl\|u-u^{k+1}\bigr\|^2 - \bigl\|u-u^k\bigr\|^2 \bigr) + \frac{1}{2} \bigl\|u^k-u^{k+1}\bigr\|^2 . \end{aligned}$$
(5.10)

Adding (5.9) and (5.10) and using the definition of q(γ), we get (5.6) and the proof is complete. □

For the different projection and contraction methods, we have the same key inequality which is shown in Lemmas 5.1 and 5.2, respectively. By setting u=u in (5.3) and (5.6), we get

$$\bigl\|u^k-u^*\bigr\|^2 - \bigl\|u^{k+1}-u^*\bigr\|^2 \ge 2\gamma\varrho_k\beta_k\bigl(\tilde{u}^k-u^* \bigr)^TF\bigl(\tilde{u}^k\bigr) + q(\gamma). $$

Because \((\tilde{u}^{k}-u^{*})^{T}F(\tilde{u}^{k})\ge (\tilde{u}^{k}-u^{*})^{T}F(u^{*})\ge 0\) and \(q(\gamma)= \gamma(2 - \gamma) \varrho_{k}^{2}\times \| d(u^{k},\tilde{u}^{k})\|^{2}\), it follows from the last inequality that

$$\bigl\|u^{k+1}-u^*\bigr\|^2 \le \bigl\|u^k-u^*\bigr\|^2 - \gamma(2-\gamma) \varrho_k^2 \bigl\| d\bigl(u^k, \tilde{u}^k\bigr)\bigr\|^2. $$

This is just the form (4.13) in Sect. 4. In other words, the contraction property (4.13) of PC methods is the consequent result of Lemmas 5.1 and 5.2, respectively.

For the convergence rate proof, we allow γ∈(0,2]. In this case, we still have q(γ)≥0. By using the monotonicity of F, from (5.3) and (5.6) we get

$$ \bigl(u- \tilde{u}^k\bigr)^T \varrho_k\beta_kF(u) + \frac{1}{2\gamma} \bigl\|u-u^k\bigr\|^2 \ge \frac{1}{2\gamma} \bigl\|u-u^{k+1} \bigr\|^2, \quad \forall u\in \varOmega. $$
(5.11)

This inequality is essential for the convergence rate proofs.

Theorem 5.1

For any integer t>0, we have a \(\tilde{u}_{t}\in \varOmega\) which satisfies

$$ (\tilde{u}_{t}-u)^T F(u) \le \frac{1}{2\gamma\varUpsilon_t} \bigl\|u-u^0\bigr\|^2,\quad \forall u\in \varOmega, $$
(5.12)

where

$$ \tilde{u}_t = \frac{1}{\varUpsilon_t} \sum _{k=0}^t \varrho_k \beta_k \tilde{u}^k \quad \mathit{and} \quad \varUpsilon_t = \sum _{k=0}^t \varrho_k \beta_k . $$
(5.13)

Proof

Summing the inequality (5.11) over k=0,…,t, we obtain

$$\Biggl( \Biggl( \sum_{k=0}^t \varrho_k \beta_k \Biggr) u - \sum _{k=0}^t \varrho_k \beta_k \tilde{u}^k \Biggr)^T F(u) + \frac{1}{2\gamma} \bigl\|u-u^0\bigr\|^2 \ge 0, \quad \forall u\in \varOmega. $$

Using the notations of ϒ t and \(\tilde{u}_{t}\) in the above inequality, we derive

$$(\tilde{u}_t - u)^T F(u) \le \frac{\|u-u^0\|^2}{2\gamma\varUpsilon_t}, \quad \forall u\in \varOmega. $$

Indeed, \(\tilde{u}_{t}\in \varOmega\) because it is a convex combination of \(\tilde{u}^{0}, \tilde{u}^{1},\ldots, \tilde{u}^{t}\). The proof is complete. □

If the step-size sequence {β k } in (1.3a) is low-bounded away from zero, i.e.

$$ \quad \inf_{k\ge0} \{\beta_k \} \ge \beta_{L}>0, $$
(5.14)

then it follows from (2.8) and (5.13) that

$$\varUpsilon_t \ge (t+1)c_2\beta_{L}, $$

and thus the PC methods have O(1/t) convergence rate. For any substantial set \(\mathcal{ D}\subset \varOmega\), the PC methods reach

$$(\tilde{u}_{t}-u)^T F(u) \le \epsilon, \quad \forall u\in \mathcal{ D}, \ \mbox{in at most}\ t= \biggl\lceil \frac{ D^2}{2\gamma c_2\beta_{L}\epsilon} \biggr\rceil $$

iterations, where \(\tilde{u}_{t}\) is defined in (5.13) and \(D= \sup \{ \|u-u^{0}\| \mid u\in \mathcal{ D}\}\). This convergence rate is in the ergodic sense, the statement (5.12) suggests us to take a larger parameter γ∈(0,2] in the correction steps of the PC methods. To use the O(1/t) convergence rate, we need only to check that the sequence {β k } is low-bound away from zero and the general conditions in Definition 2.4 are satisfied.

6 Numerical experiments

This section is devoted to test the efficiency of the PC methods in comparison with the extragradient method [12]. In particular, we use (1.3a), (1.3b), (1.3c) as the recursion form of the extragradient method. All codes are written in Matlab and run on a Lenovo X200 Computer with 2.53 GHz.

6.1 Test examples of minimizing a sum of distances

The first part of the test examples is a min-max problem which can be formulated as a monotone linear variational inequality with skew-symmetric matrix. The problem is to find the shortest network in a given full Steiner topology (see Example 1 in [25]). The points-edges connections of the network are depicted in Fig. 2, where P={b [1],…,b [10]} are given points in R 2 (called regular points) and x [1],…,x [8] are the locations of the additional points (called Steiner points). The coordinates of the 10 regular points are given in Table 1. The mathematical form of the shortest sum of the distance under p-norm is

$$\begin{aligned} &\min_{ x_{[j]}\in R^2} \Biggl\{ \| x_{[1]} - b_{[1]} \|_p + \sum_{j=1}^8 \| x_{[j]} - b_{[j+1]} \|_p + \| x_{[8]} - b_{[10]} \|_p \\ &\qquad \qquad {}+ \sum_{j=1}^7 \| x_{[j]} - x_{[j+1]} \|_p \Biggr\}. \end{aligned}$$
(6.1)

Mainly, we are interested in p=1,2 and ∞. For any dR 2 and p=1,2 and ∞, we have

$$ \|d \|_p = \max_{\xi \in B_q} \xi ^T d, $$
(6.2)

where

$$B_q =\bigl\{ \xi \in {R}^2 \mid \| \xi \|_q \le 1 \bigr\} \quad \mbox{and} \quad q= \left \{ \begin{array}{l@{\quad }l} \infty & \mbox{if}\ p=1, \\ 2 &\mbox{if}\ p=2, \\ 1 & \mbox{if}\ p=\infty. \end{array} \right . $$
Fig. 2
figure 2

Original graph of the problem

Table 1 The coordinates of the 10 regular points

Formulating the problem to a linear variational inequality

The problem (6.1) is equivalent to the following min-max problem

$$\begin{aligned} \min_{x_{[j]}\in R^2} \max_{z_{[i]} \in \mathcal{B}} &\Biggl\{ z_{[1]}^T( x_{[1]} - b_{[1]} ) + \displaystyle \sum_{j=1}^8 z_{[j+1]}^T( x_{[j]} - b_{[j+1]}) \\ &\,\ \quad {}+ z_{[10]}^T( x_{[8]} - b_{[10]}) + \displaystyle\sum_{j=1}^7 z_{[j+10]}^T( x_{[j]} - x_{[j+1]}) \Biggr\}, \end{aligned}$$
(6.3)

where z [i],i=1,…,17 are vectors in R 2. In fact, z [i] is the dual variable ordered to edge e i . The compact form of (6.3) is

$$ \min_{x\in \mathcal{R}} \max_{z\in \mathcal{B}} z^T(Ax-b) $$
(6.4)

where

$$ \begin{array}{rcl} x^T &=& \bigl( x_{[1]}^T, \ldots, x_{[8]}^T \bigr)^T, \qquad z^T = \bigl( z_{[1]}^T, \ldots, z_{[17]}^T\bigr)^T,\\ \mathcal{R} &=& R^2 \times \cdots \times R^2, \qquad \mathcal{B}= B_q \times \cdots \times B_q, \end{array} $$
(6.5)

A is block matrix which has form

$$ A= \left ( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} I_2 & & & \\ I_2 & & & \\ & \ddots & & \\ & & \ddots & \\ & & & I_2 \\ & & & I_2 \\ I_2 & -I_2 & & \\ & \ddots & \ddots & \\ & & I_2 & -I_2 \\ \end{array} \right ) \quad \mathrm{and} \quad b = \left ( \begin{array}{c} b_{[1]} \\ b_{[2]} \\ \vdots \\ \vdots \\ b_{[9]} \\ b_{[10]} \\ 0 \\ \vdots \\ 0 \end{array} \right ) . $$
(6.6)

Let \((x^{*}, z^{*})\in \mathcal{R}\times \mathcal{B}\) be any solution of (6.4), it follows that

$$z^T \bigl(Ax^* - b\bigr) \le {z^*}^T\bigl(Ax^* - b\bigr) \le {z^*}^T (Ax-b), \quad \forall {x\in \mathcal{R}}, {z\in \mathcal{B}}. $$

Thus, (x ,z ) is a solution of the following variational inequality:

$$ x^*\in \mathcal{R}, z^* \in \mathcal{B}_2, \quad \left \{\begin{array}{l@{\quad }l} (x-x^*)^T (A^T z^* ) \ge 0, & \forall x\in \mathcal{R}, \\ (z-z^*)^T (-Ax^* + b) \ge 0 , & \forall z \in \mathcal{B}. \end{array} \right . $$
(6.7)

The compact form of (6.7) is the following linear variational inequality:

$$ \mbox{LVI}(\varOmega,M,q) \quad u^*\in \varOmega, \quad \bigl(u-u^* \bigr)^T \bigl(Mu^* + q \bigr) \ge 0, \quad \forall u\in \varOmega, $$
(6.8)

where

$$ u= \left (\begin{array}{c} x \\ z \end{array} \right ), \qquad M=\left (\begin{array}{c@{\quad }c} 0 & A^T \\ -A & 0 \end{array} \right ), \qquad q=\left (\begin{array}{c} 0 \\ b \end{array} \right ) \quad\mathrm{and} \quad \varOmega= \mathcal{R}\times \mathcal{B}. $$
(6.9)

Note that M is skew-symmetric and thus the linear variational inequality is monotone.

Solving the LVI by the extragradient method

By using the extragradient method (see (1.3a)–(1.3c)), for given u k, to produce \(\tilde{u}^{k}\) by (1.3a), the condition (1.3b) should be satisfied. Note that the mapping F(u)=Mu+q in LVI(Ω,M,q) is Lipschitz continuous. For this example, one can calculate ∥M∥≈2.2089. In order to ensure the condition (1.3b) to be satisfied, we should to take β∈(0,1/∥M∥). Since 1/∥M∥≈0.4527, we use the extragradient method to solve the problem (6.8)–(6.9) with different β=0.30,0.35,0.40 and 0.45, respectively. The start point is u 0=0 and the iteration is stopped as soon as

$$\bigl\|u^k -\tilde{u}^k\bigr\| \le 10^{-10}. $$

The optimal networks are depicted in Figs. 3, 4. The iteration numbers for the shortest distance problems under different norms are reported in Table 2. It seems that, for fast convergence, the constant parameter β∈(0,1/∥M∥) should be closed to 1/∥M∥.

Fig. 3
figure 3

Optimal solution, l 2-norm

Fig. 4
figure 4

Optimal solution, l 1-norm

Fig. 5
figure 5

Optimal solution, l -norm

Table 2 Iteration numbers of EG method by using different β’s

Solving the LVI by the PC methods

By using the PC methods, the condition (1.3b) is not necessary. We need only to ensure the general conditions in Definition 2.4 to be satisfied. Thus, we can take β k ≡1 in (1.3a) and the predictor \(\tilde{u}^{k}\) is given by

$$\tilde{u}^k = P_{\varOmega}\bigl[ u^k - \bigl(Mu^k +q\bigr)\bigr]. $$

Because M T=−M, the above projection mapping can be rewritten as

$$\begin{aligned} \tilde{u}^k =& P_{\varOmega}\bigl\{\tilde{u}^k - \bigl[ \bigl(Mu^k +q\bigr)- \bigl(u^k - \tilde{u}^k\bigr)\bigr]\bigr\} \\ = & P_{\varOmega}\bigl\{\tilde{u}^k -\bigl[ \bigl(M \tilde{u}^k +q\bigr)- \bigl(I+M^T\bigr) \bigl(u^k -\tilde{u}^k\bigr) \bigr]\bigr\}. \end{aligned}$$

This is the form (2.6) with

$$\beta_k\equiv 1, \qquad F\bigl(\tilde{u}^k\bigr) = \bigl(M\tilde{u}^k +q\bigr) \quad \mbox{and} \quad d \bigl(u^k,\tilde{u}^k\bigr) = \bigl(I+M^T \bigr) \bigl(u^k -\tilde{u}^k\bigr). $$

For this \(d(u^{k},\tilde{u}^{k})\), we have

$$\bigl(u^k-\tilde{u}^k\bigr)^Td \bigl(u^k,\tilde{u}^k\bigr) = \bigl\|u^k-\tilde{u}^k\bigr\|^2, \quad (\mbox{due to}\ M^T=-M) $$

and

$$\varrho_k=\frac{(u^k-\tilde{u}^k)^T d(u^k,\tilde{u}^k)}{\|d(u^k,\tilde{u}^k)\|^2} = \frac{\|u^k-\tilde{u}^k\|^2}{\|(I+M^T)(u^k-\tilde{u}^k)\|^2} \ge \frac{1}{\|I+M^T\|^2}. $$

Thus the general conditions in Definition 2.4 are satisfied with c 1=1 and c 2=1/∥I+M T2. We use the PC Method-I (see (1.7)) or PC Method II (see (1.8)) to update the new iterate. The iteration numbers and the CPU times are listed in Table 3. For comparison, we also report the CPU times by using the extragradient method with β=0.45 (the fastest case in Table 2) and the shortest distance of the network in different norms.

Table 3 Numerical results of PC methods and comparison with the extragradient method

It is observed that

$$\frac{\mathrm{Computational\ load\ of\ PC\ Method\ II}}{ \mathrm{Computational\ load\ of\ the\ extragradient\ method}} < 40~\%. $$

Both the PC Methods converge much faster than the Extragradient method.

6.2 Test examples of nonlinear complementarity problems

We take the nonlinear complementarity problems

$$\mbox{(NCP)} \quad u\ge 0, \qquad F(u) \ge 0, \qquad u^T F(u) = 0, $$

as the second part of the test examples. Complementarity problem is a variational inequality VI(Ω,F) with \(\varOmega=R^{n}_{+}\), the non-negative orthant in R n. In this case, P Ω (v)=max(v,0).

Constructing the test problems

The mapping F(u) in the tested NCP is given by

$$ F(u)=D(u) + Mu + q, $$
(6.10)

where D(u):R nR n is the nonlinear part, M is an n×n matrix, and qR n is a vector.

  • In D(u), the nonlinear part of F(u), the components are

    $$D_j (u)= d_j\cdot\arctan(a_j\cdot u_j), $$

    where a and d are random vectorsFootnote 2 whose elements are in (0,1).

  • The matrix M in the linear part is given by M=A T A+B. A is an n×n matrix whose entries are randomly generated in the interval (−5,+5), and B is an n×n skew-symmetric random matrix (B T=−B) whose entriesFootnote 3 are in the interval (−5,+5).

It is clear that the mapping composed in this way is monotone. We construct the following 3 sets of test examples by choosing different vector q in (6.10).

  1. 1.

    In the first set of the test examples, the elements of vector q is generated from a uniform distribution in the interval (−500,500).

  2. 2.

    The second setFootnote 4 of the test examples is similar to the first set. Instead of q∈(−500,500), the vector q is generated from a uniform distribution in the interval (−500,0).

  3. 3.

    The third set of test examples has a known solution \(u^{*} \in R^{n}_{+}\). Let vector p be generated from a uniform distribution in the interval (−10,10) and

    $$ u^*= \max(p,0). $$
    (6.11)

    By setting

    $$v= \max(-p,0) \quad \mbox{and} \quad q= v - \bigl(D\bigl(u^*\bigr)+Mu^*\bigr), $$

    we have F(u )=D(u )+Mu +q=v=max(−p,0). Thus,

    $$\bigl(u^*\bigr)^TF\bigl(u^*\bigr)= \bigl(\max(p,0) \bigr)^T \bigl(\max(-p,0) \bigr)=0. $$

    In this way we constructed a test NCP with a known solution u described in (6.11).

Implementation details

For given u k, we use (1.3a) to produce \(\tilde{u}^{k}\) which satisfies condition in (1.3b) with ν=0.95. Note that in this case the general conditions (2.7) and (2.8) are satisfied with c 1=(1−ν) and c 2=1/2.

We use a modified Armijo rule for choosing the parameter β k . It should be mentioned, in practical computation, if \(r_{k} :=\beta_{k} \|F(u^{k})-F(\tilde{u}^{k})\|/ \|u^{k}-\tilde{u}^{k}\|\) is too small, it will lead slow convergence. Therefore, if r k μ=0.4, the trial parameter β k will be enlarged for the next iteration. In the case that F is Lipschitz continuous with constant L>0, by using the above mentioned strategies for choosing β k , we still have inf{β k }≥β L =O(1/L). These ‘refined’ strategies are necessary for fast convergence. Algorithm 1 is the implementation details.

Algorithm 1

Step 0.

Set β 0=1, u 0Ω and k=0.

Step 1.

\(\tilde{u}^{k}=P_{\varOmega}[u^{k}-\beta_{k} F(u^{k})]\),

 

\(r_{k}:=\displaystyle {{{\beta_k \|F(u^k)-F(\tilde{u}^k)\|} \over {\|u^k-\tilde{u}^k\|}},}\)

 

whiler k >ν

 

\(\beta_{k} := 0.7 *\beta_{k} * \min\{ 1, {1\over r_{k}} \}\), \(\tilde{u}^{k}=P_{\varOmega}[u^{k}-\beta_{k} F(u^{k})]\)

 

end(while)

 

Use different forms ((1.3c), (4.1) or (4.2)) to update u k+1.

 

If r k μ then β k :=β k ν∗0.9/r k , end(if)

Step 2.

β k+1=β k and k=k+1, go to Step 1.

The iterations begin with u 0=0, β 0=1 and stop as soon as

$$ \frac{\| u^k - P_{\varOmega}[u^k - F(u^k)]\|_\infty}{ \| u^0 - P_{\varOmega}[u^0 - F(u^0)]\|_\infty} \le 10^{-6}. $$
(6.12)

Since both F(u k) and \(F(\tilde{u}^{k})\) are involved in those methods recursions, each iteration of the test methods needs at least 2 times of evaluations of the mapping F. We use No. It and No. F to denote the numbers of iterations and the evaluations of the mapping F, respectively. The size of the tested problems is from 500 to 2000.

Comparison between the extragradient method and the PC method II

In the case that the condition (1.3b) is satisfied, replacing γϱ k in (4.2) by 1, the PC method II becomes the extragradient method. According to the assertion in Theorems 4.1 and 5.1, we take the relaxation factor γ=1.9 in the PC methods. The test results for the 3 sets of NCP are given in Tables 4, 5 and 6, respectively.

Table 4 Numerical results of the first set examples
Table 5 Numerical results of the second set examples
Table 6 Numerical results of the third set examples

In the third test examples, as the stop criterion is satisfied, we have ∥u ku ≈2×10−4 by using the all three test methods. The PC method II and the extragradient method use the same direction but different step size in the correction step. The numerical results show that the PC method II is much efficient than the extragradient method. Even if the PC methods need to calculate the step size ϱ k in each iteration, while the computational load required by the additional effort is significantly less than the dominating task (the evaluations of F(u k) and \(F(\tilde{u}^{k})\)). It is observed that

$$\frac{\mathrm{Computational\ load\ of\ PC\ Method\ II}}{ \mathrm{Computational\ load\ of\ the\ extragradient\ method}} \le 55~\%. $$

The different PC methods use the one of the geminate directions but the same step size in their correction forms. Between the PC methods, PC method II needs fewer iterations than PC method I, this evidence coincides with the assertions in Theorem 4.1 (see (4.5) and (4.6)). Thus, we suggest to use PC method II when the projection on Ω is easy to be carried out. Otherwise (when the projection is the dominating task in the iteration), we use PC method I because its update form (4.1) does not contain the projection.

7 Conclusions

In a unified framework, we proved the O(1/t) convergence rate of the projection and contraction methods for monotone variational inequalities. The convergence rate is the same as that for the extragradient method. In fact, our convergence rate include the extragradient method as a special case. The complexity analysis in this paper is based on the general conditions defined in Definition 2.4 and thus can be extended to a broaden class of similar contraction methods. Preliminary numerical results indicate that the PC methods do outperform the extragradient method.