1 Introduction

We research the classical variational inequality problem, which is defined as follows: find \(x^*\in \mathscr {C}\) such that

$$\begin{aligned} \langle \mathscr {A}x^*,x-x^*\rangle \ge 0,\ \forall x\in \mathscr {C}, \end{aligned}$$
(1)

where \(\mathscr {C}\) is a nonempty closed convex subset of a real Hilbert space \(\mathcal {H}\), \(\mathscr {A}:\mathcal {H}\rightarrow \mathcal {H}\) is a nonlinear operator. For simplicity, we denote the solution set of variational inequality problem (1) by \(Sol(\mathscr {C}, \mathscr {A})\).

The variational inequality problem plays an important role in nonlinear analysis research, which not only promotes the development of optimization problems, mathematical statistics, fixed point theory and other mathematical disciplines, but also has a very wide range of applications in engineering, mechanics, physics and economics. So far, many methods for solving variational inequalities have been produced. In this paper, we mainly study a class of projection methods.

It is well known that one of the most classical projection methods is the gradient projection method involving the strongly monotone and Lipschitz continuous operator. Due to its mandatory conditions, it was later improved by Korpelevich [1] into the following algorithm, which is often referred to as the extragradient algorithm:

$$\begin{aligned} \left\{ \begin{aligned}&y_n=P_\mathscr {C}(x_n-\tau \mathscr {A}x_n),\\&x_{n+1}=P_\mathscr {C}(x_n-\tau \mathscr {A}y_n), \end{aligned} \right. \end{aligned}$$

where \(\tau \in (0,\frac{1}{L})\) and the operator \(\mathscr {A}\) is monotone and L-Lipschitz continuous. The extragradient algorithm was first used to solve the saddle point problem, and was later applied to the variational inequality problem by scholars. Since then, many meaningful results have been produced. For example, Yao et al. [2, 3] constructed extragradient algorithms involving Lipschitz continuous and monotone operators for solving variational inequalities and fixed point problems. Vuong [4] proved the convergence of the extragradient algorithm under the pseudo-monotone condition, and the operator is Lipschitz continuous and sequentially weakly continuous.

However, it is worth noting that when the operator \(\mathscr {A}\) is non-Lipschitz continuous or the Lipschitz constant L is not easy to calculate, the above extragradient algorithm will fail due to the difficulty in determining the value of \(\tau \). To overcome this shortcoming, the authors apply different techniques. Thong [5] used the line-search process so that the involved operators only need to satisfy uniform continuity. Tan et al. [6, 7] and Duvocelle et al. [8] introduced different self adaptive rules so that the algorithms involving Lipschitz continuous and pseudo-monotone operators do not need to calculate the Lipschitz constant. In addition, Iusem [9] combined the line-search rule with the extragradient algorithm and obtained the following algorithm:

$$\begin{aligned} \left\{ \begin{aligned}&y_n=P_\mathscr {C}(x_n-\tau _n \mathscr {A}x_n),\\&x_{n+1}=P_\mathscr {C}(x_n-\lambda _n \mathscr {A}y_n), \end{aligned} \right. \end{aligned}$$

where \(\tau _{n}:=\gamma l^{m_n},\, m_n:=\min \{m\in \mathbb {N}:\gamma l^m\Vert \mathscr {A}x_n-\mathscr {A}y_n\Vert \le \mu \Vert x_n-y_n\Vert \},\,\mu , l\in (0,1)\) and \(\lambda _n:=\frac{\langle \mathscr {A}y_n,x_n-y_n\rangle }{\Vert \mathscr {A}y_n\Vert ^2}\). Under suitable assumptions, Iusem established a weak convergence theorem without the need for the Lipschitz continuity.

On the basis of the algorithm proposed by Iusem [9], Iusem et al. [10], Thong et al. [11, 12] and Xie et al. [13] have more or less improved the line-search rule and obtained the convergence of the corresponding algorithms. Specifically, the improved algorithm in [12] is as follows:

$$\begin{aligned} \left\{ \begin{aligned}&y_n=P_\mathscr {C}(x_n-\tau _n \mathscr {A}x_n),\\&z_n=P_\mathscr {C}(x_n-\lambda _n \mathscr {A}y_n),\\&x_{n+1}=\beta _{n}x_0+(1-\beta _{n})z_n, \end{aligned} \right. \end{aligned}$$

where \(\tau _{n}:=\gamma l^{m_n},\, m_n:=\min \{m\in \mathbb {N}:\gamma l^m\langle \mathscr {A}x_n-\mathscr {A}y_n,x_n-y_n\rangle \le \mu \Vert x_n-y_n\Vert ^2\},\,\mu , l\in (0,1)\) and \(\lambda _n:=\frac{1-\mu }{\gamma }\frac{\Vert x_n-y_n\Vert ^2}{\Vert \mathscr {A}y_n\Vert ^2}\). They effectively modified the line-search rule, thereby increasing the range of choices for the sequence \(\{\tau _n\}\). Under suitable assumptions, a strong convergence theorem is obtained by introducing Halpern-type iteration.

In addition, it is noted that during the operation of the extragradient algorithm, each iteration needs to calculate the projection on the feasible set \(\mathscr {C}\) twice, but the generality of \(\mathscr {C}\) will increase the computational complexity. To this end, the authors have proposed various methods to improve this situation. Vuong et al. [14] and Reich et al. [15] proposed different projection algorithms by using different line-search rules, and obtained the corresponding strong convergence theorems by combining Halpern iteration and viscosity iteration respectively. Censor et al. [16, 17] constructed a subgradient half-space \(\{x\in \mathcal {H}:\langle x_n-\tau \mathscr {A}x_n-y_n,x-y_n\rangle \le 0\}\) to replace the feasible set \(\mathscr {C}\) in the second projection process in the extragradient algorithm, which accelerated the iteration rate of the algorithm. Tseng [18] directly omitted the second projection and used \(y_n-\tau (\mathscr {A}y_n-\mathscr {A}x_n)\) to replace the computation of \(P_\mathscr {C}(x_n-\tau \mathscr {A}y_n)\). Based on the algorithm proposed by Tseng, Reich et al. [19] and Yao et al. [20] add a single inertial term and a double inertial term respectively, and the convergence results of the related algorithms are obtained by combining self adaptive rule. In this paper, we mainly research such a projection-type algorithm modified by Vuong and Shehu [14]. The detailed algorithm is as follows:

Algorithm 1
figure a

Halpern-type projection methodHalpern-type projection method

Vuong and Shehu [14] introduced a half-space \(C_n\) in Algorithm 1 to replace the second-step projection in the iterative process, which is an interesting improvement and reduces the difficulty of projection computation. At the same time, they obtained the strong convergence theorem involving uniformly continuous, pseudo-monotone and sequentially weakly continuous operators by combining the line-search rule and Halpern-type iteration method. Based on the result of Algorithm 1, Reich et al. [15] made some improvements while speeding up the convergence of the following algorithm:

Algorithm 2
figure b

Viscosity projection methodViscosity projection method

Reich et al. [15] proposed a different line-search rule than in Algorithm 1, thereby adjusting the half-space \(C_n\). The operator \(\mathscr {A}\) in Algorithm 2 is pseudo-monotone, uniformly continuous and satisfies \(\Vert \mathscr {A}q\Vert \le \liminf _{n\rightarrow \infty }\Vert \mathscr {A}x_n\Vert \) whenever \(\{x_n\}\subset \mathscr {C}\) and \(x_n\rightharpoonup q\). On the other hand, the introduced viscosity iteration method further accelerates the convergence process of Algorithm 2. Under the imposition of appropriate assumptions on the parameters, they obtained a strong convergence theorem.

Motivated and inspired by the above results, we propose a new algorithm for solving variational inequalities with uniformly continuous pseudo-monotone operators. Precisely, we create a novel line-search rule to determine the value of the key sequence and make appropriate adjustments to \(C_n\). In addition, we introduced inertial technology to speed up iteration efficiency. Finally, by comparing with known results in numerical experiments, it is confirmed that our proposed algorithm indeed has better behavior.

2 Preliminaries

In this section, we recall some basic concepts and facts.

Let \(\mathcal {H}\) be a real Hilbert space. Then

$$\begin{aligned} \Vert x+y\Vert ^2\le \Vert x\Vert ^2+2\langle y,x+y\rangle \end{aligned}$$
(2)

and

$$\begin{aligned} \Vert \alpha x+(1-\alpha )y\Vert ^2= \alpha \Vert x\Vert ^2+(1-\alpha )\Vert y\Vert ^2-\alpha (1-\alpha )\Vert x-y\Vert ^2, \end{aligned}$$
(3)

for every \(x,y\in \mathcal {H}\) and \(\alpha \in \mathbb {R}\).

Let \(\mathscr {C}\) be a nonempty subset of \(\mathcal {H}\). Then an operator \(\mathscr {A}:\mathscr {C} \rightarrow \mathcal {H}\) is called

(a) L-Lipschitz continuous with \(L>0\) if

$$\begin{aligned} \Vert \mathscr {A}x-\mathscr {A}y\Vert \le L\Vert x-y\Vert , \ \forall x,y\in \mathscr {C}. \end{aligned}$$
(4)

If \(L\in (0,1)\), \(\mathscr {A}\) is called a contraction.

(b) monotone if

$$\begin{aligned} \langle \mathscr {A}x-\mathscr {A}y,x-y\rangle \ge 0, \ \forall x,y\in \mathscr {C}. \end{aligned}$$
(5)

(c) pseudo-monotone if

$$\begin{aligned} \langle \mathscr {A}x,y-x\rangle \ge 0 \Rightarrow \langle \mathscr {A}y,y-x\rangle \ge 0, \ \forall x,y\in \mathscr {C}. \end{aligned}$$
(6)

(d) sequentially weakly continuous if for each sequence \(\{x_n\}\subset \mathscr {C}\) we have: \(x_n\rightharpoonup x\) implies \(\mathscr {A}x_n\rightharpoonup \mathscr {A}x\) as \(n\rightarrow \infty \).

For any point \(x\in \mathcal {H}\), it is obvious that there exists a unique nearest point in \(\mathscr {C}\), denoted by \(P_\mathscr {C} x\) satisfying \(\Vert x-P_\mathscr {C}x\Vert \le \Vert x-y\Vert ,\ \forall y\in \mathscr {C}\). \(P_\mathscr {C}\) is called the metric projection of \(\mathcal {H}\) onto \(\mathscr {C}\). The projection formula will be applied to numerical experiments (Sect. 4). The projection of x onto a half-space \(\mathscr {C}_{u,v}=\{x:\langle u,x\rangle \le v\} \) is computed by

$$\begin{aligned} P_{\mathscr {C}_{u,v}}=x-\max \{[\langle u,x\rangle -v]/ \Vert u\Vert ^2,0\}u. \end{aligned}$$

Throughout this paper, let \(\mathscr {C}\) be a nonempty closed convex subset of a real Hilbert space \(\mathcal {H}\). Next, list some lemmas that will be needed later.

Lemma 1

([21]) Given \(x\in \mathcal {H}\) and \(z\in \mathscr {C}\). Then

$$\begin{aligned} z=P_\mathscr {C} x\Leftrightarrow \langle x-z,z-y\rangle \ge 0,\ \forall y\in \mathscr {C}. \end{aligned}$$

Lemma 2

([21]) Given \(x\in \mathcal {H}\). Then

(1) \(\Vert P_\mathscr {C}x-P_\mathscr {C}y\Vert ^2\le \langle P_\mathscr {C}x-P_\mathscr {C}y,x-y\rangle , \ \forall y\in \mathcal {H}\);

(2) \(\Vert P_\mathscr {C}x-y\Vert ^2\le \Vert x-y\Vert ^2-\Vert x-P_\mathscr {C}x\Vert ^2, \ \forall y\in \mathscr {C}\).

Lemma 3

([22]) Given \(x\in \mathcal {H}\) and \(\alpha \ge \beta >0\). Then

(1) \(\Vert x-P_\mathscr {C}(x-\beta \mathscr {A}x)\Vert \le \Vert x-P_\mathscr {C}(x-\alpha \mathscr {A}x)\Vert \);

(2) \(\dfrac{\Vert x-P_\mathscr {C}(x-\alpha \mathscr {A}x)\Vert }{\alpha }\le \dfrac{\Vert x-P_\mathscr {C}(x-\beta \mathscr {A}x)\Vert }{\beta }\).

Lemma 4

([23]) Let \(\mathcal {H}_1\) and \(\mathcal {H}_2\) be two real Hilbert spaces. Suppose \(\mathscr {A}:\mathcal {H}_1\rightarrow \mathcal {H}_2\) is uniformly continuous on bounded subsets of \(\mathcal {H}_1\) and M is a bounded subset of \(\mathcal {H}_1\). Then \(\mathscr {A}(M)\) (the image of M under \(\mathscr {A}\)) is bounded.

Lemma 5

([24]) Let \(\mathscr {A}:\mathscr {C}\rightarrow \mathcal {H}\) be pseudo-monotone and continuous. Then, \(x^*\) belongs to \(Sol(\mathscr {C},\mathscr {A})\) if and only if

$$\begin{aligned} \langle \mathscr {A}x,x-x^*\rangle \ge 0, \ \forall x\in \mathscr {C}. \end{aligned}$$

Lemma 6

([25]) Let h be a real-valued function on \(\mathcal {H}\) and defined \(K:=\{x\in \mathscr {C}:h(x)\le 0\}\). If K is nonempty and h is Lipschitz continuous on \(\mathscr {C}\) with modulus \(\theta >0\), then

$$\begin{aligned} dist(x,K)\ge \theta ^{-1}\max \{h(x),0\}, \ \forall x\in \mathscr {C}, \end{aligned}$$

where dist(x,K) denotes the distance of x to K.

Lemma 7

([26]) Let \(\{d_n\}\) be a sequence of non-negative real number such that there exists a subsequence \(\{d_{n_j}\}\subset \{d_n\}\) such that \(d_{n_j}<d_{n_j+1}\) for all \(j\in \mathbb {N}\). Then there exists a non-decreasing sequence \(\{m_k\}\subset \mathbb {N}\) such that \(\lim _{k\rightarrow \infty }m_k=\infty \) and the following properties are satisfied by all (sufficiently large) number \(k\in \mathbb {N}\):

$$\begin{aligned} d_{m_k}\le d_{m_k+1},\quad d_k\le d_{m_k+1}. \end{aligned}$$

In fact, \(m_k=\max \{n\in \mathbb {N}:d_n<d_{n+1}, n\le k\}\).

Lemma 8

([27]) Let \(\{d_n\}\) be a sequence of non-negative real numbers such that

$$\begin{aligned} d_{n+1}\le (1-a_n)d_n+a_n b_n, \ \forall n\ge 0, \end{aligned}$$

where \(\{a_n\}\subset (0,1)\) and \(\{b_n\}\) is a real sequence such that

(a) \(\sum _{n=1}^\infty a_n=\infty \);

(b) \(\lim \sup _{n\rightarrow \infty }b_{n}\le 0\).

Then \(\lim _{n\rightarrow \infty }d_n=0\).

Lemma 9

([12]) Let \(\mathscr {C}\) be a nonempty closed convex subset of a real Hilbert space \(\mathcal {H}\) and \(\mathscr {A}:\mathscr {C}\rightarrow \mathcal {H}\) be pseudo-monotone, sequentially weakly continuous and uniformly continuous on the bounded subsets of \(\mathscr {C}\). There exists a subsequence \(\{w_{n_k}\}\subset \{w_{n}\}\) such that \(w_{n_k}\rightharpoonup q\in \mathscr {C}\) and \(\lim _{k\rightarrow \infty }\Vert w_{n_k}-P_\mathscr {C}(w_{n_k}-\tau _{n_k}\mathscr {A}w_{n_k})\Vert =0\), where \(\tau _{n}\) is a positive sequence. If

$$\begin{aligned} \liminf _{k\rightarrow \infty }\langle \mathscr {A}w_{n_k},x-w_{n_k}\rangle \ge 0, \ \forall x\in \mathscr {C}. \end{aligned}$$

Then \(q\in Sol(\mathscr {C},\mathscr {A})\).

3 Main results

3.1 Strong convergence

In this section, we propose an improved projection-type method for solving variational inequality problems in Hilbert spaces. First, we give the following conditions:

(C1) The feasible set \(\mathscr {C}\) is a nonempty closed convex subset of a real Hilbert space \(\mathcal {H}\). The solution set \(Sol(\mathscr {C},\mathscr {A})\) is nonempty.

(C2) The operator \(\mathscr {A}:\mathscr {C}\rightarrow \mathcal {H}\) is uniformly continuous, pseudo-monotone and sequentially weakly continuous on \(\mathscr {C}\).

(C3) Let \(f:\mathscr {C}\rightarrow \mathscr {C}\) be a contraction mapping with a coefficient \(\rho \in [0,1)\) and \(\beta _n\) be a sequence in (0, 1) such that

$$\begin{aligned} \lim _{n\rightarrow \infty }\beta _n=0,\ \sum _{n=1}^{\infty }\beta _n=\infty . \end{aligned}$$

The sequence \(\{\alpha _n\}\) satisfies \(\lim _{n\rightarrow \infty } \frac{\alpha _n}{\beta _n}=0\).

Next, we introduce our new algorithm.

Algorithm 3
figure c

Modified viscosity projection methodModified viscosity projection method

Remark 1

  1. 1.

    Since \(\mathscr {C}\) is a convex set, it can be deduced that all iterates \(\{x_n\}\), \(\{w_n\}\), \(\{y_n\}\) and \(\{z_n\}\) generated by Algorithm 3 belong to \(\mathscr {C}\). Therefore, the operator \(\mathscr {A}\) needs to be defined only on \(\mathscr {C}\), rather than necessarily on the entire space \(\mathcal {H}\).

  2. 2.

    The operator \(\mathscr {A}\) is pseudo-monotone and uniformly continuous, which allows us to prove the strong convergence theorem without Lipschitz prior knowledge.

  3. 3.

    The sequence \(\{\alpha _n\}\) could be chosen such that

    $$\begin{aligned} \alpha _n= {\left\{ \begin{array}{ll} \min \{\frac{\theta _n}{\left\| x_1-x_{n}\right\| },\frac{\alpha }{2}\}, &{}\text { if }x_n\ne x_{1},\\ \frac{\alpha }{2}, &{}\text { otherwise },\\ \end{array}\right. } \end{aligned}$$

    where \(\alpha \) is a constant such that \(0\le \alpha <1\) and \(\left\{ \theta _n\right\} \) is a positive sequence such that \(\lim _{n\rightarrow \infty }\frac{\theta _n}{\beta _n}=0\). At this time, it is easy to see that \(\lim _{n\rightarrow \infty } \frac{\alpha _n}{\beta _n}=0\). In addition, if \(\alpha =0\), then \(w_n=x_n\), this is a trivial case. In the following text we will mainly study non-trivial situations.

  4. 4.

    Compared to Algorithms 1 and 2, Algorithm 3 introduces a step size rule with parameter \(\gamma >0\) in (7), which allows the trial step \(\tau _n\) to start from a value other than 1 in each outer iteration. This modification makes sense because there are cases where a larger step size \(\tau _n>1\) may be acceptable (resulting in faster convergence), and cases where \(\tau _n<1\) may be smaller (in which case it is beneficial to choose \(\gamma <1\) to avoid unnecessary evaluations of \(\mathscr {A}\) in the inner loop). In addition, during the inner loop, we initially set \(m=0\), which gives \(\tau _{n}=\gamma \). We then substitute this value into (7) to see if it satisfies the condition. If it does, we output the result, otherwise we set \(m=1\) and continue iterating until the condition in (7) is satisfied. Once satisfied, we output the corresponding \(y_n\) value and continue with further iterations.

  5. 5.

    In Algorithm 2, \(h_n(x)\) is defined as \(h_n(x)=\langle \mathscr {A}y_n,x-x_n\rangle +\dfrac{\tau _n}{2\lambda }\Vert r(x_n)\Vert ^2\), where \(\tau _n\Vert r(x_n)\Vert ^2\rightarrow 0\) is a crucial assumption for proving \(r(x_n)\rightarrow 0\). However, in the original proof, it was necessary to consider two separate cases: \(\liminf _{n\rightarrow \infty }\tau _n>0\) and \(\liminf _{n\rightarrow \infty }\tau _n=0\), as detailed in [15, Lemma 3.5]. To streamline the proof, we have made an improvement to \(h_n(x)\) in (8). This adjustment eliminates the need to consider separate cases and leads directly to the conclusion, thus simplifying the overall proof.

Lemma 10

Assume that conditions (C1-C3) hold. Then the line-search rule (7) is well defined.

Proof

If \(w_n\in Sol(\mathscr {C},\mathscr {A})\), then \(w_n = P_\mathscr {C}(w_n-\gamma \mathscr {A}w_n)\), thus (7) holds with \(m_n=0\). Next, we suppose that \(w_n\notin Sol(\mathscr {C},\mathscr {A})\) and assume the contrary. That is, for all m we have

$$\begin{aligned} \gamma l^m\langle \mathscr {A}w_n-\mathscr {A}(w_n-\gamma l^m r(w_n)),r(w_n)\rangle >\dfrac{\mu }{2}\Vert r(w_n)\Vert ^2. \end{aligned}$$
(9)

By the Cauchy-Schwarz inequality, then

$$\begin{aligned}&\gamma l^m\langle \mathscr {A}w_n-\mathscr {A}(w_n-\gamma l^m r(w_n)),r(w_n)\rangle \nonumber \\&\quad \le \gamma l^m\Vert \mathscr {A}w_n-\mathscr {A}(w_n-\gamma l^m r(w_n))\Vert \Vert r(w_n)\Vert . \end{aligned}$$
(10)

Combining (9) and (10), we have

$$\begin{aligned} \gamma l^m\Vert \mathscr {A}w_n-\mathscr {A}(w_n-\gamma l^m r(w_n))\Vert >\dfrac{\mu }{2}\Vert w_n-P_\mathscr {C}(w_n-\gamma l^m\mathscr {A}w_n)\Vert . \end{aligned}$$

It follows that

$$\begin{aligned} \Vert \mathscr {A}w_n-\mathscr {A}(w_n-\gamma l^m r(w_n))\Vert >\dfrac{\mu }{2}\frac{\Vert w_n-P_\mathscr {C}(w_n-\gamma l^m\mathscr {A}w_n)\Vert }{\gamma l^m}. \end{aligned}$$
(11)

Since \(w_n\in \mathscr {C}\), \(P_\mathscr {C}\) is continuous and \(\mathscr {A}\) is uniformly continuous, we obtain

$$\begin{aligned} \lim _{m\rightarrow \infty }\Vert w_n-P_\mathscr {C}(w_n-\gamma l^m\mathscr {A}w_n)\Vert =0, \end{aligned}$$
(12)

and

$$\begin{aligned} \lim _{m\rightarrow \infty }\Vert \mathscr {A}w_n-\mathscr {A}(w_n-\gamma l^mr(w_n))\Vert =0. \end{aligned}$$

Noticing (11), we get

$$\begin{aligned} \lim _{m\rightarrow \infty }\frac{\Vert w_n-P_\mathscr {C}(w_n-\gamma l^m\mathscr {A}w_n)\Vert }{\gamma l^m}=0. \end{aligned}$$
(13)

Let \(z_m=P_\mathscr {C}(w_n-\gamma l^m \mathscr {A} w_n)\). By Lemma 1, we have

$$\begin{aligned} \langle z_m-w_n+\gamma l^m\mathscr {A}w_n,x-z_m\rangle \ge 0, \ \forall x\in \mathscr {C}. \end{aligned}$$

That is

$$\begin{aligned} \langle \frac{z_m-w_n}{\gamma l^m},x-z_m\rangle +\langle \mathscr {A}w_n,x-z_m\rangle \ge 0,\ \forall x\in \mathscr {C}. \end{aligned}$$

Consequently

$$\begin{aligned} \langle \frac{z_m-w_n}{\gamma l^m},x-z_m\rangle +\langle \mathscr {A}w_n,x-w_n\rangle +\langle \mathscr {A}w_n,w_n-z_m\rangle \ge 0,\ \forall x\in \mathscr {C}. \end{aligned}$$
(14)

Taking \(m\rightarrow \infty \) in (14) and using (12) and (13), we obtain

$$\begin{aligned} \langle \mathscr {A}w_n,x-w_n\rangle \ge 0, \ \forall x\in \mathscr {C}, \end{aligned}$$

it follows that \(w_n\in Sol(\mathscr {C},\mathscr {A})\). This contradiction implies that (7) is well defined. \(\square \)

Lemma 11

Assume that conditions (C1-C3) hold. Let the function \(h_n\) be defined by (8) and \(p\in Sol(\mathscr {C},\mathscr {A})\). Then \(h_n(w_n)=\dfrac{\mu }{2}\Vert r(w_n)\Vert ^2\) and \(h_n(p)\le 0\). Particularly, \(h_n(w_n)>0\) whenever \(r(w_n)\ne 0\).

Proof

According to (8), \(h_n(w_n)=\dfrac{\mu }{2}\Vert r(w_n)\Vert ^2\) is obvious. On the other hand, from the pseudomonotonicity of \(\mathscr {A}\) and \(p\in Sol(\mathscr {C},\mathscr {A})\), we have \(\langle \mathscr {A}y_n,y_n-p\rangle \ge 0\) and

$$\begin{aligned} h_n(p)&=\langle \mathscr {A}y_n,p-w_n\rangle +\dfrac{\mu }{2}\Vert r(w_n)\Vert ^2 \nonumber \\&=\langle \mathscr {A}y_n,p-y_n\rangle +\langle \mathscr {A}y_n,y_n-w_n\rangle +\dfrac{\mu }{2}\Vert r(w_n)\Vert ^2 \nonumber \\&\le -\tau _{n}\langle \mathscr {A}y_n,r(w_n)\rangle +\dfrac{\mu }{2}\Vert r(w_n)\Vert ^2. \end{aligned}$$
(15)

From the rule (7) and the definition of \(\{y_n\}\), we obtain

$$\begin{aligned} \tau _{n}\langle \mathscr {A}w_n-\mathscr {A}y_n,r(w_n)\rangle \le \dfrac{\mu }{2}\Vert r(w_n)\Vert ^2, \end{aligned}$$

that is

$$\begin{aligned} \tau _{n}\langle \mathscr {A}y_n,r(w_n)\rangle \ge \tau _{n}\langle \mathscr {A}w_n,r(w_n)\rangle -\dfrac{\mu }{2}\Vert r(w_n)\Vert ^2. \end{aligned}$$
(16)

According to Lemma 2 (1), we can conclude that

$$\begin{aligned} \Vert w_n-P_\mathscr {C}(w_n-\tau _{n}\mathscr {A}w_n)\Vert ^2\le \tau _{n}\langle \mathscr {A}w_n,w_n-P_\mathscr {C}(w_n-\tau _{n}\mathscr {A}w_n)\rangle , \end{aligned}$$

or equivalently

$$\begin{aligned} \tau _{n}\langle \mathscr {A}w_n,r(w_n)\rangle \ge \Vert r(w_n)\Vert ^2. \end{aligned}$$
(17)

Substituting (16) and (17) into (15), we get that

$$\begin{aligned} h_n(p)\le -(1-\mu )\Vert r(w_n)\Vert ^2. \end{aligned}$$

Since \(\mu \in (0,1]\), then \(h_n(p)\le 0\). \(\square \)

Lemma 12

Assume that conditions (C1-C3) hold. Let \(\{w_n\}\) be a sequence generated by Algorithm 3. If there exists a subsequence \(\{w_{n_k}\}\subset \{ w_n\}\) converges weakly to \(q\in \mathscr {C}\) and \(\lim _{k\rightarrow \infty }\Vert w_{n_k}-z_{n_k}\Vert =0\), then \(q\in Sol(\mathscr {C},\mathscr {A})\).

Proof

Since \(z_{n_k}=P_\mathscr {C}(w_{n_k}-\tau _{n_k}\mathscr {A}w_{n_k})\), we have

$$\begin{aligned} \langle w_{n_k}-\tau _{n_k}\mathscr {A}w_{n_k}-z_{n_k},x-z_{n_k}\rangle \le 0, \ \forall x\in \mathscr {C}, \end{aligned}$$

or equivalently

$$\begin{aligned} \frac{1}{\tau _{n_k}} \langle w_{n_k}-z_{n_k},x-z_{n_k}\rangle \le \langle \mathscr {A}w_{n_k},x-z_{n_k}\rangle , \ \forall x\in \mathscr {C}. \end{aligned}$$

Consequently

$$\begin{aligned} \frac{1}{\tau _{n_k}} \langle w_{n_k}-z_{n_k},x-z_{n_k}\rangle +\langle \mathscr {A}w_{n_k},z_{n_k}-w_{n_k}\rangle \le \langle \mathscr {A}w_{n_k},x-w_{n_k}\rangle , \ \forall x\in \mathscr {C}. \end{aligned}$$
(18)

Next, we prove that

$$\begin{aligned} \liminf _{k\rightarrow \infty }\langle \mathscr {A}w_{n_k},x-w_{n_k}\rangle \ge 0, \ \forall x\in \mathscr {C}. \end{aligned}$$
(19)

We consider the following two possible cases.

Case 1 Suppose that \(\liminf _{k\rightarrow \infty }\tau _{n_k}>0\). Since \(w_{n_k} \rightharpoonup q \in \mathscr {C}\), then \(\{w_{n_k}\}\) is bounded. As \(\mathscr {A}\) is uniformly continuous on bounded subsets of \(\mathscr {C}\), by Lemma 4, we get that \(\{\mathscr {A}w_{n_k}\}\) is bounded. Taking \(k\rightarrow \infty \) in (18), since \(\Vert w_{n_k}-z_{n_k}\Vert \rightarrow 0\), we get

$$\begin{aligned} \liminf _{k\rightarrow \infty } \langle \mathscr {A}w_{n_k},x-w_{n_k}\rangle \ge 0. \end{aligned}$$

Case 2 Assume that \(\liminf _{k\rightarrow \infty }\tau _{n_k}=0\). Let \(s_{n_k}=P_\mathscr {C}(w_{n_k}-\tau _{n_k}l^{-1}\mathscr {A}w_{n_k})\), we have \(\tau _{n_k}l^{-1}>\tau _{n_k}\). Applying Lemma 3, we obtain

$$\begin{aligned} \frac{\Vert w_{n_k}-P_\mathscr {C}(w_{n_k}-\tau _{n_k}l^{-1}\mathscr {A}w_{n_k})\Vert }{\tau _{n_k}l^{-1}}\le \frac{\Vert w_{n_k}-P_\mathscr {C}(w_{n_k}-\tau _{n_k}\mathscr {A}w_{n_k})\Vert }{\tau _{n_k}}, \end{aligned}$$

that is

$$\begin{aligned} \Vert w_{n_k}-s_{n_k}\Vert \le \frac{1}{l}\Vert w_{n_k}-z_{n_k}\Vert \rightarrow 0 \ (k\rightarrow \infty ). \end{aligned}$$
(20)

Therefore, \(s_{n_k}\rightharpoonup q\in \mathscr {C}\), it folllows that \(\{s_{n_k}\}\) is bounded. In addition,

$$\begin{aligned} \Vert \mathscr {A}w_{n_k}-\mathscr {A}(w_{n_k}-\tau _{n_k}l^{-1}r(w_{n_k}))\Vert \rightarrow 0\ (k\rightarrow \infty ). \end{aligned}$$
(21)

By \(\tau _{n_k}l^{-1}>\tau _{n_k}\), we know that \(\tau _{n_k}l^{-1}\) does not satisfy (7), owing to (11), then

$$\begin{aligned} \Vert \mathscr {A}w_{n_k}-\mathscr {A}(w_{n_k}-\tau _{n_k}l^{-1}r(w_{n_k}))\Vert >\dfrac{\mu }{2}\dfrac{\Vert w_{n_k}-P_\mathscr {C}(w_{n_k}-\tau _{n_k}l^{-1}\mathscr {A}w_{n_k})\Vert }{\tau _{n_k}l^{-1}}, \end{aligned}$$

which implies that

$$\begin{aligned} \lim _{k\rightarrow \infty }\frac{\Vert w_{n_k}-s_{n_k}\Vert }{\tau _{n_k}l^{-1}}=0. \end{aligned}$$
(22)

Furthermore, it follows from the definition of \(\{s_{n_k}\}\) and Lemma 1 that

$$\begin{aligned} \langle w_{n_k}-\tau _{n_k}l^{-1}\mathscr {A}w_{n_k}-s_{n_k},x-s_{n_k}\rangle \le 0, \ \forall x\in \mathscr {C}. \end{aligned}$$

It follows that

$$\begin{aligned} \frac{1}{\tau _{n_k}l^{-1}}\langle w_{n_k}-s_{n_k},x-s_{n_k}\rangle +\langle \mathscr {A}w_{n_k},s_{n_k}-w_{n_k}\rangle \le \langle \mathscr {A}w_{n_k},x-w_{n_k}\rangle , \ \forall x\in \mathscr {C}. \end{aligned}$$
(23)

Taking \(k\rightarrow \infty \) in (23), we get

$$\begin{aligned} \liminf _{k\rightarrow \infty }\langle \mathscr {A}w_{n_k},x-w_{n_k}\rangle \ge 0, \ \forall x\in \mathscr {C}. \end{aligned}$$

By Lemma 9, \(q\in Sol(\mathscr {C},\mathscr {A})\) and the proof is completed. \(\square \)

Theorem 1

Assume that conditions (C1-C3) hold and the sequence \(\{\alpha _n\}\) is chosen such that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\alpha _n}{\beta _n}\Vert x_1-x_{n}\Vert =0. \end{aligned}$$

Then the sequence \(\{x_n\}\) generated by Algorithm 3 converges strongly to an element \(p\in Sol(\mathscr {C},\mathscr {A})\), where \(p=P_{Sol(\mathscr {C},\mathscr {A})}\circ f(p)\).

Proof

We divide the proof into five claims.

Claim 1. We prove that the \(\{x_n\}\) is bounded. Indeed, let \(s_n:=P_{C_n}(w_n)\), then

$$\begin{aligned} \Vert s_n-p\Vert ^2=\Vert P_{C_n}(w_n)-p\Vert ^2&\le \Vert w_n-p\Vert ^2-\Vert P_{C_n}(w_n)-w_n\Vert ^2\\&=\Vert w_n-p\Vert ^2-dist^2(w_n,C_n)\\&\le \Vert w_n-p\Vert ^2. \end{aligned}$$

That is

$$\begin{aligned} \Vert s_n-p\Vert \le \Vert w_n-p\Vert . \end{aligned}$$
(24)

Thus

$$\begin{aligned} \Vert x_{n+1}-p\Vert&=\Vert \beta _nf(w_n)+(1-\beta _n)s_n-p\Vert \nonumber \\&=\Vert \beta _n(f(w_n)-p)+(1-\beta _n)(s_n-p)\Vert \nonumber \\&\le \beta _n\Vert f(w_n)-p\Vert +(1-\beta _n)\Vert s_n-p\Vert \nonumber \\&\le \beta _n\Vert f(w_n)-f(p)\Vert +\beta _n\Vert f(p)-p\Vert +(1-\beta _n)\Vert s_n-p\Vert \nonumber \\&\le \beta _n\rho \Vert w_n-p\Vert +\beta _n\Vert f(p)-p\Vert +(1-\beta _n)\Vert w_n-p\Vert \nonumber \\&=(1-\beta _n(1-\rho ))\Vert w_n-p\Vert +\beta _n\Vert f(p)-p\Vert . \end{aligned}$$
(25)

It follows from the definition of \(\left\{ w_n\right\} \) that

$$\begin{aligned} \Vert w_n-p\Vert&=\Vert (1-\alpha _{n})x_n+\alpha _{n}x_1-p\Vert \nonumber \\&\le \Vert x_n-p\Vert +\alpha _n\Vert x_1-x_{n}\Vert \nonumber \\&\le \Vert x_n-p\Vert +\beta _n\cdot \frac{\alpha _n}{\beta _n}\Vert x_1-x_{n}\Vert . \end{aligned}$$
(26)

Since

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\alpha _n}{\beta _n}\Vert x_1-x_{n}\Vert =0, \end{aligned}$$

there exists \(N\in \mathbb {N}\) and a constant \(M_1>0\) such that

$$\begin{aligned} \frac{\alpha _n}{\beta _n}\Vert x_1-x_{n}\Vert \le M_1, \ \forall n\ge N. \end{aligned}$$
(27)

Combining (26) and (27), we have

$$\begin{aligned} \Vert w_n-p\Vert \le \Vert x_n-p\Vert +\beta _nM_1. \end{aligned}$$
(28)

Substituting (28) into (25), we get

$$\begin{aligned} \Vert x_{n+1}-p\Vert&\le (1-\beta _n(1-\rho ))\Vert x_n-p\Vert +\beta _nM_1+\beta _n\Vert f(p)-p\Vert \\&\le (1-\beta _n(1-\rho ))\Vert x_n-p\Vert +\beta _n(1-\rho )\frac{\Vert f(p)-p\Vert +M_1}{1-\rho }\\&\le \max \{\Vert x_n-p\Vert ,\frac{\Vert f(p)-p\Vert +M_1}{1-\rho }\}\\&\le \cdots \\&\le \max \{\Vert x_{N}-p\Vert ,\frac{\Vert f(p)-p\Vert +M_1}{1-\rho }\}. \end{aligned}$$

This implies that the sequence \(\{x_n\}\) is bounded. Consequently, the sequences \(\{y_n\},\{f(w_n)\}\) and \(\{\mathscr {A}y_n\}\) are bounded too.

Claim 2. We prove that

$$\begin{aligned} \Vert s_n-w_n\Vert ^2\le \Vert x_n-p\Vert ^2-\Vert x_{n+1}-p\Vert ^2+\beta _{n}M_2+2\beta _{n}\langle f(w_n)-p,x_{n+1}-p\rangle , \end{aligned}$$

for some \(M_2>0\). Ineeed, from (2) and (25), we have

$$\begin{aligned} \Vert x_{n+1}-p\Vert ^2&=\Vert \beta _n(f(w_n)-p)+(1-\beta _{n})(s_n-p)\Vert ^2 \nonumber \\&\le (1-\beta _{n})\Vert s_n-p\Vert ^2+2\beta _{n}\langle f(w_n)-p,x_{n+1}-p\rangle \nonumber \\&\le \Vert s_n-p\Vert ^2+2\beta _{n}\langle f(w_n)-p,x_{n+1}-p\rangle \nonumber \\&=\Vert P_{C_n}(w_n)-p\Vert ^2+2\beta _{n}\langle f(w_n)-p,x_{n+1}-p\rangle \nonumber \\&\le \Vert w_n-p\Vert ^2-\Vert s_n-w_n\Vert ^2+2\beta _{n}\langle f(w_n)-p,x_{n+1}-p\rangle . \end{aligned}$$
(29)

Using (28), we get

$$\begin{aligned} \Vert w_n-p\Vert ^2&\le (\Vert x_n-p\Vert +\beta _nM_1)^2 \nonumber \\&=\Vert x_n-p\Vert ^2+\beta _n(2M_1\Vert x_n-p\Vert +\beta _nM_1^2), \end{aligned}$$
(30)

Substituting (30) into (29), then

$$\begin{aligned} \Vert x_{n+1}-p\Vert ^2\le \Vert x_n-p\Vert ^2-\Vert s_n-w_n\Vert ^2+\beta _{n}M_2+2\beta _{n}\langle f(w_n)-p,x_{n+1}-p\rangle , \end{aligned}$$

where \(M_2:=\sup _{n \in \mathbb {N}}\{2M_1\Vert x_n-p\Vert +\beta _nM_1^2\}\). Thus,

$$\begin{aligned} \Vert s_n-w_n\Vert ^2\le \Vert x_n-p\Vert ^2-\Vert x_{n+1}-p\Vert ^2+\beta _{n}M_2+2\beta _{n}\langle f(w_n)-p,x_{n+1}-p\rangle . \end{aligned}$$

Claim 3. We prove that

$$\begin{aligned} (1-\beta _{n})\left[ \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2\right] ^2\le \Vert x_n-p\Vert ^2-\Vert x_{n+1}-p\Vert ^2+\beta _{n}M_3, \end{aligned}$$

for some \(M_3>0\). Indeed, from \(\{\mathscr {A}y_n\}\) is bounded, there exists \(L>0\) such that \(\Vert \mathscr {A}y_n\Vert \le L\). By the definition of \(h_n(x)\), for all \(u,v\in C_n\),

$$\begin{aligned} \Vert h_n(u)-h_n(v)\Vert =\Vert \langle \mathscr {A}y_n,u-v\rangle \Vert \le \Vert \mathscr {A}y_n\Vert \Vert u-v\Vert \le L\Vert u-v\Vert , \end{aligned}$$

which implies that \(h_n(\cdot )\) is L-Lipschitz continuous on \(C_n\). From Lemma 6, we have

$$\begin{aligned} dist(w_n,C_n)\ge \dfrac{1}{L}h_n(w_n). \end{aligned}$$

Applying Lemma 11, we get

$$\begin{aligned} dist(w_n,C_n)\ge \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2. \end{aligned}$$
(31)

Thus,

$$\begin{aligned} \Vert s_n-p\Vert ^2\le \Vert w_n-p\Vert ^2-\left[ \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2\right] ^2. \end{aligned}$$
(32)

On the other hand,

$$\begin{aligned}&\Vert x_{n+1}-p\Vert ^2\\&\quad =\Vert \beta _nf(w_n)+(1-\beta _n)s_n-p\Vert ^2\\&\quad =\Vert \beta _n(f(w_n)-p)+(1-\beta _n)(s_n-p)\Vert ^2\\&\quad =\beta _{n}\Vert f(w_n)-p\Vert ^2+(1-\beta _{n})\Vert s_n-p\Vert ^2-\beta _{n}(1-\beta _{n})\Vert f(w_n)-s_n\Vert ^2\\&\quad \le \beta _{n}\Vert f(w_n)-p\Vert ^2+(1-\beta _{n})\Vert s_n-p\Vert ^2\\&\quad \le \beta _{n}\Vert f(w_n)-p\Vert ^2+(1-\beta _{n})\Vert w_n-p\Vert ^2-(1-\beta _{n})\left[ \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2\right] ^2\\&\quad \le \beta _{n}\Vert f(w_n)-p\Vert ^2+\Vert w_n-p\Vert ^2-(1-\beta _{n})\left[ \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2\right] ^2\\&\quad \le \beta _{n}\Vert f(w_n)-p\Vert ^2+\Vert x_n-p\Vert ^2+\beta _n(2M_1\Vert x_n-p\Vert +\beta _nM_1^2)\\&\qquad -(1-\beta _{n})\left[ \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2\right] ^2\\&\quad \le \Vert x_n-p\Vert ^2+\beta _{n}M_3-(1-\beta _{n})\left[ \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2\right] ^2, \end{aligned}$$

where \(M_3:=\sup _{n \in \mathbb {N}}\{\Vert f(w_n)-p\Vert ^2+2M_1\Vert x_n-p\Vert +\beta _nM_1^2\}\). Therefore,

$$\begin{aligned} (1-\beta _{n})\left[ \dfrac{\mu }{2L}\Vert r(w_n)\Vert ^2\right] ^2\le \Vert x_n-p\Vert ^2-\Vert x_{n+1}-p\Vert ^2+\beta _{n}M_3. \end{aligned}$$

Claim 4. We prove that

$$\begin{aligned}&\Vert x_{n+1}-p\Vert ^2\\&\quad \le (1-\beta _n(1-\rho ))\Vert x_n-p\Vert ^2+\beta _n(1-\rho )\Big [\frac{2}{1-\rho }\langle f(p)-p,x_{n+1}-p\rangle \\&\qquad +\frac{2M}{1-\rho }\cdot \frac{\alpha _n}{\beta _n}\Vert x_1-x_{n}\Vert \Big ], \end{aligned}$$

for some \(M>0\). In fact, using (2), we have

$$\begin{aligned} \Vert w_n-p\Vert ^2&=\Vert (1-\alpha _{n})x_n+\alpha _{n}x_1-p\Vert ^2 \nonumber \\&\le \Vert x_n-p\Vert ^2+2\alpha _n\langle x_1-x_{n},w_n-p\rangle \nonumber \\&\le \Vert x_n-p\Vert ^2+2\alpha _n\Vert x_1-x_{n}\Vert \Vert w_n-p\Vert . \end{aligned}$$
(33)

By (2), (3) and (24), we obtain

$$\begin{aligned}&\Vert x_{n+1}-p\Vert ^2 \nonumber \\&\quad =\Vert \beta _nf(w_n)+(1-\beta _n)s_n-p\Vert ^2 \nonumber \\&\quad =\Vert \beta _n(f(w_n)-f(p))+(1-\beta _n)(s_n-p)+\beta _n(f(p)-p)\Vert ^2 \nonumber \\&\quad \le \Vert \beta _n(f(w_n)-f(p))+(1-\beta _n)(s_n-p)\Vert ^2+2\beta _n\langle f(p)-p,x_{n+1}-p\rangle \nonumber \\&\quad \le \beta _n\Vert f(w_n)-f(p)\Vert ^2+(1-\beta _n)\Vert s_n-p\Vert ^2+2\beta _n\langle f(p)-p,x_{n+1}-p\rangle \nonumber \\&\quad \le \beta _n\rho \Vert w_n-p\Vert ^2+(1-\beta _n)\Vert w_n-p\Vert ^2+2\beta _n\langle f(p)-p,x_{n+1}-p\rangle \nonumber \\&\quad =(1-\beta _n(1-\rho ))\Vert w_n-p\Vert ^2+2\beta _n\langle f(p)-p,x_{n+1}-p\rangle . \end{aligned}$$
(34)

Substituting (33) into (34), we get that

$$\begin{aligned}&\Vert x_{n+1}-p\Vert ^2\\&\quad \le (1-\beta _n(1-\rho ))\Vert x_n-p\Vert ^2+2\alpha _n\Vert x_1-x_{n}\Vert \Vert w_n-p\Vert \\&\qquad +2\beta _n\langle f(p)-p,x_{n+1}-p\rangle \\&\quad =(1-\beta _n(1-\rho ))\Vert x_n-p\Vert ^2+\beta _n(1-\rho )\cdot \frac{2}{1-\rho }\langle f(p)-p,x_{n+1}-p\rangle \\&\qquad +2\alpha _n\Vert x_1-x_{n}\Vert \Vert w_n-p\Vert \\&\quad \le (1-\beta _n(1-\rho ))\Vert x_n-p\Vert ^2+\beta _n(1-\rho )\cdot \frac{2}{1-\rho }\langle f(p)-p,x_{n+1}-p\rangle \\&\qquad +2M\alpha _n\Vert x_1-x_{n}\Vert \\&\quad =(1-\beta _n(1-\rho ))\Vert x_n-p\Vert ^2+\beta _n(1-\rho )\Big [\frac{2}{1-\rho }\langle f(p)-p,x_{n+1}-p\rangle \\&\qquad +\frac{2M}{1-\rho }\cdot \frac{\alpha _n}{\beta _n}\Vert x_1-x_{n}\Vert \Big ], \end{aligned}$$

where \(M:=\sup _{n \in \mathbb {N}}\{\Vert w_n-p\Vert \}\).

Claim 5. We prove that \(\{\Vert x_n-p\Vert \}\) converges to zero by considering two possible cases.

Case 1 There exists \(N\in \mathbb {N}\) such that \(\Vert x_{n+1}-p\Vert ^2\le \Vert x_n-p\Vert ^2\) for all \(n\ge N\), it follows that \(\lim _{n\rightarrow \infty }\Vert x_n-p\Vert ^2\) exists. Next we prove that

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert x_{n+1}-x_n\Vert =0. \end{aligned}$$
(35)

Indeed, by Claim 2 we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert s_n-w_n\Vert =0. \end{aligned}$$
(36)

In addition,

$$\begin{aligned} \Vert w_n-x_n\Vert =\alpha _{n}\Vert x_1-x_{n}\Vert =\beta _{n}\cdot \dfrac{\alpha _{n}}{\beta _{n}}\Vert x_1-x_{n}\Vert \rightarrow 0 \ (n\rightarrow \infty ), \end{aligned}$$
(37)

and

$$\begin{aligned} \Vert x_{n+1}-s_n\Vert =\beta _{n}\Vert f(w_n)-s_n\Vert \rightarrow 0 \ (n\rightarrow \infty ). \end{aligned}$$
(38)

Combining (36), (37) and (38), we obtain

$$\begin{aligned} \Vert x_{n+1}-x_n\Vert =\Vert x_{n+1}-s_n\Vert +\Vert s_n-w_n\Vert +\Vert w_n-x_n\Vert \rightarrow 0\ (n\rightarrow \infty ). \end{aligned}$$

This implies that (35) holds. On the other hand, since the sequence \(\{x_{n}\}\) is bounded, it follows that there exists a subsequence \(\{x_{n_{k}}\}\) of \(\{x_{n}\}\), which converges weakly to \(q \in \mathscr {C}\) and

$$\begin{aligned} \limsup _{n\rightarrow \infty }\langle f(p)-p,x_{n}-p\rangle =\lim _{k\rightarrow \infty }\langle f(p)-p,x_{n_{k}}-p\rangle =\langle f(p)-p,q-p\rangle . \end{aligned}$$
(39)

According to Claim 3, we get

$$\begin{aligned} \lim _{k\rightarrow \infty }\left[ \dfrac{\mu }{2L}\Vert r(w_{n_k})\Vert ^2\right] ^2=0. \end{aligned}$$

That is

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert w_{n_k}-z_{n_k}\Vert =0. \end{aligned}$$
(40)

Using the fact that \( x_{n_k}\rightharpoonup q \ (k\rightarrow \infty )\), (40) and Lemma 12, we have \(q \in Sol(\mathscr {C},\mathscr {A})\). Since \(p=P_{Sol(\mathscr {C},\mathscr {A})}\circ f(p)\), combining (35) and (39), we obtain

$$\begin{aligned}&\limsup _{n\rightarrow \infty }\langle f(p)-p,x_{n+1}-p\rangle \nonumber \\&\quad \le \limsup _{n\rightarrow \infty }\langle f(p)-p,x_{n+1}-x_{n}\rangle +\limsup _{n\rightarrow \infty }\langle f(p)-p,x_{n}-p\rangle \nonumber \\&\quad \le 0. \end{aligned}$$
(41)

Hence, by Claim 4, (41) and Lemma 8, we conclude that \(\lim _{n\rightarrow \infty }\Vert x_n-p\Vert =0\).

Case 2 There exists a subsequence \(\{\Vert x_{n_j}-p\Vert ^2\}\subset \{\Vert x_n-p\Vert ^2\}\) such that

$$\begin{aligned} \Vert x_{n_j}-p\Vert ^2<\Vert x_{n_j+1}-p\Vert ^2, \ \forall j\in \mathbb {N}. \end{aligned}$$

From Lemma 7, there exists a non-decreasing sequence \(\{m_k\}\) of \(\mathbb {N}\) such that \(\lim _{k\rightarrow \infty }m_k=\infty \) and

$$\begin{aligned} \Vert x_{m_k}-p\Vert ^2\le \Vert x_{m_k+1}-p\Vert ^2,\ \Vert x_{k}-p\Vert ^2\le \Vert x_{m_k+1}-p\Vert ^2,\ \forall k\in \mathbb {N}. \end{aligned}$$
(42)

From Claim 2,

$$\begin{aligned}&\Vert s_{m_k}-w_{m_k}\Vert ^2\\&\quad \le \Vert x_{m_k}-p\Vert ^2-\Vert x_{{m_k}+1}-p\Vert ^2+\beta _{m_k}M_2+2\beta _{m_k}\langle f(w_{m_k})-p,x_{{m_k}+1}-p\rangle \\&\quad \le \Vert x_{m_k}-p\Vert ^2-\Vert x_{{m_k}+1}-p\Vert ^2+\beta _{m_k}M_2+2\beta _{m_k}\Vert f(w_{m_k})-p\Vert \Vert x_{{m_k}+1}-p\Vert \\&\quad \rightarrow 0\ (k\rightarrow \infty ). \end{aligned}$$

As proved in Case 1, we can conclude that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert x_{m_k+1}-x_{m_k}\Vert =0 \end{aligned}$$

and

$$\begin{aligned} \limsup _{k\rightarrow \infty }\langle f(p)-p,x_{m_k+1}-p\rangle \le 0. \end{aligned}$$
(43)

Applying Claim 4, we have

$$\begin{aligned}&\Vert x_{m_k+1}-p\Vert ^2\\&\quad \le (1-\beta _{m_k}(1-\rho ))\Vert x_{m_k}-p\Vert ^2+\beta _{m_k}(1-\rho )\Big [\frac{2}{1-\rho }\langle f(p)-p,x_{m_k+1}-p\rangle \\&\qquad +\frac{2M}{1-\rho }\cdot \frac{\alpha _{m_k}}{\beta _{m_k}}\Vert x_{1}-x_{{m_k}}\Vert \Big ]\\&\quad \le (1-\beta _{m_k}(1-\rho ))\Vert x_{m_k+1}-p\Vert ^2+\beta _{m_k}(1-\rho )\Big [\frac{2}{1-\rho }\langle f(p)-p,x_{m_k+1}-p\rangle \\&\qquad +\frac{2M}{1-\rho }\cdot \frac{\alpha _{m_k}}{\beta _{m_k}}\Vert x_{1}-x_{{m_k}}\Vert \Big ]. \end{aligned}$$

It follows that

$$\begin{aligned} \Vert x_{m_k+1}-p\Vert ^2\le \frac{2}{1-\rho }\langle f(p)-p,x_{m_k+1}-p\rangle +\frac{2M}{1-\rho }\cdot \frac{\alpha _{m_k}}{\beta _{m_k}}\Vert x_{1}-x_{{m_k}}\Vert . \end{aligned}$$

From (42) and (43), we obtain

$$\begin{aligned} \Vert x_k-p\Vert ^2\le \Vert x_{m_k+1}-p\Vert ^2\rightarrow 0\ (k\rightarrow \infty ). \end{aligned}$$

Hence, \(x_k\rightarrow p\) as \(k\rightarrow \infty \). This completes the proof. \(\square \)

3.2 Convergence rate

Next, we assume that the operator \(\mathscr {A}\) in Algorithm 3 is \(\eta \)-strongly pseudo-monotone. In this case, by modifying Algorithm 3, we obtain that the sequence \(\{x_n\}\) strongly converges with a linear rate. The modified algorithm is as follows:

Algorithm 4
figure d

 .

Remark 2

Similar to the proof of Lemma 10, it is easy to get that (44) is well defined.

Theorem 2

Assume that \(\mathscr {A}:\mathscr {C}\rightarrow \mathscr {C}\) is \(\eta \)-strongly pseudo-monotone and L-Lipschitz continuous. Then the sequence \(\{x_n\}\) generated by Algorithm 4 converges strongly with a Q-linear rate to the unique element p in Sol\((\mathscr {C},\mathscr {A})\).

Proof

According to the conditions of Theorem 2, \(\langle \mathscr {A}p,x_{n+1}-p\rangle \ge 0\) and thus

$$\begin{aligned} \langle \mathscr {A}x_{n+1},x_{n+1}-p\rangle \ge \eta \Vert x_{n+1}-p\Vert ^2. \end{aligned}$$

From the definition of \(\{x_{n+1}\}\), we have

$$\begin{aligned} \langle x_n-\tau _{n}\mathscr {A}x_n-x_{n+1},p-x_{n+1\rangle }\le 0. \end{aligned}$$

It follows from (44) that

$$\begin{aligned} 2\langle x_n-x_{n+1},p-x_{n+1}\rangle&\le 2\tau _{n}\langle \mathscr {A}x_n,p-x_{n+1}\rangle \\&=-2\tau _{n}\langle \mathscr {A}x_{n+1},x_{n+1}-p\rangle +2\tau _{n}\langle \mathscr {A}x_{n}-\mathscr {A}x_{n+1},p-x_{n+1}\rangle \\&\le -2\tau _{n}\eta \Vert x_{n+1}-p\Vert ^2+2\tau _{n}\Vert \mathscr {A}x_{n}-\mathscr {A}x_{n+1}\Vert \Vert p-x_{n+1}\Vert \\&\le -2\tau _{n}\eta \Vert x_{n+1}-p\Vert ^2+\Vert x_{n+1}-x_n\Vert ^2+\mu ^2\Vert p-x_{n+1}\Vert ^2. \end{aligned}$$

Moreover,

$$\begin{aligned} 2\langle x_n-x_{n+1},p-x_{n+1}\rangle =-\Vert x_n-p\Vert ^2+\Vert x_n-x_{n+1}\Vert ^2+\Vert x_{n+1}-p\Vert ^2, \end{aligned}$$

which implies that

$$\begin{aligned} (1+2\tau _{n}\eta -\mu ^2)\Vert x_{n+1}-p\Vert ^2\le \Vert x_n-p\Vert ^2. \end{aligned}$$

Since \(\mu \in (0,\min \{\frac{1}{\sqrt{2}},\frac{2l\eta }{L}\}) \), we have \(1+\mu ^2<2-\mu ^2\). Next we show that \(\tau _{n}>\frac{\mu l}{L}\). Indeed, we know that \(\frac{\tau _{n}}{l}\) must violate (44), thus,

$$\begin{aligned} \frac{\mu l}{\tau _{n}}\Vert x_n-x_{n+1}\Vert <\Vert \mathscr {A}x_n-\mathscr {A}x_{n+1}\Vert \le L\Vert x_n-x_{n+1}\Vert , \end{aligned}$$

that is \(\tau _{n}>\frac{\mu l }{L}\). It follows that \(2\tau _{n}\eta> 2\frac{\mu l \eta }{L}> \mu ^2\) and \(\frac{\mu ^2}{\min \{1-\mu ^2,2\tau _{n}\eta \}}<1\). Fix \(\delta \in (\frac{1+\mu ^2}{2\mu },\frac{2-\mu ^2}{2\mu })\) and \(\varepsilon \in (\frac{\mu ^2}{\min \{1-\mu ^2,2\tau _{n}\eta \}},\frac{2(1-\delta \mu )}{\min \{1-\mu ^2,2\tau _{n}\eta \}})\). Since \(\frac{\mu ^2}{\min \{1-\mu ^2,2\tau _{n}\eta \}}<1\), we can choose \(\varepsilon \in (0,1)\) such that

$$\begin{aligned} \mu ^2<\varepsilon \min \{1-\mu ^2,2\tau _{n}\eta \}. \end{aligned}$$

Let \(\nu :=\frac{1}{2}\varepsilon \min \{1-\mu ^2,2\tau _{n}\eta \}\), we see that

$$\begin{aligned} 2\varepsilon \tau _{n} \eta \ge 2\nu >\mu ^2. \end{aligned}$$

Therefore,

$$\begin{aligned} 1+2\tau _{n}\eta -\mu ^2>1+1+2\varepsilon \tau _{n}\eta -\mu ^2=1+\gamma , \end{aligned}$$

where \(\gamma :=2\varepsilon \tau _{n}\eta -\mu ^2>0\). It follows that

$$\begin{aligned} \Vert x_{n+1}-p\Vert ^2\le \frac{1}{1+\gamma }\Vert x_{n}-p\Vert ^2, \end{aligned}$$

that is,

$$\begin{aligned} \Vert x_{n+1}-p\Vert \le \sqrt{\frac{1}{1+\gamma }}\Vert x_{n}-p\Vert . \end{aligned}$$

This implies that the sequence \(\{x_n\}\) generated by Algorithm 4 converges strongly to p with a Q-linear rate. \(\square \)

Fig. 1
figure 1

Example 1 with \(m=5\)

Fig. 2
figure 2

Example 1 with \(m=10\)

4 Numerical experiments

We give some numerical examples to show performances of our proposed Algorithm 3 and compare with [15, Algorithm SD], [14, Algorithm PY], [28, Algorithm DY] and [12, Algorithm DA].

Example 1

Assume that \(\mathscr {A}: \mathbb {R}^m \rightarrow \mathbb {R}^m\) is defined by \(\mathscr {A}(x) : = Mx+q\) with \(M = NN^T + S + D\), N is an \(m\times m\) matrix, S is an \(m\times m\) skew-symmetric matrix, D is an \(m\times m\) diagonal matrix, whose diagonal entries are positive (so \(\mathscr {A}\) is positive definite). All the entries of NS and D are randomly generated in the interval (0, 20). Consider \(Sol(\mathscr {C},\mathscr {A})\) with the feasible set \(\mathscr {C}=\{x\in \mathbb {R}^m: -10\le x_i \le 10, i=1,2,\cdots , m\}\). The parameters are taken as follows.

  • Algorithm 3: \(\alpha _n = \left\{ \begin{array}{ll} \frac{1}{(n+1)^2}, &{}\text {if}\,\, \Vert x_{n} - x_1\Vert = 0 \\ \min \left\{ \frac{1}{(n+1)^2}, \frac{1}{(n+1)^2\times \Vert x_{n} - x_1\Vert }\right\} , &{}\text {otherwise} \\ \end{array} \right. \), \(\mu = 0.9\), \(l = 0.9\), \(\gamma = 0.05\), \(\beta _n = \frac{1}{n+1}\) and \(\rho = 0.5\);

  • Algorithm SD: \(\alpha _n = \frac{1}{n+1}\), \(\mu = 0.9\), \(l = 0.9\), \(\rho = 0.5\) and \(\lambda \) is randomly generated in the interval \((0, 1/\mu )\);

  • Algorithm PY: \(\alpha _n = \frac{1}{n+1}\), \(\mu = 0.9\) and \(l= 0.9\);

  • Algorithms DY, DA: \(\alpha _n = \frac{1}{n+1}\), \(\mu = 0.9\), \(l = 0.9\), \(\gamma = 0.05\).

In our experiment, the starting point \(x_1\) is generated randomly in \((-10, 10)^m\), where \(m = 5, 10\). We use the stopping rule \(E_n = \Vert x_n - P_\mathscr {C}(I-\mathscr {A})x_n\Vert \le 10^{-1}\) and we also stop if the number of iterations \(N = 60000\) for all algorithms. Figures 1, 2 and Tables 12 show the computational results for Example 1 by using Algorithms 3, SD, PY, DY and DA.

Let “Iter” denote number of iterations, “CPU” denote the CPU time seconds. Example 1 shows that our proposed Algorithm 3 is fast, efficient and easy to implement. We notice that from Figs. 1, 2 and Tables 1, 2 that our Algorithm 3 outperforms SD, PY, DY and DA, in terms of CPU time and required number of iterations for each case of different dimensions given as follows: \(m=5, 10\).

Example 2

Consider \(\mathcal {H}=L^2([0,1])\) with inner product \(\langle x,y\rangle :=\int _{0}^1 x(t)y(t)dt\) and norm \(\Vert x\Vert _2 : = (\int _0^1 |x(t)|^2 dt) ^{\frac{1}{2}}\). Suppose \(\mathscr {C}: = \{x\in \mathcal {H}: \Vert x\Vert _2\le 2\}\). Let \(g:\mathscr {C}\rightarrow \mathbb {R}\) be defined by \(g(u) := \frac{1}{1+\Vert u\Vert ^2_2}.\) Observe that g is \(L_g\)-Lipschitz continuous with \(L_g = \frac{16}{25}\) and \(\frac{1}{5}\le g(u) \le 1, \forall u\in \mathscr {C}\). Define the Volterra integral mapping \(F: \mathcal {H}\rightarrow \mathcal {H}\) by

$$\begin{aligned} F(u)(t):= \int _0^t u(s)ds,\ \forall u\in \mathcal {H}, t\in [0,1]. \end{aligned}$$

Then F is bounded linear monotone, see [29]. Now define \(\mathscr {A}: \mathscr {C}\rightarrow \mathcal {H}\) by

$$\begin{aligned} \mathscr {A}(u)(t): = g(u)F(u)(t),\ \forall u\in \mathscr {C}, t\in [0,1]. \end{aligned}$$

As given in [30], the mapping \(\mathscr {A}\) is pseudo-monotone but not monotone. Now take

$$\begin{aligned} \mathscr {C}:=\{x\in \mathcal {H}: \langle a, x\rangle \ge b\}, \end{aligned}$$

where \(a\in \mathcal {H}\) and \(b\in R\). Then we define the metric projection \(P_\mathscr {C}\) as

$$\begin{aligned} P_\mathscr {C}(x) = \left\{ \begin{array}{ll} \frac{b-\langle a, x\rangle }{\Vert a\Vert ^2_{2}}a +x, &{} \text {if}\,\, \langle a, x\rangle < b \\ x, &{} \text {otherwise}. \end{array} \right. \end{aligned}$$
Table 1 The number of termination iterations and execution time of all algorithms
Table 2 The number of termination iterations and execution time of all algorithms

Let \(Q = \{x\in \mathcal {H}: \Vert x\Vert _2\le r\}\) be a closed ball centered at 0 with radius \(r = 2\), then Q is a nonempty closed and convex subset of \(L^2([0, 1])\). Thus, the projection onto Q is easily computed as

$$\begin{aligned} P_Q(x) = \left\{ \begin{array}{ll} r\frac{x}{\Vert x\Vert ^2}, &{} x \notin Q \\ x, &{} \text {otherwise}. \end{array} \right. \end{aligned}$$
Fig. 3
figure 3

Example 2 with \(x_1 = t\)

Table 3 The number of termination iterations and execution time of all algorithms

During this experiment, the parameters are taken as follows.

  • Algorithm 3: \(\alpha _n = \left\{ \begin{array}{ll} \frac{1}{(n+1)^2}, &{}\text {if}\,\, \Vert x_{n} - x_1\Vert _2 = 0\\ \min \left\{ \frac{1}{(n+1)^2}, \frac{1}{(n+1)^2\times \Vert x_{n} - x_1\Vert _2 }\right\} , &{}\text {otherwise} \\ \end{array} \right. \), \(\mu = 0.4\), \(l = 0.1\), \(\gamma = 0.5\), \(\beta _n = \frac{1}{n+1}\) and \(\rho = 0.5\);

  • Algorithm SD: \(\alpha _n = \frac{1}{n+1}\), \(\mu = 0.4\), \(l = 0.1\), \(\rho = 0.5\) and \(\lambda \) is randomly generated in the interval \((0, 1/\mu )\);

  • Algorithm PY: \(\alpha _n = \frac{1}{n+1}\), \(\mu = 0.4\) and \(l = 0.1\);

  • Algorithms DY, DA: \(\alpha _n = \frac{1}{n+1}\), \(\mu = 0.4\), \(l = 0.1\), \(\gamma = 0.5\).

We test Algorithms 3, PY, DY for different cases of the initial point \(x_1 \in L^2([0,1])\). Let the initial point be \(x_1 = t\). We take \(E_n = \Vert x_{n+1} - x_n\Vert _2\le 4 \times 10^{-3}\) as a termination criterion. The results of this test are displayed in Fig. 3 and Table 3.

Let the initial point be \(x_1 = t^3\). We terminate the iterations if \(E_n = \Vert x_{n+1} - x_n\Vert _2\le 10^{-3}.\) The numerical results are presented in Fig. 4 and Table 3.

Let “Iter” denote number of iterations, “CPU” denote the CPU time seconds. Figures 34 and Table 3 show that the numerical results of Example 2 with initial points \(x_0 = t\) and \(x_1= t^3\), respectively. They show that the performance of Algorithm 3 is better than that of Algorithms PY, DY, in terms of the number of iterations and CPU time required to reach the stopping criterion.

Fig. 4
figure 4

Example 2 with \(x_1 = t^3\)

Table 4 Comparison of Algorithm 3, Algorithm SD and Algorithm PY for Example 3

Example 3

Let us consider the variational inequality problem. Let

$$\begin{aligned} \mathscr {A}(x)= \left( \begin{array}{c} (x_1^2+(x_2-1)^2)(1+x_2)\\ -x_1^3-x_1(x_2-1)^2 \\ \end{array} \right) \end{aligned}$$

and \(\mathscr {C}:=\{x \in \mathbb {R}^2:\ -10\le x_i\le 10,\ i=1,2\}.\) This problem has unique solution \(x^*=(0,-1)^T\). It is easy to see that \(\mathscr {A}\) is not a monotone map on \(\mathscr {C}\). However, using the Monte Carlo approach (see [31]), it can be shown that \(\mathscr {A}\) is pseudo-monotone on \(\mathscr {C}\). Let \(f(x)=20/x\). Parameters in different algorithms are selected as follows:

Algorithm 3: \(\beta _n=\frac{1}{50n+1},\,\gamma =1.35,\,l=0.22,\,\mu =0.99\) and \(\alpha =0.8\);

Algorithm SD: \(\alpha _n=\frac{1}{50n+1},\,l=0.22,\,\mu =0.99\) and \(\lambda =0.9/\mu \);

Algorithm PY: \(\alpha _n=\frac{1}{50n+1},\, l=0.22\) and \(\mu =0.99\).

We take some initial points and different stopping criteria conditions in the Table 4. The following statistical data are obtained by averaging the number of iterations and CPU costs from 10 independent trials. In the Fig. 5, the initial point is \([10,10]^T\) and the number of steps to stop criteria is 1000.

Fig. 5
figure 5

The value of error versus the iteration numbers for Example 3

5 Conclusions

In this paper, we have given an improved projection-type method for solving classical variational inequalities in Hilbert spaces. We have proposed a novel line-search rule that removes the reliance on Lipschitz continuity. Furthermore, a strong convergence theorem is obtained through the combination of viscosity iteration and the projection method. In numerical experiments, we have compared our Algorithm 3 with some recent related results, and it can be found from the figures and tables that our new scheme has better convergence performance.