1 Introduction

In this paper, we are concerned with solving the classical Variational Inequality (VI) of Fichera [21, 22] and Stampacchia [38] (see also Kinderlehrer and Stampacchia [27]) in real Hilbert space. This problem stands at the core of many theoretical and applied areas, such as in transportation, economics, financial equilibrium problems, engineering mechanics and many more, see for example, [4, 5] and [6, 14, 15, 24].

Let \(C\subseteq {\mathcal {H}}\) be a nonempty, closed and convex subset of a real Hilbert space \({\mathcal {H}}\) and \({\mathcal {F}}:{\mathcal {H}}\rightarrow {\mathcal {H}}\) be a given mapping. The variational inequality with \({\mathcal {F}}\) and C, for short, VI(\({\mathcal {F}},C\)), is defined as

$$\begin{aligned} \text{ Find }~p\in C~\text{ such } \text{ that }~\left\langle {\mathcal {F}}p,x-p\right\rangle \ge 0,~\forall x\in C. \end{aligned}$$
(1.1)

During the last decades, many iterative methods have been developed for solving (1.1), see for example the excellent book of Facchinei and Pang [19] and the many references therein. One of the earliest gradient-type methods for solving (1.1) is introduced by Korpelevich [28] and is known as the Extragradient Method, shortly, EM (see also [3]). Given the current iterate \(x_n\in C\), calculate the next iterate by the following:

$$\begin{aligned} \left\{ \begin{array}{ll} y_n=P_C(x_n-\lambda {\mathcal {F}}x_n),\\ x_{n+1}=P_C(x_n-\lambda {\mathcal {F}}y_n) \end{array} \right. \end{aligned}$$
(1.2)

where \(\lambda \) is some positive constant.

Despite the mild assumptions needed for the convergence of the extragradient method, which are monotonicity and Lipschitz continuity of \({\mathcal {F}}\), there is the need to evaluate \({\mathcal {F}}\) twice at \(x_n\) and \(y_n\) and compute two orthogonal projections onto the VI’s feasible set C per each iteration, that is \(P_C(\cdot )\). This of course could affect seriously the computational effort and applicability of the method in case that \({\mathcal {F}}\) and C have complex and/or general structures. One proposed extragradient method extension is the Subgradient Extragradient Method (shortly, SEM) [11,12,13]. In this method, a constructible set \(T^n\) is introduced per each iteration and then the second orthogonal projection onto C in (1.2) is replaced by this easily computable projection. Given the current iterate \(x_n\in {\mathcal {H}}\), calculate the next iterate by the following:

$$\begin{aligned} \left\{ \begin{array}{ll} y_n=P_C(x_n-\lambda {\mathcal {F}}x_n),\\ T^n=\left\{ x\in {\mathcal {H}}:\left\langle x_n-\lambda {\mathcal {F}}x_n-y_n,x-y_n\right\rangle \le 0\right\} ,\\ x_{n+1}=P_{T^n}(x_n-\lambda {\mathcal {F}}y_n),~n\ge 0 \end{array} \right. \end{aligned}$$
(1.3)

again \(\lambda \) is some positive constant.

Since \(T^n\) is a half-space in case that \(x_n-\lambda {\mathcal {F}}x_n\ne y_n\), the second projection in (1.3) is inherently explicit, see formula (2.2) below. Observe that similar to the extragradient method, the subgradient extragradient method still requires the evaluation of \({\mathcal {F}}\) twice per each iteration. To overcome this obstacle, but still with two metric projections per each iteration, Popov [37] introduced the modified extragradient method. Recently, Malitsky and Semenov [32] extended the modified extragradient method in [37] and proposed the Modified Subgradient Extragradient Method (shortly, MSEM). For our purposes, we present next which also uses projection onto \(T^n\) but as Popov, one evaluation of \({\mathcal {F}}\) per each iteration.

Choose \(x_0,~y_0\in C\), and set \( x_1=P_C(x_0-\lambda {\mathcal {F}}y_0),~ y_{1}=P_{C}(x_1-\lambda {\mathcal {F}}y_0)\). Given the iterates \(x_n,~y_n,~y_{n-1}\in {\mathcal {H}}\), calculate the next iterates by the following:

$$\begin{aligned} \left\{ \begin{array}{ll} T^n=\left\{ x\in {\mathcal {H}}:\left\langle x_n-\lambda {\mathcal {F}}y_{n-1}-y_n,x-y_n\right\rangle \le 0\right\} ,\\ x_{n+1}=P_{T^n}(x_n-\lambda {\mathcal {F}}y_n),\\ y_{n+1}=P_{C}(x_{n+1}-\lambda {\mathcal {F}}y_n), \end{array} \right. \end{aligned}$$
(1.4)

where \(\lambda \) is some positive constant.

Now, we wish to recall the inertial-type algorithm which originates from the heavy ball method of the second-order dynamical systems in time [1, 2, 36] and speed up the original algorithm without the inertial effects. In recent years, this technique has been studied intensively and applied successfully to many problems, see for example, [7, 8, 16,17,18, 29, 30, 33, 36]. In particular, Alvarez and Attouch [2] applied it to obtain an inertial proximal method for finding zeros of maximal monotone operators. The algorithm’s iterative rule can be formulated as follows. Choose two sequences \(\left\{ \theta _n\right\} \subset [0,1), \left\{ \lambda _n\right\} \subset (0,+\infty )\), and starting points \(x_{-1},~x_0 \in {\mathcal {H}}\). Given the iterates \(x_n,~x_{n-1}\in {\mathcal {H}}\), calculate the next iterate by the following:

$$\begin{aligned} 0\in \lambda _n {\mathcal {F}}(x_{n+1})+x_{n+1}-x_n-\theta _{n}(x_n-x_{n-1}). \end{aligned}$$
(1.5)

Using the resolvent \(J_{\lambda _n}^{\mathcal {F}}\) of \({\mathcal {F}}\) with parameter \(\lambda _n>0\), (1.5) can be rewritten in the following compact form:

$$\begin{aligned} x_{n+1}=J_{\lambda _n}^{\mathcal {F}}(x_n+\theta _n(x_n-x_{n-1})). \end{aligned}$$
(1.6)

So, motivated and inspired by the above methods and results, we introduce a new inertial-type modified subgradient extragradient method for solving VIs in real Hilbert spaces. Under mild and standard assumptions, we present the weak convergence theorem. We also provide several numerical examples which illustrate the behavior of our algorithm and demonstrate its potential applicability compared with related results in the literature.

The outline of the paper is as follows. In Sect. 2, we present definitions and notions that are needed for the rest of the paper. In Sect. 3, the new algorithm is presented and analyzed. Finally, in Sect. 4, several numerical examples are presented.

2 Preliminaries

Let \({\mathcal {H}}\) denote a real Hilbert space with the inner product \(\langle \cdot ,\cdot \rangle \) and the induced norm \(\Vert \cdot \Vert \). Let C be a nonempty, closed and convex subset of \({\mathcal {H}}\). Recall that the metric projection operator \(P_C:{\mathcal {H}}\rightarrow C\) is defined, for each \(x\in {\mathcal {H}}\), by

$$\begin{aligned} P_C(x)=\arg \min \left\{ ||y-x||:y\in C\right\} . \end{aligned}$$
(2.1)

Since C is nonempty, closed and convex set, \(P_C(x)\) exists and is unique. A useful and important case in which the orthogonal projection has a closed formula is the following. Given \(x\in {\mathcal {H}}\) and \(v\in {\mathcal {H}}\), \(v\ne 0\) and let \(T=\left\{ z\in {\mathcal {H}}: \left\langle v, z-x\right\rangle \le 0\right\} \) be a half-space. Then, for all \(u\in {\mathcal {H}}\), the projection of u onto the half-space T, denoted by \(P_T(u)\), is defined by

$$\begin{aligned} P_T(u)=u-\max \left\{ 0,\frac{\left\langle v, u-x\right\rangle }{||v||^2}\right\} v. \end{aligned}$$
(2.2)

From the definition of \(P_C\) (2.1), it is easy to show that \(P_C\) has the following characteristic properties, see for example [23] for more details.

Lemma 2.1

  1. (i)

    \(\left\langle P_C(x)-P_C(y),x-y \right\rangle \ge \left\| P_C (x)-P_C (y)\right\| ^2,~\forall x,y\in {\mathcal {H}}.\)

  2. (ii)

    \(\left\| x-P_C (y)\right\| ^2+\left\| P_C (y)-y\right\| ^2\le \left\| x-y\right\| ^2, \forall x\in C, y\in {\mathcal {H}}.\)

  3. (iii)

    \(z=P_C (x) \Leftrightarrow \left\langle x-z,y-z \right\rangle \le 0,\quad \forall y\in C.\)

Lemma 2.2

For all \(x,y\in {\mathcal {H}}\) and constant \(\alpha \in {\mathbb {R}}\), the following equality holds:

$$\begin{aligned} ||\alpha x+(1-\alpha )y||^2=\alpha ||x||^2+(1-\alpha )||y||^2-\alpha (1-\alpha )||x-y||^2. \end{aligned}$$

Next, we present the monotone and Lipschitz continuous concepts of a mapping \({\mathcal {F}}:{\mathcal {H}} \rightarrow {\mathcal {H}}\).

(i):

The mapping \({\mathcal {F}}\) is called monotone on C if

$$\begin{aligned} \left\langle {\mathcal {F}}x-{\mathcal {F}}y, x-y\right\rangle \ge 0, \quad \forall x,y\in C; \end{aligned}$$
(ii):

The mapping \({\mathcal {F}}\) is called L - Lipschitz continuous on C if there exists \(L>0\) such that

$$\begin{aligned} ||{\mathcal {F}}x-{\mathcal {F}}y||\le L||x-y||,~\forall x,y\in C. \end{aligned}$$

3 The algorithm

In this section, we introduce our new algorithm which is constructed around the projection method and the inertial computational technique.

figure a

Observe that the main computational effort of Algorithm 3.1 is one mapping’s evaluation, \({\mathcal {F}}y_n\) and one orthogonal projection onto C to calculate \(y_{n+1}\), per each iteration. This of course suggests that the complexity of Algorithm 3.1 is close to the classical gradient method. The term \(\theta (x_{n+1}-x_n)\) is called the inertial effect. Throughout this section, we suppose that the algorithm stopping rule does not meet and hence the algorithm generates infinite sequences.

To study the asymptotic behavior of Algorithm 3.1, we assume the following conditions:

Condition 3.2

The mapping \({\mathcal {F}}\) is monotone on C.

Condition 3.3

The mapping \({\mathcal {F}}\) is L-Lipschitz continuous on C.

Condition 3.4

The solution set of the VI (1.1) is nonempty.

We also consider the following assumptions:

Condition 3.5

\(\delta :=1-\lambda L(3+2\theta )>0\).

Condition 3.6

\(\delta (1+\theta ^2)-2\theta (2+\theta )>0\).

Observe that \(\lambda \) and \(\theta \) can always be chosen such that Conditions 3.5 and 3.6 are satisfied, for example

$$\begin{aligned} 0<\lambda<\frac{1-4\theta -\theta ^2}{L(3+2\theta )(1+\theta ^2)}\quad \text{ and }\quad 0\le \theta <\sqrt{5}-2. \end{aligned}$$

3.1 Convergence

To start our analysis, for each \(p\in VI({\mathcal {F}},C)\), sequences \(\{x_n\}\), \(\{y_n\}\) and \(\{w_n\}\) generated by Algorithm 3.1 and constants \(\lambda ,L\) as above, we define the sequence

$$\begin{aligned} \Omega _n(p)=||x_n-p||^2-\theta ||x_{n-1}-p||^2+2\lambda L||y_{n-1}-w_n||^2 \end{aligned}$$
(3.1)

and prove the following useful result:

Lemma 3.7

For each \(p\in VI({\mathcal {F}},C)\) and \(n\ge 0\), the following inequality holds:

$$\begin{aligned} \Omega _{n+1}(p)\le \Omega _n(p) +\Xi ||x_n-x_{n-1}||^2-\Gamma ||x_{n+1}-x_{n}||^2, \end{aligned}$$

where \(\Xi =\theta (1+\theta )+\frac{\delta }{2}\theta (1-\theta )\) and \(\Gamma =\frac{\delta }{2}(1-\theta )-2\lambda L \theta (1+\theta )\).

Proof

Since \(x_{n+1}=P_{T^n}(w_n-\lambda {\mathcal {F}}y_n)\), by the projection characteristic, Lemma 2.1(iii), we obtain

$$\begin{aligned} \left\langle x-x_{n+1}, w_n-\lambda {\mathcal {F}}y_n-x_{n+1}\right\rangle \le 0,~\forall x\in T^n. \end{aligned}$$

Since \(C\subset T^n\), we choose \(x=p\) and obtain \(2\left\langle p-x_{n+1}, w_n-x_{n+1}\right\rangle \le 2\lambda \left\langle p-x_{n+1},{\mathcal {F}}y_n\right\rangle .\) Using the equality \(2\left\langle a,b\right\rangle =||a||^2+||b||^2-||a-b||^2\), we get

$$\begin{aligned} ||x_{n+1}-p||^2+||w_n-x_{n+1}||^2-||w_n-p||^2\le 2\lambda \left\langle p-x_{n+1},{\mathcal {F}}y_n\right\rangle . \end{aligned}$$
(3.2)

From the definition of \(T^n\) and the fact \(x_{n+1}\in T^n\), we see that

$$\begin{aligned} \left\langle w_n-\lambda {\mathcal {F}}y_{n-1}-y_{n},x_{n+1}-y_n\right\rangle \le 0, \end{aligned}$$

which yields the inequality \(2\left\langle w_n-y_{n},x_{n+1}-y_n\right\rangle \le 2\lambda \left\langle {\mathcal {F}}y_{n-1},x_{n+1}-y_n\right\rangle \). Thus, by applying the above equality for \(2\left\langle a,b\right\rangle \), one obtains

$$\begin{aligned} ||x_{n+1}-y_n||^2+||w_n-y_{n}||^2-||x_{n+1}-w_n||^2\le 2\lambda \left\langle x_{n+1}-y_n,{\mathcal {F}}y_{n-1}\right\rangle .\nonumber \\ \end{aligned}$$
(3.3)

Adding up both right hand sides of (3.2) and (3.3), we get

$$\begin{aligned} ||x_{n+1}-p||^2\le & {} ||w_n-p||^2-||w_n-y_n||^2-||x_{n+1}-y_n||^2\nonumber \\&+2\lambda \left[ \left\langle p-x_{n+1},{\mathcal {F}}y_n\right\rangle +\left\langle x_{n+1}-y_n,{\mathcal {F}}y_{n-1}\right\rangle \right] \nonumber \\= & {} ||w_n-p||^2-||w_n-y_n||^2-||x_{n+1}-y_n||^2\nonumber \\&+2\lambda \left[ \left\langle p-y_n,{\mathcal {F}}y_n\right\rangle +\left\langle y_n-x_{n+1},{\mathcal {F}}y_n-{\mathcal {F}}y_{n-1}\right\rangle \right] \nonumber \\\le & {} ||w_n-p||^2-||w_n-y_n||^2-||x_{n+1}-y_n||^2\nonumber \\&+2\lambda \left\langle y_n-x_{n+1},{\mathcal {F}}y_n-{\mathcal {F}}y_{n-1}\right\rangle \end{aligned}$$
(3.4)

where the last inequality follows from the fact \(p\in VI({\mathcal {F}},C)\) and the monotonicity of \({\mathcal {F}}\).

On the other hand, using the L-Lipschitz continuity of \({\mathcal {F}}\) and the Cauchy–Schwarz inequality, we get

$$\begin{aligned} \left\langle y_n-x_{n+1},{\mathcal {F}}y_n-{\mathcal {F}}y_{n-1}\right\rangle\le & {} L ||y_n-x_{n+1}|| ||y_n-y_{n-1}|| \end{aligned}$$
(3.5)
$$\begin{aligned}\le & {} \frac{L}{2} ||y_n-x_{n+1}||^2+\frac{L}{2} ||y_n-y_{n-1}||^2, \end{aligned}$$
(3.6)

which together with the inequality \(||y_n-y_{n-1}||^2\le 2||y_n-w_n||^2+2||w_n-y_{n-1}||^2\) implies that

$$\begin{aligned} \left\langle y_n-x_{n+1},{\mathcal {F}}y_n-{\mathcal {F}}y_{n-1}\right\rangle\le & {} \frac{L}{2} ||y_n-x_{n+1}||^2+L||y_n-w_n||^2\nonumber \\&+L||w_n-y_{n-1}||^2. \end{aligned}$$
(3.7)

Combining the relations (3.4) and (3.7), we obtain

$$\begin{aligned} ||x_{n+1}-p||^2\le & {} ||w_n-p||^2-(1-2\lambda L)||w_n-y_n||^2\nonumber \\&-(1-\lambda L)||x_{n+1}-y_n||^2+2\lambda L||w_{n}-y_{n-1}||^2. \end{aligned}$$
(3.8)

Adding the term \(2\lambda L||y_n-w_{n+1}||^2\) to both sides of (3.8), we get that

$$\begin{aligned} ||x_{n+1}-p||^2+ & {} 2\lambda L||y_n-w_{n+1}||^2\le ||w_n-p||^2-(1-2\lambda L)||w_n-y_n||^2\nonumber \\&-(1-\lambda L)||x_{n+1}-y_n||^2+2\lambda L||w_{n}-y_{n-1}||^2\nonumber \\&+2\lambda L||y_n-w_{n+1}||^2. \end{aligned}$$
(3.9)

Since \(w_n=x_n+\theta (x_n-x_{n-1})=(1+\theta )x_n-\theta x_{n-1}\) and using Lemma 2.2, we obtain

$$\begin{aligned} ||w_n-p||^2=(1+\theta )||x_n-p||^2-\theta ||x_{n-1}-p||^2+\theta (1+\theta )||x_n-x_{n-1}||^2.\nonumber \\ \end{aligned}$$
(3.10)

Similarly, from \(w_{n+1}=x_{n+1}+\theta (x_{n+1}-x_n)\), we also get

$$\begin{aligned} ||w_{n+1}-y_n||^2= & {} (1+\theta )||x_{n+1}-y_n||^2-\theta ||x_{n}-y_n||^2+\theta (1+\theta )||x_{n+1}-x_{n}||^2\nonumber \\\le & {} (1+\theta )||x_{n+1}-y_n||^2+\theta (1+\theta )||x_{n+1}-x_{n}||^2. \end{aligned}$$
(3.11)

Combining (3.9)–(3.11), we obtain

$$\begin{aligned}&||x_{n+1}-p||^2+2\lambda L||y_n-w_{n+1}||^2\nonumber \\&\quad \le (1+\theta )||x_n-p||^2-\theta ||x_{n-1}-p||^2\\&\quad \quad +\theta (1+\theta )||x_n-x_{n-1}||^2-(1-2\lambda L)||w_n-y_n||^2\\&\quad \quad -(1-\lambda L)||x_{n+1}-y_n||^2 +2\lambda L||w_{n}-y_{n-1}||^2\\&\quad \quad +2\lambda L\left[ (1+\theta )||x_{n+1}-y_n||^2+\theta (1+\theta )||x_{n+1}-x_{n}||^2\right] . \end{aligned}$$

Thus

$$\begin{aligned}&||x_{n+1}-p||^2-\theta ||x_n-p||^2+2\lambda L||y_n-w_{n+1}||^2\nonumber \\&\quad \le ||x_n-p||^2-\theta ||x_{n-1}-p||^2\nonumber \\&\quad \quad +2\lambda L||w_{n}-y_{n-1}||^2+\theta (1+\theta )||x_n-x_{n-1}||^2-(1-2\lambda L)||w_n-y_n||^2\nonumber \\&\quad \quad -(1-\lambda L(3+2\theta )||x_{n+1}-y_n||^2+2\lambda L\theta (1+\theta )||x_{n+1}-x_{n}||^2. \end{aligned}$$
(3.12)

Note that \( 1-2\lambda L> 1-\lambda L(3+2\theta ) =: \delta >0. \) Thus, from (3.12) and the definition of \(\Omega _n(p)\) in (3.1), one obtains

$$\begin{aligned} \Omega _{n+1}(p)\le & {} \Omega _n(p) -\delta (||w_n-y_n||^2+||x_{n+1}-y_n||^2)+\theta (1+\theta )||x_n-x_{n-1}||^2\nonumber \\&+2\lambda L\theta (1+\theta )||x_{n+1}-x_{n}||^2. \end{aligned}$$
(3.13)

Finally, we have the following estimation:

$$\begin{aligned}&||w_n-y_n||^2+||x_{n+1}-y_n||^2\ge \frac{1}{2}||w_n-x_{n+1}||^2\\&\quad =\frac{1}{2}||(x_n-x_{n+1})+\theta (x_n-x_{n-1})||^2\\&\quad =\frac{1}{2}||x_n-x_{n+1}||^2+\frac{1}{2}\theta ^2||x_n-x_{n-1}||^2+\theta \left\langle x_n-x_{n+1},x_n-x_{n-1}\right\rangle \\&\quad \ge \frac{1}{2}||x_n-x_{n+1}||^2+\frac{1}{2}\theta ^2||x_n-x_{n-1}||^2-\theta ||x_n-x_{n+1}|| ||x_n-x_{n-1}||\\&\quad \ge \frac{1}{2}||x_n-x_{n+1}||^2+\frac{1}{2}\theta ^2||x_n-x_{n-1}||^2-\frac{\theta }{2} ||x_n-x_{n+1}||^2-\frac{\theta }{2} ||x_n-x_{n-1}||^2\\&\quad =\frac{1-\theta }{2}||x_n-x_{n+1}||^2-\frac{\theta (1-\theta )}{2}||x_n-x_{n-1}||^2. \end{aligned}$$

Thus, from (3.13) and the definitions of \(\Gamma \) and \(\Xi \) in Lemma 3.7, the desired result is obtained and the proof is complete. \(\square \)

Now, we are ready to prove the convergence theorem of Algorithm 3.1.

Theorem 3.8

Assume that Conditions 3.23.4 and 3.53.6 hold. Then, any sequences \(\left\{ x_n\right\} \), \(\left\{ y_n\right\} \) and \(\left\{ w_n\right\} \) generated by Algorithm 3.1 converge weakly to some point p which solves the VI (1.1).

Proof

For each \(n\ge 0\) and solution p of \(VI({\mathcal {F}},C)\), define \(\Phi _n(p)\) by

$$\begin{aligned} \Phi _n(p):{=}||x_n-p||^2{-}\theta ||x_{n{-}1}{-}p||^2{+}2\lambda L||y_{n{-}1}{-}w_n||^2{+}\Xi ||x_n{-}x_{n{-}1}||^2.\nonumber \\ \end{aligned}$$
(3.14)

Now, we divide the proof of Theorem 3.8 into three steps (claims). \(\square \)

Claim 1

The limit of \(\left\{ \Phi _n(p)\right\} \) exists and \(||x_{n+1}-x_n||\rightarrow 0\) as \(n\rightarrow \infty \).

Indeed, from the definition of \(\Omega _n(p)\) and \(\Phi _n(p)\) we see that \(\Phi _n(p)=\Omega _n(p)+\Xi ||x_n-x_{n-1}||^2\). Thus, from Lemma 3.7, we obtain

$$\begin{aligned} \Phi _{n+1}(p)-\Phi _n(p)= & {} \Omega _{n+1}(p)+\Xi ||x_{n+1}-x_{n}||^2-\Omega _n(p)-\Xi ||x_n-x_{n-1}||^2\nonumber \\\le & {} -\Gamma ||x_{n+1}-x_{n}||^2 + \Xi ||x_{n+1}-x_{n}||^2\nonumber \\= & {} -(\Gamma -\Xi )||x_{n+1}-x_{n}||^2. \end{aligned}$$
(3.15)

Note that \( 1-2\lambda L (1+\theta )=1-\lambda L (2+2\theta )> 1-\lambda L(3+2\theta ) =: \delta >0, \) which implies that \(2\lambda L (1+\theta )<1-\delta \). Thus, from the definition of \(\Gamma \) and \(\Xi \) in Lemma 3.7, we obtain

$$\begin{aligned} \Gamma -\Xi= & {} \frac{\delta }{2}(1-\theta )-2\lambda L \theta (1+\theta )-\theta (1+\theta )-\frac{\delta }{2}\theta (1-\theta )\nonumber \\\ge & {} \frac{\delta }{2}(1-\theta )-\theta (1-\delta )-\theta (1+\theta )-\frac{\delta }{2}\theta (1-\theta )\nonumber \\= & {} \frac{\delta (1+\theta ^2)-2\theta (2+\theta )}{2}>0.~ (\text{ see, } \text{ Condition }~3.6) \end{aligned}$$
(3.16)

It follows from (3.15) and (3.16) that the sequence \(\left\{ \Phi _n(p)\right\} \) is nonincreasing. Moreover

$$\begin{aligned} \Phi _n(p)\ge & {} ||x_n-p||^2-\theta ||x_{n-1}-p||^2+\Xi ||x_n-x_{n-1}||^2. \end{aligned}$$
(3.17)

So we have that \(\Xi =\theta (1+\theta )+\frac{\delta }{2}\theta (1-\theta )=\theta \left( 1+\frac{2\theta +\delta (1-\theta )}{2}\right) \). Thus, by denoting by \(k:=\frac{2}{2\theta +\delta (1-\theta )}>0\), we get that

$$\begin{aligned} \Xi -\theta (1+\frac{1}{k})=0. \end{aligned}$$
(3.18)

On the other hand, from the Cauchy–Schwarz and the Cauchy inequality, we obtain

$$\begin{aligned} 2\left\langle x_{n-1}-x_n,x_{n}-p\right\rangle\le & {} 2||x_{n-1}-x_n|| ||x_{n}-p||\\\le & {} \frac{1}{k}||x_{n-1}-x_n||^2+k||x_{n}-p||^2. \end{aligned}$$

Now

$$\begin{aligned} ||x_{n-1}-p||^2= & {} ||x_{n-1}-x_n||^2+||x_{n}-p||^2+2\left\langle x_{n-1}-x_n,x_{n}-p\right\rangle \nonumber \\\le & {} ||x_{n-1}-x_n||^2+||x_{n}-p||^2+\frac{1}{k}||x_{n-1}-x_n||^2+k||x_{n}-p||^2\nonumber \\= & {} (1+\frac{1}{k})||x_{n-1}-x_n||^2+(1+k)||x_{n}-p||^2. \end{aligned}$$
(3.19)

Combining (3.17) and (3.19), and then using (3.18), we get that

$$\begin{aligned} \Phi _n(p)\ge & {} (1-\theta (1+k))||x_n-p||^2+(\Xi -\theta (1+\frac{1}{k})||x_n-x_{n-1}||^2\nonumber \\= & {} (1-\theta (1+k))||x_n-p||^2. \end{aligned}$$
(3.20)

Observing that

$$\begin{aligned} 1-\theta (1+k)=1-\theta \left( 1+\frac{2}{2\theta +\delta (1-\theta )}\right) =\frac{\delta (1-\theta )^2-2\theta ^2}{(2-\delta )\theta +\delta }, \end{aligned}$$
(3.21)

and \( \left[ \delta (1-\theta )^2-2\theta ^2\right] -\left[ \delta (1+\theta ^2)-2\theta (2+\theta )\right] = 2\theta ^2+2\theta (2-\delta )\ge 0\) and thus also

$$\begin{aligned} \delta (1-\theta )^2-2\theta ^2\ge \delta (1+\theta ^2)-2\theta (2+\theta )>0, \end{aligned}$$
(3.22)

where the last inequality follows from Condition 3.6. From (3.21), (3.22) and the fact \(0<\delta <1\), we get that \(1-\theta (1+k)>0\), showing that together with (3.20), \(\Phi _n(p)\ge 0\) for all \(n\ge 0\). Since \(\left\{ \Phi _n(p)\right\} \) is nonincreasing, the limit of \(\left\{ \Phi _n(p)\right\} \) exists. Now, passing to the limit in (3.15) as \(n\rightarrow \infty \) and noting that \(\Gamma -\Xi >0\), we obtain \(||x_{n+1}-x_n||\rightarrow 0\).

Claim 2

\(\lim _{n\rightarrow \infty }||w_n-y_{n}||=\lim _{n\rightarrow \infty }||x_n-y_n||=\lim _{n\rightarrow \infty }||y_n-y_{n+1}||=\lim _{n\rightarrow \infty }||w_n-y_{n-1}||=0\), and the sequences \(\left\{ x_n\right\} \), \(\left\{ y_n\right\} \), \(\left\{ w_n\right\} \) are bounded.

Indeed, from the definition of \(\Omega _n(p)\) and \(\Phi _n(p)\), we see that \(\Omega _n(x^*)=\Phi _n(x^*)-\Xi ||x_n-x_{n-1}||^2\). This together with Claim 1 implies that the limit of \(\left\{ \Omega _n(x^*)\right\} \) exists. Thus, from (3.13) and Claim 1, we get that

$$\begin{aligned} 0\le & {} \delta (||w_n-y_{n}||^2+||y_n-x_{n+1}||^2)\\\le & {} \Omega _n(x^*)-\Omega _{n+1}(x^*) +\theta (1+\theta )||x_n-x_{n-1}||^2\\&+2\lambda L\theta (1+\theta )||x_{n+1}-x_{n}||^2\rightarrow 0~\text{ as }~n\rightarrow \infty \end{aligned}$$

meaning that \(\lim _{n\rightarrow \infty }||w_n-y_{n}||=\lim _{n\rightarrow \infty }||y_n-x_{n+1}||=0\) since \(\delta >0\). Hence, from \(||x_n-x_{n+1}||\rightarrow 0\) and the inequality \( ||x_{n}-y_{n}||\le ||x_n-x_{n+1}||+||x_{n+1}-y_n||, \) we also obtain \(\lim _{n\rightarrow \infty }||x_n-y_{n}||=0\). Moreover, since

$$\begin{aligned} ||y_n-y_{n+1}||\le ||y_n-x_{n+1}||+||x_{n+1}-y_{n+1}||, \end{aligned}$$

we have \(\lim _{n\rightarrow \infty }||y_n-y_{n+1}||=0\). Since \(||w_n-y_{n-1}||\le ||w_n-y_n||+||y_n-y_{n-1}||\rightarrow 0\), we also obtain that \(\lim _{n\rightarrow \infty }||w_n-y_{n-1}||=0\). By (3.20), Claim 1 and the fact \(1-\theta (1+k)>0\), we see that the sequence \(\left\{ x_n\right\} \) is bounded. Thus, the boundedness of \(\left\{ y_n\right\} \) and \(\left\{ w_n\right\} \) follows immediately from the limits in Claim 2.

Claim 3

The sequences \(\left\{ x_n\right\} \), \(\left\{ y_n\right\} \), \(\left\{ w_n\right\} \) converge weakly to the same point, which solves \(VI({\mathcal {F}},C)\).

Indeed, we first prove that every weak cluster point of \(\left\{ x_n\right\} \) solves \(VI({\mathcal {F}},C)\). From the definition of \(y_{n+1}\), we have

$$\begin{aligned} \left\langle y_{n+1}-w_{n+1}+\lambda {\mathcal {F}}y_n, y-y_{n+1}\right\rangle \ge 0 \end{aligned}$$

for all \(y\in C\). Thus, it follows from the monotonicity of \({\mathcal {F}}\) that

$$\begin{aligned} 0\le & {} \left\langle y_{n+1}-w_{n+1}, y-y_{n+1}\right\rangle +\lambda \left\langle {\mathcal {F}}y_n, y-y_{n+1}\right\rangle \nonumber \\= & {} \left\langle y_{n+1}-w_{n+1}, y-y_{n+1}\right\rangle +\lambda \left\langle {\mathcal {F}}y_n, y_n-y_{n+1}\right\rangle +\lambda \left\langle {\mathcal {F}}y_n, y-y_{n}\right\rangle \nonumber \\= & {} \left\langle y_{n+1}-w_{n+1}, y-y_{n+1}\right\rangle +\lambda \left\langle {\mathcal {F}}y_n, y_n-y_{n+1}\right\rangle \nonumber \\&-\lambda \left\langle {\mathcal {F}}y-{\mathcal {F}}y_n, y-y_{n}\right\rangle +\lambda \left\langle {\mathcal {F}}y, y-y_{n}\right\rangle \nonumber \\\le & {} \left\langle y_{n+1}-w_{n+1}, y-y_{n+1}\right\rangle +\lambda \left\langle {\mathcal {F}}y_n, y_n-y_{n+1}\right\rangle +\lambda \left\langle {\mathcal {F}}y, y-y_{n}\right\rangle .\nonumber \\ \end{aligned}$$
(3.23)

Now, assume that \(x^\dagger \) is a weak cluster point of \(\left\{ x_n\right\} \), i.e., there exists a subsequence \(\left\{ x_m\right\} \) of \(\left\{ x_n\right\} \) converging weakly to \(x^\dagger \). Since \(||x_m-y_m||\rightarrow 0\), \(y_m\rightarrow x^\dagger \). Since C is a convex and closed subset of \({\mathcal {H}}\), it is weakly closed and hence \(x^\dagger \in C\) since \(\left\{ y_m\right\} \subset C\). Now, passing to the limit as \(n=m\rightarrow \infty \) in (3.23), and using Claim 2, we get that \(\left\langle {\mathcal {F}}y, y-x^\dagger \right\rangle \ge 0\) for all \(y\in C\). This implies that \(x^\dagger \) solves \(VI({\mathcal {F}},C)\).

Finally, to complete the proof of Theorem 3.8, we show that the all sequence \(\left\{ x_n\right\} \) converges weakly to \(x^\dagger \). Indeed, assume that there exists a different subsequence \(\left\{ x_k\right\} \) of \(\left\{ x_n\right\} \) converging weakly to \({\bar{x}}\), i.e., \(x_k\rightharpoonup {\bar{x}}\), and \({\bar{x}}\ne x^\dagger \). Using similar arguments as above, we get that \({\bar{x}}\) also solves \(VI({\mathcal {F}},C)\).

Then

$$\begin{aligned}&2\left\langle x_n,{\bar{x}}-x^\dagger \right\rangle =||x_n-x^\dagger ||^2-||x_n-{\bar{x}}||^2+||{\bar{x}}||^2-||x^\dagger ||^2. \end{aligned}$$
(3.24)
$$\begin{aligned}&2\left\langle x_{n-1},{\bar{x}}-x^\dagger \right\rangle =||x_{n-1}-x^\dagger ||^2-||x_{n-1}-{\bar{x}}||^2+||{\bar{x}}||^2-||x^\dagger ||^2.\nonumber \\ \end{aligned}$$
(3.25)

Multiplying (3.25) by \(\theta \), and then subtracting its both sides by (3.24), we obtain

$$\begin{aligned}&2\left\langle x_n-\theta x_{n-1},{\bar{x}}-x^\dagger \right\rangle =(||x_n-x^\dagger ||^2-\theta ||x_{n-1}-x^\dagger ||^2)\nonumber \\&\quad \quad -(||x_n-{\bar{x}}||^2-\theta ||x_{n-1}-{\bar{x}}||^2)+(1-\theta )(||{\bar{x}}||^2-||x^\dagger ||^2)\nonumber \\&\quad =\Delta _n(x^\dagger )-\Delta _n({\bar{x}})+(1-\theta )(||{\bar{x}}||^2-||x^\dagger ||^2) \end{aligned}$$
(3.26)

where \(\Delta _n(p)\) is defined as \(\Delta _n(p)=||x_n-p||^2-\theta ||x_{n-1}-p||^2\) for each \(n\ge 0\) and p solves \(VI({\mathcal {F}},C)\). By Claim 2, that is, \(||w_n-y_{n-1}||\rightarrow 0\) and the fact \(\Delta _n(p)=\Omega _n(p)-2\lambda L ||y_{n-1}-w_n||^2\), we see that the limit of \(\left\{ \Delta _n(p)\right\} \) exists for each \(p\in VI({\mathcal {F}},C)\). Thus, from relation (3.26) and noting that both points \(x^\dagger \) and \({\bar{x}}\) are in \(VI({\mathcal {F}},C)\), we see that the limit of the sequence \(\left\{ \left\langle x_n-\theta x_{n-1},{\bar{x}}-x^\dagger \right\rangle \right\} \) exists and, hence, we denote it by l, i.e.,

$$\begin{aligned} \lim _{n\rightarrow \infty }\left\langle x_n-\theta x_{n-1},{\bar{x}}-x^\dagger \right\rangle =l. \end{aligned}$$
(3.27)

Now, passing to the limit as \(n=m=k\rightarrow \infty \) in (3.27), we obtain

$$\begin{aligned} \left\langle x^\dagger -\theta x^\dagger ,{\bar{x}}-x^\dagger \right\rangle= & {} \lim _{m\rightarrow \infty }\left\langle x_m-\theta x_{m-1},{\bar{x}}-x^\dagger \right\rangle =l\\= & {} \lim _{k\rightarrow \infty }\left\langle x_k-\theta x_{k-1},{\bar{x}}-x^\dagger \right\rangle \\= & {} \left\langle {\bar{x}}-\theta {\bar{x}},{\bar{x}}-x^\dagger \right\rangle . \end{aligned}$$

Thus, either \((1-\theta )||{\bar{x}}-x^\dagger ||^2=0\) or \(x^\dagger ={\bar{x}}\), since \(\theta \in [0,1)\). This implies that the all sequence \(\left\{ x_n\right\} \) converges weakly to \(x^\dagger \). From Claim 2, it is easy to see that the sequences \(\left\{ y_n\right\} \) and \(\left\{ w_n\right\} \) also converge weakly to \(x^\dagger \) and the proof of Theorem 3.8 is complete.

4 Computational experiments

In this section, we present several numerical experiments which illustrate the behavior of Algorithm 3.1 (shortly, IMSEM) with inertial effects. The performances are also compared with four related algorithms, namely the Modified Subgradient Extragradient Method (MSEM) (Algorithm 3.1 for \(\theta =0\), also see [32]), the Subgradient Extragradient Method (SEM) in [11,12,13], the Extragradient Method (EM) in [28], and the Projected Reflected Gradient Method (PRGM) in [31].

Due to the fixed-point characteristic of the solutions of VIs, that is, \(x=P_C(x-{\mathcal {F}}x)\) for each \(x \in VI({\mathcal {F}},C)\), we use the function \(D(x)=||x-P_C(x-{\mathcal {F}}x)||^2\) to illustrate the computational performance of the aforementioned algorithms. The convergence of \(\left\{ D(x_n)\right\} \) to 0 as \(n\rightarrow \infty \) implies that the sequence \(\left\{ x_n\right\} \) converges to the solution of the problem. We describe the behavior of \(D(x_n)\) generated by each algorithm when the execution time elapses in seconds and the number of projections on C is performed. The starting points are \(x_{-1}=x_0=y_0=(1,1,\ldots ,1)\in {\mathbb {R}}^m\). We take the stepsize \(\lambda =\frac{1}{5L}\) for IMSEM and also choose the best stepsize (possibly) for each algorithm used to compare, namely \(\lambda =\frac{1}{3.01L}\) for MSEM, \(\lambda =\frac{1}{1.01L}\) for EM and SEM, \(\lambda =\frac{0.4}{L}\) for PRGM. The feasible set is a polyhedral convex set, given by \( C=\left\{ x\in {\mathbb {R}}^m_+: Ex\le f\right\} \) where E is a random matrix of size \(l\times m\) (\(l=10,~m=150\) or \(m=200\)) with entries in \((-2,2)\) and \(f\in {\mathbb {R}}^l_{+}\). All the programs are written on 7.0 Matlab and computed on a PC Desktop Intel(R) Core(TM) i5-3210M CPU @ 2.50 GHz, RAM 2.00 GB.

Example 1

This example considers a linear VI, [25]. Let \({\mathcal {F}}:{\mathbb {R}}^m\rightarrow {\mathbb {R}}^m\) which is defined by \({\mathcal {F}}(x)=Mx+q\), where

$$\begin{aligned} M=N N^T+S+D, \end{aligned}$$

q is a vector in \({\mathbb {R}}^{m}\), N is a \(m\times m\) matrix, S is a \(m\times m\) skew-symmetric matrix with entries being generated in \((-2,2)\) and D is a \(m\times m\) diagonal matrix, whose diagonal entries are positive in (0, 2). The Lipschitz constant of \({\mathcal {F}}\) is \(L=||M||\). The numerical results for this example are shown in Figs. 1 and 2.

Fig. 1
figure 1

Example 1 in \(\mathfrak {R}^{150}\). The number of projections on C is 141, 132, 144, 135, 131, 136, 132, 124, respectively

Fig. 2
figure 2

Example 1 in \(\mathfrak {R}^{200}\). The number of projections on C is 108, 119, 115, 103, 108, 105, 114, 109, respectively

Example 2

Now, we consider a nonlinear VI with \({\mathcal {F}}:{\mathbb {R}}^m\rightarrow {\mathbb {R}}^m\) which is defined as \({\mathcal {F}}(x)=Mx+F(x)+q\), with M a \(m\times m\) symmetric and positive semidefinite matrix, q is a vector in \({\mathbb {R}}^{m}\) and F(x) is the proximal mapping of the function \(g(x)=\frac{1}{4}||x||^4\), i.e.,

$$\begin{aligned} F(x)=\arg \min \left\{ \frac{||y||^4}{4}+\frac{1}{2}||y-x||^2:y\in {\mathbb {R}}^m\right\} . \end{aligned}$$

Now, we prove that the proximal mapping F(x) is monotone. Indeed, take two points \(x_1,~x_2\in {\mathbb {R}}^m\) and set \(y_1=F(x_1),~y_2=F(x_2)\). From the definition of F, we have \(||y_1||^2y_1+y_1-x_1=0,~||y_2||^2y_2+y_2-x_2=0\) or

$$\begin{aligned} x_1=||y_1||^2y_1+y_1,\quad x_2=||y_2||^2y_2+y_2. \end{aligned}$$

Thus

$$\begin{aligned}&\left\langle F(x_1)-F(x_2),x_1-x_2\right\rangle =\left\langle y_1-y_2,x_1-x_2\right\rangle \\&\quad =\left\langle y_1-y_2,||y_1||^2y_1+y_1-||y_2||^2y_2-y_2\right\rangle \\&\quad =||y_1-y_2||^2+||y_1||^4+||y_2||^4-\left\langle y_1,y_2\right\rangle (||y_1||^2+||y_2||^2)\\&\quad \ge ||y_1-y_2||^2+||y_1||^4+||y_2||^4-||y_1||||y_2||(||y_1||^2+||y_2||^2)\\&\quad =||y_1-y_2||^2+(||y_1||-||y_2||)^2(||y_1||^2-||y_1||||y_2||+||y_2||^2)\ge 0. \end{aligned}$$

This together with the definition of \({\mathcal {F}}\) implies that

$$\begin{aligned} \left\langle {\mathcal {F}}(x_1)-{\mathcal {F}}(x_2),x_1-x_2\right\rangle\ge & {} \left\langle Mx_1-Mx_2,x_1-x_2\right\rangle \\= & {} (x_1-x_2)^TM(x_1-x_2)\ge 0. \end{aligned}$$

Hence, the operator \({\mathcal {F}}\) is monotone. Moreover, \({\mathcal {F}}\) is Lipschitz continuous with the constant \(L=||M||+1\). For experiment, all the entries of M and q are also generated randomly as in the previous example. The numerical results are shown in Figs. 3 and 4.

Fig. 3
figure 3

Example 2 in \(\mathfrak {R}^{150}\). The number of projections on C is 50, 53, 50, 52, 49, 45, 48, 48, respectively

Fig. 4
figure 4

Example 2 in \(\mathfrak {R}^{200}\). The number of projections on C is 45, 45, 45, 44, 43, 40, 40, 39, respectively

Example 3

Finally, we consider the VI with \({\mathcal {F}}:{\mathbb {R}}^{m}\rightarrow {\mathbb {R}}^{m}\) (where \(m=2k\)) defined by

$$\begin{aligned} {\mathcal {F}}x = \left( \begin{array}{c} x_1+x_2+\sin x_1\\ -x_1+x_2+\sin x_2\\ x_3+x_4+\sin x_3\\ -x_3+x_4+\sin x_4\\ \ldots \\ x_{2k-1}+x_{2k}+\sin x_{2k-1}\\ -x_{2k-1}+x_{2k}+\sin x_{2k} \end{array}\right) ,~ x = \left( \begin{array}{c} x_1\\ x_2\\ \ldots \\ x_{2k-1}\\ x_{2k} \end{array}\right) . \end{aligned}$$

Now, we prove that \({\mathcal {F}}\) is monotone. Indeed, for all \(x,~y\in {\mathbb {R}}^{m}\) (with \(m=2k\)), we have

$$\begin{aligned}&\left\langle {\mathcal {F}}x-{\mathcal {F}}y,x-y\right\rangle \\&\quad =\sum _{i=1}^k\left( x_{2i-1}+x_{2i}+\sin x_{2i-1}-y_{2i-1}-y_{2i}-\sin y_{2i-1}\right) (x_{2i-1}-y_{2i-1})\\&\qquad +\sum _{i=1}^k\left( -x_{2i-1}+x_{2i}+\sin x_{2i}+y_{2i-1}-y_{2i}-\sin y_{2i}\right) (x_{2i}-y_{2i})\\&\quad =\sum _{i=1}^k\left[ (x_{2i-1}-y_{2i-1})^2+(x_{2i-1}-y_{2i-1})(\sin x_{2i-1}-\sin y_{2i-1})\right] \\&\qquad +\sum _{i=1}^k\left[ (x_{2i}-y_{2i})^2+(x_{2i}-y_{2i})(\sin x_{2i}-\sin y_{2i})\right] \ge 0, \end{aligned}$$

where the last inequality follows from the fact that \(\left| \sin a -\sin b\right| \le |a-b|\) for all \(a,~b\in {\mathbb {R}}\). Thus, \({\mathcal {F}}\) is monotone. On the other hand, for all \(x,~y\in {\mathbb {R}}^m\), we have

$$\begin{aligned}&||{\mathcal {F}}x-{\mathcal {F}}y||^2\\&\quad = \sum _{i=1}^k\left( x_{2i-1}+x_{2i}+\sin x_{2i-1}-y_{2i-1}-y_{2i}-\sin y_{2i-1}\right) ^2\\&\qquad +\sum _{i=1}^k\left( -x_{2i-1}+x_{2i}+\sin x_{2i}+y_{2i-1}-y_{2i}-\sin y_{2i}\right) ^2\\&\quad \le \sum _{i=1}^k\left( 2|x_{2i-1}-y_{2i-1}|+|x_{2i}-y_{2i}|\right) ^2\\&\qquad +\sum _{i=1}^k\left( |x_{2i-1}-y_{2i-1}|+2|x_{2i}-y_{2i}|\right) ^2\\&\quad \le \sum _{i=1}^k10\left[ (x_{2i-1}-y_{2i-1})^2+(x_{2i}-y_{2i})^2\right] \\&\quad =10||x-y||^2. \end{aligned}$$

This implies that \({\mathcal {F}}\) is Lipschitz continuous with \(L=\sqrt{10}\). The behaviors of the sequences \(D(x_n)\) generated by all the algorithms are described in Figs. 5 and 6.

Fig. 5
figure 5

Example 3 in \(\mathfrak {R}^{150}\). The number of projections on C is 81, 79, 78, 76, 80, 126, 158, 130, respectively

Fig. 6
figure 6

Example 3 in \(\mathfrak {R}^{200}\). The number of projections on C is 49, 47, 44, 46, 43, 82, 92, 82, respectively

Remark 4.1

The techniques used in the convergence proof in this paper depend strictly on Conditions 3.5 and 3.6. Thanks to the referee’s comments, we observe that due to Condition 3.5, our Algorithm 3.1 has a disadvantage over algorithms EGM and SEGM since the interval for possible stepsize \(\lambda \) is smaller. While this might be an issue for small-sized problems, for huge problems, the evaluation of \({\mathcal {F}}\) is much more expansive since it could require more computational resources. This could suggest that the EGM and SEGM, even when using the optimal stepsize, require more computational efforts due to the extra evaluation of the associated operator per each iteration (also, see [31, Sect. 5]).

Condition 3.6 is also restrictive for the inertial parameter \(\theta \) (a small interval of \(\theta \)). We assumed this condition since it was needed for the algorithm’s analysis which is completely new technique. In our numerical experiments, we have tried inertial parameters outside their theoretical bounds to see their limitations. The experimental results illustrate the computational advantages and complexity of Algorithm 3.1 with these inertial effects over other related algorithms. This suggests to study further and provide a more general proof with these choices of parameters and hence weaken Condition 3.6. Due to the importance and interest of this matter and to provide a general result as possible with no pressure, we plan to study this in a forthcoming work.

Fig. 7
figure 7

The area (black) of \(\theta \) and \(\lambda L\) where Conditions 3.5 and 3.6 hold

In the aforementioned experiments, the parameters are chosen outside their theoretical bounds. Now, we perform some experiments to show the behavior of the proposed algorithm (IMSEGM) where the stepsize \(\lambda \) and the inertial parameter \(\theta \) satisfy Conditions 3.5 and 3.6. In fact, Conditions 3.5 and 3.6 can be rewritten as follows:

$$\begin{aligned} 0<\lambda L<f(\theta ):=\frac{1-4\theta -\theta ^2}{(3+2\theta )(1+\theta ^2)}\quad \text{ and }\quad 0\le \theta <\sqrt{5}-2. \end{aligned}$$
(4.1)

Figure 7 describes the applicable area (black) of \(\lambda \) and \(\theta \) where Conditions 3.5 and 3.6 hold. The stepsize \(\lambda \) here depends strictly on the inertial parameter \(\theta \). For the experiment, we choose \(\lambda =0.9f(\theta )/L\) and \(\theta \in \left\{ 0.05,~0.10,~0.15,~0.20,~0.23\right\} \). The numerical results are shown in Figs. 8, 9, and 10. It was seen in the previous experiments, with a fixed stepsize \(\lambda \), the better the IMSEM, the larger the inertial parameter \(\theta \). Here, when \(\theta \) increases, then \(\lambda \) is small. This affects the numerical performance of IMSEM as shown in Figs. 8, 9, and 10. This remark suggests to study a forthcoming work where the theoretical bounds of \(\lambda \) and \(\theta \) can be extended and they do not depend on each other. Moreover, other numerical results on other test problems should be performed to check the better convergence of the proposed algorithm over existing methods.

Fig. 8
figure 8

Example 1 in \(\mathfrak {R}^{150}\) for IMSEM. The number of projections on C is 249, 253, 249, 244, 250, respectively

Fig. 9
figure 9

Example 2 in \(\mathfrak {R}^{150}\) for IMSEM. The number of projections on C is 109, 107, 105, 104, 110, respectively

Fig. 10
figure 10

Example 3 in \(\mathfrak {R}^{150}\) for IMSEM. The number of projections on C is 184, 186, 183, 190, 185, respectively

5 Conclusions

This article presents a new inertial-type subgradient extragradient method for solving monotone and Lipschitz continuous variational inequalities (VI) in real Hilbert spaces. The algorithm requires only one orthogonal projection onto the feasible set of the VI as well as one mapping evaluation per each iteration. These computational properties make it attractable and comparable with the classical gradient method. Moreover, several numerical experiments suggest the advantages of the new proposed methods compared with recent related results in the literature.

Our results can be extended in many promising directions, such as multi-valued variational inequality [20], VIs combined with fixed point problems [10], system of VIs and mixed equilibrium problem [26], general VIs [35], strong convergence in Hilbert spaces as well as extensions to Banach spaces [9, 34] and this is surely our future goals.