1 Introduction

Throughout this paper, we assume that X is a reflexive real Banach space and C is a nonempty closed and convex subset of X unless otherwise stated. We shall denote the dual space of X by \(X^{*}.\) The norm and the duality pairing between X and \(X^{*}\) are, respectively, denoted by \(\Vert .\Vert \) and \(\langle .,.\rangle ,\) and \({\mathbb {R}}\) stands for the set of real numbers. The equilibrium problem for a bifunction \(f:C\times C\rightarrow {\mathbb {R}}\) is stated as follows:

$$\begin{aligned} \text {Find} \quad x^{*}\in C \qquad \text {such that} \quad f(x^{*},y)\ge 0, \quad \forall y\in C. \end{aligned}$$
(1)

The set of solutions of (1) is denoted by EP(f).

In some nonlinear analysis problems such as complementarity, fixed point, Nash equilibria, optimization, saddle point and variational inequality, it is common that these interesting and confusing problems can be reformulated as the equilibrium problems (see, for instance, [7]), which caused the problem (EP) to become an interesting area in recent years. Also, the approximation methods to solutions of problem (EP) plus the theoretical results of solution are interesting. The proximal point method [27, 33] is a well-known method for solving the problem (EP) that substitutes the original problem with a family of regularization equilibrium subproblems which can easily be solved. Using Bregman distance, Reich and Sabach [40] presented two algorithms for approximating a point of (EP) in reflexive Banach spaces.

The variational inequality problem is a particular part of the equilibrium problem. Korpelevich [28] presented an extragradient method for solving the variational inequality in Euclidean space that two metric projections on feasible sets must be found at each iterative step. This method was extended in Hilbert spaces by Nadezhkina and Takahashi [34]. In this direction, see also [18, 19, 21, 27, 42, 43, 45, 46].

In this paper, we are interested in the proximal-like method [16] which is also called the extragradient method [36] due to the early contributions on the saddle point problems in [28]. The convergence of the extragradient method is established in [36] under the assumptions that the bifunction is pseudomonotone and satisfies a Lipschitz-type condition presented in [31]. In the extragradient method, at each iterative step two strongly convex minimization problems on a closed convex constrained set must be solved. In 2018, a hybrid extragradient method to find a common element of the set of solutions of pseudomonotone equilibrium problem in reflexive Banach spaces was proposed by Eskandani et al. [14]. In [14], Bregman–Lipschitz coefficients are necessary to be determined; however, there are some challenges in their estimation. Some of the iterative algorithms were introduced by Hieu et al. [22] to solve the pseudomonotone and Lipschitz-type equilibrium problem in a Hilbert space. The performance of suggested algorithms is made without the prior information of the Lipschitz-type constants. In the studies [1, 17, 19, 21, 23,24,25,26, 31, 32, 37,38,39], there are some other methods to solve problem (EP).

In this paper, motivated and inspired by the above results, a new extragradient algorithm is introduced to solve the pseudomonotone equilibrium problem with a Bregman–Lipschitz-type condition. The superiority of this algorithm is that it can be performed without any precedent information about the Bregman–Lipschitz coefficients. The weak convergence of the algorithm is determinate under mild assumption and the strong convergence will be established as the bifunction equilibrium is satisfied in different additional assumptions. In conclusion, we can use the algorithm to find a solution of the variational inequality problem. At the end, several numerical examples are exhibited that demonstrate the efficiency of our method compared to the related methods in the studies.

2 Preliminaries and Lemmas

Let \(\{x_{n}\}\) be a sequence in X and \(x\in X.\) Weak convergence of \(\{x_{n}\}\) to x is denoted by \(x_{n}\rightharpoonup x,\) and strong convergence is denoted by \(x_{n}\rightarrow x.\) In this paper, we assume that \(g:X\rightarrow (-\infty ,\infty ]\) is a proper convex and lower semicontinuous function. We denote by dom g, the domain of g, that is, the set \(\{x\in X:g(x)<\infty \}.\) Let \(x\in \) int dom g, the subdifferential of g at x is the convex set defined by:

$$\begin{aligned} \partial g(x){=}{:}\{x^{*}\in X^{*}: g(x) +\langle y-x, x^{*}\rangle \le g(y),\quad \forall y\in X\}, \end{aligned}$$

and the Fenchel conjugate of g is the convex function

$$\begin{aligned} g^{*}:X^{*}\rightarrow (-\infty ,\infty ],\quad g^{*}(x^{*})=\sup \{\langle x,x^{*}\rangle - g(x): x\in X\}. \end{aligned}$$

It is well known that \( x^{*}\in \partial g(x) \) is equivalent to

$$\begin{aligned} g(x) + g^{*}(x^{*})= \langle x,x^{*}\rangle . \end{aligned}$$
(2)

It is easy to see that \(g^{*}\) is proper convex and lower semicontinuous function. The function g is called to be cofinite if dom \(g^{*}=X^{*}.\)

Let \(g^{\circ }(x,y)\) be the right-hand derivative of g at \(x\in \) int dom g in the direction y, that is,

$$\begin{aligned} g^{\circ }(x,y){:}{=} \displaystyle \lim _{t\downarrow 0}\frac{g(x+ty)- g(x)}{t}. \end{aligned}$$
(3)

If the limit as \(t\rightarrow 0\) in (3) exists for each y, then the function g is said to be G\({\hat{a}}\)teaux differentiable at x. In this case, the gradient of g at x is the linear function \(\nabla g(x),\) which is defined by \( \langle y,\nabla g(x)\rangle {:}{=} g^{\circ }(x,y)\) for all \(y\in X\). The function g is said to be G\({\hat{a}}\)teaux differentiable if it is G\({\hat{a}}\)teaux differentiable at each \(x\in \) int dom g. When the limit as \(t\rightarrow 0\) in (3) is obtained uniformly for each \(y \in X\) with \(\Vert y\Vert =1\), we say that g is Fréchet differentiable at x. At the end, g is said to be uniformly Fréchet differentiable on a subset C of X if the limit is obtained uniformly for \(x \in C\) and \( \Vert y\Vert = 1\).

The function g is said to be Legendre if it satisfies the following two conditions:

  1. (L1)

    int dom \(g\ne \emptyset \) and \(\partial g\) is single-valued on its domain.

  2. (L2)

    int dom \(g^{*}\ne \emptyset \) and \(\partial g^{*}\) is single-valued on its domain.

Since X is reflexive, we have \((\partial g)^{-1}=\partial g^{*}\) (see [8]). This fact, when joined together conditions (L1) and (L2), intimates the following equalities:

$$\begin{aligned}&\nabla g=(\nabla g^{*})^{-1},\quad \text {ran} \quad (\nabla g) = \text {dom} \quad (\nabla g^{*})= \text {int}\quad \text {dom} \quad g^{*},\\&\text {ran} \quad (\nabla g^{*}) = \text {dom} \quad (\nabla g)= \text {int}\quad \text {dom} \quad g. \end{aligned}$$

It is well known that if g is Legendre function, then the functions g and \(g^{*}\) are G\({\hat{a}}\)teaux differentiable and strictly convex in the interior of their respective domains [3]. When the Banach space X is smooth and strictly convex, in particular, a Hilbert space, the function \(g(.)=\frac{1}{p}\Vert .\Vert ^{p}\) with \(p\in (1,+\infty ),\) is Legendre [3]. Suppose that \(g: X\rightarrow (-\infty , +\infty ]\) is G\({\hat{a}}\)teaux differentiable. The function \( D_{g}\): dom \(g\times \) int dom \(g \rightarrow [0, +\infty ) \) defined by:

$$\begin{aligned} D_{g} (y, x) {:}{=} g (y) - g (x) -\langle y-x,\nabla g(x)\rangle , \end{aligned}$$

is called the Bregman distance with respect to g. It should be mentioned that \(D_{g}\) is not a distance in the usual sense of the term. Obviously, \(D_{g}(x, x) = 0\), but \( D_{g}(y, x) = 0 \) may not intimate \(x = y.\) In our case, when g is Legendre this indeed holds (see [3], Theorem 7.3(vi), p. 642). In general, \(D_{g}\) is not symmetric and does not satisfy the triangle inequality. However, \(D_{g}\) satisfies the three-point identity

$$\begin{aligned} D_{g} (x, y)+D_{g} (y, z)-D_{g} (x, z)=\langle x-y,\nabla g(z)-\nabla g(y)\rangle , \end{aligned}$$

and four-point identity

$$\begin{aligned} D_{g} (x, y)+D_{g} (w, z)-D_{g} (x, z)-D_{g} (w, y)=\langle x-w,\nabla g(z)-\nabla g(y)\rangle , \end{aligned}$$

for any \(x,w\in \) dom g and \(y,z\in \) int dom g.

The modulus of total convexity at \(x\in \) int dom g is the function \(\upsilon _{g}(x,.):[0,+\infty )\rightarrow [0,\infty ]\) defined by:

$$\begin{aligned} \upsilon _{g}(x,t)\,{:}{=}\, \inf \{D_{g}(y,x):y \in \text {dom}\quad g,\quad \Vert y-x\Vert =t\}. \end{aligned}$$

If for any \(t > 0\), \(\upsilon _{g}(x,t)\) is positive, then the function g is called totally convex at x. This concept was first presented by Butnariu and Iusem in [10]. Let C be a nonempty subset of X. The modulus of total convexity of g on C is defined by:

$$\begin{aligned} \upsilon _{g}(C,t)=\inf \{\upsilon _{g}(x,t):\quad x\in C\cap \text {int}\quad \text {dom} \quad g\}. \end{aligned}$$

The function g is termed totally convex on bounded subsets if \(\upsilon _{g}(C,t)\) is positive for any \(t > 0\) and for any nonempty and bounded subset C.

Lemma 1

[39] If \(g : X \rightarrow {\mathbb {R}}\) is uniformly Fr\(\acute{e}\)chet differentiable and bounded on bounded subsets of X, then \(\nabla g\) is uniformly continuous on bounded subsets of X from the strong topology of X to the strong topology of \(X^{*}.\)

Lemma 2

[10] The function \(g: X\rightarrow (-\infty ,+\infty ]\) is totally convex on bounded subsets of X if and only if for any two sequences \(\{x_{n}\}\) and \(\{y_{n}\}\) in int dom g and dom g,  respectively, such that the first one is bounded,

$$\begin{aligned} \lim _{n\rightarrow \infty } D_{g} (y_{n} , x_{n})=0\Longrightarrow \lim _{n\rightarrow \infty }\Vert y_{n} - x_{n}\Vert =0. \end{aligned}$$

Lemma 3

[41] Let the function \(g : X \rightarrow {\mathbb {R}}\) be G\({\hat{a}}\)teaux differentiable and totally convex at a point \(x\in \) int dom g. Let \(\{x_{n}\}\subset \) dom g. If \(\{D_{g} (x_{n} , x)\}\) is bounded, then the sequence \(\{x_{n}\}\) is also bounded.

Lemma 4

[41] Let \(g : X \rightarrow {\mathbb {R}}\) be a G\({\hat{a}}\)teaux differentiable function such that \(\nabla g^{*}\) is bounded on bounded subsets of dom \(g^{*}.\) Let \(x_{0} \in X\) and \(\{x_{n}\}\subset X \). If \(\{D_{g}(x_{0},x_{n}) \}\) is bounded, then the sequence \(\{x_{n}\}\) is also bounded.

A Bregman projection [9, 15] of \(x\in \) int dom g onto the nonempty closed convex set \( C\subset \) int dom g is the unique vector \(\overleftarrow{proj}^{g}_{C}(x)\in C\) satisfying

$$\begin{aligned} D_{g}(\overleftarrow{proj}^{g}_{C}(x),x)= \inf \{ D_{g} (y, x): y\in C\}. \end{aligned}$$

Lemma 5

[13] Let the function \(g: X\rightarrow (-\infty ,+\infty ]\) be G\({\hat{a}}\)teaux differentiable and totally convex on \(int\quad dom\quad g\). Let \(x \in \) \(int\quad dom \quad g\) and \( C\subset int\quad dom \quad g\) be a nonempty closed convex set. If \({\hat{x}} \in C,\) then the following statements are equivalent:

  1. (i)

    The vector \({\hat{x}} \in C\) is the Bregman projection of x onto C.

  2. (ii)

    The vector \({\hat{x}} \in C\) is the unique solution of the variational inequality

    $$\begin{aligned} \langle z-y,\nabla g(x) - \nabla g(z) \rangle \ge 0,\quad \forall y\in C. \end{aligned}$$
  3. (iii)

    The vector \({\hat{x}} \) is the unique solution of the inequality

    $$\begin{aligned} D_{g} (y, z) + D_{g} (z, x) \le D_{g} (y, x),\quad \forall y\in C. \end{aligned}$$

Let B and S be the closed unit ball and the unit sphere of a Banach space X. Let \(rB =\{z\in X:\Vert z\Vert \le r \}\) for all \(r> 0\). Then, the function \(g:X\rightarrow {\mathbb {R}}\) is said to be uniformly convex on bounded subsets (see [48]) if \(\rho _{r}(t)>0\) for all \(r,t>0\), where \(\rho _{r}:[0,\infty )\rightarrow [0,\infty ] \) is defined by:

$$\begin{aligned} \rho _{r}(t)= \inf _{x,y\in rB,\Vert x-y\Vert =t, \alpha \in (0,1)}\frac{\alpha g(x) + (1-\alpha )g(y)- g(\alpha x+ (1-\alpha )y)}{\alpha (1-\alpha )}, \end{aligned}$$

for all \(t\ge 0. \) The function \(\rho _{r}\) is called the gauge of uniform convexity of g. It is known that \(\rho _{r}\) is a nondecreasing function.

Lemma 6

[35] Let X be a Banach space, \(r>0\) be a fixed number and \(g:X\rightarrow {\mathbb {R}}\) be a uniformly convex function on bounded subsets of X. Then,

$$\begin{aligned} g \left( \sum _{k=0}^{n}\alpha _{k}x_{k}\right) \le \sum _{k=0}^{n}\alpha _{k}g(x_{k})- \alpha _{i}\alpha _{j}\rho _{r}(\Vert x_{i}-x_{j}\Vert ), \end{aligned}$$

for all \(i,j\in \{0,1,2,...,n\}\), \(\alpha _{k}\in (0,1)\), \(x_{k}\in rB\) and \(k=0,1,2,...,n\) with \(\sum _{k=0}^{n}\alpha _{k}=1, \) where \(\rho _{r}\) is the gauge of uniform convexity of g.

The function g is also said to be uniformly smooth on bounded subsets (see [48]) if \(\lim _{t\downarrow 0}\frac{\sigma _{r}(t)}{t}=0\) for all \(r>0,\) where \( \sigma _{r}:[0,\infty ) \rightarrow [0,\infty ]\) is defined by:

$$\begin{aligned} \sigma _{r}(t) = \sup _{x\in rB,y\in S, \alpha \in (0,1)} \frac{\alpha g(x+(1-\alpha )ty)+ (1-\alpha )g(x-\alpha ty)- g(x)}{\alpha (1-\alpha )}, \end{aligned}$$

for all \(t\ge 0.\) A function g is said to be supercoercive if \(\lim _{\Vert x\Vert \rightarrow \infty }\frac{g(x)}{\Vert x\Vert }= +\infty .\)

Theorem 1

[48] Let \(g:X\rightarrow {\mathbb {R}} \) be a supercoercive and convex function. Then, the following are equivalent:

  1. (i)

    g is bounded on bounded subsets and uniformly smooth on bounded subsets of X;

  2. (ii)

    g is Fr\(\acute{e}\)chet differentiable and \(\nabla g\) is uniformly norm-to-norm continuous on bounded subsets of X;

  3. (iii)

    dom \(g^{*} = X^{*},\) \(g^{*} \) is supercoercive and uniformly convex on bounded subsets of \(X^{*}\).

Theorem 2

[48] Let \(g: X\rightarrow {\mathbb {R}} \) be a convex function which is bounded on bounded subsets of X. Then, the following are equivalent:

  1. (i)

    g is supercoercive and uniformly convex on bounded subsets of X;

  2. (ii)

    dom \(g^{*} = X^{*},\) \( g^{*}\) is bounded on bounded subsets and uniformly smooth on bounded subsets of \(X^{*}\);

  3. (iii)

    dom \(g^{*} = X^{*},\) \(g^{*} \) is Fr\(\acute{e}\)chet differentiable and \(\nabla g^{*}\) is uniformly norm-to-norm continuous on bounded subsets of \(X^{*}\).

Theorem 3

[3] (Supercoercivity) Let \(g: X\rightarrow (-\infty ,+\infty ] \) be a proper convex and lower semicontinuous function. Then, the following are equivalent:

  1. (i)

    g is supercoercive;

  2. (ii)

    \(g^*\) is bounded above on bounded sets;

  3. (iii)

    dom \(g^{*} = X^{*}\) and \(\partial g^{*} \) maps bounded sets to bounded sets.

Theorem 4

[11] Suppose that \(g:X\rightarrow (-\infty ,+\infty ]\) is a Legendre function. The function g is uniformly convex on bounded subsets of X if and only if g is totally convex on bounded subsets of X.

A bifunction f satisfies a Bregman–Lipschitz-type condition [14] if there exist two positive constants \(c_{1},c_{2}\) such that

$$\begin{aligned} f(x,y) + f(y,z)\ge f(x,z)- c_{1}D_{g}(y,x)-c_{2}D_{g}(z,y),\quad \forall x,y,z\in C, \end{aligned}$$

where \(g:X\rightarrow (-\infty ,+\infty ]\) is a Legendre function. The constants \(c_{1}\) and \(c_{2}\) are called Bregman–Lipschitz coefficients.

Let \(g:X\rightarrow (-\infty ,+\infty ]\) be a G\({\hat{a}}\)teaux differentiable function, recall that the proximal mapping of a proper convex and lower semicontinuous function \( f: C\rightarrow (-\infty ,+\infty ] \) with respect to g is defined by:

$$\begin{aligned} prox_{f}^{g}(x){:}{=}arg min\big \{f(y)+ D_{g}(y,x): y\in C\big \},\quad x\in X. \end{aligned}$$

Applying the tools used in [12], we can state the following lemma.

Lemma 7

Let \(g: X\rightarrow (-\infty ,+\infty ]\) be a Legendre and supercoercive function. Let \(x\in \) int dom g\( C\subset \) int dom g and \(f: C\rightarrow (-\infty ,\infty ]\) be a proper convex and lower semicontinuous function. Then,

$$\begin{aligned} \big \langle prox_{ f}^{g}(x)-y,\nabla g(x)-\nabla g(prox_{ f}^{g}(x))\big \rangle + f(y)- f(prox_{f}^{g}(x))\ge 0,\quad \forall y\in C. \end{aligned}$$
(4)

3 Weak Convergence

In this section, we first establish some crucial lemmas and then we introduce a new extragradient algorithm (Algorithm 1) for solving pseudomonotone equilibrium problem with the Bregman–Lipschitz-type conditions. The algorithm is explicit in the sense that it is done without previously knowing the Bregman–Lipschitz coefficients of bifunction. This is quite interesting in the case where these coefficients are unknown or even demanding to approximate. We prove a weak convergence theorem (Theorem 5 ) for approximating a point of EP. For the sake of simplicity in the presentation, we will apply the symbol \([t]_{+}\) = max \(\{0, t\}\) and assume that \(\frac{0}{0}=+\infty \) and \(\frac{a}{0}=+\infty \) \((a> 0).\)

Motivated by Proposition 2.5 of [29], we prove the following lemma.

Lemma 8

Let the function \(g:X\rightarrow (-\infty ,+\infty ]\) be G\({\hat{a}}\)teaux differentiable and totally convex on bounded subset of X and let \(\{x_{n}\}\) and \(\{y_{n}\}\) be two sequences in int dom g and dom g, respectively. If \(\{x_{n}\}\) and \(\{D_{g}( y_{n},x_{n})\} \) are bounded, then the sequence \(\{y_{n}\}\) is bounded too.

Proof

Assume that the sequence \(\{y_{n}\}\) is not bounded. Then, there exists a subsequence \(\{y_{n_{k}}\}\) of \(\{y_{n}\}\) such that \(\displaystyle \lim _{k\rightarrow \infty }\Vert y_{n_{k}}\Vert =+ \infty .\) Consequently, \(\displaystyle \lim _{k\rightarrow \infty }\Vert x_{n_{k}}-y_{n_{k}}\Vert =+\infty \) and there exists some \(k_{0} > 0\) such that \(\Vert x_{n_{k}}-y_{n_{k}}\Vert > 1\) for all \(k>k_{0}.\) So, using [10, Proposition 1.2.2], for all \(k>k_{0}\) we get

$$\begin{aligned} \upsilon _{g}(x_{n_{k}},\Vert x_{n_{k}}-y_{n_{k}}\Vert )\ge \Vert x_{n_{k}}-y_{n_{k}}\Vert \upsilon _{g}(x_{n_{k}},1). \end{aligned}$$
(5)

Since g is totally convex on bounded subset of X, letting \(k\rightarrow \infty \) in (5), we get that \(\{\upsilon _{g}(x_{n},\Vert x_{n}-y_{n}\Vert )\}_{n \in {\mathbb {N}}}\) is not bounded. Since, by definition,

$$\begin{aligned} \upsilon _{g}(x_{n},\Vert x_{n}-y_{n}\Vert )\le D_{g}( y_{n},x_{n}), \end{aligned}$$

for all \(n \in {\mathbb {N}}\), this implies that the sequence \(\{D_{g}( y_{n},x_{n})\}_{n \in {\mathbb {N}}}\) cannot be bounded which is a contradiction. \(\square \)

Lemma 9

[30] Let \(g:X\rightarrow {\mathbb {R}}\) be a Legendre function such that \(\nabla g\) is weakly sequentially continuous and \(\nabla g^{*}\) is bounded on bounded subsets of dom \(g^{*}.\) Let \(\{x_{n}\}\) be a sequence in X and C be a nonempty subset of X. Suppose that for every \(x\in C,\) \(\{D_{g}(x,x_{n})\}\) converges and every weak cluster point of \(\{x_{n}\}\) belongs to C. Then, \(\{x_{n}\}\) converges weakly to a point in C.

Proof

It is suffice to show that there is exactly one weak subsequential limit of \(\{x_{n}\}.\) Using Lemma 4, we get that \(\{x_{n}\}\) is bounded. So there is at least one weak subsequential limit of \(\{x_{n}\}.\) Suppose that x and y are two weak subsequential limits of \(\{x_{n}\}\) in C,  say \(x_{k_{n}}\rightharpoonup x\) and \(x_{l_{n}}\rightharpoonup y.\) Since \(\nabla g\) is weakly sequentially continuous, we have \(\nabla g (x_{k_{n}})\rightharpoonup \nabla g(x)\) and \(\nabla g (x_{l_{n}})\rightharpoonup \nabla g(y).\) Since x and y belong to C,  the sequences \(\{D_{g}(x,x_{n})\}\) and \(\{D_{g}(y,x_{n})\}\) converge. In turn, since

$$\begin{aligned} D_{g}(x,y) + D_{g}(y, x_{n})- D_{g}(x, x_{n})= \langle x-y, \nabla g(x_{n})- \nabla g(y) \rangle , \end{aligned}$$

passing to the limit along \(x_{k_{n}}\) and along \(x_{l_{n}}\), respectively, yields

$$\begin{aligned} \langle x-y, \nabla g(x)- \nabla g(y) \rangle = \langle x-y, \nabla g(y)- \nabla g(y) \rangle =0. \end{aligned}$$

Thus, \(D_{g}(x,y)+D_{g}(y,x)=0\) and hence \(x=y.\) \(\square \)

Definition 1

[2] Let \(g:X\rightarrow (-\infty ,+\infty ]\) be a G\({\hat{a}}\)teaux differentiable function and C be a nonempty subset of X. A sequence \(\{x_{n}\}\) in X is called Bregman monotone with respect to C if the following conditions hold:

  1. (i)

    \(C\cap dom\quad g\ne \emptyset .\)

  2. (ii)

    \(\{x_{n}\} \subset int \quad dom \quad g.\)

  3. (iii)

    \( D_{g}(x,x_{n+1})\le D_{g}(x,x_{n}),\)    \(\forall x\in C\cap dom\quad g,\) \(\forall n\in {\mathbb {N}}.\)

Lemma 10

Let \(g:X\rightarrow {\mathbb {R}}\) be a Legendre function such that \(\nabla g\) is weakly sequentially continuous and \(\nabla g^{*}\) is bounded on bounded subsets of dom \( g^{*}.\) Let \(\{x_{n}\}\) be a sequence in X and C be a nonempty subset of X. Suppose that \(\{x_{n}\}\) is Bregman monotone with respect to C and every weak cluster point of \(\{x_{n}\}\) belongs to C. Then, \(\{x_{n}\}\) converges weakly to a point in C.

Proof

It deduces immediately from Lemma 9. \(\square \)

Lemma 11

Let \(g: X\rightarrow {\mathbb {R}}\) be a Legendre function and totally convex on bounded subset of X such that \(\nabla g^{*}\) is bounded on bounded subsets of dom \( g^{*}.\) Let \(\{x_{n}\}\subset X\) be a Bregman monotone sequence with respect to C. Then, \(\{\overleftarrow{proj}_{C}^{g}(x_{n})\} \) converges strongly to some \(z\in C.\) Furthermore, if \(x_{n}\rightharpoonup {\overline{x}}\) and \(\nabla g\) is weakly sequentially continuous, then \({\overline{x}}=z.\)

Proof

Let \(u_{n}= \overleftarrow{proj}_{C}^{g}(x_{n}) \) and \(m> n.\) Since

$$\begin{aligned} D_{g}(u_{m},x_{m})\le D_{g}(u_{n},x_{m}) \le D_{g}(u_{n},x_{n}), \end{aligned}$$

the sequence \(\{D_{g}(u_{n},x_{n})\}\) is convergent and hence it is bounded. Using Lemmas 4 and 8, the sequence \(\{u_{n}\}\) is bounded too. Set \(r=\sup \{\Vert u_{n}\Vert ,n\in {\mathbb {N}}\} .\) So, using Lemma 6 and Theorem 4, we have

$$\begin{aligned} D_{g}(\frac{u_{n}+u_{m}}{2},x_{m})=&g(\frac{u_{n}+u_{m}}{2})-g(x_{m}) -\langle \frac{u_{n}+u_{m}}{2}- x_{m}, \nabla g ( x_{m}) \rangle \\ \le&\frac{1}{2}g(u_{n}) + \frac{1}{2}g(u_{m}) - \frac{1}{4}\rho _{r}(\Vert u_{n}-u_{m} \Vert ) - g(x_{m})\\&\quad -\langle \frac{u_{n}+u_{m}}{2}- x_{m}, \nabla g ( x_{m}) \rangle \\ =&\frac{1}{2} D_{g}(u_{n},x_{m})+ \frac{1}{2} D_{g}(u_{m},x_{m}) - \frac{1}{4}\rho _{r}(\Vert u_{n}-u_{m} \Vert ), \end{aligned}$$

and then,

$$\begin{aligned} \frac{1}{4}\rho _{r}(\Vert u_{n}-u_{m} \Vert )\le&\frac{1}{2} D_{g}(u_{n},x_{m}) + \frac{1}{2} D_{g}(u_{m},x_{m})-D_{g}(\frac{u_{n}+u_{m}}{2},x_{m})\\ \le&\frac{1}{2} D_{g}(u_{n},x_{m})+ \frac{1}{2} D_{g}(u_{m},x_{m})-D_{g} (u_{m},x_{m})\\ \le&\frac{1}{2} D_{g}(u_{n},x_{n})- \frac{1}{2} D_{g}(u_{m},x_{m}). \end{aligned}$$

Therefore, \(\rho _{r}(\Vert u_{n}-u_{m} \Vert )\rightarrow 0\) as \(m, n\rightarrow \infty .\) Now, we show that

$$\begin{aligned} \lim _{n,m\rightarrow \infty } \Vert u_{n}-u_{m} \Vert =0. \end{aligned}$$

If this were not the case, there exist \(\epsilon _{0}>0\) and subsequences \(\{n_{k}\}\) and \(\{m_{k}\}\) of \(\{n\}\) and \(\{m\}\), respectively, such that \(\Vert u_{n_{k}}-u_{m_{k}} \Vert \ge \epsilon _{0}. \) Since \(\rho _{r}\) is nondecreasing, we get

$$\begin{aligned} \rho _{r}(\epsilon _{0})\le \rho _{r} (\Vert u_{n_{k}}-u_{m_{k}} \Vert ). \end{aligned}$$

Letting \(k\rightarrow \infty \), we have \( \rho _{r}(\epsilon _{0})\le 0.\) But this yields a contradiction to the uniform convexity of g on bounded subsets of X. Therefore, \(\{u_{n}\}\) is a Cauchy sequence and hence converges strongly to some \(z\in C.\)

Now, let \(x_{n}\rightharpoonup {\overline{x}}\) and \(\nabla g\) be weakly sequentially continuous. Using Lemma 5, we have

$$\begin{aligned} \big \langle \nabla g (x_{n}) - \nabla g \big (\overleftarrow{proj}_{C}^{g}(x_{n})\big ), {\overline{x}}- \overleftarrow{proj}_{C}^{g}(x_{n}) \big \rangle \le 0. \end{aligned}$$

Letting \(n\rightarrow \infty \), we get \(\langle \nabla g ({\overline{x}}) - \nabla g (z), {\overline{x}}- z \rangle \le 0.\) So, \(D_{g}({\overline{x}},z)+ D_{g}(z,{\overline{x}})\le 0\) and hence \({\overline{x}}=z.\) This completes the proof. \(\square \)

In order to obtain the convergence of our method, we consider the following blanket assumptions imposed on the bifunction f.

  1. (A1)

    f is pseudomonotone on C, i.e., for all \(x,y\in C\),

    $$\begin{aligned} f(x,y)\ge 0 \Rightarrow f(y,x)\le 0. \end{aligned}$$
  2. (A2)

    f satisfies the Bregman–Lipschitz-type condition.

  3. (A3)

    for any sequence \(\{x_{n}\}\subset C\) and \(x\in C\) such that \(x_{n}\rightharpoonup x\) and \(\limsup _{n\rightarrow \infty }f(x_{n},y)\ge 0\), for all \(y\in C\), then \(f(x,y)\ge 0\) [26].

  4. (A4)

    f(x, .) is convex, lower semicontinuous and subdifferentiable on C for every fixed \(x\in C.\) A bifunction f is called monotone on C if for all \(x,y\in C,\) \(f(x,y)+ f(y,x)\le 0. \) It is evident that any monotone bifunction is a pseudomonotone one, but not vice versa. A mapping \(A: C\rightarrow X^{*} \) is pseudomonotone if and only if the bifunction \(f(x,y)=\langle y-x, A(x)\rangle \) is pseudomonotone on C (see [47]).

Lemma 12

[6] If the bifunction f satisfying conditions \(A1-A4,\) then EP(f) is closed and convex.

Now, we present the following algorithm and we prove a weak convergence theorem (Theorem 5) and a strong convergence theorem (Theorem 6) for approximating a point of EP(f).

Algorithm 1 (Extragradient algorithm for EP)

Initialization. Choose \(x_{0}\in C,\) \(\lambda _{0}>0\) and \(\mu \in (0,1).\)

Iterative steps: Assume that \(x_{n}\in C\), \(\lambda _{n}\) \((n\ge 0)\) are known. Compute

\(x_{n+1}\) and \(\lambda _{n+1}\) as follows:

\(\qquad y_{n}=prox_{\lambda _{n}f(x_{n},.)}^{g}(x_{n})\),

\(\qquad x_{n+1}=prox_{\lambda _{n}f(y_{n},.)}^{g}(x_{n})\),

and set

\(\qquad \lambda _{n+1}=\min \big \{\lambda _{n}, \dfrac{\mu (D_{g}(y_{n},x_{n})+D_{g}(x_{n+1}, y_{n}))}{[f(x_{n},x_{n+1})-f(x_{n},y_{n})-f(y_{n},x_{n+1})]_{+}}\big \}.\)

Stopping criterion: If \(y_{n}=x_{n},\) then stop and \(x_{n}\) is a solution of EP(f).

Remark 1

Under hypothesis (A2), there exist some constants \(c_{1}>0\) and \(c_{2}>0\) such that

$$\begin{aligned} f(x_{n},x_{n+1})-f(x_{n},y_{n})-f(y_{n},x_{n+1}) \le&c_{1} D_{g}(y_{n},x_{n})+ c_{2}D_{g}(x_{n+1},y_{n})\\ \le&\max \{c_{1}, c_{2}\} \big (D_{g}(y_{n},x_{n})+D_{g}(x_{n+1},y_{n})\big ). \end{aligned}$$

Thus, from the definition of \(\lambda _{n}\), we see that this sequence is bounded from below. Indeed, if \(\lambda _{0}\le \frac{\mu }{\max \{c_{1}, c_{2}\}}\), then \(\{\lambda _{n}\}\) is bounded from below by \(\lambda _{0}\); otherwise, \(\{\lambda _{n}\}\) is bounded from below by \(\frac{\mu }{\max \{c_{1}, c_{2}\}}.\) Moreover, the sequence \(\{\lambda _{n}\}\) is nonincreasing. Therefore, there is a real \( \lambda > 0\) such that \(\lambda _{n}\rightarrow \lambda ,\) as \(n\rightarrow \infty .\)

Theorem 5

Let \(g: X\rightarrow {\mathbb {R}}\) be a Legendre and supercoercive function which is bounded, uniformly Fr\(\acute{e}\)chet differentiable and totally convex on bounded subsets of X and let \(\nabla g\) be weakly sequentially continuous. Under conditions \((A1)-(A4)\), the sequence \(\{x_{n}\}\) generated by Algorithm 1 converges weakly to \({\bar{x}}\in EP(f),\) where \({\bar{x}}=\lim _{n\rightarrow \infty } \overleftarrow{proj}_{EP(f)}^{g}(x_{n}). \)

Proof

It follows from Lemma 7 and the definition of \(x_{n+1}\) that

$$\begin{aligned} \langle x_{n+1}- y, \nabla g( x_{n})-\nabla g( x_{n+1})\rangle \ge \lambda _{n}f( y_{n}, x_{n+1})-\lambda _{n}f( y_{n}, y), \quad \forall y\in C. \end{aligned}$$
(6)

From the definition of \(\lambda _{n+1},\) we get

$$\begin{aligned} f(x_{n},x_{n+1})-f(x_{n},y_{n})-f(y_{n},x_{n+1})\le \dfrac{\mu \big (D_{g}(y_{n},x_{n})+D_{g}(x_{n+1},y_{n})\big )}{\lambda _{n+1}}, \end{aligned}$$

which, by multiplying both sides of it by \(\lambda _{n},\) implies that

$$\begin{aligned} \lambda _{n}f(y_{n},x_{n+1})\ge \lambda _{n}\big (f(x_{n},x_{n+1})-f(x_{n},y_{n})\big )-\dfrac{\mu \lambda _{n}\big (D_{g}(y_{n},x_{n})+D_{g}(x_{n+1},y_{n})\big )}{\lambda _{n+1}}. \end{aligned}$$
(7)

Combining relations (6) and (7), we obtain

$$\begin{aligned} \big \langle x_{n+1}- y, \nabla g( x_{n})-\nabla g( x_{n+1})\big \rangle \ge&\lambda _{n}\big (f(x_{n},x_{n+1})-f(x_{n},y_{n})\big ) -\lambda _{n}f( y_{n},y)\nonumber \\&\quad -\dfrac{\mu \lambda _{n}\big (D_{g}(y_{n},x_{n})+D_{g}(x_{n+1}, y_{n})\big )}{\lambda _{n+1}}. \end{aligned}$$
(8)

Similarly, from Lemma 7 and the definition of \(y_{n}\) we obtain

$$\begin{aligned} \big \langle y_{n}-x_{n+1},\nabla g(y_{n})-\nabla g(x_{n})\big \rangle \le \lambda _{n}\big (f(x_{n},x_{n+1})-f(x_{n},y_{n})\big ). \end{aligned}$$
(9)

Using (8) and (9), we have

$$\begin{aligned} \big \langle x_{n+1}- y, \nabla g( x_{n})-\nabla g( x_{n+1})\big \rangle \ge&\big \langle y_{n}-x_{n+1},\nabla g(y_{n})-\nabla g(x_{n})\big \rangle -\lambda _{n}f( y_{n}, y))\nonumber \\&\quad -\dfrac{\mu \lambda _{n}\big (D_{g}(y_{n},x_{n})+D_{g} (x_{n+1},y_{n})\big )}{\lambda _{n+1}}. \end{aligned}$$
(10)

Applying the three-point identity of Bregman distance in (10) and a simple calculation, we get

$$\begin{aligned} D_{g}(y,x_{n+1})\le&D_{g}(y,x_{n})- (1-\frac{\mu \lambda _{n}}{\lambda _{n+1}})D_{g}(y_{n},x_{n})- (1-\frac{\mu \lambda _{n}}{\lambda _{n+1}})D_{g}(x_{n+1},y_{n}))\nonumber \\&\quad + \lambda _{n}f( y_{n}, y). \end{aligned}$$
(11)

Let \(x^{*} \in EP(f).\) Since f is pseudomonotone, \(f(y_{n},x^{*})\le 0.\) Hence, substituting \(y= x^{*}\) into (11), we obtain

$$\begin{aligned} D_{g}(x^{*},x_{n+1})\le D_{g}(x^{*},x_{n})- (1-\frac{\mu \lambda _{n}}{\lambda _{n+1}})D_{g}(y_{n},x_{n})- (1-\frac{\mu \lambda _{n}}{\lambda _{n+1}})D_{g}(x_{n+1},y_{n}). \end{aligned}$$
(12)

Let \(\epsilon \in (0,1-\mu )\) be some fixed number. Applying Remark 1, \(\lambda _{n}\rightarrow \lambda >0\) and hence

$$\begin{aligned} \lim _{n\rightarrow \infty }(1-\frac{\mu \lambda _{n}}{\lambda _{n+1}})=1-\mu>\epsilon >0. \end{aligned}$$

Thus, there exists \(n_{0}\in {\mathbb {N}}\) such that

$$\begin{aligned} 1-\frac{\mu \lambda _{n}}{\lambda _{n+1}}>\epsilon >0, \quad \forall n\ge n_{0}. \end{aligned}$$
(13)

From (12) and (13), we obtain

$$\begin{aligned} D_{g}(x^{*},x_{n+1})\le D_{g}(x^{*},x_{n})-\epsilon (D_{g}(y_{n},x_{n})+D_{g}(x_{n+1},y_{n})), \end{aligned}$$
(14)

or

$$\begin{aligned} a_{n+1}\le a_{n}-b_{n}, \end{aligned}$$

where \(a_{n}=D_{g}(x^{*},x_{n})\) and \(b_{n}=\epsilon (D_{g}(y_{n},x_{n})+D_{g}(x_{n+1},y_{n})).\) Thus, there exists the limit of \(\{a_{n}\}\) and \(\lim _{n\rightarrow \infty }b_{n}=0.\) From Lemma 2, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert y_{n}-x_{n}\Vert =\lim _{n\rightarrow \infty }\Vert x_{n+1}-y_{n}\Vert =0, \end{aligned}$$
(15)

and hence \(\lim _{n\rightarrow \infty }\Vert x_{n+1}-x_{n}\Vert =0.\) Therefore, from Lemma 1 we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert \nabla g (x_{n+1})-\nabla g(x_{n})\Vert =0. \end{aligned}$$
(16)

From Theorems 2 and 4, \(g^{*}\) is bounded on bounded subsets of \(X^{*}\) and hence \(\nabla g^{*}\) is also bounded on bounded subsets of \(X^{*}\). From this, (14) and Lemma 4, the sequence \(\{x_{n}\}\) is bounded. Now, we prove that each weak cluster point of \(\{x_{n}\}\) is in EP(f). Suppose that \({\overline{x}}\) is a weak cluster point of \(\{x_{n}\}\). That is, there exists the subsequence \(\{x_{n_{k}}\}\) of \(\{x_{n}\}\) such that \( x_{n_{k}}\rightharpoonup {\overline{x}}\) as \(k\rightarrow \infty .\) Since \(\Vert x_{n}-y_{n}\Vert \rightarrow 0\) as \(n\rightarrow \infty \), we also have that \(y_{n_{k}}\rightharpoonup {\overline{x}}\) as \(k\rightarrow \infty .\) Passing to the limit in (11) as \(n=n_{k},\) we obtain

$$\begin{aligned} \limsup _{k\rightarrow \infty }f(y_{n_{k}},y) \ge&\frac{1}{\lambda }\limsup _{k\rightarrow \infty }(D_{g}(y,x_{n_{k}+1}) - D_{g}(y,x_{n_{k}}))\\ \ge&\frac{1}{\lambda }\limsup _{k\rightarrow \infty }(D_{g}(y,x_{n_{k}+1}) - D_{g}(y,x_{n_{k}})- D_{g}(x_{n_{k}},x_{n_{k}+1}))\\ =&\frac{1}{\lambda }\lim _{k\rightarrow \infty }\langle x_{n_{k}}-y,\nabla g(x_{n_{k}+1})-\nabla g(x_{n_{k}})\rangle . \end{aligned}$$

From above inequality, (16), boundedness of \(\{x_{n}\}\) and (A3),  we obtain \(f({\overline{x}},y)\ge 0\) for all \(y\in C\). Hence, \({\overline{x}}\in EP(f).\) Using (12) and (13), we get that \(\{x_{n}\}_{n\ge n_{0}}\) is Bregman monotone with respect to EP(f). Thus, applying Lemmas 10, 11 and 12, we get the desired result. \(\square \)

4 Strong Convergence

In this section, we analyze the strong convergence of the sequence generated by Algorithm 1 to an element of EP(f) with some various extra assumptions on the problem. In the following theorem, we also assume that the bifunction f satisfies the following condition:

(A5) for all bounded sequences \(\{x_{n}\}\) and \(\{y_{n}\}\) in C

$$\begin{aligned} \Vert x_{n}-y_{n}\Vert \rightarrow 0\Longrightarrow f(x_{n},y_{n})\rightarrow 0. \end{aligned}$$

Definition 2

A bifunction f is called strongly pseudomonotone on C, if there exists \(\beta >0\) such that whenever \(f(x, y) \ge 0\), then \(f(y, x) \le -\beta \Vert x-y\Vert ^{2}\) for all \(x, y \in C\).

Definition 3

A function \(h : X\rightarrow (-\infty ,+\infty ]\) is called uniformly convex with modulus \(\psi :[0,+\infty )\rightarrow [0,+\infty ]\) if \(\psi \) is increasing, vanishes only at 0, and for each pair \(x, y\in \) dom h and each \(\alpha \in (0,1)\),

$$\begin{aligned} h(\alpha x + (1 -\alpha )y) \le \alpha h(x) + (1 -\alpha )h(y) -\alpha (1 -\alpha )\psi (\Vert x-y\Vert ). \end{aligned}$$
(17)

If (17) holds with \(\psi =\frac{\sigma }{2}\Vert .\Vert ^2\) for some \(\sigma >0,\) then h is strongly convex with constant \(\sigma \). We say that h is strongly concave whenever \(-h\) is strongly convex.

Motivated by Theorem 4.3 of [26], we prove the following theorem.

Theorem 6

Let \(g: X\rightarrow {\mathbb {R}}\) be a Legendre and supercoercive function which is bounded, uniformly Fr\(\acute{e}\)chet differentiable and totally convex on bounded subsets of X and let hypotheses \((A1)-(A5)\) be satisfied. If each one of the following conditions is satisfied:

  1. (i)

    f is strongly pseudomonotone,

  2. (ii)

    for all \(x \in C,\) f(x, .) is uniformly convex with modulus \(\psi \),

  3. (iii)

    for all \(y \in C\), f(., y) is strongly concave with constant \(\sigma ,\) then the sequence \(\{x_{n}\}\) made by Algorithm 1 is strongly convergent to an element of EP(f).

Proof

Observe that in Theorem 5, we proved that all cluster points of \(\{x_{n}\}\) belong to EP(f). Now, if \(\{x_{n_{m}}\}\) and \(\{x_{k_{m}}\}\) are arbitrary subsequences of \(\{x_{n}\}\) that converge strongly to p and q, respectively, then

$$\begin{aligned}&\langle p-q,\nabla g(x_{n_{m}})-\nabla g(x_{k_{m}})\rangle \\&\quad = D_{g}(p,x_{k_{m}})+D_{g} (q,x_{n_{m}})-D_{g}(p,x_{n_{m}})-D_{g}(q,x_{k_{m}}). \end{aligned}$$

From (14), \(\lim _{n\rightarrow \infty } D_{g}(p,x_{n})\) and \(\lim _{n\rightarrow \infty } D_{g}(q,x_{n})\) exist. Using this, Lemma 1 and taking limit when \(m\rightarrow \infty ,\) we get \(p=q\). That is, \(\{x_{n}\}\) converges strongly to a point of EP(f).

Therefore, in each item, it remains to be proved that if \(x_{n_{k}}\rightharpoonup x^{*},\) then \(x_{n_{k}}\rightarrow x^{*}.\) Suppose that \(x_{n_{k}}\rightharpoonup x^{*}.\) Consequently by (15), \(y_{n_{k}}\rightharpoonup x^{*}.\) Substituting \(y=x^{*}\) into (6), we have

$$\begin{aligned} 0\le&\lambda _{n_{k}}f( y_{n_{k}}, x^{*})-\lambda _{n_{k}}f( y_{n_{k}}, x_{n_{k}+1})+\langle x_{n_{k}+1}-x^{*}, \nabla g(x_{n_{k}}) -\nabla g( x_{n_{k}+1})\rangle \\ \le&\lambda _{n_{k}}f( y_{n_{k}}, x^{*})-\lambda _{n_{k}}f( y_{n_{k}}, x_{n_{k}+1})+\Vert x_{n_{k}+1}-x^{*}\Vert \Vert \nabla g(x_{n_{k}})-\nabla g( x_{n_{k}+1})\Vert . \end{aligned}$$

From (15), (16), Remark 1, condition A5 and boundedness of \(\{x_{n}\}\), we get

$$\begin{aligned} \liminf _{k\rightarrow \infty }f( y_{n_{k}}, x^{*})\ge 0. \end{aligned}$$
(18)

(i) Since \(f(x^{*}, y_{n_{k}})\ge 0\), there is a \(\beta >0\) such that \(f( y_{n_{k}}, x^{*})\le -\beta \Vert y_{n_{k}} - x^{*}\Vert ^{2}. \) This together with (18) implies that

$$\begin{aligned} 0\le \liminf _{k\rightarrow \infty }f( y_{n_{k}}, x^{*})\le \liminf _{k\rightarrow \infty }(-\beta \Vert y_{n_{k}} - x^{*}\Vert ^{2})\le -\beta \limsup _{k\rightarrow \infty }\Vert y_{n_{k}} - x^{*}\Vert ^{2}. \end{aligned}$$

Therefore, \(y_{n_{k}}\rightarrow x^{*}\), and hence, \(x_{n_{k}}\rightarrow x^{*}.\)

(ii) Let \(\alpha \in (0,1)\) and set \(w_{n_{k}}=\alpha y_{n_{k}} + (1-\alpha )x^{*},\) for all \(n\in {\mathbb {N}}.\) Replacing y with \(w_{n_{k}}\) in (6) and using uniform convexity of \(f( y_{n_{k}},w_{n_{k}})\), we get

$$\begin{aligned} 0\le&\lambda _{n_{k}}f( y_{n_{k}},w_{n_{k}})-\lambda _{n_{k}}f( y_{n_{k}}, x_{n_{k}+1})+\langle x_{n_{k}+1}-w_{n_{k}}, \nabla g(x_{n_{k}}) -\nabla g( x_{n_{k}+1})\rangle \\ \le&\lambda _{n_{k}}\alpha f( y_{n_{k}}, y_{n_{k}}) + \lambda _{n_{k}}(1-\alpha )f(y_{n_{k}},x^{*}) -\lambda _{n_{k}}\alpha (1-\alpha )\psi (\Vert y_{n_{k}} - x^{*}\Vert ) \\&\quad -\lambda _{n_{k}}f( y_{n_{k}}, x_{n_{k}+1})+\Vert x_{n_{k}+1}-w_{n_{k}}\Vert \Vert \nabla g(x_{n_{k}})-\nabla g( x_{n_{k}+1})\Vert . \end{aligned}$$

Hence, we have

$$\begin{aligned}&\lambda _{n_{k}}\alpha (1-\alpha )\psi (\Vert y_{n_{k}} - x^{*}\Vert )\le -\lambda _{n_{k}}f( y_{n_{k}}, x_{n_{k}+1})\\&\quad +\Vert x_{n_{k}+1}-w_{n_{k}}\Vert \Vert \nabla g(x_{n_{k}})-\nabla g( x_{n_{k}+1})\Vert . \end{aligned}$$

Since \(\lim _{n\rightarrow \infty }\lambda _{n}>0\), boundedness of the sequences \(\{x_{n}\}\), \(\{w_{n}\},\) condition A5 and (16) imply that \(\lim _{k\rightarrow \infty }\psi (\Vert y_{n_{k}} - x^{*}\Vert )=0.\) Applying a similar discussion to the one utilized in the proof of Lemma 11 , we deduce that \(y_{n_{k}}\rightarrow x^{*}\) and hence \(x_{n_{k}}\rightarrow x^{*}.\)

(iii) Let \(\alpha \in (0,1)\) and set \(w_{n_{k}}=\alpha y_{n_{k}} + (1-\alpha )x^{*},\) for all \(n\in {\mathbb {N}}.\) Then, since \(f(w_{n_{k}},x^{*})\) is strongly concave, we have

$$\begin{aligned} \alpha f(y_{n_{k}},x^{*})+ (1-\alpha )f(x^{*},x^{*})+\frac{1}{2}\alpha (1-\alpha )\sigma \Vert y_{n_{k}}-x^{*}\Vert ^{2}\le f(w_{n_{k}},x^{*})\le 0. \end{aligned}$$

Therefore, we get \(f(y_{n_{k}},x^{*})\le -\frac{1}{2}\sigma (1-\alpha )\Vert y_{n_{k}}-x^{*}\Vert ^{2}\). Next, similar to item (i), we get the desired result. \(\square \)

Remark 2

It is valuable to mention that in Theorem 6, unlike Theorem 5, we do not need that \(\nabla g\) to be weakly sequentially continuous.

5 Application

In this section, we study the specific equilibrium problem related to the function f defined for every \(x,y\in C\) by \(f(x,y)=\langle y-x, Ax\rangle \) with \(A:C\rightarrow X^{*}.\) Doing so, we achieve the conventional variational inequality:

$$\begin{aligned} \text {Find} \quad x^{*}\in C \quad \text {such that} \quad \langle y-x^{*}, Ax^{*}\rangle \ge 0,\quad \forall y\in C. \end{aligned}$$
(19)

The set of solutions of (19) is denoted by VI(AC).

Lemma 13

[14] Let C be a nonempty closed convex subset of a reflexive Banach space X, \( A:C\rightarrow X^{*}\) be a mapping and \(g: X\rightarrow {\mathbb {R}}\) be a Legendre function. Then,

$$\begin{aligned} \overleftarrow{proj}_{C}^{g}(\nabla g^{*}[\nabla g(x)- \lambda A(y)])= argmin_{w\in C} \big \{ \lambda \langle w-y,A(y)\rangle + D_{g}(w,x)\big \}, \end{aligned}$$

for all \(x \in X,\) \(y \in C\) and \(\lambda \in (0,+\infty ).\)

Let X be a real Banach space and \(1<q\le 2\le p\) with \(\frac{1}{p} +\frac{1}{q}=1.\) The modulus of convexity \(\delta _{X}:[0,2]\rightarrow [0,1]\) is defined by:

$$\begin{aligned} \delta _{X}(\epsilon )=\inf \{1-\frac{\Vert x+y\Vert }{2} :\Vert x\Vert =\Vert y\Vert =1,\Vert x-y\Vert \ge \epsilon \}. \end{aligned}$$

X is called uniformly convex if \( \delta _{X}(\epsilon )>0\) for any \(\epsilon \in (0,2],\) p-uniformly convex if there is a \(c_{p}>0\) so that \(\delta _{X}(\epsilon )\ge c_{p}\epsilon ^{p} \) for any \(\epsilon \in (0,2].\) The modulus of smoothness \(\rho _{X}(\tau ):[0,\infty )\rightarrow [0,\infty )\) is defined by:

$$\begin{aligned} \rho _{X}(\tau )=\sup \{\frac{\Vert x+\tau y\Vert +\Vert x-\tau y\Vert }{2}-1:\Vert x\Vert =\Vert y\Vert =1\}. \end{aligned}$$

X is called uniformly smooth if \(\lim _{\tau \rightarrow 0}\dfrac{\rho _{X}(\tau )}{\tau }=0.\)

For the p-uniformly convex space, the metric and Bregman distance have the following relation [44]:

$$\begin{aligned} \tau \Vert x-y\Vert ^{p}\le D_{\frac{1}{p}\Vert .\Vert ^{p}} (x, y)\le \langle x-y,J_{X}^{p}(x)-J_{X}^{p}(y) \rangle , \end{aligned}$$
(20)

where \(\tau >0\) is fixed number and duality mapping \(J_{X}^{p}: X\rightarrow 2^{X^{*}}\) is defined by:

$$\begin{aligned} J_{X}^{p}(x)=\{f\in X^{*},\langle x,f\rangle =\Vert x\Vert ^{p}, \Vert f\Vert =\Vert x\Vert ^{p-1}\}, \end{aligned}$$

for every \(x\in X.\) It is well known that X is smooth if and only if \(J_{X}^{p}\) is single-valued mapping of X into \(X^{*}.\) We also know that X is reflexive if and only if \(J_{X}^{p}\) is surjective, and X is strictly convex if and only if \(J_{X}^{p}\) is one-to-one. Therefore, if X is smooth, strictly convex and reflexive Banach space, then \(J_{X}^{p}\) is a single-valued bijection and in this case, \(J_{X}^{p}=(J_{X^{*}}^{q})^{-1}\) where \(J_{X^{*}}^{q}\) is the duality mapping of \(X^{*}.\)

For \(p=2,\) the duality mapping \(J_{X}^{p}\) is called the normalized duality and is denoted by J. The function \(\phi : X ^{2}\rightarrow {\mathbb {R}}\) is defined by:

$$\begin{aligned} \phi (y, x) = \parallel y\parallel ^{2}-2\langle y,Jx\rangle +\parallel x\parallel ^{2}, \end{aligned}$$

for all \(x ,y \in X.\) The generalized projection \(\Pi _{C}\) from X onto C is defined by:

$$\begin{aligned} \Pi _{C}(x) = argmin_{y\in C}\phi (y,x),\quad \forall x\in X, \end{aligned}$$

where C is a nonempty closed and convex subset of X. Let X be a uniformly convex and uniformly smooth Banach space and \(g(.)=\frac{1}{2}\Vert .\Vert ^{2}.\) So, \(\nabla g=J\), \(D_{\frac{1}{2}\Vert .\Vert ^{2}}(x,y)= \dfrac{1}{2}\phi (x,y)\) and \(\overleftarrow{Proj}_{C}^{\frac{1}{2}\Vert .\Vert ^{2}}=\Pi _{C} .\) For solving the variational inequality (19) in 2-uniformly convex and uniformly smooth Banach space X, we consider the following algorithm.

Algorithm 2 (Extragradient algorithm for VI)

Initialization. Choose \(x_{0}\in C,\) \(\lambda _{0}>0\) and \(\mu \in (0,1).\)

Iterative steps: Assume that \(x_{n}\in C\), \(\lambda _{n}\) \((n\ge 0)\) are known. Compute

\(x_{n+1}\) and \(\lambda _{n+1}\) as follows:

\(\qquad y_{n}=\Pi _{C}(J^{-1}(J(x_{n})-\lambda _{n}Ax_{n}),\)

\( \qquad x_{n+1}=\Pi _{C}(J^{-1}(J(x_{n})-\lambda _{n}Ay_{n}),\)

and set

\(\lambda _{n+1}=\min \big \{\lambda _{n}, \dfrac{\mu (\phi (y_{n},x_{n})+\phi (x_{n+1},y_{n}))}{[\langle x_{n+1}-y_{n},Ax_{n}-Ay_{n}\rangle ]_{+}}\big \}.\)

Stopping criterion: If \(y_{n}=x_{n}\), then stop and \(x_{n}\) is a solution of VI(AC).

Corollary 1

Let X be a 2-uniformly convex and uniformly smooth Banach space. Suppose that \(A:C\rightarrow X^{*}\) is a strongly pseudomonotone, L-Lipschitz and weak to norm continuous mapping. Then, the sequence \(\{x_{n}\}\) made by Algorithm 2 is strongly convergent to an element of VI(AC).

Proof

Let \(f(x,y){:}{=} \langle y-x, A(x)\rangle \) for all \(x,y\in C.\) Since A is L-Lipschitz continuous, for all \(x,y,z\in C\), we have

$$\begin{aligned} f(x,y)+ f(y,z)-f(x,z)=&\langle y-x, A(x)\rangle + \langle z-y, A(y)\rangle - \langle z-x, A(x)\rangle \\ =&-\langle y-z, A(y)- A(x)\rangle \\ \ge&-\Vert A(y)- A(x)\Vert \Vert y-z\Vert \\ \ge&-L\Vert y-x\Vert \Vert y-z\Vert \\ \ge&-\frac{L}{2}\Vert y-x\Vert ^{2} -\frac{L}{2}\Vert y-z\Vert ^{2}\\ \ge&-\frac{L}{4\tau }\phi (y,x)-\frac{L}{4\tau }\phi (z,y), \qquad by\quad (20). \end{aligned}$$

Therefore, f satisfies the Bregman–Lipschitz-type condition with respect to \(g(.)=\frac{1}{2}\Vert .\Vert ^{2}\) and \(c_{1}=c_{2}= \frac{L}{2\tau }\). Moreover, the strong pseudomonotonicity of A certifies the strong pseudomonotonicity of f. Conditions A3 and A4 are satisfied automatically. Using Theorem 6 and Lemma 13, we get the desired result. \(\square \)

6 Numerical Experiments

In this section, we will give two numerical examples to show that our algorithm is efficient and converges faster than Algorithm 1 of [22] . The optimization subproblems in these examples have been solved by FMINCON optimization toolbox in MATLAB software.

Theorem 7

[5] Let \(g:X\rightarrow {\mathbb {R}}\) be a G\({\hat{a}}\)teaux differentiable function. The function g is strongly convex with constant \(\sigma >0\) if and only if

$$\begin{aligned} g(x)-g(y)\ge \langle x-y,\nabla g(y)\rangle +\frac{\sigma }{2}\Vert x-y\Vert ^2, \quad \forall x,y\in X. \end{aligned}$$

Proposition 1

[4] Let H be a Hilbert space and \(g :H\rightarrow (-\infty ,+\infty ]\) be proper and let \(\sigma >0.\) Then, g is strongly convex with constant \(\sigma >0\) if and only if \(g-\frac{\sigma }{2}\Vert .\Vert ^{2}\) is convex.

Example 1

Let \(X={\mathbb {R}}\) and \(C=[4,10]\) with the usual norm |.|. Define the bifunction \(f=C\times C\rightarrow {\mathbb {R}} \) as follows:

$$\begin{aligned} f(x,y)=(6x+4y-50)(y-x),\quad \forall x,y\in C. \end{aligned}$$

Let \(p\in (1,+\infty )\) and \(g_{p}(.)=\frac{1}{p}\Vert .\Vert ^{p}.\) Using Proposition 1, we can easily see that the function \(g_{p}\) is strongly convex on C with constant

$$\begin{aligned} \sigma _{p}= \left\{ \begin{array}{lr} (p-1) 10^{p-2},\quad \qquad \text {if} \quad p<2,\\ (p-1) 4^{p-2},\qquad \qquad o.w. \end{array} \right. \end{aligned}$$
(21)

Obviously, the bifunction f satisfies conditions A1, A3 and A4. Furthermore,

$$\begin{aligned} f(x,y) +f(y,z)-f(x,z)&=2(z-y)(y-x)\\&\ge -2|z-y| |y-x|\\&\ge -|z-y|^{2}- |y-x|^{2}\\&\ge \frac{-2}{\sigma _{p}} D_{g_{p}}(z,y)-\frac{2}{\sigma _{p}} D_{g_{p}}(y,x), \end{aligned}$$

which the last inequality follows from the strong convexity of \(g_{p}\) on C. Hence, f satisfies condition A2. Put \(\lambda _{0}=\frac{1}{10}\) and \(\mu =\frac{3}{4}\). It can be seen that all the hypotheses of Theorem 5 are satisfied and \(EP(f)=\{5\}.\) Using Algorithm 1 with the initial point \( x_{0}=4,\) we have the numerical results in Figs. 1 and 2. In cases \(p=1.5,\) 2 and 2.5, approximate solutions, respectively, are

$$\begin{aligned} x_{9}=4.9999990, \quad x_{21}= 4.9999994, \quad x_{53}=4.9999990, \end{aligned}$$

with the tolerance \(\epsilon =10^{-6}.\) Therefore, the rate of convergence decreases with increasing p. Note that for \(g(.)=\frac{1}{2}\Vert .\Vert ^2\) our Algorithm 1 reduces to Algorithm 1 of [22] and we see that our Algorithm 1 with \(g(.)=\frac{2}{3}\Vert .\Vert ^\frac{3}{2}\) converges faster than Algorithm 1 of [22].

Theorem 8

[5] The quadratic function \(g(x) = x^{T} Ax+ 2b^{T} x + c\) with \(A = A^T\in {\mathbb {R}}\) \( ^{n\times n},\) \( b\in {\mathbb {R}}^{n}\) and \(c\in {\mathbb {R}}\) is strongly convex if and only if A is positive definite and in that case the strong convexity constant is \(2\lambda _{\min }(A),\) where \(\lambda _{\min }(A)\) is the minimum eigenvalue of A.

Remark 3

Let \(g(x) = x^{T} Ax+ 2b^{T} x + c\), where A is positive definite, \( b\in {\mathbb {R}}^{n}\), \(c\in {\mathbb {R}},\) then

$$\begin{aligned} \nabla g(x)=2(Ax+b), \quad g^*(y)=\frac{1}{4}(y-2b)^{T}A^{-1}(y-2b). \end{aligned}$$
Fig. 1
figure 1

Plotting of \(\log \Vert x_{n}-x_{n-1}\Vert \) in Example 1

Fig. 2
figure 2

Plotting of \(\log \Vert x_{n}-5\Vert \) in Example 1

Fig. 3
figure 3

Plotting of \(\log \Vert x_{n}-x_{n-1}\Vert \) in Example 2

Fig. 4
figure 4

Plotting of \(\log \Vert x_{n}-(3,3,3)\Vert \) in Example 2

Fig. 5
figure 5

Plotting of Example 2 with different choices of \(\mu \) and \(\lambda _{0}\)

Fig. 6
figure 6

Plotting of Example 2 with different choices of \(\mu \) and \(\lambda _{0}\)

Example 2

Let \(X={\mathbb {R}}^{3}\) with the Euclidean norm and

$$\begin{aligned} C=\{x\in {\mathbb {R}}^{3},\quad \Vert x-(4,4,4)\Vert \le 2\}. \end{aligned}$$

Define the bifunction \(f=C\times C\rightarrow {\mathbb {R}} \) and the quadratic function \(g:X\rightarrow {\mathbb {R}}\) as follows:

$$\begin{aligned} f(x,y)=\langle Px+Qy+q, y-x\rangle ,\quad g(x)=x^TAx+2b^Tx+c, \end{aligned}$$

where

$$\begin{aligned}&P=\left[ \begin{array}{c@{\qquad }c@{\qquad }c} 8 &{} -3 &{} -1 \\ -3 &{} 10 &{} -7\\ -1 &{}-7 &{}13 \\ \end{array} \right] ,\quad Q=\left[ \begin{array}{c@{\qquad }c@{\qquad }c} 6 &{} -2 &{} -1 \\ -2 &{} 8 &{} -6\\ -1 &{} -6 &{}11 \\ \end{array} \right] ,\quad q=\left[ \begin{array}{c@{\qquad }c@{\qquad }c} -21 \\ 0 \\ -27 \\ \end{array} \right] , \end{aligned}$$

and

$$\begin{aligned}&A=\left[ \begin{array}{c@{\qquad }c@{\qquad }c} 11 &{} 5 &{} 1 \\ 5 &{} 5 &{} 6\\ 1 &{} 6 &{}13 \\ \end{array} \right] ,\quad b=\left[ \begin{array}{c@{\qquad }c@{\qquad }c} 1 \\ 0 \\ 3 \\ \end{array} \right] , \end{aligned}$$

and \(c\in {\mathbb {R}}.\) Obviously, the bifunction f satisfies conditions A1, A3 and A4. Furthermore, using [36, Lemma 6.2], Theorems 7 and 8, we have

$$\begin{aligned} f(x,y) +f(y,z)-f(x,z)&\ge -\dfrac{\Vert Q-P\Vert }{2}\Vert x-y\Vert ^{2}-\dfrac{\Vert Q-P\Vert }{2}\Vert y-z\Vert ^{2}\\&\ge -\dfrac{\Vert Q-P\Vert }{2\lambda _{\min (A)}}D_{g}(y,x)-\dfrac{\Vert Q-P\Vert }{2\lambda _{\min (A)}}D_{g}(z,y), \end{aligned}$$

which shows the bifunction f satisfying A2. Since \(q=(-Q-P)(3,3,3)^T\) and Q is a symmetric positive semidefinite matrix, \(EP(f)=\{(3,3,3)^T\}.\) Using Theorem 3 and Remark 3, we get that g is supercoercive function and total convexity of g on bounded subsets of X follows from the strong convexity of g. Put \(\lambda _{0}=\frac{1}{100}\) and \( \mu =\frac{1}{10}\). It can be seen that all the hypotheses of Theorem 5 are satisfied. Applying Algorithm 1 with the initial point \( x_{0}=(4,4,5)\), we have the numerical results in Figs. 3 and 4 (See also Figs. 5, 6). In this example, as the first one, we see that our Algorithm 1 converges faster than Algorithm 1 of [22].