1 Introduction

Let \(E_1\) and \(E_2\) be real Banach spaces. Let C and Q be nonempty, closed and convex subsets of \(E_1\) and \(E_2\), respectively. Let \(A : E_1\rightarrow E_2\) be a bounded linear operator with its adjoint operator \(A^*\) which is defined by \(A^*:E_2^*\rightarrow E_1^*\),

$$\begin{aligned} \left\langle A^*{\bar{y}},x\right\rangle :=\left\langle {\bar{y}},Ax\right\rangle , \quad \forall x \in E_1, {\bar{y}} \in E_2^* \end{aligned}$$

and the equalities \(||A^*||=||A||\) and \({\mathcal {N}}(A^*) = {\mathcal {R}}(A)^\bot \) are valid, where \({\mathcal {R}}(A)^\bot :=\{x^*\in E_2^*:\left\langle x^*,u\right\rangle =0,~~\forall u \in {\mathcal {R}}(A)\}\). For more details on bounded linear operators and their duals, please see [22, 56, 57].

We shall adopt the following notations in this paper:

  • \(x_n\rightarrow x\) means that \(x_n\rightarrow x\) strongly;

  • \(x_n\rightharpoonup x\) means that \(x_n\rightarrow x\) weakly;

  • \(\omega _w(x_n):= \{x: \exists x_{n_j}\rightharpoonup x\}\) is the weak w-limit set of the sequence \(\{x_n\}\).

The split feasibility problem (SFP) is the problem of finding a point \(x\in C\) such that

$$\begin{aligned} Ax\in Q . \end{aligned}$$
(1)

We denote the solutions set by \(\Omega :=C \cap A^{-1}(Q) = \{y\in C : Ay \in Q\}\). This problem was first introduced by Censor and Elfving [15], in a finite dimensional Hilbert space, for solving the inverse problems in the context of phase retrievals, medical image reconstruction and also in modeling of intensity modulated radiation therapy. Many authors have introduced several iterative methods for finding a solution to (1) both in Hilbert spaces and certain Banach spaces (see, for example, [25, 38, 41, 50, 52, 53, 58, 59]). Censor et al. in Section 2 of [11] (see also [25]) introduced the prototypical Split Inverse Problem (SIP). In this, there are given two vector spaces X and Y and a linear operator \(A: X\rightarrow Y\). In addition, two inverse problems are involved. The first one, denoted by IP1, is formulated in the space X and the second one, denoted by IP2, is formulated in the space Y. Given these data, the Split Inverse Problem is formulated as follows: find a point \(x^*\in X\) that solves IP1 and such that the point \(y^*= Ax^*\in Y\) solves IP2.

Let \(H_{1}\) and \(H_{2}\) be real Hilbert spaces and \(A:H_{1}\rightarrow H_{2}\) a bounded linear operator. Let C and Q be nonempty, closed and convex subsets of \(H_{1}\) and \(H_{2}\), respectively. To solve the SFP in Hilbert spaces, Byrne [10] introduced the following CQ algorithm: \(x_{1}\in C\) and

$$\begin{aligned} x_{n+1} = P_{C}(x_{n} - \lambda A^*(I-P_Q)Ax_n), \ n\ge 1, \end{aligned}$$
(2)

where \(\lambda >0\), \(P_{C}\) and \(P_{Q}\) are the orthogonal projections on C and Q, respectively. It was proved that the sequence \(\{x_{n}\}\) defined by (2) converges weakly to a solution of the SFP provided the step size \(\lambda \in (0,\frac{2}{\Vert A\Vert ^{2}})\). It is observed that, in order to achieve the convergence, one has to estimate the norm of the bounded linear operator \(\Vert A\Vert \) (or the spectral radius of the matrix \(A^TA\) in the finite-dimensional framework) for selecting the step size \(\lambda \), which is not always possible in practice (see also Theorem 2.3 of [28]). To avoid this computation, there have been worthwhile works that the convergence is guaranteed without any prior information of the matrix norm (see, for examples [60, 62,63,64]). Among these works, López et al. [34] introduced a new way to select the step size by replacing the parameter \(\lambda \) appeared in (2) by the following:

$$\begin{aligned} \mu _n=\frac{\rho _nf(x_n)}{\Vert \nabla f(x_n)\Vert ^{2}}, \quad n\ge 1, \end{aligned}$$
(3)

where \(\rho _n\in (0,4)\), \(f(x_n)=\frac{1}{2}\Vert (I-P_Q)Ax_n\Vert ^{2}\) and \(\nabla f(x_n)=A^{*}(I-P_Q)Ax_n\) for all \(n\ge 1\). This method is a modification of the CQ method which is often called the self-adaptive method. Some modifications of the CQ algorithm and the self-adaptive method now have been invented for solving the SFP (see, for example, [4, 34, 40, 41, 58, 60]).

In optimization theory, to speed up the convergence rate, Polyak [47] firstly proposed the heavy ball method of the two-order time dynamical system which is a two-step iterative method for minimizing a smooth convex function f. To improve the convergence rate, Nesterov [44] introduced a modified heavy ball method as follows:

$$\begin{aligned} y_{n}= & {} x_{n}+\alpha _n(x_{n}-x_{n-1}),\nonumber \\ x_{n+1}= & {} y_{n}-\lambda _{n}\nabla f(y_{n}), \quad n\ge 1, \end{aligned}$$
(4)

where \(\alpha _{n}\in [0,1)\) is an extrapolation factor and \(\lambda _{n}\) is a positive sequence. Here, the inertial is represented by the term \(\alpha _n(x_{n}-x_{n-1})\). It is remarkable that the inertial methodology greatly improves the performance of the algorithm and has a nice convergence properties (see [16, 19, 35]). In [3], Alvarez and Attouch employed the idea of the heavy ball method to the setting of a general maximal monotone operator using the framework of the proximal point algorithm [9, 49]. This method is called the inertial proximal point algorithm and it is of the following form:

$$\begin{aligned} y_{n}= & {} x_{n}+\alpha _n(x_{n}-x_{n-1}),\nonumber \\ x_{n+1}= & {} (I+\lambda _{n}B)^{-1}(y_{n}), \quad n\ge 1, \end{aligned}$$
(5)

where B is a maximal monotone operator. It was proved that if \(\lambda _{n}\) is non-decreasing and \(\alpha _{n}\in [0, 1)\) is chosen such that

$$\begin{aligned} \sum _{n=1}^{\infty }\alpha _{n}\Vert x_{n}-x_{n-1}\Vert ^{2}<\infty , \end{aligned}$$

then \(\{x_{n}\}\) generated by (5) converges to a zero point of B (see also [42]).

In subsequent work, Maingé [36] (see also [37]) introduced the inertial Mann algorithm for solving the fixed point problem of nonexpansive mapping T (i.e., \(\Vert Tx-Ty\Vert \le \Vert x-y\Vert ,~~\forall x,y \in H_1\)) in Hilbert spaces as follows: take \(x_{0},x_{1}\in H_{1}\) and generate the sequence \(\{x_{n}\}\) by

$$\begin{aligned} y_{n}= & {} x_{n}+\alpha _n(x_{n}-x_{n-1}),\nonumber \\ x_{n+1}= & {} y_{n}+\beta _{n}(Ty_{n}-y_{n}), \quad n\ge 1, \end{aligned}$$
(6)

where \(\alpha _{n}\in [0,1)\) and \(\beta _{n}\in (0,1)\). It was shown that the sequence \(\{x_{n}\}\) converges weakly to a fixed point of T under the following conditions:

  1. (A)

    \(\alpha _{n}\in [0,\alpha )\) where \(\alpha \in [0,1)\);

  2. (B)

    \(\sum _{n=1}^{\infty }\alpha _{n}\Vert x_{n}-x_{n-1}\Vert ^{2}<\infty \);

  3. (C)

    \(0<\inf _{n\ge 1}\beta _{n}\le \sup _{n\ge 1}\beta _{n}<1\).

To satisfy the summability condition (B) of the sequence \(\{x_n\}\), one needs to calculate \(\{\alpha _n\}\) at each step. For further results on approximations methods with inertial terms for nonexpansive mappings, please see [6, 18].

Subsequently, Dong et al. [19] introduced an inertial CQ algorithm by combining the CQ algorithm introduced by Nakajo and Takahashi [43] and the inertial extrapolation and analyze its convergence. They proved the following theorem.

Theorem 1.1

Let \(T:H\rightarrow H\) be a nonexpansive mapping such that \(F(T)\ne \emptyset \). Let

$$\begin{aligned} \{\alpha _n\}\subset [{\underline{\alpha }}, {\overline{\alpha }}], \quad {\underline{\alpha }}\in (-\infty , 0], \quad {\overline{\alpha }}\in [0, \infty ), \quad \{\beta _n\}\subset [\beta , 1],\quad \beta \in (0, 1]. \end{aligned}$$

Set \(x_0, x_1 \in H \) arbitrarily. Define a sequence \(\{x_n\}\) by the following algorithm:

$$\begin{aligned} {\left\{ \begin{array}{ll} w_n=x_n+\alpha _n(x_n-x_{n-1}),\\ y_n=(1-\beta _n)w_n+\beta _n Tw_n,\\ C_n=\{z\in H : \Vert y_n-z\Vert \le \Vert w_n-z\Vert \},\\ Q_n=\{z\in H : \left\langle x_n-z, x_n-x_0\right\rangle \le 0\},\\ x_{x+1}=P_{C_n\cap Q_n}x_0 \end{array}\right. } \end{aligned}$$

for each \(n\ge 0\). Then, the iterative sequence \(\{x_n\}\) converges in norm to \(P_{F(T )}(x_0)\).

Very recently, Dang et al. [16] proposed the inertial relaxed CQ algorithms for solving SFP in Hilbert spaces and proved the weak convergence theorem for Picard-type and Mann-type iteration processes (see also [20]). Recently, there have been increasing interests in studying inertial extrapolation type algorithms. See, for example, inertial forward–backward splitting method [5, 35, 36], inertial Douglas–Rachford splitting method [6], inertial ADMM [7, 14], and inertial forward–backward–forward method [8]. These results and other related ones analyzed the convergence properties of inertial extrapolation type algorithms and demonstrated their performance numerically on some imaging and data analysis problems. It is based on this recent trend that our contribution in solving the SFP in this paper lies.

For \(p>1\), the duality mapping \(J^E_p: E\rightarrow 2^{E^*}\) is defined by

$$\begin{aligned} J^E_p(x)=\left\{ {\bar{x}} \in E^*:\left\langle x,{\bar{x}}\right\rangle =||x||^p,||{\bar{x}}||=||x||^{p-1}\right\} . \end{aligned}$$

In this situation, it is known that the duality mapping \(J^E_p\) is one-to-one, single-valued and satisfies \(J^E_p = (J^E_q)^{-1}\), where \(J^E_q\) is the duality mapping of \(E^*\) (see [1, 13, 48]).

The duality mapping \(J^E_p\) is said to be weak-to-weak continuous if

$$\begin{aligned} x_n\rightharpoonup x\Rightarrow \left\langle J^E_px_n,y\right\rangle \rightarrow \left\langle J^E_px,y\right\rangle \end{aligned}$$

holds true for any \(y\in E\) . It is worth noting that the \(\ell _p (p > 1)\) space has such a property, but the \(L_p (p > 2)\) space does not share this property (see [13, 33]).

Let C be a nonempty, closed and convex subset of a Banach space E. The metric projection

$$\begin{aligned} P_Cx=\mathrm{argmin}_{y \in C}||x-y||, \quad x \in E, \end{aligned}$$

is the unique minimizer of the norm distance, which can be characterized by a variational inequality:

$$\begin{aligned} \left\langle J^E_p(x-P_Cx),z-P_Cx\right\rangle \le 0, \quad \forall z \in C. \end{aligned}$$
(7)

More information on metric projections can be found, for example, in Section 3 of the book [27].

Given a Gâteaux differentiable convex function \(f: E\rightarrow {\mathbb {R}}\), the Bregman distance with respect to f is defined as:

$$\begin{aligned} \Delta _f(x,y)=f(y)-f(x)-\left\langle f'(x),y-x\right\rangle , \quad x,y \in E. \end{aligned}$$

It is worth noting that the duality mapping \(J^E_p\) is in fact the derivative of the function \(f_p(x) = (\frac{1}{p})||x||^p\). Then, the Bregman distance with respect to \(f_p\) is given by (see, for example, [1, 12])

$$\begin{aligned} \Delta _p(x,y)= & {} \frac{1}{q}||x||^p-\left\langle J^E_px,y\right\rangle +\frac{1}{p}||y||^p\\= & {} \frac{1}{p}(||y||^p-||x||^p)+\left\langle J^E_px,x-y\right\rangle \\= & {} \frac{1}{q}(||x||^p-||y||^p)-\left\langle J^E_px-J^E_py,y\right\rangle . \end{aligned}$$

Given \(x,y,z\in E\), one can easily get

$$\begin{aligned} \Delta _p(x,y)= & {} \Delta _p(x,z)+\Delta _p(z,y)+\left\langle z-y,J^E_px-J^E_pz\right\rangle , \end{aligned}$$
(8)
$$\begin{aligned} \Delta _p(x,y)+\Delta _p(y,x)= & {} \left\langle x-y,J^E_px-J^E_py\right\rangle . \end{aligned}$$
(9)

If E is a Hilbert space H and \(f_p(x) = \frac{1}{2}||x||^2\), then \(\Delta _p(x,y)=\frac{1}{2}||x-y||^2, x,y \in H\). Generally speaking, the Bregman distance is not a metric due to the absence of symmetry, but it has some distance-like properties. Some of these properties include \((i) \Delta _p(x,y) \ge 0;~~(ii) \Delta _p(x,x)=0\) and the properties as given in (8) and (9).

Likewise, one can define the Bregman projection:

$$\begin{aligned} \Pi _Cx=\mathrm{argmin}_{y \in C} \Delta _p(x,y),\quad x \in E, \end{aligned}$$

as the unique minimizer of the Bregman distance (see [54]). It has been shown (see, e.g., [12]) that the Bregman projection (terminology is due to Censor and Lent [12]) is a good replacement for the metric projection in optimization methods and in algorithms for solving convex feasibility problems. Bregman projections are continuous and have good stability properties. This mathematical property is relevant because it opens a way for deeper studying the effect of computational errors on the behavior of many iterative algorithms involving Bregman projections. The Bregman projection can also be characterized by a variational inequality:

$$\begin{aligned} \left\langle J^E_p(x)-J^E_p(\Pi _Cx),z-\Pi _Cx\right\rangle \le 0, \quad \forall z \in C, \end{aligned}$$
(10)

from which one has

$$\begin{aligned} \Delta _p(\Pi _Cx,z)\le \Delta _p(x,z)-\Delta _p(x,\Pi _Cx),\quad \forall z \in C. \end{aligned}$$
(11)

In Hilbert spaces, the metric projection and the Bregman projection with respect to \(f_2\) are coincident, but in general they are different.

The SFP was studied in more general framework, for example, Banach spaces. More specifically, Schöpfer et al. [53] proposed in Banach spaces the following algorithm: \(x_{1}\in E_1\) and

$$\begin{aligned} x_{n+1} = \Pi _{C}J^*_{E_{1}}[J_{E_{1}}(x_{n}) - \lambda _{n}A^*J_{E_{2}}(Ax_n-P_Q(Ax_n))], \quad n\ge 1, \end{aligned}$$
(12)

where \(\{\lambda _{n}\}\) is a positive sequence, \(\Pi _C\) denotes the generalized projection on E, \(P_Q\) is the metric projection on \(E_2\), \(J_{E_{1}}\) is the duality mapping on \(E_{1}\) and \(J^*_{E_{1}}\) is the duality mapping on \(E_{1}^*\). It was proved that the sequence \({\{x_{n}}\}\) converges weakly to a solution of SFP, under some mild conditions, in p-uniformly convex and uniformly smooth Banach spaces. To be more precise, the condition that the duality mapping of \(E_1\) is sequentially weak-to-weak-continuous is assumed in [53] (which excludes some important Banach spaces, such as the classical \(L_p (2< p < \infty )\) spaces). Please see some modifications in [50, 51].

Recently, since in some applied disciplines, the norm convergence is more desirable that the weak convergence, Wang [59] modified the above algorithm (3) and proved its strong convergence for the following multiple-sets split feasibility problem (MSSFP): find \(x \in E_1\) satisfying

$$\begin{aligned} x \in \bigcap \limits _{i=1}^{r}C_i, Ax \in \bigcap \limits _{j=1+r}^{r+s}Q_j, \end{aligned}$$
(13)

where rs are two given integers, \(C_i, i = 1, \ldots , r\) is a closed convex subset in \(E_1\), and \(Q_j, j = r + 1, \ldots , r + s\), is a closed convex subset in \(E_2\). He defined for each \(n \in {\mathbb {N}}\),

$$\begin{aligned} T_n(x)= \left\{ \begin{array}{llll} &{} \Pi _{C_i(n)}(x),~~1\le i(n)\le r,\\ &{} J^{E_1}_q[J^{E_1}_p(x)-\lambda _nA^*J^{E_2}_p(Ax- P_{Q_j(n)}(Ax))],~~r{+}1\le i(n)\le r{+}s, \end{array} \right. \end{aligned}$$

where \(i:{\mathbb {N}}\rightarrow I\) is the cyclic control mapping

$$\begin{aligned} i(n)=n~~\mathrm{mod}~~(r+s)+1, \end{aligned}$$

and \(\lambda _n\) satisfies

$$\begin{aligned} 0<\lambda \le \lambda _n\le \left( \frac{q}{c_q||A||^q} \right) ^{\frac{1}{q-1}}, \end{aligned}$$
(14)

with \(c_q\) a uniform smoothness constant and proposed the following algorithm: For any initial guess \(x_1\), define \(\{x_n\}\) recursively by

$$\begin{aligned} \left\{ \begin{array}{lllll} &{} y_n =T_nx_n \\ &{} D_n=\{w \in E_1:\Delta _p(y_n,w)\le \Delta _p(x_n,w)\}\\ &{} E_n =\{w \in E_1:\langle x_n-w,J_p(x_1)-J_p(x_n)\ge 0 \}\\ &{} x_{n+1} = \Pi _{D_n\cap E_n}(x_1). \end{array} \right. \end{aligned}$$
(15)

Using the idea in the work of Nakajo and Takahashi [43], he proved the following strong convergence theorem in p-uniformly convex Banach spaces which is also uniformly smooth.

Theorem 1.2

Let \(E_1\) be p-uniformly convex real Banach space which is also uniformly smooth. Let C and Q be nonempty, closed and convex subsets of \(E_1\) and \(E_2\), respectively, \(A:E_1 \rightarrow E_2\) be a bounded linear operator and \(A^*:E_2^* \rightarrow E_1^*\) be the adjoint of A. Suppose that SFP (13) has a nonempty solution set \(\Omega \). Let the sequence \(\{x_n\}\) be generated by (15). Then, \(\{x_n\}\) converges strongly to the Bregman projection of \(x_1\) onto the solution set \(\Omega \).

It is observed that the main advantage of result of Wang [59] is that the weak-to-weak continuity of the duality mapping assumed in [53] is dispensed and strong convergence result was achieved. However, the step size \(\lambda _{n}\) still depends on the computation of the norm of A which is very hard in general.

Motivated by the previous works, it is our purpose to introduce a new self-adaptive projection method involving the inertial extrapolation term for solving the SFP in p-uniformly convex and uniformly smooth Banach spaces. We then prove a strong convergence theorem under the condition that does not need to know a priori the norm of the bounded linear operator. Finally, we provide some numerical experiments which involve image restoration problem for the inertial method to illustrate the convergence behavior and the effectiveness of our proposed algorithm. The obtained result of this paper seems to be the first one to investigate the SFP outside Hilbert spaces involving the inertial technique.

The rest of this paper is organized as follows: Some basic concepts and lemmas are provided in Sect. 2. The strong convergence result of this paper is proved in Sect. 3. Finally, in Sect. 4, numerical experiments are reported to conclude the effectiveness and the implementation of our algorithm.

2 Preliminaries and lemmas

Let \(1< q\le 2\le p<\infty \). Let E be a real Banach space. The modulus of convexity \(\delta _{E}:[0, 2]\rightarrow [0, 1]\) is defined as:

$$\begin{aligned} \delta _{E}(\epsilon )=\inf \left\{ 1-\frac{||x+y||}{2}:||x||=1=||y||, ||x-y||\ge \epsilon \right\} . \end{aligned}$$

E is called uniformly convex if \(\delta _{E}(\epsilon ) > 0\) for any \(\epsilon \in (0,2]\); p-uniformly convex if there is a \(c_p > 0\) so that \(\delta _{E}(\epsilon )\ge c_p\epsilon ^p\) for any \(\epsilon \in (0,2]\). The modulus of smoothness \(\rho _E(\tau ):[0,\infty )\rightarrow [0,\infty )\) is defined by

$$\begin{aligned} \rho _E(\tau )=\left\{ \frac{||x+\tau y||+||x-\tau y||}{2}-1:||x||=||y||=1 \right\} . \end{aligned}$$

E is called uniformly smooth if \(\lim \nolimits _{n\rightarrow \infty } \frac{\rho _E(\tau )}{\tau } = 0\); q-uniformly smooth if there is a \(c_q > 0\) so that \(\rho _E(\tau )\le c_q\tau ^q\) for any \(\tau > 0\). The \(L_p\) space is 2-uniformly convex for \(1 < p\le 2\) and p-uniformly convex for \(p\ge 2\). It is known that E is p-uniformly convex if and only if its dual \(E^*\) is q-uniformly smooth (see [33]).

Here and hereafter, we assume that \(E_1\) is a p-uniformly convex and uniformly smooth, which implies that its dual space, \(E_1^*\), is q-uniformly smooth and uniformly convex where \(1< q\le 2\le p<\infty \) with \(\frac{1}{p} + \frac{1}{q} = 1\). We further assume that \(J^{E_1}_{p}\) and \(J^{E_2}_{p}\) represent the duality mappings of \(E_1\) and \(E_2\), respectively, and \(J^{E_1}_{p} = (J^*_{E_1})^{-1}=(J^{E_1}_{q})^{-1}\), where \(J^{*}_{E_1}=J^{E_1}_{q}\) is the duality mapping of \(E_1^*\). For the p-uniformly convex space, the metric and Bregman distance satisfies the following relation (see [53]):

$$\begin{aligned} \tau ||x-y||^p\le \Delta _p(x,y)\le \left\langle x-y,J^{E_1}_px-J^{E_1}_py\right\rangle , \end{aligned}$$
(16)

where \(\tau > 0\) is some fixed number.

The q-uniformly smooth spaces have the following estimate [61].

Lemma 2.1

(Xu [61]). Let \(x,y\in E_1\). If \(E_1\) is q-uniformly smooth, then there is a \(c_q > 0\) so that

$$\begin{aligned} ||x-y||^q \le ||x||^q -q\left\langle y,J^{E_1}_q(x)\right\rangle +c_q||y||^q. \end{aligned}$$

It is easy to see that if \(\{x_n\}\) and \(\{y_n\}\) are bounded sequences of a p-uniformly convex and uniformly smooth \(E_1\), then \(x_n-y_n\rightarrow 0,~~n\rightarrow \infty \) implies that \(\Delta _p(x_n,y_n)\rightarrow 0,~~n\rightarrow \infty .\)

3 Main results

In this section, we introduce the inertial algorithm and prove strong convergence of the sequence generated by our scheme for solving the split feasibility problem in Banach spaces.

Algorithm 3.1

Inertial Algorithm for SFP: Let \(\{\alpha _n\}\subset {\mathbb {R}}\) be a bounded set. Set \(x_0,x_1\in C\). Define a sequence \(\{x_n\}\) by the following manner:

$$\begin{aligned} \left\{ \begin{array}{l@{\qquad }l@{\qquad }l@{\qquad }l@{\qquad }l} &{}w_n=J_q^{E_1}[J_p^{E_1}(x_n)+\alpha _n(J_p^{E_1}(x_n)-J_p^{E_1}(x_{n-1}))] \\ &{}y_n=\Pi _CJ_q^{E_1}\bigg [J_p^{E_1}(w_n)-\rho _n\frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\nabla f(w_n)\bigg ]\\ &{}C_n=\{u\in E_1 : \Delta _p(y_n,u)\le \Delta _p(w_n,u)\}\\ &{}Q_n=\{u\in E_1 : \left\langle x_n-u,J_p^{E_1}(x_0)-J_p^{E_1}(x_n)\right\rangle \ge 0\}\\ &{}x_{n+1} = \Pi _{C_n\cap Q_n}(x_0). \end{array} \right. \end{aligned}$$
(17)

for all \(n\ge 0\) where \(f(w_n):=\frac{1}{p}\Vert (I-P_Q)Aw_n\Vert ^p, f^{p-1}(w_n):=\Big (\frac{1}{p}\Vert (I-P_Q)Aw_n\Vert ^p\Big )^{p-1}\) and \(\{\rho _n\}\subset (0,\infty )\) \(\liminf _{n\rightarrow \infty }\rho _n\Big (p-c_q\frac{\rho _n^{q-1}}{q}\Big )>0\).

Remark 3.2

In Algorithm 3.1, we use convention \(\frac{0}{0}=0\). Therefore, if \(\nabla f(w_n)=0\), then \(y_n=\Pi _C(w_n)\).

Lemma 3.3

The sequence \(\{x_n\}\) is well defined.

Proof

To show that \(\{x_n\}\) is well defined, we have to show that \(C_n\bigcap Q_n\) is nonempty closed and convex \(\forall n\ge 1\). Clearly, \(C_n\) is closed and \(Q_n\) is closed convex. To show the convexity of \(C_n\), it suffices to observe that

$$\begin{aligned} \Delta p(y_n,u)\le & {} \Delta p(w_n,u)\Longleftrightarrow \left\langle J_p^{E_1}(w_n)-J_p^{E_1}(y_n),u\right\rangle \\\le & {} \frac{1}{q}(\Vert w_n\Vert ^p-\Vert y_n\Vert ^p), \quad \forall u \in E_1, \end{aligned}$$

so that \(C_n\) is a half-space and, therefore, convex. Hence, \(C_n \cap Q_n\) is closed and convex. We next show that \(C_n\bigcap Q_n\ne \emptyset .\) To do this, it suffices to show that, \(\forall n\ge 1\),

$$\begin{aligned} \Omega \subset C_n\bigcap Q_n. \end{aligned}$$
(18)

Observe that if (4) holds, then we note that \(\Omega \ne \emptyset \) and so is \(C_n\bigcap Q_n.\) To verify (18), we first show \(\Omega \subset C_n.\) Let \(z\in \Omega .\) Then, let \(u_n=J_p^{E_1}(w_n)-\rho _n\displaystyle \frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\nabla f(w_n)\) for all \(n\ge 1\). We see from Lemma 2.1 that

$$\begin{aligned} \Vert u_n\Vert _{E_1^*}^q= & {} \Vert J_p^{E_1}(w_n)-\rho _n\frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\nabla f(w_n)\Vert _{E_1^*}^q\nonumber \\\le & {} \Vert w_n\Vert ^p-q\rho _n\displaystyle \frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\left\langle w_n,\nabla f(w_n)\right\rangle +c_q\rho _n^q\frac{f^{(p-1)q}(w_n)}{\Vert \nabla f(w_n)\Vert ^{pq}}\Vert \nabla f(w_n)\Vert ^q\nonumber \\= & {} \Vert w_n\Vert ^p-q\rho _n \displaystyle \frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\left\langle w_n,\nabla f(w_n)\right\rangle +c_q\rho _n^q\displaystyle \frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}. \end{aligned}$$
(19)

Then, by (11) and (19) we get

$$\begin{aligned} \Delta p(y_n,z)\le & {} \Delta p(J_q^{E_1}(u_n),z)\nonumber \\= & {} \frac{\Vert z\Vert ^p}{p}+\frac{1}{q}\Vert J_q^{E_1}(u_n)\Vert ^p-\left\langle z,u_n\right\rangle \nonumber \\= & {} \frac{\Vert z\Vert ^p}{p}+\frac{1}{q}\Vert u_n\Vert ^{(q-1)p}-\left\langle z,u_n\right\rangle \nonumber \\= & {} \frac{\Vert z\Vert ^p}{p}+\frac{1}{q}\Vert u_n\Vert ^{(q-1)\frac{q}{(q-1)}}-\left\langle z,u_n\right\rangle \nonumber \\= & {} \frac{\Vert z\Vert ^p}{p}+\frac{1}{q}\Vert u_n\Vert ^q-\left\langle z,u_n\right\rangle \nonumber \\= & {} \frac{\Vert z\Vert ^p}{p}+\frac{1}{q}\Vert u_n\Vert ^q-\left\langle z,J_p^{E_1}(w_n)\right\rangle +\rho _n\frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\left\langle z,\nabla f(w_n)\right\rangle \nonumber \\\le & {} \frac{\Vert z\Vert ^p}{p}+\frac{1}{q}\bigg (\Vert w_n\Vert ^p-q\rho _n\frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\left\langle w_n,\nabla f(w_n)\right\rangle {+}c_q\rho _n^q\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\bigg )\nonumber \\&-\left\langle z,J_p^{E_1}(w_n)\right\rangle +\rho _n\frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\left\langle z,\nabla f(w_n)\right\rangle \nonumber \\= & {} \frac{\Vert z\Vert ^p}{p}+\frac{\Vert w_n\Vert ^p}{q}-\left\langle z,J_p^{E_1}(w_n)\right\rangle +\frac{c_q\rho _n^q}{q}\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p} \nonumber \\&+\rho _n\frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\left\langle z-w_n,\nabla f(w_n)\right\rangle \nonumber \\= & {} \Delta p(w_n,z)+\frac{c_q\rho _n^q}{q}\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}+\rho _n\frac{f^{p-1}(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\left\langle z-w_n,\nabla f(w_n)\right\rangle .\nonumber \\ \end{aligned}$$
(20)

On the other hand, observe that (using (7) and the fact that \(Az \in Q\))

$$\begin{aligned} \left\langle \nabla f(w_n),z-w_n\right\rangle= & {} \left\langle A^*J_p^{E_2}(I-P_Q)Aw_n,z-w_n\right\rangle \nonumber \\= & {} \left\langle J_p^{E_2}(I-P_Q)Aw_n,Az-Aw_n\right\rangle \nonumber \\= & {} \left\langle J_p^{E_2}(I-P_Q)Aw_n,P_QAw_n-Aw_n\right\rangle \nonumber \\&+\left\langle J_p^{E_2}(I-P_Q)Aw_n,Az-P_QAw_n\right\rangle \nonumber \\\le & {} -\Vert (I-P_Q)Aw_n\Vert ^p=-pf(w_n). \end{aligned}$$
(21)

By using (20) and (21), we get

$$\begin{aligned} \Delta p(y_n,z)\le \Delta p(w_n,z)+\bigg (\frac{c_q\rho _n^q}{q}-\rho _np\bigg )\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}, \end{aligned}$$
(22)

which implies by our assumption that

$$\begin{aligned} \Delta p(y_n,z)\le \Delta p(w_n,z). \end{aligned}$$

This implies that \(\Omega \subset C_n.\)

Finally, we show that \(\Omega \subset Q_n.\) For \(n=0\), we have \(Q_0=E_1\) and hence \(\Omega \subseteq Q_0.\) Suppose that \(x_k\) is given and \(\Omega \subseteq C_k\bigcap Q_k\) for some \(k\in {\mathbb {N}}.\) Then, there exists a unique element \(x_{k+1}\) such that

$$\begin{aligned} x_{k+1}=\Pi _{C_k\bigcap Q_k}x_0. \end{aligned}$$

Using (10), we have

$$\begin{aligned} \left\langle x_{k+1}-z,J_p^{E_1}(x_0)-J_p^{E_1}(x_{k+1})\right\rangle \ge 0. \end{aligned}$$

So that \(\Omega \subset Q_{k+1}.\) By induction, we can show that \(\Omega \subset Q_n\) \(\forall n\in {\mathbb {N}}\) and, thus, the proof is complete. \(\square \)

Lemma 3.4

Let \(\{x_n\}\) be generated by Algorithm 3.1. Then

  1. (i)

    \(\displaystyle \lim _{n\rightarrow \infty }\Vert x_{n+1}-x_n\Vert =0\);

  2. (ii)

    \(\displaystyle \lim _{n\rightarrow \infty }\Vert w_n-x_n\Vert =0\);

  3. (iii)

    \(\displaystyle \lim _{n\rightarrow \infty }\Vert y_n-x_{n+1}\Vert =0\);

  4. (iv)

    \(\displaystyle \lim _{n\rightarrow \infty }\Vert y_n-w_n\Vert =0\);

  5. (v)

    \(\displaystyle \lim _{n\rightarrow \infty }\Vert A^*J_p^{E_2}(I-P_Q)Aw_n\Vert =0\).

Proof

Note that the definition of \(Q_n\) actually implies that \(x_n=\Pi _{Q_n}(x_0).\) This together with the fact that \(\Omega \subset Q_n\) further implies that

$$\begin{aligned} \Delta _p(x_0,x_n)\le \Delta _p(x_0,z), \quad \forall z\in \Omega . \end{aligned}$$

Due to \(s:=\Pi _{\Omega }(x_0)\in \Omega \), we obtain

$$\begin{aligned} \Delta _p(x_0,x_n)\le \Delta _p(x_0,s), \end{aligned}$$
(23)

which implies that \(\{\Delta _p(x_0,x_n)\}\) is bounded. Hence \(\{x_n\}\) is bounded by (16). On the other hand, from \(x_{n+1}\in Q_n\) and (10), we have \(\left\langle x_n-x_{n+1},\right. \left. J_p^{E_1}(x_n)-J_p^{E_1}(x_0)\right\rangle \le 0\) and hence by (11)

$$\begin{aligned} \Delta _p(x_n,x_{n+1})\le \Delta _p(x_0,x_{n+1})-\Delta p(x_0,x_n),\quad \forall n\ge 0. \end{aligned}$$
(24)

This implies that

$$\begin{aligned} \Delta _p(x_0,x_n)\le & {} \Delta _p(x_0,x_{n+1})-\Delta p(x_n,x_{n+1})\\\le & {} \Delta _p(x_0,x_{n+1}). \end{aligned}$$

Thus, \(\{\Delta _p(x_0,x_n)\}\) is monotone nondecreasing and by (23) \(\displaystyle \lim _{n\rightarrow \infty }\Delta _p(x_0,x_n)\) exists. It then follows from (24) that

$$\begin{aligned} \displaystyle \lim _{n\rightarrow \infty }\Delta p(x_n,x_{n+1})=0. \end{aligned}$$

Hence, we obtain from (16) that

$$\begin{aligned} \displaystyle \lim _{n\rightarrow \infty }\Vert x_{n+1}-x_n\Vert =0. \end{aligned}$$

This establishes (i).

Since \(J_p^{E_1}\) is norm-to-norm uniformly continuous on bounded subsets of \(E_1\), we get

$$\begin{aligned} \displaystyle \lim _{n\rightarrow \infty }\Vert J_p^{E_1}(x_{n+1})-J_p^{E_1}(x_n)\Vert =0. \end{aligned}$$

Observe from Algorithm 3.1 that

$$\begin{aligned} J_p^{E_1}(w_n)-J_p^{E_1}(x_n)=\alpha _n(J_p^{E_1}(x_n)-J_p^{E_1}(x_{n-1})). \end{aligned}$$

So,

$$\begin{aligned} \left\| J_p^{E_1}(w_n)-J_p^{E_1}(x_n)\right\|= & {} \alpha _n\left\| J_p^{E_1}(x_n)-J_p^{E_1}(x_{n-1})\right\| \\\le & {} {\bar{\alpha }}\left\| J_p^{E_1}(x_n)-J_p^{E_1}(x_{n-1})\right\| \rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Since \(J_q^{E_1}\) is also norm-to-norm uniformly continuous on bounded subsets of \(E_1^*\), we have

$$\begin{aligned} \Vert w_n-x_n\Vert \rightarrow 0. \end{aligned}$$

This establishes (ii).

Furthermore,

$$\begin{aligned} \left\| x_{n+1}-w_{n}\right\| \le \left\| x_{n+1}-x_n\right\| +\left\| x_n-w_n\right\| \rightarrow 0, \quad n\rightarrow \infty . \end{aligned}$$

It follows that

$$\begin{aligned} \left\| J^{E_1}_p(w_n)-J^{E_1}_p(x_{n+1})\right\| \rightarrow 0. \end{aligned}$$

By (16), we get

$$\begin{aligned} \Delta _p(w_n,x_{n+1})\le & {} \left\langle w_{n}-x_{n+1},J^{E_1}_p(w_n)-J^{E_1}_p(x_{n+1})\right\rangle \nonumber \\\le & {} M\left\| J^{E_1}_p(w_n)-J^{E_1}_p(x_{n+1})\right\| \rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Since \(x_{n+1}\in C_n\), we have that

$$\begin{aligned} \Delta _p(y_n,x_{n+1})\le \Delta _p(w_n,x_{n+1})\rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

This implies that

$$\begin{aligned} \left\| y_n-x_{n+1}\right\| \rightarrow 0, \quad n\rightarrow \infty , \end{aligned}$$

and this establishes (iii).

Observe that

$$\begin{aligned} \left\| y_n-x_n\right\| \le \left\| y_n-x_{n+1}\right\| +\left\| x_{n+1}-x_n\right\| \rightarrow 0, \quad n\rightarrow \infty \end{aligned}$$

and

$$\begin{aligned} \left\| y_n-w_n\right\| \le \left\| y_n-x_n\right\| +\left\| x_n-w_n\right\| \rightarrow 0, \quad n\rightarrow \infty . \end{aligned}$$
(25)

This establishes (iv).

From (22) and z being in \(\Omega \), we get

$$\begin{aligned} \Delta _p(y_n,z)\le & {} \Delta _p(w_n,z)+\rho _n\bigg (\frac{c_q\rho _n^{q-1}}{q}-p\bigg )\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\nonumber \\= & {} \Delta _p(w_n,z)-\rho _n\bigg (p-\frac{c_q\rho _n^{q-1}}{q}\bigg )\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}. \end{aligned}$$

This implies that

$$\begin{aligned} \rho _n\bigg (p-\frac{c_q\rho _n^{q-1}}{q}\bigg )\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\le \Delta _p(w_n,z)-\Delta _p(y_n,z). \end{aligned}$$
(26)

By (8), we get

$$\begin{aligned} \Delta _p(w_n,z)= \Delta _p(w_n,y_n)+\Delta _p(y_n,z)+\left\langle y_n-z, J^{E_1}_p(w_n)-J^{E_1}_p(y_n)\right\rangle , \end{aligned}$$

which implies that

$$\begin{aligned} \Delta _p(w_n,z)-\Delta _p(y_n,z)= & {} \Delta _p(w_n,y_n)+\left\langle y_n-z, J^{E_1}_p(w_n)-J^{E_1}_p(y_n)\right\rangle \nonumber \\\le & {} \left\langle w_n-y_n,J^{E_1}_p(w_n)-J^{E_1}_p(y_n)\right\rangle \nonumber \\&+\left\langle y_n-z, J^{E_1}_p(w_n)-J^{E_1}_p(y_n)\right\rangle \nonumber \\\le & {} M^*\Vert J^{E_1}_p(w_n)-J^{E_1}_p(y_n)\Vert \rightarrow 0, \quad n\rightarrow \infty .\nonumber \\ \end{aligned}$$
(27)

Combining (26) and (27), we get

$$\begin{aligned} \rho _n\bigg (p-\frac{c_q\rho _n^{q-1}}{q}\bigg )\frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\rightarrow 0, \quad n\rightarrow \infty . \end{aligned}$$
(28)

Since \(\liminf _{n\rightarrow \infty }\rho _n\Big (p-c_q\frac{\rho _n^{q-1}}{q}\Big )>0\), we get

$$\begin{aligned} \frac{f^p(w_n)}{\Vert \nabla f(w_n)\Vert ^p}\rightarrow 0, \quad n\rightarrow \infty \end{aligned}$$

and hence

$$\begin{aligned} \frac{f(w_n)}{\Vert \nabla f(w_n)\Vert }\rightarrow 0, \quad n\rightarrow \infty . \end{aligned}$$
(29)

Since \(J_p^{E_1}\) is norm-to-norm uniformly continuous on bounded subsets of \(E_1\), we have from (25) that

$$\begin{aligned} \Vert J_p^{E_1}(y_n)-J_p^{E_1}(w_n)\Vert \rightarrow 0, \quad n\rightarrow \infty . \end{aligned}$$
(30)

Furthermore, since \(\{\nabla f(w_n)\}\) is bounded, we obtain from (29) that

$$\begin{aligned} 0\le f(w_n)= & {} \Vert \nabla f(w_n)\Vert \frac{f(w_n)}{\Vert \nabla f(w_n)\Vert }\nonumber \\\le & {} M_1 \frac{f(w_n)}{\Vert \nabla f(w_n)\Vert }\rightarrow 0, \quad n\rightarrow \infty , \end{aligned}$$

for some \(M_1 >0\). Therefore

$$\begin{aligned} \displaystyle \lim _{n\rightarrow \infty } f(w_n)=0. \end{aligned}$$

Hence

$$\begin{aligned} \displaystyle \lim _{n\rightarrow \infty } \Vert (I-P_Q)Aw_n\Vert =0. \end{aligned}$$

Also

$$\begin{aligned} \Vert A^*J^{E_2}_p(I-P_Q)Aw_n\Vert \le \Vert A\Vert \Vert (I-P_Q)Aw_n\Vert \rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Thus

$$\begin{aligned} \left\| A^*J^{E_2}_p(I-P_Q)Aw_n\right\| \rightarrow 0, \quad n\rightarrow \infty . \end{aligned}$$

This establishes (v). \(\square \)

Lemma 3.5

Let \(\{x_n\}\) be generated by Algorithm 3.1. Then, the sequence \(\{x_n\}\) has a weak cluster point and \(\omega _w(x_n)\subseteq \Omega .\)

Proof

From the previous Lemma 3.4, we see that \(\{x_n\}\) is bounded. By the hypothesis, \(E_1\) is clearly reflexive and thus the weak cluster point set of \(\{x_n\}\) is nonempty. So it suffices to show that \(\omega _w(x_n)\subseteq \Omega .\) To do this, let us choose a subsequence \(\{x_{n_j}\}\) of \(\{x_n\}\) such that \(x_{n_j}\rightharpoonup z\in \omega _w(x_n).\) Since \(\Vert y_n-x_n\Vert \rightarrow 0,n\rightarrow \infty \), then we get \(y_{n_j}\rightharpoonup z\). Obviously we have \(z \in C\). Now since \(x_{n_j} \rightharpoonup z\) and \(\displaystyle \lim _{n\rightarrow \infty }\Vert x_n-w_n\Vert =0\), we obtain that \(w_{n_j}\rightharpoonup z\). Let us fix some \(x\in \Omega .\) Then, \(Ax\in Q\) and by (7), we get

$$\begin{aligned} \Vert (I-P_Q)Aw_{n_j}\Vert ^p= & {} \left\langle J_p^{E_2}(I-P_Q)Aw_{n_j},(I-P_Q)Aw_{n_j}\right\rangle \nonumber \\= & {} \left\langle J_p^{E_2}(I-P_Q)Aw_{n_j},Aw_{n_j}-Ax\right\rangle \nonumber \\&+ \left\langle J_p^{E_2}(I-P_Q)Aw_{n_j},Ax-P_QAw_{n_j}\right\rangle \nonumber \\\le & {} \left\langle J_p^{E_2}(I-P_Q)Aw_{n_j},Aw_{n_j}-Ax\right\rangle \nonumber \\\le & {} M_1\Vert A^{*}J_p^{E_2}(I-P_Q)Aw_{n_j}\Vert ^{p-1}\rightarrow 0,j\rightarrow \infty ,\nonumber \\ \end{aligned}$$
(31)

where \(M_1>0\) is a sufficiently large number. It then follows that

$$\begin{aligned} \Vert (I-P_Q)Az\Vert ^p= & {} \left\langle J_p^{E_2}(Az-P_QAz),Az-P_QAz\right\rangle \nonumber \\= & {} \left\langle J_p^{E_2}(Az-P_QAz),Az-Aw_{n_j}\right\rangle \nonumber \\&+\left\langle J_p^{E_2}(Az-P_QAz),Aw_{n_j}-P_QAw_{n_j}\right\rangle \nonumber \\&+\left\langle J_p^{E_2}(Az-P_QAz),P_QAw_{n_j}-P_QAz\right\rangle \nonumber \\\le & {} \left\langle J_p^{E_2}(Az-P_QAz),Az-Aw_{n_j}\right\rangle \nonumber \\&+\left\langle J_p^{E_2}(Az-P_QAz),Aw_{n_j}-P_QAw_{n_j}\right\rangle . \end{aligned}$$
(32)

Note that since \(w_{n_j}\rightharpoonup z\), then \(Aw_{n_j}\rightharpoonup Az.\) Letting \(j\rightarrow \infty \) in (32), we obtain

$$\begin{aligned} \Vert Az-P_QAz\Vert =0. \end{aligned}$$

Thus, \(Az \in Q\) and hence \(z\in \Omega .\) \(\square \)

Theorem 3.6

The sequence \(\{x_n\}\) generated by Algorithm 3.1 converges strongly to the Bregman projection of \(x_0\) onto the solution set \(\Omega .\)

Proof

We see that there is \(\{x_{n_j}\}\) of \(\{x_n\}\) such that \(x_{n_j}\rightharpoonup z.\) Hence, \(z\in \Omega \) by Lemma 3.4. Since \(x_{n+1}\in Q_n\) and \(\Pi _{Q_n}(x_0)=\mathrm{argmin}_{y \in Q_n}\Delta _p(x_0,y)\), it follows that

$$\begin{aligned} \Delta _p(x_0,x_n)= & {} \Delta _p(x_0,\Pi _{Q_n}(x_0))\\\le & {} \Delta _p(x_0,x_{n+1}). \end{aligned}$$

By Lemma 3.2, \(\Pi _{\Omega }(x_0)\in \Omega \subseteq Q_{n+1}\) and hence

$$\begin{aligned} \Delta _p(x_0,x_{n+1})= & {} \Delta _p(x_0,\Pi _{Q_{n+1}}(x_0))\\\le & {} \Delta _p(x_0,\Pi _{\Omega }(x_0)). \end{aligned}$$

Thus,

$$\begin{aligned} \Delta _p(x_0,x_n)\le \Delta _p(x_0,x_{n+1})\le \Delta _p(x_0,\Pi _{\Omega }(x_0)). \end{aligned}$$

From (8) and (9), we see that

$$\begin{aligned} \Delta _p(\Pi _\Omega x_0,x_{n_j})= & {} \Delta _p(\Pi _\Omega x_0,x_0)+\Delta _p(x_0,x_{n_j})\nonumber \\&+ \left\langle x_0-x_{n_j},J_p^{E_1}(\Pi _\Omega x_0)-J_p^{E_1}(x_0)\right\rangle \nonumber \\\le & {} \Delta _p(\Pi _{\Omega }x_0,x_0)+\Delta _p(x_0,\Pi _\Omega x_0)\nonumber \\&+\left\langle x_0-\Pi _{\Omega }x_0,J_p^{E_1}(\Pi _{\Omega }x_0)-J_p^{E_1}(x_0)\right\rangle \nonumber \\&+\left\langle \Pi _{\Omega }x_0-x_{n_j},J_p^{E_1}(\Pi _{\Omega }x_0)-J_p^{E_1}(x_0)\right\rangle \nonumber \\= & {} \left\langle x_{n_j}-\Pi _{\Omega }(x_0),J_p^{E_1}(x_0)-J_p^{E_1}(\Pi _{\Omega }x_0)\right\rangle . \end{aligned}$$
(33)

Taking limsup yields

$$\begin{aligned} \displaystyle \limsup _{j\rightarrow \infty }\Delta _p(\Pi _\Omega x_0,x_{n_j})\le & {} \displaystyle \limsup _{j\rightarrow \infty }\left\langle x_{n_j}-\Pi _\Omega x_0,J_p^{E_1}(x_0)-J_p^{E_1}(\Pi _\Omega x_0)\right\rangle \\= & {} \left\langle z-\Pi _\Omega x_0,J_p^{E_1}(x_0)-J_p^{E_1}(\Pi _\Omega x_0)\right\rangle \\\le & {} 0. \end{aligned}$$

Hence, \(\lim _{j\rightarrow \infty }\Delta _p(x_{n_j},\Pi _\Omega x_0)=0,\) and so \(x_{n_j}\rightarrow \Pi _\Omega x_0\). By the arbitrariness of \(\{x_{n_j}\}\) and uniqueness of \(\Pi _\Omega x_0,\) we have that \(x_n\rightharpoonup \Pi _\Omega x_0.\) Using (16), it follows from (33) that

$$\begin{aligned} \tau \Vert x_n-\Pi _\Omega x_0\Vert ^p\le & {} \Delta _p(x_n,\Pi _\Omega x_0)\\\le & {} \left\langle x_n-\Pi _\Omega x_0,J_p^{E_1}(x_0)-J_p^{E_1}(\Pi _\Omega x_0)\right\rangle . \end{aligned}$$

Taking limit as \(n\rightarrow \infty \) above, we get that \(x_n\rightarrow \Pi _\Omega x_0\).

\(\square \)

Remark 3.7

(i) On comparing with the conditions on \(\{\alpha _{n}\}\) in the inertial Mann algorithm of [6], our assumptions on \(\{\alpha _{n}\}\) in the inertial CQ algorithm are obviously relaxed. To be more precise, we do not require that the inertial sequence \(\{\alpha _{n}\}\) is nonnegative, nondecreasing and cannot exceed 1/3. Moreover, our main result seems to be new in a Banach space for solving the SFP with the inertial technique. As far as we know, it should be noted that the conditions of convergence of the inertial CQ algorithm introduced in this paper are weakest among the inertial algorithms proposed in some certain Banach spaces.

(ii) Our proposed Algorithm 3.1 has connections with some recent methods in the literature. For example, the inertial extrapolation factor \(\alpha _n\) in our method has similar property with the recent papers of Dong et al. [19], where

$$\begin{aligned} \{\alpha _n\} \subset [{\underline{\alpha }},{\overline{\alpha }}], {\underline{\alpha }} \in (-\infty ,0], {\overline{\alpha }} \in [0,\infty ) \end{aligned}$$

and the proposed method in [17], where \(\{\alpha _n\}\) is assumed to be bounded. In these papers, several choices of \(\{\alpha _n\}\) are considered in numerical implementations and the authors showed that their proposed methods are efficient and implementable. It is in the light of these results that our proposed Algorithm 3.1 is introduced and studied for SFP (1.1).

4 Application

Next, we give some example and application which serve as motivation for studying the split feasibility problem (1) outside Hilbert spaces. This example concerns computing \(L^p\)-solutions of Fredholm integral equations of the first kind with \(p \ge 2\). In this example, let \(E_1=L^p([\alpha ,\beta ]), p\ge 2\). Observe that the duality mapping of the spaces \(E_1=E_2=L^p([\alpha ,\beta ])\) is the function \(J_p^{E_1}:L^p([\alpha ,\beta ])\rightarrow L^q([\alpha ,\beta ])\) defined by (see [2])

$$\begin{aligned} J_p^{E_1}(x):=\Vert x\Vert ^{2-p}.|x|^{p-2}.x \end{aligned}$$

and the Bregman distance function \(\Delta _p(.,.)\) is given by

$$\begin{aligned} \Delta _p(y,x):=\Vert y\Vert ^p+(p-1)\Vert x\Vert ^p-p\left\langle |x|^{p-2}.x,y\right\rangle , \end{aligned}$$

where |x| is the Euclidean norm of x and \(\Vert x\Vert \) is the \(L^p\) norm of x.

Example 4.1

Let us consider computing \(L^p\)-solutions of Fredholm integral equations of the first kind, as considered in [2],

$$\begin{aligned} \int _\alpha ^\beta K(s,t)x(t)\mathrm{d}t=g(s),~~s \in [\alpha ,\beta ], \end{aligned}$$
(34)

involving a continuous kernel \(K:[\alpha ,\beta ]^2\rightarrow {\mathbb {R}}\) and a continuous free term \(g:[\alpha ,\beta ]\rightarrow {\mathbb {R}}\). Fredholm integral equations of the first kind appear in many physical and engineering applications. An immense amount of work has been done on solving (34). The literature is very extensive on the subject. Many analytical and numerical techniques have been constructed so far and it is still expanding. For more details, see, for example, [65].

It can be shown that (34) is an SFP (1) with \(A:=I\), the identity, which is the problem: find \(x \in C:=\bigcap \nolimits _{i=1}^{N}C_i\), where

$$\begin{aligned} C_i=\{x \in L^p:\left\langle a_i,x\right\rangle =b_i\}, \quad i=1,2,\ldots ,N \end{aligned}$$
(35)

with \(a_i(t)=K(s_i,t)\in L^q([\alpha ,\beta ]) (q=\frac{p}{p-1})\) and \(b_i=g(s_i)\in {\mathbb {R}}\), while \(\alpha =s_1<s_2<\ldots <s_N=\beta \) (see [2, 29, 65]). Under some hypothesis, (34) has solutions (cf. [39]) and approximating a \(L^p\)-solution of (34) amounts to solving a consistent SFP (1).

In this setting, our Algorithm 3.1 reduces to

Algorithm 4.2

Let \(\{\alpha _n\}\subset {\mathbb {R}}\) be a bounded set. Set \(x_0,x_1\in C\). Define a sequence \(\{x_n\}\) by the following manner:

$$\begin{aligned} \left\{ \begin{array}{lllll} &{} w_n=J_q^{E_1}[J_p^{E_1}(x_n)+\alpha _n(J_p^{E_1}(x_n)-J_p^{E_1}(x_{n-1}))] \\ &{} y_n=\Pi _{C_{[n]}}(w_n)\\ &{} C_n=\{u\in E_1 : \Delta _p(y_n,u)\le \Delta _p(w_n,u)\}\\ &{} Q_n=\{u\in E_1: \left\langle x_n-u,J_p^{E_1}(x_0)-J_p^{E_1}(x_n)\right\rangle \ge 0\}\\ &{} x_{n+1} = \Pi _{C_n\cap Q_n}(x_0). \end{array} \right. \end{aligned}$$
(36)

for all \(n\ge 0\) where \([n] := n (\mathrm{mod}~~ N)\) with the mod function taking values in \(\{1,2,\ldots , N\}\) and \(\{\rho _n\}\subset (0,\infty )\) satisfies \(\displaystyle \liminf _{n\rightarrow \infty }\rho _n\Big (p-c_q\frac{\rho _n^{q-1}}{q}\Big )>0\).

By Theorem 3.6, we can show that, if (34) has solutions, then the sequence \(\{x_n\}\) generated by (36) converges strongly to a solution of (34).

5 Examples and numerical results

In this section, we provide computational experiments comparing our newly proposed method considered in Sect. 3 of this paper to the existing iterative method of Wang [59] for solving some problems in p-uniformly convex Banach space which is also uniformly smooth. All codes were written in Matlab 2017a. We perform all computations on a Linux desktop with an Intel(R) Core(TM) i5-4670S CPU at 3.10 GHz.

Example 1: image restoration problem

The image restoration problem consists of the reconstruction of an original image that has been digitized and has been degraded by blur and additive noise. In this problem, the interest is in finding the minimum p-norm solution of \(Ax = y\), with exact data \(y\in R(A)\), where \(A: E_1\rightarrow E_2\) is a continuous linear operator between the two Banach spaces \(E_1\) and \(E_2\) with \(x\in E_1\). Since the solution is almost sparse, the minimum p-norm solution is always sought for with \(p \in (1,2]\), to promote sparsity in the restored solution. For more information about the role of the sparsity in inverse problems, we refer to [21, 23, 31, 55]. Using Algorithm 3.1, we develop an iterative algorithm to recover the solution of the functional linear equation \(Ax = y\). Furthermore, we compare the performance of our proposed Algorithm 3.1 with Algorithm (15).

Let us choose a simple image restoration problem studied in [53]. For simplicity, we consider the case when \(p=2=q\) and \(E_1=L_2([0,1])=E_2\). Consider the matrix \(A\in {\mathbb {R}}^{1000\times 1000}\)

$$\begin{aligned} A=\frac{1}{1000} \left( \begin{array}{c@{\qquad }c@{\qquad }c@{\qquad }c} 1 &{} 0 &{} \cdots &{} 0 \\ \vdots &{} \ddots &{} \ddots &{} \vdots \\ \vdots &{} &{} \ddots &{} 0 \\ 1 &{} \cdots &{} \cdots &{} 1 \\ \end{array} \right) \end{aligned}$$

resulting from a discretization of the operator equation of the first kind

$$\begin{aligned} y(t)=(Ax)(t):=\int _0^t x(s)\mathrm{d}s,\quad s,t \in [0,1]; \quad x,y:[0,1]\rightarrow [0,1]. \end{aligned}$$

We are interested in solutions \(x\in \{x\in C: Ax\in Q\}\), where C is the cone of functions x(s) that are negative for \(s\in [0, 0.25]\) and positive for \(s\in [0.25, 1]\) and \(Q = [a(t), b(t)]\) is a box delimited by the functions a(t), b(t). We take \(E_1 =({\mathbb {R}}^{1000}, \Vert .\Vert _{2}), E_2=({\mathbb {R}}^{1000}, \Vert .\Vert _{2})\). We compute the metric projection \(P_Q\) with formula:

$$\begin{aligned} P_{[a,b]}(x):=\max \{a,\min \{x,b\}\}, \end{aligned}$$
(37)

where \([a, b] := \{x\in H:a \le x\le b\}\) is a closed box with extended real valued functions ab, “\(\le \)”, “min” and “max” are to be understood pointwise a.e.. Since \(E_1,E_2\) are Hilbert spaces, the Bregman projection \(\Pi _C\) coincides with the metric projection \(P_C\), which can be also computed with (37) because \(C = [c(s), \mathrm{d}(s)]\) is furthermore an unbounded box with \(c(s) = -\infty , \mathrm{d}(s) = 0\) for \(s\in [0, 0.25]\) and \(c(s) = 0, \mathrm{d}(s) = +\infty \) for \(s\in [0.25, 1]\).

To ensure the existence of the solution of the considered problem, a sparse vector \(x^*(t)\) is generated randomly in C (see Fig. 1a). Taking \(y(t)=Ax^*(t)\) and \(a(t)=y(t)-0.01, b(t)=y(t)+0.01 \) (Fig. 1b) we have \(Q=[a(t),b(t)]\). The problem of interest is to find \(x \in C\) such that \(Ax \in Q.\)

Fig. 1
figure 1

\(x^* \in C \) (left) and \(Ax^* \in Q\) (right)

In Fig. 2, we report the results comparing the behavior of Algorithm 3.1 with Algorithm (15) for initial guess \(x_0=x_1=0\). For Algorithm (17), we take \(\alpha _{n}=\alpha =3\) and \(\rho _{n}=\rho =2\), for Algorithm (15), we take \(\lambda _n=\lambda =\frac{1}{\Vert A\Vert ^2}\). We compute the error as:

$$\begin{aligned} \text {Error}=\Vert x_n-\Pi _C(x_n)\Vert +\Vert Ax_n-P_Q(Ax_n)\Vert . \end{aligned}$$

It can be seen that Algorithm 3.1 is significantly faster than Algorithm (15).

Fig. 2
figure 2

Number of iterations and error estimate for Algorithm 3.1 and Algorithm (15) for image restoration problem

Example 2

The following example was considered in [51], where \(E_1=E_2=L_2[0,1]\) with the inner product given as:

$$\begin{aligned} \left\langle f,g \right\rangle =\int _{0}^{1} f(t)g(t)\mathrm{d}t. \end{aligned}$$

Let

$$\begin{aligned} C:=\left\{ x \in E_1: \left\langle a,x \right\rangle =b \right\} , \end{aligned}$$

where \(a=2t^2\) and \(b=0\). In this case, we have

$$\begin{aligned} P_C(x)=x+\frac{b-\left\langle a,x \right\rangle }{\Vert a\Vert ^2}a. \end{aligned}$$

Also, let

$$\begin{aligned} Q:=\left\{ x \in E_2: \left\langle c,x \right\rangle \le \mathrm{d} \right\} , \end{aligned}$$

where \(c=t/3\) and \(d=1\), we obtain

$$\begin{aligned} P_Q(x)=x+\max \left\{ 0, \frac{d-\left\langle c,x \right\rangle }{\Vert c\Vert ^2}c\right\} . \end{aligned}$$

We assume that

$$\begin{aligned} A: E_1 \rightarrow E_2, \quad Ax(t)=\frac{x(t)}{2}. \end{aligned}$$

Then, A is a bounded linear operator and \(A^*=A\). Observe that the solution set

$$\begin{aligned} \Omega =\left\{ x \in C: Ax \in Q \right\} \end{aligned}$$

is nonempty since \(0 \in \Omega \). In the following figures, we compare Algorithm 3.1 with Algorithm (15) for two initial guesses \(x_0(t)=x_1(t)=\exp (2t)\) and \(x_0(t)=x_1(t)=3 \sin (t)\). For Algorithm (17), we take \(\alpha _{n}=\alpha =1\) and \(\rho _{n}=\rho =2\), for Algorithm (15), we take \(\lambda _n=\lambda =\frac{1}{\Vert A\Vert ^2}\). The iteration was stopped with accuracy

$$\begin{aligned} \text {Error}=\Vert x_n-\Pi _C(x_n)\Vert +\Vert Ax_n-P_Q(Ax_n)\Vert \le \epsilon , \end{aligned}$$

where \(\epsilon =10^{-4}\) (Fig. 3).

Fig. 3
figure 3

Number of iterations and error estimate for Algorithm 3.1 and Algorithm (15) with \(x_0(t)=x_1(t)=\exp (2t)\) (left) and \(x_0(t)=x_1(t)=3 \sin (t)\) (right) in Example 2

Finally, let us summarize the comparison of Algorithm 3.1 with Algorithm (15) in Table 1.

Table 1 Comparison of Algorithm 3.1 and Algorithm (15)

6 Conclusion and final remarks

In this paper, we study the modified self-adaptive projection method with an inertial technique for solving the split feasibility problem (1) in p-uniformly convex Banach spaces which are also uniformly smooth. We show in a simple and novel way how the sequence generated by our projection method with an inertial technique strongly converges to a solution of the SFP. The effectiveness and numerical implementation of the proposed method are illustrated for solving the some examples in Banach spaces. All the numerical implementations of our results are compared with the Wang’s [59] algorithm. On these grounds, we can summarize that our proposed method is really more efficient and faster than Wang’s method.

In recent years, the nonlinear split feasibility problem (NLSFP) gained a lot of interest, see, e.g., [26, 32]. In addition, the non-convex case is also very attractive from the application point of view, see [45]. In our future project, we will extend the results of this paper to nonlinear split feasibility problem and non-convex case of split feasibility problem.