1 Introduction

It is well known that the split feasibility problem (SFP) plays a key role in signal processing Byrne (2004) and medical image reconstruction Byrne (2002). Therefore, many numerical algorithms have been developed to solve the SFP; see Byrne (2004, 2002); Censor et al. (2005); Dong et al. (2020); López et al. (2012); Reich et al. (2020); Sahu et al. (2020) and the references therein.

The original model of the SFP was considered by Censor and Elfving Censor and Elfving (1994) for modeling inverse problems, and a classical method for solving the SFP is Byrne’s CQ algorithm Byrne (2004, 2002). It is easily shown that the SFP can be got from the proximal split feasibility problem (SFP) which is a generalization of proximal split minimization problems in Moudafi and Thakur (2014). For the proximal SFP, there are numerous iteratively algorithms for the study of its convergence properties; see Abbas et al. (2018); Moudafi and Thakur (2014); Shehu and Iyiola (2017, 2017, 2018); Wang and Xu (2014); Shehu and Iyiola (2018) and references therein. To be specific, Moudafi and Thakur Moudafi and Thakur (2014) discussed its weak convergence by introducing a split proximal algorithm in which the self-adaptive stepsize was not determined by an Armijo-like rule Dong et al. (2018); Gibali et al. (2018); Qu and Xiu (2005); Shehu and Gibali (2020). This rule often results in additional computation costs. Based on the inertial idea (which is viewed as a procedure of speeding up the convergence properties; see Kesornprom and Cholamjiak (2019); Sahu et al. (2020); Suantai et al. (2018); Shehu et al. (2020); Iyiola et al. (2018)), Shehu et al. Shehu and Iyiola (2017) used the inertial technique to modify Moudafi et al.’s algotithm and obtained the weak convergence in real Hilbert spaces. To obtain its strong convergence, various related algorithms have been proposed in recent years. For instance, Abbas et al. Abbas et al. (2018) presented two divergent one-step methods; Shehu et al. Shehu and Iyiola (2017, 2018) combined Mann-type, accelerated hybrid viscosity, and steepest-descent methods to ensure it; Wang and Xu (2014) proposed the proximal gradient method. However, the study of strong convergence of the algorithm Shehu and Iyiola (2017) with new inertial effects has yet to be founded. We also observe that Shehu et al.’s algorithm Shehu and Iyiola (2017) does not require the estimation of the operator norm, but the convergence of their algorithm requires firm-nonexpasiveness of the involved operators. These observations bring us the following concern:

Question: Can we prove a strong convergence result for the proximal SFP employing a new modification of the inertial split proximal algorithm Shehu and Iyiola (2017) under weaker conditions than firm-nonexpasiveness of the involved operators?

Inspired and motivated by the works in Moudafi (2000); Shehu and Iyiola (2017), in this paper, we propose an iterative algorithm for solving the proximal SFP. The algorithm consists of the inertial method, the viscosity-type algorithm, and the split proximal algorithm with a self-adaptive stepsize. The strong convergence of the offered algorithm is established but without firm-nonexpasiveness of the mappings involved. We also provide an application of our main results for solving the split feasibility problems in Hilbert spaces. Finally, three numerical examples are listed for illustrating the effectiveness of the proposed algorithm.

The paper is arranged as follows. In Sect. 2, some basic concepts and lemmas used in subsequent sections are proposed. The main results are presented in Sect. 3. Numerical experiments are provided in Sect. 4. We give some conclusions in the final section.

2 Preliminaries

The symbol \(\rightharpoonup \) stands for the weak convergence, the symbol \(\rightarrow \) represents the strong convergence. Let \(H_{1}\) and \(H_{2}\) be real Hilbert spaces. For a proper lower semi-continuous convex (lsc) function \(F:H_{1}\rightarrow ] {-\infty ,\infty }]\), its domain is denoted by \(\mathrm{dom}F\), i.e., \({\mathrm{dom}F}:=\{x\in H_{1}: F(x)<\infty \}\). The proximal operator \({\mathrm{prox}}_{\tau G}: H_{2} \rightarrow H_{2}\) is defined by

$$\begin{aligned}{\mathrm{pro}}{{\mathrm{x}}_{\tau G}}\left( x \right) \mathop { := \arg \min }\limits _{y \in H_{2}} \left\{ {G\left( y \right) + \frac{1}{{2\tau }}{{\left\| {x -y } \right\| }^2}} \right\} ,\end{aligned}$$

where \(\tau >0\) and \(G:H_{2}\rightarrow \mathbb {R}\cup \left\{ { + \infty } \right\} \) is a proper, convex, and lower semi-continuous (lsc) function.

In view of Combettes and Hirstoaga (2005), the proximal mapping \(\mathrm{prox}_{\tau G}\) is firmly nonexpansive, that is

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;\Vert {\mathrm{prox}}_{\tau G}(x)-{\mathrm{prox}}_{\tau G}\left( y\right) \Vert ^2\le \Vert x-y\Vert ^2\\ \quad ~~~\; -\Vert \left( x-{\mathrm{prox}}_{\tau G}(x)\right) -\left( y -{\mathrm{prox}}_{\tau G}(y)\right) \Vert ^2,~\forall ~x,~y\in H_{2}, \end{array} \end{array}\end{aligned}$$

and its fixed point set is the set of minimizers of G. Let C be a nonempty closed and convex subset of \(H_{1}\), and then, the orthogonal projection of x onto C is defined by

$$\begin{aligned} P_{C}x\mathop { := \arg \min }\left\{ \left\| {x -y } \right\| ~|~y \in C \right\} ,~\forall ~x\in H_{1}. \end{aligned}$$

Definition 2.1

Let \(h:H_{1}\rightarrow H_{1}\) be a mapping, and then

  1. (i)

    h is called nonexpansive if

    $$\begin{aligned} \Vert hx-hy\Vert \le \Vert x-y\Vert ,~\forall ~x,~y \in H_{1}. \end{aligned}$$
  2. (ii)

    h is said to be firmly nonexpansive if

    $$\begin{aligned} \langle hx-hy, x-y\rangle \ge \Vert hx-hy\Vert ^2,~\forall ~x,~y \in H_{1}. \end{aligned}$$
  3. (iii)

    Let \(D\subset H_{1}\) be a set and let \(h:D\rightarrow \mathbb {R}\cup \left\{ { + \infty }\right\} \) be named weak lower semi-continuity if \(x_{n}\rightharpoonup x \), the following statement holds:

    $$\begin{aligned} \mathop {\liminf }\limits _{n \rightarrow \infty }h(x_{n})\ge h(x). \end{aligned}$$

Lemma 2.1

Let \(\nu \in \left] {0,1} \right[\), for all \(x,y,z\in H_{1}\), and then

  1. (i)

    \(\Vert \nu x+(1-\nu )y\Vert ^2=\nu \Vert x\Vert ^2+(1-\nu )\Vert y\Vert ^2-\nu (1-\nu )\Vert x-y\Vert ^2;\)

  2. (ii)

    \(\Vert x+y\Vert ^2\le \Vert x\Vert ^2+2\langle y, x+y\rangle ;\)

  3. (iii)

    \(\langle x-y, x-z\rangle =\frac{1}{2}\Vert x-y\Vert ^2+\frac{1}{2}\Vert x-z\Vert ^2-\frac{1}{2}\Vert y-z\Vert ^2.\)

Lemma 2.2

Goebel and Reich (1984) Let \(C \subset H_{1}\) be a nonempty closed convex set and let \(P_{C}\) be the metric projection from \(H_{1}\) to C. Then, the following statements hold:

  1. (i)

    \(\langle x-P_{C}x, y-P_{C}x\rangle \le 0~for~all~ x \in H_{1}~and~y \in C;\)

  2. (ii)

    \(\Vert P_{C}x-P_{C}y\Vert \le \Vert x-y\Vert ~for~all~ x,~y \in H_{1}.\)

Lemma 2.3

Saejung and Yotkaew (2012) Suppose that \(\{s_{n} \}_{n=1}^{\infty }\) is a sequence of nonnegative real numbers, such that

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;s_{n+1}\le (1-\alpha _{n})s_{n}+\alpha _{n}\delta _{n},~\forall ~n\ge 1, \end{array} \end{array}\end{aligned}$$

where

  1. (i)

    \(\{ \alpha _{n} \}_{n=1}^{\infty }\subset \left] {0,1} \right[\) and \( \sum \limits _{n = 1}^\infty { \alpha _{n}=\infty }\),

  2. (ii)

    if \(\mathop {\limsup }\limits _{k\rightarrow \infty }\delta _{n_{k}}\le 0\) for every subsequence \(\{s_{n_{k}}\}\) of \(\{s_{n}\}\) fulfilling \(\mathop {\liminf }\limits _{k\rightarrow \infty }\Vert s_{n_{k}+1}-s_{n_{k}}\Vert \ge 0\).

    Then, \( \mathop {\lim }\limits _{n\rightarrow \infty } s_{n}=0.\)

3 Strong convergence

In this section, we first let \(H_{1}\) and \(H_{2}\) be real Hilbert spaces and \(A:H_{1}\rightarrow H_{2}\) be a bounded linear operator with its adjoint \(A^*,\) \(F:H_{1}\rightarrow \mathbb {R}\cup \left\{ { + \infty } \right\} \) and \(G:H_{2}\rightarrow \mathbb {R}\cup \left\{ { + \infty } \right\} \) be proper, convex, and lower semi-continuous (lsc) functions. Now, we consider the following proximal SFP:

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} {\mathrm{Find~a~solution}}~z^* \in H_{1}~{\mathrm{such~that}}~\mathop {\min }\left\{ F(x)+G_{\tau }(Ax):x\in H_{1}\right\} , \end{array} \end{array}\end{aligned}$$

where \(\tau >0\) and \(G_{\tau }(x):=\mathop {\min }\limits _{y \in H_{2}}\left\{ {G\left( y\right) + \frac{1}{{2\tau }}{{\left\| {y -x } \right\| }^2}} \right\} \) can be regarded as the Moreau–Yosida approximate of the function G of parameter \(\tau \). If such point exists, then its solutions set is denoted by \(\varGamma \). Similar as in Abbas et al. (2018); Shehu and Iyiola (2017), we also offer the following definitions used for the rest of the paper.

Given any \(\tau >0\) and \(x \in H_{1}\), we define

$$\begin{aligned} \begin{array}{*{20}{l}} \begin{array}{l} E(x)=\frac{1}{2}\Vert (I-\text{ prox}_{\tau G})Ax\Vert ^2;\\ L(x)=\frac{1}{2}\Vert (I-\text{ prox}_{\tau F})x\Vert ^2~{\mathrm{and}}~\\ \theta (x)=\sqrt{\Vert \nabla E(x)+\nabla L(x)\Vert ^2}. \end{array} \end{array}\end{aligned}$$

Then, the Lipschitz gradients \(\nabla E\) and \(\nabla L\) of E and L, respectively, are

$$\begin{aligned} \begin{array}{*{20}{l}} \begin{array}{l} \nabla E(x)=A^*(I-{\mathrm{prox}}_{\tau G})Ax~{\mathrm{and}}~\\ \nabla L(x)=(I-{\mathrm{prox}}_{\tau F})x, \end{array} \end{array}\end{aligned}$$

whose Lipschitz constants are \(\Vert A\Vert ^2\) and 1, respectively. Before describing our algorithm, the following conditions are required in convergence analysis.

  1. (A1)

    The solution set of the proximal SFP is nonempty, that is, \(\varGamma \ne \emptyset .\)

  2. (A2)

    Let \( \{\tilde{\tau }_{n}\} \subset [{0, \tilde{\theta }}[\) with \(\tilde{\theta }>0\) be a positive sequence, such that \( \tilde{\tau }_{n}=o(\gamma _{n}) \), i.e., \(\mathop {\lim }\limits _{n\rightarrow \infty } \frac{\tilde{\tau }_{n}}{\gamma _{n}}=0\) where the sequence \(\{ \gamma _{n}\}\subset \left]{0,1}\right[\) fulfills \( \sum \limits _{n = 1}^\infty { \gamma _{n}=\infty }~\mathrm{and}~ \mathop {\lim }\limits _{n\rightarrow \infty }\gamma _{n}=0.\)

  3. (A3)

    The mapping \(f: H_{1}\rightarrow H_{1}\) is \(\tilde{\rho }\)- contractive with constant \(\tilde{\rho }\in \left[ {0,1}[\right. \).

  4. (A4)

    \(\inf \kappa _{n}(2-\kappa _{n})>0\).

Below, our iterative scheme is stated in Algorithm 1.

figure a

Remark 3.1

In Algorithm 1, if \(\varGamma \ne \emptyset \) and \(\nabla E(w_{n})=\nabla L(w_{n})=0\) and \(w_{n}=x_{n}\), then \(x_{n}\in \varGamma \).

Proof

  Since if \( A^*(I-\text{ prox}_{\tau G})Aw_{n}=(I-\text{ prox}_{\tau F})w_{n}=0\), then this shows that \(w_{n}\in \varGamma .\) Additionally, \(A^*(I-\text{ prox}_{\tau G})Aw_{n}=(I-\text{ prox}_{\tau F})w_{n}=0\) yields from ((3.1)) of Algorithm 1 that \(y_{n}=w_{n}.\) This, together with \(w_{n}=x_{n}\), implies that \(y_{n}=x_{n} \in \varGamma \). Thus, \(x_{n}\in \varGamma .\) \(\square \)

Remark 3.2

In Algorithm 1, the inertial parameter \(\sigma _{n}\) is chosen as

$$\begin{aligned} \sigma _{n}= {\left\{ \begin{array}{ll} \min \left\{ {\frac{\tilde{\tau }_{n}}{\Vert x_{n}-x_{n-1}\Vert },\sigma }\right\} &{} {\mathrm{if}}\ x_{n}\ne x_{n-1}, \\ \sigma , &{}{\mathrm{otherwise}}. \end{array}\right. } \end{aligned}$$
(3.3)

In what follows, the proof of the following lemma and main theorem does not involve firm-nonexpasiveness of the operator \(I-\text{ prox}_{\tau (\cdot )}\).

Theorem 3.1

Suppose that Conditions \((A1)-(A4)\) hold. The sequence \(\{x_{n}\}\) generated by Algorithm 1 converges strongly to a point \(z \in \varGamma \), where \(z=P_{\varGamma }o f(z)\).

Proof

Let \(z\in \varGamma \). Since \(\text{ prox}_{\tau (\cdot )}\) is nonexpansive, z solves the proximal SFP due to minimizers of any function are exactly fixed points of its proximal mapping, and furthermore, by Lemma 2.1 (iii), we derive

$$\begin{aligned}&\;{\langle w_{n}-z, -\nabla E(w_{n})\rangle }\\&\quad = \langle w_{n}-z, A^*(\text{ prox}_{\tau G}-I)Aw_{n}\rangle \\&\quad = \langle Aw_{n}-Az, (\text{ prox}_{\tau G}-I)Aw_{n}\rangle \\&\quad = \langle Aw_{n}-\text{ prox}_{\tau G}Aw_{n}+\text{ prox}_{\tau G}Aw_{n}-Az, (\text{ prox}_{\tau G}-I)Aw_{n}\rangle \\&\quad = \langle \text{ prox}_{\tau G}Aw_{n}-Az, \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\rangle -\Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2\\&\quad = \frac{1}{2}\left( \Vert \text{ prox}_{\tau G}Aw_{n}-Az\Vert ^2+\Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2 -\Vert Aw_{n}-Az\Vert ^2 \right) \\&\qquad - \Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2\\&\quad \le -\frac{1}{2}\Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2\\&\quad = -E(w_{n}), \end{aligned}$$

and

$$\begin{aligned} \;{\langle w_{n}-z, -\nabla L(w_{n})\rangle }= & {} \langle w_{n}-z, (\text{ prox}_{\tau F}-I)w_{n}\rangle \\= & {} \langle w_{n}-\text{ prox}_{\tau F}w_{n}+\text{ prox}_{\tau F}w_{n}-z, (\text{ prox}_{\tau F}-I)w_{n}\rangle \\= & {} \langle \text{ prox}_{\tau F}w_{n}-z, \text{ prox}_{\tau F}w_{n}-w_{n}\rangle -\Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2\\= & {} \frac{1}{2}\left( \Vert \text{ prox}_{\tau F}w_{n}-z\Vert ^2+\Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2 -\Vert w_{n}-z\Vert ^2\right) \\&- \Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2\\\le & {} -\frac{1}{2}\Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2\\= & {} -L(w_{n}); \end{aligned}$$

combining with ((3.1)) and ((3.2)) yields that

$$\begin{aligned} {\Vert y_{n}-z\Vert ^2}= & {} \Vert w_{n}-\lambda _{n}\left( \nabla E(w_{n})+\nabla L(w_{n})\right) -z\Vert ^2 \nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\lambda _{n}^{2}\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2 \nonumber \\&+ 2\lambda _{n}\langle w_{n}-z, -(\nabla E(w_{n})+\nabla L(w_{n})) \rangle \nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\lambda _{n}^{2}\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2 \nonumber \\&+ 2\lambda _{n}\langle w_{n}-z, -\nabla E(w_{n})\rangle +2\lambda _{n}\langle w_{n}-z, -\nabla L(w_{n}) \rangle \nonumber \\\le & {} \Vert w_{n}-z\Vert ^2+\lambda _{n}^{2}\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2 -2\lambda _{n}(E(w_{n})+L(w_{n}))\nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\kappa _{n}^2\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^4} (\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2)\nonumber \\&-2\kappa _{n}\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}\nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}. \end{aligned}$$
(3.4)

After arrangement, we have

$$\begin{aligned} \Vert y_{n}-z\Vert ^2\le \Vert w_{n}-z\Vert ^2+\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}. \end{aligned}$$
(3.5)

From (A4) and (3.5), we have

$$\begin{aligned} \Vert y_{n}-z\Vert \le \Vert w_{n}-z\Vert . \end{aligned}$$
(3.6)

By the definition of \(w_{n}\), we get

$$\begin{aligned} {\Vert w_{n}-z\Vert }= & {} \Vert x_{n}+\sigma _{n}(x_{n}-x_{n-1})-z\Vert \nonumber \\\le & {} \Vert x_{n}-z\Vert +\sigma _{n}\Vert x_{n}-x_{n-1}\Vert \nonumber \\= & {} \Vert x_{n}-z\Vert +\gamma _{n} \cdot \frac{\sigma _{n}}{\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert . \end{aligned}$$
(3.7)

According to (3.3), we have \(\sigma _{n}\Vert x_{n}-x_{n-1}\Vert \le \tilde{\tau }_{n}\) \(\forall n\ge 1\), which, together with \( \mathop {\lim }\limits _{n\rightarrow \infty }\frac{\tilde{\tau }_{n}}{\gamma _{n}}=0\), yields that

$$\begin{aligned} \mathop {\lim }\limits _{n\rightarrow \infty }\frac{\sigma _{n}}{\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \le \mathop {\lim }\limits _{n\rightarrow \infty }\frac{\tilde{\tau }_{n}}{\gamma _{n}}=0. \end{aligned}$$

Therefore, there is a constant \(M_{1}> 0\), such that

$$\begin{aligned} \frac{\sigma _{n}}{\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \le M_{1},~\forall n\ge 1, \end{aligned}$$

which, along with (3.6) and (3.7), yields that

$$\begin{aligned} \Vert y_{n}-z\Vert \le \Vert w_{n}-z\Vert \le \Vert x_{n}-z\Vert +\gamma _{n}M_{1}. \end{aligned}$$
(3.8)

From ((3.1)) and (3.8), it follows that:

$$\begin{aligned} {\Vert x_{n+1}-z\Vert }= & {} \Vert \gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}-z\Vert \\= & {} \Vert \gamma _{n}(f(x_{n})-z)+(1-\gamma _{n})(y_{n}-z)\Vert \\\le & {} \gamma _{n}\Vert f(x_{n})-z\Vert +(1-\gamma _{n})\Vert y_{n}-z\Vert \\= & {} \gamma _{n}\Vert f(x_{n})-f(z)+f(z)-z\Vert +(1-\gamma _{n})\Vert y_{n}-z\Vert \\\le & {} \gamma _{n}\tilde{\rho }\Vert x_{n}-z\Vert +\gamma _{n}\Vert f(z)-z\Vert +(1-\gamma _{n})\Vert y_{n}-z\Vert \\\le & {} \gamma _{n}\tilde{\rho }\Vert x_{n}-z\Vert +(1-\gamma _{n})(\Vert x_{n}-z\Vert +\gamma _{n}M_{1}\Vert )+\gamma _{n}\Vert f(z)-z\Vert \\\le & {} \left( 1-\gamma _{n}\left( 1-\tilde{\rho }\right) \right) \Vert x_{n}-z\Vert +\gamma _{n}(1-\tilde{\rho })\frac{\Vert f(z)-z\Vert +M_{1}}{1-\tilde{\rho }}\\\le & {} \max \left\{ \Vert x_{n}-z\Vert , \frac{ M_{1}+ \Vert f(z)-z\Vert }{1-\tilde{\rho }}\right\} \\\le & {} \cdots \le \max \left\{ \Vert x_{1}-z\Vert , \frac{ M_{1}+ \Vert f(z)-z\Vert }{1-\tilde{\rho }}\right\} . \end{aligned}$$

This means that the sequence \(\left\{ x_{n}\right\} \) is bounded. Hence, the sequences \(\{y_{n}\}\), \(\left\{ f(x_{n})\right\} \) and \(\{w_{n}\}\) are also bounded.

By ((3.1)) and the convexity of \(\Vert \cdot \Vert ^2\), we get that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}= & {} \Vert \gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}-z\Vert ^2\\= & {} \Vert \gamma _{n}(f(x_{n})-z)+(1-\gamma _{n})(y_{n}-z)\Vert ^2\\\le & {} \gamma _{n}\Vert f(x_{n})-z\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}(\Vert f(x_{n})-f(z)\Vert +\Vert f(z)-z\Vert )^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}(\tilde{\rho }\Vert x_{n}-z\Vert +\Vert f(z)-z\Vert )^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}(\Vert x_{n}-z\Vert +\Vert f(z)-z\Vert )^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\= & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+\gamma _{n}\left( \Vert f(z)-z\Vert ^2+2\Vert x_{n}-z\Vert \Vert f(z)-z\Vert \right) \\&+ (1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2 +\gamma _{n}M_{2}; \end{aligned}$$

for some \(M_{2}>0\). Combining with (3.5), we derive that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}\le & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert w_{n}-z\Vert ^2\nonumber \\&+ (1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2} +\gamma _{n}M_{2}. \end{aligned}$$
(3.9)

Substituting (3.8) into (3.9), then there exists \(M_{3}>0\), such that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}\le & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})(\Vert x_{n}-z\Vert +\gamma _{n}M_{1})^2\\&+ (1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2} +\gamma _{n}M_{2}\\= & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})(\gamma _{n}M_{1})^2\\&+ 2(1-\gamma _{n})\gamma _{n}M_{1}\Vert x_{n}-z\Vert + (1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\\&\times \frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}+\gamma _{n}M_{2}\\\le & {} \Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}+\gamma _{n}M_{3}. \end{aligned}$$

That is

$$\begin{aligned} (1-\gamma _{n})\kappa _{n}(2-\kappa _{n})\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2} \le \Vert x_{n}-z\Vert ^2-\Vert x_{n+1}-z\Vert ^2+\gamma _{n}M_{3}.\nonumber \\ \end{aligned}$$
(3.10)

By Lemma 2.1 (i, ii) and (3.8), we derive that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}= & {} \Vert \gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}-z\Vert ^2\nonumber \\= & {} \Vert \gamma _{n}(f(x_{n})-f(z))+(1-\gamma _{n})(y_{n}-z)+\gamma _{n}(f(z)-z)\Vert ^2 \nonumber \\\le & {} \Vert \gamma _{n}(f(x_{n})-f(z))+(1-\gamma _{n})(y_{n}-z)\Vert ^2+ 2 \gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \nonumber \\\le & {} \gamma _{n}\Vert f(x_{n})-f(z)\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2 +2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \nonumber \\\le & {} \gamma _{n}\tilde{\rho }^2\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2 +2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \nonumber \\\le & {} \gamma _{n}\tilde{\rho }\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert w_{n}-z\Vert ^2 +2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle . \end{aligned}$$
(3.11)

From the definition of \(w_{n}\), it follows that:

$$\begin{aligned} {\Vert w_{n}-z\Vert ^2}= & {} \Vert x_{n}+\sigma _{n}(x_{n}-x_{n-1})-z\Vert ^2 \nonumber \\= & {} \Vert x_{n}-z\Vert ^2+\sigma _{n}^{2}\Vert x_{n}-x_{n-1}\Vert ^2+2\sigma _{n}\langle x_{n}-z, x_{n}-x_{n-1}\rangle \nonumber \\\le & {} \Vert x_{n}-z\Vert ^2+\sigma _{n}^{2}\Vert x_{n}-x_{n-1}\Vert ^2+2\sigma _{n}\Vert x_{n}-z\Vert \Vert x_{n}-x_{n-1}\Vert . \end{aligned}$$
(3.12)

Let \( M=\underset{n\ge 1}{\sup }\{ \sigma \Vert x_{n}-x_{n-1}\Vert , 2\Vert x_{n}-z\Vert \}\). Combining (3.11) and (3.12), we obtain that

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad \;{\Vert x_{n+1}-z\Vert ^2} \le \left( 1-\gamma _{n}\left( 1-\tilde{\rho }\right) \right) \Vert x_{n}-z\Vert ^2+\sigma _{n}^{2}\Vert x_{n}-x_{n-1}\Vert ^2\\ \quad \quad \quad \quad \;\;+ 2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle +2\sigma _{n}\Vert x_{n}-z\Vert \Vert x_{n}-x_{n-1}\Vert \\ \quad \quad \quad \;\; = (1-\gamma _{n}\left( 1-\tilde{\rho })\right) \Vert x_{n}-z\Vert ^2+(1-\tilde{\rho })\frac{2}{1-\tilde{\rho }}\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \\ \quad \quad \quad \quad \;\; + \sigma _{n} \Vert x_{n}-x_{n-1}\Vert ( \sigma _{n} \Vert x_{n}-x_{n-1}\Vert +2\Vert x_{n}-z\Vert )\\ \quad \quad \quad \;\; \le (1- \gamma _{n}(1-\tilde{\rho }))\Vert x_{n}-z\Vert ^2+(1-\tilde{\rho })\frac{2}{1-\tilde{\rho }}\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \\ \quad \quad \quad \quad \;\; + 2M\sigma _{n} \Vert x_{n}-x_{n-1}\Vert \\ \quad \quad \quad \;\; = (1- \gamma _{n}(1-\tilde{\rho }))\Vert x_{n}-z\Vert ^2\\ \quad \quad \qquad \;\;+ (1-\tilde{\rho })\gamma _{n}\left( \frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n+1}-z\rangle + \frac{2M\sigma _{n}}{(1-\tilde{\rho })\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \right) . \end{array} \end{array}\end{aligned}$$

After arrangement, there exists \(M>0\), such that

$$\begin{aligned} \begin{array}{l} {\Vert x_{n+1}-z\Vert ^2} \le (1-(1-\tilde{\rho } )\gamma _{n})\Vert x_{n}-z\Vert ^2\\ ~ ~~~~~\quad \quad \quad \qquad \;\; + (1-\tilde{\rho })\gamma _{n}\left( \frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n+1}-z\rangle + \frac{2M\sigma _{n}}{(1-\tilde{\rho })\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \right) . \end{array}\nonumber \\ \end{aligned}$$
(3.13)

Next, we let

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;s_{n}= \Vert x_{n}-z\Vert ^2,\\ \quad ~~~\; \alpha _{n}= (1-\tilde{\rho })\gamma _{n},\\ \quad ~~~\; \delta _{n}=\frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n+1}-z\rangle +\frac{2M\sigma _{n}}{(1-\tilde{\rho })\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert . \end{array} \end{array}\end{aligned}$$

Then, (3.13) reduces to the following inequality:

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;s_{n+1}\le (1-\alpha _{n})s_{n}+\alpha _{n}\delta _{n},~\forall ~n\ge 1. \end{array} \end{array}\end{aligned}$$

Clearly, Lemma 2.3 (i) is satisfied. Now, it needs to verify that Lemma 2.3 (ii) is also satisfied. Suppose that \( \{ \Vert x_{n_{k}}-z\Vert \}\) is the subsequence of \( \{ \Vert x_{n}-z\Vert \}\) and satisfies \(\liminf _{k\rightarrow \infty }(\Vert x_{n_{k}+1}-z\Vert -\Vert x_{n_{k}}-z\Vert )\ge 0\). Then

$$\begin{aligned} \begin{array}{l} \quad \quad \;\;\;\underset{k\rightarrow \infty }{\liminf }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2)\\ ~ ~~ \quad \quad \quad \;\; = \underset{k\rightarrow \infty }{\liminf }~\left( \left( \Vert x_{n_{k}+1}-z\Vert -\Vert x_{n_{k}}-z\Vert \right) (\Vert x_{n_{k}+1}-z\Vert +\Vert x_{n_{k}}-z\Vert )\right) \ge 0. \end{array} \nonumber \\ \end{aligned}$$
(3.14)

By \(\underset{k\rightarrow \infty }{\lim }\gamma _{n_{k}}=0\), (3.10) and (3.14), one has

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad \;\;\;\underset{k\rightarrow \infty }{\limsup }~(1-\gamma _{n_{k}})\kappa _{n_{k}}(2-\kappa _{n_{k}})\frac{(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\\ ~ ~~ \quad \quad \quad \quad \quad \quad \quad \quad \;\; \le \underset{k\rightarrow \infty }{\limsup }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2+ \gamma _{n_{k}}M_{3} )\\ ~ ~~ \quad \quad \quad \quad \quad \quad \quad \quad \;\; \le \underset{k\rightarrow \infty }{\limsup }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2)+\underset{k\rightarrow \infty }{\limsup }~\gamma _{n_{k}}M_{3}\\ ~ ~~ \quad \quad \quad \quad \quad \quad \quad \quad \;\; = -\underset{k\rightarrow \infty }{\liminf }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2)\le 0. \end{array} \end{array}\end{aligned}$$

Now, we have

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\left( \kappa _{n_{k}}(2-\kappa _{n_{k}})\frac{(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\right) =0. \end{aligned}$$

Thus, we get

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\frac{(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}=0. \end{aligned}$$

As a result, we have

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }(E(w_{n_{k}})+L(w_{n_{k}}))=0 \Leftrightarrow \underset{k\rightarrow \infty }{\lim }E(w_{n_{k}})=0~\text{ and }~\underset{k\rightarrow \infty }{\lim }L(w_{n_{k}})=0. \end{aligned}$$
(3.15)

Since \(\theta _{n_{k}}^2=\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2\) is bounded. This follows from the fact that \(\nabla E\) is Lipschitz continuous with constant \(\Vert A\Vert ^2\), \(\nabla L\) is nonexpansive and \(\{w_{n_{k}}\}\) is bounded. More precisely, for any \(z^*\) which solves the proximal SFP, we have

$$\begin{aligned} \begin{array}{l} \Vert \nabla E(w_{n_{k}})\Vert =\Vert \nabla E(w_{n_{k}})-\nabla E(z^*)\Vert \le \Vert A\Vert ^2\Vert w_{n_{k}}-z^*\Vert \end{array} \end{aligned}$$

and

$$\begin{aligned} \begin{array}{l} \Vert \nabla L(w_{n_{k}})\Vert =\Vert \nabla L(w_{n_{k}})-\nabla L(z^*)\Vert \le \Vert w_{n_{k}}-z^*\Vert . \end{array} \end{aligned}$$

By ((3.1)) and ((3.2)), we get

$$\begin{aligned} \begin{array}{l} \displaystyle \quad ~~\;{\Vert y_{n_{k}}-w_{n_{k}}\Vert ^2=\Vert \lambda _{n_{k}}(\nabla E(w_{n_{k}})+\nabla L(w_{n_{k}}))\Vert ^2} \\ ~ ~~~\quad \quad \quad \quad \quad \quad \displaystyle \;\;= \dfrac{\kappa _{n_{k}}^2(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\\ ~ ~~~\quad \quad \quad \quad \quad \quad \displaystyle \;\;\le \dfrac{4(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\\ ~ ~~~\quad \quad \quad \quad \quad \quad \displaystyle \;\;\rightarrow 0,~{\mathrm{as}}~k\rightarrow \infty . \end{array} \end{aligned}$$

This shows that

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\Vert y_{n_{k}}-w_{n_{k}}\Vert =0. \end{aligned}$$
(3.16)

From \(\underset{k\rightarrow \infty }{\lim }\gamma _{n_{k}}=0\) and ((3.1)), it follows that:

$$\begin{aligned} \Vert x_{n_{k}}-w_{n_{k}}\Vert = \sigma _{n_{k}}\Vert x_{n_{k}}-x_{n_{k}-1}\Vert =\gamma _{n_{k}} \frac{ \sigma _{n_{k}}}{\gamma _{n_{k}}}\Vert x_{n_{k}}-x_{n_{k}-1}\Vert \rightarrow 0, ~\text{ as }~k\rightarrow \infty . \end{aligned}$$
(3.17)

Using (3.16) and (3.17), we have

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\Vert y_{n_{k}}-x_{n_{k}}\Vert =0. \end{aligned}$$
(3.18)

By ((3.1)) and \(\underset{k\rightarrow \infty }{\lim }\gamma _{n_{k}}=0\), we have

$$\begin{aligned} \Vert x_{n_{k}+1}-y_{n_{k}}\Vert =\gamma _{n_{k}}\Vert f(x_{n_{k}})-y_{n_{k}}\Vert \rightarrow 0,~\text{ as }~k\rightarrow \infty . \end{aligned}$$

This deduces that

$$\begin{aligned} \left\| x_{n_{k}+1}-x_{n_{k}}\right\| \le \Vert x_{n_{k}+1}-y_{n_{k}}\Vert +\Vert y_{n_{k}}-x_{n_{k}}\Vert \rightarrow 0,~\text{ as }~k\rightarrow \infty . \end{aligned}$$
(3.19)

Since the sequence \(\{ x_{n_{k}}\}\) is bounded, then there exists a subsequence \(\left\{ x_{n_{k_{i}}}\right\} \) of \(\{ x_{n_{k}}\}\) converging weakly to a point \(z^* \in H_{1}\), such that

$$\begin{aligned} \underset{k\rightarrow \infty }{\limsup }\left\langle f(z)-z,x_{n_{k}}-z\right\rangle =\underset{i\rightarrow \infty }{\lim }\left\langle f(z)-z,x_{n_{k_{i}}}-z\right\rangle = \langle f(z)-z,z^*-z\rangle . \end{aligned}$$

Thanks to (3.17), we have

$$\begin{aligned} w_{n_{k_{i}}}\rightharpoonup z^*,~\text{ as }~i\rightarrow \infty . \end{aligned}$$

By the weak lower semi-continuity of E, we arrive at

$$\begin{aligned} 0\le E(z^*)\le \underset{i\rightarrow \infty }{\liminf }E\left( w_{n_{k_{i}}}\right) =\underset{k\rightarrow \infty }{\lim }E\left( w_{n_{k}}\right) =0. \end{aligned}$$

This means that \(E(z^*)=\frac{1}{2}\Vert (I-\text{ prox}_{\tau G})Az^*\Vert ^2=0,\) that is, \(Az^*\) is a fixed point of the proximal mapping of G or equivalently \(0 \in \partial G(Az^*).\) In other words, \(Az^*\) is a minimizer of G. Similarly, by the weak lower semi-continuity of L, we have \( 0\le L(z^*)\le \underset{i\rightarrow \infty }{\liminf }L\left( w_{n_{k_{i}}}\right) =\underset{k\rightarrow \infty }{\lim }L\left( w_{n_{k}}\right) =0. \) This means that \(L(z^*)=\frac{1}{2}\Vert (I-\text{ prox}_{\lambda _{n}\tau F})z^*\Vert ^2=0,\) that is, \(z^*\) is a fixed point of the proximal mapping of F or equivalently \(0 \in \partial F(z^*).\) In other words, \(z^*\) is a minimizer of F. Therefore, \(z^*\in \varGamma .\) From the definition of \(z=P_{\varGamma }of(z)\), it yields that

$$\begin{aligned} \underset{k\rightarrow \infty }{\limsup }\langle f(z)-z,x_{n_{k}}-z\rangle =\underset{i\rightarrow \infty }{\lim }\left\langle f(z)-z,x_{n_{k_{i}}}-z\right\rangle =\langle f(z)-z,z^*-z\rangle \le 0, \end{aligned}$$

which together with (3.19) implies that

$$\begin{aligned} \begin{array}{*{20}{l}} \begin{array}{l} \quad \quad \;\;\;\underset{k\rightarrow \infty }{\limsup }~\langle f(z)-z,x_{n_{k}+1}-z\rangle \\ ~ ~~ \quad \quad \quad \quad \quad \;\; \le \underset{k\rightarrow \infty }{\limsup }~\langle f(z)-z,x_{n_{k+1}}-x_{n_{k}}\rangle +\underset{k\rightarrow \infty }{\limsup }~\langle f(z)-z,x_{n_{k}}-z\rangle \\ ~ ~~ \quad \quad \quad \quad \quad \;\; \le 0. \end{array} \end{array}\end{aligned}$$

Hence

$$\begin{aligned} \underset{k\rightarrow \infty }{\limsup }\delta _{n_{k}}=\underset{k\rightarrow \infty }{\limsup }\left\{ \frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n_{k}+1}-z\rangle +\frac{2M\sigma _{n_{k}}}{(1-\tilde{\rho })\gamma _{n_{k}}}\Vert x_{n_{k}}-x_{n_{k}-1}\Vert \right\} \le 0. \end{aligned}$$

Employing Lemma 2.3, we conclude that \( \underset{n\rightarrow \infty }{\lim }\Vert x_{n}-z\Vert =0\). \(\square \)

If \(F\equiv \delta _{C}\) [defined as \(\delta _{C}(x)=0\) if \(x \in C\) and \(+\infty \) otherwise] and \(G\equiv \delta _{Q}\), the indicator functions of the nonempty, closed, and convex sets \(C\subset H_{1}\) and \(Q\subset H_{2}\), respectively, then the proximal SFP reduces to the following SFP:

$$\begin{aligned} \begin{array}{l} \text{ Find }~z^* \in C~\text{ such } \text{ that }~Az^* \in Q. \end{array} \end{aligned}$$

Furthermore, we derive the following strongly convergent corollary from Theorem 3.1.

Corollary 3.1

Let \(H_{1}\), \(H_{2}\), C, Q, A, \(A^*\), and \(\varGamma \) be the same as above description. Suppose that \(\varGamma \ne \emptyset \), \(\{\sigma _{n}\}\subset [0,\sigma [\subset [0,1[\) and Conditions (A1)–(A4) hold. Let \(x_{0},~x_{1}\in H_{1}\) and \(\{x_{n}\}\) be a sequence generated by

$$\begin{aligned} {\left\{ \begin{array}{ll} w_{n}=x_{n}+\sigma _{n}(x_{n}-x_{n-1}), y_{n}=w_{n}-\lambda _{n}(\nabla E(w_{n})+\nabla L(w_{n})),\\ x_{n+1}=\gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}, \end{array}\right. } \end{aligned}$$

where \(\sigma _{n}\) is defined in (3.3) and the stepsize \(\lambda _{n}\) can be computed via

$$\begin{aligned} \lambda _{n}= \kappa _{n}\frac{E(w_{n})+L(w_{n})}{\theta (w_{n})^2}, \end{aligned}$$

where \(0<\kappa _{n}<2\), \(L(w_{n})=\frac{1}{2}\Vert (I-P_{C})w_{n}\Vert ^2\), \(E(w_{n})=\frac{1}{2}\Vert (I-P_{Q})Aw_{n}\Vert ^2\) and \(\theta (w_{n})=\sqrt{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}\).

Then, the iterative sequence \(\{x_{n}\}\) produced above strongly converges to \(z \in \varGamma \), where \(z=P_{\varGamma }of(z)\).

Remark 3.3

Theorem 3.1 improves the result of [ Shehu and Iyiola (2017), Theorem 3.2 ], because strong convergence of our method is obtained without assuming firm-nonexpasiveness of the operator \(I-\text{ prox}_{\tau (\cdot )}\).

4 Numerical experiments

In this section, we provide numerical experiments relative to the proximal SFP. For the first example, we compare Alg. 1 with Abbas et al.’s Algorithms 3.1-3.2 (shortly, AMMOAlg. 3.1-3.2) Abbas et al. (2018), Shehu et al.’s Algorithm 3.1 (SIAlg. 3.1) Shehu and Iyiola (2017), and Shehu et al.’s Algorithm (AHVSDM) Shehu and Iyiola (2018). All the programs are implemented in MATLAB R2017a on a PC Desktop Intel(R) Core(TM) i7-6700 CPU @ 3.40 GHZ computer with RAM 8.00 GB.

In the first example, we study the proximal SFP in the case \(\text{ arg }\min F\cap A^{-1}(\text{ arg }\min G)\ne \emptyset \), or in other words: in finding a minimizer \(z^*\) of F, such that \(Az^*\) minimizes G, that is

$$\begin{aligned} \text{ Find }~ z^*\in H_{1}~\text{ such } \text{ that }~z^*\in \text{ arg }\underset{x \in H_{1}}{\min }~F(x) ~\text{ and }~Az^*\in \text{ arg }\underset{y \in H_{2}}{\min }~G(y), \end{aligned}$$
(4.1)

where \(F:H_{1}\rightarrow \mathbb {R}\) and \(G:H_{2}\rightarrow \mathbb {R}\) are proper and lower semi-continuous convex (lsc) functions, \(\text{ arg }\min ~F=\{ z^*\in H_{1} : F(z^*)\le F(x)~\forall x\in H_{1}\}\) and \(\text{ arg }\min ~G=\{ y^*\in H_{2}: G(y^*)\le G(x)~\forall x\in H_{2}\}\), the solution set is denoted by \(\varGamma \).

Example 4.1

Kesornprom and Cholamjiak (2019) Let \(H_{1}=H_{2}=\mathbb {R}^{N}\) and \(F(x)=\frac{1}{2}d_{C}^{2}(x)\), where \(C\subset \mathbb {R}^{N}\) is a unit ball and \(G(x)=\frac{1}{2}\Vert x\Vert ^2\). Set \(Ax=x\)\(x\in \mathbb {R}^{N}\). Observe that \(0 \in \varGamma \) and \(\varGamma \ne \emptyset \). For AMOOAlg. 3.1-3.2 and SIAlg. 3.1, we take \(\kappa _{n}=1.9\), \(\gamma _{n}=\frac{1}{n+1}\) and \(\alpha _{n}=\frac{1}{10^4(n+1)}\). For AHVSDM, we set \(\lambda _{n}=10^{-4}\), \(\gamma _{n}=\frac{1.99}{n+1}\), \(\mu =1\), \(\tilde{F}=I\) (which is a contraction mapping in Shehu and Iyiola (2018) and I is an identity mapping on \(H_{1}\)) and \(\beta _{n}=\frac{0.001}{(n+1)^2}\). For Alg. 1, we adopt \(\kappa _{n}=1.9\), \(\gamma _{n}=\frac{1}{n^2}\), \(\sigma =0.3\) and \(\tilde{\tau }_{n}=\frac{1}{n^3}\). For all tests, we use the condition \( \Vert x_{n}-\text{ prox}_{\tau , F}(x_{n})\Vert +\Vert Ax_{n}- \text{ prox}_{\tau , G}(Ax_{n})\Vert <\epsilon \) to terminate all the algorithms and choose \(x_{0}=[0,0,0, \cdots ,0]~\text{ and }~x_{1}=[1,\cdots ,1] \in \mathbb {R}^{N}\) and \(\tau =5\). To ensure that all algorithms have a common convergence point in this experiment, we set \(f(x)=0\). The results are summarized in Table 1.

Table 1 Results for Example 4.1

Remark 4.1

The numerical results of Example 4.1 are described in Table 1, the observations we obtain are the following:

  1. (1)

    The iterative rule proposed in this note implements efficiently and readily. More significantly, it converges fast.

  2. (2)

    Our proposed algorithm converges faster than some existing algorithms in terms of the number of iterations and execution time under different dimensions of the problem.

For the SFP, we list the following numerical examples and compare Alg. 1 with Gibali et al.’s Algorithm 3.1 (shortly, GMV Alg. 3.1) Gibali et al. (2019) and Suantai et al.’s Algorithm 3.1(SKC Alg. 3.1) Suantai et al. (2018).

Example 4.2

Kesornprom and Cholamjiak (2019) Let \(H_{1}=H_{2}=L^2([0,1])\) with norm \(\Vert x\Vert _{L^2}=\left( \int _{0}^{1} x(t)^2dt\right) ^\frac{1}{2}\) and inner product \(\langle x,y\rangle =\int _{0}^{1} x(t)y(t)dt,~x,y \in L^2([0,1])\). Let \(C=\{ x\in L^2([0,1]): \Vert x\Vert _{L^2}\le 1 \}\) and \(Q=\{ x\in L^2([0,1]):\langle x, t\rangle =0\}\). Set \(Ax(t)=\frac{x(t)}{2}\). Observe that \(0 \in \varGamma \), and so, \(\varGamma \ne \emptyset \). For SKC Alg. 3.1 and GMV Alg. 3.1, we fix \(\sigma =0.3\), \(\tilde{\tau }_{n}=\frac{1}{n^2}\), \(\gamma _{n}=\frac{1}{10^4n}\), \(\kappa _{n}=1.6\), \(f(x)=0.01x\), \(\beta _{n}=0.7\) and \(\sigma _{n}=\max (0,\sigma _{n}-0.1)\). For Alg. 1, we choose \(\kappa _{n}=1.6\), \(\sigma =0.3\), \(f(x)=0.01x\)\(\gamma _{n}=\frac{1}{10^4n}\) ,\(\tilde{\tau }_{n}=\frac{1}{n^2}\). For L\(\acute{o}\)pez Alg. 5.1, we adopt \(\kappa _{n}=9\times 10^{-5}\) and \(\gamma _{n}=\frac{10^{-5}}{n}.\) For all algorithms, we regard the condition \( \Vert x_{n+1}-x_{n}\Vert _{L^2}<\epsilon \) as a stopping criterion. We choose two types of starting points:

Case 1: \(x_{0}=t^4,~x_{1}=t+1\);

Case 2: \(x_{0}=e^t,~x_{1}=3e^t\).

Before conducting our numerical experiments, we first recall that the projections on sets C and Q have respective formulas, that is

$$\begin{aligned} \ P_{C}(x)= {\left\{ \begin{array}{ll} \frac{x}{\Vert x\Vert _{L^2}}, &{} \text{ if }\ \Vert x\Vert _{L^2}>1, \\ x,&{}\text{ if }\ \Vert x\Vert _{L^2}\le 1. \end{array}\right. } ~~~~\text{ and }~~~ \ P_{Q}(x)= x-\frac{\langle t,x\rangle }{\Vert t\Vert _{L^2}}t. \end{aligned}$$

The numerical results are shown in Table 2.

Table 2 Results for Example 4.2

Remark 4.2

According to Table 2, it shows that Alg. 1 behaves better than the compared algorithms with respect to the number of iterations and execution time under various cases of the problem.

Example 4.3

LASSO problem Sahu et al. (2020)

In this subsection, we employ SFP to model a real problem which is the recovery of a sparse signal. We take advantage of the well-known LASSO problem whose form is the following:

$$\begin{aligned} \min \left\{ \frac{1}{2}\Vert Ax-b\Vert ^2: x \in \mathbb {R}^{N},\Vert x\Vert _{1}\le \kappa \right\} , \end{aligned}$$
(4.2)

where \(A \in \mathbb {R}^{M\times N},~M<N,\) \(b\in \mathbb {R}^{M}\) and \(\kappa >0.\) This problem is devoted to finding a sparse solution of SFP. The system A is generated from a standard normal distribution with mean zero and unit variance. We generate the true sparse signal \(z^*\) from uniformly distribution in the interval \([-2,2]\) with random k position nonzero, while the rest is kept zero. The sample data \(b=Az^*\).

Under certain conditions on matrix A, the solution of the minimization problem (4.2) is equivalent to the \(\ell _{0}-\) norm solution of the underdetermined linear system. For the SFP , we define \(C=\{ z| \Vert z\Vert _{1} \le \kappa \},~\kappa =k,\) and \(Q=\{b\}\), since the projection onto the closed convex set C does not have a closed form solution. Therefore, we employ the subgradient projection. Thus, we define a convex function \(c(z)=\Vert z\Vert _{1}-\kappa \) and denote \(C_{n}\) by

$$\begin{aligned} C_{n}=\{ z:c(w_{n})+\langle \varepsilon _{n}, z-w_{n}\rangle \le 0\}, \end{aligned}$$

where \(\varepsilon _{n} \in \partial c(w_{n})\). Also, the orthogonal projection of a point \(z \in \mathbb {R}^{N}\) onto \(C_{n}\) can be computed via

$$\begin{aligned} \ P_{C_{n}}(z)= {\left\{ \begin{array}{ll} z , &{}\text{ if }\ c(w_{n})+\langle \varepsilon _{n}, z-w_{n}\rangle \le 0, \\ z-\frac{c(w_{n})+\langle \varepsilon _{n}, z-w_{n}\rangle }{\Vert \varepsilon _{n}\Vert ^2}\varepsilon _{n}, &{}\text{ otherwise }. \end{array}\right. } \end{aligned}$$

The subdifferential \(\partial c\) at \(w_{n}\) is

To implement our method in this example, we initialize the algorithms at the original and define

$$\begin{aligned} E_{n}=\frac{\Vert x_{n}-z^*\Vert }{\max \{1,\Vert x_{n}\Vert \}}. \end{aligned}$$

We test the numerical behavior of all algorithms with the same iteration error \(E_{n}\) in different \(M,~N~\text{ and }~k\) and limit the number of iterations to 8000 and report \(E_{n}\) in Table 3. The second problem is the recovery of the signal \(z^*\) when \(M=1440,~N=6144,~k=180\), \(M \times N\) matrix A is randomly obtained with independent samples of standard Gaussian distribution. More details, the original signal \(z^*\) contains 180 randomly placed \(\pm 1\) spikes. The iterative process is started with \(x_{0}=0\), the following method of mean square error is used for measuring the recovery accuracy:

$$\begin{aligned} \text{ MSE }=\frac{1}{N}\Vert x_{n}-z^*\Vert ^2. \end{aligned}$$

For all algorithms, we fix \(f(x)=0.0005x\), \(\sigma =0.9\), \(\tilde{\tau }_{n}=\frac{1}{n^5}\) and \(\gamma _{n}=\frac{1}{10^5n}\). For SKC Alg. 3.1, we take \(\kappa _{n}=0.02\). For GMV Alg. 3.1, we adopt \(\kappa _{n}=1.9\) and \(\beta _{n}=0.7\). For Alg. 1, we set \(\kappa _{n}=1.9\). For L\(\acute{o}\)pez Alg. 5.1, we choose \(\kappa _{n}=1.9\) and \(\gamma _{n}=\frac{10^{-7}}{n}.\)

Table 3 Results for Example 4.3
Table 4 Results of Algorithm 1 with different values of \(\kappa _{n}\) for Example 4.3

Remark 4.3

It can be observed from Tables 34 that the proposed algorithm implements efficiently. Moreover, our method requires less CPU time than some strongly convergent algorithms in the literature to obtain more smaller value of error accuracy \(E_{n}\) in different cases. We also find that when the value of the parameter \(\kappa _{n}\) is 1.99, our proposed algorithm performs better.

Fig. 1
figure 1

Comparison of signal processing

The recovery results of all algorithms are shown in Fig. 1, which stands for the original signal, the mean-squared error (MSE) of the restored signal, and the computing time required for the iterative process.

Remark 4.4

As can be observed from Fig. 1, the signal \(z^*\) is estimated with fair degree of accuracy by Algorithm 1. Under the same number of iterations, the execution time of Algorithm 1 is less but a little bigger mean-squared error.

Example 4.4

Image deblurring problem Saejung and Yotkaew (2012) We consider the problem here is an image deblurring problem. Fixed a convolution matrix \(A \in \mathbb {R}^{m\times n}\) and an unknown image \(z \in \mathbb {R}^{n}\), we derive \(b \in \mathbb {R}^{m}\), which can be viewed as the known degreaded observation. Also, the unknown additive random noise \(\nu \in \mathbb {R}^{m}\) is included, and furthermore, we obtain the image recovery problem as follows:

$$\begin{aligned} Az=b+\nu . \end{aligned}$$
(4.3)

This problem obviously is suitable for the setting of SFP with \(C=\mathbb {R}^{n }\); if no noise is added to the observed image b, then \(Q=\{b\}\) is a singleton and otherwise \(Q=\{x\in \mathbb {R}^{m} | \Vert x-(b+\nu )\Vert \le \epsilon \}\) for small enough \(\epsilon >0.\) In this example, we compare Algorithm 1 with Lopez Alg. 5.1. The test image was corrupted as in He et al. (2016). More precisely, every image was degraded by a \(9\times 9\) Gaussian random blur and standard deviation 4, and corrupted by undertaking an additive zero-mean white Gaussian noise with standard deviation \(10^{-3}.\) To measure the quality of the obtained recovered image, we define the following signal-to-noise ratio:

$$\begin{aligned} \mathrm{SNR}=20\log _{10}\frac{\Vert z\Vert }{\Vert \bar{z}-z\Vert }, \end{aligned}$$

where z is an original image and \(\bar{z}\) is a obtained image. Obviously, when the SNR value is higher, the image is recovered better. For Alg. 1, we take \(\kappa _{n}=1.6\), \(\sigma =0.3\), \(f(x)=0.01x\)\(\gamma _{n}=\frac{21}{100n}\) ,\(\tilde{\tau }_{n}=\frac{1}{n^2}\). For L\(\acute{o}\)pez Alg. 5.1, we adopt \(\kappa _{n}=5\times 10^{-4}\) and \(\gamma _{n}=\frac{21}{100n}.\) For all algorithms, we limit the number of iterations to 100 and report numerical results in Table 5 and Figs. 2-3.

Remark 4.5

As shown in Table 5 and Figs. 2-3, we observe that these two methods require the same iterations to recovery images. Concretely, Alg. 1 obtains higher SNR than L\(\acute{o}\)pez Alg. 5.1, but a little longer execution time.

Table 5 Results of all algorithms for Example 4.4
Fig. 2
figure 2

Original image (a) and observed image (b) for Example 4.4

Fig. 3
figure 3

Recovered images by Alg. 1 and L\(\acute{o}\)pez Alg. 5.1 in Example 4.4

5 Conclusions

In this paper, we obtain a strong convergence result for the proximal split feasibility problems with nonexpansive mappings. We modify the algorithm Shehu and Iyiola (2017) with the viscosity-type algorithm Moudafi (2000), the inertial method and the split proximal algorithm with a self-adaptive stepsize. As practical applications, we consider signal recovery and image deblurring problems. Preliminary numerical experiments confirm the effectiveness of the proposed algorithm in practice.