The iterative method for solving the proximal split feasibility problem with an application to LASSO problem

Ma, Xiaojun; Liu, Hongwei; Li, Xiaoyin

doi:10.1007/s40314-021-01703-3

The iterative method for solving the proximal split feasibility problem with an application to LASSO problem

Published: 09 December 2021

Volume 41, article number 5, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Computational and Applied Mathematics Aims and scope Submit manuscript

The iterative method for solving the proximal split feasibility problem with an application to LASSO problem

Download PDF

Xiaojun Ma¹,
Hongwei Liu¹ &
Xiaoyin Li¹

412 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, we investigate strong convergence of the iterative algorithm for solving the proximal split feasibility problem in real Hilbert spaces. The algorithm is motivated by the inertial method, the viscosity-type method and the split proximal algorithm with a self-adaptive stepsize. A strong convergence theorem for the proposed algorithm is established without requiring firm-nonexpasiveness of the involved operators. An application of our obtained results is offered. Finally, some numerical experiments are provided for illustration and comparison.

Strong convergence algorithm for proximal split feasibility problem

Article 06 July 2023

Iterative approximation of solutions for proximal split feasibility problems

Article Open access 22 July 2015

Self-adaptive algorithms for proximal split feasibility problems and strong convergence analysis

Article Open access 14 November 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

It is well known that the split feasibility problem (SFP) plays a key role in signal processing Byrne (2004) and medical image reconstruction Byrne (2002). Therefore, many numerical algorithms have been developed to solve the SFP; see Byrne (2004, 2002); Censor et al. (2005); Dong et al. (2020); López et al. (2012); Reich et al. (2020); Sahu et al. (2020) and the references therein.

The original model of the SFP was considered by Censor and Elfving Censor and Elfving (1994) for modeling inverse problems, and a classical method for solving the SFP is Byrne’s CQ algorithm Byrne (2004, 2002). It is easily shown that the SFP can be got from the proximal split feasibility problem (SFP) which is a generalization of proximal split minimization problems in Moudafi and Thakur (2014). For the proximal SFP, there are numerous iteratively algorithms for the study of its convergence properties; see Abbas et al. (2018); Moudafi and Thakur (2014); Shehu and Iyiola (2017, 2017, 2018); Wang and Xu (2014); Shehu and Iyiola (2018) and references therein. To be specific, Moudafi and Thakur Moudafi and Thakur (2014) discussed its weak convergence by introducing a split proximal algorithm in which the self-adaptive stepsize was not determined by an Armijo-like rule Dong et al. (2018); Gibali et al. (2018); Qu and Xiu (2005); Shehu and Gibali (2020). This rule often results in additional computation costs. Based on the inertial idea (which is viewed as a procedure of speeding up the convergence properties; see Kesornprom and Cholamjiak (2019); Sahu et al. (2020); Suantai et al. (2018); Shehu et al. (2020); Iyiola et al. (2018)), Shehu et al. Shehu and Iyiola (2017) used the inertial technique to modify Moudafi et al.’s algotithm and obtained the weak convergence in real Hilbert spaces. To obtain its strong convergence, various related algorithms have been proposed in recent years. For instance, Abbas et al. Abbas et al. (2018) presented two divergent one-step methods; Shehu et al. Shehu and Iyiola (2017, 2018) combined Mann-type, accelerated hybrid viscosity, and steepest-descent methods to ensure it; Wang and Xu (2014) proposed the proximal gradient method. However, the study of strong convergence of the algorithm Shehu and Iyiola (2017) with new inertial effects has yet to be founded. We also observe that Shehu et al.’s algorithm Shehu and Iyiola (2017) does not require the estimation of the operator norm, but the convergence of their algorithm requires firm-nonexpasiveness of the involved operators. These observations bring us the following concern:

Question: Can we prove a strong convergence result for the proximal SFP employing a new modification of the inertial split proximal algorithm Shehu and Iyiola (2017) under weaker conditions than firm-nonexpasiveness of the involved operators?

Inspired and motivated by the works in Moudafi (2000); Shehu and Iyiola (2017), in this paper, we propose an iterative algorithm for solving the proximal SFP. The algorithm consists of the inertial method, the viscosity-type algorithm, and the split proximal algorithm with a self-adaptive stepsize. The strong convergence of the offered algorithm is established but without firm-nonexpasiveness of the mappings involved. We also provide an application of our main results for solving the split feasibility problems in Hilbert spaces. Finally, three numerical examples are listed for illustrating the effectiveness of the proposed algorithm.

The paper is arranged as follows. In Sect. 2, some basic concepts and lemmas used in subsequent sections are proposed. The main results are presented in Sect. 3. Numerical experiments are provided in Sect. 4. We give some conclusions in the final section.

2 Preliminaries

The symbol $\rightharpoonup $ stands for the weak convergence, the symbol $\rightarrow $ represents the strong convergence. Let $H_{1}$ and $H_{2}$ be real Hilbert spaces. For a proper lower semi-continuous convex (lsc) function $F:H_{1}\rightarrow ] {-\infty ,\infty }]$, its domain is denoted by $\mathrm{dom}F$, i.e., ${\mathrm{dom}F}:=\{x\in H_{1}: F(x)<\infty \}$. The proximal operator ${\mathrm{prox}}_{\tau G}: H_{2} \rightarrow H_{2}$ is defined by

$$\begin{aligned}{\mathrm{pro}}{{\mathrm{x}}_{\tau G}}\left( x \right) \mathop { := \arg \min }\limits _{y \in H_{2}} \left\{ {G\left( y \right) + \frac{1}{{2\tau }}{{\left\| {x -y } \right\| }^2}} \right\} ,\end{aligned}$$

where $\tau >0$ and $G:H_{2}\rightarrow \mathbb {R}\cup \left\{ { + \infty } \right\} $ is a proper, convex, and lower semi-continuous (lsc) function.

In view of Combettes and Hirstoaga (2005), the proximal mapping $\mathrm{prox}_{\tau G}$ is firmly nonexpansive, that is

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;\Vert {\mathrm{prox}}_{\tau G}(x)-{\mathrm{prox}}_{\tau G}\left( y\right) \Vert ^2\le \Vert x-y\Vert ^2\\ \quad ~~~\; -\Vert \left( x-{\mathrm{prox}}_{\tau G}(x)\right) -\left( y -{\mathrm{prox}}_{\tau G}(y)\right) \Vert ^2,~\forall ~x,~y\in H_{2}, \end{array} \end{array}\end{aligned}$$

and its fixed point set is the set of minimizers of G. Let C be a nonempty closed and convex subset of $H_{1}$, and then, the orthogonal projection of x onto C is defined by

$$\begin{aligned} P_{C}x\mathop { := \arg \min }\left\{ \left\| {x -y } \right\| ~|~y \in C \right\} ,~\forall ~x\in H_{1}. \end{aligned}$$

Definition 2.1

Let $h:H_{1}\rightarrow H_{1}$ be a mapping, and then

(i)
h is called nonexpansive if
$$\begin{aligned} \Vert hx-hy\Vert \le \Vert x-y\Vert ,~\forall ~x,~y \in H_{1}. \end{aligned}$$
(ii)
h is said to be firmly nonexpansive if
$$\begin{aligned} \langle hx-hy, x-y\rangle \ge \Vert hx-hy\Vert ^2,~\forall ~x,~y \in H_{1}. \end{aligned}$$
(iii)
Let $D\subset H_{1}$ be a set and let $h:D\rightarrow \mathbb {R}\cup \left\{ { + \infty }\right\} $ be named weak lower semi-continuity if $x_{n}\rightharpoonup x $, the following statement holds:
$$\begin{aligned} \mathop {\liminf }\limits _{n \rightarrow \infty }h(x_{n})\ge h(x). \end{aligned}$$

Lemma 2.1

Let $\nu \in \left] {0,1} \right[$, for all $x,y,z\in H_{1}$, and then

(i)
$\Vert \nu x+(1-\nu )y\Vert ^2=\nu \Vert x\Vert ^2+(1-\nu )\Vert y\Vert ^2-\nu (1-\nu )\Vert x-y\Vert ^2;$
(ii)
$\Vert x+y\Vert ^2\le \Vert x\Vert ^2+2\langle y, x+y\rangle ;$
(iii)
$\langle x-y, x-z\rangle =\frac{1}{2}\Vert x-y\Vert ^2+\frac{1}{2}\Vert x-z\Vert ^2-\frac{1}{2}\Vert y-z\Vert ^2.$

Lemma 2.2

Goebel and Reich (1984) Let $C \subset H_{1}$ be a nonempty closed convex set and let $P_{C}$ be the metric projection from $H_{1}$ to C. Then, the following statements hold:

(i)
$\langle x-P_{C}x, y-P_{C}x\rangle \le 0~for~all~ x \in H_{1}~and~y \in C;$
(ii)
$\Vert P_{C}x-P_{C}y\Vert \le \Vert x-y\Vert ~for~all~ x,~y \in H_{1}.$

Lemma 2.3

Saejung and Yotkaew (2012) Suppose that $\{s_{n} \}_{n=1}^{\infty }$ is a sequence of nonnegative real numbers, such that

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;s_{n+1}\le (1-\alpha _{n})s_{n}+\alpha _{n}\delta _{n},~\forall ~n\ge 1, \end{array} \end{array}\end{aligned}$$

where

(i)
$\{ \alpha _{n} \}_{n=1}^{\infty }\subset \left] {0,1} \right[$ and $ \sum \limits _{n = 1}^\infty { \alpha _{n}=\infty }$,
(ii)
if $\mathop {\limsup }\limits _{k\rightarrow \infty }\delta _{n_{k}}\le 0$ for every subsequence $\{s_{n_{k}}\}$ of $\{s_{n}\}$ fulfilling $\mathop {\liminf }\limits _{k\rightarrow \infty }\Vert s_{n_{k}+1}-s_{n_{k}}\Vert \ge 0$.

Then, $ \mathop {\lim }\limits _{n\rightarrow \infty } s_{n}=0.$

3 Strong convergence

In this section, we first let $H_{1}$ and $H_{2}$ be real Hilbert spaces and $A:H_{1}\rightarrow H_{2}$ be a bounded linear operator with its adjoint $A^*,$ $F:H_{1}\rightarrow \mathbb {R}\cup \left\{ { + \infty } \right\} $ and $G:H_{2}\rightarrow \mathbb {R}\cup \left\{ { + \infty } \right\} $ be proper, convex, and lower semi-continuous (lsc) functions. Now, we consider the following proximal SFP:

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} {\mathrm{Find~a~solution}}~z^* \in H_{1}~{\mathrm{such~that}}~\mathop {\min }\left\{ F(x)+G_{\tau }(Ax):x\in H_{1}\right\} , \end{array} \end{array}\end{aligned}$$

where $\tau >0$ and $G_{\tau }(x):=\mathop {\min }\limits _{y \in H_{2}}\left\{ {G\left( y\right) + \frac{1}{{2\tau }}{{\left\| {y -x } \right\| }^2}} \right\} $ can be regarded as the Moreau–Yosida approximate of the function G of parameter $\tau $. If such point exists, then its solutions set is denoted by $\varGamma $. Similar as in Abbas et al. (2018); Shehu and Iyiola (2017), we also offer the following definitions used for the rest of the paper.

Given any $\tau >0$ and $x \in H_{1}$, we define

$$\begin{aligned} \begin{array}{*{20}{l}} \begin{array}{l} E(x)=\frac{1}{2}\Vert (I-\text{ prox}_{\tau G})Ax\Vert ^2;\\ L(x)=\frac{1}{2}\Vert (I-\text{ prox}_{\tau F})x\Vert ^2~{\mathrm{and}}~\\ \theta (x)=\sqrt{\Vert \nabla E(x)+\nabla L(x)\Vert ^2}. \end{array} \end{array}\end{aligned}$$

Then, the Lipschitz gradients $\nabla E$ and $\nabla L$ of E and L, respectively, are

$$\begin{aligned} \begin{array}{*{20}{l}} \begin{array}{l} \nabla E(x)=A^*(I-{\mathrm{prox}}_{\tau G})Ax~{\mathrm{and}}~\\ \nabla L(x)=(I-{\mathrm{prox}}_{\tau F})x, \end{array} \end{array}\end{aligned}$$

whose Lipschitz constants are $\Vert A\Vert ^2$ and 1, respectively. Before describing our algorithm, the following conditions are required in convergence analysis.

(A1)
The solution set of the proximal SFP is nonempty, that is, $\varGamma \ne \emptyset .$
(A2)
Let $ \{\tilde{\tau }_{n}\} \subset [{0, \tilde{\theta }}[$ with $\tilde{\theta }>0$ be a positive sequence, such that $ \tilde{\tau }_{n}=o(\gamma _{n}) $, i.e., $\mathop {\lim }\limits _{n\rightarrow \infty } \frac{\tilde{\tau }_{n}}{\gamma _{n}}=0$ where the sequence $\{ \gamma _{n}\}\subset \left]{0,1}\right[$ fulfills $ \sum \limits _{n = 1}^\infty { \gamma _{n}=\infty }~\mathrm{and}~ \mathop {\lim }\limits _{n\rightarrow \infty }\gamma _{n}=0.$
(A3)
The mapping $f: H_{1}\rightarrow H_{1}$ is $\tilde{\rho }$- contractive with constant $\tilde{\rho }\in \left[ {0,1}[\right. $.
(A4)
$\inf \kappa _{n}(2-\kappa _{n})>0$.

Below, our iterative scheme is stated in Algorithm 1.

Remark 3.1

In Algorithm 1, if $\varGamma \ne \emptyset $ and $\nabla E(w_{n})=\nabla L(w_{n})=0$ and $w_{n}=x_{n}$, then $x_{n}\in \varGamma $.

Proof

Since if $ A^*(I-\text{ prox}_{\tau G})Aw_{n}=(I-\text{ prox}_{\tau F})w_{n}=0$, then this shows that $w_{n}\in \varGamma .$ Additionally, $A^*(I-\text{ prox}_{\tau G})Aw_{n}=(I-\text{ prox}_{\tau F})w_{n}=0$ yields from ((3.1)) of Algorithm 1 that $y_{n}=w_{n}.$ This, together with $w_{n}=x_{n}$, implies that $y_{n}=x_{n} \in \varGamma $. Thus, $x_{n}\in \varGamma .$ $\square $

Remark 3.2

In Algorithm 1, the inertial parameter $\sigma _{n}$ is chosen as

$$\begin{aligned} \sigma _{n}= {\left\{ \begin{array}{ll} \min \left\{ {\frac{\tilde{\tau }_{n}}{\Vert x_{n}-x_{n-1}\Vert },\sigma }\right\} &{} {\mathrm{if}}\ x_{n}\ne x_{n-1}, \\ \sigma , &{}{\mathrm{otherwise}}. \end{array}\right. } \end{aligned}$$

(3.3)

In what follows, the proof of the following lemma and main theorem does not involve firm-nonexpasiveness of the operator $I-\text{ prox}_{\tau (\cdot )}$.

Theorem 3.1

Suppose that Conditions $(A1)-(A4)$ hold. The sequence $\{x_{n}\}$ generated by Algorithm 1 converges strongly to a point $z \in \varGamma $, where $z=P_{\varGamma }o f(z)$.

Proof

Let $z\in \varGamma $. Since $\text{ prox}_{\tau (\cdot )}$ is nonexpansive, z solves the proximal SFP due to minimizers of any function are exactly fixed points of its proximal mapping, and furthermore, by Lemma 2.1 (iii), we derive

$$\begin{aligned}&\;{\langle w_{n}-z, -\nabla E(w_{n})\rangle }\\&\quad = \langle w_{n}-z, A^*(\text{ prox}_{\tau G}-I)Aw_{n}\rangle \\&\quad = \langle Aw_{n}-Az, (\text{ prox}_{\tau G}-I)Aw_{n}\rangle \\&\quad = \langle Aw_{n}-\text{ prox}_{\tau G}Aw_{n}+\text{ prox}_{\tau G}Aw_{n}-Az, (\text{ prox}_{\tau G}-I)Aw_{n}\rangle \\&\quad = \langle \text{ prox}_{\tau G}Aw_{n}-Az, \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\rangle -\Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2\\&\quad = \frac{1}{2}\left( \Vert \text{ prox}_{\tau G}Aw_{n}-Az\Vert ^2+\Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2 -\Vert Aw_{n}-Az\Vert ^2 \right) \\&\qquad - \Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2\\&\quad \le -\frac{1}{2}\Vert \text{ prox}_{\tau G}Aw_{n}-Aw_{n}\Vert ^2\\&\quad = -E(w_{n}), \end{aligned}$$

and

$$\begin{aligned} \;{\langle w_{n}-z, -\nabla L(w_{n})\rangle }= & {} \langle w_{n}-z, (\text{ prox}_{\tau F}-I)w_{n}\rangle \\= & {} \langle w_{n}-\text{ prox}_{\tau F}w_{n}+\text{ prox}_{\tau F}w_{n}-z, (\text{ prox}_{\tau F}-I)w_{n}\rangle \\= & {} \langle \text{ prox}_{\tau F}w_{n}-z, \text{ prox}_{\tau F}w_{n}-w_{n}\rangle -\Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2\\= & {} \frac{1}{2}\left( \Vert \text{ prox}_{\tau F}w_{n}-z\Vert ^2+\Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2 -\Vert w_{n}-z\Vert ^2\right) \\&- \Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2\\\le & {} -\frac{1}{2}\Vert \text{ prox}_{\tau F}w_{n}-w_{n}\Vert ^2\\= & {} -L(w_{n}); \end{aligned}$$

combining with ((3.1)) and ((3.2)) yields that

$$\begin{aligned} {\Vert y_{n}-z\Vert ^2}= & {} \Vert w_{n}-\lambda _{n}\left( \nabla E(w_{n})+\nabla L(w_{n})\right) -z\Vert ^2 \nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\lambda _{n}^{2}\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2 \nonumber \\&+ 2\lambda _{n}\langle w_{n}-z, -(\nabla E(w_{n})+\nabla L(w_{n})) \rangle \nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\lambda _{n}^{2}\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2 \nonumber \\&+ 2\lambda _{n}\langle w_{n}-z, -\nabla E(w_{n})\rangle +2\lambda _{n}\langle w_{n}-z, -\nabla L(w_{n}) \rangle \nonumber \\\le & {} \Vert w_{n}-z\Vert ^2+\lambda _{n}^{2}\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2 -2\lambda _{n}(E(w_{n})+L(w_{n}))\nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\kappa _{n}^2\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^4} (\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2)\nonumber \\&-2\kappa _{n}\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}\nonumber \\= & {} \Vert w_{n}-z\Vert ^2+\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}. \end{aligned}$$

(3.4)

After arrangement, we have

$$\begin{aligned} \Vert y_{n}-z\Vert ^2\le \Vert w_{n}-z\Vert ^2+\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}. \end{aligned}$$

(3.5)

From (A4) and (3.5), we have

$$\begin{aligned} \Vert y_{n}-z\Vert \le \Vert w_{n}-z\Vert . \end{aligned}$$

(3.6)

By the definition of $w_{n}$, we get

$$\begin{aligned} {\Vert w_{n}-z\Vert }= & {} \Vert x_{n}+\sigma _{n}(x_{n}-x_{n-1})-z\Vert \nonumber \\\le & {} \Vert x_{n}-z\Vert +\sigma _{n}\Vert x_{n}-x_{n-1}\Vert \nonumber \\= & {} \Vert x_{n}-z\Vert +\gamma _{n} \cdot \frac{\sigma _{n}}{\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert . \end{aligned}$$

(3.7)

According to (3.3), we have $\sigma _{n}\Vert x_{n}-x_{n-1}\Vert \le \tilde{\tau }_{n}$ $\forall n\ge 1$, which, together with $ \mathop {\lim }\limits _{n\rightarrow \infty }\frac{\tilde{\tau }_{n}}{\gamma _{n}}=0$, yields that

$$\begin{aligned} \mathop {\lim }\limits _{n\rightarrow \infty }\frac{\sigma _{n}}{\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \le \mathop {\lim }\limits _{n\rightarrow \infty }\frac{\tilde{\tau }_{n}}{\gamma _{n}}=0. \end{aligned}$$

Therefore, there is a constant $M_{1}> 0$, such that

$$\begin{aligned} \frac{\sigma _{n}}{\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \le M_{1},~\forall n\ge 1, \end{aligned}$$

which, along with (3.6) and (3.7), yields that

$$\begin{aligned} \Vert y_{n}-z\Vert \le \Vert w_{n}-z\Vert \le \Vert x_{n}-z\Vert +\gamma _{n}M_{1}. \end{aligned}$$

(3.8)

From ((3.1)) and (3.8), it follows that:

$$\begin{aligned} {\Vert x_{n+1}-z\Vert }= & {} \Vert \gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}-z\Vert \\= & {} \Vert \gamma _{n}(f(x_{n})-z)+(1-\gamma _{n})(y_{n}-z)\Vert \\\le & {} \gamma _{n}\Vert f(x_{n})-z\Vert +(1-\gamma _{n})\Vert y_{n}-z\Vert \\= & {} \gamma _{n}\Vert f(x_{n})-f(z)+f(z)-z\Vert +(1-\gamma _{n})\Vert y_{n}-z\Vert \\\le & {} \gamma _{n}\tilde{\rho }\Vert x_{n}-z\Vert +\gamma _{n}\Vert f(z)-z\Vert +(1-\gamma _{n})\Vert y_{n}-z\Vert \\\le & {} \gamma _{n}\tilde{\rho }\Vert x_{n}-z\Vert +(1-\gamma _{n})(\Vert x_{n}-z\Vert +\gamma _{n}M_{1}\Vert )+\gamma _{n}\Vert f(z)-z\Vert \\\le & {} \left( 1-\gamma _{n}\left( 1-\tilde{\rho }\right) \right) \Vert x_{n}-z\Vert +\gamma _{n}(1-\tilde{\rho })\frac{\Vert f(z)-z\Vert +M_{1}}{1-\tilde{\rho }}\\\le & {} \max \left\{ \Vert x_{n}-z\Vert , \frac{ M_{1}+ \Vert f(z)-z\Vert }{1-\tilde{\rho }}\right\} \\\le & {} \cdots \le \max \left\{ \Vert x_{1}-z\Vert , \frac{ M_{1}+ \Vert f(z)-z\Vert }{1-\tilde{\rho }}\right\} . \end{aligned}$$

This means that the sequence $\left\{ x_{n}\right\} $ is bounded. Hence, the sequences $\{y_{n}\}$, $\left\{ f(x_{n})\right\} $ and $\{w_{n}\}$ are also bounded.

By ((3.1)) and the convexity of $\Vert \cdot \Vert ^2$, we get that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}= & {} \Vert \gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}-z\Vert ^2\\= & {} \Vert \gamma _{n}(f(x_{n})-z)+(1-\gamma _{n})(y_{n}-z)\Vert ^2\\\le & {} \gamma _{n}\Vert f(x_{n})-z\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}(\Vert f(x_{n})-f(z)\Vert +\Vert f(z)-z\Vert )^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}(\tilde{\rho }\Vert x_{n}-z\Vert +\Vert f(z)-z\Vert )^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}(\Vert x_{n}-z\Vert +\Vert f(z)-z\Vert )^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\= & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+\gamma _{n}\left( \Vert f(z)-z\Vert ^2+2\Vert x_{n}-z\Vert \Vert f(z)-z\Vert \right) \\&+ (1-\gamma _{n})\Vert y_{n}-z\Vert ^2\\\le & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2 +\gamma _{n}M_{2}; \end{aligned}$$

for some $M_{2}>0$. Combining with (3.5), we derive that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}\le & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert w_{n}-z\Vert ^2\nonumber \\&+ (1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2} +\gamma _{n}M_{2}. \end{aligned}$$

(3.9)

Substituting (3.8) into (3.9), then there exists $M_{3}>0$, such that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}\le & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})(\Vert x_{n}-z\Vert +\gamma _{n}M_{1})^2\\&+ (1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2} +\gamma _{n}M_{2}\\= & {} \gamma _{n}\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})(\gamma _{n}M_{1})^2\\&+ 2(1-\gamma _{n})\gamma _{n}M_{1}\Vert x_{n}-z\Vert + (1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\\&\times \frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}+\gamma _{n}M_{2}\\\le & {} \Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\kappa _{n}(\kappa _{n}-2)\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}+\gamma _{n}M_{3}. \end{aligned}$$

That is

$$\begin{aligned} (1-\gamma _{n})\kappa _{n}(2-\kappa _{n})\frac{(E(w_{n})+L(w_{n}))^2}{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2} \le \Vert x_{n}-z\Vert ^2-\Vert x_{n+1}-z\Vert ^2+\gamma _{n}M_{3}.\nonumber \\ \end{aligned}$$

(3.10)

By Lemma 2.1 (i, ii) and (3.8), we derive that

$$\begin{aligned} {\Vert x_{n+1}-z\Vert ^2}= & {} \Vert \gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}-z\Vert ^2\nonumber \\= & {} \Vert \gamma _{n}(f(x_{n})-f(z))+(1-\gamma _{n})(y_{n}-z)+\gamma _{n}(f(z)-z)\Vert ^2 \nonumber \\\le & {} \Vert \gamma _{n}(f(x_{n})-f(z))+(1-\gamma _{n})(y_{n}-z)\Vert ^2+ 2 \gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \nonumber \\\le & {} \gamma _{n}\Vert f(x_{n})-f(z)\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2 +2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \nonumber \\\le & {} \gamma _{n}\tilde{\rho }^2\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert y_{n}-z\Vert ^2 +2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \nonumber \\\le & {} \gamma _{n}\tilde{\rho }\Vert x_{n}-z\Vert ^2+(1-\gamma _{n})\Vert w_{n}-z\Vert ^2 +2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle . \end{aligned}$$

(3.11)

From the definition of $w_{n}$, it follows that:

$$\begin{aligned} {\Vert w_{n}-z\Vert ^2}= & {} \Vert x_{n}+\sigma _{n}(x_{n}-x_{n-1})-z\Vert ^2 \nonumber \\= & {} \Vert x_{n}-z\Vert ^2+\sigma _{n}^{2}\Vert x_{n}-x_{n-1}\Vert ^2+2\sigma _{n}\langle x_{n}-z, x_{n}-x_{n-1}\rangle \nonumber \\\le & {} \Vert x_{n}-z\Vert ^2+\sigma _{n}^{2}\Vert x_{n}-x_{n-1}\Vert ^2+2\sigma _{n}\Vert x_{n}-z\Vert \Vert x_{n}-x_{n-1}\Vert . \end{aligned}$$

(3.12)

Let $ M=\underset{n\ge 1}{\sup }\{ \sigma \Vert x_{n}-x_{n-1}\Vert , 2\Vert x_{n}-z\Vert \}$. Combining (3.11) and (3.12), we obtain that

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad \;{\Vert x_{n+1}-z\Vert ^2} \le \left( 1-\gamma _{n}\left( 1-\tilde{\rho }\right) \right) \Vert x_{n}-z\Vert ^2+\sigma _{n}^{2}\Vert x_{n}-x_{n-1}\Vert ^2\\ \quad \quad \quad \quad \;\;+ 2\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle +2\sigma _{n}\Vert x_{n}-z\Vert \Vert x_{n}-x_{n-1}\Vert \\ \quad \quad \quad \;\; = (1-\gamma _{n}\left( 1-\tilde{\rho })\right) \Vert x_{n}-z\Vert ^2+(1-\tilde{\rho })\frac{2}{1-\tilde{\rho }}\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \\ \quad \quad \quad \quad \;\; + \sigma _{n} \Vert x_{n}-x_{n-1}\Vert ( \sigma _{n} \Vert x_{n}-x_{n-1}\Vert +2\Vert x_{n}-z\Vert )\\ \quad \quad \quad \;\; \le (1- \gamma _{n}(1-\tilde{\rho }))\Vert x_{n}-z\Vert ^2+(1-\tilde{\rho })\frac{2}{1-\tilde{\rho }}\gamma _{n}\langle f(z)-z, x_{n+1}-z\rangle \\ \quad \quad \quad \quad \;\; + 2M\sigma _{n} \Vert x_{n}-x_{n-1}\Vert \\ \quad \quad \quad \;\; = (1- \gamma _{n}(1-\tilde{\rho }))\Vert x_{n}-z\Vert ^2\\ \quad \quad \qquad \;\;+ (1-\tilde{\rho })\gamma _{n}\left( \frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n+1}-z\rangle + \frac{2M\sigma _{n}}{(1-\tilde{\rho })\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \right) . \end{array} \end{array}\end{aligned}$$

After arrangement, there exists $M>0$, such that

$$\begin{aligned} \begin{array}{l} {\Vert x_{n+1}-z\Vert ^2} \le (1-(1-\tilde{\rho } )\gamma _{n})\Vert x_{n}-z\Vert ^2\\ ~ ~~~~~\quad \quad \quad \qquad \;\; + (1-\tilde{\rho })\gamma _{n}\left( \frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n+1}-z\rangle + \frac{2M\sigma _{n}}{(1-\tilde{\rho })\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert \right) . \end{array}\nonumber \\ \end{aligned}$$

(3.13)

Next, we let

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;s_{n}= \Vert x_{n}-z\Vert ^2,\\ \quad ~~~\; \alpha _{n}= (1-\tilde{\rho })\gamma _{n},\\ \quad ~~~\; \delta _{n}=\frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n+1}-z\rangle +\frac{2M\sigma _{n}}{(1-\tilde{\rho })\gamma _{n}}\Vert x_{n}-x_{n-1}\Vert . \end{array} \end{array}\end{aligned}$$

Then, (3.13) reduces to the following inequality:

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad ~~~\;s_{n+1}\le (1-\alpha _{n})s_{n}+\alpha _{n}\delta _{n},~\forall ~n\ge 1. \end{array} \end{array}\end{aligned}$$

Clearly, Lemma 2.3 (i) is satisfied. Now, it needs to verify that Lemma 2.3 (ii) is also satisfied. Suppose that $ \{ \Vert x_{n_{k}}-z\Vert \}$ is the subsequence of $ \{ \Vert x_{n}-z\Vert \}$ and satisfies $\liminf _{k\rightarrow \infty }(\Vert x_{n_{k}+1}-z\Vert -\Vert x_{n_{k}}-z\Vert )\ge 0$. Then

$$\begin{aligned} \begin{array}{l} \quad \quad \;\;\;\underset{k\rightarrow \infty }{\liminf }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2)\\ ~ ~~ \quad \quad \quad \;\; = \underset{k\rightarrow \infty }{\liminf }~\left( \left( \Vert x_{n_{k}+1}-z\Vert -\Vert x_{n_{k}}-z\Vert \right) (\Vert x_{n_{k}+1}-z\Vert +\Vert x_{n_{k}}-z\Vert )\right) \ge 0. \end{array} \nonumber \\ \end{aligned}$$

(3.14)

By $\underset{k\rightarrow \infty }{\lim }\gamma _{n_{k}}=0$, (3.10) and (3.14), one has

$$\begin{aligned}\begin{array}{*{20}{l}} \begin{array}{l} \quad \;\;\;\underset{k\rightarrow \infty }{\limsup }~(1-\gamma _{n_{k}})\kappa _{n_{k}}(2-\kappa _{n_{k}})\frac{(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\\ ~ ~~ \quad \quad \quad \quad \quad \quad \quad \quad \;\; \le \underset{k\rightarrow \infty }{\limsup }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2+ \gamma _{n_{k}}M_{3} )\\ ~ ~~ \quad \quad \quad \quad \quad \quad \quad \quad \;\; \le \underset{k\rightarrow \infty }{\limsup }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2)+\underset{k\rightarrow \infty }{\limsup }~\gamma _{n_{k}}M_{3}\\ ~ ~~ \quad \quad \quad \quad \quad \quad \quad \quad \;\; = -\underset{k\rightarrow \infty }{\liminf }~(\Vert x_{n_{k}+1}-z\Vert ^2-\Vert x_{n_{k}}-z\Vert ^2)\le 0. \end{array} \end{array}\end{aligned}$$

Now, we have

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\left( \kappa _{n_{k}}(2-\kappa _{n_{k}})\frac{(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\right) =0. \end{aligned}$$

Thus, we get

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\frac{(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}=0. \end{aligned}$$

As a result, we have

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }(E(w_{n_{k}})+L(w_{n_{k}}))=0 \Leftrightarrow \underset{k\rightarrow \infty }{\lim }E(w_{n_{k}})=0~\text{ and }~\underset{k\rightarrow \infty }{\lim }L(w_{n_{k}})=0. \end{aligned}$$

(3.15)

Since $\theta _{n_{k}}^2=\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2$ is bounded. This follows from the fact that $\nabla E$ is Lipschitz continuous with constant $\Vert A\Vert ^2$, $\nabla L$ is nonexpansive and $\{w_{n_{k}}\}$ is bounded. More precisely, for any $z^*$ which solves the proximal SFP, we have

$$\begin{aligned} \begin{array}{l} \Vert \nabla E(w_{n_{k}})\Vert =\Vert \nabla E(w_{n_{k}})-\nabla E(z^*)\Vert \le \Vert A\Vert ^2\Vert w_{n_{k}}-z^*\Vert \end{array} \end{aligned}$$

and

$$\begin{aligned} \begin{array}{l} \Vert \nabla L(w_{n_{k}})\Vert =\Vert \nabla L(w_{n_{k}})-\nabla L(z^*)\Vert \le \Vert w_{n_{k}}-z^*\Vert . \end{array} \end{aligned}$$

By ((3.1)) and ((3.2)), we get

$$\begin{aligned} \begin{array}{l} \displaystyle \quad ~~\;{\Vert y_{n_{k}}-w_{n_{k}}\Vert ^2=\Vert \lambda _{n_{k}}(\nabla E(w_{n_{k}})+\nabla L(w_{n_{k}}))\Vert ^2} \\ ~ ~~~\quad \quad \quad \quad \quad \quad \displaystyle \;\;= \dfrac{\kappa _{n_{k}}^2(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\\ ~ ~~~\quad \quad \quad \quad \quad \quad \displaystyle \;\;\le \dfrac{4(E(w_{n_{k}})+L(w_{n_{k}}))^2}{\Vert \nabla E(w_{n_{k}})+\nabla L(w_{n_{k}})\Vert ^2}\\ ~ ~~~\quad \quad \quad \quad \quad \quad \displaystyle \;\;\rightarrow 0,~{\mathrm{as}}~k\rightarrow \infty . \end{array} \end{aligned}$$

This shows that

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\Vert y_{n_{k}}-w_{n_{k}}\Vert =0. \end{aligned}$$

(3.16)

From $\underset{k\rightarrow \infty }{\lim }\gamma _{n_{k}}=0$ and ((3.1)), it follows that:

$$\begin{aligned} \Vert x_{n_{k}}-w_{n_{k}}\Vert = \sigma _{n_{k}}\Vert x_{n_{k}}-x_{n_{k}-1}\Vert =\gamma _{n_{k}} \frac{ \sigma _{n_{k}}}{\gamma _{n_{k}}}\Vert x_{n_{k}}-x_{n_{k}-1}\Vert \rightarrow 0, ~\text{ as }~k\rightarrow \infty . \end{aligned}$$

(3.17)

Using (3.16) and (3.17), we have

$$\begin{aligned} \underset{k\rightarrow \infty }{\lim }\Vert y_{n_{k}}-x_{n_{k}}\Vert =0. \end{aligned}$$

(3.18)

By ((3.1)) and $\underset{k\rightarrow \infty }{\lim }\gamma _{n_{k}}=0$, we have

$$\begin{aligned} \Vert x_{n_{k}+1}-y_{n_{k}}\Vert =\gamma _{n_{k}}\Vert f(x_{n_{k}})-y_{n_{k}}\Vert \rightarrow 0,~\text{ as }~k\rightarrow \infty . \end{aligned}$$

This deduces that

$$\begin{aligned} \left\| x_{n_{k}+1}-x_{n_{k}}\right\| \le \Vert x_{n_{k}+1}-y_{n_{k}}\Vert +\Vert y_{n_{k}}-x_{n_{k}}\Vert \rightarrow 0,~\text{ as }~k\rightarrow \infty . \end{aligned}$$

(3.19)

Since the sequence $\{ x_{n_{k}}\}$ is bounded, then there exists a subsequence $\left\{ x_{n_{k_{i}}}\right\} $ of $\{ x_{n_{k}}\}$ converging weakly to a point $z^* \in H_{1}$, such that

$$\begin{aligned} \underset{k\rightarrow \infty }{\limsup }\left\langle f(z)-z,x_{n_{k}}-z\right\rangle =\underset{i\rightarrow \infty }{\lim }\left\langle f(z)-z,x_{n_{k_{i}}}-z\right\rangle = \langle f(z)-z,z^*-z\rangle . \end{aligned}$$

Thanks to (3.17), we have

$$\begin{aligned} w_{n_{k_{i}}}\rightharpoonup z^*,~\text{ as }~i\rightarrow \infty . \end{aligned}$$

By the weak lower semi-continuity of E, we arrive at

$$\begin{aligned} 0\le E(z^*)\le \underset{i\rightarrow \infty }{\liminf }E\left( w_{n_{k_{i}}}\right) =\underset{k\rightarrow \infty }{\lim }E\left( w_{n_{k}}\right) =0. \end{aligned}$$

This means that $E(z^*)=\frac{1}{2}\Vert (I-\text{ prox}_{\tau G})Az^*\Vert ^2=0,$ that is, $Az^*$ is a fixed point of the proximal mapping of G or equivalently $0 \in \partial G(Az^*).$ In other words, $Az^*$ is a minimizer of G. Similarly, by the weak lower semi-continuity of L, we have $ 0\le L(z^*)\le \underset{i\rightarrow \infty }{\liminf }L\left( w_{n_{k_{i}}}\right) =\underset{k\rightarrow \infty }{\lim }L\left( w_{n_{k}}\right) =0. $ This means that $L(z^*)=\frac{1}{2}\Vert (I-\text{ prox}_{\lambda _{n}\tau F})z^*\Vert ^2=0,$ that is, $z^*$ is a fixed point of the proximal mapping of F or equivalently $0 \in \partial F(z^*).$ In other words, $z^*$ is a minimizer of F. Therefore, $z^*\in \varGamma .$ From the definition of $z=P_{\varGamma }of(z)$, it yields that

$$\begin{aligned} \underset{k\rightarrow \infty }{\limsup }\langle f(z)-z,x_{n_{k}}-z\rangle =\underset{i\rightarrow \infty }{\lim }\left\langle f(z)-z,x_{n_{k_{i}}}-z\right\rangle =\langle f(z)-z,z^*-z\rangle \le 0, \end{aligned}$$

which together with (3.19) implies that

$$\begin{aligned} \begin{array}{*{20}{l}} \begin{array}{l} \quad \quad \;\;\;\underset{k\rightarrow \infty }{\limsup }~\langle f(z)-z,x_{n_{k}+1}-z\rangle \\ ~ ~~ \quad \quad \quad \quad \quad \;\; \le \underset{k\rightarrow \infty }{\limsup }~\langle f(z)-z,x_{n_{k+1}}-x_{n_{k}}\rangle +\underset{k\rightarrow \infty }{\limsup }~\langle f(z)-z,x_{n_{k}}-z\rangle \\ ~ ~~ \quad \quad \quad \quad \quad \;\; \le 0. \end{array} \end{array}\end{aligned}$$

Hence

$$\begin{aligned} \underset{k\rightarrow \infty }{\limsup }\delta _{n_{k}}=\underset{k\rightarrow \infty }{\limsup }\left\{ \frac{2}{1-\tilde{\rho }}\langle f(z)-z, x_{n_{k}+1}-z\rangle +\frac{2M\sigma _{n_{k}}}{(1-\tilde{\rho })\gamma _{n_{k}}}\Vert x_{n_{k}}-x_{n_{k}-1}\Vert \right\} \le 0. \end{aligned}$$

Employing Lemma 2.3, we conclude that $ \underset{n\rightarrow \infty }{\lim }\Vert x_{n}-z\Vert =0$. $\square $

If $F\equiv \delta _{C}$ [defined as $\delta _{C}(x)=0$ if $x \in C$ and $+\infty $ otherwise] and $G\equiv \delta _{Q}$, the indicator functions of the nonempty, closed, and convex sets $C\subset H_{1}$ and $Q\subset H_{2}$, respectively, then the proximal SFP reduces to the following SFP:

$$\begin{aligned} \begin{array}{l} \text{ Find }~z^* \in C~\text{ such } \text{ that }~Az^* \in Q. \end{array} \end{aligned}$$

Furthermore, we derive the following strongly convergent corollary from Theorem 3.1.

Corollary 3.1

Let $H_{1}$, $H_{2}$, C, Q, A, $A^*$, and $\varGamma $ be the same as above description. Suppose that $\varGamma \ne \emptyset $, $\{\sigma _{n}\}\subset [0,\sigma [\subset [0,1[$ and Conditions (A1)–(A4) hold. Let $x_{0},~x_{1}\in H_{1}$ and $\{x_{n}\}$ be a sequence generated by

$$\begin{aligned} {\left\{ \begin{array}{ll} w_{n}=x_{n}+\sigma _{n}(x_{n}-x_{n-1}), y_{n}=w_{n}-\lambda _{n}(\nabla E(w_{n})+\nabla L(w_{n})),\\ x_{n+1}=\gamma _{n}f(x_{n})+(1-\gamma _{n})y_{n}, \end{array}\right. } \end{aligned}$$

where $\sigma _{n}$ is defined in (3.3) and the stepsize $\lambda _{n}$ can be computed via

$$\begin{aligned} \lambda _{n}= \kappa _{n}\frac{E(w_{n})+L(w_{n})}{\theta (w_{n})^2}, \end{aligned}$$

where $0<\kappa _{n}<2$, $L(w_{n})=\frac{1}{2}\Vert (I-P_{C})w_{n}\Vert ^2$, $E(w_{n})=\frac{1}{2}\Vert (I-P_{Q})Aw_{n}\Vert ^2$ and $\theta (w_{n})=\sqrt{\Vert \nabla E(w_{n})+\nabla L(w_{n})\Vert ^2}$.

Then, the iterative sequence $\{x_{n}\}$ produced above strongly converges to $z \in \varGamma $, where $z=P_{\varGamma }of(z)$.

Remark 3.3

Theorem 3.1 improves the result of [ Shehu and Iyiola (2017), Theorem 3.2 ], because strong convergence of our method is obtained without assuming firm-nonexpasiveness of the operator $I-\text{ prox}_{\tau (\cdot )}$.

4 Numerical experiments

In this section, we provide numerical experiments relative to the proximal SFP. For the first example, we compare Alg. 1 with Abbas et al.’s Algorithms 3.1-3.2 (shortly, AMMOAlg. 3.1-3.2) Abbas et al. (2018), Shehu et al.’s Algorithm 3.1 (SIAlg. 3.1) Shehu and Iyiola (2017), and Shehu et al.’s Algorithm (AHVSDM) Shehu and Iyiola (2018). All the programs are implemented in MATLAB R2017a on a PC Desktop Intel(R) Core(TM) i7-6700 CPU @ 3.40 GHZ computer with RAM 8.00 GB.

In the first example, we study the proximal SFP in the case $\text{ arg }\min F\cap A^{-1}(\text{ arg }\min G)\ne \emptyset $, or in other words: in finding a minimizer $z^*$ of F, such that $Az^*$ minimizes G, that is

$$\begin{aligned} \text{ Find }~ z^*\in H_{1}~\text{ such } \text{ that }~z^*\in \text{ arg }\underset{x \in H_{1}}{\min }~F(x) ~\text{ and }~Az^*\in \text{ arg }\underset{y \in H_{2}}{\min }~G(y), \end{aligned}$$

(4.1)

where $F:H_{1}\rightarrow \mathbb {R}$ and $G:H_{2}\rightarrow \mathbb {R}$ are proper and lower semi-continuous convex (lsc) functions, $\text{ arg }\min ~F=\{ z^*\in H_{1} : F(z^*)\le F(x)~\forall x\in H_{1}\}$ and $\text{ arg }\min ~G=\{ y^*\in H_{2}: G(y^*)\le G(x)~\forall x\in H_{2}\}$, the solution set is denoted by $\varGamma $.

Example 4.1

Kesornprom and Cholamjiak (2019) Let $H_{1}=H_{2}=\mathbb {R}^{N}$ and $F(x)=\frac{1}{2}d_{C}^{2}(x)$, where $C\subset \mathbb {R}^{N}$ is a unit ball and $G(x)=\frac{1}{2}\Vert x\Vert ^2$. Set $Ax=x$, $x\in \mathbb {R}^{N}$. Observe that $0 \in \varGamma $ and $\varGamma \ne \emptyset $. For AMOOAlg. 3.1-3.2 and SIAlg. 3.1, we take $\kappa _{n}=1.9$, $\gamma _{n}=\frac{1}{n+1}$ and $\alpha _{n}=\frac{1}{10^4(n+1)}$. For AHVSDM, we set $\lambda _{n}=10^{-4}$, $\gamma _{n}=\frac{1.99}{n+1}$, $\mu =1$, $\tilde{F}=I$ (which is a contraction mapping in Shehu and Iyiola (2018) and I is an identity mapping on $H_{1}$) and $\beta _{n}=\frac{0.001}{(n+1)^2}$. For Alg. 1, we adopt $\kappa _{n}=1.9$, $\gamma _{n}=\frac{1}{n^2}$, $\sigma =0.3$ and $\tilde{\tau }_{n}=\frac{1}{n^3}$. For all tests, we use the condition $ \Vert x_{n}-\text{ prox}_{\tau , F}(x_{n})\Vert +\Vert Ax_{n}- \text{ prox}_{\tau , G}(Ax_{n})\Vert <\epsilon $ to terminate all the algorithms and choose $x_{0}=[0,0,0, \cdots ,0]~\text{ and }~x_{1}=[1,\cdots ,1] \in \mathbb {R}^{N}$ and $\tau =5$. To ensure that all algorithms have a common convergence point in this experiment, we set $f(x)=0$. The results are summarized in Table 1.

Table 1 Results for Example 4.1

Full size table

Remark 4.1

The numerical results of Example 4.1 are described in Table 1, the observations we obtain are the following:

(1)
The iterative rule proposed in this note implements efficiently and readily. More significantly, it converges fast.
(2)
Our proposed algorithm converges faster than some existing algorithms in terms of the number of iterations and execution time under different dimensions of the problem.

For the SFP, we list the following numerical examples and compare Alg. 1 with Gibali et al.’s Algorithm 3.1 (shortly, GMV Alg. 3.1) Gibali et al. (2019) and Suantai et al.’s Algorithm 3.1(SKC Alg. 3.1) Suantai et al. (2018).

Example 4.2

Kesornprom and Cholamjiak (2019) Let $H_{1}=H_{2}=L^2([0,1])$ with norm $\Vert x\Vert _{L^2}=\left( \int _{0}^{1} x(t)^2dt\right) ^\frac{1}{2}$ and inner product $\langle x,y\rangle =\int _{0}^{1} x(t)y(t)dt,~x,y \in L^2([0,1])$. Let $C=\{ x\in L^2([0,1]): \Vert x\Vert _{L^2}\le 1 \}$ and $Q=\{ x\in L^2([0,1]):\langle x, t\rangle =0\}$. Set $Ax(t)=\frac{x(t)}{2}$. Observe that $0 \in \varGamma $, and so, $\varGamma \ne \emptyset $. For SKC Alg. 3.1 and GMV Alg. 3.1, we fix $\sigma =0.3$, $\tilde{\tau }_{n}=\frac{1}{n^2}$, $\gamma _{n}=\frac{1}{10^4n}$, $\kappa _{n}=1.6$, $f(x)=0.01x$, $\beta _{n}=0.7$ and $\sigma _{n}=\max (0,\sigma _{n}-0.1)$. For Alg. 1, we choose $\kappa _{n}=1.6$, $\sigma =0.3$, $f(x)=0.01x$, $\gamma _{n}=\frac{1}{10^4n}$ ,$\tilde{\tau }_{n}=\frac{1}{n^2}$. For L$\acute{o}$pez Alg. 5.1, we adopt $\kappa _{n}=9\times 10^{-5}$ and $\gamma _{n}=\frac{10^{-5}}{n}.$ For all algorithms, we regard the condition $ \Vert x_{n+1}-x_{n}\Vert _{L^2}<\epsilon $ as a stopping criterion. We choose two types of starting points:

Case 1: $x_{0}=t^4,~x_{1}=t+1$;

Case 2: $x_{0}=e^t,~x_{1}=3e^t$.

Before conducting our numerical experiments, we first recall that the projections on sets C and Q have respective formulas, that is

$$\begin{aligned} \ P_{C}(x)= {\left\{ \begin{array}{ll} \frac{x}{\Vert x\Vert _{L^2}}, &{} \text{ if }\ \Vert x\Vert _{L^2}>1, \\ x,&{}\text{ if }\ \Vert x\Vert _{L^2}\le 1. \end{array}\right. } ~~~~\text{ and }~~~ \ P_{Q}(x)= x-\frac{\langle t,x\rangle }{\Vert t\Vert _{L^2}}t. \end{aligned}$$

The numerical results are shown in Table 2.

Table 2 Results for Example 4.2

Full size table

Remark 4.2

According to Table 2, it shows that Alg. 1 behaves better than the compared algorithms with respect to the number of iterations and execution time under various cases of the problem.

Example 4.3

LASSO problem Sahu et al. (2020)

In this subsection, we employ SFP to model a real problem which is the recovery of a sparse signal. We take advantage of the well-known LASSO problem whose form is the following:

$$\begin{aligned} \min \left\{ \frac{1}{2}\Vert Ax-b\Vert ^2: x \in \mathbb {R}^{N},\Vert x\Vert _{1}\le \kappa \right\} , \end{aligned}$$

(4.2)

where $A \in \mathbb {R}^{M\times N},~M<N,$ $b\in \mathbb {R}^{M}$ and $\kappa >0.$ This problem is devoted to finding a sparse solution of SFP. The system A is generated from a standard normal distribution with mean zero and unit variance. We generate the true sparse signal $z^*$ from uniformly distribution in the interval $[-2,2]$ with random k position nonzero, while the rest is kept zero. The sample data $b=Az^*$.

Under certain conditions on matrix A, the solution of the minimization problem (4.2) is equivalent to the $\ell _{0}-$ norm solution of the underdetermined linear system. For the SFP , we define $C=\{ z| \Vert z\Vert _{1} \le \kappa \},~\kappa =k,$ and $Q=\{b\}$, since the projection onto the closed convex set C does not have a closed form solution. Therefore, we employ the subgradient projection. Thus, we define a convex function $c(z)=\Vert z\Vert _{1}-\kappa $ and denote $C_{n}$ by

$$\begin{aligned} C_{n}=\{ z:c(w_{n})+\langle \varepsilon _{n}, z-w_{n}\rangle \le 0\}, \end{aligned}$$

where $\varepsilon _{n} \in \partial c(w_{n})$. Also, the orthogonal projection of a point $z \in \mathbb {R}^{N}$ onto $C_{n}$ can be computed via

$$\begin{aligned} \ P_{C_{n}}(z)= {\left\{ \begin{array}{ll} z , &{}\text{ if }\ c(w_{n})+\langle \varepsilon _{n}, z-w_{n}\rangle \le 0, \\ z-\frac{c(w_{n})+\langle \varepsilon _{n}, z-w_{n}\rangle }{\Vert \varepsilon _{n}\Vert ^2}\varepsilon _{n}, &{}\text{ otherwise }. \end{array}\right. } \end{aligned}$$

The subdifferential $\partial c$ at $w_{n}$ is

To implement our method in this example, we initialize the algorithms at the original and define

$$\begin{aligned} E_{n}=\frac{\Vert x_{n}-z^*\Vert }{\max \{1,\Vert x_{n}\Vert \}}. \end{aligned}$$

We test the numerical behavior of all algorithms with the same iteration error $E_{n}$ in different $M,~N~\text{ and }~k$ and limit the number of iterations to 8000 and report $E_{n}$ in Table 3. The second problem is the recovery of the signal $z^*$ when $M=1440,~N=6144,~k=180$, $M \times N$ matrix A is randomly obtained with independent samples of standard Gaussian distribution. More details, the original signal $z^*$ contains 180 randomly placed $\pm 1$ spikes. The iterative process is started with $x_{0}=0$, the following method of mean square error is used for measuring the recovery accuracy:

$$\begin{aligned} \text{ MSE }=\frac{1}{N}\Vert x_{n}-z^*\Vert ^2. \end{aligned}$$

For all algorithms, we fix $f(x)=0.0005x$, $\sigma =0.9$, $\tilde{\tau }_{n}=\frac{1}{n^5}$ and $\gamma _{n}=\frac{1}{10^5n}$. For SKC Alg. 3.1, we take $\kappa _{n}=0.02$. For GMV Alg. 3.1, we adopt $\kappa _{n}=1.9$ and $\beta _{n}=0.7$. For Alg. 1, we set $\kappa _{n}=1.9$. For L$\acute{o}$pez Alg. 5.1, we choose $\kappa _{n}=1.9$ and $\gamma _{n}=\frac{10^{-7}}{n}.$

Table 3 Results for Example 4.3

Full size table

Table 4 Results of Algorithm 1 with different values of $\kappa _{n}$ for Example 4.3

Full size table

Remark 4.3

It can be observed from Tables 3–4 that the proposed algorithm implements efficiently. Moreover, our method requires less CPU time than some strongly convergent algorithms in the literature to obtain more smaller value of error accuracy $E_{n}$ in different cases. We also find that when the value of the parameter $\kappa _{n}$ is 1.99, our proposed algorithm performs better.

The recovery results of all algorithms are shown in Fig. 1, which stands for the original signal, the mean-squared error (MSE) of the restored signal, and the computing time required for the iterative process.

Remark 4.4

As can be observed from Fig. 1, the signal $z^*$ is estimated with fair degree of accuracy by Algorithm 1. Under the same number of iterations, the execution time of Algorithm 1 is less but a little bigger mean-squared error.

Example 4.4

Image deblurring problem Saejung and Yotkaew (2012) We consider the problem here is an image deblurring problem. Fixed a convolution matrix $A \in \mathbb {R}^{m\times n}$ and an unknown image $z \in \mathbb {R}^{n}$, we derive $b \in \mathbb {R}^{m}$, which can be viewed as the known degreaded observation. Also, the unknown additive random noise $\nu \in \mathbb {R}^{m}$ is included, and furthermore, we obtain the image recovery problem as follows:

$$\begin{aligned} Az=b+\nu . \end{aligned}$$

(4.3)

This problem obviously is suitable for the setting of SFP with $C=\mathbb {R}^{n }$; if no noise is added to the observed image b, then $Q=\{b\}$ is a singleton and otherwise $Q=\{x\in \mathbb {R}^{m} | \Vert x-(b+\nu )\Vert \le \epsilon \}$ for small enough $\epsilon >0.$ In this example, we compare Algorithm 1 with Lopez Alg. 5.1. The test image was corrupted as in He et al. (2016). More precisely, every image was degraded by a $9\times 9$ Gaussian random blur and standard deviation 4, and corrupted by undertaking an additive zero-mean white Gaussian noise with standard deviation $10^{-3}.$ To measure the quality of the obtained recovered image, we define the following signal-to-noise ratio:

$$\begin{aligned} \mathrm{SNR}=20\log _{10}\frac{\Vert z\Vert }{\Vert \bar{z}-z\Vert }, \end{aligned}$$

where z is an original image and $\bar{z}$ is a obtained image. Obviously, when the SNR value is higher, the image is recovered better. For Alg. 1, we take $\kappa _{n}=1.6$, $\sigma =0.3$, $f(x)=0.01x$, $\gamma _{n}=\frac{21}{100n}$ ,$\tilde{\tau }_{n}=\frac{1}{n^2}$. For L$\acute{o}$pez Alg. 5.1, we adopt $\kappa _{n}=5\times 10^{-4}$ and $\gamma _{n}=\frac{21}{100n}.$ For all algorithms, we limit the number of iterations to 100 and report numerical results in Table 5 and Figs. 2-3.

Remark 4.5

As shown in Table 5 and Figs. 2-3, we observe that these two methods require the same iterations to recovery images. Concretely, Alg. 1 obtains higher SNR than L$\acute{o}$pez Alg. 5.1, but a little longer execution time.

Table 5 Results of all algorithms for Example 4.4

Full size table

5 Conclusions

In this paper, we obtain a strong convergence result for the proximal split feasibility problems with nonexpansive mappings. We modify the algorithm Shehu and Iyiola (2017) with the viscosity-type algorithm Moudafi (2000), the inertial method and the split proximal algorithm with a self-adaptive stepsize. As practical applications, we consider signal recovery and image deblurring problems. Preliminary numerical experiments confirm the effectiveness of the proposed algorithm in practice.

References

Abbas M, Alshahrani M, Ansari QH et al (2018) Iterative methods for solving proximal split minimization problems. Numer Algorithms 78:193–215
Article MathSciNet Google Scholar
Anh PK, Vinh NT, Dung VT (2018) A new self-adaptive CQ algorithm with an application to the lasso problem. J Fixed Point Theory Appl 20:142
Article MathSciNet Google Scholar
Byrne C (2002) Iterative oblique projection onto convex sets and the split feasibility problem. Inverse Problems 18:441–453
Article MathSciNet Google Scholar
Byrne C (2004) A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Problems 20:103–120
Article MathSciNet Google Scholar
Censor Y, Elfving T (1994) A multiprojection algorithm using Bregman projection in a product space. Numer Algorithms 8:221–239
Article MathSciNet Google Scholar
Censor Y, Elfving T, Kopf N, Bortfeld T (2005) The multiple-sets split feasibility problem and its applications for inverse problems. Inverse Problems 21:2071–2084
Article MathSciNet Google Scholar
Combettes PL, Hirstoaga SA (2005) Equilibrium programming in Hilbert spaces. J Nonlinear Convex Anal 6:117–136
MathSciNet MATH Google Scholar
Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4:1168–1200
Article MathSciNet Google Scholar
Dong QL, Tang YC, Cho YJ, Rassias TM (2018) ’ ’Optimal“ choice of the step length of the projection and contraction methods for solving the split feasibility problem. J Global Optim 71:341–360
Article MathSciNet Google Scholar
Dong QL, He S, Rassias MT (2020) General splitting methods with linearization for the split feasibility problem. J Glob Optim. https://doi.org/10.1007/s10898-020-00963-3
Article MATH Google Scholar
Gibali A, Liu LW, Tang YC (2018) Note on the modified relaxation CQ algorithm for the split feasibility problem. Optim Lett 12:817–830
Article MathSciNet Google Scholar
Gibali A, Mai DT, Vinh NT (2019) A new relaxed CQ algorithm for solving split feasibility problems in Hilbert spaces and its applications. J Ind Manag Optim 15:963–984
MathSciNet MATH Google Scholar
Goebel K, Reich S (1984) Uniform convexity, hyperbolic geometry, and nonexpansive mappings. Marcel Dekker, New York
MATH Google Scholar
He H, Ling C, Xu HK (2016) An Implementable Splitting Algorithm for the ?1-norm Regularized Split Feasibility Problem. J Sci Comput 67:281–298
Iyiola OS, Ogbuisi FU, Shehu Y (2018) An inertial type iterative method with Armijo linesearch for nonmonotone equilibrium problems. Calcolo 55:52
Article MathSciNet Google Scholar
Kesornprom S, Cholamjiak P (2019) Proximal type algorithms involving linesearch and inertial technique for split variational inclusion problem in Hilbert spaces with applications. Optimization. https://doi.org/10.1080/02331934.2019.1638389
Article MathSciNet MATH Google Scholar
Kesornprom S, Pholasa N, Cholamjiak P (2020) On the convergence analysis of the gradient-CQ algorithms for the split feasibility problem. Numer Algorithms 84:997–1017
Article MathSciNet Google Scholar
Lions PL, Mercier B (1979) Splitting algorithms for the sum of two nonlinear operators. SIAM J Numer Anal 16:964–979
Article MathSciNet Google Scholar
López, G, Martin V, Wang F, Xu HK (2012) Solving the split feasibility problem without prior knowledge of matrix norms. Inverse Problems 28
Moudafi A (2000) Viscosity approximation methods for fixed points problems. J Math Anal Appl 241:46–55
Article MathSciNet Google Scholar
Moudafi A, Thakur BS (2014) Solving proximal split feasibility problems without prior knowledge of operator norms. Optim Lett 8:2099–2110
Article MathSciNet Google Scholar
Qu B, Xiu N (2005) A note on the CQ algorithm for the split feasibility problem. Inverse Problems 21:1655–1665
Article MathSciNet Google Scholar
Reich S, Tuyen TM, Ha MTN (2020) An optimization approach to solving the split feasibility problem in Hilbert spaces. J Glob Optim. https://doi.org/10.1007/s10898-020-00964-2
Saejung S, Yotkaew P (2012) Approximation of zeros of inverse strongly monotone operators in Bachna spaces. Nolinear Anal 75:742–750
Article Google Scholar
Sahu DR, Cho YJ, Dong QL, Kashyap MR, Li XH (2020) Inertial relaxed CQ algorithms for solving a split feasibility problem in Hilbert spaces. Numer Algorithms https://doi.org/10.1007/s11075-020-00999-2
Shehu Y, Iyiola OS (2017) Convergence analysis for the proximal split feasibility problem using an inertial extrapolation term method. J Fixed Point Theory Appl 19:2483–2510
Article MathSciNet Google Scholar
Shehu Y, Iyiola OS (2017) Strong convergence result for proximal split feasibility problem in Hilbert spaces. Optimization 12:2275–2290
Article MathSciNet Google Scholar
Shehu Y, Iyiola OS (2018) Accelerated hybrid viscosity and steepest-descent method for proximal split feasibility problems. Optimization 67:475–492
Article MathSciNet Google Scholar
Shehu Y, Iyiola OS (2018) Nonlinear iteration method for proximal split feasibility problems. Math Methods Appl Sci 41:781–802
MathSciNet MATH Google Scholar
Shehu Y, Iyiola OS, Ogbuisi FU (2020) Iterative method with inertial terms for nonexpansive mappings: applications to compressed sensing. Numer Algorithms 83:1321–1347
Article MathSciNet Google Scholar
Shehu Y, Gibali A (2020) New inertial relaxed method for solving split feasibilities. Optim Lett https://doi.org/10.1007/s11590-020-01603-1
Suantai S, Pholasa N, Cholamjiak P (2018) The modified inertial relaxed CQ algorithm for solving the split feasibility problems. J Ind Manag Optim 23:1595–1615
MathSciNet Google Scholar
Wang Y, Xu H-K (2014) Strong convergence for the the proximal gradient method. J Nonlinear Convex Anal 15:581–593
MathSciNet MATH Google Scholar
Yen LH, Muu LD, Huyen NTT (2016) An algorithm for a class of split feasibility problems: application to a model in electricity production. Math Methods Oper Res 84:549–565
Article MathSciNet Google Scholar
Yen LH, Huyen NTT, Muu LD (2019) A subgradient algorithm for a class of nonlinear split feasibility problems: application to jointly constrained Nash equilibrium models. J Global Optim 73:849–868
Article MathSciNet Google Scholar

Download references

Funding

The second author was supported by National Natural Science Foundation of China (Grant No. 11801430).

Author information

Authors and Affiliations

Xidian University School of Mathematics and Statistics, Xi’an, 710126, Shaanxi, China
Xiaojun Ma, Hongwei Liu & Xiaoyin Li

Authors

Xiaojun Ma
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaojun Ma.

Additional information

Communicated by Joerg Fliege.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, X., Liu, H. & Li, X. The iterative method for solving the proximal split feasibility problem with an application to LASSO problem. Comp. Appl. Math. 41, 5 (2022). https://doi.org/10.1007/s40314-021-01703-3

Download citation

Received: 28 June 2021
Revised: 30 September 2021
Accepted: 13 November 2021
Published: 09 December 2021
DOI: https://doi.org/10.1007/s40314-021-01703-3

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The iterative method for solving the proximal split feasibility problem with an application to LASSO problem

Abstract

Similar content being viewed by others

Strong convergence algorithm for proximal split feasibility problem

Iterative approximation of solutions for proximal split feasibility problems

Self-adaptive algorithms for proximal split feasibility problems and strong convergence analysis

1 Introduction

2 Preliminaries

Definition 2.1

Lemma 2.1

Lemma 2.2

Lemma 2.3

3 Strong convergence

Remark 3.1

Proof

Remark 3.2

Theorem 3.1

Proof

Corollary 3.1

Remark 3.3

4 Numerical experiments

Example 4.1

Remark 4.1

Example 4.2

Remark 4.2

Example 4.3

Remark 4.3

Remark 4.4

Example 4.4

Remark 4.5

5 Conclusions

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation