1 Introduction

Let X be a reflexive, strictly convex and smooth Banach space with dual space \(X^{*}.\) Consider the optimization problem:

$$\begin{aligned} \min _{x\in X}\varPhi (x):=f(x)+g(x), \end{aligned}$$
(1)

where \(f:X\rightarrow (-\infty ,+\infty ]\) is proper, lower semicontinuous and convex, \(g:X\rightarrow (-\infty ,+\infty )\) is convex Gâteaux differentiable. We denote by \(\varPhi ^{*}\) the infimum value of the problem (1). The set of solutions to problem (1) is denoted by \(\mathcal {S}.\) We shall assume in what follows that the set \(\mathcal {S}\) is nonempty. Since \(\varPhi \) is proper, lower semicontinuous and convex, \(\mathcal {S}\) is a nonempty closed and convex set. Despite its simple form, problem (1) has been shown to cover a wide range of apparently unrelated signal recovery formulations (see [12,13,14, 18]).

The forward–backward splitting method is an effective method to solve (1), which allows to decouple the contributions of the functions f and g in a gradient descent step determined by f and in a backward implicit step induced by g. Forward–backward methods belong to the class of proximal splitting methods. These methods require the computation of the proximity operator and the approximation of proximal points(see [14, 20]).

The forward–backward splitting iteration procedure proposed in [14] is given in Hilbert space H and is governed by the updating rule:

$$\begin{aligned} x_{n+1}\in \mathrm{argmin}_{y\in H}\{\frac{1}{2}\Vert y-x_{n}\Vert ^{2}+t_{n}(\langle \nabla g(x_{n}),y\rangle +f(y))\}. \end{aligned}$$
(2)

Generalization of this method from Hilbert spaces to Banach spaces is not immediate. The main difficulties are due to the fact that the inner product structure of a Hilbert space is missing in a Banach space. The proposed forward–backward splitting method for solving problem (1) in Banach space suggested in this work aims at establishing a bridge between the well known forward–backward splitting in Hilbert spaces [14, 20] and that in Banach spaces. The use of a general regularizing function \(\Vert \cdot \Vert ^{p}\) instead of the square of the norm is rather natural in a Banach space, where the square of the norm loses the privileged role it enjoys in Hilbert spaces [4]. For instance, it is not hard to verify that in the spaces \(X=l_{p}\) or \(X=L_{p}(1<p<+\infty ),\) \(x_{n+1}\in \mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert y-x_{n}\Vert ^{p}+t_{n}(\langle \nabla g(x_{n}),y\rangle +f(y))\}\) becomes simpler than \(x_{n+1}\in \mathrm{argmin}_{y\in X}\{\frac{1}{2}\Vert y-x_{n}\Vert ^{2}+t_{n}(\langle \nabla g(x_{n}),y\rangle +f(y))\}.\) In [10], the following generalization of the forward–backward iteration procedure was proposed in reflexive Banach spaces X:

$$\begin{aligned} x_{0}\in X,~~ x_{n+1}\in \mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert y-x_{n}\Vert ^{p}+t_{n}(\langle \nabla g(x_{n}),y\rangle +f(y))\}, \end{aligned}$$
(3)

where, the gradient operator \(\nabla g\) is \((p-1)\) Hölder-continuous on X , i.e., there exists a constant L such that

$$\begin{aligned} \Vert \nabla g(x)-\nabla g(y)\Vert \le L\Vert x-y\Vert ^{p-1},~\forall ~ x,y\in X. \end{aligned}$$

In [16], Guan and Song proposed another type generalization of the forward–backward method in reflexive Banach spaces

$$\begin{aligned} x_{n+1}= \mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert y\Vert ^{p}-\langle J_{p}(x_{n}),y\rangle +t_{n}(\langle \nabla g(x_{n})+J_{p}(z_{n}),y\rangle +f(y))\},\nonumber \\ \end{aligned}$$
(4)

where \(J_{p}:X \rightarrow X^{*}\) is the p-duality mapping and \(\{z_{n}\}\) is absolutely summable sequence.

In particular, if X is a Hilbert space and \(p=2\), then formula (3) and (4) reduces to formula (2). In [16], it is shown that the sequence of functional values converges with the convergence rate \(n^{1-p}\) to the optimal value of Problem (1) under appropriate assumptions. In [17], Guan and Song further extended forward–backward splitting method (3) to more general case, i.e., by taking a convex combination of the current step and the previous step:

$$\begin{aligned} \left\{ \begin{array}{l} y_{n}=\mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert x_{n}-y\Vert ^{p}+t_{n}(\langle \nabla g(x_{n})+J_{p}(z_{n}),y\rangle +f(y))\},\\ x_{n+1}=(1-\lambda _{n})x_{n}+\lambda _{n}y_{n}, \end{array} \right. \end{aligned}$$
(5)

and proved that the sequence of the functional values converges with an asymptotic rate \(n^{1-p}\) to the optimal value. They also proved that the sequence of the functional values converges with an asymptotic rate under an error bound assumption.

The convergence of the forward–backward splitting method to an optimal solution of (1) is usually established under the assumption that the gradient of g is Lipschitz continuous and the stepsize \(t_{n}\) is taken less than some constant related with Lipschitz modulus. When \(\nabla g\) is Lipschitz continuous but somehow the Lipschitz constant is not known or \(\nabla g\) is not Lipschitz continuous, finding the stepsize \(t_{n}\) that guarantees the convergence of (2) would be a challenge.

Beck and Teboulle [6] proposed a proximal gradient method with backtracking stepsize rules overcome this inconvenience. As far as we observe, the theory of convergence and complexity for the proximal forward–backward is almost complete under such a Lipschitz assumption. However, the Lipschitz condition fails in many natural circumstance; see, e.g., [15]. Bello Cruz and Nghia [7] introduced two new linesearches into the frame of the forward–backward splitting method and prove the convergence analysis and complexity results of cost values. These two linesearch rules were also recently studied in [19, 21, 22] in conjunction with the forward–backward splitting algorithm for convex minimization problems without the assumption of the Lipschitz continuous gradient.

Bauschke, Bolte and Teboulle [5] introduced Bregman-based proximal gradient methods which share most of the convergence properties and complexity of the classical proximal-gradient, instead of the restrictive condition of Lipschitz continuity of the gradient of the differentiable they assuming a more general and flexible convexity condition. [8] further extended the above approach to nonconvex composite model with smooth adaptable functions and proved the global convergence to a critical point under natural assumptions on the problems data.

In this paper, following the lines of [7, 21], we propose the forward–backward splitting method with linesearches in the framework of Banach spaces. The main advantage of our algorithms is that the Lipschitz constants of the gradient of functions do not require in computation. The paper is organized as follows. The next section presents some preliminary results that will be used throughout the paper. Section 4 is devoted to the study of the forward–backward splitting method (3) with Linesearch 1. We prove that the sequence of the functional values converges with an asymptotic rate \(\frac{1}{n}\) to the optimal value of the minimization Problem (1). In Sect. 5, we further study forward–backward splitting method (5) with Linesearch 2. We prove that the sequence of the functional values converges with an asymptotic rate \(\frac{1}{n}\) to the optimal value. We also prove that the sequence of the functional values Q-linear converges under an error bound assumption if \(t_{n}\ge \bar{t}>0.\)

2 Preliminaries

In this section we present some definitions and results needed for our paper. Let f be a lower semi-continuous proper convex function from X to \((-\infty ,+\infty ]\). We denote the domain of f by \(\mathrm{dom}f := \{x\in X| f(x) < +\infty \}.\) The subdifferential of f at \(x\in X\) is the convex set

$$\begin{aligned} \partial f(x)=\{x^{*}\in X^{*}:\langle x^{*},y-x\rangle \le f(y)-f(x),~\forall y\in X\}. \end{aligned}$$
(6)

It follows from definition (6) that a point \(\hat{x}\) is a minimizer of f if and only if \(0\in \partial f(\hat{x}).\) The subdifferential mapping \(x\rightarrow \partial f(x)\) has following property of monotonicity, i.e.,

$$\begin{aligned} \langle x_{1}^{*}-x_{2}^{*},x_{1}-x_{2}\rangle \ge 0,\;\forall ~x_{1},x_{2}\in X, \forall ~ x_{1}^{*}\in \partial f(x_{1}), \forall ~ x_{2}^{*}\in \partial f(x_{2}). \end{aligned}$$

The subdifferential operator \(\partial f\) is maximal monotone [2]. Moreover, the graph of \(\partial f\) is demiclosed [2], i.e., if \(\{(x_{n},x^{*}_{n})\}\subset \mathrm{Gph}(\partial f)\) satisfies that \(\{x_{n}\}\) converges weakly to x and \(\{x^{*}_{n}\}\) converges strongly to \(x^{*},\) then \((x,x^{*})\in \mathrm{Gph}(\partial f).\) If f and g are two lower semicontinuous proper convex functions and if the regularity condition \(0\in \mathrm{int( dom}f-\mathrm{dom}g)\) holds, then for any \(\bar{x}\in \mathrm{dom}f\cap \mathrm{dom}g,\) we have (see [9]) \(\partial (f+g)(\bar{x})=\partial f(\bar{x})+\partial g(\bar{x})\)

The p-duality mapping \(J_{p} : X \rightarrow X^{*}\) is defined by

$$\begin{aligned} J_{p}(x)=\{x^{*}\in X^{*}|\langle x^{*},x\rangle =\Vert x^{*}\Vert \Vert x\Vert ,\Vert x^{*}\Vert =\Vert x\Vert ^{p-1}\},~~~\forall x\in X. \end{aligned}$$

The Hahn-Banach theorem guarantees that \(J_{p}(x)\ne \emptyset \) for every \(x\in X.\) It is clear that \(J_{p}(x)=\partial (\frac{1}{p}\Vert \cdot \Vert ^{p})(x)\) for all \(x\in X.\) It is well known that if X is smooth, then \(J_{p}\) is single valued and is norm-to-weak star continuous. Properties of the duality mapping have been given in [1, 11, 23].

Let \(\{x_{n}\}\) be a sequence in X that converges to \(\bar{x}.\) We say that the convergence is Q-linear if there exists a constant \(r\in (0,1)\) such that

$$\begin{aligned} \Vert x_{n+1}-\bar{x}\Vert \le r\Vert x_{n}-\bar{x}\Vert , \end{aligned}$$

for all n sufficiently large. We say that the convergence is R-linear if there exists a sequence of nonnegative scalars \(\{\alpha _{n}\}\) such that

$$\begin{aligned} \Vert x_{n}-\bar{x}\Vert \le \alpha _{n},\forall \;n\ge 1, \end{aligned}$$

and \(\{\alpha _{n}\}\) converges Q-linearly to zero.

Definition 1

[2] A functional f is called lower semicontinuous at the point \(x_{0}\in \mathrm{dom} f\) if for any sequence \(x_{n}\in \mathrm{dom} f\) such that \(x_{n}\rightarrow x_{0}\) there holds the inequality

$$\begin{aligned} f(x_{0})\le \liminf _{n\rightarrow \infty } f(x_{n}). \end{aligned}$$
(7)

If the inequality (7) occurs with the condition that the convergence of \({x_{n}}\) to \(x_{0}\) is weak, then the functional f is called weakly lower semicontinuous at \(x_{0}.\)

Lemma 1

[2] Let f be a convex and lower semicontinuous functional. Then it is weakly lower semicontinuous.

Lemma 2

[3] Let \(\{a_{n}\}, \{b_{n}\}\) and \(\{\epsilon _{n}\}\) be real sequences. Assume that \(\{a_{n}\}\) is bounded from below, \(\{b_{n}\}\) is nonnegative, \(\sum _{n=1}^{\infty }|\epsilon _{n}|< +\infty \) and \(a_{n+1}-a_{n}+b_{n}\le \epsilon _{n}.\) Then \(\{a_{n}\}\) converges and \(\sum _{n=1}^{\infty }b_{n}< +\infty .\)

Definition 2

[10] Let \(f:X\rightarrow (-\infty ,+\infty ]\) be proper, convex and lower semicontinuous. f is called totally convex at \(\bar{x}\in X,\) if for each \(x^{*}\in \partial f(\bar{x})\) and each sequence \(\{x_{n}\},\) the following implication holds

$$\begin{aligned} f(x_{n})-f(\bar{x})-\langle x^{*},x_{n}-\bar{x}\rangle \rightarrow 0~\Rightarrow ~\Vert x_{n}-\bar{x}\Vert \rightarrow 0. \end{aligned}$$

The following standing assumptions on the data of problem (1) will be used throughout the paper:

Assumption 1

The gradient \(\nabla g\) is uniformly continuous on any bounded subset of X and maps any bounded subset of X to a bounded set in \(X^{*}.\)

3 The FBS Method with Linesearch 1

In this section, we shall consider the convergence and convergence rate of the following forward–backward splitting method:

figure a

Iterative Method 3.1. Given \(x_{0}\in X\), for every \(n\in \mathbf {N},\) set

$$\begin{aligned} x_{n+1}=\mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert x_{n}-y\Vert ^{p}+t_{n}(\langle \nabla g(x_{n}) ,y\rangle +f(y))\}, \end{aligned}$$

where \(t_n=\mathbf{Linesearch~1}(x_{n}, ~\alpha ,~\theta ,~\beta ).\)

Lemma 3

[17] For any \(x\in X\) and \(t>0,\) let

$$\begin{aligned} \hat{x}_{t}=\mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert y-x\Vert ^{p}+t(\langle \nabla g(x),y\rangle +f(y))\}. \end{aligned}$$

Then, for any \(0<t_{1}\le t_{2},\) we have

$$\begin{aligned} \Vert x-\hat{x}_{t_{1}}\Vert \le \Vert x-\hat{x}_{t_{2}}\Vert \le \left( \frac{t_{2}}{t_{1}}\right) ^{\frac{1}{p-1}}\Vert x-\hat{x}_{t_{1}}\Vert . \end{aligned}$$
(8)

Lemma 4

If \(x\in \mathrm{dom}f,\) then Linesearch 1\((x,\alpha ,\theta ,\beta )\) stops after finitely many steps.

Proof

If \(x\in \mathcal {S},\) then \(x=\hat{x}_{\alpha }.\) Thus the linesearch stops at zero step and gives us the output \(\alpha .\) If \(x\notin \mathcal {S},\) by contradiction suppose that for all \(t_{k}=\alpha \theta ^{k},~k\in \mathbf {N}\)

$$\begin{aligned} t_{k}\Vert \nabla g(\hat{x}_{t_{k}})-\nabla g(x)\Vert >\beta \Vert \hat{x}_{t_{k}}-x\Vert ^{p-1}. \end{aligned}$$
(9)

It follows from Lemma 3 that \(\Vert x-\hat{x}_{t_{k}}\Vert \le \Vert x-\hat{x}_{\alpha }\Vert ,\) that is, \(\{\hat{x}_{t_{k}}\}\) is bounded. Thus we get from (9) that \( \Vert \hat{x}_{t_{k}}-x\Vert \rightarrow 0\) as \(k\rightarrow +\infty \) thanks to Assumption 1. The latter implies \( \Vert \nabla g(\hat{x}_{t_{k}})-\nabla g(x)\Vert \rightarrow 0\) as \(k\rightarrow +\infty \) by Assumption 1 again. From (9) we also obtain

$$\begin{aligned} \lim _{k\rightarrow +\infty }\frac{\Vert \hat{x}_{t_{k}}-x\Vert ^{p-1}}{t_{k}}=0. \end{aligned}$$
(10)

The optimality of \(\hat{x}_{t_{k}}\) implies

$$\begin{aligned} 0\in \partial \left( \frac{1}{p}\Vert x-\cdot \Vert ^{p}+t_{k}(\langle \nabla g(x),\cdot \rangle +f(\cdot ))\right) (\hat{x}_{t_{k}}). \end{aligned}$$

Then, we have

$$\begin{aligned} \frac{J_{p}(x-\hat{x}_{t_{k}})}{t_{k}}-\nabla g(x)\in \partial f(\hat{x}_{t_{k}}). \end{aligned}$$

By letting \(k\rightarrow +\infty \) in the above inclusion and using (10), we get from the demiclosedness of \(\mathrm{Gph}(\partial f) \) that \(0\in \partial f(x)+\nabla g(x).\) This contradicts the assumption that x is not an optimal solution to problem (1) and completes the proof of the lemma. \(\square \)

Remark 1

We observe from Lemma 4 that for finding the stepsize \(t_{n}\) in the above scheme is finite. Hence the choice of sequence \(\{x_{n}\}\) in Iterative Method 3.1 is well defined. Another important feature from the definition of Linesearch 1 useful for our analysis is the following inequality

$$\begin{aligned} t_{n} \Vert \nabla g(x_{n+1})-\nabla g(x_{n})\Vert \le \beta \Vert x_{n+1}-x_{n}\Vert ^{p-1}. \end{aligned}$$
(11)

Proposition 1

Let \(\{x_{n}\}\) be a sequence generated by Iterative Method 3.1 and define

$$\begin{aligned} h(x_{n}):=f(x_{n})-f(x_{n+1})+\langle \nabla g(x_{n}),x_{n}-x_{n+1}\rangle . \end{aligned}$$

Then, we have

  1. (i)

    \(\Vert x_{n}-x_{n+1}\Vert ^{p}\le t_{n}h(x_{n}).\)

  2. (ii)

    \(\varPhi (x_{n+1})\le \varPhi (x_{n})-(1-\beta )h(x_{n}).\)

  3. (iii)

    \(\varPhi (x_{n})\) converges and \(\sum _{n=1}^{\infty }h(x_{n})<+\infty .\)

Proof

(i) The optimality of \(x_{n+1}\) implies

$$\begin{aligned} 0\in \partial \left( \frac{1}{p}\Vert x_{n}-\cdot \Vert ^{p}+t_{n}(\langle \nabla g(x_{n}) ,\cdot \rangle +f(\cdot ))\right) (x_{n+1}). \end{aligned}$$

Then, we have

$$\begin{aligned} \frac{J_{p}(x_{n}-x_{n+1})}{t_{n}}-\nabla g(x_{n})\in \partial f(x_{n+1}). \end{aligned}$$
(12)

Hence,

$$\begin{aligned} \left\langle \frac{J_{p}(x_{n}-x_{n+1})}{t_{n}}-\nabla g(x_{n}),x_{n}-x_{n+1}\right\rangle \le f(x_{n})-f(x_{n+1}), \end{aligned}$$

and so that

$$\begin{aligned} \Vert x_{n}-x_{n+1}\Vert ^{p} \le t_{n}f(x_{n})-t_{n}f(x_{n+1})+t_{n}\langle \nabla g(x_{n}) ,x_{n}-x_{n+1}\rangle =t_{n}h(x_{n}). \end{aligned}$$

(ii) Using the definition of \(h(x_{n})\) and (11) we have

$$\begin{aligned} \varPhi (x_{n})-\varPhi (x_{n+1})= & {} h(x_{n})+g(x_{n})-g(x_{n+1})-\langle \nabla g(x_{n}),x_{n}-x_{n+1}\rangle \\\ge & {} h(x_{n})+\langle \nabla g(x_{n+1}),x_{n}-x_{n+1}\rangle -\langle \nabla g(x_{n}),x_{n}-x_{n+1}\rangle \\\ge & {} h(x_{n})-\langle \nabla g(x_{n})-\nabla g(x_{n+1}),x_{n}-x_{n+1}\rangle \\\ge & {} h(x_{n})-\Vert \nabla g(x_{n})-\nabla g(x_{n+1})\Vert \Vert x_{n}-x_{n+1}\Vert \\\ge & {} h(x_{n})-\frac{\beta }{t_{n}}\Vert x_{n}-x_{n+1}\Vert ^{p}\\\ge & {} h(x_{n})-\beta h(x_{n})\\= & {} (1-\beta )h(x_{n}). \end{aligned}$$

(iii) The conclusion follows using Lemma 2 and Proposition 1 (ii). \(\square \)

Proposition 2

Let \(\{x_{n}\}\) be a sequence generated by Iterative Method 3.1. Assume the sequence \(\{x_{n}\}\) is bounded. Then,

  1. (i)

    \(\lim \limits _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*}.\)

  2. (ii)

    all weak accumulation points of \(\{x_{n}\}\) belong to \(\mathcal {S}.\)

Proof

(i) Let \(\hat{x}\in \mathcal {S}.\) Since \(g(x_{n})-g(\hat{x})\le \langle \nabla g(x_{n}),x_{n}-\hat{x}\rangle \) and \(\varPhi (x_{n})-\varPhi ^{*} =f(x_{n})-f(\hat{x})+g(x_{n})-g(\hat{x}),\) we have that

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le & {} f(x_{n})-f(\hat{x})+\langle \nabla g(x_{n}),x_{n}-\hat{x}\rangle \nonumber \\= & {} h(x_{n})+ f(x_{n+1})-f(\hat{x})+\langle \nabla g(x_{n}),x_{n+1}-\hat{x}\rangle . \end{aligned}$$
(13)

Then, by (12) and (13), we have

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le & {} h(x_{n})-\left\langle \frac{J_{p}(x_{n}-x_{n+1})}{t_{n}},\hat{x}-x_{n+1}\right\rangle \nonumber \\\le & {} h(x_{n}) +\frac{1}{t_{n}}\Vert x_{n}-x_{n+1}\Vert ^{p-1}\Vert \hat{x}-x_{n+1}\Vert . \end{aligned}$$
(14)

Now let us split our further analysis into two distinct cases.

Case 1 Suppose that there exists \(\bar{t}>0\) such that \(t_{n}\ge \bar{t}\) for all \(n\in \mathbf {N}.\) Then, by Proposition 1 and (14), we have

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le & {} \frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\beta }+\frac{1}{t_{n}}\Vert x_{n}-x_{n+1}\Vert ^{p-1}\Vert \hat{x}-x_{n+1}\Vert \\\le & {} \frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\beta }+\left( \frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\beta }\right) ^{\frac{p-1}{p}}t_{n}^{-\frac{1}{p}}\Vert \hat{x}-x_{n+1}\Vert . \end{aligned}$$

Since \(\{x_{n}\}\) is bounded and \(\alpha \ge t_{n}\ge \bar{t}>0\), there exists \(c_{1}\ge 0\) such that \(t_{n}^{-\frac{1}{p}}\Vert \hat{x}-x_{n+1}\Vert \le c_{1}.\) Hence,

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le \frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\beta }+\left( \frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\beta }\right) ^{\frac{p-1}{p}}c_{1}. \end{aligned}$$
(15)

Since \(\{\varPhi (x_{n})-\varPhi ^{*}\}\) is bounded by Proposition 1(iii), there exist \(c_{2}\ge 0\) such that

$$\begin{aligned} (\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{1}{p}}+(1-\beta )^{\frac{1}{p}}c_{1}\le c_{2}. \end{aligned}$$
(16)

Then, by (15) and (16), we have

$$\begin{aligned}&(1-\beta )(\varPhi (x_{n})-\varPhi ^{*})\nonumber \\&\quad \le (\varPhi (x_{n})-\varPhi (x_{n+1}))+(\varPhi (x_{n})-\varPhi (x_{n+1})) ^{\frac{p-1}{p}}(1-\beta )^{\frac{1}{p}}c_{1}\nonumber \\&\quad =(\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{p-1}{p}}\left( (\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{1}{p}}+(1-\beta )^{\frac{1}{p}}c_{1}\right) \nonumber \\&\quad \le (\varPhi (x_{n})-\varPhi (x_{n+1}) )^{\frac{p-1}{p}}c_{2}. \end{aligned}$$
(17)

Using (17), we get that

$$\begin{aligned} \frac{1-\beta }{c_{2}}(\varPhi (x_{n})-\varPhi ^{*})\le (\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{p-1}{p}}, \end{aligned}$$

and so that,

$$\begin{aligned} (\varPhi (x_{n+1})-\varPhi ^{*})\le (\varPhi (x_{n})-\varPhi ^{*})-\left( \frac{1-\beta }{c_{2}}\right) ^{\frac{p}{p-1}}(\varPhi (x_{n})-\varPhi ^{*})^{\frac{p}{p-1}}. \end{aligned}$$
(18)

Hence, by (18) and Lemma 2, we easily obtain, \(\lim \nolimits _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*}.\)

Case 2 Suppose now that \(\{t_{k}\}\subset \{t_{n}\}\) and \(\lim _{k\rightarrow +\infty }t_{k}=0.\) Since \(\{x_{n}\}\) is bounded, the set of its weak accumulation points is nonempty. Without loss of generality, we assume that \(\{x_{k}\}\) weakly converging to \(\bar{x}\). Define \(\hat{t}_{k}=\frac{t_{k}}{\theta }>t_{k}>0\) and

$$\begin{aligned} \hat{x}_{\hat{t}_{k}}=\mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert x_{k}-y\Vert ^{p}+\hat{t}_{k}(\langle \nabla g(x_{k}),y\rangle +f(y))\}. \end{aligned}$$

Due to Lemma 3 we have

$$\begin{aligned} \Vert x_{k}-\hat{x}_{\hat{t}_{k}}\Vert \le \left( \frac{\hat{t}_{k}}{t_{k}}\right) ^{\frac{1}{p-1}}\Vert x_{k}-x_{k+1}\Vert =(\frac{1}{\theta })^{\frac{1}{p-1}}\Vert x_{k}-x_{k+1}\Vert , \end{aligned}$$

which combines with the boundedness of \(\{x_{k}\}\) to show that the sequence \(\{\hat{x}_{\hat{t}_{k}}\}\) is also bounded. It follows from the definition of Linesearch 1 that

$$\begin{aligned} \hat{t}_{k}\Vert \nabla g(x_{k})-\nabla g(\hat{x}_{\hat{t}_{k}})\Vert >\beta \Vert x_{k}-\hat{x}_{\hat{t}_{k}}\Vert ^{p-1}. \end{aligned}$$
(19)

Since \(\lim _{k\rightarrow +\infty }\hat{t}_{k}=0\) and both \(\{x_{k}\}\) and \(\{\hat{x}_{\hat{t}_{k}}\}\) are bounded, (19) together with Assumption 1 tell us that \(\lim _{k\rightarrow +\infty }\Vert x_{k}-\hat{x}_{\hat{t}_{k}}\Vert =0\) and thus \(\{\hat{x}_{\hat{t}_{k}}\}\) also weakly converges to \(\bar{x}.\) Thanks to Assumption 1 again, we have

$$\begin{aligned} \lim _{k\rightarrow +\infty }\Vert \nabla g(x_{k})-\nabla g(\hat{x}_{\hat{t}_{k}})\Vert =0. \end{aligned}$$
(20)

This and (19) imply that

$$\begin{aligned} \lim _{k\rightarrow +\infty }\frac{1}{\hat{t}_{k}}\Vert x_{k}-\hat{x}_{\hat{t}_{k}}\Vert ^{p-1}=0. \end{aligned}$$
(21)

Since

$$\begin{aligned} \frac{J_{p}(x_{k}-\hat{x}_{\hat{t}_{k}})-\hat{t}_{k}\nabla g(x_{k})}{\hat{t}_{k}}+\nabla g(\hat{x}_{\hat{t}_{k}})\in \partial f(\hat{x}_{\hat{t}_{k}})+\nabla g(\hat{x}_{\hat{t}_{k}})=\partial (f+g)(\hat{x}_{\hat{t}_{k}}). \end{aligned}$$

By letting \(k\rightarrow +\infty ,\) we get from (20) and (21) that \(0\in \partial (f+g)(\bar{x}),\) which means \(\bar{x}\in \mathcal {S}.\) It remains to verify (i) in this case. Indeed, we get from Lemma 3 that

$$\begin{aligned} \Vert x_{k}-\hat{x}_{\hat{t}_{k}}\Vert \ge \Vert x_{k}-x_{k+1}\Vert . \end{aligned}$$

This together with (21) yields

$$\begin{aligned} \lim _{k\rightarrow +\infty }\frac{1}{t_{k}}\Vert x_{k}-x_{k+1}\Vert ^{p-1}=0. \end{aligned}$$

Hence, by Proposition 1(iii) and (14), we have \(\varPhi (x_{n})\rightarrow \varPhi ^{*}.\)

(ii) Since \(\lim _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*},\) the sequence \(\{x_{n}\}\) is minimizing sequence, thus, due to the weak lower semicontinuity of \(\varPhi \)(see lemma 1), all weak accumulation points of \(\{x_{n}\}\) belong to \(\mathcal {S}.\) \(\square \)

Proposition 3

Let \(\{x_{n}\}\) be a sequence generated by Iterative Method 3.1. Suppose that the sequence \(\{x_{n}\}\) is bounded and there exists \(\bar{t}>0\) such that \(t_{n}\ge \bar{t}.\) Then, \(\varPhi (x_{n})-\varPhi ^{*}\le \lambda n^{1-p}\) for some \(\lambda >0.\)

Proof

By the mean value theorem, we have

$$\begin{aligned}&\frac{1}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\&\quad =\frac{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}-(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\&\quad =\frac{(q-1)\xi ^{q-2}[(\varPhi (x_{n})-\varPhi ^{*})-(\varPhi (x_{n+1})-\varPhi ^{*})]}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}, \end{aligned}$$

where, q is the dual exponent, i.e., \(\frac{1}{p}+\frac{1}{q}=1\) and \(\varPhi (x_{n+1})-\varPhi ^{*}\le \xi \le \varPhi (x_{n})-\varPhi ^{*}.\) Thus,

$$\begin{aligned} \xi ^{q-2}\ge (\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{-1} \end{aligned}$$

and, by (18),

$$\begin{aligned}&\frac{1}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\&\quad \ge \frac{(q-1)\bar{c}[(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}]}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\&\quad =(q-1)\bar{c}, \end{aligned}$$

where, \(\bar{c}=\left( \frac{1-\beta }{c_{2}}\right) ^{\frac{p}{p-1}}.\) Summing up then yields

$$\begin{aligned}&\frac{1}{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{0})-\varPhi ^{*})^{q-1}}\\&\quad =\sum _{i=0}^{n-1}\frac{1}{(\varPhi (x_{i+1})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{i})-\varPhi ^{*})^{q-1}}\\&\quad \ge n(q-1)\bar{c}. \end{aligned}$$

and consequently,

$$\begin{aligned} (\varPhi (x_{n})-\varPhi ^{*})^{q-1}\le ((\varPhi (x_{0})-\varPhi ^{*})^{1-q}+n(q-1)\bar{c})^{-1}. \end{aligned}$$

Hence, there exists \(\lambda >0,\) such that \(\varPhi (x_{n})-\varPhi ^{*}\le \lambda n^{1-p}.\) \(\square \)

4 The FBS Method with Linesearch 2

figure b

Lemma 5

If \(x\in \mathrm{dom}f,\) then Linesearch 2\((x,\theta )\) stops after finitely many steps.

Proof

If \(x\in \mathcal {S},\) then \(x=\hat{x}.\) Thus the linesearch immediately give us the output 1 without proceeding any step. If \(x\notin \mathcal {S},\) by contradiction let us assume that Linesearch 2 does not stop after finitely many steps. Thus for all \(t\in \{1,\theta ,\theta ^{2},\cdots \},\) we obtain

$$\begin{aligned}&(f+g)(x-t(x-\hat{x}))>(f+g)(x)-t(f(x)-f(\hat{x}))-t\langle \nabla g(x),x-\hat{x}\rangle \nonumber \\&\quad +\frac{t}{p}\Vert \hat{x}-x\Vert ^{p}. \end{aligned}$$

Since \(f(x-t(x-\hat{x}))\le (1-t)f(x)+tf(\hat{x}),\) we have

$$\begin{aligned} \frac{g(x-t(x-\hat{x}))-g(x)}{t}+\langle \nabla g(x),x-\hat{x}\rangle >\frac{1}{p}\Vert \hat{x}-x\Vert ^{p}. \end{aligned}$$

Then we get that

$$\begin{aligned} 0=\lim _{t\rightarrow 0}\frac{g(x-t(x-\hat{x}))-g(x)+t\langle \nabla g(x),x-\hat{x}\rangle }{t}\ge \frac{1}{p}\Vert \hat{x}-x\Vert ^{p}. \end{aligned}$$

Hence we have \(\hat{x}=x,\) which readily implies that \(0\in \partial f(x)+\nabla g(x).\) This contradicts the assumption that x is not an optimal solution to problem and completes the proof of the lemma. \(\square \)

In this section, we shall consider the convergence and convergence rate of the following forward–backward splitting method:

Iterative Method 4.1.

Given \(x_{0}\in X,\) for every \(n\in \mathbf {N},\) set

$$\begin{aligned} \left\{ \begin{array}{l} y_{n}=\mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert x_{n}-y\Vert ^{p}+\langle \nabla g(x_{n}) ,y\rangle +f(y)\},\\ x_{n+1}=(1-t_{n})x_{n}+t_{n}y_{n}, \end{array} \right. \end{aligned}$$

where \(t_n=\mathbf{Linesearch ~2}(x_{n}, ~\theta ).\)

Remark 2

Let \(\{x_{n}\}\) and \(\{y_{n}\}\) be sequences generated by Iterative Method 4.1. It follows from Linesearch 2\((x_{n}, ~\theta )\) that

$$\begin{aligned} (f+g)(x_{n+1})\le & {} (f+g)(x_{n})-t_{n}(f(x_{n})-f(y_{n}))\\&-t_{n}\langle \nabla g(x_{n}),x_{n}-y_{n}\rangle +\frac{t_{n}}{p}\Vert x_{n}-y_{n}\Vert ^{p}. \end{aligned}$$

If we remove the f term from Linesearch 2\((x_{n}, ~\theta ),\) we can still get this result from the convexity of f. However, if f is a strictly convex or strongly convex function, the number of iteration step for Linesearch 2 may be reduced.

Next we obtain some similar results for Iterative Method 4.1 to the ones in Sect. 4 for Iterative Method 3.1.

Proposition 4

Let \(\{x_{n}\}\) be a sequence generated by Iterative Method 4.1 and define

$$\begin{aligned} \rho (x_{n}):=f(x_{n})-f(y_{n})+\langle \nabla g(x_{n}),x_{n}-y_{n}\rangle . \end{aligned}$$

Then, we have

  1. (i)

    \(\frac{1}{t_{n}^{p}}\Vert x_{n}-x_{n+1}\Vert ^{p}\le \rho (x_{n}).\)

  2. (ii)

    \(\varPhi (x_{n+1})\le \varPhi (x_{n})-t_{n}(1-\frac{1}{p})\rho (x_{n}).\)

  3. (iii)

    \(\varPhi (x_{n})\) converges and \(\sum _{n=1}^{\infty }t_{n}\rho (x_{n})<+\infty .\)

Proof

(i) The optimality of \(y_{n}\) implies

$$\begin{aligned} 0\in \partial \left( \frac{1}{p}\Vert x_{n}-\cdot \Vert ^{p}+\langle \nabla g(x_{n}) ,\cdot \rangle +f(\cdot )\right) (y_{n}). \end{aligned}$$

Then, we have

$$\begin{aligned} J_{p}(x_{n}-y_{n})-\nabla g(x_{n})\in \partial f(y_{n}). \end{aligned}$$
(22)

Hence, by \(y_{n}=\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n},\) we have

$$\begin{aligned} \left\langle J_{p}(\frac{x_{n}-x_{n+1}}{t_{n}})-\nabla g(x_{n}),x_{n}-(\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n})\right\rangle \le f(x_{n})-f(y_{n}), \end{aligned}$$

and so that

$$\begin{aligned} \Vert \frac{x_{n}-x_{n+1}}{t_{n}}\Vert ^{p} \le f(x_{n})-f(y_{n})+\langle \nabla g(x_{n}) ,x_{n}-y_{n}\rangle =\rho (x_{n}). \end{aligned}$$
(23)

(ii) Using the definition of \(\rho (x_{n})\) and Remark 2, we get that

$$\begin{aligned} \varPhi (x_{n})-\varPhi (x_{n+1})\ge & {} t_{n}(f(x_{n})-f(y_{n}))+t_{n}\langle \nabla g(x_{n}),x_{n}-y_{n}\rangle -\frac{t_{n}}{p}\Vert x_{n}-y_{n}\Vert ^{p}\nonumber \\= & {} t_{n}\rho (x_{n})-\frac{1}{pt_{n}^{p-1}}\Vert x_{n}-x_{n+1}\Vert ^{p}. \end{aligned}$$
(24)

It follows from (23) and (24) that

$$\begin{aligned} \varPhi (x_{n+1})\le \varPhi (x_{n})-t_{n}(1-\frac{1}{p})\rho (x_{n}). \end{aligned}$$

(iii) The conclusion follow using Lemma 2 and Proposition 4 (ii). \(\square \)

Proposition 5

Let \(\{x_{n}\}\) and \(\{y_{n}\}\) be sequences generated by Iterative Method 4.1. Assume the sequence \(\{x_{n}\}\) is bounded. Then,

  1. (i)

    all weak accumulation points of \(\{x_{n}\}\) belong to \(\mathcal {S}.\)

  2. (ii)

    if there exists \(\{t_{k}\}\subset \{t_{n}\}\) such that \(t_{k}\rightarrow \bar{t}>0,\) then \(\lim \nolimits _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*}.\)

  3. (iii)

    if \(t_{n}\rightarrow 0\) and f is uniformly continuous on any bounded subset of \(\mathrm{dom}f,\) then \(\lim \nolimits _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*}.\)

Proof

(i) Let \(\hat{x}\in \mathcal {S}\) and \(\{x_{k}\}\subset \{x_{n}\}\) weakly converging to \(\bar{x}\). Since

$$\begin{aligned} g(x_{n})-g(\hat{x})\le \langle \nabla g(x_{n}),x_{n}-\hat{x}\rangle \end{aligned}$$

and

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*} =f(x_{n})-f(\hat{x})+g(x_{n})-g(\hat{x}), \end{aligned}$$

we have that

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le & {} f(x_{n})-f(\hat{x})+\langle \nabla g(x_{n}),x_{n}-\hat{x}\rangle \nonumber \\= & {} \rho (x_{n})+ f(y_{n})-f(\hat{x})+\langle \nabla g(x_{n}),y_{n}-\hat{x}\rangle . \end{aligned}$$
(25)

Then, by (22) and (25), we have

$$\begin{aligned}&\varPhi (x_{n})-\varPhi ^{*}\nonumber \\&\quad \le \rho (x_{n})-\langle J_{p}(\frac{x_{n}-x_{n+1}}{t_{n}})-\nabla g(x_{n}) ,\hat{x}-y_{n}\rangle +\langle \nabla g(x_{n}),y_{n}-\hat{x}\rangle \nonumber \\&\quad = \rho (x_{n})-\langle J_{p}(\frac{x_{n}-x_{n+1}}{t_{n}}) ,\hat{x}-y_{n}\rangle \nonumber \\&\quad \le \rho (x_{n}) +(\frac{1}{t_{n}})^{p-1}\Vert x_{n}-x_{n+1}\Vert ^{p-1}\Vert \hat{x}-(\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n})\Vert . \end{aligned}$$
(26)

Now let us split our further analysis into two distinct cases.

Case 1 Without loss of generality, we assume that \(t_{k}\rightarrow \bar{t}>0.\) Then, by Proposition 4, we have \(\lim _{n\rightarrow \infty }\rho (x_{n})=0\) and \(\lim _{n\rightarrow \infty }\Vert x_{n}-x_{n+1}\Vert =0.\) Then by (26), we have \(\lim _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*}.\) Hence, the sequence \(\{x_{k}\}\) is minimizing sequence, thus, due to the weak lower semicontinuity of \(\varPhi \), \(\bar{x}\in \mathcal {S}.\)

Case 2 Suppose now that \(\lim _{k\rightarrow +\infty }t_{k}=0.\) Define \(\hat{t}_{k}=\frac{t_{k}}{\theta }>t_{k}>0\) and

$$\begin{aligned} \left\{ \begin{array}{l} y_{k}=\mathrm{argmin}_{y\in X}\{\frac{1}{p}\Vert x_{k}-y\Vert ^{p}+\langle \nabla g(x_{k}) ,y\rangle +f(y)\},\\ \hat{x}_{k+1}=(1-\hat{t}_{k})x_{k}+\hat{t}_{k}y_{k}, \end{array} \right. \end{aligned}$$
(27)

It follows from the definition of Linesearch 2\((x_{k},\theta )\) that

$$\begin{aligned} \varPhi (x_{k})-\varPhi (\hat{x}_{k+1})\le \hat{t}_{k}(f(x_{k})-f(y_{k}))+\hat{t}_{k}\langle \nabla g(x_{k}),x_{k}-y_{k}\rangle -\frac{\hat{t}_{k}}{p}\Vert x_{k}-y_{k}\Vert ^{p}. \end{aligned}$$

This together with (6) and (27) gives us that

$$\begin{aligned} 0\ge & {} \varPhi (x_{k})-\varPhi (\hat{x}_{k+1})-\hat{t}_{k}(f(x_{k})-f(y_{k}))-\hat{t}_{k}\langle \nabla g(x_{k}),x_{k}-y_{k}\rangle \\&+\frac{\hat{t}_{k}}{p}\Vert x_{k}-y_{k}\Vert ^{p}\\= & {} f(x_{k})-f(\hat{x}_{k+1})+g(x_{k})-g(\hat{x}_{k+1})\\&-\hat{t}_{k}(f(x_{k})-f(y_{k}))-\hat{t}_{k}\langle \nabla g(x_{k}),x_{k}-y_{k}\rangle +\frac{\hat{t}_{k}}{p}\Vert x_{k}-y_{k}\Vert ^{p}\\\ge & {} -\hat{t}_{k}\langle \nabla g(x_{k}),x_{k}-y_{k}\rangle +\langle \nabla g(\hat{x}_{k+1}),x_{k}-\hat{x}_{k+1}\rangle +\frac{\hat{t}_{k}}{p}\Vert x_{k}-y_{k}\Vert ^{p}\\&+f(x_{k})-\hat{t}_{k}f(y_{k})-(1-\hat{t}_{k})f(x_{k})-\hat{t}_{k}(f(x_{k})-f(y_{k}))\\= & {} \hat{t}_{k}\langle \nabla g(\hat{x}_{k+1})-\nabla g(x_{k}),x_{k}-y_{k}\rangle +\frac{\hat{t}_{k}}{p}\Vert x_{k}-y_{k}\Vert ^{p}. \end{aligned}$$

We obtain from the latter that

$$\begin{aligned} \frac{\hat{t}_{k}}{p}\Vert x_{k}-y_{k}\Vert ^{p}\le \hat{t}_{k}\Vert \nabla g(\hat{x}_{k+1})-\nabla g(x_{k})\Vert \Vert x_{k}-y_{k}\Vert , \end{aligned}$$

which yields

$$\begin{aligned} \frac{1}{p}\Vert x_{k}-y_{k}\Vert ^{p-1}\le \Vert \nabla g(\hat{x}_{k+1})-\nabla g(x_{k})\Vert . \end{aligned}$$
(28)

On the other hand,

$$\begin{aligned} \langle J_{p}(x_{k}-y_{k})-\nabla g(x_{k}),y_{0}-y_{k }\rangle \le f(y_{0})-f(y_{k}) \end{aligned}$$

and

$$\begin{aligned} \langle J_{p}(x_{0}-y_{0})-\nabla g(x_{0}),y_{k}-y_{0}\rangle \le f(y_{k})-f(y_{0}). \end{aligned}$$

Adding the two inequalities, we get

$$\begin{aligned} \langle J_{p}(x_{k}-y_{k})-J_{p}(x_{0}-y_{0}),y_{0}-y_{k}\rangle \le \langle \nabla g(x_{0})-\nabla g(x_{k}),y_{k}-y_{0}\rangle . \end{aligned}$$

Then we have

$$\begin{aligned} \Vert x_{k}-y_{k}\Vert ^{p}\le & {} \langle J_{p}(x_{k}-y_{k}),x_{k}-y_{0}\rangle +\langle J_{p}(x_{0}-y_{0}),y_{0}-y_{k}\rangle \\&+\langle \nabla g(x_{0})-\nabla g(x_{k}),y_{k}-y_{0}\rangle . \end{aligned}$$

Hence

$$\begin{aligned} \Vert x_{k}-y_{k}\Vert ^{p}\le & {} \Vert x_{k}-y_{k}\Vert ^{p-1}\Vert y_{0}-x_{k}\Vert +\Vert x_{0}-y_{0}\Vert ^{p-1}\Vert y_{0}-y_{k}\Vert \\&+\Vert \nabla g(x_{0})-\nabla g(x_{k})\Vert \Vert y_{k}-y_{0}\Vert . \end{aligned}$$

Due to Assumption 1, \(p>1\) and the boundedness of \(\{x_{n}\},\) the latter tells us that \(\{y_{k}\}\) is also bounded. This together with (27) and the fact \(\lim _{k\rightarrow +\infty }t_{k}=0\) implies \(\lim _{k\rightarrow +\infty }\Vert \hat{x}_{k+1}-x_{k}\Vert =0.\) Since \(\nabla g\) is uniformly continuous on bounded sets, we get \(\lim _{k\rightarrow +\infty }\Vert \nabla g(\hat{x}_{k+1})-\nabla g(x_{k})\Vert =0\) and derive from (28) that

$$\begin{aligned} \lim _{k\rightarrow +\infty }\Vert x_{k}-y_{k}\Vert =0. \end{aligned}$$
(29)

Since \(\nabla g\) is uniformly continuous on bounded sets, (29) implies

$$\begin{aligned} \lim _{k\rightarrow +\infty }\Vert \nabla g(x_{k})-\nabla g(y_{k})\Vert =0. \end{aligned}$$
(30)

Since \(J_{p}(x_{k}-y_{k})-\nabla g(x_{k})+\nabla g(y_{k})\in \partial f(y_{k})+\nabla g(y_{k})=\partial (f+g)(y_{k}).\) By passing to the limit over the subsequence \(\{x_{k}\}\) in the above inclusion, we get from (29) and (30) that \(0\in \partial (f+g)(\bar{x}),\) which means \(\bar{x}\in \mathcal {S}.\)

(ii) Suppose now that \(\{t_{k}\}\subset \{t_{n}\}\) and \(t_{k}\rightarrow \bar{t}>0,\) Hence, by Proposition 4 and (26), we have \(\lim _{k\rightarrow \infty }\varPhi (x_{k})=\varPhi ^{*}.\) Furthermore, we have \(\lim _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*}.\)

(iii) By (29), we have

$$\begin{aligned} \lim _{n\rightarrow +\infty }\Vert x_{n}-y_{n}\Vert =0. \end{aligned}$$
(31)

Since f is uniformly contionuous on any bounded set and \(\nabla g\) is uniformly continuous on bounded sets, (31) implies

$$\begin{aligned} \lim _{n\rightarrow +\infty }\rho (x_{n})=\lim _{n\rightarrow +\infty }f(x_{n})-f(y_{n})+\langle \nabla g(x_{n}),x_{n}-y_{n}\rangle =0. \end{aligned}$$

Hence, by Proposition 4(i) and (26), we have \(\lim _{n\rightarrow \infty }\varPhi (x_{n})=\varPhi ^{*}.\) \(\square \)

Proposition 6

Let \(\{x_{n}\}\)and \(\{y_{n}\}\) be sequences generated by Iterative Method 4.1. Suppose that the sequence \(\{x_{n}\}\) is bounded and there exists \(\bar{t}>0\) such that \(t_{n}\ge \bar{t}\) for all \(n\in \mathbf {N}.\) Then, \(\varPhi (x_{n})-\varPhi ^{*}\le \lambda n^{1-p}\) for some \(\lambda >0.\)

Proof

By (26) and Proposition 4, we have

$$\begin{aligned}&\varPhi (x_{n})-\varPhi ^{*}\\&\le \rho (x_{n}) +(\frac{1}{t_{n}})^{p-1}\Vert x_{n}-x_{n+1}\Vert ^{p-1}\Vert \hat{x}-(\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n})\Vert \\&\le \frac{1}{t_{n}}\frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\frac{1}{p}}+\Vert \frac{x_{n}-x_{n+1}}{t_{n}}\Vert ^{p-1}\Vert \hat{x}-(\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n})\Vert \\&\le \frac{1}{t_{n}}\frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\frac{1}{p}}\\&+\left( \frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\frac{1}{p} }\right) ^{\frac{p-1}{p}}t_{n}^{-\frac{p-1}{p}}\Vert \hat{x}-(\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n})\Vert . \end{aligned}$$

Since \(\{x_{n}\}\) is bounded, there exist \(c_{1}\ge 0\) such that

$$\begin{aligned} t_{n}^{-\frac{p-1}{p}}\Vert \hat{x}-(\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n})\Vert \le c_{1}. \end{aligned}$$

Hence,

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le \frac{1}{t_{n}}\frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\frac{1}{p} }+\left( \frac{\varPhi (x_{n})-\varPhi (x_{n+1})}{1-\frac{1}{p}}\right) ^{\frac{p-1}{p}}c_{1}. \end{aligned}$$
(32)

Since \(\{\varPhi (x_{n})-\varPhi ^{*}\}\) is bounded by Proposition 4(iii), there exist \(c_{2}\ge 0\) such that

$$\begin{aligned} \frac{1}{t_{n}}(\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{1}{p}}+(1-\frac{1}{p})^{\frac{1}{p}}c_{1}\le c_{2}. \end{aligned}$$
(33)

Then, by (32) and (33), we have

$$\begin{aligned}&(1-\frac{1}{p})(\varPhi (x_{n})-\varPhi ^{*})\nonumber \\&\quad \le \frac{1}{t_{n}}(\varPhi (x_{n})-\varPhi (x_{n+1}))+(\varPhi (x_{n})-\varPhi (x_{n+1})) ^{\frac{p-1}{p}}(1-\frac{1}{p} )^{\frac{1}{p}}c_{1}\nonumber \\&\quad =(\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{p-1}{p}}\left( \frac{1}{t_{n}}(\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{1}{p}}+(1-\frac{1}{p} )^{\frac{1}{p}}c_{1}\right) \nonumber \\&\quad \le c_{2}(\varPhi (x_{n})-\varPhi (x_{n+1}) )^{\frac{p-1}{p}}. \end{aligned}$$
(34)

Using (34), we get that

$$\begin{aligned} \frac{p-1}{pc_{2}}(\varPhi (x_{n})-\varPhi ^{*})\le (\varPhi (x_{n})-\varPhi (x_{n+1}))^{\frac{p-1}{p}} \end{aligned}$$

and so that,

$$\begin{aligned} (\varPhi (x_{n+1})-\varPhi ^{*})\le (\varPhi (x_{n})-\varPhi ^{*})-(\frac{p-1}{pc_{2}})^{\frac{p}{p-1}}(\varPhi (x_{n})-\varPhi ^{*})^{\frac{p}{p-1}}. \end{aligned}$$
(35)

By the mean value theorem, we have

$$\begin{aligned}&\frac{1}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\&\quad =\frac{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}-(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\&\quad =\frac{(q-1)\xi ^{q-2}[(\varPhi (x_{n})-\varPhi ^{*})-(\varPhi (x_{n+1})-\varPhi ^{*})]}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}} \end{aligned}$$

with \(\varPhi (x_{n+1})-\varPhi ^{*}\le \xi \le \varPhi (x_{n})-\varPhi ^{*} \) and \(\frac{1}{q}+\frac{1}{p}=1.\) Thus,

$$\begin{aligned} \xi ^{q-2}\ge (\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{-1} \end{aligned}$$

and, by (35),

$$\begin{aligned}&\frac{1}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\\ge & {} \frac{(q-1)\bar{c}[(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}]}{(\varPhi (x_{n+1})-\varPhi ^{*})^{q-1}(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}\\= & {} (q-1)\bar{c}, \end{aligned}$$

where, \(\bar{c}=\left( \frac{p-1}{pc_{2}}\right) ^{\frac{p}{p-1}}.\) Summing up then yields

$$\begin{aligned}&\frac{1}{(\varPhi (x_{n})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{0})-\varPhi ^{*})^{q-1}}\\&\qquad =\sum _{i=0}^{n-1}\frac{1}{(\varPhi (x_{i+1})-\varPhi ^{*})^{q-1}}-\frac{1}{(\varPhi (x_{i})-\varPhi ^{*})^{q-1}}\\&\qquad \ge n(q-1)\bar{c}. \end{aligned}$$

and consequently,

$$\begin{aligned} (\varPhi (x_{n})-\varPhi ^{*})^{q-1}\le ((\varPhi (x_{0})-\varPhi ^{*})^{1-q}+n(q-1)\bar{c})^{-1}. \end{aligned}$$

Hence, there exists \(\lambda >0,\) such that \(\varPhi (x_{n})-\varPhi ^{*}\le \lambda n^{1-p}.\) \(\square \)

Definition 3

[17] We say that the sequence of iterates generated by Iterative Method 4.1 satisfies the error bound property, if there exists \(\kappa >0\) such that

$$\begin{aligned} \min _{y\in \mathcal {S}}\Vert x_{n}-y\Vert \le \kappa \Vert x_{n}-y_{n}\Vert . \end{aligned}$$
(36)

Proposition 7

Let \(\{x_{n}\}\) and \(\{y_{n}\}\) be sequences generated by Iterative Method 4.1. Assume that the error bound property (36) holds for \(\{x_{n}\}\) and the sequence \(\{x_{n}\}\) is bounded. If there exists \(\bar{t}>0\) such that \(t_{n}\ge \bar{t}\) for all \(n\in \mathbf {N},\) then the sequence \(\{\varPhi (x_{n})\}\) converges \(Q-\)linearly, that is, there exists \(0<\epsilon <1\) such that

$$\begin{aligned} \varPhi (x_{n+1})-\varPhi ^{*}\le \epsilon (\varPhi (x_{n})-\varPhi ^{*}). \end{aligned}$$

Proof

Since \(\mathcal {S}\) is nonempty, closed and convex and X is reflexive, there exists \(\hat{x}_{n}\in \mathcal {S},\) such that \(\Vert x_{n}-\hat{x}_{n}\Vert =\min _{y\in \mathcal {S}}\Vert x_{n}-y\Vert \) and \(\varPhi (\hat{x}_{n})=\varPhi ^{*}.\) Since \(\{x_{n}\}\) is bounded, \(\{\hat{x}_{n}\}\) is also bounded. Since

$$\begin{aligned} g(x_{n})-g(\hat{x}_{n})\le \langle \nabla g(x_{n}),x_{n}-\hat{x}_{n}\rangle \end{aligned}$$

and

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*} =f(x_{n})-f(\hat{x}_{n})+g(x_{n})-g(\hat{x}_{n}), \end{aligned}$$

we have that

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le & {} f(x_{n})-f(\hat{x}_{n})+\langle \nabla g(x_{n}),x_{n}-\hat{x}_{n}\rangle \nonumber \\= & {} \rho (x_{n})+ f(y_{n})-f(\hat{x}_{n})+\langle \nabla g(x_{n}),y_{n}-\hat{x}_{n}\rangle . \end{aligned}$$
(37)

Then, by (22) and (37), we have

$$\begin{aligned}&\varPhi (x_{n})-\varPhi ^{*}\nonumber \\&\quad \le \rho (x_{n})-\langle J_{p}(\frac{x_{n}-x_{n+1}}{t_{n}})-\nabla g(x_{n}) ,\hat{x}_{n}-y_{n}\rangle +\langle \nabla g(x_{n}),y_{n}-\hat{x}_{n}\rangle \nonumber \\&\quad = \rho (x_{n})-\langle J_{p}(\frac{x_{n}-x_{n+1}}{t_{n}}) ,\hat{x}_{n}-y_{n}\rangle \nonumber \\&\quad \le \rho (x_{n}) +(\frac{1}{t_{n}})^{p-1}\Vert x_{n}-x_{n+1}\Vert ^{p-1}\Vert \hat{x}_{n}-(\frac{1}{t_{n}}x_{n+1}+(1-\frac{1}{t_{n}})x_{n})\Vert . \end{aligned}$$
(38)

By (38), we have

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le & {} \rho (x_{n}) +(\frac{1}{t_{n}})^{p-1}\Vert x_{n}-x_{n+1}\Vert ^{p-1}\Vert (\hat{x}_{n}-x_{n})-\frac{1}{t_{n}}(x_{n+1}-x_{n})\Vert \\\le & {} \rho (x_{n}) +(\frac{1}{t_{n}})^{p-1}\Vert x_{n}-x_{n+1}\Vert ^{p-1}(\Vert \hat{x}_{n}-x_{n}\Vert +\frac{1}{t_{n}}\Vert x_{n+1}-x_{n}\Vert ). \end{aligned}$$

Now we use the error bound condition \(\min _{y\in \mathcal {S}}\Vert x_{n}-y\Vert \le \kappa \Vert x_{n}-y_{n}\Vert \) together with Proposition 4 to obtain

$$\begin{aligned} \varPhi (x_{n})-\varPhi ^{*}\le & {} \frac{p}{t_{n}(p-1)}\times (\varPhi (x_{n})-\varPhi (x_{n+1}))\nonumber \\&+(\frac{1}{t_{n}})^{p-1}\Vert x_{n}-x_{n+1}\Vert ^{p-1}(\kappa \Vert x_{n}-y_{n}\Vert +\frac{1}{t_{n}}\Vert x_{n+1}-x_{n}\Vert ) \nonumber \\= & {} \frac{p}{t_{n}(p-1)}\times (\varPhi (x_{n})-\varPhi (x_{n+1}))+\frac{1+\kappa }{t_{n}^{p}}\Vert x_{n}-x_{n+1}\Vert ^{p}\nonumber \\\le & {} \frac{(2+\kappa )p}{t_{n}(p-1)}\times (\varPhi (x_{n})-\varPhi (x_{n+1}))\nonumber \\\le & {} \frac{(2+\kappa )p}{\bar{t}(p-1)}\times (\varPhi (x_{n})-\varPhi (x_{n+1})). \end{aligned}$$
(39)

It follows from (39) that

$$\begin{aligned} \varPhi (x_{n+1})-\varPhi ^{*}\le & {} \left( \frac{(2+\kappa )p}{\bar{t}(p-1)}-1\right) \left( \frac{(2+\kappa )p}{\bar{t}(p-1)}\right) ^{-1}(\varPhi (x_{n})-\varPhi ^{*})\\= & {} \left( 1-\frac{\bar{t}(p-1)}{(2+\kappa )p}\right) (\varPhi (x_{n})-\varPhi ^{*})\\= & {} \epsilon (\varPhi (x_{n})-\varPhi ^{*}), \end{aligned}$$

where, \(0<\epsilon =\left( 1-\frac{\bar{t}(p-1)}{(2+\kappa )p}\right) <1.\) Hence, the sequence \(\{\varPhi (x_{n})\}\) converges \(Q-\)linearly. \(\square \)

Corollary 1

Let \(\{x_{n}\}\) and \(\{y_{n}\}\) be sequences generated by Iterative Method 4.1. Assume that the error bound property (36) holds for \(\{x_{n}\}\) and the sequence \(\{x_{n}\}\) is bounded. If f is totally convex and there exists \(\bar{t}>0\) such that \(t_{n}\ge \bar{t}\) for all \(n\in \mathbf {N},\) then \(\{x_{n}\}\) converges \(R-\)linearly to the unique minimizer \(\bar{x}.\)

Proof

Consider the following Bregman-like distance to the solution \(\bar{x}\) of the problem (1):

$$\begin{aligned} R(x):=f(x)-f(\bar{x})+\langle \nabla g(\bar{x}),x-\bar{x}\rangle \end{aligned}$$

which is non-negative since \(-\nabla g(\bar{x})\in \partial f(\bar{x}).\) Note that if \( \partial f(\bar{x})\) consists of one point, R is indeed the Bregman distance. By optimality of \(\bar{x}\) and by Proposition 7, the iterates \(x_{n}\) satisfy

$$\begin{aligned} 0\le R(x_{n})=f(x_{n})-f(\bar{x})+\langle \nabla g(\bar{x}),x_{n}-\bar{x}\rangle \le \varPhi (x_{n})-\varPhi ^{*}\rightarrow 0. \end{aligned}$$

Hence, by Definition 2, \(x_{n}\rightarrow \bar{x}.\)

On the other hand, If f is totally convex, then \(\mathcal {S}=\{\bar{x}\}\) and

$$\begin{aligned} \min _{y\in \mathcal {S}}\Vert x_{n}-y\Vert =\Vert x_{n}-\bar{x}\Vert . \end{aligned}$$

Using the error bound condition (36), we have

$$\begin{aligned} \Vert x_{n}-\bar{x}\Vert \le \kappa \Vert x_{n}-y_{n}\Vert =\frac{\kappa }{t_{n}}\Vert x_{n}-x_{n+1}\Vert . \end{aligned}$$

Then, by Proposition 4 and \(\varPhi (x_{n})-\varPhi (\bar{x})\ge \varPhi (x_{n})-\varPhi (x_{n+1})\), we have \(\{x_{n}\}\) is bounded and

$$\begin{aligned} \varPhi (x_{n})-\varPhi (\bar{x})\ge \frac{p-1}{pt_{n}^{p-1}} \Vert x_{n}-x_{n+1}\Vert ^{p}\ge \frac{(p-1)t_{n}}{p\kappa ^{p}}\Vert x_{n}-\bar{x}\Vert ^{p} \ge \frac{(p-1)\bar{t}}{p\kappa ^{p}}\Vert x_{n}-\bar{x}\Vert ^{p} \end{aligned}$$

and consequently

$$\begin{aligned} \left( \frac{\varPhi (x_{n})-\varPhi (\bar{x})}{\mu }\right) ^\frac{1}{p}\ge \Vert x_{n}-\bar{x}\Vert , \end{aligned}$$
(40)

where, \(\mu =\frac{(p-1)\bar{t}}{p\kappa ^{p}}.\) Since the sequence \(\{\varPhi (x_{n})\}\) is converges \(Q-\)linearly by Proposition 7, that is, \(\varPhi (x_{n+1})-\varPhi ^{*}\le \epsilon (\varPhi (x_{n})-\varPhi ^{*}),\) then we have

$$\begin{aligned} \left( \frac{\varPhi (x_{n+1})-\varPhi (\bar{x})}{\mu }\right) ^\frac{1}{p}\le \epsilon ^\frac{1}{p}\left( \frac{\varPhi (x_{n})-\varPhi (\bar{x})}{\mu }\right) ^\frac{1}{p}. \end{aligned}$$

Hence, the sequence \(\left\{ \left( \frac{\varPhi (x_{n})-\varPhi (\bar{x})}{\mu }\right) ^\frac{1}{p}\right\} \) is also converges \(Q-\)linearly. Thus, by (40), \(\{x_{n}\}\) converges \(R-\)linearly to an optimal solution in \(\mathcal {S}.\) \(\square \)

5 Conclusion

In this work, we discuss the modified forward–backward splitting method involving new linesearches for solving minimization problems of two convex functions in Banach space. Our algorithms do not require the Lipschitz constant of the gradient of functions. We proved the weak convergence of the iterative sequence generated by these methods, and further proved convergence with asymptotic rate \(\frac{1}{n}\) to the optimal value under the assumption of the boundedness of the iterative sequence.