1 Introduction

In this paper, we propose a forward-backward splitting algorithm to solve the following composite convex minimization problem considered in Banach spaces.

Problem 1

Let \(\mathcal {X}\) be a reflexive real Banach space, let \(\varphi \colon \mathcal {X}\to ]-\infty ,+\infty ]\) and \(\psi \colon \mathcal {X}\to ]-\infty ,+\infty ]\) be proper lower semi-continuous convex functions, and suppose that ψ is Gâteaux differentiable on interior of its domain. The problem is to

$$ \underset{x\in \mathcal{X}}{\text{minimize}}\;\varphi(x)+\psi(x). $$
(1)

The set of solutions to (1) is denoted by S.

A particular instance of (1) when ψ is the Bregman distance associated to a differentiable convex function f, i.e.,

$$ \begin{aligned} D^{f}\colon\mathcal{X}\times\mathcal{X}&\to [0,+\infty]\\ (x,y)&\mapsto \left\{\begin{array}{ll} f(x)-f(y)-\langle x-y,\nabla f(y)\rangle &\text{ if}\;y\in\text{int\,dom}\, f,\\ +\infty&\text{ otherwise}, \end{array}\right. \end{aligned} $$
(2)

where \(\text {dom}\,f =\{x\in \mathcal {X}\mid f(x)<+\infty \}\) and int dom f is its interior, provides a framework for many problems arising in applied mathematics. For instance, when \(\mathcal {X}\) is a Euclidean space and f is Boltzmann–Shannon entropy, it captures many problems in information theory and signal recovery [9].

It was shown in [14] that if \(\mathcal {X}\) is Hilbertian and ψ possesses a β −1-Lipschitz continuous gradient for some β∈]0, + [, then Problem 1 can be solved by the standard forward-backward algorithm

$$ (\forall n\in\mathbb{N})\quad x_{n+1}=\text{prox}_{\gamma_{n}\varphi} \big(x_{n}-\gamma\nabla\psi(x_{n})\big), \quad\text{ where}\;0<\gamma<2\beta. $$
(3)

Here, prox is Moreau proximity operator [19]. However, many problems in applications do not conform to these hypotheses, for example when \(\mathcal {X}\) is a Euclidean space and ψ is Boltzmann–Shannon entropy which appears in many problems in image and signal processing, in statistics, and in machine learning [2, 11, 12, 1618]. Another difficulty in the implementation of (3) is that the operator prox is not always easy to evaluate.

The objective of the present paper is to propose a forward-backward splitting algorithm to solve Problem 1, which is so far limited to Hilbert spaces, in the general framework of reflexive real Banach spaces. This algorithm, which employs Bregman distance-based proximity operators, provides new algorithms in the framework of Euclidean spaces, which are, in some instances, more favorable than the standard forward-backward splitting algorithm. This framework can be applied in the case when ψ is not everywhere differentiable. The paper is organized as follows. In Section 2, we provide some preliminary results. We present the algorithm and prove its convergence in Section 3. Section 4 is devoted to an application of our result to multivariate minimization problem together with examples.

Notation and Background

Throughout this paper, \(\mathcal {X}\) is reflexive, \(\mathcal {X}^{\ast }\) is the dual space of \(\mathcal {X}\), 〈⋅,⋅〉 is the duality pairing between \(\mathcal {X}\) and \(\mathcal {X}^{\ast }\) and ∥⋅∥ is a norm of \(\mathcal {X}\). The symbols \(\rightharpoonup \) and → represent respectively weak and strong convergence. The set of weak sequential cluster points of a sequence \((x_{n})_{n\in \mathbb {N}}\) is denoted by \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\). Let \(M\colon \mathcal {X}\to 2^{\mathcal {X}^{\ast }}\). The domain of M is \(\text {dom}\,M=\{x\in \mathcal {X}\mid Mx\neq \emptyset \}\) and the range of M is \(\text {ran}\,M=\{x^{*}\in \mathcal {X}^{\ast }\mid (\exists x\in \mathcal {X}) x^{\ast }\in Mx\}\). Let \(f\colon \mathcal {X}\to ]-\infty ,+\infty ]\). Then, f is cofinite if \(\text {dom}\, f^{\ast }=\mathcal {X}^{\ast }\), is coercive if \(\lim _{\|x\|\to +\infty }f(x)=+\infty \), is supercoercive if \(\lim _{\|x\|\to +\infty }f(x)/\|x\|=+\infty \), and is uniformly convex at x∈dom f if there exists an increasing function ϕ:[0, + [→[0, + ] that vanishes only at 0 such that

$$\begin{array}{@{}rcl@{}} (\forall y\in\text{dom}\,f) (\forall \alpha\in ]0,1[)\quad &&f(\alpha x+(1-\alpha)y)+\alpha(1-\alpha)\phi(\|x-y\|)\\ &&\leq \alpha f(x)+(1-\alpha)f(y). \end{array} $$

Denote by \({\Gamma }_{0}(\mathcal {X})\) the class of all lower semicontinuous convex functions \(f\colon \mathcal {X}\to ]-\infty ,+\infty ]\) such that \(\text {dom}\,f=\{x\in \mathcal {X}\mid f(x)<+\infty \}\neq \emptyset \). Let \(f\in {\Gamma }_{0}(\mathcal {X})\). Denote by Argmin f the set of global minimizers of f, by \(f^{\ast }\colon \mathcal {X}^{\ast }\to ]-\infty ,+\infty ] \colon x^{\ast }\mapsto \sup _{x\in \mathcal {X}}(\langle x,x^{\ast }\rangle -f(x))\) the conjugate of f and by

$$ \partial f\colon\mathcal{X}\to 2^{\mathcal{X}^{\ast}}\colon x\mapsto \{x^{\ast} \in \mathcal{X}^{\ast}\mid (\forall y \in \mathcal{X}) \langle y-x,x^{\ast}\rangle + f(x)\leq f(y)\}, $$
(4)

the Moreau subdifferential of f. In addition, if f is Gâteaux differentiable on int dom f then

$$ \hat{f}\colon \mathcal{X} \to ]-\infty,+\infty] \;x \mapsto \left\{\begin{array}{lllllll} f(x)&\text{ if} \;x \in \mathrm{int\,dom}\,f,\\ +\infty&\text{ otherwise}. \end{array}\right. $$
(5)

We denote

$$\mathcal{F}(f)=\{g\in{\Gamma}_{0}(\mathcal{X})\mid g\text{ is G\^{a}teaux differentiable on} \; \text{dom}\,g= \mathrm{int\,dom}\,f\}. $$

Moreover, if g 1 and g 2 are in \(\mathcal {F}(f)\), then

$$g_{1} \succcurlyeq g_{2}\quad \Leftrightarrow \quad (\forall x\in\text{dom}\,f) (\forall y\in\mathrm{int\,dom}\,f) \quad D^{g_{1}}(x,y)\geq D^{g_{2}}(x,y). $$

For every α∈[0, + [, set

$$\mathcal{P}_{\alpha}(f)=\{g\in\mathcal{F}(f)\mid g\succcurlyeq\alpha f\}. $$

Finally, \(\ell _{+}^{1}(\mathbb {N})\) is the set of all summable sequences in [0, + [.

2 Preliminary Results

In this section, we give some preliminary results on Legendre function, Bregman monotonicity, and Bregman distance-based proximity operator that will be used in the next section.

Definition 1

[5, 6] Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f. We say that f is a Legendre function if it is essentially smooth in the sense that f is both locally bounded and single-valued on its domain, and essentially strictly convex in the sense that f is locally bounded on its domain and f is strictly convex on every convex subset of dom f. Let C be a closed convex subset of \(\mathcal {X}\) such that C∩int dom f. The Bregman projector onto C induced by f is

$$\begin{array}{@{}rcl@{}} {P^{f}_{C}} \colon \mathrm{int\,dom}\,f &\to& C\cap \mathrm{int\,dom}\,f\\ y&\mapsto&\text{argmin}_{x\in C}D^{f}(x,y), \end{array} $$

and the D f-distance to C is the function

$$\begin{array}{@{}rcl@{}} {D^{f}_{C}}\colon \mathcal{X}&\to& [0,+\infty]\\ y&\mapsto&\inf D^{f}(C,y). \end{array} $$

Definition 2

[20] Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f, let \((f_{n})_{n\in \mathbb {N}}\) be in \(\mathcal {F}(f)\), let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\,f)^{\mathbb {N}}\), and let \(C \subset \mathcal {X}\) be such that C∩dom f. Then \((x_{n})_{n\in \mathbb {N}}\) is:

  1. 1.

    quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) if

    $$\begin{array}{@{}rcl@{}} &&(\exists(\eta_{n})_{n\in\mathbb{N}}\in\ell_{+}^{1}(\mathbb{N})) (\forall x\in C\cap\text{dom}\, f) (\exists(\varepsilon_{n})_{n\in\mathbb{N}}\in \ell_{+}^{1}(\mathbb{N}))(\forall n\in\mathbb{N})\\ &&\hspace{2cm} D^{f_{n+1}}(x,x_{n+1})\leq (1+\eta_{n})D^{f_{n}}(x,x_{n})+\varepsilon_{n}; \end{array} $$
  2. 2.

    stationarily quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) if

    $$\begin{array}{@{}rcl@{}} &&(\exists(\varepsilon_{n})_{n\in\mathbb{N}}\in\ell_{+}^{1}(\mathbb{N})) (\exists(\eta_{n})_{n\in\mathbb{N}}\in\ell_{+}^{1}(\mathbb{N}))(\forall x\in C\cap\text{dom}\, f) (\forall n\in\mathbb{N})\\ &&\hspace{2cm} D^{f_{n+1}}(x,x_{n+1})\leq (1+\eta_{n})D^{f_{n}}(x,x_{n})+\varepsilon_{n}. \end{array} $$

Condition 1

[6, Condition 4.4] Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f. For every bounded sequences \((x_{n})_{n\in \mathbb {N}}\) and \((y_{n})_{n\in \mathbb {N}}\) in int dom f,

$$ D^{f}(x_{n},y_{n})\to 0\quad\Rightarrow\quad x_{n}-y_{n}\to 0. $$

Proposition 1 (20)

Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅, let α∈]0,+∞[, let \((f_{n})_{n\in \mathbb {N}}\) be in \(\mathcal {P}_{\alpha }(f)\) , let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) , let \(C\subset \mathcal {X}\) be such that C∩int dom f≠∅, and let x∈C∩int dom f. Suppose that \((x_{n})_{n\in \mathbb {N}}\) is quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) . Then the following hold.

  1. 1.

    \((D^{f_{n}}(x,x_{n}))_{n\in \mathbb {N}}\) converges.

  2. 2.

    Suppose that D f (x,⋅) is coercive. Then \((x_{n})_{n\in \mathbb {N}}\) is bounded.

Proposition 2 (20)

Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅, let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) , let \(C\subset \mathcal {X}\) be such that C∩int dom f≠∅, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , let α∈]0,+∞[, and let \((f_{n})_{n\in \mathbb {N}}\) in \(\mathcal {P}_{\alpha }(f)\) be such that \((\forall n\in \mathbb {N})(1+\eta _{n})f_{n}\succcurlyeq f_{n+1}\) . Suppose that \((x_{n})_{n\in \mathbb {N}}\) is quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) , that there exists \(g\in \mathcal {F}(f)\) such that for every \(n\in \mathbb {N}\) , \(g\succcurlyeq f_{n}\) , and that, for every \(y_{1}\in \mathcal {X}\) and every \(y_{2}\in \mathcal {X}\) ,

$$ \left\{\begin{array}{lllllll} y_{1}\in\mathfrak{W}(x_{n})_{n\in\mathbb{N}}\cap C,\\ y_{2}\in\mathfrak{W}(x_{n})_{n\in\mathbb{N}}\cap C,\\ \big(\langle y_{1}-y_{2},\nabla f_{n}(x_{n})\rangle\big)_{n\in\mathbb{N}}\quad\text{converges} \end{array}\right. \Rightarrow\quad y_{1}=y_{2}. $$

Moreover, suppose that (∀x∈int dom f)D f (x,⋅) is coercive. Then \((x_{n})_{n\in \mathbb {N}}\) converges weakly to a point in C∩int dom f if and only if \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset C\cap \mathrm {int\,dom}\, f\).

Proposition 3 (20)

Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function, let α∈]0,+∞[, let \((f_{n})_{n\in \mathbb {N}}\) be in \(\mathcal {P}_{\alpha }(f)\) , let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) , and let C be a closed convex subset of \(\mathcal {X}\) such that C∩int dom f≠∅. Suppose that \((x_{n})_{n\in \mathbb {N}}\) is stationarily quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) , that f satisfies Condition 1, and that (∀x∈int dom f)D f (x,⋅) is coercive. In addition, suppose that there exists β∈]0,+∞[ such that \((\forall n\in \mathbb {N})\beta \hat {f}\succcurlyeq f_{n}\) . Then \((x_{n})_{n\in \mathbb {N}}\) converges strongly to a point in \(C\cap \overline {\text {dom}\,}f\) if and only if \(\underline {\lim } {D^{f}_{C}}(x_{n})=0\).

Our framework uses the Bregman distance-based proximity operators whose definition and properties are discussed in the following proposition.

Proposition 4

Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅, let \(\varphi \in {\Gamma }_{0}(\mathcal {X})\) , and let

$$\begin{array}{@{}rcl@{}} \text{Prox}_{\varphi}^{f} \colon \mathcal{X}^{*}&\to& 2^{\mathcal{X}}\\ x^{\ast}&\mapsto&\{x\in\mathcal{X}\mid \varphi(x)+f(x)-\langle x,x^{\ast}\rangle =\min(\varphi+f-x^{\ast})(\mathcal{X})<+\infty\} \end{array} $$
(6)

be f-proximity operator of φ. Then the following hold.

  1. (1)

    \(\text {ran}\text {Prox}_{\varphi }^{f}\subset \text {dom}\, f \cap \text {dom}\,\varphi \) and \(\text {Prox}_{\varphi }^{f}=(\partial (f+\varphi ))^{-1}\).

  2. (2)

    Suppose that dom φ∩int dom f≠∅ and that dom ∂f∩dom ∂φ⊂int dom f. Then the following hold.

    1. (a)

      \(\text {ran}\text {Prox}_{\varphi }^{f}\subset \mathrm {int\,dom}\, f\) and \(\text {Prox}_{\varphi }^{f}=(\nabla f+\partial \varphi )^{-1}\).

    2. (b)

      \(\text {int}(\text {dom}\, f^{\ast } + \text {dom}\,\varphi ^{\ast })\subset \text {dom}\,\text {Prox}_{\varphi }^{f}\).

    3. (c)

      Suppose that f| int dom f is strictly convex. Then \(\text {Prox}_{\varphi }^{f}\) is single-valued on its domain.

Proof

Let us fix \(x^{\ast }\in \mathcal {X}^{\ast }\) and define \(f_{x^{\ast }}\colon \mathcal {X} \to ]-\infty ,+\infty ] \colon x \mapsto f(x)-\langle x,x^{\ast }\rangle + f^{\ast }(x^{\ast })\). Then \(\text {dom}\, f_{x^{\ast }} = \text {dom}\, f\) and \(\varphi + f_{x^{\ast }} \in {\Gamma }_{0}(\mathcal {X})\). Moreover, \(\partial (\varphi + f_{x^{\ast }}) = \partial (\varphi + f) - x^{\ast }\).

  1. (1):

    By definition, \(\text {ran}\text {Prox}_{\varphi }^{f} \subset \text {dom}\, f \cap \text {dom}\,\varphi \). For the second assertion, it is sufficient to prove for the case dom f∩dom φ since otherwise both sides of the desired identity reduce to the trivial operator x . Now let x∈dom f∩dom φ. Then

    $$\begin{array}{@{}rcl@{}} x\in\text{Prox}_{\varphi}^{f} x^{\ast} &\Leftrightarrow& 0\in\partial (\varphi+f_{x^{\ast}})(x)\\ &\Leftrightarrow& 0\in\partial(\varphi+f)(x)-x^{\ast}\\ &\Leftrightarrow& x^{\ast}\in\partial(\varphi+f)(x)\\ &\Leftrightarrow& x\in \big(\partial(\varphi+f)\big)^{-1}(x^{\ast}). \end{array} $$
    (7)
  2. (2):

    Suppose that x ∈int(dom f +dom φ ). Since dom φ∩int dom f, it follows from [1, Theorem 1.1] and [23, Theorem 2.1.3(ix)] that

    $$ x^{\ast}\in\text{int}(\text{dom}\, f^{\ast}+\text{dom}\,\varphi^{\ast}) = \text{int}\text{dom} (f+\varphi)^{\ast}. $$
    (8)
  3. (2a):

    Since dom φ∩int dom f, (φ + f) = φ + f by [1, Corollary 2.1], and hence 1) yields

    $$\text{ran}\text{Prox}_{\varphi}^{f}=\text{dom}\,\partial(f+\varphi)=\text{dom}(\partial f+\partial\varphi)=\text{dom}\,\partial f\cap\text{dom}\,\partial\varphi\subset\mathrm{int\,dom}\, f. $$

    In turn, \(\text {ran}\text {Prox}_{\varphi }^{f}\subset \text {dom}\,\varphi \cap \mathrm {int\,dom}\, f\). We now prove that \(\text {Prox}_{\varphi }^{f}=(\nabla f+\partial \varphi )^{-1}\). Note that dom(∇f + φ)⊂dom φ∩int dom f. Let x∈dom φ∩int dom f. Then (f + φ)(x) = f(x) + φ(x)=∇f(x) + φ(x) and therefore,

    $$x\in\text{Prox}_{\varphi}^{f}x^{\ast}\Leftrightarrow x^{\ast}\in \partial(f+\varphi)(x)=\nabla f(x)+\partial \varphi(x)\Leftrightarrow x\in(\nabla f+\partial\varphi)^{-1}(x^{\ast}). $$
  4. (2b):

    We derive from (8) and [5, Fact 3.1] that \(\varphi +f_{x^{\ast }}\) is coercive. Hence, by [23, Theorem 2.5.1], \(\varphi +f_{x^{\ast }}\) admits at least one minimizer, i.e., \(x^{*}\in \text {dom}\,\text {Prox}_{\varphi }^{f}\).

  5. (2c):

    Since f|int dom f is strictly convex, so is \((\varphi +f_{x^{\ast }})|_{\mathrm {int\,dom}\, f}\) and thus, in view of 2b), \(\varphi +f_{x^{\ast }}\) admits a unique minimizer on int dom f. However, since

    $$\text{Argmin}(\varphi+f_{x^{\ast}})=\text{ran}\text{Prox}_{\varphi}^{f}\subset\mathrm{int\,dom}\, f, $$

    it follows that \(\varphi +f_{x^{\ast }}\) admits a unique minimizer and that \(\text {Prox}_{\varphi }^{f}\) is therefore single-valued.

Proposition 5

Let m be a strictly positive integer, let \((\mathcal {X}_{i})_{1\leq i\leq m}\) be reflexive real Banach spaces, and let \(\mathcal {X}\) be the vector product space equipped with the norm \(x=(x_{i})_{1\leq i\leq m}\mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}\|^{2}}\) . For every i∈{1,…,m}, let \(f_{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\) be a Legendre function and let \(\varphi _{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\) be such that dom φ i ∩int dom f i ≠∅. Set \(f\colon \mathcal {X} \to ]-\infty ,+\infty ] \colon x\mapsto {\sum }_{i=1}^{m}f_{i}(x_{i})\) and \(\varphi \colon \mathcal {X}\to ]-\infty ,+\infty ] \colon x\mapsto {\sum }_{i=1}^{m}\varphi _{i}(x_{i})\) . Then

figure b

Proof

First, we observe that \(\mathcal {X}^{\ast }\) is the vector product space equipped with the norm \(x^{\ast }=(x_{i}^{\ast })_{1\leq i\leq m} \mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}^{\ast }\|^{2}}\). Next, we derive from the definition of f that dom and that

figure e

Thus, f is single-valued on

figure f

Likewise, since

$$f^{\ast}\colon\mathcal{X}^{\ast} \to ]-\infty,+\infty] \colon (x_{i}^{\ast})_{1\leq i\leq m}\mapsto{\sum}_{i=1}^{m} f_{i}^{\ast} (x_{i}^{\ast}), $$

we deduce that f is single-valued on dom f =int dom f . Consequently, [5, Theorems 5.4 and 5.6] assert that

$$ f \text{ is a Legendre function}. $$
(9)

In addition,

figure g

Hence, Proposition 4(2b) and (2c) assert that \(\text {int}(\text {dom}\, f^{\ast }+\text {dom}\,\varphi ^{\ast }) \subset \text {dom}\,\text {Prox}_{\varphi }^{f}\) and \(\text {Prox}_{\varphi }^{f}\) is single-valued on its domain. Now set \(x=\text {Prox}^{f}_{\varphi }x^{\ast }\) and \(q=(\text {Prox}^{f_{i}}_{\varphi _{i}}x_{i}^{\ast })_{1\leq i\leq m}\). We derive from Proposition 4(2a) that

$$x=\text{Prox}_{\varphi}^{f} x^{\ast} \quad \Leftrightarrow\quad x=(\nabla f+\partial\varphi)^{-1}(x^{\ast}) \quad \Leftrightarrow\quad x^{\ast}-\nabla f(x)\in\partial\varphi(x). $$

Consequently, by invoking (4), we get

$$ (\forall z\in\text{dom}\,\varphi)\quad \langle z-x,x^{\ast}-\nabla f(x)\rangle + \varphi(x)\leq\varphi(z). $$
(11)

Upon setting z = q in (11), we obtain

$$ \langle q-x,x^{\ast}-\nabla f(x)\rangle + \varphi(x)\leq\varphi(q). $$
(12)

For every i∈{1,…,m}, let us set \(q_{i}=\text {Prox}_{\varphi _{i}}^{f_{i}}x_{i}^{\ast }\). The same characterization as in (11) yields

$$ (\forall i\in\{1,\ldots,m\}) (\forall z_{i}\in\text{dom}\,\varphi_{i}) \quad \langle z_{i}-q_{i},x_{i}^{\ast}-\nabla f_{i}(q_{i})\rangle + \varphi_{i}(q_{i})\leq\varphi_{i}(z_{i}). $$

By summing these inequalities over i∈{1,…,m}, we obtain

$$ (\forall z\in\text{dom}\,\varphi)\quad \langle z-q,x^{\ast}-\nabla f(q)\rangle + \varphi(q)\leq\varphi(z). $$
(13)

Upon setting z = x in (13), we get

$$ \langle x-q,\nabla f(x)-\nabla f(q)\rangle + \varphi(q) \leq \varphi(x). $$
(14)

Adding (12) and (14) yields

$$ \langle x-q,\nabla f(x)-\nabla f(q)\rangle \leq 0. $$

Now suppose that xq. Since f|int dom f is strictly convex, it follows from [23, Theorem 2.4.4(ii)] that ∇f is strictly monotone, i.e.,

$$\langle x-q,\nabla f(x)-\nabla f(q)\rangle > 0, $$

and we reach a contradiction. □

In Hilbert spaces, the operator defined in (6) reduces to the Moreau’s usual proximity operator prox φ [19] if f=∥⋅∥2/2. We provide illustrations of such instances in the standard Euclidean space \(\mathbb {R}^{m}\).

Example 1

Let γ∈]0, + [, let \(\phi \in {\Gamma }_{0}(\mathbb {R})\) be such that dom ϕ∩]0, + [≠, and let 𝜗 be Boltzmann–Shannon entropy, i.e.,

$$\vartheta\colon \xi \mapsto \left\{\begin{array}{lllllll} \xi\ln\xi-\xi&\text{ if}\; \xi\in ]0,+\infty[,\\ 0&\text{ if}\; \xi=0,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

Set \(\varphi \colon (\xi _{i})_{1\leq i\leq m}\mapsto {\sum }_{i=1}^{m}\phi (\xi _{i})\) and \(f\colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\vartheta (\xi _{i})\). Note that f is a supercoercive Legendre function [4, Sections 5 and 6], and hence, Proposition 4(2b) asserts that \(\text {dom}\,\text {Prox}_{\varphi }^{f}=\mathbb {R}^{m}\). Let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), set \((\eta _{i})_{1\leq i\leq m} = \text {Prox}_{\gamma \varphi }^{f}(\xi _{i})_{1\leq i\leq m}\), let W be the Lambert function [15], i.e., the inverse of ξξ e ξ on [0, + [, and let i∈{1,…,m}. Then η i can be computed as follows.

  1. 1.

    Let \(\omega \in \mathbb {R}\) and suppose that

    $$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi\ln\xi-\omega\xi&\text{ if}\; \xi\in]0,+\infty[,\\ 0&\text{ if} \;\xi=0,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

    Then \(\eta _{i}=e^{(\xi _{i}+\omega -1)/(\gamma +1)}\).

  2. 2.

    Let p∈[1, + [ and suppose that either ϕ=|⋅|p/p or

    $$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi^{p}/p&\text{ if}\; \xi\in[0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

    Then

    $$\eta_{i}= \left\{\begin{array}{lllllll} \left( \frac{W(\gamma(p-1)e^{(p-1)\xi_{i}})}{\gamma(p-1)}\right)^{\frac{1}{p-1}}&\text{ if}\; p\in ]1,+\infty[,\\ e^{\xi_{i}-\gamma}&\text{ if}\; p=1. \end{array}\right. $$
  3. 3.

    Let p∈[1, + [ and suppose that

    $$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi^{-p}/p&\text{ if}\; \xi\in]0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

    Then

    $$\eta_{i}=\left( \frac{W(\gamma(p+1)e^{-(p+1)\xi_{i}})}{\gamma(p+1)}\right)^{\frac{-1}{p+1}}. $$
  4. 4.

    Let p∈]0,1[ and suppose that

    $$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} -\xi^{p}/p&\text{ if}\; \xi\in[0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

    Then

    $$\eta_{i}=\left( \frac{W(\gamma(1-p)e^{(p-1)\xi_{i}})}{\gamma(1-p)}\right)^{\frac{1}{p-1}}. $$

Example 2

Let \(\phi \in {\Gamma }_{0}(\mathbb {R})\) be such that dom ϕ∩]0,1[≠ and let 𝜗 be Fermi–Dirac entropy, i.e.,

$$\vartheta\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi\ln\xi-(1-\xi)\ln(1-\xi)&\text{ if}\; \xi\in ]0,1[,\\ 0&\text{ if}\; \xi\in\{0,1\},\\ +\infty&\text{ otherwise}. \end{array}\right. $$

Set \(\varphi \colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\phi (\xi _{i})\) and \(f\colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\vartheta (\xi _{i})\). Note that f is a cofinite Legendre function [4, Sections 5 and 6], and hence Proposition 4(2b) asserts that \(\text {dom}\,\text {Prox}_{\varphi }^{f}=\mathbb {R}^{m}\). Let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), set \((\eta _{i})_{1\leq i\leq m} = \text {Prox}_{\varphi }^{f}(\xi _{i})_{1\leq i\leq m}\), and let i∈{1,…,m}. Then η i can be computed as follows.

  1. 1.

    Let \(\omega \in \mathbb {R}\) and suppose that

    $$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi\ln\xi-\omega\xi&\text{ if}\; \xi\in]0,+\infty[,\\ 0&\text{ if}\; \xi=0,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

    Then \(\eta _{i}=-e^{\xi _{i}+\omega -1}/2+\sqrt {e^{2(\xi _{i}+\omega -1)}/4 + e^{\xi _{i}+\omega -1}}\).

  2. 2.

    Suppose that

    $$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} (1-\xi)\ln(1-\xi)+\xi&\text{ if}\; \xi\in]-\infty,1[,\\ 1&\text{ if}\; \xi=1,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

    Then \(\eta _{i}=1+e^{-\xi _{i}}/2-\sqrt {e^{-\xi _{i}}+e^{-2\xi _{i}}/4}\).

Example 3

Let \(f\colon (\xi _{i})_{1\leq i\leq m}\mapsto {\sum }_{i=1}^{m} \vartheta (\xi _{i})\), where 𝜗 is Hellinger-like function, i.e.,

$$\vartheta\colon\xi\mapsto \left\{\begin{array}{lllllll} -\sqrt{1-\xi^{2}}&\text{ if}\; \xi\in[-1,1],\\ +\infty&\text{ otherwise}, \end{array}\right. $$

let γ∈]0, + [, and let φ = f. Since f is a cofinite Legendre function [4, Sections 5 and 6], Proposition 4(2b) asserts that \(\text {dom}\,\text {Prox}_{\gamma \varphi }^{f}=\mathbb {R}^{m}\). Let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), and set \((\eta _{i})_{1\leq i\leq m}=\text {Prox}_{\gamma \varphi }^{f}(\xi _{i})_{1\leq i\leq m}\). Then \((\forall i\in \{1,\ldots ,m\})\eta _{i}=\xi _{i}/\sqrt {(\gamma +1)^{2}+{\xi _{i}^{2}}}\).

Example 4

Let γ∈]0, + [, let \(\phi \in {\Gamma }_{0}(\mathbb {R})\) be such that dom ϕ∩]0, + [≠, and let 𝜗 be Burg entropy, i.e.,

$$\vartheta\colon\xi\mapsto \left\{\begin{array}{lllllll} -\ln\xi&\text{ if} \;\xi\in]0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

Set \(\varphi \colon (\xi _{i})_{1\leq i\leq m}\mapsto {\sum }_{i=1}^{m}\phi (\xi _{i})\) and \(f\colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\vartheta (\xi _{i})\), let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), and set \((\eta _{i})_{1\leq i\leq m} = \text {Prox}_{\gamma \varphi }^{f}(\xi _{i})_{1\leq i\leq m}\). Let i∈{1,…,m}. Then η i can be computed as follows.

  1. 1.

    Suppose that ϕ = 𝜗 and ξ i ∈]−,0]. Then η i =−(1 + γ)−1 ξ i .

  2. 2.

    Suppose that ϕ:ξα|ξ| and ξ i ∈]−,γ α]. Then η i =(γ αξ i )−1.

The following result will be used subsequently.

Lemma 1

Let \(\mathcal {X}\) be a reflexive real Banach space, let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function, let x∈int dom f, and let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) . Suppose that \((D^{f}(x,x_{n}))_{n\in \mathbb {N}}\) is bounded, that dom f is open, and that ∇f is weakly sequentially continuous. Then \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathrm {int\,dom}\, f\).

Proof

[20, Proof of Theorem 4.1]. □

3 Forward-Backward Splitting in Banach Spaces

The main result in this section is a version of the forward-backward splitting algorithm in reflexive real Banach spaces which employs different Bregman distance-based proximity operators over the iterations.

Theorem 1

Consider the setting of Problem 1 and let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function such that S∩int dom f≠∅, int dom f⊂int dom ψ, and \(f\succcurlyeq \beta \psi \) for some β∈]0,+∞[. Let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , let α∈]0,+∞[, and let \((f_{n})_{n\in \mathbb {N}}\) be Legendre functions in \(\mathcal {P}_{\alpha }(f)\) such that

$$ (\forall n\in\mathbb{N})\quad(1+\eta_{n})f_{n}\succcurlyeq f_{n+1}. $$
(15)

Suppose that either −ran ∇ψ⊂dom φ or \((\forall n\in \mathbb {N})f_{n}\) is cofinite. Let ε∈]0,αβ/(αβ+1)[ and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that

$$ (\forall n\in\mathbb{N})\quad \varepsilon\leq\gamma_{n}\leq\alpha\beta(1-\varepsilon) \quad\text{ and} \quad(1+\eta_{n})\gamma_{n}-\gamma_{n+1} \leq\alpha\beta\eta_{n}. $$
(16)

Furthermore, let x 0 ∈int dom f and iterate

$$ (\forall n\in\mathbb{N})\quad x_{n+1}=\text{Prox}_{\gamma_{n}\varphi}^{f_{n}} \left( \nabla f_{n}(x_{n})-\gamma_{n}\nabla\psi(x_{n})\right). $$
(17)

Suppose in addition that (∀x∈int dom f)D f (x,⋅) is coercive. Then \((x_{n})_{n\in \mathbb {N}}\) is a bounded sequence in int dom f and \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathcal {S}\) . Moreover, there exists \(\overline {x}\in \mathcal {S}\) such that the following hold.

  1. (1)

    Suppose that \(\mathcal {S}\cap \overline {\text {dom}}\,f\) is a singleton. Then \(x_{n}\rightharpoonup \overline {x}\).

  2. (2)

    Suppose that there exists \(g\in \mathcal {F}(f)\) such that for every \(n\in \mathbb {N}\) , \(g\succcurlyeq f_{n}\) , and that, for every \(y_{1}\in \mathcal {X}\) and every \(y_{2}\in \mathcal {X}\) ,

    $$ \left\{\begin{array}{lllllll} y_{1}\in\mathfrak{W}(x_{n})_{n\in\mathbb{N}},\\ y_{2}\in\mathfrak{W}(x_{n})_{n\in\mathbb{N}},\\ \big(\langle y_{1}-y_{2},\nabla f_{n}(x_{n}) - \gamma_{n}\nabla\psi(x_{n})\rangle\big)_{n\in\mathbb{N}} \quad\text{converges} \end{array}\right. \Rightarrow\quad y_{1}=y_{2}. $$
    (18)

    In addition, suppose that one of the following holds.

    1. (a)

      S⊂int dom f.

    2. (b)

      dom f is open and ∇f is weakly sequentially continuous.

    Then \(x_{n}\rightharpoonup \overline {x}\).

  3. (3)

    Suppose that f satisfies Condition 1 and that one of the following holds.

    1. (a)

      Either φ or ψ is uniformly convex at \(\overline {x}\).

    2. (b)

      \(\underline {\lim } D^{f}_{\mathcal {S}}(x_{n})=0\) and there exists μ∈]0,+∞[ such that \((\forall n\in \mathbb {N})\mu \hat {f}\succcurlyeq f_{n}\).

    Then \(x_{n}\to \overline {x}\).

Proof

We first derive from Proposition 4(2c) that the operators \((\text {Prox}_{\gamma _{n}\varphi }^{f})_{n\in \mathbb {N}}\) are single-valued on their domains. We also note that x 0∈int dom f. Suppose that x n ∈int dom f for some \(n\in \mathbb {N}\). If f n is cofinite then Proposition 4(2b) yields

$$ \nabla f_{n}(x_{n}) - \gamma_{n}\nabla\psi(x_{n}) \in \mathcal{X}^{\ast}=\text{dom}\,\text{Prox}_{\gamma_{n}\varphi}^{f_{n}}. $$
(19)

Otherwise,

$$\begin{array}{@{}rcl@{}} \nabla f_{n}(x_{n})-\gamma_{n}\nabla\psi(x_{n}) &\in&\mathrm{int\,dom}\, f_{n}^{\ast}+\gamma_{n}\text{dom}\,\varphi^{\ast} =\text{int}(\mathrm{int\,dom}\, f_{n}^{\ast} + \gamma_{n}\text{dom}\,\varphi^{\ast})\\ &\subset&\text{int}(\text{dom}\, f_{n}^{\ast} + \gamma_{n}\text{dom}\,\varphi^{\ast}) = \text{int}(\text{dom}\, f_{n}^{\ast} + \text{dom}(\gamma_{n}\varphi^{\ast})).\quad \end{array} $$
(20)

Since \(\text {int}(\text {dom}\, f_{n}^{\ast } + \text {dom}\,(\gamma _{n}\varphi ^{\ast })) \subset \text {dom}\,\text {Prox}_{\gamma _{n}\varphi }^{f}\) by Proposition 4(2b), we deduce from (17), (19), (20), and Proposition 4(2a) that x n+1 is a well-defined element in \(\text {ran}\text {Prox}_{\gamma \varphi }^{f_{n}} = \text {dom}\,\partial \varphi \cap \mathrm {int\,dom}\, f_{n}=\text {dom}\,\partial \varphi \cap \mathrm {int\,dom}\, f\subset \mathrm {int\,dom}\, f\). By reasoning by induction, we conclude that

$$ (x_{n})_{n\in\mathbb{N}}\in(\mathrm{int\,dom}\, f)^{\mathbb{N}}\quad\text{is well-defined}. $$

Next, let us set Φ = φ + ψ and

$$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad g_{n}\colon\mathcal{X}&\to&]-\infty,+\infty]\\ x&\mapsto&\left\{\begin{array}{lllllll} f_{n}(x)-\gamma_{n}\psi(x)&\text{ if}\; x\in\mathrm{int\,dom}\, f,\\ +\infty&\text{ otherwise}. \end{array}\right. \end{array} $$
(21)

Since int dom f⊂int dom ψ, it follows from (21) that \((\forall n\in \mathbb {N})g_{n}\) is Gâteaux differentiable on dom g n =int dom g n =int dom f. Since ψ is continuous on int dom ψ⊃int dom f and the functions \((f_{n})_{n\in \mathbb {N}}\) are continuous on int dom f [21, Proposition 3.3], we deduce that \((\forall n\in \mathbb {N})g_{n}\) is continuous on dom g n . In addition,

$$ (\forall n\in\mathbb{N})\quad g_{n}-\varepsilon\alpha f=(1-\varepsilon)(f_{n}-\alpha\beta\psi)+\varepsilon(f_{n}-\alpha f) + \big(\alpha\beta(1-\varepsilon)-\gamma_{n}\big)\psi. $$
(22)

Note that \(f\succcurlyeq \beta \psi \) and \((\forall n\in \mathbb {N})f_{n}\succcurlyeq \alpha f\). Hence, (22) yields

$$ (\forall n\in\mathbb{N})\quad f_{n}\succcurlyeq\alpha\beta\psi, $$
(23)

and hence, we deduce from (16) and (22) that \((\forall n\in \mathbb {N})g_{n}\succcurlyeq \varepsilon \alpha f\). In turn,

$$\begin{array}{@{}rcl@{}} &&(\forall n\in\mathbb{N})(\forall x\in\text{dom}\, g_{n})(\forall y\in\text{dom}\, g_{n})\\ &&\langle x-y,\nabla g_{n}(x)-\nabla g_{n}(y)\rangle =D^{g_{n}}(x,y)+D^{g_{n}}(y,x)\geq\varepsilon\alpha\big(D^{f}(x,y)+D^{f}(y,x)\big) \geq 0, \end{array} $$

and it therefore follows from [23, Theorem 2.1.11] that \((\forall n\in \mathbb {N})g_{n}\) is convex. Consequently,

$$ (\forall n\in\mathbb{N})\quad g_{n}\in\mathcal{P}_{\varepsilon\alpha}(f). $$
(24)

Set ω=1+1/ε. Then

$$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad (1+\omega\eta_{n})g_{n}-g_{n+1} &=&(1+\omega\eta_{n})(f_{n}-\gamma_{n}\psi)-(f_{n+1} - \gamma_{n+1}\psi)\\ &=&(1+\eta_{n})f_{n}-f_{n+1}+\eta_{n}\varepsilon^{-1} \left( f_{n}-(\gamma_{n}+\varepsilon\alpha\beta)\psi\right)\\ &&+\big(\alpha\beta\eta_{n}+\gamma_{n+1}-(1+\eta_{n})\gamma_{n}\big)\psi. \end{array} $$

We thus derive from (15), (16) and (23) that

$$ (\forall n\in\mathbb{N})\quad (1+\omega\eta_{n})g_{n}\succcurlyeq g_{n+1}. $$
(25)

By invoking (17) and Proposition 4(2a), we get

$$(\forall n\in\mathbb{N})\quad\nabla f_{n}(x_{n})-\gamma_{n}\nabla\psi(x_{n}) \in \nabla f_{n}(x_{n+1})+\gamma_{n}\partial\varphi(x_{n+1}), $$

and therefore,

$$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad \nabla f_{n}(x_{n})-\gamma_{n}\nabla\psi(x_{n}) &\in& \nabla f_{n}(x_{n+1})-\gamma_{n}\nabla\psi(x_{n+1})\\ &&+\gamma_{n}\big(\partial\varphi(x_{n+1})+\nabla\psi(x_{n+1})\big). \end{array} $$
(26)

Since [23, Theorem 2.4.2(vii)–(viii)] yield

$$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad \partial\varphi(x_{n+1})+\nabla\psi(x_{n+1})&\subset&\partial\varphi(x_{n+1})+\partial\psi(x_{n+1})\\ &\subset&\partial(\varphi+\psi)(x_{n+1})=\partial{\Phi}(x_{n+1}), \end{array} $$

we deduce from (26) that

$$ (\forall n\in\mathbb{N})\quad \nabla g_{n}(x_{n})-\nabla g_{n}(x_{n+1}) \in\gamma_{n}\partial{\Phi}(x_{n+1}). $$
(27)

By appealing to (4) and (27), we get

$$\begin{array}{@{}rcl@{}} &&(\forall x\in\text{dom}\,{\Phi}\cap\text{dom}\, f)(\forall n\in\mathbb{N})\\ &&\gamma_{n}^{-1}\langle x-x_{n+1},\nabla g_{n}(x_{n})-\nabla g_{n}(x_{n+1})\rangle + {\Phi}(x_{n+1}) \leq {\Phi}(x), \end{array} $$
(28)

and hence, by [6, Proposition 2.3(ii)],

$$\begin{array}{@{}rcl@{}} &&(\forall x\in\text{dom}\,{\Phi}\cap\text{dom}\, f)(\forall n\in\mathbb{N})\\ &&\gamma_{n}^{-1}\big(D^{g_{n}}(x,x_{n+1})+D^{g_{n}}(x_{n+1},x_{n}) -D^{g_{n}}(x,x_{n})\big)+{\Phi}(x_{n+1})\leq{\Phi}(x). \end{array} $$
(29)

In particular,

$$ (\forall x\in\mathcal{S}\cap\text{dom}\, f)(\forall n\in\mathbb{N})\quad D^{g_{n}}(x,x_{n+1})+D^{g_{n}}(x_{n+1},x_{n}) -D^{g_{n}}(x,x_{n})\leq 0. $$
(30)

By using (25), we deduce from (30) that

$$\begin{array}{@{}rcl@{}} &&(\forall x\in\mathcal{S}\cap\text{dom}\, f)(\forall n\in\mathbb{N})\\ &&D^{g_{n+1}}(x,x_{n+1})+(1+\omega\eta_{n})D^{g_{n}}(x_{n+1},x_{n})\leq (1+\omega\eta_{n})D^{g_{n}}(x,x_{n}), \end{array} $$
(31)

and therefore,

$$ (\forall x\in\mathcal{S}\cap\text{dom}\, f)(\forall n\in\mathbb{N})\quad D^{g_{n+1}}(x,x_{n+1})\leq (1+\omega\eta_{n})D^{g_{n}}(x,x_{n}). $$
(32)

This shows that \((x_{n})_{n\in \mathbb {N}}\) is stationarily quasi-Bregman monotone with respect to S relative to \((g_{n})_{n\in \mathbb {N}}\). Hence, we deduce from Proposition 1(2) that

$$ (x_{n})_{n\in\mathbb{N}}\in(\mathrm{int\,dom}\, f)^{\mathbb{N}}\quad\text{is bounded} $$
(33)

and, since \(\mathcal {X}\) is reflexive,

$$ \mathfrak{W}(x_{n})_{n\in\mathbb{N}}\neq\emptyset. $$
(34)

In addition, we derive from (32) and Proposition 1(1) that

$$ (\forall x\in\mathcal{S}\cap\mathrm{int\,dom}\, f)\quad\big(D^{g_{n}}(x,x_{n})\big)_{n\in\mathbb{N}} \quad\text{converges}, $$
(35)

and thus, since (31) yields

$$\begin{array}{@{}rcl@{}} (\forall x\in\mathcal{S}\cap\mathrm{int\,dom}\, f)(\forall n\in\mathbb{N})\quad 0&\leq& D^{g_{n}}(x_{n+1},x_{n})\\ &\leq& (1+\omega\eta_{n})D^{g_{n}}(x_{n+1},x_{n})\\ &\leq& (1+\omega\eta_{n})D^{g_{n}}(x,x_{n})-D^{g_{n+1}}(x,x_{n+1}), \end{array} $$

and since η n →0, we obtain

$$ D^{g_{n}}(x_{n+1},x_{n})\to 0. $$
(36)

On the other hand, it follows from (24) that

$$(\forall n\in\mathbb{N})\quad\varepsilon\alpha D^{f}(x_{n+1},x_{n}) \leq D^{g_{n}}(x_{n+1},x_{n}), $$

and hence, (36) yields

$$ D^{f}(x_{n+1},x_{n})\to 0. $$
(37)

Now, it follows from (29) that

$$(\forall n\in\mathbb{N})\quad {\Phi}(x_{n+1})\leq\gamma_{n}^{-1}\big(D^{g_{n}}(x_{n},x_{n+1}) + D^{g_{n}}(x_{n+1},x_{n})\big)+{\Phi}(x_{n+1})\leq{\Phi}(x_{n}), $$

which shows that \(({\Phi }(x_{n}))_{n\in \mathbb {N}}\) is decreasing and hence, since it is bounded from below by \(\inf {\Phi }(\mathcal {X})\), it is convergent. However, (29) and (32) yield

$$\begin{array}{@{}rcl@{}} &&(\forall x\in\mathcal{S}\cap\mathrm{int\,dom}\, f)(\forall n\in\mathbb{N})\\ &&\varepsilon^{-1} \left( \frac{1}{1+\omega\eta_{n}}D^{g_{n+1}}(x,x_{n+1}) + D^{g_{n}}(x_{n+1},x_{n})-D^{g_{n}}(x,x_{n})\right) + {\Phi}(x_{n+1})\\ &&\leq\gamma_{n}^{-1} \left( \frac{1}{1+\omega\eta_{n}}D^{g_{n+1}}(x,x_{n+1}) + D^{g_{n}}(x_{n+1},x_{n})-D^{g_{n}}(x,x_{n})\right) + {\Phi}(x_{n+1})\\ &&\leq{\Phi}(x). \end{array} $$
(38)

Since η n →0, by taking the limit in (38) and then using (35) and (36), we get

$$\inf{\Phi}(\mathcal{X})\leq\lim{\Phi}(x_{n})\leq\inf{\Phi}(\mathcal{X}), $$

and thus,

$$ {\Phi}(x_{n})\to{\inf}{\Phi}(\mathcal{X}). $$
(39)

We now show that

$$ \mathfrak{W}(x_{n})_{n\in\mathbb{N}}\subset\mathcal{S}. $$
(40)

To this end, suppose that \(x\in \mathfrak {W}(x_{n})_{n\in \mathbb {N}}\), i.e., \(x_{k_{n}}\rightharpoonup x\). Since Φ is weakly lower semicontinuous [23, Theorem 2.2.1], by (39),

$$\inf{\Phi}(\mathcal{X})\leq{\Phi}(x)\leq\underline{\lim}\, {\Phi}(x_{k_{n}})=\lim{\Phi}(x_{n})={\inf}{\Phi}(\mathcal{X}). $$

This yields \({\Phi }(x)={\inf }\,{\Phi }(\mathcal {X})\), i.e., x∈Argmin Φ = S.

  1. (1)

    Let \(\overline {x}\in \mathfrak {W}(x_{n})_{n\in \mathbb {N}}\). Since (33) and (40) imply that \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}} \subset \mathcal {S}\cap \overline {\text {dom}}\,f\), we obtain \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}=\{\overline {x}\}\), and in turn, (34) yields \(x_{n}\rightharpoonup \overline {x}\).

  2. (2)

    In view of (40) and Proposition 2, it suffices to show that \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathrm {int\,dom}\, f\).

  3. (2a)

    We have \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathcal {S}\subset \mathrm {int\,dom}\, f\).

  4. (2b)

    This follows from Lemma 1.

  5. (3)

    Let \(\overline {x}\in \mathcal {S}\cap \mathrm {int\,dom}\, f\). Since f satisfies Condition 1, (37) yields

    $$ x_{n+1}-x_{n}\to 0. $$
    (41)

    Now set

    $$(\forall n\in\mathbb{N})\quad y_{n}=x_{n+1} \quad\text{ and} \quad y_{n}^{\ast} = \gamma_{n}^{-1}\big(\nabla g_{n}(x_{n})-\nabla g_{n}(y_{n})\big). $$

    Then (27) and (41) imply that

    $$ (\forall n\in\mathbb{N})\quad y_{n}^{\ast}\in\partial{\Phi}(y_{n}) \quad \text{ and} \quad y_{n}-x_{n}\to 0. $$
    (42)

    Since (31) yields

    $$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad D^{g_{n+1}}(\overline{x},x_{n+1}) &=&D^{g_{n+1}}(\overline{x},y_{n})\\ &\leq& (1+\omega\eta_{n})D^{g_{n}}(\overline{x},y_{n})\\ &=&(1+\omega\eta_{n})D^{g_{n}}(\overline{x},x_{n+1})\\ &\leq&(1+\omega\eta_{n})D^{g_{n}}(\overline{x},x_{n}), \end{array} $$

    we deduce that

    $$ (\forall n\in\mathbb{N})\quad (1+\omega\eta_{n})^{-1} D^{g_{n+1}}(\overline{x},x_{n+1})\leq D^{g_{n}}(\overline{x},y_{n}) \leq D^{g_{n}}(\overline{x},x_{n}). $$
    (43)

    Altogether, (35) and (43) yield

    $$ D^{g_{n}}(\overline{x},y_{n})-D^{g_{n}}(\overline{x},x_{n})\to 0. $$
    (44)

    In (28), by setting \(x=\overline {x}\), we get

    $$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad 0&\leq&\gamma_{n}\langle y_{n}-\overline{x},y_{n}^{\ast}\rangle \\ &=&\langle y_{n}-\overline{x}, \nabla g_{n}(x_{n})-\nabla g_{n}(y_{n})\rangle \\ &=&D^{g_{n}}(\overline{x},x_{n}) - D^{g_{n}}(\overline{x},y_{n})-D^{g_{n}}(y_{n},x_{n})\\ &\leq& D^{g_{n}}(\overline{x},x_{n})-D^{g_{n}}(\overline{x},y_{n}). \end{array} $$
    (45)

    By taking to the limit in (45) and using (44), we get

    $$ \langle y_{n}-\overline{x},y_{n}^{\ast}\rangle\to 0. $$
    (46)
  6. (3a)

    In this case \(\mathcal {S}=\{\overline {x}\}\). Since φ is uniformly convex at \(\overline {x}\), Φ is likewise and hence, there exists an increasing function ϕ:[0, + [→[0, + ] that vanishes only at 0 such that

    $$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})(\forall\tau\in]0,1[)\quad &&{\Phi}(\tau\overline{x}+(1-\tau)y_{n})+\tau(1-\tau) \phi(\|y_{n}-\overline{x}\|)\\ && \leq\tau{\Phi}(\overline{x})+(1-\tau){\Phi}(y_{n}). \end{array} $$

    It therefore follows from [23, Page 201] that Φ is uniformly monotone at \(\overline {x}\) and its modulus of convexity is ϕ, i.e,

    $$ (\forall n\in\mathbb{N})\quad \langle y_{n}-\overline{x},y_{n}^{\ast}\rangle \geq \phi(\|y_{n}-\overline{x}\|)\geq 0. $$
    (47)

    Altogether, (46) and (47) yield \(\phi (\|y_{n}-\overline {x}\|)\to 0\), and thus, \(y_{n}\to \overline {x}\). In turn, (42) yields \(x_{n}\to \overline {x}\). The case when ψ is uniformly convex at \(\overline {x}\) is similar.

  7. 3b)

    First, we observe that S is closed and convex since \({\Phi }\in {\Gamma }_{0}(\mathcal {X})\). Next, for every \(n\in \mathbb {N}\), since \(\mu \hat {f}\succcurlyeq f_{n}\), we derive from (21) that \(\mu \hat {f}\succcurlyeq g_{n}\). Finally, the strong convergence follows from Proposition 3.

In Theorem 1, when \((\forall n\in \mathbb {N})f_{n}=f\), condition (18) is satisfied when both ∇f and ∇ψ are weakly sequentially continuous. More precisely, we have the following result.

Theorem 2

Consider the setting of Problem 1 and let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function such that S∩int dom f≠∅, int dom f⊂int dom ψ, and \(f\succcurlyeq \beta \psi \) for some β∈]0,+∞[. Suppose that either f is cofinite or −ran ∇ψ⊂dom φ , and that (∀x∈int dom f)D f (x,⋅) is coercive. Let ε∈]0,β/(β+1)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that

$$ (\forall n\in\mathbb{N})\quad \varepsilon\leq\gamma_{n}\leq\beta(1-\varepsilon) \quad\text{and}\quad (1+\eta_{n})\gamma_{n}-\gamma_{n+1}\leq\beta\eta_{n}. $$
(48)

Furthermore, let x 0 ∈int dom f and iterate

$$ (\forall n\in\mathbb{N})\quad x_{n+1}=\text{Prox}_{\gamma_{n}\varphi}^{f} \big(\nabla f(x_{n})-\gamma_{n}\nabla\psi(x_{n})\big). $$
(49)

Then there exists \(\overline {x}\in \mathcal {S}\) such that the following hold.

  1. (1)

    Suppose that one of the following holds.

    1. (a)

      \(\mathcal {S}\cap \overline {\text {dom}}f\) is a singleton.

    2. (b)

      ∇f and ∇ψ are weakly sequentially continuous and S⊂int dom f.

    3. (c)

      dom f is open and ∇f, ∇f , and ∇ψ are weakly sequentially continuous.

    Then \(x_{n}\rightharpoonup \overline {x}\).

  2. (2)

    Suppose that f satisfies Condition 1 and that one of the following holds.

    1. (a)

      Either φ or ψ is uniformly convex at \(\overline {x}\).

    2. (b)

      \(\underline {\lim }\, D^{f}_{\mathcal {S}}(x_{n})=0\).

    Then \(x_{n}\to \overline {x}\).

Proof

Set \((\forall n\in \mathbb {N})f_{n}=f\). Then

$$ (\forall n\in\mathbb{N})\quad \left\{\begin{array}{lllllll} f_{n}\in\mathcal{P}_{1}(f),\\ f\succcurlyeq f_{n},\\ (1+\eta_{n})f_{n}\succcurlyeq f_{n+1}. \end{array}\right. $$
(50)

(1a): This is a corollary of Theorem 1(1).

(1b)–(1c): Firstly, the proof of Theorem 1(2a) and (2b) shows that \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathrm {int\,dom}\, f\). Next, in view of Theorem 1(2), it suffices to show that (18) holds. To this end, suppose that y 1 and y 2 are two weak sequential cluster points of \((x_{n})_{n\in \mathbb {N}}\) such that

$$ \big(\langle y_{1}-y_{2},\nabla f(x_{n}) - \gamma_{n}\nabla\psi(x_{n})\rangle \big)_{n\in\mathbb{N}} \quad\text{converges}. $$
(51)

Then, there exist two strictly increasing sequences \((k_{n})_{n\in \mathbb {N}}\) and \((l_{n})_{n\in \mathbb {N}}\) in \(\mathbb {N}\) such that \(x_{k_{n}}\rightharpoonup y_{1}\) and \(x_{l_{n}}\rightharpoonup y_{2}\). We derive from (48) and [22, Lemma 2.2.2] that there exists 𝜃∈[ε,β(1−ε)] such that γ n 𝜃. Since ∇f and ∇ψ are weakly sequentially continuous, after taking the limit in (51) along the subsequences \((x_{k_{n}})_{n\in \mathbb {N}}\) and \((x_{l_{n}})_{n\in \mathbb {N}}\), respectively, we get

$$ \langle y_{1}-y_{2},\nabla f(y_{1})-\theta\nabla\psi(y_{1})\rangle = \langle y_{1}-y_{2},\nabla f(y_{2})-\theta\nabla\psi(y_{2})\rangle. $$
(52)

Let us define

$$\begin{array}{@{}rcl@{}} h\colon\mathcal{X}&\to&]-\infty,+\infty]\\ x&\mapsto& \left\{\begin{array}{lllllll} f(x)-\theta\psi(x)&\text{ if} \; x\in\mathrm{int\,dom}\, f,\\ +\infty&\text{ otherwise}. \end{array}\right. \end{array} $$

Then h is Gâteaux differentiable on int dom h=int dom f and (52) yields

$$ \langle y_{1}-y_{2},\nabla h(y_{1})-\nabla h(y_{2})\rangle =0. $$
(53)

On the other hand,

$$h-\varepsilon f=f-\theta\psi-\varepsilon f = (1-\varepsilon)(f-\beta\psi) + \big(\beta(1-\varepsilon)-\theta\big)\psi. $$

In turn, since \(f\succcurlyeq \beta \psi \) and 𝜃β(1−ε), we obtain \(h\succcurlyeq \varepsilon f\), and hence,

$$D^{h}(y_{1},y_{2})\geq\varepsilon D^{f}(y_{1},y_{2})\quad\text{and}\quad D^{h}(y_{2},y_{1})\geq\varepsilon D^{f}(y_{2},y_{1}). $$

Therefore, (53) yields

$$\begin{array}{@{}rcl@{}} 0=\langle y_{1}-y_{2},\nabla h(y_{1})-\nabla h(y_{2})\rangle&=&D^{h}(y_{1},y_{2})+D^{h}(y_{2},y_{1})\\ &\geq&\varepsilon\left( D^{f}(y_{1},y_{2})+D^{f}(y_{2},y_{1})\right)\\ &=&\varepsilon\langle y_{1}-y_{2},\nabla f(y_{1})-\nabla f(y_{2})\rangle. \end{array} $$

Suppose that y 1y 2. Since f|int dom f is strictly convex, ∇f is strictly monotone [23, Theorem 2.4.4(ii)], i.e.,

$$ \langle y_{1}-y_{2},\nabla f(y_{1})-\nabla f(y_{2})\rangle > 0 $$

and we reach a contradiction.

  1. (2):

    The conclusions follow from (50) and Theorem 1(3).

Remark 1

In condition (48), if we take \((\forall n\in \mathbb {N})\eta _{n} = 0\) then we get the forward-backward splitting algorithm with monotonic step size whose particular case is forward-backward splitting algorithm with constant step-size.

Remark 2

Let us rewrite algorithm (49) as follows

$$ (\forall n\in\mathbb{N})\quad x_{n+1} = \underset{x\in\mathcal{X}}{\text{argmin}}\left( \varphi(x) + \langle x-x_{n},\nabla\psi(x_{n})\rangle + \psi(x_{n}) + \gamma_{n}^{-1}D^{f}(x,x_{n})\right). $$
(54)

Another method to solve Problem 1 was proposed in [10]. In that method, instead of solving (54), the authors solve

$$ (\forall n\in\mathbb{N})\quad x_{n+1} = \underset{x\in\mathcal{X}}{\text{argmin}} \left( \varphi(x) + \langle x-x_{n},\nabla\psi(x_{n})\rangle + \psi(x_{n}) + \gamma_{n}^{-1}\|x-x_{n}\|^{p}\right) $$
(55)

for some 1<p≤2. The weak convergence is established under the assumptions that Problem 1 admits a unique solution, ∇ψ is (p−1)-Hölder continuous with constant β, and \(0<\inf _{n\in \mathbb {N}}\gamma _{n}\leq \sup _{n\in \mathbb {N}}\gamma _{n} \leq (1-\delta )/\beta \), where 0<δ<1. The high nonlinearity of the regularization in (55) compared to (54) makes the numerical implementation of this method difficult in general. Furthermore, since (55) yields

$$(\forall n\in\mathbb{N})\quad 0\in\partial\varphi(x_{n+1})+\nabla\psi(x_{n}) + \gamma_{n}^{-1}\partial\big(\|x_{n+1}-x_{n}\|^{p}\big), $$

and since \((\forall n\in \mathbb {N})\partial \big (\|x_{n+1}-x_{n}\|^{p}\big )\) is not separable, this method is not a splitting method.

Remark 3

We can reformulate Problem 1 as the following joint minimization problem

$$\underset{(x,y)\in V}{\text{minimize}}\; {\varphi(x)+\psi(y)}, $$

where \(V=\{(x,y)\in \mathcal {X}\times \mathcal {X}\mid y=x\}\). This constrained problem is equivalent to the following unconstrained problem

$$\underset{(x,y)\in\mathcal{X}\times\mathcal{X}}{\text{minimize}} {\varphi(x)+\psi(y)+\iota_{V}(x,y)}. $$

In [8], a different coupling term between the variables x and y was considered and the problem considered there was

$$\underset{(x,y)\in\mathcal{X}\times\mathcal{X}}{\text{minimize}} {\varphi(x)+\psi(y)+D^{f}(x,y)}, $$

in Euclidean spaces. Their method activates φ and ψ via their so-called left and right Bregman proximity operators alternatively (see also [7] for the projection setting). This method does not require the smoothness of ψ but it requires the computation of Bregman distance-based proximity operator of ψ.

Next, we provide a particular instance of Theorem 2 in finite-dimensional spaces.

Corollary 1

In the setting of Problem 1, suppose that \(\mathcal {X}\) and \(\mathcal {Y}\) are finite-dimensional. Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function such that S∩int dom f≠∅, int dom f⊂int dom ψ, \(f\succcurlyeq \beta \psi \) for some β∈]0,+∞[, and dom f is open. Suppose that either f is cofinite or −ran ∇ψ⊂dom φ . Let ε∈]0,β/(β+1)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that

$$ (\forall n\in\mathbb{N})\quad \varepsilon\leq\gamma_{n}\leq\beta(1-\varepsilon)\quad\text{and}\quad (1+\eta_{n})\gamma_{n}-\gamma_{n+1}\leq\beta\eta_{n}. $$

Furthermore, let x 0 ∈int dom f and iterate

$$ (\forall n\in\mathbb{N})\quad x_{n+1}=\text{Prox}_{\gamma_{n}\varphi}^{f} \big(\nabla f(x_{n})-\gamma_{n}\nabla\psi(x_{n})\big). $$

Then there exists \(\overline {x}\in \mathcal {S}\) such that \(x_{n}\to \overline {x}\).

Proof

Since dom f is open, [5, Lemma 7.3(ix)] asserts that (∀x∈int dom f)D f(x,⋅) is coercive. Hence, the claim follows from Theorem 2(1c). □

4 Application to Multivariate Minimization

In this section, we apply Theorem 2 to solve the following multivariate minimization problem.

Problem 2

Let m and p be strictly positive integers, let \((\mathcal {X}_{i})_{1\leq i\leq m}\) and \((\mathcal {Y}_{k})_{1\leq k\leq p}\) be reflexive real Banach spaces. For every i∈{1,…,m} and every k∈{1,…,p}, let \(\varphi _{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\), let \(\psi _{k}\in {\Gamma }_{0}(\mathcal {Y}_{k})\) be Gâteaux differentiable on int dom ψ k , and let \(L_{ik}\colon \mathcal {X}_{i}\to \mathcal {Y}_{k}\) be linear and bounded. The problem is to

$$ \underset{x_{1}\in\mathcal{X}_{1},\ldots,x_{m}\in\mathcal{X}_{m}}{\text{minimize}} {{\sum}_{i=1}^{m}\varphi_{i}(x_{i}) + {\sum}_{k=1}^{p}\psi_{k} \left( {\sum}_{i=1}^{m}L_{ik}x_{i}\right)}. $$
(56)

Denote by S the set of solutions to (56).

We derive from Theorem 2 the following result.

Proposition 6

Consider the setting of Problem 2. For every k∈{1,…,p}, suppose that there exists σ k ∈]0,+∞[ such that for every (y ik ) 1≤i≤m ∈int dom ψ k and every (v ik ) 1≤i≤m ∈int dom ψ k satisfying \({\sum }_{i=1}^{m}y_{ik}\in \mathrm {int\,dom}\,\psi _{k}\) and \({\sum }_{i=1}^{m}v_{ik}\in \mathrm {int\,dom}\,\psi _{k}\) , one has

$$ D^{\psi_{k}}\left( {\sum}_{i=1}^{m}y_{ik},{\sum}_{i=1}^{m}v_{ik}\right) \leq \sigma_{k}{\sum}_{i=1}^{m}D^{\psi_{k}}(y_{ik},v_{ik}). $$
(57)

For every i∈{1,…,m}, let \(f_{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\) be a Legendre function such that \((\forall x_{i}\in \mathrm {int\,dom}\, f_{i}) D^{f_{i}}(x_{i},\cdot )\) is coercive. For every k∈{1,…,p}, suppose that \({\sum }_{i=1}^{m}L_{ik}(\mathrm {int\,dom}\, f_{i}) \subset \mathrm {int\,dom}\,\psi _{k}\) , that, for every i∈{1,…,m}, there exists β ik ∈]0,+∞[ such that \(f_{i}\succcurlyeq \beta _{ik}\psi _{k}\circ L_{ik}\) , and set β k = min1≤i≤mβ ik . In addition, suppose that \(\text {int\,dom}\, f_{i}\neq \emptyset \) and that either (∀i∈{1,…,m})f i is cofinite or (∀i∈{1,…,m})φ i is cofinite. Let \(\varepsilon \in \big ]0,1/\big (1+{\sum }_{k=1}^{p}\sigma _{k}\beta _{k}^{-1}\big )\big [\) , let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that

$$ (\forall n\in\mathbb{N})\quad \varepsilon\leq\gamma_{n}\leq \frac{1-\varepsilon}{{\sum}_{k=1}^{p}\sigma_{k}\beta_{k}^{-1}} \quad\text{and}\quad(1+\eta_{n})\gamma_{n}-\gamma_{n+1}\leq \frac{\eta_{n}}{{\sum}_{k=1}^{p}\sigma_{k}\beta_{k}^{-1}}. $$

Furthermore, let \( \text {int\,dom}\, f_{i}\) and iterate

$$ \begin{array}{l} \text{for}\; n=0,1,\ldots\\ \left\lfloor \begin{array}{l} \text{for} \;i=1,\ldots, m\\ \left\lfloor \begin{array}{l} x_{i,n+1}=\text{Prox}_{\gamma_{n}\varphi_{i}}^{f_{i}}\left( \nabla f_{i}(x_{i,n})-\gamma_{n}{\sum}_{k=1}^{p}L_{ik}^{\ast}\nabla \psi_{k} \left( {\sum}_{j=1}^{m}L_{jk}x_{j,n}\right)\right). \end{array} \right. \end{array} \right. \end{array} $$
(58)

Then there exists \((\overline {x}_{i})_{1\leq i\leq m}\in \mathcal {S}\) such that the following hold.

  1. (1)

    Suppose that is a singleton. Then \((\forall i\in \{1,\ldots ,m\})x_{i,n}\rightharpoonup \overline {x}_{i}\).

  2. (2)

    For every i∈{1,…,m} and every k∈{1,…,p}, suppose that ∇f i and ∇ψ k are weakly sequentially continuous, and that one of the following holds.

    1. (a)

      dom φ i ⊂int dom f i .

    2. (b)

      \(\text {dom}\, f_{i}^{\ast }\) is open and \(\nabla f_{i}^{\ast }\) is weakly sequentially continuous.

    Then \((\forall i\in \{1,\ldots ,m\})x_{i,n}\rightharpoonup \overline {x}_{i}\).

Proof

Denote by \(\mathcal {X}\) and \(\mathcal {Y}\) the standard vector product spaces and equipped with the norms \(x=(x_{i})_{1\leq i\leq m}\mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}\|^{2}}\) and y=(y k )1≤kp \(\sqrt {{\sum }_{k=1}^{p}\|y_{k}\|^{2}}\), respectively. Then \(\mathcal {X}^{\ast }\) is the vector product space equipped with the norm \(x^{\ast }\mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}^{\ast }\|^{2}}\) and \(\mathcal {Y}^{\ast }\) is the vector product space equipped with the norm \(y^{\ast }\mapsto \sqrt {{\sum }_{k=1}^{p}\|y_{k}^{\ast }\|^{2}}\). Let us introduce the functions and operator

$$ \left\{ \begin{array}{ll} \varphi\colon\mathcal{X}\to]-\infty,+\infty]\colon& x\mapsto{\sum}_{i=1}^{m}\varphi_{i}(x_{i}),\\ f\colon\mathcal{X}\to]-\infty,+\infty]\colon & x\mapsto{\sum}_{i=1}^{m}f_{i}(x_{i}),\\ \psi\colon\mathcal{Y}\to ]-\infty,+\infty]\colon& y\mapsto{\sum}_{k=1}^{p}\psi_{k}(y_{k}),\\ L\colon\mathcal{X}\to\mathcal{Y}\colon& x\mapsto \left( {\sum}_{i=1}^{m}L_{ik}x_{i}\right)_{1\leq k\leq p}. \end{array}\right. $$
(59)

Then ψ is Gâteaux differentiable on and Problem 2 is a special case of Problem 1. Since (59) yields and , we deduce from our assumptions that either f is cofinite or φ is cofinite. As in (9) and (10), f is a Legendre function and dom φ∩int dom f. In addition,

figure r

Now set ψ L = ψL and let x∈int dom f. Then ψ is Gâteaux differentiable at L x and hence ψ L is Gâteaux differentiable at x. This implies that x∈intdom ψ L and thus intdom f⊂intdom ψ L . To show that D f(x,⋅) is coercive, we fix \(\rho \in \mathbb {R}\). On one hand,

figure s

On the other hand, for every i∈{1,…,m}, since \(D^{f_{i}}(x_{i},\cdot )\) is coercive, we deduce that

$$\{z_{i}\in\mathcal{X}_{i}\mid D^{f_{i}}(x_{i},z_{i})\leq\rho\}\quad\text{is bounded}. $$

Hence, (60) implies that \(\{z\in \mathcal {X}\mid D^{f}(x,z)\leq \rho \}\) is bounded and D f(x,⋅) is therefore coercive. Next, set \(\beta =1/{\sum }_{k=1}^{p}\sigma _{k}\beta _{k}^{-1}\). We shall show that \(f\succcurlyeq \beta \psi _{L}\). To this end, fix z=(z i )1≤im ∈int dom f. We have

$$\begin{array}{@{}rcl@{}} D^{\psi_{L}}(x,z)=D^{\psi}(Lx,Lz)&=&{\sum}_{k=1}^{p}D^{\psi_{k}}\left( {\sum}_{i=1}^{m}L_{ik}x_{i},{\sum}_{i=1}^{m}L_{ik}z_{i}\right)\\ &\leq&{\sum}_{k=1}^{p}{\sum}_{i=1}^{m}\sigma_{k} D^{\psi_{k}}(L_{ik}x_{i},L_{ik}z_{i})\\ &\leq&{\sum}_{k=1}^{p}{\sum}_{i=1}^{m}\sigma_{k}\beta_{ik}^{-1}D^{f_{i}}(x_{i},z_{i})\\ &\leq&{\sum}_{k=1}^{p}\sigma_{k}\beta_{k}^{-1}D^{f}(x,z). \end{array} $$

Now let us set \((\forall n\in \mathbb {N})x_{n}=(x_{i,n})_{1\leq i\leq m}\). By virtue of Proposition 5, (58) is a particular case of (49).

  1. (1)

    Since \(\mathcal {S}\cap \overline {\text {dom}}f\) is a singleton, the claim follows from Theorem 2(1a).

  2. (2)

    Our assumptions on (f i )1≤im and (ψ k )1≤kp imply that ∇f and ∇ψ are weakly sequentially continuous.

  3. (2a)

    Since , the claim follows from Theorem 2(1b).

  4. (2b)

    Since, for every i∈{1,…,m}, \(\text {dom}\, f_{i}^{\ast }\) is open and \(\nabla f_{i}^{\ast }\) is weakly sequentially continuous, we deduce that dom f is open and ∇f is weakly sequentially continuous. The assertion therefore follows from Theorem 2(1c).

Example 5

In Problem 2, suppose that m=1, that \(\mathcal {X}_{1}\) and \((\mathcal {Y}_{k})_{1\leq k\leq p}\) are Hilbert spaces, and that, for every k∈{1,…,p}, φ k = ω k ∥⋅−r k 2/2, where (ω k )1≤kp ∈]0, + [p and let . Then the weak convergence result in [13, Proposition 6.3] without errors is a particular instance of Proposition 6 with f 1=∥⋅∥2/2.

Example 6

Let m and p be strictly positive integers. For every i∈{1,…,m} and every k∈{1,…,p}, let ω i k ∈]0, + [, let ρ k ∈]0, + [, and let \(\varphi _{i}\in {\Gamma }_{0}(\mathbb {R})\) be cofinite. The problem is to

$$ \underset{(\xi_{1},\ldots,\xi_{m})\in]0,+\infty[^{m}}{\text{minimize}} {{\sum}_{i=1}^{m} \varphi_{i}(\xi_{i})+{\sum}_{k=1}^{p} \left( -\ln\frac{{\sum}_{i=1}^{m}\omega_{ik}\xi_{i}}{\rho_{k}} + \frac{{\sum}_{i=1}^{m}\omega_{ik}\xi_{i}}{\rho_{k}}-1\right)}. $$
(61)

Denote by S the set of solutions to (61) and suppose that S∩]0, + [m. Let

$$\vartheta\colon\mathbb{R}\to]-\infty,+\infty]\colon\xi\mapsto \left\{\begin{array}{lllllll} -\ln\xi&\text{ if}\; \xi>0,\\ +\infty&\text{ otherwise} \end{array}\right. $$

be Burg entropy, let ε∈]0,1/(1 + p)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\), and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that

$$(\forall n\in\mathbb{N})\quad \varepsilon\leq\gamma_{n}\leq p^{-1}(1-\varepsilon)\quad\text{and}\quad (1+\eta_{n})\gamma_{n}-\gamma_{n+1}\leq p^{-1}\eta_{n}. $$

Let (ξ i,0)1≤im ∈]0, + [m and iterate

$$ \begin{array}{l} \text{for} \;n=0,1,\ldots\\ \left\lfloor \begin{array}{l} \text{for\;} i=1,\ldots, m\\ \left\lfloor \begin{array}{l} \xi_{i,n+1}=\text{Prox}_{\gamma_{n}\varphi_{i}}^{\vartheta}\left( \frac{-1}{\xi_{i,n}} -\gamma_{n}{\sum}_{k=1}^{p}\omega_{ik}\left( \frac{-1}{{\sum}_{j=1}^{m}\omega_{jk}\xi_{j,n}}+ \frac{1}{\rho_{k}}\right)\right). \end{array} \right. \end{array} \right. \end{array} $$

Then there exists \((\overline {\xi }_{i})_{1\leq i\leq m}\in \mathcal {S}\) such that \((\forall i\in \{1,\ldots ,m\})\xi _{i,n}\to \overline {\xi }_{i}\).

Proof

For every i∈{1,…,m} and every k∈{1,…,p}, let us set \(\mathcal {X}_{i}=\mathbb {R}\), \(\mathcal {Y}_{k}=\mathbb {R}\), ψ k = D 𝜗(⋅,ρ k ), and L i k :ξ i ω i k ξ i . Then (61) is a particular case of (56). Since ψ is not differentiable on \(\mathbb {R}^{p}\), the standard forward-backward algorithm is inapplicable. We show that the problem can be solved by using Proposition 6. First, let (ξ i )1≤im and (η i )1≤im be in ]0, + [m, and consider

$$ \phi\colon\mathbb{R}\to]-\infty,+\infty]\colon\xi\mapsto \left\{\begin{array}{lllllll} -\ln\xi+\xi-1 &\text{ if}\; \xi\in]0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$

We see that ϕ is convex and positive. Thus,

$$\phi\left( \frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}}\right) = \phi\left( {\sum}_{i=1}^{m}\frac{\eta_{i}}{{\sum}_{j=1}^{m}\eta_{j}}\frac{\xi_{i}}{\eta_{i}}\right) \leq {\sum}_{i=1}^{m}\frac{\eta_{i}}{{\sum}_{j=1}^{m}\eta_{j}}\phi\left( \frac{\xi_{i}}{\eta_{i}}\right) \leq {\sum}_{i=1}^{m}\phi\left( \frac{\xi_{i}}{\eta_{i}}\right), $$

and hence,

$$-\ln\frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}} + \frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}}-1 \leq {\sum}_{i=1}^{m}\left( -\ln\frac{\xi_{i}}{\eta_{i}} + \frac{\xi_{i}}{\eta_{i}}-1\right). $$

In turn,

$$ D^{\vartheta}\left( {\sum}_{i=1}^{m}\xi_{i},{\sum}_{i=1}^{m}\eta_{i}\right) \leq {\sum}_{i=1}^{m}D^{\vartheta}(\xi_{i},\eta_{i}). $$

This shows that (57) is satisfied with (∀k∈{1,…,p})σ k =1. Next, let us set (∀i∈{1,…,m})f i = 𝜗. Fix i∈{1,…,m} and k∈{1,…,p}, and let ξ i and η i be in ]0, + [. Then

$$D^{\psi_{k}}(L_{ik}\xi_{i},L_{ik}\eta_{i}) = D^{\vartheta}(\omega_{ik}\xi_{i},\omega_{ik}\eta_{i}) = D^{\vartheta}(\xi_{i},\eta_{i})=D^{f_{i}}(\xi_{i},\eta_{i}), $$

which implies that \(f_{i}\succcurlyeq \psi _{k}\circ L_{ik}\). In addition, since \(\text {dom}\, f_{i}^{\ast }=]-\infty ,0[\) is open, [5, Lemma 7.3(ix)] asserts that \(D^{f_{i}}(\xi _{i},\cdot )\) is coercive. We therefore deduce the convergence result from Proposition 6(2b). □

Example 7

Let m and p be strictly positive integers. For every i∈{1,…,m} and every k∈{1,…,p}, let ω i k ∈]0, + [, let ρ k ∈]0, + [, and let \(\varphi _{i}\in {\Gamma }_{0}(\mathbb {R})\). The problem is to

$$ \underset{(\xi_{1},\ldots,\xi_{m})\in\left[0,+\infty\right[^{m}}{\text{minimize}} {{\sum}_{i=1}^{m}\varphi_{i}(\xi_{i})+ {\sum}_{k=1}^{p} \left( \left( {\sum}_{i=1}^{m}\omega_{ik}\xi_{i}\right) \ln\frac{{\sum}_{i=1}^{m}\omega_{ik}\xi_{i}}{\rho_{k}} - {\sum}_{i=1}^{m}\omega_{ik}\xi_{i}+\rho_{k}\right)}. $$
(62)

Denote by S the set of solutions to (62) and suppose that S∩]0, + [m. Let

$$\vartheta\colon\mathbb{R}\to]-\infty,+\infty]\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi\ln\xi-\xi&\text{ if}\; \xi\in]0,+\infty[,\\ 0&\text{ if}\; \xi=0,\\ +\infty&\text{ otherwise} \end{array}\right. $$

be Boltzmann–Shannon entropy, let β= max1≤kp max1≤im ω i k , let ε∈]0,1/(1 + β)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\), and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that

$$(\forall n\in\mathbb{N})\quad \varepsilon\leq\gamma_{n} \leq(p\beta)^{-1}(1-\varepsilon)\quad\text{and}\quad (1+\eta_{n})\gamma_{n}-\gamma_{n+1}\leq(p\beta)^{-1}\eta_{n}. $$

Let (ξ i,0)1≤im ∈]0, + [m and iterate

$$ \begin{array}{l} \text{for}\; n=0,1,\ldots\\ \left\lfloor \begin{array}{l} \text{for}\; i=1,\ldots, m\\ \left\lfloor \begin{array}{l} \xi_{i,n+1}=\text{Prox}^{\vartheta}_{\gamma_{n}\varphi_{i}}\left( \ln\xi_{i,n} - \gamma_{n}{\sum}_{k=1}^{p}\omega_{ik} \left( \ln\left( {\sum}_{j=1}^{m}\omega_{jk}\xi_{j,n}\right) - \ln\rho_{k}\right)\right). \end{array} \right. \end{array} \right. \end{array} $$

Then, there exists \((\overline {\xi }_{i})_{1\leq i\leq m}\in \mathcal {S}\) such that \((\forall i\in \{1,\ldots ,m\})\xi _{i,n}\to \overline {\xi }_{i}\).

Proof

For every i∈{1,…,m} and every k∈{1,…,p}, let us set \(\mathcal {X}_{i}=\mathbb {R}\), \(\mathcal {Y}_{k}=\mathbb {R}\), ψ k = D 𝜗(⋅,ρ k ), and L i k :ξ i ω i k ξ i . Then (62) is a particular case of (56). We cannot apply the standard forward-backward algorithm here since ψ is not differentiable on \(\mathbb {R}^{p}\). We shall verify the assumptions of Proposition 6. First, let (ξ i )1≤im and (η i )1≤im be in ]0, + [m. Since

$$ \phi\colon\mathbb{R}\to]-\infty,+\infty]\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi\ln\xi &\text{ if}\; \xi\in]0,+\infty[,\\ 0&\text{ if}\; \xi=0,\\ +\infty&\text{ otherwise} \end{array}\right. $$

is convex, we have

$$\phi\left( \frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}}\right) = \phi\left( {\sum}_{i=1}^{m}\frac{\eta_{i}}{{\sum}_{j=1}^{m}\eta_{j}} \frac{\xi_{i}}{\eta_{i}}\right) \leq {\sum}_{i=1}^{m}\frac{\eta_{i}}{{\sum}_{j=1}^{m}\eta_{j}} \phi \left( \frac{\xi_{i}}{\eta_{i}}\right), $$

and hence,

$$\frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}} \ln\frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}} \leq {\sum}_{i=1}^{m}\frac{\eta_{i}}{{\sum}_{j=1}^{m}\eta_{j}} \frac{\xi_{i}}{\eta_{i}}\ln\frac{\xi_{i}}{\eta_{i}} = \frac{{\sum}_{i=1}^{m}\xi_{i}\ln\frac{\xi_{i}}{\eta_{i}}}{{\sum}_{i=1}^{m}\eta_{i}}. $$

In turn,

$$\left( {\sum}_{i=1}^{m}\xi_{i}\right) \ln\frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}} \leq{\sum}_{i=1}^{m}\xi_{i}\ln\frac{\xi_{i}}{\eta_{i}}, $$

which implies that

$$\begin{array}{@{}rcl@{}} D^{\vartheta} \left( {\sum}_{i=1}^{m}\xi_{i},{\sum}_{i=1}^{m}\eta_{i}\right) &=& \left( {\sum}_{i=1}^{m}\xi_{i}\right) \ln\frac{{\sum}_{i=1}^{m}\xi_{i}}{{\sum}_{i=1}^{m}\eta_{i}} - {\sum}_{i=1}^{m}\xi_{i}+{\sum}_{i=1}^{m}\eta_{i}\\ &\leq&{\sum}_{i=1}^{m}\left( \xi_{i}\ln\frac{\xi_{i}}{\eta_{i}}-\xi_{i}+\eta_{i}\right)\\ &=&{\sum}_{i=1}^{m}D^{\vartheta}(\xi_{i},\eta_{i}). \end{array} $$

This shows that (57) is satisfied with (∀k∈{1,…,p})σ k =1. Next, let us set (∀i∈{1,…,m})f i = 𝜗. Fix i∈{1,…,m} and k∈{1,…,p}, and let ξ i and η i be in ]0, + [. Then

$$D^{\psi_{k}}(L_{ik}\xi_{i},L_{ik}\eta_{i}) = D^{\vartheta}(\omega_{ik}\xi_{i},\omega_{ik}\eta_{i}) = \omega_{ik}D^{\vartheta}(\xi_{i},\eta_{i}) \leq \beta D^{\vartheta}(\xi_{i},\eta_{i}), $$

which implies that \(f_{i}\succcurlyeq \beta ^{-1}\psi _{k}\circ L_{ik}\). In addition, since f i is supercoercive, f i is cofinite and [5, Lemma 7.3(viii)] asserts that \(D^{f_{i}}(\xi _{i},\cdot )\) is coercive. Therefore, the claim follows from Proposition 6(2b). □

Remark 4

The Bregman distance associated with Burg entropy, i.e., the Itakura–Saito divergence, is used in linear regression [3, Section 3]. The Bregman distance associated with Boltzmann–Shannon entropy, i.e., the Kullback–Leibler divergence, is used in information theory [3, Section 3] and image processing [11].