Abstract
We propose a forward-backward splitting algorithm based on Bregman distances for composite minimization problems in general reflexive Banach spaces. The convergence is established using the notion of variable quasi-Bregman monotone sequences. Various examples are discussed, including some in Euclidean spaces, where new algorithms are obtained.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we propose a forward-backward splitting algorithm to solve the following composite convex minimization problem considered in Banach spaces.
Problem 1
Let \(\mathcal {X}\) be a reflexive real Banach space, let \(\varphi \colon \mathcal {X}\to ]-\infty ,+\infty ]\) and \(\psi \colon \mathcal {X}\to ]-\infty ,+\infty ]\) be proper lower semi-continuous convex functions, and suppose that ψ is Gâteaux differentiable on interior of its domain. The problem is to
The set of solutions to (1) is denoted by S.
A particular instance of (1) when ψ is the Bregman distance associated to a differentiable convex function f, i.e.,
where \(\text {dom}\,f =\{x\in \mathcal {X}\mid f(x)<+\infty \}\) and int dom f is its interior, provides a framework for many problems arising in applied mathematics. For instance, when \(\mathcal {X}\) is a Euclidean space and f is Boltzmann–Shannon entropy, it captures many problems in information theory and signal recovery [9].
It was shown in [14] that if \(\mathcal {X}\) is Hilbertian and ψ possesses a β −1-Lipschitz continuous gradient for some β∈]0, + ∞[, then Problem 1 can be solved by the standard forward-backward algorithm
Here, prox is Moreau proximity operator [19]. However, many problems in applications do not conform to these hypotheses, for example when \(\mathcal {X}\) is a Euclidean space and ψ is Boltzmann–Shannon entropy which appears in many problems in image and signal processing, in statistics, and in machine learning [2, 11, 12, 16–18]. Another difficulty in the implementation of (3) is that the operator prox is not always easy to evaluate.
The objective of the present paper is to propose a forward-backward splitting algorithm to solve Problem 1, which is so far limited to Hilbert spaces, in the general framework of reflexive real Banach spaces. This algorithm, which employs Bregman distance-based proximity operators, provides new algorithms in the framework of Euclidean spaces, which are, in some instances, more favorable than the standard forward-backward splitting algorithm. This framework can be applied in the case when ψ is not everywhere differentiable. The paper is organized as follows. In Section 2, we provide some preliminary results. We present the algorithm and prove its convergence in Section 3. Section 4 is devoted to an application of our result to multivariate minimization problem together with examples.
Notation and Background
Throughout this paper, \(\mathcal {X}\) is reflexive, \(\mathcal {X}^{\ast }\) is the dual space of \(\mathcal {X}\), 〈⋅,⋅〉 is the duality pairing between \(\mathcal {X}\) and \(\mathcal {X}^{\ast }\) and ∥⋅∥ is a norm of \(\mathcal {X}\). The symbols \(\rightharpoonup \) and → represent respectively weak and strong convergence. The set of weak sequential cluster points of a sequence \((x_{n})_{n\in \mathbb {N}}\) is denoted by \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\). Let \(M\colon \mathcal {X}\to 2^{\mathcal {X}^{\ast }}\). The domain of M is \(\text {dom}\,M=\{x\in \mathcal {X}\mid Mx\neq \emptyset \}\) and the range of M is \(\text {ran}\,M=\{x^{*}\in \mathcal {X}^{\ast }\mid (\exists x\in \mathcal {X}) x^{\ast }\in Mx\}\). Let \(f\colon \mathcal {X}\to ]-\infty ,+\infty ]\). Then, f is cofinite if \(\text {dom}\, f^{\ast }=\mathcal {X}^{\ast }\), is coercive if \(\lim _{\|x\|\to +\infty }f(x)=+\infty \), is supercoercive if \(\lim _{\|x\|\to +\infty }f(x)/\|x\|=+\infty \), and is uniformly convex at x∈dom f if there exists an increasing function ϕ:[0, + ∞[→[0, + ∞] that vanishes only at 0 such that
Denote by \({\Gamma }_{0}(\mathcal {X})\) the class of all lower semicontinuous convex functions \(f\colon \mathcal {X}\to ]-\infty ,+\infty ]\) such that \(\text {dom}\,f=\{x\in \mathcal {X}\mid f(x)<+\infty \}\neq \emptyset \). Let \(f\in {\Gamma }_{0}(\mathcal {X})\). Denote by Argmin f the set of global minimizers of f, by \(f^{\ast }\colon \mathcal {X}^{\ast }\to ]-\infty ,+\infty ] \colon x^{\ast }\mapsto \sup _{x\in \mathcal {X}}(\langle x,x^{\ast }\rangle -f(x))\) the conjugate of f and by
the Moreau subdifferential of f. In addition, if f is Gâteaux differentiable on int dom f≠∅ then
We denote
Moreover, if g 1 and g 2 are in \(\mathcal {F}(f)\), then
For every α∈[0, + ∞[, set
Finally, \(\ell _{+}^{1}(\mathbb {N})\) is the set of all summable sequences in [0, + ∞[.
2 Preliminary Results
In this section, we give some preliminary results on Legendre function, Bregman monotonicity, and Bregman distance-based proximity operator that will be used in the next section.
Definition 1
[5, 6] Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅. We say that f is a Legendre function if it is essentially smooth in the sense that ∂ f is both locally bounded and single-valued on its domain, and essentially strictly convex in the sense that ∂ f ∗ is locally bounded on its domain and f is strictly convex on every convex subset of dom ∂ f. Let C be a closed convex subset of \(\mathcal {X}\) such that C∩int dom f≠∅. The Bregman projector onto C induced by f is
and the D f-distance to C is the function
Definition 2
[20] Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅, let \((f_{n})_{n\in \mathbb {N}}\) be in \(\mathcal {F}(f)\), let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\,f)^{\mathbb {N}}\), and let \(C \subset \mathcal {X}\) be such that C∩dom f≠∅. Then \((x_{n})_{n\in \mathbb {N}}\) is:
-
1.
quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) if
$$\begin{array}{@{}rcl@{}} &&(\exists(\eta_{n})_{n\in\mathbb{N}}\in\ell_{+}^{1}(\mathbb{N})) (\forall x\in C\cap\text{dom}\, f) (\exists(\varepsilon_{n})_{n\in\mathbb{N}}\in \ell_{+}^{1}(\mathbb{N}))(\forall n\in\mathbb{N})\\ &&\hspace{2cm} D^{f_{n+1}}(x,x_{n+1})\leq (1+\eta_{n})D^{f_{n}}(x,x_{n})+\varepsilon_{n}; \end{array} $$ -
2.
stationarily quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) if
$$\begin{array}{@{}rcl@{}} &&(\exists(\varepsilon_{n})_{n\in\mathbb{N}}\in\ell_{+}^{1}(\mathbb{N})) (\exists(\eta_{n})_{n\in\mathbb{N}}\in\ell_{+}^{1}(\mathbb{N}))(\forall x\in C\cap\text{dom}\, f) (\forall n\in\mathbb{N})\\ &&\hspace{2cm} D^{f_{n+1}}(x,x_{n+1})\leq (1+\eta_{n})D^{f_{n}}(x,x_{n})+\varepsilon_{n}. \end{array} $$
Condition 1
[6, Condition 4.4] Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅. For every bounded sequences \((x_{n})_{n\in \mathbb {N}}\) and \((y_{n})_{n\in \mathbb {N}}\) in int dom f,
Proposition 1 (20)
Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅, let α∈]0,+∞[, let \((f_{n})_{n\in \mathbb {N}}\) be in \(\mathcal {P}_{\alpha }(f)\) , let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) , let \(C\subset \mathcal {X}\) be such that C∩int dom f≠∅, and let x∈C∩int dom f. Suppose that \((x_{n})_{n\in \mathbb {N}}\) is quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) . Then the following hold.
-
1.
\((D^{f_{n}}(x,x_{n}))_{n\in \mathbb {N}}\) converges.
-
2.
Suppose that D f (x,⋅) is coercive. Then \((x_{n})_{n\in \mathbb {N}}\) is bounded.
Proposition 2 (20)
Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅, let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) , let \(C\subset \mathcal {X}\) be such that C∩int dom f≠∅, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , let α∈]0,+∞[, and let \((f_{n})_{n\in \mathbb {N}}\) in \(\mathcal {P}_{\alpha }(f)\) be such that \((\forall n\in \mathbb {N})(1+\eta _{n})f_{n}\succcurlyeq f_{n+1}\) . Suppose that \((x_{n})_{n\in \mathbb {N}}\) is quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) , that there exists \(g\in \mathcal {F}(f)\) such that for every \(n\in \mathbb {N}\) , \(g\succcurlyeq f_{n}\) , and that, for every \(y_{1}\in \mathcal {X}\) and every \(y_{2}\in \mathcal {X}\) ,
Moreover, suppose that (∀x∈int dom f)D f (x,⋅) is coercive. Then \((x_{n})_{n\in \mathbb {N}}\) converges weakly to a point in C∩int dom f if and only if \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset C\cap \mathrm {int\,dom}\, f\).
Proposition 3 (20)
Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function, let α∈]0,+∞[, let \((f_{n})_{n\in \mathbb {N}}\) be in \(\mathcal {P}_{\alpha }(f)\) , let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) , and let C be a closed convex subset of \(\mathcal {X}\) such that C∩int dom f≠∅. Suppose that \((x_{n})_{n\in \mathbb {N}}\) is stationarily quasi-Bregman monotone with respect to C relative to \((f_{n})_{n\in \mathbb {N}}\) , that f satisfies Condition 1, and that (∀x∈int dom f)D f (x,⋅) is coercive. In addition, suppose that there exists β∈]0,+∞[ such that \((\forall n\in \mathbb {N})\beta \hat {f}\succcurlyeq f_{n}\) . Then \((x_{n})_{n\in \mathbb {N}}\) converges strongly to a point in \(C\cap \overline {\text {dom}\,}f\) if and only if \(\underline {\lim } {D^{f}_{C}}(x_{n})=0\).
Our framework uses the Bregman distance-based proximity operators whose definition and properties are discussed in the following proposition.
Proposition 4
Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be Gâteaux differentiable on int dom f≠∅, let \(\varphi \in {\Gamma }_{0}(\mathcal {X})\) , and let
be f-proximity operator of φ. Then the following hold.
-
(1)
\(\text {ran}\text {Prox}_{\varphi }^{f}\subset \text {dom}\, f \cap \text {dom}\,\varphi \) and \(\text {Prox}_{\varphi }^{f}=(\partial (f+\varphi ))^{-1}\).
-
(2)
Suppose that dom φ∩int dom f≠∅ and that dom ∂f∩dom ∂φ⊂int dom f. Then the following hold.
-
(a)
\(\text {ran}\text {Prox}_{\varphi }^{f}\subset \mathrm {int\,dom}\, f\) and \(\text {Prox}_{\varphi }^{f}=(\nabla f+\partial \varphi )^{-1}\).
-
(b)
\(\text {int}(\text {dom}\, f^{\ast } + \text {dom}\,\varphi ^{\ast })\subset \text {dom}\,\text {Prox}_{\varphi }^{f}\).
-
(c)
Suppose that f| int dom f is strictly convex. Then \(\text {Prox}_{\varphi }^{f}\) is single-valued on its domain.
-
(a)
Proof
Let us fix \(x^{\ast }\in \mathcal {X}^{\ast }\) and define \(f_{x^{\ast }}\colon \mathcal {X} \to ]-\infty ,+\infty ] \colon x \mapsto f(x)-\langle x,x^{\ast }\rangle + f^{\ast }(x^{\ast })\). Then \(\text {dom}\, f_{x^{\ast }} = \text {dom}\, f\) and \(\varphi + f_{x^{\ast }} \in {\Gamma }_{0}(\mathcal {X})\). Moreover, \(\partial (\varphi + f_{x^{\ast }}) = \partial (\varphi + f) - x^{\ast }\).
-
(1):
By definition, \(\text {ran}\text {Prox}_{\varphi }^{f} \subset \text {dom}\, f \cap \text {dom}\,\varphi \). For the second assertion, it is sufficient to prove for the case dom f∩dom φ≠∅ since otherwise both sides of the desired identity reduce to the trivial operator x ∗↦∅. Now let x∈dom f∩dom φ. Then
$$\begin{array}{@{}rcl@{}} x\in\text{Prox}_{\varphi}^{f} x^{\ast} &\Leftrightarrow& 0\in\partial (\varphi+f_{x^{\ast}})(x)\\ &\Leftrightarrow& 0\in\partial(\varphi+f)(x)-x^{\ast}\\ &\Leftrightarrow& x^{\ast}\in\partial(\varphi+f)(x)\\ &\Leftrightarrow& x\in \big(\partial(\varphi+f)\big)^{-1}(x^{\ast}). \end{array} $$(7) -
(2):
Suppose that x ∗∈int(dom f ∗+dom φ ∗). Since dom φ∩int dom f≠∅, it follows from [1, Theorem 1.1] and [23, Theorem 2.1.3(ix)] that
$$ x^{\ast}\in\text{int}(\text{dom}\, f^{\ast}+\text{dom}\,\varphi^{\ast}) = \text{int}\text{dom} (f+\varphi)^{\ast}. $$(8) -
(2a):
Since dom φ∩int dom f≠∅, ∂(φ + f) = ∂ φ + ∂ f by [1, Corollary 2.1], and hence 1) yields
$$\text{ran}\text{Prox}_{\varphi}^{f}=\text{dom}\,\partial(f+\varphi)=\text{dom}(\partial f+\partial\varphi)=\text{dom}\,\partial f\cap\text{dom}\,\partial\varphi\subset\mathrm{int\,dom}\, f. $$In turn, \(\text {ran}\text {Prox}_{\varphi }^{f}\subset \text {dom}\,\varphi \cap \mathrm {int\,dom}\, f\). We now prove that \(\text {Prox}_{\varphi }^{f}=(\nabla f+\partial \varphi )^{-1}\). Note that dom(∇f + ∂ φ)⊂dom φ∩int dom f. Let x∈dom φ∩int dom f. Then ∂(f + φ)(x) = ∂ f(x) + ∂ φ(x)=∇f(x) + ∂ φ(x) and therefore,
$$x\in\text{Prox}_{\varphi}^{f}x^{\ast}\Leftrightarrow x^{\ast}\in \partial(f+\varphi)(x)=\nabla f(x)+\partial \varphi(x)\Leftrightarrow x\in(\nabla f+\partial\varphi)^{-1}(x^{\ast}). $$ -
(2b):
We derive from (8) and [5, Fact 3.1] that \(\varphi +f_{x^{\ast }}\) is coercive. Hence, by [23, Theorem 2.5.1], \(\varphi +f_{x^{\ast }}\) admits at least one minimizer, i.e., \(x^{*}\in \text {dom}\,\text {Prox}_{\varphi }^{f}\).
-
(2c):
Since f|int dom f is strictly convex, so is \((\varphi +f_{x^{\ast }})|_{\mathrm {int\,dom}\, f}\) and thus, in view of 2b), \(\varphi +f_{x^{\ast }}\) admits a unique minimizer on int dom f. However, since
$$\text{Argmin}(\varphi+f_{x^{\ast}})=\text{ran}\text{Prox}_{\varphi}^{f}\subset\mathrm{int\,dom}\, f, $$it follows that \(\varphi +f_{x^{\ast }}\) admits a unique minimizer and that \(\text {Prox}_{\varphi }^{f}\) is therefore single-valued.
□
Proposition 5
Let m be a strictly positive integer, let \((\mathcal {X}_{i})_{1\leq i\leq m}\) be reflexive real Banach spaces, and let \(\mathcal {X}\) be the vector product space equipped with the norm \(x=(x_{i})_{1\leq i\leq m}\mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}\|^{2}}\) . For every i∈{1,…,m}, let \(f_{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\) be a Legendre function and let \(\varphi _{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\) be such that dom φ i ∩int dom f i ≠∅. Set \(f\colon \mathcal {X} \to ]-\infty ,+\infty ] \colon x\mapsto {\sum }_{i=1}^{m}f_{i}(x_{i})\) and \(\varphi \colon \mathcal {X}\to ]-\infty ,+\infty ] \colon x\mapsto {\sum }_{i=1}^{m}\varphi _{i}(x_{i})\) . Then
Proof
First, we observe that \(\mathcal {X}^{\ast }\) is the vector product space equipped with the norm \(x^{\ast }=(x_{i}^{\ast })_{1\leq i\leq m} \mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}^{\ast }\|^{2}}\). Next, we derive from the definition of f that dom and that
Thus, ∂ f is single-valued on
Likewise, since
we deduce that ∂ f ∗ is single-valued on dom ∂ f ∗=int dom f ∗. Consequently, [5, Theorems 5.4 and 5.6] assert that
In addition,
Hence, Proposition 4(2b) and (2c) assert that \(\text {int}(\text {dom}\, f^{\ast }+\text {dom}\,\varphi ^{\ast }) \subset \text {dom}\,\text {Prox}_{\varphi }^{f}\) and \(\text {Prox}_{\varphi }^{f}\) is single-valued on its domain. Now set \(x=\text {Prox}^{f}_{\varphi }x^{\ast }\) and \(q=(\text {Prox}^{f_{i}}_{\varphi _{i}}x_{i}^{\ast })_{1\leq i\leq m}\). We derive from Proposition 4(2a) that
Consequently, by invoking (4), we get
Upon setting z = q in (11), we obtain
For every i∈{1,…,m}, let us set \(q_{i}=\text {Prox}_{\varphi _{i}}^{f_{i}}x_{i}^{\ast }\). The same characterization as in (11) yields
By summing these inequalities over i∈{1,…,m}, we obtain
Upon setting z = x in (13), we get
Now suppose that x≠q. Since f|int dom f is strictly convex, it follows from [23, Theorem 2.4.4(ii)] that ∇f is strictly monotone, i.e.,
and we reach a contradiction. □
In Hilbert spaces, the operator defined in (6) reduces to the Moreau’s usual proximity operator prox φ [19] if f=∥⋅∥2/2. We provide illustrations of such instances in the standard Euclidean space \(\mathbb {R}^{m}\).
Example 1
Let γ∈]0, + ∞[, let \(\phi \in {\Gamma }_{0}(\mathbb {R})\) be such that dom ϕ∩]0, + ∞[≠∅, and let 𝜗 be Boltzmann–Shannon entropy, i.e.,
Set \(\varphi \colon (\xi _{i})_{1\leq i\leq m}\mapsto {\sum }_{i=1}^{m}\phi (\xi _{i})\) and \(f\colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\vartheta (\xi _{i})\). Note that f is a supercoercive Legendre function [4, Sections 5 and 6], and hence, Proposition 4(2b) asserts that \(\text {dom}\,\text {Prox}_{\varphi }^{f}=\mathbb {R}^{m}\). Let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), set \((\eta _{i})_{1\leq i\leq m} = \text {Prox}_{\gamma \varphi }^{f}(\xi _{i})_{1\leq i\leq m}\), let W be the Lambert function [15], i.e., the inverse of ξ↦ξ e ξ on [0, + ∞[, and let i∈{1,…,m}. Then η i can be computed as follows.
-
1.
Let \(\omega \in \mathbb {R}\) and suppose that
$$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi\ln\xi-\omega\xi&\text{ if}\; \xi\in]0,+\infty[,\\ 0&\text{ if} \;\xi=0,\\ +\infty&\text{ otherwise}. \end{array}\right. $$Then \(\eta _{i}=e^{(\xi _{i}+\omega -1)/(\gamma +1)}\).
-
2.
Let p∈[1, + ∞[ and suppose that either ϕ=|⋅|p/p or
$$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi^{p}/p&\text{ if}\; \xi\in[0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$Then
$$\eta_{i}= \left\{\begin{array}{lllllll} \left( \frac{W(\gamma(p-1)e^{(p-1)\xi_{i}})}{\gamma(p-1)}\right)^{\frac{1}{p-1}}&\text{ if}\; p\in ]1,+\infty[,\\ e^{\xi_{i}-\gamma}&\text{ if}\; p=1. \end{array}\right. $$ -
3.
Let p∈[1, + ∞[ and suppose that
$$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi^{-p}/p&\text{ if}\; \xi\in]0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$Then
$$\eta_{i}=\left( \frac{W(\gamma(p+1)e^{-(p+1)\xi_{i}})}{\gamma(p+1)}\right)^{\frac{-1}{p+1}}. $$ -
4.
Let p∈]0,1[ and suppose that
$$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} -\xi^{p}/p&\text{ if}\; \xi\in[0,+\infty[,\\ +\infty&\text{ otherwise}. \end{array}\right. $$Then
$$\eta_{i}=\left( \frac{W(\gamma(1-p)e^{(p-1)\xi_{i}})}{\gamma(1-p)}\right)^{\frac{1}{p-1}}. $$
Example 2
Let \(\phi \in {\Gamma }_{0}(\mathbb {R})\) be such that dom ϕ∩]0,1[≠∅ and let 𝜗 be Fermi–Dirac entropy, i.e.,
Set \(\varphi \colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\phi (\xi _{i})\) and \(f\colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\vartheta (\xi _{i})\). Note that f is a cofinite Legendre function [4, Sections 5 and 6], and hence Proposition 4(2b) asserts that \(\text {dom}\,\text {Prox}_{\varphi }^{f}=\mathbb {R}^{m}\). Let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), set \((\eta _{i})_{1\leq i\leq m} = \text {Prox}_{\varphi }^{f}(\xi _{i})_{1\leq i\leq m}\), and let i∈{1,…,m}. Then η i can be computed as follows.
-
1.
Let \(\omega \in \mathbb {R}\) and suppose that
$$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} \xi\ln\xi-\omega\xi&\text{ if}\; \xi\in]0,+\infty[,\\ 0&\text{ if}\; \xi=0,\\ +\infty&\text{ otherwise}. \end{array}\right. $$Then \(\eta _{i}=-e^{\xi _{i}+\omega -1}/2+\sqrt {e^{2(\xi _{i}+\omega -1)}/4 + e^{\xi _{i}+\omega -1}}\).
-
2.
Suppose that
$$\phi\colon\xi\mapsto \left\{\begin{array}{lllllll} (1-\xi)\ln(1-\xi)+\xi&\text{ if}\; \xi\in]-\infty,1[,\\ 1&\text{ if}\; \xi=1,\\ +\infty&\text{ otherwise}. \end{array}\right. $$Then \(\eta _{i}=1+e^{-\xi _{i}}/2-\sqrt {e^{-\xi _{i}}+e^{-2\xi _{i}}/4}\).
Example 3
Let \(f\colon (\xi _{i})_{1\leq i\leq m}\mapsto {\sum }_{i=1}^{m} \vartheta (\xi _{i})\), where 𝜗 is Hellinger-like function, i.e.,
let γ∈]0, + ∞[, and let φ = f. Since f is a cofinite Legendre function [4, Sections 5 and 6], Proposition 4(2b) asserts that \(\text {dom}\,\text {Prox}_{\gamma \varphi }^{f}=\mathbb {R}^{m}\). Let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), and set \((\eta _{i})_{1\leq i\leq m}=\text {Prox}_{\gamma \varphi }^{f}(\xi _{i})_{1\leq i\leq m}\). Then \((\forall i\in \{1,\ldots ,m\})\eta _{i}=\xi _{i}/\sqrt {(\gamma +1)^{2}+{\xi _{i}^{2}}}\).
Example 4
Let γ∈]0, + ∞[, let \(\phi \in {\Gamma }_{0}(\mathbb {R})\) be such that dom ϕ∩]0, + ∞[≠∅, and let 𝜗 be Burg entropy, i.e.,
Set \(\varphi \colon (\xi _{i})_{1\leq i\leq m}\mapsto {\sum }_{i=1}^{m}\phi (\xi _{i})\) and \(f\colon (\xi _{i})_{1\leq i\leq m} \mapsto {\sum }_{i=1}^{m}\vartheta (\xi _{i})\), let \((\xi _{i})_{1\leq i\leq m}\in \mathbb {R}^{m}\), and set \((\eta _{i})_{1\leq i\leq m} = \text {Prox}_{\gamma \varphi }^{f}(\xi _{i})_{1\leq i\leq m}\). Let i∈{1,…,m}. Then η i can be computed as follows.
-
1.
Suppose that ϕ = 𝜗 and ξ i ∈]−∞,0]. Then η i =−(1 + γ)−1 ξ i .
-
2.
Suppose that ϕ:ξ↦α|ξ| and ξ i ∈]−∞,γ α]. Then η i =(γ α−ξ i )−1.
The following result will be used subsequently.
Lemma 1
Let \(\mathcal {X}\) be a reflexive real Banach space, let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function, let x∈int dom f, and let \((x_{n})_{n\in \mathbb {N}}\in (\mathrm {int\,dom}\, f)^{\mathbb {N}}\) . Suppose that \((D^{f}(x,x_{n}))_{n\in \mathbb {N}}\) is bounded, that dom f ∗ is open, and that ∇f ∗ is weakly sequentially continuous. Then \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathrm {int\,dom}\, f\).
Proof
[20, Proof of Theorem 4.1]. □
3 Forward-Backward Splitting in Banach Spaces
The main result in this section is a version of the forward-backward splitting algorithm in reflexive real Banach spaces which employs different Bregman distance-based proximity operators over the iterations.
Theorem 1
Consider the setting of Problem 1 and let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function such that S∩int dom f≠∅, int dom f⊂int dom ψ, and \(f\succcurlyeq \beta \psi \) for some β∈]0,+∞[. Let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , let α∈]0,+∞[, and let \((f_{n})_{n\in \mathbb {N}}\) be Legendre functions in \(\mathcal {P}_{\alpha }(f)\) such that
Suppose that either −ran ∇ψ⊂dom φ ∗ or \((\forall n\in \mathbb {N})f_{n}\) is cofinite. Let ε∈]0,αβ/(αβ+1)[ and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that
Furthermore, let x 0 ∈int dom f and iterate
Suppose in addition that (∀x∈int dom f)D f (x,⋅) is coercive. Then \((x_{n})_{n\in \mathbb {N}}\) is a bounded sequence in int dom f and \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathcal {S}\) . Moreover, there exists \(\overline {x}\in \mathcal {S}\) such that the following hold.
-
(1)
Suppose that \(\mathcal {S}\cap \overline {\text {dom}}\,f\) is a singleton. Then \(x_{n}\rightharpoonup \overline {x}\).
-
(2)
Suppose that there exists \(g\in \mathcal {F}(f)\) such that for every \(n\in \mathbb {N}\) , \(g\succcurlyeq f_{n}\) , and that, for every \(y_{1}\in \mathcal {X}\) and every \(y_{2}\in \mathcal {X}\) ,
$$ \left\{\begin{array}{lllllll} y_{1}\in\mathfrak{W}(x_{n})_{n\in\mathbb{N}},\\ y_{2}\in\mathfrak{W}(x_{n})_{n\in\mathbb{N}},\\ \big(\langle y_{1}-y_{2},\nabla f_{n}(x_{n}) - \gamma_{n}\nabla\psi(x_{n})\rangle\big)_{n\in\mathbb{N}} \quad\text{converges} \end{array}\right. \Rightarrow\quad y_{1}=y_{2}. $$(18)In addition, suppose that one of the following holds.
-
(a)
S⊂int dom f.
-
(b)
dom f∗ is open and ∇f ∗ is weakly sequentially continuous.
Then \(x_{n}\rightharpoonup \overline {x}\).
-
(a)
-
(3)
Suppose that f satisfies Condition 1 and that one of the following holds.
-
(a)
Either φ or ψ is uniformly convex at \(\overline {x}\).
-
(b)
\(\underline {\lim } D^{f}_{\mathcal {S}}(x_{n})=0\) and there exists μ∈]0,+∞[ such that \((\forall n\in \mathbb {N})\mu \hat {f}\succcurlyeq f_{n}\).
Then \(x_{n}\to \overline {x}\).
-
(a)
Proof
We first derive from Proposition 4(2c) that the operators \((\text {Prox}_{\gamma _{n}\varphi }^{f})_{n\in \mathbb {N}}\) are single-valued on their domains. We also note that x 0∈int dom f. Suppose that x n ∈int dom f for some \(n\in \mathbb {N}\). If f n is cofinite then Proposition 4(2b) yields
Otherwise,
Since \(\text {int}(\text {dom}\, f_{n}^{\ast } + \text {dom}\,(\gamma _{n}\varphi ^{\ast })) \subset \text {dom}\,\text {Prox}_{\gamma _{n}\varphi }^{f}\) by Proposition 4(2b), we deduce from (17), (19), (20), and Proposition 4(2a) that x n+1 is a well-defined element in \(\text {ran}\text {Prox}_{\gamma \varphi }^{f_{n}} = \text {dom}\,\partial \varphi \cap \mathrm {int\,dom}\, f_{n}=\text {dom}\,\partial \varphi \cap \mathrm {int\,dom}\, f\subset \mathrm {int\,dom}\, f\). By reasoning by induction, we conclude that
Next, let us set Φ = φ + ψ and
Since int dom f⊂int dom ψ, it follows from (21) that \((\forall n\in \mathbb {N})g_{n}\) is Gâteaux differentiable on dom g n =int dom g n =int dom f. Since ψ is continuous on int dom ψ⊃int dom f and the functions \((f_{n})_{n\in \mathbb {N}}\) are continuous on int dom f [21, Proposition 3.3], we deduce that \((\forall n\in \mathbb {N})g_{n}\) is continuous on dom g n . In addition,
Note that \(f\succcurlyeq \beta \psi \) and \((\forall n\in \mathbb {N})f_{n}\succcurlyeq \alpha f\). Hence, (22) yields
and hence, we deduce from (16) and (22) that \((\forall n\in \mathbb {N})g_{n}\succcurlyeq \varepsilon \alpha f\). In turn,
and it therefore follows from [23, Theorem 2.1.11] that \((\forall n\in \mathbb {N})g_{n}\) is convex. Consequently,
Set ω=1+1/ε. Then
We thus derive from (15), (16) and (23) that
By invoking (17) and Proposition 4(2a), we get
and therefore,
Since [23, Theorem 2.4.2(vii)–(viii)] yield
we deduce from (26) that
By appealing to (4) and (27), we get
and hence, by [6, Proposition 2.3(ii)],
In particular,
By using (25), we deduce from (30) that
and therefore,
This shows that \((x_{n})_{n\in \mathbb {N}}\) is stationarily quasi-Bregman monotone with respect to S relative to \((g_{n})_{n\in \mathbb {N}}\). Hence, we deduce from Proposition 1(2) that
and, since \(\mathcal {X}\) is reflexive,
In addition, we derive from (32) and Proposition 1(1) that
and thus, since (31) yields
and since η n →0, we obtain
On the other hand, it follows from (24) that
and hence, (36) yields
Now, it follows from (29) that
which shows that \(({\Phi }(x_{n}))_{n\in \mathbb {N}}\) is decreasing and hence, since it is bounded from below by \(\inf {\Phi }(\mathcal {X})\), it is convergent. However, (29) and (32) yield
Since η n →0, by taking the limit in (38) and then using (35) and (36), we get
and thus,
We now show that
To this end, suppose that \(x\in \mathfrak {W}(x_{n})_{n\in \mathbb {N}}\), i.e., \(x_{k_{n}}\rightharpoonup x\). Since Φ is weakly lower semicontinuous [23, Theorem 2.2.1], by (39),
This yields \({\Phi }(x)={\inf }\,{\Phi }(\mathcal {X})\), i.e., x∈Argmin Φ = S.
-
(1)
Let \(\overline {x}\in \mathfrak {W}(x_{n})_{n\in \mathbb {N}}\). Since (33) and (40) imply that \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}} \subset \mathcal {S}\cap \overline {\text {dom}}\,f\), we obtain \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}=\{\overline {x}\}\), and in turn, (34) yields \(x_{n}\rightharpoonup \overline {x}\).
-
(2)
In view of (40) and Proposition 2, it suffices to show that \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathrm {int\,dom}\, f\).
-
(2a)
We have \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathcal {S}\subset \mathrm {int\,dom}\, f\).
-
(2b)
This follows from Lemma 1.
-
(3)
Let \(\overline {x}\in \mathcal {S}\cap \mathrm {int\,dom}\, f\). Since f satisfies Condition 1, (37) yields
$$ x_{n+1}-x_{n}\to 0. $$(41)Now set
$$(\forall n\in\mathbb{N})\quad y_{n}=x_{n+1} \quad\text{ and} \quad y_{n}^{\ast} = \gamma_{n}^{-1}\big(\nabla g_{n}(x_{n})-\nabla g_{n}(y_{n})\big). $$$$ (\forall n\in\mathbb{N})\quad y_{n}^{\ast}\in\partial{\Phi}(y_{n}) \quad \text{ and} \quad y_{n}-x_{n}\to 0. $$(42)Since (31) yields
$$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad D^{g_{n+1}}(\overline{x},x_{n+1}) &=&D^{g_{n+1}}(\overline{x},y_{n})\\ &\leq& (1+\omega\eta_{n})D^{g_{n}}(\overline{x},y_{n})\\ &=&(1+\omega\eta_{n})D^{g_{n}}(\overline{x},x_{n+1})\\ &\leq&(1+\omega\eta_{n})D^{g_{n}}(\overline{x},x_{n}), \end{array} $$we deduce that
$$ (\forall n\in\mathbb{N})\quad (1+\omega\eta_{n})^{-1} D^{g_{n+1}}(\overline{x},x_{n+1})\leq D^{g_{n}}(\overline{x},y_{n}) \leq D^{g_{n}}(\overline{x},x_{n}). $$(43)Altogether, (35) and (43) yield
$$ D^{g_{n}}(\overline{x},y_{n})-D^{g_{n}}(\overline{x},x_{n})\to 0. $$(44)In (28), by setting \(x=\overline {x}\), we get
$$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})\quad 0&\leq&\gamma_{n}\langle y_{n}-\overline{x},y_{n}^{\ast}\rangle \\ &=&\langle y_{n}-\overline{x}, \nabla g_{n}(x_{n})-\nabla g_{n}(y_{n})\rangle \\ &=&D^{g_{n}}(\overline{x},x_{n}) - D^{g_{n}}(\overline{x},y_{n})-D^{g_{n}}(y_{n},x_{n})\\ &\leq& D^{g_{n}}(\overline{x},x_{n})-D^{g_{n}}(\overline{x},y_{n}). \end{array} $$(45)By taking to the limit in (45) and using (44), we get
$$ \langle y_{n}-\overline{x},y_{n}^{\ast}\rangle\to 0. $$(46) -
(3a)
In this case \(\mathcal {S}=\{\overline {x}\}\). Since φ is uniformly convex at \(\overline {x}\), Φ is likewise and hence, there exists an increasing function ϕ:[0, + ∞[→[0, + ∞] that vanishes only at 0 such that
$$\begin{array}{@{}rcl@{}} (\forall n\in\mathbb{N})(\forall\tau\in]0,1[)\quad &&{\Phi}(\tau\overline{x}+(1-\tau)y_{n})+\tau(1-\tau) \phi(\|y_{n}-\overline{x}\|)\\ && \leq\tau{\Phi}(\overline{x})+(1-\tau){\Phi}(y_{n}). \end{array} $$It therefore follows from [23, Page 201] that ∂Φ is uniformly monotone at \(\overline {x}\) and its modulus of convexity is ϕ, i.e,
$$ (\forall n\in\mathbb{N})\quad \langle y_{n}-\overline{x},y_{n}^{\ast}\rangle \geq \phi(\|y_{n}-\overline{x}\|)\geq 0. $$(47)Altogether, (46) and (47) yield \(\phi (\|y_{n}-\overline {x}\|)\to 0\), and thus, \(y_{n}\to \overline {x}\). In turn, (42) yields \(x_{n}\to \overline {x}\). The case when ψ is uniformly convex at \(\overline {x}\) is similar.
-
3b)
First, we observe that S is closed and convex since \({\Phi }\in {\Gamma }_{0}(\mathcal {X})\). Next, for every \(n\in \mathbb {N}\), since \(\mu \hat {f}\succcurlyeq f_{n}\), we derive from (21) that \(\mu \hat {f}\succcurlyeq g_{n}\). Finally, the strong convergence follows from Proposition 3.
□
In Theorem 1, when \((\forall n\in \mathbb {N})f_{n}=f\), condition (18) is satisfied when both ∇f and ∇ψ are weakly sequentially continuous. More precisely, we have the following result.
Theorem 2
Consider the setting of Problem 1 and let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function such that S∩int dom f≠∅, int dom f⊂int dom ψ, and \(f\succcurlyeq \beta \psi \) for some β∈]0,+∞[. Suppose that either f is cofinite or −ran ∇ψ⊂dom φ ∗ , and that (∀x∈int dom f)D f (x,⋅) is coercive. Let ε∈]0,β/(β+1)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that
Furthermore, let x 0 ∈int dom f and iterate
Then there exists \(\overline {x}\in \mathcal {S}\) such that the following hold.
-
(1)
Suppose that one of the following holds.
-
(a)
\(\mathcal {S}\cap \overline {\text {dom}}f\) is a singleton.
-
(b)
∇f and ∇ψ are weakly sequentially continuous and S⊂int dom f.
-
(c)
dom f∗ is open and ∇f, ∇f ∗ , and ∇ψ are weakly sequentially continuous.
Then \(x_{n}\rightharpoonup \overline {x}\).
-
(a)
-
(2)
Suppose that f satisfies Condition 1 and that one of the following holds.
-
(a)
Either φ or ψ is uniformly convex at \(\overline {x}\).
-
(b)
\(\underline {\lim }\, D^{f}_{\mathcal {S}}(x_{n})=0\).
Then \(x_{n}\to \overline {x}\).
-
(a)
Proof
Set \((\forall n\in \mathbb {N})f_{n}=f\). Then
(1a): This is a corollary of Theorem 1(1).
(1b)–(1c): Firstly, the proof of Theorem 1(2a) and (2b) shows that \(\mathfrak {W}(x_{n})_{n\in \mathbb {N}}\subset \mathrm {int\,dom}\, f\). Next, in view of Theorem 1(2), it suffices to show that (18) holds. To this end, suppose that y 1 and y 2 are two weak sequential cluster points of \((x_{n})_{n\in \mathbb {N}}\) such that
Then, there exist two strictly increasing sequences \((k_{n})_{n\in \mathbb {N}}\) and \((l_{n})_{n\in \mathbb {N}}\) in \(\mathbb {N}\) such that \(x_{k_{n}}\rightharpoonup y_{1}\) and \(x_{l_{n}}\rightharpoonup y_{2}\). We derive from (48) and [22, Lemma 2.2.2] that there exists 𝜃∈[ε,β(1−ε)] such that γ n →𝜃. Since ∇f and ∇ψ are weakly sequentially continuous, after taking the limit in (51) along the subsequences \((x_{k_{n}})_{n\in \mathbb {N}}\) and \((x_{l_{n}})_{n\in \mathbb {N}}\), respectively, we get
Let us define
Then h is Gâteaux differentiable on int dom h=int dom f and (52) yields
On the other hand,
In turn, since \(f\succcurlyeq \beta \psi \) and 𝜃≤β(1−ε), we obtain \(h\succcurlyeq \varepsilon f\), and hence,
Therefore, (53) yields
Suppose that y 1≠y 2. Since f|int dom f is strictly convex, ∇f is strictly monotone [23, Theorem 2.4.4(ii)], i.e.,
and we reach a contradiction.
-
(2):
The conclusions follow from (50) and Theorem 1(3).
□
Remark 1
In condition (48), if we take \((\forall n\in \mathbb {N})\eta _{n} = 0\) then we get the forward-backward splitting algorithm with monotonic step size whose particular case is forward-backward splitting algorithm with constant step-size.
Remark 2
Let us rewrite algorithm (49) as follows
Another method to solve Problem 1 was proposed in [10]. In that method, instead of solving (54), the authors solve
for some 1<p≤2. The weak convergence is established under the assumptions that Problem 1 admits a unique solution, ∇ψ is (p−1)-Hölder continuous with constant β, and \(0<\inf _{n\in \mathbb {N}}\gamma _{n}\leq \sup _{n\in \mathbb {N}}\gamma _{n} \leq (1-\delta )/\beta \), where 0<δ<1. The high nonlinearity of the regularization in (55) compared to (54) makes the numerical implementation of this method difficult in general. Furthermore, since (55) yields
and since \((\forall n\in \mathbb {N})\partial \big (\|x_{n+1}-x_{n}\|^{p}\big )\) is not separable, this method is not a splitting method.
Remark 3
We can reformulate Problem 1 as the following joint minimization problem
where \(V=\{(x,y)\in \mathcal {X}\times \mathcal {X}\mid y=x\}\). This constrained problem is equivalent to the following unconstrained problem
In [8], a different coupling term between the variables x and y was considered and the problem considered there was
in Euclidean spaces. Their method activates φ and ψ via their so-called left and right Bregman proximity operators alternatively (see also [7] for the projection setting). This method does not require the smoothness of ψ but it requires the computation of Bregman distance-based proximity operator of ψ.
Next, we provide a particular instance of Theorem 2 in finite-dimensional spaces.
Corollary 1
In the setting of Problem 1, suppose that \(\mathcal {X}\) and \(\mathcal {Y}\) are finite-dimensional. Let \(f\in {\Gamma }_{0}(\mathcal {X})\) be a Legendre function such that S∩int dom f≠∅, int dom f⊂int dom ψ, \(f\succcurlyeq \beta \psi \) for some β∈]0,+∞[, and dom f ∗ is open. Suppose that either f is cofinite or −ran ∇ψ⊂dom φ ∗ . Let ε∈]0,β/(β+1)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that
Furthermore, let x 0 ∈int dom f and iterate
Then there exists \(\overline {x}\in \mathcal {S}\) such that \(x_{n}\to \overline {x}\).
Proof
Since dom f ∗ is open, [5, Lemma 7.3(ix)] asserts that (∀x∈int dom f)D f(x,⋅) is coercive. Hence, the claim follows from Theorem 2(1c). □
4 Application to Multivariate Minimization
In this section, we apply Theorem 2 to solve the following multivariate minimization problem.
Problem 2
Let m and p be strictly positive integers, let \((\mathcal {X}_{i})_{1\leq i\leq m}\) and \((\mathcal {Y}_{k})_{1\leq k\leq p}\) be reflexive real Banach spaces. For every i∈{1,…,m} and every k∈{1,…,p}, let \(\varphi _{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\), let \(\psi _{k}\in {\Gamma }_{0}(\mathcal {Y}_{k})\) be Gâteaux differentiable on int dom ψ k ≠∅, and let \(L_{ik}\colon \mathcal {X}_{i}\to \mathcal {Y}_{k}\) be linear and bounded. The problem is to
Denote by S the set of solutions to (56).
We derive from Theorem 2 the following result.
Proposition 6
Consider the setting of Problem 2. For every k∈{1,…,p}, suppose that there exists σ k ∈]0,+∞[ such that for every (y ik ) 1≤i≤m ∈int dom ψ k and every (v ik ) 1≤i≤m ∈int dom ψ k satisfying \({\sum }_{i=1}^{m}y_{ik}\in \mathrm {int\,dom}\,\psi _{k}\) and \({\sum }_{i=1}^{m}v_{ik}\in \mathrm {int\,dom}\,\psi _{k}\) , one has
For every i∈{1,…,m}, let \(f_{i}\in {\Gamma }_{0}(\mathcal {X}_{i})\) be a Legendre function such that \((\forall x_{i}\in \mathrm {int\,dom}\, f_{i}) D^{f_{i}}(x_{i},\cdot )\) is coercive. For every k∈{1,…,p}, suppose that \({\sum }_{i=1}^{m}L_{ik}(\mathrm {int\,dom}\, f_{i}) \subset \mathrm {int\,dom}\,\psi _{k}\) , that, for every i∈{1,…,m}, there exists β ik ∈]0,+∞[ such that \(f_{i}\succcurlyeq \beta _{ik}\psi _{k}\circ L_{ik}\) , and set β k = min1≤i≤mβ ik . In addition, suppose that \(\text {int\,dom}\, f_{i}\neq \emptyset \) and that either (∀i∈{1,…,m})f i is cofinite or (∀i∈{1,…,m})φ i is cofinite. Let \(\varepsilon \in \big ]0,1/\big (1+{\sum }_{k=1}^{p}\sigma _{k}\beta _{k}^{-1}\big )\big [\) , let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\) , and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that
Furthermore, let \( \text {int\,dom}\, f_{i}\) and iterate
Then there exists \((\overline {x}_{i})_{1\leq i\leq m}\in \mathcal {S}\) such that the following hold.
-
(1)
Suppose that is a singleton. Then \((\forall i\in \{1,\ldots ,m\})x_{i,n}\rightharpoonup \overline {x}_{i}\).
-
(2)
For every i∈{1,…,m} and every k∈{1,…,p}, suppose that ∇f i and ∇ψ k are weakly sequentially continuous, and that one of the following holds.
-
(a)
dom φ i ⊂int dom f i .
-
(b)
\(\text {dom}\, f_{i}^{\ast }\) is open and \(\nabla f_{i}^{\ast }\) is weakly sequentially continuous.
Then \((\forall i\in \{1,\ldots ,m\})x_{i,n}\rightharpoonup \overline {x}_{i}\).
-
(a)
Proof
Denote by \(\mathcal {X}\) and \(\mathcal {Y}\) the standard vector product spaces and equipped with the norms \(x=(x_{i})_{1\leq i\leq m}\mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}\|^{2}}\) and y=(y k )1≤k≤p ↦\(\sqrt {{\sum }_{k=1}^{p}\|y_{k}\|^{2}}\), respectively. Then \(\mathcal {X}^{\ast }\) is the vector product space equipped with the norm \(x^{\ast }\mapsto \sqrt {{\sum }_{i=1}^{m}\|x_{i}^{\ast }\|^{2}}\) and \(\mathcal {Y}^{\ast }\) is the vector product space equipped with the norm \(y^{\ast }\mapsto \sqrt {{\sum }_{k=1}^{p}\|y_{k}^{\ast }\|^{2}}\). Let us introduce the functions and operator
Then ψ is Gâteaux differentiable on and Problem 2 is a special case of Problem 1. Since (59) yields and , we deduce from our assumptions that either f is cofinite or φ is cofinite. As in (9) and (10), f is a Legendre function and dom φ∩int dom f≠∅. In addition,
Now set ψ L = ψ∘L and let x∈int dom f. Then ψ is Gâteaux differentiable at L x and hence ψ L is Gâteaux differentiable at x. This implies that x∈intdom ψ L and thus intdom f⊂intdom ψ L . To show that D f(x,⋅) is coercive, we fix \(\rho \in \mathbb {R}\). On one hand,
On the other hand, for every i∈{1,…,m}, since \(D^{f_{i}}(x_{i},\cdot )\) is coercive, we deduce that
Hence, (60) implies that \(\{z\in \mathcal {X}\mid D^{f}(x,z)\leq \rho \}\) is bounded and D f(x,⋅) is therefore coercive. Next, set \(\beta =1/{\sum }_{k=1}^{p}\sigma _{k}\beta _{k}^{-1}\). We shall show that \(f\succcurlyeq \beta \psi _{L}\). To this end, fix z=(z i )1≤i≤m ∈int dom f. We have
Now let us set \((\forall n\in \mathbb {N})x_{n}=(x_{i,n})_{1\leq i\leq m}\). By virtue of Proposition 5, (58) is a particular case of (49).
-
(1)
Since \(\mathcal {S}\cap \overline {\text {dom}}f\) is a singleton, the claim follows from Theorem 2(1a).
-
(2)
Our assumptions on (f i )1≤i≤m and (ψ k )1≤k≤p imply that ∇f and ∇ψ are weakly sequentially continuous.
-
(2a)
Since , the claim follows from Theorem 2(1b).
-
(2b)
Since, for every i∈{1,…,m}, \(\text {dom}\, f_{i}^{\ast }\) is open and \(\nabla f_{i}^{\ast }\) is weakly sequentially continuous, we deduce that dom f ∗ is open and ∇f ∗ is weakly sequentially continuous. The assertion therefore follows from Theorem 2(1c).
□
Example 5
In Problem 2, suppose that m=1, that \(\mathcal {X}_{1}\) and \((\mathcal {Y}_{k})_{1\leq k\leq p}\) are Hilbert spaces, and that, for every k∈{1,…,p}, φ k = ω k ∥⋅−r k ∥2/2, where (ω k )1≤k≤p ∈]0, + ∞[p and let . Then the weak convergence result in [13, Proposition 6.3] without errors is a particular instance of Proposition 6 with f 1=∥⋅∥2/2.
Example 6
Let m and p be strictly positive integers. For every i∈{1,…,m} and every k∈{1,…,p}, let ω i k ∈]0, + ∞[, let ρ k ∈]0, + ∞[, and let \(\varphi _{i}\in {\Gamma }_{0}(\mathbb {R})\) be cofinite. The problem is to
Denote by S the set of solutions to (61) and suppose that S∩]0, + ∞[m≠∅. Let
be Burg entropy, let ε∈]0,1/(1 + p)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\), and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that
Let (ξ i,0)1≤i≤m ∈]0, + ∞[m and iterate
Then there exists \((\overline {\xi }_{i})_{1\leq i\leq m}\in \mathcal {S}\) such that \((\forall i\in \{1,\ldots ,m\})\xi _{i,n}\to \overline {\xi }_{i}\).
Proof
For every i∈{1,…,m} and every k∈{1,…,p}, let us set \(\mathcal {X}_{i}=\mathbb {R}\), \(\mathcal {Y}_{k}=\mathbb {R}\), ψ k = D 𝜗(⋅,ρ k ), and L i k :ξ i ↦ω i k ξ i . Then (61) is a particular case of (56). Since ψ is not differentiable on \(\mathbb {R}^{p}\), the standard forward-backward algorithm is inapplicable. We show that the problem can be solved by using Proposition 6. First, let (ξ i )1≤i≤m and (η i )1≤i≤m be in ]0, + ∞[m, and consider
We see that ϕ is convex and positive. Thus,
and hence,
In turn,
This shows that (57) is satisfied with (∀k∈{1,…,p})σ k =1. Next, let us set (∀i∈{1,…,m})f i = 𝜗. Fix i∈{1,…,m} and k∈{1,…,p}, and let ξ i and η i be in ]0, + ∞[. Then
which implies that \(f_{i}\succcurlyeq \psi _{k}\circ L_{ik}\). In addition, since \(\text {dom}\, f_{i}^{\ast }=]-\infty ,0[\) is open, [5, Lemma 7.3(ix)] asserts that \(D^{f_{i}}(\xi _{i},\cdot )\) is coercive. We therefore deduce the convergence result from Proposition 6(2b). □
Example 7
Let m and p be strictly positive integers. For every i∈{1,…,m} and every k∈{1,…,p}, let ω i k ∈]0, + ∞[, let ρ k ∈]0, + ∞[, and let \(\varphi _{i}\in {\Gamma }_{0}(\mathbb {R})\). The problem is to
Denote by S the set of solutions to (62) and suppose that S∩]0, + ∞[m≠∅. Let
be Boltzmann–Shannon entropy, let β= max1≤k≤p max1≤i≤m ω i k , let ε∈]0,1/(1 + β)[, let \((\eta _{n})_{n\in \mathbb {N}}\in \ell _{+}^{1}(\mathbb {N})\), and let \((\gamma _{n})_{n\in \mathbb {N}}\) be a sequence in \(\mathbb {R}\) such that
Let (ξ i,0)1≤i≤m ∈]0, + ∞[m and iterate
Then, there exists \((\overline {\xi }_{i})_{1\leq i\leq m}\in \mathcal {S}\) such that \((\forall i\in \{1,\ldots ,m\})\xi _{i,n}\to \overline {\xi }_{i}\).
Proof
For every i∈{1,…,m} and every k∈{1,…,p}, let us set \(\mathcal {X}_{i}=\mathbb {R}\), \(\mathcal {Y}_{k}=\mathbb {R}\), ψ k = D 𝜗(⋅,ρ k ), and L i k :ξ i ↦ω i k ξ i . Then (62) is a particular case of (56). We cannot apply the standard forward-backward algorithm here since ψ is not differentiable on \(\mathbb {R}^{p}\). We shall verify the assumptions of Proposition 6. First, let (ξ i )1≤i≤m and (η i )1≤i≤m be in ]0, + ∞[m. Since
is convex, we have
and hence,
In turn,
which implies that
This shows that (57) is satisfied with (∀k∈{1,…,p})σ k =1. Next, let us set (∀i∈{1,…,m})f i = 𝜗. Fix i∈{1,…,m} and k∈{1,…,p}, and let ξ i and η i be in ]0, + ∞[. Then
which implies that \(f_{i}\succcurlyeq \beta ^{-1}\psi _{k}\circ L_{ik}\). In addition, since f i is supercoercive, f i is cofinite and [5, Lemma 7.3(viii)] asserts that \(D^{f_{i}}(\xi _{i},\cdot )\) is coercive. Therefore, the claim follows from Proposition 6(2b). □
Remark 4
The Bregman distance associated with Burg entropy, i.e., the Itakura–Saito divergence, is used in linear regression [3, Section 3]. The Bregman distance associated with Boltzmann–Shannon entropy, i.e., the Kullback–Leibler divergence, is used in information theory [3, Section 3] and image processing [11].
References
Attouch, H., Brezis, H.: Duality for the sum of convex functions in general Banach spaces. In: Barroso, J.A (ed.) Aspects of Mathematics and its Applications. North-Holland Mathematics Library, vol 34, pp 125–133, North-Holland, Amsterdam (1986)
Banerjee, A., Basu, S., Merugu, S.: Multi-way clustering on relation graphs. In: Apte, C., et al. (eds.) Proceedings of the 2007 SIAM International Conference on Data Mining, pp 145–156. SIAM, Philadelphia (2007)
Basseville, M.: Divergence measures for statistical data processing—An annotated bibliography. Signal Process. 93, 621–633 (2013)
Bauschke, H.H., Borwein, J.M.: Legendre functions and the method of random Bregman projections. J. Convex Anal. 4, 27–67 (1997)
Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces. Commun. Contemp. Math. 3, 615–647 (2001)
Bauschke, H.H., Borwein, J.M., Combettes, P.L.: Bregman monotone optimization algorithms. SIAM J. Control Optim. 42, 596–636 (2003)
Bauschke, H.H., Combettes, P.L.: Iterating Bregman retractions. SIAM. J. Optim 13, 1159–1173 (2003)
Bauschke, H.H., Combettes, P.L., Noll, D.: Joint minimization with alternating Bregman proximity operators. Pac. J. Optim. 2, 401–424 (2006)
Bertero, M., Boccacci, P., Desiderà, G., Vicidomini, G.: Image deblurring with Poisson data: from cells to galaxies. Inverse Probl. 25, 123006 (2009). 26 pages
Bredies, K.: A forward-backward splitting algorithm for the minimization of non-smooth convex functionals in Banach space. Inverse Probl. 25, 015005 (2009). 20 pages
Byrne, C.L.: Iterative image reconstruction algorithms based on cross-entropy minimization. IEEE Trans. Image Process. 2, 96–103 (1993)
Chandrasekaran, V., Parrilo, P.A., Willsky, A.S.: Latent variable graphical model selection via convex optimization. Ann. Stat. 40, 1935–1967 (2012)
Combettes, P.L., Vũ, B.C.: Variable metric quasi-Fejér monotonicity. Nonlinear Anal. 78, 17–31 (2013)
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model Simul. 4, 1168–1200 (2005)
Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J., Knuth, D.E.: On the Lambert W function. Adv. Comput. Math. 5, 329–359 (1996)
Kivinen, J., Warmuth, M.K.: Relative loss bounds for multidimensional regression problems. Mach. Learn. 45, 301–329 (2001)
Lantéri, H., Roche, M., Aime, C.: Penalized maximum likelihood image restoration with positivity constraints: multiplicative algorithms. Inverse Probl. 18, 1397–1419 (2002)
Markham, J., Conchello, J.-A.: Fast maximum-likelihood image-restoration algorithms for three- dimensional fluorescence microscopy. J. Opt. Soc. Am. A 18, 1062–1071 (2001)
Moreau, J.-J.: Fonctions convexes duales et points proximaux dans un espace hilbertien. C.R. Acad. Sci. Paris 255, 2897–2899 (1962)
Nguyen, V.Q.: Variable quasi-Bregman monotone sequences. Numer. Algor. doi:10.1007/s11075-016-0132-9 (2016)
Phelps, R.R.: Convex Functions, Monotone Operators and Differentiability. Lecture Notes in Mathematics, 2nd edn., vol. 1364. Springer-Verlag, Berlin (1993)
Polyak, B.T.: Introduction to Optimization. Translations Series in Mathematics and Engineering. Optimization Software, Inc. Publications Division, New York (1987)
Zălinescu, C.: Convex Analysis in General Vector Spaces. World Scientific Publishing Co., Inc., River Edge (2002)
Acknowledgments
I would like to thank my doctoral advisor Professor Patrick L. Combettes for bringing this problem to my attention and for helpful discussions. The contributions of the referees to the article are important and I sincerely thank them for those.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Van Nguyen, Q. Forward-Backward Splitting with Bregman Distances. Vietnam J. Math. 45, 519–539 (2017). https://doi.org/10.1007/s10013-016-0238-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10013-016-0238-3
Keywords
- Banach space
- Bregman distance
- Forward-backward algorithm
- Legendre function
- Multivariate minimization
- Variable quasi-Bregman monotonicity