1 Introduction

Throughout this paper, \(({\fancyscript{X}},\Vert \cdot \Vert )\) is a reflexive real Banach space with topological dual \(({\fancyscript{X}}^*,\Vert \cdot \Vert _*)\), and the canonical bilinear form on \({\fancyscript{X}}\times {\fancyscript{X}}^*\) is denoted by \(\langle {{\cdot },{\cdot }}\rangle \). The distance function to a set \(C\subset {\fancyscript{X}}\) is \(d_C:x\mapsto \inf _{y\in C}\Vert x-y\Vert \), the metric projector onto \(C\) is \(P_C:x\mapsto \big \{{y\in C}~\big |~{\Vert x-y\Vert =d_C(x)}\big \}\), and the polar cone of \(C\) is \(C^\ominus =\big \{{x^*\in {\fancyscript{X}}^*}~\big |~{(\forall x\in C)\;\langle {{x},{x^*}}\rangle \le 0}\big \}\). \(\varGamma _0({\fancyscript{X}})\) is the class of lower semicontinuous convex functions \(\varphi :{\fancyscript{X}}\rightarrow {\,]\!-\!\infty ,+\infty ]}\) such that \({\mathrm{dom}}\,\varphi =\big \{{x\in {\fancyscript{X}}}~\big |~{\varphi (x)<{{+\infty }}}\big \}\ne {{\varnothing }}\).

A classical tool in linear hilbertian analysis is the following orthogonal decomposition principle.

Proposition 1

Suppose that \({\fancyscript{X}}\) is a Hilbert space, let \(V\) be a closed vector subspace of \({\fancyscript{X}}\) with orthogonal complement \(V^\bot \), and let \(x\in {\fancyscript{X}}\). Then the following hold.

  1. (i)

    \(\Vert x\Vert ^2=d_V^2(x)+d_{V^\bot }^2(x)\).

  2. (ii)

    \(x=P_Vx+P_{V^\bot }x\).

  3. (iii)

    \(\langle {{P_Vx},{P_{V^\perp }x}}\rangle =0\).

In 1962, Moreau proposed a nonlinear extension of this decomposition.

Proposition 2

[22] Suppose that \({\fancyscript{X}}\) is a Hilbert space, let \(K\) be a nonempty closed convex cone in \({\fancyscript{X}}\), and let \(x\in {\fancyscript{X}}\). Then the following hold.

  1. (i)

    \(\Vert x\Vert ^2=d_K^2(x)+d_{K^\ominus }^2(x)\).

  2. (ii)

    \(x=P_Kx+P_{K^\ominus }x\).

  3. (iii)

    \(\langle {{P_Kx},{P_{K^\ominus }x}}\rangle =0\).

Motivated by problems in unilateral mechanics, Moreau further extended this result in [23] (see also [25]). To state Moreau’s decomposition principle, we require some basic notions from convex analysis [7, 33]. Let \(\varphi \) and \(f\) be two functions in \(\varGamma _0({\fancyscript{X}})\). The conjugate of \(\varphi \) is the function \(\varphi ^*\) in \(\varGamma _0({\fancyscript{X}}^*)\) defined by

$$\begin{aligned} \varphi ^*:{\fancyscript{X}}^*\rightarrow {\,]\!-\!\infty ,+\infty ]}:x^*\mapsto \sup _{x\in {\fancyscript{X}}}\big (\langle {{x},{x^*}}\rangle -\varphi (x)\big ). \end{aligned}$$
(1.1)

Moreover, the infimal convolution of \(\varphi \) and \(f\) is the function

$$\begin{aligned} \varphi {\square }f:{\fancyscript{X}}\rightarrow {\,\left[ -\infty ,+\infty \right] }:x\mapsto \inf _{y\in {\fancyscript{X}}}\big (\varphi (y)+f(x-y)\big ). \end{aligned}$$
(1.2)

Now suppose that \({\fancyscript{X}}\) is a Hilbert space and set \(q=(1/2)\Vert \cdot \Vert ^2\). Then, for every \(x\in {\fancyscript{X}}\), there exists a unique point \(p\in {\fancyscript{X}}\) such that \((\varphi {\square }q)(x)=\varphi (p)+q(x-p)\); this point is denoted by \(p={{\mathrm{prox}}}_\varphi x\). The operator \({{\mathrm{prox}}}_\varphi :{\fancyscript{X}}\rightarrow {\fancyscript{X}}\) thus defined is called the proximity operator of \(\varphi \).

Proposition 3

[23, 25] Suppose that \({\fancyscript{X}}\) is a Hilbert space, let \(\varphi \in \varGamma _0({\fancyscript{X}})\), set \(q=\Vert \cdot \Vert ^2/2\), and let \(x\in {\fancyscript{X}}\). Then the following hold.

  1. (i)

    \(q(x)=(\varphi {\square }q)(x)+(\varphi ^*{\square }q)(x)\).

  2. (ii)

    \(x={{\mathrm{prox}}}_\varphi x+{{\mathrm{prox}}}_{\varphi ^*}x\).

  3. (iii)

    \(\langle {{{{\mathrm{prox}}}_\varphi x},{{{\mathrm{prox}}}_{\varphi ^*}x}}\rangle =\varphi \big ({{\mathrm{prox}}}_\varphi x\big )+ \varphi ^*\big ({{\mathrm{prox}}}_{\varphi ^*}x\big )\).

Note that, if in Proposition 3 \(\varphi \) is the indicator function of a nonempty closed convex cone \(K\subset {\fancyscript{X}}\), i.e., \(\varphi =\iota _K\) where

$$\begin{aligned} (\forall x\in {\fancyscript{X}})\quad \iota _K(x)= {\left\{ \begin{array}{ll} 0,&{}\text{ if }\;\;x\in K;\\ {{+\infty }},&{}\text{ if }\;\;x\notin K, \end{array}\right. } \end{aligned}$$
(1.3)

we recover Proposition 2.

The above hilbertian nonlinear decomposition principles have found many applications in optimization and in various other areas of applied mathematics (see for instance [9, 1218, 21, 28] and the references therein) and attempts have been made to extend them to more general Banach spaces. The main result in this direction is the following generalization of Proposition 2(ii) and (iii) in uniformly convex and uniformly smooth Banach spaces (see also [3, 19, 29, 30] for alternate proofs and applications), where \(\varPi _C\) denotes the generalized projector onto a nonempty closed convex subset \(C\) of \({\fancyscript{X}}\) [1], i.e., if \(J\) denotes the duality mapping of \({\fancyscript{X}}\),

$$\begin{aligned} (\forall x\in {\fancyscript{X}})\quad \varPi _C x=\underset{y\in C}{\mathrm{argmin}} \big (\Vert x\Vert ^2-2\langle {{y},{Jx}}\rangle +\Vert y\Vert ^2\big ). \end{aligned}$$
(1.4)

Proposition 4

[2] Suppose that \({\fancyscript{X}}\) is uniformly convex and uniformly smooth, let \(J:{\fancyscript{X}}\rightarrow {\fancyscript{X}}^*\) denote its duality mapping, which is characterized by

$$\begin{aligned} (\forall x\in {\fancyscript{X}})\quad \Vert x\Vert ^2=\langle {{x},{Jx}}\rangle =\Vert Jx\Vert ^2_*, \end{aligned}$$
(1.5)

let \(K\) be a nonempty closed convex cone in \({\fancyscript{X}}\), and let \(x\in {\fancyscript{X}}\). Then the following hold.

  1. (i)

    \(x=P_Kx+ J^{-1}\big (\varPi _{K^\ominus }(Jx)\big )\).

  2. (ii)

    \(\langle {{P_Kx},{\varPi _{K^\ominus }(Jx)}}\rangle =0\).

The objective of the present paper is to unify and extend the above results. To this end, we first discuss in Sect. 2 suitable notions of proximity in Banach spaces. Based on these, we propose our extension of Moreau’s decomposition in Sect. 3. A feature of our analysis is to rely heavily on convex analytical tools, which allows us to derive our main result with simpler proofs than those utilized in the above special case.

2 Proximity in Banach spaces

Let \(\varphi \in \varGamma _0({\fancyscript{X}})\). As seen in the Introduction, if \({\fancyscript{X}}\) is a Hilbert space, Moreau’s proximity operator is defined by

$$\begin{aligned} (\forall x\in {\fancyscript{X}})\quad {{\mathrm{prox}}}_\varphi x=\underset{y\in {\fancyscript{X}}}{\mathrm{argmin}}\Big (\varphi (y)+\frac{1}{2}\Vert x-y\Vert ^2\Big ). \end{aligned}$$
(2.1)

In this section we discuss two extensions of this operator in Banach spaces. We recall that \(\varphi \) is coercive if \(\lim _{\Vert y\Vert \rightarrow {{+\infty }}}\varphi (y)={{+\infty }}\) and supercoercive if \(\lim _{\Vert y\Vert \rightarrow {{+\infty }}}\varphi (y)/\Vert y\Vert ={{+\infty }}\). As usual, the subdifferential operator of \(\varphi \) is denoted by \(\partial \varphi \). Finally, the strong relative interior of a convex set \(C\subset {\fancyscript{X}}\) is

$$\begin{aligned} {{\mathrm{sri}}\,}C=\bigg \{{x\in C}~\bigg |~{\bigcup _{\lambda >0}\lambda (C-x)={\overline{\mathrm{span}}\,}(C-x)}\bigg \}. \end{aligned}$$
(2.2)

We shall also require the following facts.

Lemma 1

[24, 26] Let \(f\in \varGamma _0({\fancyscript{X}})\) and let \(x^*\in {\fancyscript{X}}^*\). Then \(f-x^*\) is coercive if and only if \(x^*\in {\mathrm{int}\mathrm{dom}}\,f^*\).

Lemma 2

[5, Theorem 3.4] Let \(f\in \varGamma _0({\fancyscript{X}})\) be supercoercive. Then \({\mathrm{dom}}\,f^*={\fancyscript{X}}^*\).

Lemma 3

[4] Let \(f\) and \(\varphi \) be functions in \(\varGamma _0({\fancyscript{X}})\) such that \(0\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f-{\mathrm{dom}}\,\varphi )\). Then the following hold.

  1. (i)

    \((\varphi +f)^*=\varphi ^*{\square }f^*\) and the infimal convolution is exact everywhere:

    $$\begin{aligned} (\forall x^*\in {\fancyscript{X}}^*)({\exists \,}y^*\in {\fancyscript{X}}^*)\quad (\varphi +f)^*(x^*)=\varphi ^*(y^*)+f^*(x^*-y^*). \end{aligned}$$
  2. (ii)

    \(\partial (\varphi +f)=\partial \varphi +\partial f\).

2.1 Legendre functions

We review the notion of a Legendre function, which was introduced in Euclidean spaces in [27] and extended to Banach spaces in [5] (see also [8] for further developments in the nonreflexive case).

Definition 1

[5, Definition 5.2] Let \(f\in \varGamma _0({\fancyscript{X}})\). Then \(f\) is:

  1. (i)

    essentially smooth, if \(\partial f\) is both locally bounded and single-valued on its domain;

  2. (ii)

    essentially strictly convex, if \((\partial f)^{-1}\) is locally bounded on its domain and \(f\) is strictly convex on every convex subset of \({\mathrm{dom}}\,\partial f\);

  3. (iii)

    a Legendre function, if it is both essentially smooth and essentially strictly convex.

Some key properties of Legendre functions are listed below.

Lemma 4

Let \(f\in \varGamma _0({\fancyscript{X}})\) be a Legendre function. Then the following hold.

  1. (i)

    \(f^*\) is a Legendre function [5, Corollary 5.5].

  2. (ii)

    \({\mathrm{dom}}\,\partial f={\mathrm{int}\mathrm{dom}f}\ne {{\varnothing }}\) and \(f\) is Gâteaux differentiable on \({\mathrm{int}\mathrm{dom}f}\) [5, Theorem 5.6].

  3. (iii)

    \(\nabla f:{\mathrm{int}\mathrm{dom}f}\rightarrow {\mathrm{int}\mathrm{dom}f^*}\) is bijective with inverse \(\nabla f^*:{\mathrm{int}\mathrm{dom}f^*}\rightarrow {\mathrm{int}\mathrm{dom}f}\) [5, Theorem 5.10].

2.2 \(D\)-proximity operators

In this subsection we discuss a notion of proximity based on Bregman distances investigated in [6] and which goes back to [10, 31].

The first extension of (2.1) was investigated in [6]. Let \(f\in \varGamma _0({\fancyscript{X}})\) be a Legendre function. The Bregman distance associated with \(f\) is

$$\begin{aligned} D_f:{\fancyscript{X}}\times {\fancyscript{X}}&\rightarrow [0,{{+\infty }}]\nonumber \\ (y,x)&\mapsto {\left\{ \begin{array}{ll} f(y)-f(x)-\langle {{y-x},{{\nabla f}(x)}}\rangle ,&{}\text{ if }\;\;x\in {\mathrm{int}\mathrm{dom}f};\\ {{+\infty }},&{}\text{ otherwise }. \end{array}\right. }\qquad \end{aligned}$$
(2.3)

For every \(\varphi \in \varGamma _0({\fancyscript{X}})\), we define the function \(\varphi {\diamond }f:{\fancyscript{X}}\rightarrow {\,\left[ -\infty ,+\infty \right] }\) by

$$\begin{aligned} (\forall x\in {\fancyscript{X}})\quad (\varphi {\diamond }f)(x)=\inf _{y\in {\fancyscript{X}}}\big (\varphi (y)+ D_f(y,x)\big ). \end{aligned}$$
(2.4)

The following proposition refines and complements some results of [6, Section 3.4].

Proposition 5

Let \(f\in \varGamma _0({\fancyscript{X}})\) be a Legendre function, let \(\varphi \in \varGamma _0({\fancyscript{X}})\) be such that

$$\begin{aligned} 0\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f-{\mathrm{dom}}\,\varphi ), \end{aligned}$$
(2.5)

and let \(x\in {\mathrm{int}\mathrm{dom}}\,f\). Suppose that one of the following holds.

  1. (i)

    \({\nabla f}(x)\in {\mathrm{int}}({\mathrm{dom}}\,f^*+{\mathrm{dom}}\,\varphi ^*)\).

  2. (ii)

    \({\mathrm{int}\mathrm{dom}}\,f^*\subset {\mathrm{int}}({\mathrm{dom}}\,f^*+{\mathrm{dom}}\,\varphi ^*)\).

  3. (iii)

    \(f\) is supercoercive.

  4. (iv)

    \(\inf \varphi ({\fancyscript{X}})>{{-\infty }}\).

Then there exists a unique point \(p\in {\fancyscript{X}}\) such that \((\varphi {\diamond }f)(x)=\varphi (p)+D_f(p,x)\); moreover, \(p\) lies in \({\mathrm{dom}}\,\partial \varphi \cap {\mathrm{int}\mathrm{dom}f}\) and it is characterized by the inclusion

$$\begin{aligned} {\nabla f}(x)-{\nabla f}(p)\in \partial \varphi (p). \end{aligned}$$
(2.6)

Proof

Set \(f_x:{\fancyscript{X}}\rightarrow {\,]-\infty ,+\infty ]}:y\mapsto f(y)-\langle {{y},{{\nabla f}(x)}}\rangle \). Then the minimizers of \(\varphi +D_f(\cdot ,x)\) coincide with those of \(\varphi +f_x\) and our assumptions imply that

$$\begin{aligned} \varphi +f_x\in \varGamma _0({\fancyscript{X}}). \end{aligned}$$
(2.7)

Now let \(p\in {\fancyscript{X}}\). It follows from (2.5), Lemma 3(ii), and Lemma 4(ii) that

$$\begin{aligned} (\varphi \,{\diamond }\, f)(x)=\varphi (p)+D_f(p,x)&\Leftrightarrow p\;\,\text{ minimizes }\;\,\varphi +f_x \nonumber \\&\Leftrightarrow 0\in \partial \big (\varphi +f_x\big )(p) \nonumber \\&\Leftrightarrow 0\in \partial \varphi (p)+\partial f(p)-{\nabla f}(x) \nonumber \\&\Leftrightarrow 0\in \partial \varphi (p)+{\nabla f}(p)-{\nabla f}(x) \nonumber \\&\Leftrightarrow {\nabla f}(x)-{\nabla f}(p)\in \partial \varphi (p)\end{aligned}$$
(2.8)
$$\begin{aligned}&\Rightarrow p\in {\mathrm{dom}}\,\partial \varphi \cap {\mathrm{int}\mathrm{dom}f}. \end{aligned}$$
(2.9)

Hence, the minimizers of \(\varphi +f_x\) are in \({\mathrm{int}\mathrm{dom}f}\). However, since \(f\) is essentially strictly convex, it is strictly convex on \({\mathrm{int}\mathrm{dom}f}\) and so is therefore \(\varphi +f_x\). This shows that \(\varphi +f_x\) admits at most one minimizer. It remains to establish existence.

(i): It follows from (2.7) that, to show existence, it is enough to show that \(\varphi +f_x\) is coercive [33, Theorem 2.5.1(ii)]. In view of Lemma 1, this is equivalent to showing that \({\nabla f}(x)\in {\mathrm{int}\mathrm{dom}}\,(f+\varphi )^*\). However, it follows from (2.5) and Lemma 3(i) that

$$\begin{aligned} {\mathrm{int}\mathrm{dom}}\,(f+\varphi )^* ={\mathrm{int}\mathrm{dom}}\,(f^*{\square }\varphi ^*) ={\mathrm{int}}({\mathrm{dom}}\,f^*+{\mathrm{dom}}\,\varphi ^*). \end{aligned}$$
(2.10)

(ii) \(\Rightarrow \) (i): Lemma 4(iii).

(iii) \(\Rightarrow \) (ii): By Lemma 2, \({\mathrm{dom}}\,f^*={\fancyscript{X}}^*\) and, since \({\mathrm{dom}}\,\varphi ^*\ne {{\varnothing }},\,{\mathrm{int}\mathrm{dom}}\,f^*\subset {\mathrm{int}}({\mathrm{dom}}\,f^*+{\mathrm{dom}}\,\varphi ^*)\).

(iv) \(\Rightarrow \) (ii): We have \(\inf \varphi ({\fancyscript{X}})>{{-\infty }}\,\Rightarrow \,\varphi ^*(0) =-\inf \varphi ({\fancyscript{X}})<{{+\infty }}\,\Rightarrow \,0\in {\mathrm{dom}}\,\varphi ^*\). Hence, \({\mathrm{int}\mathrm{dom}}\,f^*\subset {\mathrm{int}}({\mathrm{dom}}\,f^*+{\mathrm{dom}}\,\varphi ^*)\). \(\square \)

In view of Proposition 5 and Lemma 4(iii), the following is well defined.

Definition 2

Let \(f\in \varGamma _0({\fancyscript{X}})\) be a Legendre function and let \(\varphi \in \varGamma _0({\fancyscript{X}})\) be such that \(0\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f-{\mathrm{dom}}\,\varphi )\). Set

$$\begin{aligned} E=({\mathrm{int}\mathrm{dom}f})\cap \big ({\nabla f}^*\big ({\mathrm{int}}({\mathrm{dom}}\,f^*+{\mathrm{dom}}\,\varphi ^*)\big )\big ). \end{aligned}$$
(2.11)

The \(D\)-proximity (or Bregman proximity) operator of \(\varphi \) relative to \(f\) is

$$\begin{aligned} {{\mathrm{bprox}}}_\varphi ^f:E\rightarrow {\mathrm{int}\mathrm{dom}f}:x\mapsto \underset{y\in {\fancyscript{X}}}{\mathrm{argmin}}\big (\varphi (y)+D_f(y,x)\big ). \end{aligned}$$
(2.12)

Remark 1

In connection with Definition 2, let us make a couple of observations.

  1. (i)

    It follows from Proposition 5 that, if \({\mathrm{int}\mathrm{dom}}\,f^*\subset {\mathrm{int}}({\mathrm{dom}}\,\varphi ^*+{\mathrm{dom}}\,f^*)\) (in particular if \(f\) is supercoercive or if \(\inf \varphi ({\fancyscript{X}})>{{-\infty }}\)), then \({{\mathrm{bprox}}}_\varphi ^f:{\mathrm{int}\mathrm{dom}f}\rightarrow {\mathrm{int}\mathrm{dom}f}\).

  2. (ii)

    Suppose that \({\fancyscript{X}}\) is hilbertian and that \(f=\Vert \cdot \Vert ^2/2\), and let \(\varphi \in \varGamma _0({\fancyscript{X}})\). Then \(\varphi {\diamond }f=\varphi \,{\square }\,f\) and \({{\mathrm{bprox}}}_\varphi ^f={{\mathrm{prox}}}_\varphi \).

2.3 Anisotropic proximity operators

An alternative extension of the notion of proximity can be obtained by replacing the function \(\Vert \cdot \Vert ^2/2\) in (2.1) by a Legendre function \(f\). This type of construction goes back to [20].

Proposition 6

Let \(f\in \varGamma _0({\fancyscript{X}})\) be a Legendre function, let \(\varphi \in \varGamma _0({\fancyscript{X}})\) be such that

$$\begin{aligned} 0\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f^*-{\mathrm{dom}}\,\varphi ^*), \end{aligned}$$
(2.13)

and let \(x\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f+{\mathrm{dom}}\,\varphi )\). Then there exists a unique point \(p\in {\fancyscript{X}}\) such that \((\varphi \,{\square }\, f)(x)=\varphi (p)+f(x-p)\); moreover, \(p\) is characterized by the inclusion

$$\begin{aligned} {\nabla f}(x-p)\in \partial \varphi (p). \end{aligned}$$
(2.14)

Proof

Using (2.13) and Lemma 3(i), we obtain

$$\begin{aligned} (\varphi ^*+f^*)^*=\varphi ^{**}\,{\square }\, f^{**}=\varphi \,{\square }\,f \end{aligned}$$
(2.15)

and the fact that the infimum in the infimal convolution is attained everywhere. On the other hand, since \(x\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f+{\mathrm{dom}}\,\varphi )\), we have

$$\begin{aligned} 0\in {{\mathrm{sri}}\,}\big ({\mathrm{dom}}\,\varphi -(x-{\mathrm{dom}}\,f)\big ) ={{\mathrm{sri}}\,}\big ({\mathrm{dom}}\,\varphi -{\mathrm{dom}}\,f(x-\cdot )\big ). \end{aligned}$$
(2.16)

Consequently, by Lemma 3(ii),

$$\begin{aligned} \partial \big (\varphi +f(x-\cdot )\big )=\partial \varphi +\partial f(x-\cdot ). \end{aligned}$$
(2.17)

Now let \(p\in {\fancyscript{X}}\). It follows from (2.17) and Lemma 4(ii) that

$$\begin{aligned} p\;\,\text{ minimizes }\;\,\varphi +f(x-\cdot )&\Leftrightarrow 0\in \partial \big (\varphi +f(x-\cdot )\big )(p)\nonumber \\&\Leftrightarrow 0\in \partial \varphi (p)-\partial f(x-p)\nonumber \\&\Leftrightarrow 0\in \partial \varphi (p)-{\nabla f}(x-p)\nonumber \\&\Leftrightarrow {\nabla f}(x-p)\in \partial \varphi (p)\end{aligned}$$
(2.18)
$$\begin{aligned}&\Rightarrow x-p\in {\mathrm{int}\mathrm{dom}f}. \end{aligned}$$
(2.19)

To show uniqueness, suppose that \(p\) and \(q\) are two distinct minimizers of \(\varphi +f(x-\cdot )\). Then \((\varphi \,{\square }\, f)(x)=\varphi (p)+f(x-p)=\varphi (q)+f(x-q)\) and, by (2.19), \(x-p\) and \(x-q\) lie in \({\mathrm{int}\mathrm{dom}f}\). Now let \(r=(1/2)(p+q)\) and suppose that \(p\ne q\). Lemma 4(ii) asserts that \(f\) is strictly convex on the convex set \({\mathrm{int}\mathrm{dom}f}={\mathrm{dom}}\,\partial f\). Therefore, invoking the convexity of \(\varphi \),

$$\begin{aligned} (\varphi \,{\square }\,f)(x)&\le \varphi (r)+f(x-r)\nonumber \\&< \frac{1}{2}\big (\varphi (p)+\varphi (q)\big )+ \frac{1}{2}\big (f(x-p)+f(x-q)\big )\nonumber \\&= (\varphi \,{\square }\,f)(x), \end{aligned}$$
(2.20)

which is impossible. \(\square \)

Using Proposition 6, we can now introduce the anisotropic proximity operator of \(\varphi \).

Definition 3

Let \(f\in \varGamma _0({\fancyscript{X}})\) be a Legendre function and let \(\varphi \in \varGamma _0({\fancyscript{X}})\) be such that \(0\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f^*-{\mathrm{dom}}\,\varphi ^*)\). Set

$$\begin{aligned} E={{\mathrm{sri}}\,}({\mathrm{dom}}\,f+{\mathrm{dom}}\,\varphi ). \end{aligned}$$
(2.21)

The anisotropic proximity operator of \(\varphi \) relative to \(f\) is

$$\begin{aligned} {{\mathrm{aprox}}}_\varphi ^f:E\rightarrow {\fancyscript{X}}:x\mapsto \underset{y\in {\fancyscript{X}}}{\mathrm{argmin}} \big (\varphi (y)+f(x-y)\big ). \end{aligned}$$
(2.22)

Remark 2

Suppose that \({\fancyscript{X}}\) is hilbertian and that \(f=\Vert \cdot \Vert ^2/2\), and let \(\varphi \in \varGamma _0({\fancyscript{X}})\). Then \({{\mathrm{aprox}}}_\varphi ^f={{\mathrm{prox}}}_\varphi \).

3 Main result

In the previous section we have described two extensions of the classical proximity operator. Our main result is a generalization of Moreau’s decomposition (Proposition 3) in Banach spaces which involves a mix of these two extensions.

Theorem 1

Let \(f\in \varGamma _0({\fancyscript{X}})\) be a Legendre function, let \(\varphi \in \varGamma _0({\fancyscript{X}})\) be such that

$$\begin{aligned} 0\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f^*-{\mathrm{dom}}\,\varphi ^*), \end{aligned}$$
(3.1)

and let \(x\in ({\mathrm{int}\mathrm{dom}f})\cap {\mathrm{int}}({\mathrm{dom}}\,f+{\mathrm{dom}}\,\varphi )\). Then the following hold.

  1. (i)

    \(f(x)=(\varphi \,{\square }\,f)(x)+(\varphi ^*\,{\diamond }\,f^*) \big ({\nabla f}(x)\big )\).

  2. (ii)

    \(x={{\mathrm{aprox}}}^f_\varphi x+\nabla f^* \big ({{\mathrm{bprox}}}^{f^*}_{\varphi ^*}\big (\nabla f(x)\big )\big )\).

  3. (iii)

    \(\big \langle {{{{\mathrm{aprox}}}^f_\varphi x},{{{\mathrm{bprox}}}_{\varphi ^*}^{f^*} \big ({\nabla f}(x)\big )}}\big \rangle =\varphi \big ({{\mathrm{aprox}}}^f_\varphi x\big )+ \varphi ^*\big ({{\mathrm{bprox}}}_{\varphi ^*}^{f^*}\big ({\nabla f}(x)\big )\big )\).

  4. (iv)

    \(\big \langle {{{{\mathrm{aprox}}}^f_\varphi x},{{\nabla f}\big (x-{{\mathrm{aprox}}}^f_\varphi x\big )}}\big \rangle = \varphi \big ({{\mathrm{aprox}}}^f_\varphi x\big )+ \varphi ^*\big ({\nabla f}\big (x-{{\mathrm{aprox}}}^f_\varphi x\big )\big )\).

Proof

Since \(x\in {\mathrm{int}}({\mathrm{dom}}\,f+{\mathrm{dom}}\,\varphi )\), Lemma 4(iii) yields

$$\begin{aligned} x\in {{\mathrm{sri}}\,}({\mathrm{dom}}\,f+{\mathrm{dom}}\,\varphi )\quad \text{ and }\quad {\nabla f}^*\big ({\nabla f}(x)\big )\in {\mathrm{int}}\big ({\mathrm{dom}}\,f^{**}+{\mathrm{dom}}\,\varphi ^{**}\big ).\qquad \end{aligned}$$
(3.2)

Hence, it follows from Proposition 6 that \({{\mathrm{aprox}}}^f_\varphi x\) is well defined and, from Lemma 4(i) and Proposition 5(i) (applied to \(f^*\) and \(\varphi ^*\)), that \({\nabla f}^*({{\mathrm{bprox}}}_{\varphi ^*}^{f^*}({\nabla f}(x)))\) is well defined. In addition,

$$\begin{aligned} (\varphi \,{\square }\,f)(x)\in \mathbb{R }\quad \text{ and }\quad (\varphi ^*\,{\diamond }\,f^*)\big ({\nabla f}(x)\big )\in \mathbb{R }. \end{aligned}$$
(3.3)

(i): It follows from (2.3), Lemma 4(iii), and the Fenchel-Young identity [33, Theorem 2.4.2(iii)] that

$$\begin{aligned} (\forall x^*\in {\fancyscript{X}}^*)\quad D_{f^*}\big (x^*,{\nabla f}(x)\big )&= f^*( x^*)-f^*\big ({\nabla f}(x)\big )-\langle {{x^*-{\nabla f}(x)},{x}}\rangle _*\nonumber \\&= f^*(x^*)+f(x)-\langle {{x^*},{x}}\rangle _*. \end{aligned}$$
(3.4)

This, (2.4), (3.1), and Lemma 3(i) imply that

$$\begin{aligned} (\varphi ^*\,{\diamond }\, f^*)\big ({\nabla f}(x)\big )&= \inf _{x^*\in {\fancyscript{X}}^*}\big (\varphi ^*(x^*)+f^*(x^*)+f(x)- \langle {{x^*},{x}}\rangle _*\big )\nonumber \\&= f(x)-\sup _{x^*\in {\fancyscript{X}}^*}\big (\langle {{x^*},{x}}\rangle _*-\varphi ^*(x^*)- f^*(x^*)\big )\nonumber \\&= f(x)-(\varphi ^*+f^*)^*(x)\nonumber \\&= f(x)-(\varphi \,{\square }\, f)(x). \end{aligned}$$
(3.5)

In view of (3.3), we obtain the announced identity.

(ii): Let \(p\in {\fancyscript{X}}\). Using Proposition 6, Lemma 4(iii), and Proposition 5(i), we obtain

$$\begin{aligned} p={{\mathrm{aprox}}}^f_\varphi x&\Leftrightarrow {\nabla f}(x-p)\in \partial \varphi (p)\end{aligned}$$
(3.6)
$$\begin{aligned}&\Leftrightarrow p\in \partial \varphi ^*\big ({\nabla f}(x-p)\big )\nonumber \\&\Leftrightarrow {\nabla f}^*\big ({\nabla f}(x)\big )-{\nabla f}^*\big ({\nabla f}(x-p)\big )\in \partial \varphi ^*\big ({\nabla f}(x-p)\big )\nonumber \\&\Leftrightarrow {\nabla f}(x-p)={{\mathrm{bprox}}}_{\varphi ^*}^{f^*}\big ({\nabla f}(x)\big )\end{aligned}$$
(3.7)
$$\begin{aligned}&\Leftrightarrow x-p={\nabla f}^*\big ({{\mathrm{bprox}}}_{\varphi ^*}^{f^*} \big ({\nabla f}(x)\big )\big ). \end{aligned}$$
(3.8)

(iii): Set \(p={{\mathrm{aprox}}}^f_\varphi x\). As seen in (3.7) and (3.6),

$$\begin{aligned} {{\mathrm{bprox}}}_{\varphi ^*}^{f^*}\big ({\nabla f}(x)\big )={\nabla f}(x-p)\in \partial \varphi (p). \end{aligned}$$
(3.9)

Hence, the Fenchel-Young identity yields

$$\begin{aligned} \langle {{p},{{{\mathrm{bprox}}}_{\varphi ^*}^{f^*}\big ({\nabla f}(x)\big )}}\rangle&= \langle {{p},{{\nabla f}(x-p)}}\rangle \nonumber \\&= \varphi (p)+\varphi ^*\big ({\nabla f}(x-p)\big )\nonumber \\&= \varphi (p)+\varphi ^*\big ({{\mathrm{bprox}}}_{\varphi ^*}^{f^*} \big ({\nabla f}(x)\big )\big ). \end{aligned}$$
(3.10)

(iv): This follows at once from (iii) and (3.9). \(\square \)

Remark 3

An instance of Theorem 1(iv) in which \(f\) and \(f^*\) are real-valued appears in [32, Proposition 1].

Theorem 1 provides a range of new decomposition schemes, even in the case when \({\fancyscript{X}}\) is a Hilbert space. Thus, in the following result, we obtain a new hilbertian frame decomposition principle (for background on frames and their applications, see [11]).

Corollary 1

Suppose that \({\fancyscript{X}}\) is a separable Hilbert space, let \(I\) be a countable set, and let \((e_i)_{i\in I}\) be a frame in \({\fancyscript{X}}\), i.e.,

$$\begin{aligned} ({\exists \,}\alpha \in {\left]0,+\infty \right[})({\exists \,}\beta \in {\left]0,+\infty \right[})(\forall x\in {\fancyscript{X}})\quad \alpha \Vert x\Vert ^2\le \sum _{i\in I}|\langle {{x},{e_i}}\rangle |^2\le \beta \Vert x\Vert ^2.\nonumber \\ \end{aligned}$$
(3.11)

Let \(S:{\fancyscript{X}}\rightarrow {\fancyscript{X}}:x\mapsto \sum _{i\in I} \langle {{x},{e_i}}\rangle e_i\) be the associated frame operator and let \((e^*_i)_{i\in I}=(S^{-1}e_i)_{i\in I}\) be the associated canonical dual frame. Furthermore, let \(\varphi \in \varGamma _0({\fancyscript{X}})\), let \(x\in {\fancyscript{X}}\), and set

$$\begin{aligned} a(x)=\underset{y\in {\fancyscript{X}}}{\mathrm{argmin}} \left( \varphi (y)+\frac{1}{2}\sum _{i\in I}|\langle {{x-y},{e_i}}\rangle |^2 \right) \end{aligned}$$
(3.12)

and

$$\begin{aligned} b(x)=\underset{x^*\in {\fancyscript{X}}}{\mathrm{argmin}} \left( \varphi ^*(x^*)-\langle {{x^*},{x}}\rangle +\frac{1}{2}\sum _{i\in I} |\langle {{x^*},{e^*_i}}\rangle |^2\right) . \end{aligned}$$
(3.13)

Then \(x=a(x)+\sum _{i\in I}\langle {{b(x)},{e^*_i}}\rangle e^*_i\).

Proof

Set \(f:{\fancyscript{X}}\rightarrow \mathbb{R }:x\mapsto (1/2) \sum _{i\in I}|\langle {{x},{e_i}}\rangle |^2\). It is easily seen that \(f\) is Fréchet differentiable on \({\fancyscript{X}}\) with \({\nabla f}=S\). It therefore follows from [5, Theorem 5.6] that \(f\) is essentially smooth. Now fix \(x^*\in {\fancyscript{X}}\). Since the frame operator of \((e_i^*)_{i\in I}\) is \(S^{-1}\) [11, Lemma 5.1.6], we have

$$\begin{aligned} \langle {{S^{-1}x^*},{x^*}}\rangle = \Bigg \langle \sum _{i\in I}\langle x^{*}, e_i^{*}\rangle {e_{i}^{*}},{x^*}\Bigg \rangle =\sum _{i\in I}|\langle {{x^*},{e_i^*}}\rangle |^2=2f(S^{-1}x^*). \end{aligned}$$
(3.14)

Now set \(g:{\fancyscript{X}}\rightarrow \mathbb{R }:x\mapsto f(x)-\langle {{x},{x^*}}\rangle \). Then \(g\) is a differentiable convex function and \(\nabla g:x\mapsto Sx-x^*\) vanishes at \(x=S^{-1}x^*\). Hence, using (3.14), we obtain

$$\begin{aligned} f^*(x^*)\!=\!-\min _{x\in {\fancyscript{X}}}g(x)\!=\!\langle {{S^{-1}x^*},{x^*}}\rangle -f(S^{-1}x^*) \!=\!f(S^{-1}x^*)\!=\!\frac{1}{2}\sum _{i\in I}|\langle {{x^*},{e^*_i}}\rangle |^2.\nonumber \\ \end{aligned}$$
(3.15)

Hence, as above, \(f^*\) is Fréchet differentiable on \({\fancyscript{X}}\) with \({\nabla f}^*=S^{-1}\) and, in turn, essentially smooth, which makes \(f\) essentially strictly convex [5, Theorem 5.4]. Altogether, \(f\) is a Legendre function with

$$\begin{aligned} {\mathrm{dom}}\,f={\fancyscript{X}},\quad {\mathrm{dom}}\,f^*={\fancyscript{X}},\quad {\nabla f}=S, \quad \text{ and }\quad {\nabla f}^*=S^{-1}. \end{aligned}$$
(3.16)

Moreover, it follows from (2.12), (2.22), (3.16), Lemma 4(iii), (3.12), (3.13), and (3.15) that

$$\begin{aligned} {{\mathrm{bprox}}}_{\varphi ^*}^{f^*}({\nabla f}(x))=b(x) \quad \text{ and }\quad {{\mathrm{aprox}}}_{\varphi }^{f}(x)=a(x). \end{aligned}$$
(3.17)

The result is therefore an application of Theorem 1(ii). \(\square \)

Remark 4

Corollary 1 can be regarded as an extension of Moreau’s decomposition principle in separable Hilbert spaces. Indeed, in the special case when \((e_i)_{i\in I}\) is an orthonormal basis in Corollary 1, we recover Proposition 3(ii).

The next application is set in uniformly convex and uniformly smooth Banach spaces.

Corollary 2

Suppose that \({\fancyscript{X}}\) is uniformly convex and uniformly smooth, let \(J\) be its duality mapping, set \(q=\Vert \cdot \Vert ^2/2\), and let \(\varphi \in \varGamma _0({\fancyscript{X}})\). Then \(q^*=\Vert \cdot \Vert ^2_*/2\) and the following hold for every \(x\in {\fancyscript{X}}\).

  1. (i)

    \(q(x)=(\varphi \,{\square }\,q)(x)+(\varphi ^*\,{\diamond }\,q^*)(Jx)\).

  2. (ii)

    \(x={{\mathrm{aprox}}}^q_\varphi x+ J^{-1}\big ({{\mathrm{bprox}}}^{q^*}_{\varphi ^*}(Jx)\big )\).

  3. (iii)

    \(\big \langle {{{{\mathrm{aprox}}}^q_\varphi x},{{{\mathrm{bprox}}}_{\varphi ^*}^{q^*}(Jx)}}\big \rangle = \varphi \big ({{\mathrm{aprox}}}^q_\varphi x\big )+ \varphi ^*\big ({{\mathrm{bprox}}}_{\varphi ^*}^{q^*}(Jx)\big )\).

  4. (iv)

    \(\big \langle {{{{\mathrm{aprox}}}^q_\varphi x},{J\big (x-{{\mathrm{aprox}}}^q_\varphi x\big )}}\big \rangle = \varphi \big ({{\mathrm{aprox}}}^q_\varphi x\big )+ \varphi ^*\big (J\big (x-{{\mathrm{aprox}}}^q_\varphi x\big )\big )\).

Proof

This is an application of Theorem 1 with \(f=q\). Indeed, \({\mathrm{dom}}\,f={\fancyscript{X}},\,\) \({\mathrm{dom}}\,f^*={\fancyscript{X}}^*\), and \({\nabla f}=J\). \(\square \)

In particular, if \({\fancyscript{X}}\) is a Hilbert space in Corollary 2, if follows from Remark 1(ii) and Remark 2 that we recover Moreau’s decomposition principle (Proposition 3) and a fortiori Propositions 1 and 2. Another noteworthy instance of Corollary 2 is when \(\varphi =\iota _K\), where \(K\) is a nonempty closed convex cone in \({\fancyscript{X}}\). In this case, \(\varphi ^*=\iota _{K^\ominus },\,{{\mathrm{aprox}}}^q_\varphi =P_K\), and we derive from (1.4) and (1.5) that \({{\mathrm{bprox}}}^q_\varphi =\varPi _K\). Hence, Corollary 2 (ii)&(iii) yields Proposition 4.

Remark 5

Consider the setting of Theorem 1 and set \(A=\partial \varphi \). Then, by Rockafellar’s theorem, \(A\) is a maximally monotone operator [33, Theorem 3.1.11]. Moreover, it follows from (2.14), Lemma 4(iii), and (2.6) that we can rewrite Theorem 1(ii) as

$$\begin{aligned} x=({\mathrm{Id}}\,+\nabla f^*\circ A)^{-1}x+ \nabla f^*\circ \big ({\nabla f}^*+A^{-1}\big )^{-1}x, \end{aligned}$$
(3.18)

where \({\mathrm{Id}}\,\) is the identity operator on \({\fancyscript{X}}\). The results of [6, Section 3.3] suggest that this decomposition holds for more general maximally monotone operators \(A:{\fancyscript{X}}\rightarrow 2^{{\fancyscript{X}}^*}\). If \({\fancyscript{X}}\) is a Hilbert space and \(f=\Vert \cdot \Vert ^2/2\), (3.18) yields the well-known resolvent identity \({\mathrm{Id}}\,=({\mathrm{Id}}\,+A)^{-1}+({\mathrm{Id}}\,+A^{-1})^{-1}\), which is true for any maximally monotone operator \(A\) [7, Proposition 23.18].