Abstract
We propose a flexible approach for computing the resolvent of the sum of weakly monotone operators in real Hilbert spaces. This relies on splitting methods where strong convergence is guaranteed. We also prove linear convergence under Lipschitz continuity assumption. The approach is then applied to computing the proximity operator of the sum of weakly convex functions, and particularly to finding the best approximation to the intersection of convex sets.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we explore a straightforward path to the problem of computing the resolvent of the sum of two (not necessarily monotone) operators using resolvents of individual operators. When applied to normal cones of convex sets, this computation solves the best approximation problem of finding the projection onto the intersection of these sets.
In general, computations involving simultaneously two or more operators are usually difficult. One popular approach is to treat each operator individually, then use these calculations to construct the desired answer. Prominent examples of such splitting strategy include the Douglas–Rachford algorithm [9, 11] and the Peaceman–Rachford algorithm [12] that apply to the problem of finding a zero of the sum of maximally monotone operators. In [3], the authors proposed an extension of Dykstra’s algorithm [10] for constructing the resolvent of the sum of two maximally monotone operators. By product space reformulation, this problem was then handled in [5] for finitely many operators. Recently, the so-called averaged alternating modified reflections algorithm was used in [2] to study this problem, and was soon after re-derived in [1] from the view point of the proximal and resolvent average. Because computing the resolvent of a finite sum of operators can be transformed into that of the sum of two operators by a standard product space setting, as done in [2, 5], we will focus on the case of two operators for simplicity.
The goal of this paper is to provide a flexible approach for computing the resolvent of the sum of two weakly monotone operators from individual resolvents. Our work extends and complements recent results in this direction. We also present applications to computing the proximity operator of the sum of two weakly convex functions and to finding the best approximation to the intersection of two convex sets.
The paper is organized as follows. In Sect. 2, we provide necessary materials. Sect. 3 contains our main results. Finally, applications are presented in Sect. 4.
2 Preparation
We assume throughout that X is a real Hilbert space with inner product \(\left\langle {\cdot },{\cdot } \right\rangle \) and induced norm \(\Vert \cdot \Vert \). The set of nonnegative integers is denoted by \(\mathbb {N}\), the set of real numbers by \(\mathbb {R}\), the set of nonnegative real numbers by \({\mathbb {R}}_+:= \{{x \in \mathbb {R}}~\big |~{x \ge 0}\}\), and the set of the positive real numbers by \({\mathbb {R}}_{++}:= \{{x \in \mathbb {R}}~\big |~{x >0}\}\). The notation \(A:X\rightrightarrows X\) indicates that A is a set-valued operator on X.
Given an operator A on X, its domain is denoted by \({\text {dom}}A :=\{{x\in X}~\big |~{Ax\ne \varnothing }\}\), its range by \({\text {ran}}A :=A(X)\), its graph by \({\text {gra}}A :=\{{(x,u)\in X\times X}~\big |~{u\in Ax}\}\), its set of zeros by \({\text {zer}}A :=\{{x\in X}~\big |~{0\in Ax}\}\), and its fixed point set by \({\text {Fix}}A :=\{{x\in X}~\big |~{x\in Ax}\}\). The inverse of A, denoted by \(A^{-1}\), is the operator with graph \({\text {gra}}A^{-1} :=\{{(u,x)\in X\times X}~\big |~{u\in Ax}\}\). Recall from [8, Definition 3.1] that an operator \(A:X\rightrightarrows X\) is said to be \(\alpha \)-monotone if \(\alpha \in \mathbb {R}\) and
In this case, we say that A is monotone if \(\alpha =0\), strongly monotone if \(\alpha >0\), and weakly monotone if \(\alpha <0\). The operator A is said to be maximally\(\alpha \)-monotone if it is \(\alpha \)-monotone and there is no \(\alpha \)-monotone operator \(B:X\rightrightarrows X\) such that \({\text {gra}}B\) properly contains \({\text {gra}}A\). It is worth mentioning that if A is maximally \(\alpha \)-monotone with \(\alpha \in {\mathbb {R}}_+\), then it is maximally monotone (see [8, Section 3]. Furthermore, it is also clear that A is (resp. maximally) \(\alpha \)-monotone if and only if \(A-\alpha {\text {Id}}\) is (resp. maximally) monotone, where \({\text {Id}}\) is the identity operator.
The resolvent of \(A:X\rightrightarrows X\) is defined by \(J_A:= ({\text {Id}}+ A)^{-1}\). We conclude this section by an elementary formula for computing the resolvent of special composition via resolvents of its components.
Proposition 2.1
(Resolvent of composition) Let \(A:X\rightrightarrows X\), \(q, r\in X\), \(\theta \in {\mathbb {R}}_{++}\), and \(\sigma \in \mathbb {R}\). Define \(\bar{A} :=A\circ (\theta {\text {Id}}-q)+\sigma {\text {Id}}-r\) and let \(\gamma \in {\mathbb {R}}_{++}\). Then the following hold:
- (i)
A is (resp. maximally) \(\alpha \)-monotone if and only if \(\bar{A}\) is (resp. maximally) \((\theta \alpha +\sigma )\)-monotone.
- (ii)
If \(1+\gamma \sigma \ne 0\), then
$$\begin{aligned} J_{\gamma {\bar{A}}} =\frac{1}{\theta }\left( J_{\frac{\gamma \theta }{1+\gamma \sigma }A}\circ \left( \frac{\theta }{1+\gamma \sigma }{\text {Id}}+\frac{\gamma \theta }{1+\gamma \sigma }r-q\right) +q\right) ; \end{aligned}$$(2)and if, in addition, A is maximally \(\alpha \)-monotone and \(1+\gamma (\theta \alpha +\sigma ) >0\), then \(J_{\gamma \bar{A}}\) and \(J_{\frac{\gamma \theta }{1+\gamma \sigma }A}\) are single-valued and have full domain.
Proof
(i): This is straightforward from the definition.
(ii): We note that \((\theta {\text {Id}}-q)^{-1} =\frac{1}{\theta }({\text {Id}}+q)\), that \((T-z)^{-1} =T^{-1}\circ ({\text {Id}}+z)\), and that \((\alpha T)^{-1} =T^{-1}\circ (\frac{1}{\alpha }{\text {Id}})\) for any operator T, any \(z\in X\), and any \(\alpha \in \mathbb {R}\smallsetminus \{0\}\). Using these facts yields
Since A is maximally \(\alpha \)-monotone, \(\bar{A}\) is maximally \((\theta \alpha +\sigma )\)-monotone. Now, since \(1+\gamma (\theta \alpha +\sigma ) >0\), [8, Proposition 3.4] implies the conclusion. \(\square \)
3 Main results
In this section, let \(A, B:X\rightrightarrows X\), \(\omega \in {\mathbb {R}}_{++}\), and \(r\in X\). We present a flexible approach to the computation of the resolvent at r of the scaled sum \(\omega (A+B)\), that is to
Our analysis relies on the observation that this problem can be reformulated into the problem of finding a zero of the sum of two suitable operators. Indeed, when \(r\in {\text {dom}}J_{\omega (A+B)} ={\text {ran}}\left( {\text {Id}}+\omega (A+B)\right) \), we have by definition that
By writing \(\frac{1}{\omega } =\sigma +\tau \) and \(\frac{1}{\omega }r =r_A+r_B\), the last inclusion is equivalent to
which leads to finding a zero of the sum of two new operators \(A+\sigma {\text {Id}}-r_A\) and \(B+\tau {\text {Id}}-r_B\).
Based on the above observation, we proceed with a more general formulation. Assume throughout that
that \((\sigma ,\tau )\in \mathbb {R}^2\) and \((r_A,r_B)\in X^2\) satisfy
and that
Now, we will derive the formula for the resolvent of the scaled sum via zeros of the sum of these newly defined operators.
Proposition 3.1
(Resolvent via zeros of sum of operators) Suppose that \(r\in {\text {ran}}\left( {\text {Id}}+\omega (A+B)\right) \). Then
Consequently, if \(A_\sigma +B_\tau \) is strongly monotone, then \(J_{\omega (A+B)}(r)\) and \({\text {zer}}(A_\sigma +B_\tau )\) are singletons.
Proof
By assumption, \(J_{\omega (A+B)}(r)\ne \varnothing \). For every \(z\in X\), we derive from (8) and (9) that
The remaining conclusion follows from [4, Proposition 23.35]. \(\square \)
The new operators \(A_\sigma \) and \(B_\tau \) along with Proposition 3.1 allow for the flexibility in chosing \((\sigma ,\tau )\) and \((r_A,r_B)\) as one can decide the values of these parameters as long as (8) is satisfied. We are now ready for our main result.
Theorem 3.2
(Resolvent of sum of \(\alpha \)- and \(\beta \)-monotone operators) Suppose that A and B are respectively maximally \(\alpha \)- and \(\beta \)-monotone with \(\alpha +\beta >-1/\omega \), that \(r\in {\text {ran}}\left( {\text {Id}}+\omega (A+B)\right) \), and that \((\sigma ,\tau )\) satisfies
Let \(\gamma \in {\mathbb {R}}_{++}\) be such that \(1+\gamma \sigma \ne 0\) and \(1+\gamma \tau \ne 0\). Given any \(\kappa \in \left. \right] 0,1 \left. \right] \) and \(x_0\in X\), define the sequence \((x_n)_{n\in {\mathbb {N}}}\) by
with explicit formulas
Then \(J_{\omega (A+B)}(r) =J_{\frac{\gamma \theta }{1+\gamma \sigma }A}\left( \frac{\theta }{1+\gamma \sigma }\overline{x}+\frac{\gamma \theta }{1+\gamma \sigma }r_A-q\right) \) with \(\overline{x}\in {\text {Fix}}(2J_{\gamma B_\tau }-{\text {Id}})\circ (2J_{\gamma A_\sigma }-{\text {Id}})\) and the following hold:
- (i)
\(\left( J_{\frac{\gamma \theta }{1+\gamma \sigma }A}\left( \frac{\theta }{1+\gamma \sigma }x_n+\frac{\gamma \theta }{1+\gamma \sigma }r_A-q\right) \right) _{n\in {\mathbb {N}}}\) converges strongly to \(J_{\omega (A+B)}(r)\).
- (ii)
If \(\kappa <1\), then \((x_n)_{n\in {\mathbb {N}}}\) converges weakly to \(\overline{x}\).
- (iii)
If A is Lipschitz continuous, then the convergences in (i) and (ii) are linear.
Proof
We first note that the existence of \((\sigma ,\tau )\in \mathbb {R}^2\) satisfying (8) and (12) is ensured since \(\alpha +\beta >-1/\omega \). By Proposition 2.1(i) and (12), \(A_\sigma \) and \(B_\tau \) are respectively maximally \((\theta \alpha +\sigma )\)- and \((\theta \beta +\tau )\)-monotone with \(\theta \alpha +\sigma >0\) and \(\theta \beta +\tau \ge 0\), hence, by Proposition 2.1(ii), \(J_{\gamma A_\sigma }\) and \(J_{\gamma B_\tau }\) are single-valued and have full domain. We also see that \(A_\sigma \) and \(B_\tau \) are maximally monotone and that \(A_\sigma \) and \(A_\sigma +B_\tau \) are strongly monotone. Using Proposition 3.1 and [4, Proposition 26.1(iii)(b)], we have \({\text {zer}}(A_\sigma +B_\tau ) =\{J_{\gamma A_\sigma }(\overline{x})\}\) with \(\overline{x}\in {\text {Fix}}(2J_{\gamma B_\tau }-{\text {Id}})\circ (2J_{\gamma A_\sigma }-{\text {Id}})\) and \(J_{\omega (A+B)}(r) =\theta J_{\gamma A_\sigma }(\overline{x}) -q\).
Now, Proposition 2.1(ii) implies (14), which yields
Therefore,
(i): By applying [4, Theorem 26.11(vi)(b)] with all \(\lambda _n =\kappa \) if \(\kappa <1\) and applying [4, Proposition 26.13] if \(\kappa =1\), we obtain that \(J_{\gamma A_\sigma }(x_n)\rightarrow J_{\gamma A_\sigma }(\overline{x})\). Now combine with (15) and (16).
(ii): Again apply [4, Theorem 26.11] with all \(\lambda _n =\kappa \).
(iii): Assume that A is Lipschitz continuous with constant \(\ell \). It is straightforward to see that \(A_\sigma \) is Lipschitz continuous with constant \((\theta \ell +|\sigma |)\). The conclusion follows from [8, Theorem 4.8] with \(\lambda =\mu =2\) and \(\delta =\gamma \). \(\square \)
Remark 3.3
Some remarks regarding Theorem 3.2 are in order.
- (i)
Under the assumptions made, \(A+B\) is \((\alpha +\beta )\)-monotone but not necessarily maximal. If, in addition, \(A+B\) is indeed maximally \((\alpha +\beta )\)-monotone, then \(J_{\omega (A+B)}\) has full domain by [8, Proposition 3.4(ii)]; thus, the condition \(r\in {\text {ran}}\left( {\text {Id}}+\omega (A+B)\right) \) can be removed.
- (ii)
The iterative scheme (13) is the Douglas–Rachford algorithm if \(\kappa =1/2\) and the Peaceman–Rachford algorithm if \(\kappa =1\). For a more general version of (13), we refer the readers to [8]; see also [6, 7].
- (iii)
If the condition (12) is replaced by
$$\begin{aligned} \theta \alpha +\sigma \ge 0 \quad \text {and}\quad \theta \beta +\tau >0, \end{aligned}$$(17)then the conclusions in Theorem 3.2(ii)–(iii) still hold, while Theorem 3.2(i) only holds for \(\kappa <1\); see also [5, Theorem 2.1(ii) and Remark 2.2(iv)].
- (iv)
One can simply choose \(\theta =1\) and \(q =0\), in which case, (14) reduces to
$$\begin{aligned} J_{\gamma A_\sigma } =J_{\frac{\gamma }{1+\gamma \sigma }A}\circ \frac{1}{1+\gamma \sigma }({\text {Id}}+\gamma r_A) \quad \text {and}\quad J_{\gamma B_\tau } =J_{\frac{\gamma }{1+\gamma \tau }B}\circ \frac{1}{1+\gamma \tau }({\text {Id}}+\gamma r_B). \end{aligned}$$(18) - (v)
When A and B are maximally monotone, i.e., \(\alpha =\beta =0\), (12) is satisfied whenever \(\sigma >0\) and \(\tau \ge 0\). One thus can choose for instance \(\sigma =\tau =\frac{\theta }{2\omega }\).
- (vi)
It is always possible to find \(\gamma \in {\mathbb {R}}_{++}\) satisfying even \(1+\gamma \sigma >0\) and \(1+\gamma \tau >0\). In fact, these inequalities are automatic regardless of \(\gamma \in {\mathbb {R}}_{++}\) as long as \(\sigma \) and \(\tau \) are both nonnegative.
When A and B are maximally monotone, the following result gives an iterative method for computing the resolvent of \(A+B\) where each iteration relies only on the computations of \(J_A\) and \(J_B\).
Theorem 3.4
(Resolvent of sum of two maximally monotone operators) Suppose that A and B are maximally monotone, that \(\omega >1/2\), and that \(r\in {\text {ran}}\left( {\text {Id}}+\omega (A+B)\right) \). Define
Let \(\kappa \in \left. \right] 0,1\left. \right] \), let \(x_0\in X\), and define the sequence \((x_n)_{n\in {\mathbb {N}}}\) by
with explicit formulas
Then \(J_{\omega (A+B)}(r) =J_A\left( (1-\frac{1}{2\omega })(\theta \overline{x}-q)+\frac{1}{2\omega }r\right) \) with \(\overline{x}\in {\text {Fix}}(2J_{\bar{B}}-{\text {Id}})\circ (2J_{\bar{A}}-{\text {Id}})\) and the following hold:
- (i)
\(\left( J_A\left( (1-\frac{1}{2\omega })(\theta x_n-q)+\frac{1}{2\omega }r\right) \right) _{n\in {\mathbb {N}}}\) converges strongly to \(J_{\omega (A+B)}(r)\).
- (ii)
If \(\kappa <1\), then \((x_n)_{n\in {\mathbb {N}}}\) converges weakly to \(\overline{x}\).
- (iii)
If A is Lipschitz continuous, then the convergences in (i) and (ii) are linear.
Proof
Choosing
we have that (8) is satisfied and that
which yields \(\gamma A_\sigma =\bar{A}\) and \(\gamma B_\tau =\bar{B}\). Since \(1+\gamma \sigma =1+\gamma \theta /(2\omega ) =2\omega /(2\omega -1) =\gamma \theta \), we get (21) from (14). Now apply Theorem 3.2 with \(\alpha =\beta =0\). \(\square \)
Having the freedom of choice, one can certainly use appropriate parameters to obtain simpler formulations. The following corollary illustrates one of such instances.
Corollary 3.5
Suppose that A is maximally \(\alpha \)-monotone with \(\alpha >-1/(2\omega )\), that B is maximally \(\beta \)-monotone with \(\beta \ge -1/(2\omega )\), and that \(r\in {\text {ran}}\left( {\text {Id}}+\omega (A+B)\right) \). Let \(\eta \in {\mathbb {R}}_{++}\), \(\kappa \in \left. \right] 0,1\left. \right] \), \(x_0\in X\), and define the sequence \((x_n)_{n\in {\mathbb {N}}}\) by
Then \(J_{\omega (A+B)}(r) =J_{\omega A}(\frac{1}{2\eta }\overline{x})\) with \(\overline{x}\in {\text {Fix}}\left( 2\eta J_{\omega B}\circ \frac{1}{2\eta }{\text {Id}}+2\eta r -{\text {Id}}\right) \circ \left( 2\eta J_{\omega A}\circ \frac{1}{2\eta }{\text {Id}}+2\eta r -{\text {Id}}\right) \) and the following hold:
- (i)
\(\left( J_{\omega A}(\frac{1}{2\eta }x_n)\right) _{n\in {\mathbb {N}}}\) converges strongly to \(J_{\omega (A+B)}(r)\).
- (ii)
If \(\kappa <1\), then \((x_n)_{n\in {\mathbb {N}}}\) converges weakly to \(\overline{x}\).
- (iii)
If A is Lipschitz continuous, then the above convergences are linear.
Proof
We first see that \(\alpha +\beta >-1/\omega \). Now choose
Then (8) and (12) are satisfied, while \(\gamma \theta =2\omega \) and \(1+\gamma \sigma =1+\gamma \tau =2\). We have that
Noting from (14) that
we get the conclusion due to Theorem 3.2. \(\square \)
Again thanks to the flexibility of the parameters, our results recapture the formulation and convergence analysis of recent methods. In particular, Corollaries 3.6 and 3.7 are in the spirit of [2, Theorem 3.1] and [1, Theorem 3.2], respectively.
Corollary 3.6
Let \(\eta \in \left. \right] 0,1\left[ \right. \) and \(\gamma \in {\mathbb {R}}_{++}\). Suppose that A and B are maximally monotone and that \(r\in {\text {ran}}\left( {\text {Id}}+\frac{\gamma }{2(1-\eta )}(A+B)\right) \). Let \(\kappa \in \left. \right] 0,1 \left. \right] \), let \(x_0\in X\), and define the sequence \((x_n)_{n\in {\mathbb {N}}}\) by
Then \(J_{\frac{\gamma }{2(1-\eta )}(A+B)}(r) =J_{\gamma A}(\overline{x}+r)\) with \(\overline{x}\in {\text {Fix}}(2\eta J_{\gamma B}({\text {Id}}+r) -2\eta r -{\text {Id}})\circ (2\eta J_{\gamma A}({\text {Id}}+r) -2\eta r -{\text {Id}})\) and the following hold:
- (i)
\(\left( J_{\gamma A}(x_n+r)\right) _{n\in {\mathbb {N}}}\) converges strongly to \(J_{\frac{\gamma }{2(1-\eta )}(A+B)}(r)\).
- (ii)
If \(\kappa <1\), then \((x_n)_{n\in {\mathbb {N}}}\) converges weakly to \(\overline{x}\).
- (iii)
If A is Lipschitz continuous, then the above convergences are linear.
Proof
Let \(\omega =\frac{\gamma }{2(1-\eta )}\), \(\theta =\frac{1}{\eta }\), \(q =-r\), \(\sigma =\tau =\frac{\theta }{2\omega } =\frac{1-\eta }{\gamma \eta }\), and \(r_A =r_B =0\). Then (8) is satisfied,
Noting that \(1+\gamma \sigma =1+\gamma \tau =1\!/\eta =\theta \), we have from (14) that
Applying Theorem 3.2 with \(\alpha =\beta =0\) completes the proof. \(\square \)
Corollary 3.7
Suppose that A and B are maximally monotone and that \(A+B\) is also maximally monotone. Let \(\eta \in \left. \right] 0,1\left[ \right. \), \(\kappa \in \left. \right] 0,1 \left. \right] \), \(x_0\in X\), and define the sequence \((x_n)_{n\in {\mathbb {N}}}\) by
Then \(J_{\frac{1}{2(1-\eta )}(A+B)}(r) =J_A(\overline{x})\) with \(\overline{x}\in {\text {Fix}}(2\eta J_B +2(1-\eta )r -{\text {Id}})\circ (2\eta J_A +2(1-\eta )r -{\text {Id}})\) and the following hold:
- (i)
\((J_A(x_n))_{n\in {\mathbb {N}}}\) converges strongly to \(J_{\frac{1}{2(1-\eta )}(A+B)}(r)\).
- (ii)
If \(\kappa <1\), then \((x_n)_{n\in {\mathbb {N}}}\) converges weakly to \(\overline{x}\).
- (iii)
If A is Lipschitz continuous, then the above convergences are linear.
Proof
Apply Theorem 3.4 with \(\omega =\frac{1}{2(1-\eta )}\), \(\theta =\frac{1}{\eta }\), and \(q =\frac{1-\eta }{\eta }r =\frac{1}{2\omega -1}r\) and note that \(J_{\frac{1}{2(1-\eta )}(A+B)}\) has full domain due to Remark 3.3(i). \(\square \)
4 Applications
In this section, we provide transparent applications of our result to computing the proximity operator of the sum of two weakly convex functions and to finding the closest point in the intersection of closed convex sets.
We recall that a function \(f:X\rightarrow ]-\infty ,+\infty ]\) is proper if \({\text {dom}}f :=\{{x\in X}~\big |~{f(x) <+\infty }\}\ne \varnothing \) and lower semicontinuous if \(\forall x\in {\text {dom}}f\), \(f(x)\le \liminf _{z\rightarrow x} f(z)\). The function f is said to be \(\alpha \)-convex for some \(\alpha \in \mathbb {R}\) if \(\forall x,y\in {\text {dom}}f\), \(\forall \kappa \in \left. \right] 0,1\left[ \right. \),
We say that f is convex if \(\alpha =0\), strongly convex if \(\alpha >0\), and weakly convex if \(\alpha <0\).
Let \(f:X\rightarrow ]-\infty ,+\infty ]\) be proper. The Fréchet subdifferential of f at x is given by
It is known that if f is differentiable at x, then \(\widehat{\partial }f(x) =\{\nabla f(x)\}\), and that if f is convex, then the Fréchet subdifferential coincides with the convex subdifferential, i.e.,
The proximity operator of f with parameter \(\gamma \in {\mathbb {R}}_{++}\) is the mapping \({\text {Prox}}_{\gamma f}:X\rightrightarrows X\) defined by
Given a nonempty closed subset C of X, the indicator function\(\iota _C\) of C is defined by \(\iota _C(x) =0\) if \(x\in C\) and \(\iota _C(x) =+\infty \) if \(x\notin C\). It is clear that \({\text {Prox}}_{\gamma \iota _C} =P_C\), where \(P_C:X\rightrightarrows C\) is the projector onto C given by
If C is convex, then the normal cone to C is the operator \(N_C:X\rightrightarrows X\) defined by
Lemma 4.1
Let \(f:X\rightarrow ]-\infty ,+\infty ]\) and \(g:X\rightarrow ]-\infty ,+\infty ]\) be proper, lower semicontinuous, and respectively \(\alpha \)- and \(\beta \)-convex, let \(\omega \in {\mathbb {R}}_{++}\), and let \(r\in {\text {ran}}({\text {Id}}+\omega (\widehat{\partial }f+\widehat{\partial }g))\). Suppose that \(\alpha +\beta >-1/\omega \). Then
Consequently, if C and D are closed convex subsets of X with \(C\cap D\ne \varnothing \) and \(r\in {\text {ran}}({\text {Id}}+N_C+N_D)\), then
Proof
On the one hand, noting that \({\text {ran}}({\text {Id}}+\omega (\widehat{\partial }f+\widehat{\partial }g)) ={\text {dom}}J_{\omega (\widehat{\partial }f+\widehat{\partial }g)}\) and that \(\widehat{\partial }f+\widehat{\partial }g\subseteq \widehat{\partial }(f+g)\), we have for any \(p\in X\) that
On the other hand, it is straightforward from definition that \(f+g\) is \((\alpha +\beta )\)-convex. Since \(1+\omega (\alpha +\beta ) >0\), we learn from [8, Lemma 5.2] that \({\text {Prox}}_{\omega (f+g)} =J_{\omega \widehat{\partial }(f+g)}\) is single-valued and has full domain. Combining with (40) implies (38).
Now, since C and D are closed convex sets, \(\iota _C\) and \(\iota _D\) are convex functions, and therefore, \(\widehat{\partial }\iota _C =\partial \iota _C =N_C\) and \(\widehat{\partial }\iota _D =\partial \iota _D =N_D\) (see, e.g., [4, Example 16.13]). Applying (38) to \((f,g) =(\iota _C,\iota _D)\) and \(\omega =1\) yields
which completes the proof. \(\square \)
We now derive some applications of Theorem 3.2. In what follows, \(\theta \in {\mathbb {R}}_{++}\) and \(q\in X\).
Corollary 4.2
(Proximity operator of sum of \(\alpha \)- and \(\beta \)-convex functions) Let \(f:X\rightarrow ]-\infty ,+\infty ]\) and \(g:X\rightarrow ]-\infty ,+\infty ]\) be proper, lower semicontinuous, and respectively \(\alpha \)- and \(\beta \)-convex, let \(\omega \in {\mathbb {R}}_{++}\), and let \(r\in {\text {ran}}({\text {Id}}+\omega (\widehat{\partial }f+\widehat{\partial }g))\). Suppose that \(\alpha +\beta >-1/\omega \) and let \((\sigma ,\tau )\in \mathbb {R}^2\) and \((r_f, r_g)\in X^2\) be such that \(\sigma +\tau =\theta /\omega \), \(r_f+r_g =(q+r)/\omega \),
Let \(\gamma \in {\mathbb {R}}_{++}\) be such that \(1+\gamma \sigma >0\) and \(1+\gamma \tau >0\). Given any \(\kappa \in \left. \right] 0,1 \left. \right] \) and \(x_0\in X\), define the sequence \((x_n)_{n\in {\mathbb {N}}}\) by
where
Then \({\text {Prox}}_{\omega (f+g)}(r) ={\text {Prox}}_{\frac{\gamma \theta }{1+\gamma \sigma }f}\left( \frac{\theta }{1+\gamma \sigma }\overline{x}+\frac{\gamma \theta }{1+\gamma \sigma }r_f-q\right) \) with \(\overline{x}\in {\text {Fix}}R_gR_f\) and the following hold:
- (i)
\(\left( {\text {Prox}}_{\frac{\gamma \theta }{1+\gamma \sigma }f}\left( \frac{\theta }{1+\gamma \sigma }x_n+\frac{\gamma \theta }{1+\gamma \sigma }r_f-q\right) \right) _{n\in {\mathbb {N}}}\) converges strongly to \({\text {Prox}}_{\omega (f+g)}(r)\).
- (ii)
If \(\kappa <1\), then \((x_n)_{n\in {\mathbb {N}}}\) converges weakly to \(\overline{x}\).
- (iii)
If f is differentiable with Lipschitz continuous gradient, then the above convergences are linear.
Proof
By assumption, [8, Lemma 5.2] implies that \(\widehat{\partial }f\) and \(\widehat{\partial }g\) are respectively maximally \(\alpha \)- and \(\beta \)-monotone, and that
Next, Lemma 4.1 implies that \(J_{\omega (\widehat{\partial }f+\widehat{\partial }g)}(r) ={\text {Prox}}_{\omega (f+g)}(r)\). The conclusion then follows from Theorem 3.2 applied to \((A,B) =(\widehat{\partial }f,\widehat{\partial }g)\). \(\square \)
Corollary 4.3
(Projection onto intersection of two closed convex sets) Let C and D be closed convex subsets of X with \(C\cap D\ne \varnothing \) and let \(r\in {\text {ran}}({\text {Id}}+N_C+N_D)\). Let \(\sigma \in {\mathbb {R}}_{++}\), \(\tau \in {\mathbb {R}}_+\), and \((r_C,r_D)\in X^2\) with \(r_C+r_D=(\sigma +\tau )(q+r)/\theta \). Let also \(\gamma \in {\mathbb {R}}_{++}\), \(\kappa \in \left. \right] 0,1 \left. \right] \), \(x_0\in X\), and define the sequence \((x_n)_{n\in {\mathbb {N}}}\) by
where
Then \(\left( P_C\left( \frac{\theta }{1+\gamma \sigma }x_n+\frac{\gamma \theta }{1+\gamma \sigma }r_C-q\right) \right) _{n\in {\mathbb {N}}}\) converges strongly to \(P_{C\cap D}(r) =P_C\left( \frac{\theta }{1+\gamma \sigma }\overline{x}+\frac{\gamma \theta }{1+\gamma \sigma }r_C-q\right) \) with \(\overline{x}\in {\text {Fix}}\bar{R}_D\bar{R}_C\). Furthermore, if \(\kappa <1\), then \((x_n)_{n\in {\mathbb {N}}}\) converges weakly to \(\overline{x}\).
Proof
We first derive from [4, Example 20.26] that \(N_C\) and \(N_D\) are maximally monotone and from [4, Example 23.4] that
Setting \(\omega :=\theta /(\sigma +\tau ) >0\), we note that \(r\in {\text {ran}}({\text {Id}}+N_C+N_D) ={\text {ran}}\big ({\text {Id}}+\omega (N_C+N_D)\big )\) and from Lemma 4.1 that \(J_{\omega (N_C+N_D)}(r) =J_{N_C+N_D}(r) =P_{C\cap D}(r)\). Now apply Theorem 3.2 to \((A,B) =(N_C,N_D)\). \(\square \)
As in the proof of Corollary 3.6, if we choose \(\theta =\frac{1}{\eta }\) (with \(\eta \in \left. \right] 0,1\left[ \right. \)), \(q =-r\), \(\sigma =\tau =\frac{1-\eta }{\gamma \eta }\), and \(r_C =r_D =0\), then Corollary 4.3 reduces to [2, Corollary 3.1] where (46) reads as
Similarly, if \(\theta =\frac{1}{\eta }\) (with \(\eta \in {\mathbb {R}}_{++}\)), \(q =r\), \(\sigma =\tau =\frac{1}{\gamma }\), and \(r_C =r_D =\frac{2\eta }{\gamma }r\), then (46) is simplified to
References
Alwadani, S., Bauschke, H.H., Moursi, W.M., Wang, X.: On the asymptotic behaviour of the Aragón Artacho–Campoy algorithm. Oper. Res. Lett. 46(6), 585–587 (2018)
Aragón Artacho, F.J., Campoy, R.: Computing the resolvent of the sum of maximally monotone operators with the averaged alternating modified reflections algorithm. J. Optim. Theory Appl. (2019)
Bauschke, H.H., Combettes, P.L.: A Dykstra-like algorithm for two monotone operators. Pac. J. Optim. 4(3), 383–391 (2008)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)
Combettes, P.L.: Iterative construction of the resolvent of a sum of maximal monotone operators. J. Convex Anal. 16(3), 727–748 (2009)
Dao, M.N., Phan, H.M.: Linear convergence of projection algorithms. Math. Oper. Res. (2018). https://doi.org/10.1287/moor.2018.0942
Dao, M.N., Phan, H.M.: Linear convergence of the generalized Douglas–Rachford algorithm for feasibility problems. J. Glob. Optim. 72(3), 443–474 (2018)
Dao, M.N., Phan, H.M.: Adaptive Douglas–Rachford splitting algorithm for the sum of two operators (2018). arXiv:1809.00761
Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)
Dykstra, R.L.: An algorithm for restricted least squares regression. J. Am. Stat. Assoc. 78(384), 837–842 (1983)
Lions, P.-L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)
Peaceman, D.W., Rachford, H.H.: The numerical solution of parabolic and elliptic differential equations. J. Soc. Ind. Appl. Math. 3, 28–41 (1955)
Acknowledgements
The authors thank two anonymous referees for their careful reading and helpful comments. MND was partially supported by Discovery Project 160101537 from the Australian Research Council. HMP was partially supported by Autodesk, Inc. via a gift made to the Department of Mathematical Sciences, University of Massachusetts Lowell.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dao, M.N., Phan, H.M. Computing the resolvent of the sum of operators with application to best approximation problems. Optim Lett 14, 1193–1205 (2020). https://doi.org/10.1007/s11590-019-01432-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-019-01432-x