Keywords

AMS 2010 Subject Classification

13.1 Introduction

Splitting methods have become popular in solving convex optimization problems that involve finding a minimizer of the sum of two proper lower semicontinuous convex functions. Among these methods are the Douglas–Rachford and the Peaceman–Rachford methods introduced in the seminal work of Lions and Mercier [24], the forward-backward method (see, e.g., [12, 17] and [29]), Dykstra’s method (see, e.g., [3] and [10]), and the Method of Alternating Projections (MAP) (see, e.g., [19]).

When the optimization problem features the composition of one of the functions with a bounded linear operator, a popular technique is the Alternating-Direction Method of Multipliers (ADMM) (see [22, Section 4], [16, Section 10.6.4] and also [7, Chapter 15]). The method has a wide range of applications including large-scale optimization, machine learning, image processing and portfolio optimization, see, e.g., [9, 15] and [20]. A powerful framework to use ADMM in the more general setting of monotone operators is developed in the work of Briceño-Arias and Combettes [13] (see also [8] and [14]). Another relatively recent method is the Chambolle–Pock method introduced in [11].

Equivalences between splitting methods have been studied over the past four decades. For instance, it is known that ADMM is equivalent to the Douglas–Rachford method [24] (see, also [21]) in the sense that with a careful choice of the starting point, one can prove that the sequences generated by both algorithms coincide. (See, e.g., [22, Section 5.1] and [6, Remark 3.14].) A similar equivalence holds between ADMM (with intermediate update of multiplier) and Peaceman–Rachford method [24] (see [22, Section 5.2]). In [25], the authors proved the correspondence of Douglas–Rachford and Chambolle–Pock methods.

The rest of this paper is organized as follows: Section 13.2 provides a brief literature review of ADMM, Douglas–Rachford and Peaceman–Rachford methods. In Sections 13.3 and 13.4 we explicitly describe the equivalence of ADMM (respectively ADMM with intermediate update of multipliers) and Douglas–Rachford (respectively Peaceman–Rachford) method introduced by Gabay in [22, Sections 5.1&5.2]. We provide simplified proofs of these equivalences. Section 13.5 focuses on the recent work of O’Connor and Vandenberghe concerning the equivalence of Douglas–Rachford and Chambolle–Pock methods (see [25]). Our notation is standard and follows largely, e.g., [5].

13.2 Three Techniques

In this paper, we assume that

and that

Alternating-Direction Method of Multipliers (ADMM)

In the following we assume thatFootnote 1

that

$$\displaystyle \begin{aligned} {\operatorname{argmin}}(f\circ L+g) \neq \varnothing, \end{aligned} $$
(13.1)

and that

$$\displaystyle \begin{aligned} 0\in {\operatorname{sri}}({\operatorname{dom}} f-L({\operatorname{dom}} g)), \end{aligned} $$
(13.2)

where \({\operatorname {sri}} S\) denotes the strong relative interior of a subset S of X with respect to the closed affine hull of S. When X is finite-dimensional we have , where is the relative interior of S defined as the interior of S with respect to the affine hull of S.

Consider the problem

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{minimize}}_{y\in Y} f(Ly)+g(y). \end{aligned} $$
(13.3)

Note that (13.1) and (13.2) imply that (see, e.g., [5, Proposition 27.5(iii)(a)1])

$$\displaystyle \begin{aligned} {\operatorname{argmin}}(f\circ L+g) ={\operatorname{zer}}( \partial(f\circ L)+\partial g) ={\operatorname{zer}}(L^*\circ (\partial f) \circ L+\partial g)\neq \varnothing. \end{aligned} $$
(13.4)

In view of (13.4), solving (13.3) is equivalent to solving the inclusion:

(13.5)

The augmented Lagrangian associated with (13.3) is the function

(13.6)

The ADMM (see [22, Section 4] and also [16, Section 10.6.4]) applied to solve (13.3) consists in minimizing \(\mathcal {L}\) over b then over a and then applying a proximal minimization step with respect to the Lagrange multiplier u. The method applied with a starting point (a 0, u 0) ∈ X × X generates three sequences \((a_n)_{{n\in {\mathbb N}}}\); (b n)n≥1 and \((u_n)_{{n\in {\mathbb N}}}\) via \((\forall {n\in {\mathbb N}})\):

(13.7a)
(13.7b)
(13.7c)

where .

Let \((x_n)_{n\in {\mathbb N}}\) be a sequence in X and let \(\overline {x}\in X\). In the following we shall use \(x_n\to \overline {x}\) (respectively \(x_n{\:{\rightharpoonup }\:} \overline {x}\)) to indicate that \((x_n)_{n\in {\mathbb N}}\) converges strongly (respectively weakly) to \(\overline {x}\).

Fact 13.1 (Convergence of ADMM (See [22, Theorem 4.1]))

Let (a 0, u 0) ∈ X × X, and let \((a_n)_{{n\in {\mathbb N}}}\) , (b n)n≥1 and \((u_n)_{{n\in {\mathbb N}}}\) be defined as in (13.7). Then, there exists \(\overline {b}\in Y\) such that \(b_n{\:{\rightharpoonup }\:}\overline {b}\in {\operatorname {argmin}}(f \circ L+g)\).

The Douglas–Rachford Method

Suppose that Y = X and that \(L={\operatorname {Id}}\). In this case Problem (13.3) becomes

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{minimize}}_{x\in X} f(x)+g(x). \end{aligned} $$
(13.8)

The Douglas–Rachford (DR) method, introduced in [24], applied to the ordered pair (f, g) with a starting point x 0 ∈ X to solve (13.8) generates two sequences \((x_n)_{{n\in {\mathbb N}}}\) and \((y_n)_{{n\in {\mathbb N}}}\) via:

(13.9a)
(13.9b)

where

(13.10)

and where .

To lighten the notation, in the sequel we shall use T DR to denote T DR(f, g). Let T : X → X. Recall that the set of fixed points of T, denoted by \({\operatorname {Fix}} T\), is defined as .

Fact 13.2 (Convergence of Douglas–Rachford Method (See, e.g., [24, Theorem 1] or [5, Corollary 28.3]))

Let x 0 ∈ X and let \((x_n)_{n\in {\mathbb N}}\) and \((y_n)_{n\in {\mathbb N}}\) be defined as in (13.9). Then, there exists \(\overline {x}\in {\operatorname {Fix}} T_{ \mathit{\text{DR}}}\) such that \( x_n{\:{\rightharpoonup }\:}\overline {x}\) and \(y_n{\:{\rightharpoonup }\:}{\operatorname {Prox}}_f\overline {x}\in {\operatorname {argmin}}(f+g)\).

The Peaceman–Rachford Method

Let \(h\colon X\to \left ]-\infty ,+\infty \right ]\) be proper and let β > 0. We say that h is strongly convex if is convex, i.e., \((\forall (x,y)\in {\operatorname {dom}} f\times {\operatorname {dom}} f)\) \((\forall \alpha \in \left ]0,1\right [)\) we have .

When g is strongly convex, the Peaceman–Rachford (PR) method, introduced in [24], can be used to solve (13.8). In this case, given x 0 ∈ X, PR method generates the sequences \((x_n)_{{n\in {\mathbb N}}}\) and \((y_n)_{{n\in {\mathbb N}}}\) via:

(13.11a)
(13.11b)

where

(13.12)

To lighten the notation, in the sequel we shall use T PR to denote T PR(f, g).

Fact 13.3 (Convergence of Peaceman–Rachford Method (See, e.g., [24, Proposition 1] or [5, Proposition 28.8]))

Suppose that g is strongly convex. Let \(\overline {y}\) be the unique minimizer of f + g, let x 0 ∈ X and let \((x_n)_{n\in {\mathbb N}}\) and \((y_n)_{n\in {\mathbb N}}\) be defined as in (13.11). Then \(y_n\to \overline {y}\).

In the sequel we use the notation

(13.13)

Recall that the Fenchel–Rockafellar dual of (13.3) is

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{minimize}}_{x\in X} f^*(x)+g^*(-L^*x). \end{aligned} $$
(13.14)

Remark 13.1

  1. (i)

    One can readily verify that \(\partial g^{{{\scriptscriptstyle \vee }}} =(-{\operatorname {Id}})\circ \partial g \circ (-{\operatorname {Id}})\). Therefore, in view of [27, Theorem A] and [20, Lemma 3.5 on page 125 and Lemma 3.6 on page 133] (see also [4, Corollaries 4.2 and 4.3]) we haveFootnote 2

    (13.15)

    and

    (13.16)
  2. (ii)

    When \((L,Y)=({\operatorname {Id}}, X)\), inclusion (13.5) reduces to: Find y ∈ X such that 0 ∈ ∂f(y) + ∂g(y) and the dual inclusion (corresponding to the Fenchel–Rockafellar dual (13.14)) is: Find y ∈ X such that 0 ∈ ∂f (y) − ∂g (−y) = (∂f)−1 y − (∂g)−1(−y), which in this case coincide with the Attouch–Thera dual of (13.5) (see [2]).

One can use DR method to solve (13.14) where (f, g) in Fact 13.2 is replaced by (f , g ∘ (−L )). Recalling (13.15) we learn that T DR = T DR(f , g ∘ (−L )) = T DR(f ∗∗, (g ∘ (−L ))∨∗) = T DR(f, (g ∘ L )), where the last identity follows from [5, Proposition 13.44].

In view of (13.10) (13.15), and [5, Proposition 15.23(v)] we have

(13.17)

Similarly, under additional assumptions (see Fact 13.3), one can use PR method to solve (13.14) where (f, g) in Theorem 13.3 is replaced by (f , g ∘ (−L )). In this case (13.12), (13.16) and [5, Proposition 15.23(v)] imply that

(13.18)

For completeness, we provide a concrete proof of the formula for \({\operatorname {Prox}}_{(g^*\circ L^*)^*}\) in Appendix 1 (see Proposition 13.3 (viii) below). We point out that the formula for \({\operatorname {Prox}}_{(g^*\circ L^*)^*}\) in a more general setting is given in [22, Proposition 4.1] (see also [16, Section 10.6.4]).

13.3 ADMM and Douglas–Rachford Method

In this section we discuss the equivalence of ADMM and DR method. This equivalence was first established by Gabay in [22, Section 5.1] (see also [6, Remark 3.14]). Let (x 0, a 0, u 0) ∈ X 3. Throughout the rest of this section, we assume that the sequences \((x_n)_{n\in \mathbb {N}}\) and \((y_n)_{n\in \mathbb {N}}\) are as defined in (13.9), i.e.,

(13.19)
(13.20)

Note that the second identity in (13.20) follows from (13.17) and Proposition 13.3 (viii). We also assume that

The following lemma will be used later to clarify the equivalence of DR and ADMM.

Lemma 13.1

Let (b , a , u ) ∈ Y × X × X and set

(13.21a)
(13.21b)

Then

(13.22a)
(13.22b)

Proof

Indeed, it follows from (13.20), (13.21), (13.22a) and (13.22b) that

(13.23a)
$$\displaystyle \begin{aligned} &=(Lb+u_{-})-a+L(L^*L+\partial g)^{-1}{L^*}(2a-(Lb+u_{-})) {} \end{aligned} $$
(13.23b)
$$\displaystyle \begin{aligned} &=(Lb+u_{-})-a+L(L^*L+\partial g)^{-1}{L^*}(a-(Lb+u_{-}-a)) {} \end{aligned} $$
(13.23c)
$$\displaystyle \begin{aligned} &=u+L(L^*L+\partial g)^{-1}({L^*}a-{L^*}u) {} \end{aligned} $$
(13.23d)
$$\displaystyle \begin{aligned} &=Lb_++u, {} \end{aligned} $$
(13.23e)

where (13.23a) follows from (13.20), (13.23b) and (13.23d) follow from (13.21a), and (13.23e) follow from (13.21b). This proves (13.22a). Now (13.22b) follows from combining (13.22a) and (13.21b). □

We now prove the main result in this section by induction.

Theorem 13.4

The following hold:

  1. (i)

    (DR as ADMM Iteration) Using DR method with a starting point x 0 ∈ X to solve (13.14) is equivalent to using ADMM with a starting point to solve (13.3), in the sense that (x n)n≥1 = (Lb n + u n−1)n≥1 and \((y_n)_{{n\in {\mathbb N}}}=(a_n)_{{n\in {\mathbb N}}}\).

  2. (ii)

    (ADMM as DR Iteration) Using ADMM with a starting point (a 0, u 0) ∈ X × X to solve (13.3) is equivalent to using DR method with a starting point x 0 = Lb 1 + u 0 to solve (13.14), in the sense that \((x_n)_{{n\in {\mathbb N}}}=(Lb_{n+1}+u_{n})_{{n\in {\mathbb N}}}\) and \((y_n)_{{n\in {\mathbb N}}}=(a_{n+1})_{{n\in {\mathbb N}}}\).

Proof

For simplicity, set T = T DR. (i): Note that (13.19) implies that y 0 = a 0. Now, when n = 1 we have

(by13.17 )
(by13.19)
(by13.7a )

Combining with (13.19) and (13.7b) we get \(y_1={\operatorname {Prox}}_f Tx_0={\operatorname {Prox}}_f x_1={\operatorname {Prox}}_f(u_0+Lb_1)=a_1\), which verifies the base case. Now suppose for some n ≥ 1 we have x n = Lb n + u n−1 and y n = a n and use Lemma 13.1 with (b , a , u ) replaced by (b n−1, a n−1, u n−1) to learn that x n+1 = Lb n+1 + u n and y n+1 = a n+1. Consequently, (x n)n≥1 = (Lb n + u n−1)n≥1 and \((y_n)_{{n\in {\mathbb N}}}=(a_n)_{{n\in {\mathbb N}}}\) as claimed.

(ii): At n = 0, x 0 = Lb 1 + u 0 = Lb 0+1 + u 0, and therefore (13.9a) implies that \(y_0={\operatorname {Prox}}_fx_0={\operatorname {Prox}}_f(Lb_1+u_0)=a_1\) by (13.7b). Now suppose that for some n ≥ 0 we have x n = Lb n+1 + u n and y n = a n+1. The conclusion follows by applying Lemma 13.1 with (b , a , u ) replaced by (b n, a n, u n). □

13.4 ADMM and Peaceman–Rachford Method

We now turn to the equivalence of ADMM with intermediate update of multiplier and PR method. This equivalence was established in [22, Section 5.2]. Given (a 0, u 0) ∈ X × X, the ADMM with an intermediate update of multiplier applied to solve (13.3) generates four sequences \((a_n)_{{n\in {\mathbb N}}}\), \((u_n)_{{n\in {\mathbb N}}}\), (b n)n≥1 and (w n)n≥1 via \((\forall {n\in {\mathbb N}})\):

(13.24a)
(13.24b)
(13.24c)
(13.24d)

Fact 13.5 (Convergence of ADMM with Intermediate Update of Multipliers (See [22, Theorem 5.4]))

Suppose that g is strongly convex. Let (a 0, u 0) ∈ X × X, and let (b n)n≥1 , (w n)n≥1 , \((a_n)_{{n\in {\mathbb N}}}\) and \((u_n)_{{n\in {\mathbb N}}}\) be defined as in (13.24). Then, there exists \(\overline {b}\in Y\) such that \(b_n\to \overline {b}\in {\operatorname {argmin}}(f \circ L+g)\).

In this section we work under the additional assumption that

Let (x 0, a 0, u 0) ∈ X 3. Throughout the rest of this section we assume that the sequences \((x_n)_{n\in \mathbb {N}}\) and \((y_n)_{n\in \mathbb {N}}\) are as defined in (13.11), i.e.,

(13.25)

where

(13.26)

Note that the second identity in (13.26) follows from (13.18) and Proposition 13.3 (viii). We also assume that

Before we proceed further, we prove the following useful lemma.

Lemma 13.2

Let (b , w , a , u ) ∈ Y × X × X × X and set

$$\displaystyle \begin{aligned} (b,w,a,u)&= ((L^*L+\partial g)^{-1}(L^* a_{-}-L^* u_{-}), u_{-}+Lb-a_{-}, \\ &\qquad {\operatorname{Prox}}_f(Lb+w), w+Lb-a), {} \end{aligned} $$
(13.27a)
$$\displaystyle \begin{aligned} (b_+,w_+,a_+,u_+) &= ((L^*L+\partial g)^{-1}(L^* a-L^* u), u+Lb_{+}-a, \\ &\qquad {\operatorname{Prox}}_f(Lb_{+}+w_+), w_++Lb_{+}-a_{+}). {} \end{aligned} $$
(13.27b)

Then

(13.28a)
(13.28b)

Proof

Indeed, by (13.26), (13.27a) and (13.27b) we have

(by13.26 )
(by13.27a)
(by13.27b)
(by13.27b )

which proves (13.28a). Now (13.28b) is a direct consequence of (13.28a) in view of (13.27b). □

We are now ready for the main result in this section.

Theorem 13.6

Suppose that g is strongly convex. Then the following hold:

  1. (i)

    (PR as ADMM Iteration) Using PR method with a starting point x 0 ∈ X to solve (13.14) is equivalent to using ADMM with intermediate update of multipliers with starting points to solve (13.3), in the sense that (x n)n≥1 = (Lb n + w n)n≥1 and \((y_n)_{{n\in {\mathbb N}}}=(a_n)_{{n\in {\mathbb N}}}\).

  2. (ii)

    (ADMM as PR Iteration) Using ADMM with intermediate update of multipliers with a starting point (a 0, u 0) ∈ X × X to solve (13.3) is equivalent to using PR method with starting point x 0 = Lb 1 + w 1 to solve (13.14), in the sense that \((x_n)_{{n\in {\mathbb N}}}=(Lb_{n+1}+w_{n+1})_{{n\in {\mathbb N}}}\) and \((y_n)_{{n\in {\mathbb N}}}=(a_{n+1})_{{n\in {\mathbb N}}}\).

Proof

We proceed by induction. (i): We have

(by13.26 )
(by13.24b )

which verifies the base case. Now suppose for some n ≥ 1 we have x n = Lb n + w n. The conclusion follows from applying Lemma 13.2 with (b , w , a , u ) replaced by (b n−1, w n−1, a n−1, u n−1) in view of (13.24).

(ii): At n = 0, the base case clearly holds in view of (13.25) and (13.24c). Now suppose that for some n ≥ 0 we have x n = Lb n+1 + w n+1 and y n = a n+1 and use Lemma 13.2 with (b , w , a , u ) replaced by (b n, w n, a n, u n) in view of (13.24). □

13.5 Chambolle–Pock and Douglas–Rachford Methods

In this section we survey the recent work by O’Connor and Vandenberghe [25] concerning the equivalence of Douglas–Rachford method and Chambolle–Pock method. (For a detailed study of this correspondence in the more general framework of the primal-dual hybrid gradient method and DR method with relaxation, as well as connection to linearized ADMM, we refer the reader to [25].) We work under the assumption thatFootnote 3

(13.31)

Consider the problem

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{minimize}}_{x\in X} f(x)+g(Ax) \end{aligned} $$
(13.32)

and its Fenchel–Rockafellar dual given by

$$\displaystyle \begin{aligned} \operatorname*{\mathrm{minimize}}_{x\in X} f^*(-Ax)+g^*(x). \end{aligned} $$
(13.33)

To proceed further, in the following we assume that

(13.34)

Note that (13.34) implies that (see, e.g., [5, Proposition 27.5(iii)(a)1])

$$\displaystyle \begin{aligned} {\operatorname{argmin}}(f+g\circ A) ={\operatorname{zer}}( \partial f+\partial (g\circ A)) ={\operatorname{zer}}(\partial f+A^*\circ (\partial g) \circ A)\neq \varnothing. \end{aligned} $$
(13.35)

In view of (13.35), solving (13.32) is equivalent to solving the inclusion:

(13.36)

The Chambolle–Pock (CP) method applied with a staring point (u 0, v 0) ∈ X × Y  to solve (13.32) generates the sequences \(({u}_n)_{{n\in {\mathbb N}}}\), and \(({v}_n)_{{n\in {\mathbb N}}}\) via:

(13.37a)
$$\displaystyle \begin{aligned} {v}_n &={\operatorname{Prox}}_{\sigma g^*}(v_{n-1}+\sigma A(2{u}_n-u_{n-1})), {} \end{aligned} $$
(13.37b)

where τ and σ are as defined in (13.31).

Fact 13.7 (Convergence of Chambolle–Pock Method (See [11, Theorem 1], [30, Theorem 3.1] and Also [18, Theorem 3.1]))

Let (u 0, v 0) ∈ X × Y  and let \((u_n)_{n\in {\mathbb N}}\) and \((v_n)_{n\in {\mathbb N}}\) be defined as in (13.37). Then, there exists \((\overline {u},\overline {v})\in X \times Y\) such that \( (u_n,v_n)_{n\in {\mathbb N}}{\:{\rightharpoonup }\:}(\overline {u},\overline {v})\) , \(\overline {u}\in {\operatorname {argmin}}(f+g\circ A)\) and \(\overline {v}\in {\operatorname {argmin}}(f^*\circ (-A^*)+g^*)\).

It is known that the method in (13.37) reduces to DR method (see, e.g., [11, Section 4.2]) when \(A={\operatorname {Id}}\). We state this equivalence in Proposition 13.1 below.

Proposition 13.1 (DR as a CP Iteration)

Suppose that X = Y , and that \(A={\operatorname {Id}}\) . Then, using DR method, defined as in (13.9), with a starting point x 0 ∈ X to solve (13.32) is equivalent to using CP method with a starting point (u 0, v 0) ∈{(u, v) | u  v = x 0 }  X × X to solve (13.32) in the sense that \((x_n)_{n\in {\mathbb N}}=(u_n-v_n)_{n\in {\mathbb N}}\) and \((y_n)_{n\in {\mathbb N}}=(u_n)_{n\in {\mathbb N}}\).

Proof

We use induction. When n = 0, the base case is obviously true. Now suppose that for some n ≥ 0 we have x n = u n − v n and y n = u n. Then,

$$\displaystyle \begin{aligned} x_{n+1} &={\operatorname{Prox}}_f x_n-{\operatorname{Prox}}_{g^*}(2{\operatorname{Prox}}_f x_n-x_n) {} \end{aligned} $$
(13.38a)
$$\displaystyle \begin{aligned} &={\operatorname{Prox}}_f(u_n-v_n)-{\operatorname{Prox}}_{g^*}(2{\operatorname{Prox}}_f (u_n-v_n)-(u_n-v_n)) {} \end{aligned} $$
(13.38b)
$$\displaystyle \begin{aligned} &=u_{n+1}-{\operatorname{Prox}}_{g^*}(v_n+2u_{n+1}-u_n)=u_{n+1}-v_{n+1} . {} \end{aligned} $$
(13.38c)

Here (13.38a) follows from Lemma 13.5 below (applied with γ = 1), (13.38b) follows from the inductive hypothesis, and (13.38c) follows from (13.37) applied with (τ, σ, A) replaced by \((1,1,{\operatorname {Id}})\). The claim about y n+1 follows directly and the proof is complete. □

Chambolle–Pock as a DR Iteration: The O’Connor–Vandenberghe Technique

Let Z be a real Hilbert space. In the following, we assume that C : Z → Y  is linear and that

(13.39)

Note that one possible choice of C is to set , where the existence of C follows from, e.g., [26, Theorem on page 265]. Now consider the problem

(13.40)

where

(13.41)

The following result, proved in [25, Section 4] in the more general framework of primal-dual hybrid gradient method, provides an elegant way to construct the correspondence between the DR sequence when applied to solve (13.40) and the CP sequence when applied to solve (13.32). We restate the proof for the sake of completeness.

Proposition 13.2 (CP Corresponds to a DR Iteration)

Using CP method with starting point (u 0, v 0) ∈ X × Z to solve (13.32) corresponds to using DR with starting point to solve (13.40), in the sense that \(({\mathbf {x}}_n)_{{n\in {\mathbb N}}} =((u_n,0)-\tau B^* v_n)_{{n\in {\mathbb N}}}\) and \(({\mathbf {y}}_n)_{{n\in {\mathbb N}}} =({u}_{n+1},0)_{n\in {\mathbb N}}\).

Proof

We apply DR to solve (13.40) with \((\widetilde {f},g)\) replaced by \((\tau \widetilde {f},\tau g)\). The proof proceeds by induction. When n = 0, by assumption we have x 0 = (u 0, 0) − τB v 0. It follows from Proposition 13.4 (i)&(vii) below applied with \(\widetilde {f} \) replaced by \(\tau \widetilde {f} \) that \({\mathbf {y}}_0={\operatorname {Prox}}_{\tau \tilde {f}}{\mathbf {x}}_0 ={\operatorname {Prox}}_{\tau \widetilde {f}}((u_0,0)-\tau (A^* v_0,C^* v_0)) ={\operatorname {Prox}}_{\tau \widetilde {f}}(u_0-\tau A^* v_0,-\tau C^* v_0) =({\operatorname {Prox}}_{\tau f}(u_0-\tau A^* v_0),0)\). Now suppose that for some n ≥ 0 we have

(13.42a)
$$\displaystyle \begin{aligned} {\mathbf{y}}_n &=({u}_{n+1},0). {} \end{aligned} $$
(13.42b)

Then

(13.43a)
(13.43b)
(13.43c)
(13.43d)
(13.43e)

where (13.43a) follows from (13.37b), (13.43b) follows from (13.42b) and Proposition 13.4 (iii) below, (13.43c) follows from (13.39), and (13.43e) follows from (13.42a), Proposition 13.4 (viii) and (13.49b) below applied with (γ, g) replaced by (τ, g ∘ B).

Now by (13.37a) and Proposition 13.4 (vii) below we have

(13.44a)
(13.44b)
(13.44c)
(13.44d)
(13.44e)