1 Introduction

Throughout this paper, we assume that

$$\begin{aligned} X \text { is a Euclidean space,} \end{aligned}$$
(1)

i.e., a finite-dimensional real Hilbert space with inner product \(\left\langle {\cdot },{\cdot } \right\rangle \) and induced norm \(\Vert \cdot \Vert \), and

$$\begin{aligned} A \text { and } B \text { are closed convex subsets of } X \text { such that } A\cap B\ne \varnothing . \end{aligned}$$
(2)

Consider the convex feasibility problem

$$\begin{aligned} \text {find a point in } A\cap B \end{aligned}$$
(3)

and assume that it is possible to evaluate the projectors (nearest point mappings) \(P_A\) and \(P_B\) corresponding to A and B, respectively. We denote the corresponding reflectors by \(R_A := 2P_A-{\text {Id}}\) and \(R_B := 2P_B-{\text {Id}}\), respectively. Projection methods combine the projectors and reflectors in a suitable way to generate a sequence converging to a solution of (3)—we refer the reader to [2, 8], and [9] and the references therein for further information.

One celebrated algorithm for solving (3) is the so-called Douglas–Rachford algorithm (DRA) [11]. The adaption of this algorithm to optimization and feasibility is actually due to Lions and Mercier was laid out beautifully in their landmark paper [16] (see also [12]). The DRA is based on the Douglas–Rachford splitting operator,

$$\begin{aligned} T :={\text {Id}}-P_A +P_BR_A, \end{aligned}$$
(4)

which is used to generate a sequence \((z_n)_{n\in {\mathbb {N}}}\) with starting point \(z_0\in X\) via

$$\begin{aligned} (\forall {n\in {\mathbb {N}}}) \quad z_{n+1} := Tz_n. \end{aligned}$$
(5)

Then the “governing sequence” \((z_n)_{n\in {\mathbb {N}}}\) converges to a point \(z\in {\text {Fix }}T\), and, more importantly, the “shadow sequence” \((P_Az_n)_{n\in {\mathbb {N}}}\) converges to \(P_Az\) which is a solution of (3).

An important question concerns the speed of convergence of the sequence \((P_Az_n)_{n\in {\mathbb {N}}}\). Linear convergence was more clearly understood recently, see [14, 21], and [6].

The aim of this paper is to provide verifiable conditions sufficient for finite convergence.

Our two main results reveal that Slater’s condition, i.e.,

$$\begin{aligned} A\cap {\text {int}}\,B \ne \varnothing \end{aligned}$$
(6)

plays a key role and guarantees finite convergence when (MR1) A is an affine subspace and B is a polyhedron (Theorem 3.7); or when (MR2) A is a certain hyperplane and B is an epigraph (Theorem 5.4). Examples illustrate that these results are applicable in situations where previously known conditions sufficient for finite convergence fail. When specialized to a product space setting, we derive a finite-convergence result due to Spingarn [26] for his method of partial inverses [25]. Indeed, the proof of Theorem 3.7 follows his pioneering work, but, at the same time, we simplify his proofs and strengthen the conclusions. These sharpenings allow us to obtain finite-convergence results for solving linear equations with a positivity constraint. Numerical experiments support the competitiveness of the DRA for solving (3).

It would be interesting to derive additional conditions guaranteeing finite convergence, not only in the present finite-dimensional setting but also in a more general infinite-dimensional setting. Moreover, how these results generalize (if at all) from convex feasibility to finding the zero of the sum of two maximally monotone operators is currently quite unclear.

1.1 Organization of the paper

The paper is organized as follows. In Sect. 2, we present several auxiliary results which make the eventual proofs of the main results more structured and transparent. Section 3 contains the first main result (MR1). Applications using the product space set up, a comparison with Spingarn’s work, and numerical experiments are provided in Sect. 4. The final Sect. 5 concerns the second main result (MR2).

1.2 Notation

The notation employed is standard and follows largely [2]. The real numbers are \(\mathbb {R}\), and the nonnegative integers are \(\mathbb {N}\). Further, \(\mathbb {R}_+:= \left\{ {x \in \mathbb {R}}~\big |~{x \ge 0}\right\} \) and \(\mathbb {R}_{++}:= \left\{ {x \in \mathbb {R}}~\big |~{x >0}\right\} \). Let C be a subset of X. Then the closure of C is \(\overline{C}\), the interior of C is \({\text {int}}\,C\), the boundary of C is \({\text {bdry }}C\), and the smallest affine and linear subspaces containing C are, respectively, \({\text {aff }}C\) and \({{\text {span}}}\,C\). The relative interior of C, \({\text {ri}}\,C\), is the interior of C relative to \({\text {aff }}C\). The orthogonal complement of C is \(C^\perp :=\left\{ {y \in X}~\big |~{(\forall x\in C)\; \left\langle {x},{y} \right\rangle =0}\right\} \), and the dual cone of C is \(C^\oplus :=\left\{ {y \in X}~\big |~{(\forall x \in C)\; \left\langle {x},{y} \right\rangle \ge 0}\right\} \). The normal cone operator of C is denoted by \(N_C\), i.e., \(N_C(x) =\left\{ {y \in X}~\big |~{(\forall c \in C)\; \left\langle {y},{c-x} \right\rangle \le 0}\right\} \) if \(x \in C\), and \(N_C(x) =\varnothing \) otherwise. If \(x \in X\) and \(\rho \in \mathbb {R}_{++}\), then \({\text {ball}}({x};{\rho }) :=\left\{ {y \in X}~\big |~{\Vert x -y\Vert \le \rho }\right\} \) is the closed ball centered at x with radius \(\rho \).

2 Auxiliary results

In this section, we collect several auxiliary results that will be useful in the sequel.

2.1 Convex sets

Lemma 2.1

Let C be a nonempty closed convex subset of X, let \(x \in X\), and let \(y \in C\). Then \(x -P_Cx \in N_C(y)\) \(\Leftrightarrow \) \(\left\langle {x -P_Cx},{y -P_Cx} \right\rangle =0\).

Proof

Because \(y\in C\) and \(x-P_Cx\in N_C(P_Cx)\), we have \(\left\langle {x-P_Cx},{y-P_Cx} \right\rangle \le 0\). “\(\Rightarrow \)”: From \(x-P_Cx\in N_C(y)\) and \(P_Cx \in C\), we have \(\left\langle {x-P_Cx},{P_Cx-y} \right\rangle \le 0\). Thus \(\left\langle {x-P_Cx},{P_Cx-y} \right\rangle =0\). “\(\Leftarrow \)”: We have \((\forall c\in C)\) \(\left\langle {x-P_Cx},{c-y} \right\rangle = \left\langle {x-P_Cx},{c-P_Cx} \right\rangle +\left\langle {x-P_Cx},{P_Cx-y} \right\rangle = \left\langle {x-P_Cx},{c-P_Cx} \right\rangle \le 0\). Hence \(x-P_Cx\in N_C(y)\). \(\square \)

Lemma 2.2

Let C be a nonempty convex subset of X. Then \({\text {int}}\,C \ne \varnothing \) \(\Leftrightarrow \) \(0 \in {\text {int}}(C-C)\).

Proof

\(\Rightarrow \)”: Clear. ”\(\Leftarrow \)”: By [22, Theorem 6.2], \({\text {ri}}\,C \ne 0\). After translating the set if necessary, we assume that \(0 \in {\text {ri}}\,C\). Then \(0 \in C\), and so \(Y :={\text {aff }}C ={{\text {span}}}\,C\). Since \(0 \in {\text {int}}(C -C)\subseteq {\text {int}}({{\text {span}}}\,C)={\text {int}}\,Y\), this gives \({\text {int}}\,Y \ne \varnothing \) and thus \(Y =X\). In turn, \({\text {int}}\,C ={\text {ri}}\,C \ne \varnothing \). \(\square \)

2.2 Cones

Lemma 2.3

Let K be a nonempty convex cone in X. Then there exists \(v \in {\text {ri}}\,K \cap {\text {ri}}\,K^\oplus \) such that

$$\begin{aligned} (\forall x \in \overline{K}{\backslash }(\overline{K}\cap (-\overline{K}))) \quad \left\langle {v},{x} \right\rangle >0. \end{aligned}$$
(7)

Proof

By [26, Lemma 2], there exists \(v \in {\text {ri}}\,K \cap {\text {ri}}\,K^\oplus \). Then \(v \in {\text {ri}}\,K^\oplus ={\text {ri}}\,\overline{K}^\oplus \). Noting that \(\overline{K}\) is a closed convex cone and using [26, Lemma 3], we complete the proof. \(\square \)

Lemma 2.4

Let \((z_n)_{n\in {\mathbb {N}}}\) be a sequence in X, and let \(f :X \rightarrow \mathbb {R}\) be linear. Assume that \(z_n \rightarrow z \in X\), and that

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad f(z_n) >f(z_{n+1}). \end{aligned}$$
(8)

Then there exist \(n_0 \in \mathbb {N}{\backslash } \{0\}\) and \((\mu _1, \ldots , \mu _{n_0}) \in \mathbb {R}_+^{n_0}\) such that \(\sum _{k=1}^{n_0}\mu _k=1\) and

$$\begin{aligned} \left\langle {\sum _{k=1}^{n_0} \mu _k(z_{k-1} -z_k)},{\sum _{k=1}^{n_0} \mu _k(z_k -z)} \right\rangle >0. \end{aligned}$$
(9)

Proof

Introducing

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad y_n :=z_{n-1} -z_n, \end{aligned}$$
(10)

we get

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad f(y_n) >0, \quad \text {and so}\quad y_n \ne 0. \end{aligned}$$
(11)

Let K be the convex cone generated by \(\{y_n\}_{n \in \mathbb {N}{\backslash } \{0\}}\). We see that

$$\begin{aligned} (\forall x \in K{\backslash }\{0\})\quad f(x) >0, \quad \text {and}\quad (\forall x \in -K{\backslash }\{0\})\quad f(x) <0. \end{aligned}$$
(12)

Therefore,

$$\begin{aligned} \overline{K}\cap (-\overline{K}) \subseteq \left\{ {x \in X}~\big |~{f(x) =0}\right\} . \end{aligned}$$
(13)

Setting

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad w_n: =z_n -z, \end{aligned}$$
(14)

we immediately have \(w_n \rightarrow 0\), and so

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad w_n =z_n -z =\sum _{k=n+1}^\infty (z_{k-1} -z_k) =\sum _{k=n+1}^\infty y_k \in \overline{K}. \end{aligned}$$
(15)

From (8) we get

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad f(w_n) >f(w_{n+1}). \end{aligned}$$
(16)

Moreover, \(f(w_n) \rightarrow f(0) =0\), hence

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad f(w_n) >0. \end{aligned}$$
(17)

Together with (13) and (15), this gives

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad w_n \in \overline{K}{\backslash }(\overline{K}\cap (-\overline{K})). \end{aligned}$$
(18)

By Lemma 2.3, there exists \(v \in {\text {ri}}\,K\cap {\text {ri}}\,K^\oplus \) such that

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad \left\langle {v},{w_n} \right\rangle >0. \end{aligned}$$
(19)

Then we must have \(v \ne 0\). Since \(v \in K\), after scaling if necessary, there exist \(n_0\in \mathbb {N}{\backslash }\{0\}\) and \((\mu _1,\ldots ,\mu _{n_0})\in \mathbb {R}_+^{n_0}\) such that

$$\begin{aligned} v =\sum _{k=1}^{n_0} \mu _k y_k, \quad \text {and}\quad \sum _{k=1}^{n_0} \mu _k =1. \end{aligned}$$
(20)

This combined with (19) implies

$$\begin{aligned} \left\langle {\sum _{k=1}^{n_0} \mu _k y_k},{\sum _{k=1}^{n_0} \mu _k w_k} \right\rangle >0, \end{aligned}$$
(21)

and so (9) holds. \(\square \)

Lemma 2.5

Let K be a nonempty pointedFootnote 1 convex cone in X. Then the following hold:

  1. (i)

    Let \(m\in \mathbb {N}{\backslash }\{0\}\) and let \((x_1,\ldots ,x_m)\in K^m\). Then \(x_1 +\cdots +x_m =0\) \(\Leftrightarrow \) \(x_1 =\cdots =x_m =0\).

  2. (ii)

    If K is closed and \(L:X\rightarrow X\) is linear such that

    $$\begin{aligned} \ker L\cap K =\{0\}, \end{aligned}$$
    (22)

    then L(K) is a nonempty pointed closed convex cone.

Proof

  1. (i)

    Assume that \(x_1 +\cdots +x_m =0\). Then since K is a convex cone,

    $$\begin{aligned} -x_1 =x_2 +\cdots +x_m \in K, \end{aligned}$$
    (23)

    and so \(x_1 \in K\cap (-K)\). Since K is pointed, we get \(x_1 =0\). Continuing in this fashion, we eventually conclude that \(x_1 =\cdots =x_m =0\). The converse is trivial.

  2. (ii)

    Since K is a closed convex cone, so is \(M :=L(K)\) due to assumption (22) and [7, Proposition 3.4]. Now let \(z \in M\cap (-M)\). Then \(z =L(r) =-L(s)\) for some points rs in K. Thus \(L(r +s) =L(r) +L(s) =0\), which gives \(r +s \in \ker L\), and so \(r +s \in \ker L\cap K\). By again (22), \(r +s =0\), and now (i) implies \(r =s =0\). Therefore, \(z =0\), and M is pointed.

\(\square \)

Lemma 2.6

Let \((a_n)_{n\in {\mathbb {N}}}\) be a sequence in X such that \(a_n \rightarrow a \in X\), and K be a pointed closed convex cone of X. Assume that

$$\begin{aligned} (\exists \,p\in \mathbb {N})\quad a_p=a\;\;\text {and}\;\; (\forall n \ge p)\quad a_n -a_{n+1} \in K. \end{aligned}$$
(24)

Then

$$\begin{aligned} (\forall n \ge p)\quad a_n =a. \end{aligned}$$
(25)

Proof

Since K is a closed convex cone and \(a_n \rightarrow a\), it follows from (24) that

$$\begin{aligned} (\forall n \ge p)\quad a_n -a =\sum _{k=n}^\infty (a_k -a_{k+1}) \in K, \end{aligned}$$
(26)

and so \(a_{p+1} -a \in K\). Since \(a_p =a\), (24) gives \(a -a_{p+1} \in K\). Noting that K is pointed, this implies \(a_{p+1} -a \in K\cap (-K) \subseteq \{0\}\), and hence \(a_{p+1} =a\). Repeating this argument, we get the conclusion. \(\square \)

2.3 Locally polyhedral sets

Definition 2.7

(Local polyhedrality) Let C be a subset of X. We say that C is polyhedral at \(c\in C\) if there exist a polyhedralFootnote 2 set D and \(\varepsilon \in \mathbb {R}_{++}\) such that \(C\cap {\text {ball}}({c};{\varepsilon }) =D\cap {\text {ball}}({c};{\varepsilon })\).

It is clear from the definition that every polyhedron is polyhedral at each of its points and that every subset C of X is polyhedral at each point in \({\text {int}}\,C\).

Lemma 2.8

Let C be a subset of X, and assume that C is polyhedral at \(c \in C\). Then there exist \(\varepsilon \in \mathbb {R}_{++}\), a finite set I, \((d_i)_{i\in I}\in (X{\backslash }\{0\})^I\), \((\delta _i)_{i\in I}\in \mathbb {R}^I\) such that

$$\begin{aligned} C\cap {\text {ball}}({c};{\varepsilon }) = \bigg \{{x\in X}~\bigg |~{\max _{i\in I} \big (\left\langle {d_i},{x} \right\rangle -\delta _i\big )\le 0}\bigg \} \cap {\text {ball}}({c};{\varepsilon }), \end{aligned}$$
(27)

\((\forall i\in I)\) \(\left\langle {d_i},{c} \right\rangle =\delta _i\), and

$$\begin{aligned} \big (\forall y \in C\cap {\text {ball}}({c};{\varepsilon })\big )\quad N_C(y) = \sum _{i\in I(y)}\mathbb {R}_+d_i, \quad \text {where }I(y) := \left\{ {i\in I}~\big |~{\left\langle {d_i},{y} \right\rangle =\delta _i}\right\} . \end{aligned}$$
(28)

Proof

Combine Lemma 2.12 with [24, Theorem 6.46]. \(\square \)

Lemma 2.9

Let C be a nonempty closed convex subset of X that is polyhedral at \(c \in C\). Then there exists \(\varepsilon \in \mathbb {R}_{++}\) such that

$$\begin{aligned} \big (\forall x\in P_C^{-1}(C\cap {\text {ball}}({c};{\varepsilon }))\big )\quad \left\langle {x -P_Cx},{c -P_Cx} \right\rangle =0 \;\;\text {and}\;\; x -P_Cx \in N_C(c). \end{aligned}$$
(29)

Proof

We adopt the notation of the conclusion of Lemma 2.8. Let \(x\in X\) such that \(y := P_Cx\in {\text {ball}}({c};{\varepsilon })\). Then \(x-y\in N_C(y)\) and Lemma 2.8 guarantees the existence of \((\lambda _i)_{i\in I(y)}\in \mathbb {R}_+^{I(y)}\) such that

$$\begin{aligned} x -y =\sum _{i\in I(y)} \lambda _id_i. \end{aligned}$$
(30)

Then

$$\begin{aligned} \left\langle {x -y},{c -y} \right\rangle =\sum _{i\in I(y)} \lambda _i\left\langle {d_i},{c -y} \right\rangle =\sum _{i\in I(y)} \lambda _i(\left\langle {d_i},{c} \right\rangle -\left\langle {d_i},{y} \right\rangle ) =\sum _{i\in I(y)} \lambda _i(\delta _i-\delta _i) =0 \end{aligned}$$
(31)

and so \(\left\langle {x-P_Cx},{c-P_Cx} \right\rangle = 0\). Furthermore, by Lemma 2.1, \(x -P_Cx \in N_C(c)\). \(\square \)

2.4 Two convex sets

Proposition 2.10

Let A and B be closed convex subsets of X such that \(A\cap B\ne \varnothing \). Then the following hold:

  1. (i)

    \((\forall a\in A)(\forall b\in B)\quad N_{A-B}(a -b) =N_A(a)\cap \big (-N_B(b)\big )\).

  2. (ii)

    \(0 \in {\text {int}}(A -B) \quad \Leftrightarrow \quad N_{A-B}(0) =\{0\}\).

  3. (iii)

    \({\text {int}}\,B \ne \varnothing \quad \Leftrightarrow \quad (\forall x \in B)\; N_B(x) \cap (-N_B(x)) =\{0\}\).

Proof

(i) Noting that \(N_{-B}(-b) =-N_B(b)\) by definition, the conclusion follows from [20, Proposition 2.11(ii)] or from a direct computation. (ii) Clear from [2, Corollary 6.44]. (iii) Let \(x \in B\). By (i), \(N_B(x) \cap (-N_B(x)) =N_{B-B}(0)\). Now Lemma 2.2 and (ii) complete the proof. \(\square \)

Corollary 2.11

Let A be a linear subspace of X, and let B be a nonempty closed convex subset of X such that \(A\cap B\ne \varnothing \). Then the following hold:

  1. (i)

    \(0 \in {\text {int}}(A -B) \quad \Leftrightarrow \quad \left[ (\forall x \in A\cap B)\; A^\perp \cap N_B(x) =\{0\} \right] \).

  2. (ii)

    \(A\cap {\text {int}}\,B \ne \varnothing \quad \Leftrightarrow \quad \big [ 0 \in {\text {int}}(A -B) \quad \text {and}\quad {\text {int}}\,B \ne \varnothing \big ]\).

Proof

  1. (i)

    Let \(x \in A\cap B\). Since A is a linear subspace, Proposition 2.10(i) yields

    $$\begin{aligned} A^\perp \cap N_B(x) =-(N_A(x)\cap (-N_B(x))) =-N_{A-B}(0). \end{aligned}$$
    (32)

    Now apply Proposition 2.10(ii).

  2. (ii)

    If \(0 \in {\text {int}}(A -B)\) and \({\text {int}}\,B \ne \varnothing \), then \(0 \in {\text {ri}}(A -B)\) and \({\text {ri}}\,B={\text {int}}\,B\). Since A is a linear subspace, we have \({\text {ri}}\,A =A\), and using [22, Corollary 6.6.2], we get

    $$\begin{aligned} 0 \in {\text {ri}}(A -B) ={\text {ri}}\,A -{\text {ri}}\,B = A -{\text {int}}\,B, \end{aligned}$$
    (33)

    which implies \(A\cap {\text {int}}\,B \ne \varnothing \). The converse is obvious.

\(\square \)

Lemma 2.12

Let A and B be closed convex subsets of X, and let \(c\in A\cap B\) and \(\varepsilon \in \mathbb {R}_{++}\) be such that \(A\cap {\text {ball}}({c};{\varepsilon })=B\cap {\text {ball}}({c};{\varepsilon })\). Then \(N_A(c)=N_B(c)\).

Proof

Let \(u\in X\). Working with the directional derivative and using [2, Proposition 17.17(i)], we have \(u\in N_A(c) = \partial \iota _A(c)\) \(\Leftrightarrow \) \((\forall h\in X)\) \(\left\langle {u},{h} \right\rangle \le \iota _A'(c;h)\) \(\Leftrightarrow \) \((\forall h\in X)\) \(\left\langle {u},{h} \right\rangle \le \iota _B'(c;h)\) \(\Leftrightarrow \) \(u\in N_B(c) = \partial \iota _B(c)\). \(\square \)

2.5 Monotone operators

Lemma 2.13

Let \(L:X\times X\rightarrow X\times X:(x,y)\mapsto (L_{11}x+L_{12}y,L_{21}x+L_{22}y)\), where each \(L_{ij}:X\rightarrow X\) is linear. Assume that \(L_{11}^*L_{22} +L_{21}^*L_{12} ={\text {Id}}\) and that \(L_{11}^*L_{21}\) and \(L_{22}^*L_{12}\) are skew.Footnote 3 Then the following hold:

  1. (i)

    If \((x,y)\in X\times X\) and \((u,v)=L(x,y)\), then \(\left\langle {u},{v} \right\rangle =\left\langle {x},{y} \right\rangle \).

  2. (ii)

    Let \(M:X\rightrightarrows X\) be a monotone operator, and define \(M_L:X\rightrightarrows X\) via \({\text {gra }}M_L = L({\text {gra }}M)\). Then for all ordered pairs (xy) and \((x',y')\) in \({\text {gra }}M\) and \((u,v)=L(x,y)\), \((u',v')=L(x',y')\), we have \(\left\langle {u-v},{u'-v'} \right\rangle = \left\langle {x-x'},{y-y'} \right\rangle \); consequently, \(M_L\) is monotone.

Proof

  1. (i)

    The assumptions indeed imply

    $$\begin{aligned} \left\langle {u},{v} \right\rangle&=\left\langle {L_{11}x +L_{12}y},{L_{21}x +L_{22}y} \right\rangle \end{aligned}$$
    (34a)
    $$\begin{aligned}&=\left\langle {x},{(L_{11}^*L_{22} +L_{21}^*L_{12})y} \right\rangle +\left\langle {x},{L_{11}^*L_{21}x} \right\rangle +\left\langle {y},{L_{22}^*L_{12}y} \right\rangle =\left\langle {x},{y} \right\rangle . \end{aligned}$$
    (34b)
  2. (ii)

    Since \((x-x',y-y') = (x,y)-(x',y')\) and L is linear, the result follows from (i).

\(\square \)

Corollary 2.14

Let A be a linear subspace of X, and let (xy) and \((x',y')\) be in \(X\times X\). Then the following hold:

  1. (i)

    \(\left\langle {P_Ay -P_{A^\perp }x},{P_Ax -P_{A^\perp }y} \right\rangle =\left\langle {y},{x} \right\rangle \)

  2. (ii)

    \(\left\langle {(P_Ay -P_{A^\perp }x)-(P_Ay' -P_{A^\perp }x')},{(P_Ax -P_{A^\perp }y)-(P_Ax' -P_{A^\perp }y')} \right\rangle =\left\langle {y-y'},{x-x'} \right\rangle \).

Proof

Set \(L_{11}=L_{22}=P_A\) and \(L_{12}=L_{21}=-P_{A^\perp }\) in Lemma 2.13. \(\square \)

Remark 2.15

Spingarn’s original partial inverse [25] arises from Lemma 2.13 by setting \(L_{11}=L_{22}=P_A\) and \(L_{12}=L_{21}=P_{A^\perp }\) while in Corollary 2.14 we used \(L_{11}=L_{22}=P_A\) and \(L_{12}=L_{21}=-P_{A^\perp }\). Other choices are possible: e.g., if \(X=\mathbb {R}^2\) and R denotes the clockwise rotator by \(\pi /4\), then a valid choice for Lemma 2.13 is \(L_{11}=L_{22}=\tfrac{1}{2}R\) and \(L_{12}=L_{21}=\tfrac{1}{2}R^*\).

2.6 Finite-convergence conditions for the proximal point algorithm

It is known (see, e.g., [12, Theorem 6]) that the DRA is a special case of the exact proximal point algorithm (with constant parameter 1). The latter generates, for a given maximally monotone operator \(M:X\rightrightarrows X\) with resolvent \(T := J_M = ({\text {Id}}+M)^{-1}\), a sequence by

$$\begin{aligned} z_0 \in X, \quad (\forall {n\in {\mathbb {N}}})\quad z_{n+1} :=Tz_n = ({\text {Id}}+M)^{-1}z_n, \end{aligned}$$
(35)

in order to solve the problem

$$\begin{aligned} \text {find }z\in X \text { such that }0\in Mz; \text { equivalently}, z\in {\text {Fix }}T. \end{aligned}$$
(36)

A classical sufficient condition dates back to Rockafellar (see [23, Theorem 3]) who proved finite convergence when

$$\begin{aligned} (\exists \,\bar{z}\in X)(\exists \,\delta \in \mathbb {R}_{++})\quad {\text {ball}}({0};{\delta }) \subseteq M\bar{z}. \end{aligned}$$
(37)

It is instructive to view this condition from the resolvent side:

$$\begin{aligned} (\exists \,\bar{z}\in X)(\exists \,\delta \in \mathbb {R}_{++})(\forall z\in X)\quad \Vert z-\bar{z}\Vert \le \delta \;\Rightarrow \; Tz=\bar{z}. \end{aligned}$$
(38)

Note that since \({\text {Fix }}T\) is convex, this implies that

$$\begin{aligned} {\text {Fix }}T = \{\bar{z}\} \end{aligned}$$
(39)

is a singleton, which severely limits the applicability of this condition.

Later, Luque (see [17, Theorem 3.2]) proved finite convergence under the more general condition

$$\begin{aligned} M^{-1}0\ne \varnothing \;\text {and}\; (\exists \,\delta \in \mathbb {R}_{++})\;\; M^{-1}\big ({\text {ball}}({0};{\delta })\big )\subseteq M^{-1}0. \end{aligned}$$
(40)

On the resolvent side, his condition turns into

$$\begin{aligned} {\text {Fix }}T\ne \varnothing \;\text {and}\; (\exists \delta >0)\quad \Vert z -Tz\Vert \le \delta \quad \Rightarrow \quad Tz \in {\text {Fix }}T. \end{aligned}$$
(41)

However, when \(M^{-1}0={\text {Fix }}T\ne \varnothing \), it is well known that \(z_n-Tz_n\rightarrow 0\); thus, the finite-convergence condition is essentially a tautology.

When illustrating our main results, we shall provide examples where both (37) and (40) fail while our results are applicable (see Remark 3.8 and Example 5.5 below).

2.7 Douglas–Rachford operator

For future use, we record some results on the DRA that are easily checked. Recall that, for two nonempty closed convex subsets A and B, the DRA operator is

$$\begin{aligned} T :={\text {Id}}-P_A +P_B(2P_A -{\text {Id}}) =\tfrac{1}{2}\left( {\text {Id}}+R_BR_A\right) . \end{aligned}$$
(42)

The following result, the proof of which we omit since it is a direct verification, records properties in the presence of affinity/linearity.

Proposition 2.16

Let A be an affine subspace of X and let B be a nonempty closed convex subset of X. Then the following hold:

  1. (i)

    \(P_A\) is an affine operator and

    $$\begin{aligned} P_AR_A =P_A, \quad P_AT =P_AP_BR_A \quad \text {and}\quad T =(P_A +P_B -{\text {Id}})R_A. \end{aligned}$$
    (43)
  2. (ii)

    If A is a linear subspace, then \(P_A\) is a symmetric linear operator and

    $$\begin{aligned} T = \big (P_AP_B -P_{A^\perp }({\text {Id}}-P_B)\big )R_A. \end{aligned}$$
    (44)

The next result will be used in Sect. 4.1 below to clarify the connection between the DRA and Spingarn’s method.

Lemma 2.17

Let A be a linear subspace of X, let B be a nonempty closed convex subset of X, let \((a,a^\perp )\in A\times A^\perp \), and set \((a_+,a_+^\perp ) := (P_AP_B(a+a^\perp ), P_{A^\perp }({\text {Id}}-P_B)(a+a^\perp ))\in A\times A^\perp \). Then \(T(a-a^\perp ) = a_+-a_+^\perp \).

Proof

Clearly, \(a_+\in A\) and \(a_+^\perp \in A^\perp \). Since \(R_A(a-a^\perp ) = (P_A-P_{A^\perp })(a-a^\perp ) = a+ a^\perp \), the conclusion follows from (44). \(\square \)

3 The affine-polyhedral case with Slater’s condition

In this section, we are able to state and prove finite convergence of the DRA in the case where A is an affine subspace and B is a polyhedral set such that Slater’s condition, \(A\cap {\text {int}}\,B\ne \varnothing \), is satisfied. We start by recalling our standing assumptions. We assume that

$$\begin{aligned} X \text { is a finite-dimensional real Hilbert space,} \end{aligned}$$
(45)

and that

$$\begin{aligned} A \text { and } B \text { are closed convex subsets of } X \text { such that } A\cap B \ne \varnothing . \end{aligned}$$
(46)

The DRA is based on the operator

$$\begin{aligned} T :={\text {Id}}-P_A +P_BR_A. \end{aligned}$$
(47)

Given a starting point \(z_0 \in X\), the DRA sequence \((z_n)_{n\in {\mathbb {N}}}\) is generated by

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad z_{n+1} :=Tz_n. \end{aligned}$$
(48a)

We also set

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad a_n :=P_Az_n, \quad r_n :=R_Az_n =2a_n -z_n. \end{aligned}$$
(48b)

We now state the basic convergence result for the DRA.

Fact 3.1

(Convergence of DRA) The DRA sequences (48) satisfy

$$\begin{aligned} z_n \rightarrow z \in {\text {Fix }}T =(A\cap B) +N_{A-B}(0) \quad \text {and}\quad a_n \rightarrow P_Az \in A\cap B. \end{aligned}$$
(49)

Proof

Combine [3, Corollary 3.9 and Theorem 3.13]. \(\square \)

Fact 3.1 can be strengthened when a constraint qualification is satisfied.

Lemma 3.2

Suppose that \(0 \in {\text {int}}(A -B)\). Then there exists a point \(c\in A\cap B\) such that the following hold for the DRA sequences (48):

  1. (i)

    \({\text {ri}}(A)\cap {\text {ri}}(B)\ne \varnothing \) and hence \((z_n)_{n\in {\mathbb {N}}}\), \((a_n)_{n\in {\mathbb {N}}}\) and \((r_n)_{n\in {\mathbb {N}}}\) converge linearly to c.

  2. (ii)

    If \(c \in {\text {int}}\,B\), then the convergence of \((z_n)_{n\in {\mathbb {N}}}\), \((a_n)_{n\in {\mathbb {N}}}\) and \((r_n)_{n\in {\mathbb {N}}}\) to c is finite.

Proof

  1. (i)

    From \(0 \in {\text {int}}(A -B)\), we have \(N_{A-B}(0) =\{0\}\) due to Proposition 2.10. This gives \({\text {Fix }}T =A\cap B\), and Fact 3.1 implies \(z_n \rightarrow c \in A\cap B\) and \(a_n \rightarrow P_Ac =c\). Then \(r_n =2a_n -z_n \rightarrow c\). Using again \(0 \in {\text {int}}(A-B)\), [22, Corollary 6.6.2] yields \(0 \in {\text {ri}}(A -B) ={\text {ri}}\,A -{\text {ri}}\,B\), and thus \({\text {ri}}\,A\cap {\text {ri}}\,B \ne \varnothing \). Now the linear convergence follows from [21, Theorem 4.14] or from [6, Theorem 8.5(i)].

  2. (ii)

    Since \(r_n \rightarrow c\) and \(a_n \rightarrow c\) by (i), there exists \(n \in \mathbb {N}\) such that \(r_n \in B\) and \(a_n \in B\). Then \(P_Br_n =r_n\) and

    $$\begin{aligned} z_{n+1} =z_n -a_n +P_Br_n =z_n -a_n +r_n =a_n \in A\cap B = {\text {Fix }}T. \end{aligned}$$
    (50)

    Hence \(a_n=z_{n+1}=z_{n+2}=\cdots \) and we are done.

\(\square \)

Lemma 3.3

Suppose that A is a linear subspace. Then the DRA sequences (48) satisfy

$$\begin{aligned} a_n =P_Az_n =P_Ar_n \quad \text {and}\quad a_{n+1} =P_ATz_n = P_AP_BR_Az_n = P_AP_Br_n, \end{aligned}$$
(51)

and

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad a_n -a_{n+1} =P_A(r_n -P_Br_n). \end{aligned}$$
(52)

Proof

(51): Clear from (i). (52): Use (51) and the linearity of \(P_A\). \(\square \)

Lemma 3.4

Suppose that A is a linear subspace and that for the DRA sequences (48) there exists \(p \in \mathbb {N}\) such that \(a_p =a_{p+1} =c \in A\cap B\), and that there is a subset N of X such that \(r_p -P_Br_p \in N\) and \(A^\perp \cap N =\{0\}\). Then \((\forall n\ge p+1)\) \(z_{n}=c\).

Proof

Since \(a_p -a_{p+1} =0\), (52) implies \(r_p -P_Br_p \in A^\perp \cap N=\{0\}\). Thus \(P_Br_p =r_p\) and therefore \(z_{p+1} =z_p - a_p + r_p =a_p =c \in A\cap B \subseteq {\text {Fix }}T\). \(\square \)

Lemma 3.5

Suppose that A is a linear subspace and let \(a \in A\). Then the DRA sequence (48a) satisfies

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad \left\langle {z_n -z_{n+1}},{z_{n+1} -a} \right\rangle =\left\langle {r_n -P_Br_n},{P_Br_n -a} \right\rangle , \end{aligned}$$
(53a)

and

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})(\forall m\in \mathbb {N}) \quad \left\langle {z_{n+1}-z_{m+1}},{(z_n-z_{n+1})-(z_m-z_{m+1})} \right\rangle \ge 0. \end{aligned}$$
(53b)

Proof

Let \(n\in \mathbb {N}\). Using (51), we find that

$$\begin{aligned} z_n -z_{n+1} =a_n -P_Br_n =P_Ar_n -P_Br_n =P_A(r_n -P_Br_n) -P_{A^\perp }(P_Br_n -a). \end{aligned}$$
(54)

Next, (44) yields

$$\begin{aligned} z_{n+1} -a =P_AP_Br_n -P_{A^\perp }(r_n -P_Br_n) -a =P_A(P_Br_n -a) -P_{A^\perp }(r_n -P_Br_n). \end{aligned}$$
(55)

Moreover, \(r_n -P_Br_n \in N_B(P_Br_n) =N_{B-a}(P_Br_n -a)\) and \(N_{B-a}\) is a monotone operator. The result thus follows from Corollary 2.14. \(\square \)

Lemma 3.6

Suppose that A is a linear subspace and that the DRA sequences (48) satisfy

$$\begin{aligned} z_n \rightarrow a \in A\quad \text {and}\quad (\forall {n\in {\mathbb {N}}})\quad \left\langle {r_n -P_Br_n},{a -P_Br_n} \right\rangle =0. \end{aligned}$$
(56)

Then there is no linear functional \(f :X \rightarrow \mathbb {R}\) such that

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad f(z_n) >f(z_{n+1}). \end{aligned}$$
(57)

Proof

Suppose to the contrary that there exists a linear function \(f :X \rightarrow \mathbb {R}\) satisfying (57). Now set

$$\begin{aligned} (\forall n \in \mathbb {N}{\backslash } \{0\})\quad y_n :=z_{n-1} -z_n \quad \text {and}\quad w_n :=z_n -a. \end{aligned}$$
(58)

On the one hand, Lemma 2.4 yields \(n_0 \in \mathbb {N}{\backslash } \{0\}\) and \((\mu _1, \ldots , \mu _{n_0}) \in \mathbb {R}_+^{n_0}\) such that

$$\begin{aligned} \sum _{k=1}^{n_0}\mu _k=1\quad \text {and}\quad \left\langle {\sum _{k=1}^{n_0} \mu _ky_k},{\sum _{k=1}^{n_0} \mu _kw_k} \right\rangle >0. \end{aligned}$$
(59)

On the other hand, Lemma 3.5 and (56) yield

$$\begin{aligned} (\forall k \in \mathbb {N}{\backslash } \{0\})(\forall j \in \mathbb {N}{\backslash } \{0\})\quad \left\langle {y_k},{w_k} \right\rangle =0 \quad \text {and}\quad \left\langle {y_k -y_j},{w_k -w_j} \right\rangle \ge 0; \end{aligned}$$
(60)

consequently, with the help of [2, Lemma 2.13(i)],

$$\begin{aligned} \left\langle {\sum _{k=1}^{n_0} \mu _k y_k},{\sum _{k=1}^{n_0} \mu _k w_k} \right\rangle =\sum _{k=1}^{n_0} \mu _k\left\langle {y_k},{w_k} \right\rangle -\tfrac{1}{2}\sum _{k=1}^{n_0}\sum _{j=1}^{n_0} \mu _k\mu _j\left\langle {y_k -y_j},{w_k -w_j} \right\rangle \le 0. \end{aligned}$$
(61)

Comparing (59) with (61), we arrive at the desired contradiction. \(\square \)

We are now ready for our first main result concerning the finite convergence of the DRA.

Theorem 3.7

(Finite convergence of DRA in the affine-polyhedral case) Suppose that A is an affine subspace, that B is polyhedral at every point in \(A\cap {\text {bdry }}B\), and that Slater’s condition

$$\begin{aligned} A\cap {\text {int}}\,B \ne \varnothing \end{aligned}$$
(62)

holds. Then the DRA sequences (48) converge in finitely many steps to a point in \(A\cap B\).

Proof

After translating the sets if necessary, we can and do assume that A is a linear subspace of X. By Corollary 2.11(ii), (62) yields

$$\begin{aligned} 0 \in {\text {int}}(A -B) \quad \text {and}\quad {\text {int}}\,B \ne \varnothing . \end{aligned}$$
(63)

Lemma 3.2(i) thus implies that \((z_n)_{n\in {\mathbb {N}}}\), \((a_n)_{n\in {\mathbb {N}}}\) and \((r_n)_{n\in {\mathbb {N}}}\) converge linearly to a point \(c \in A\cap B\). Since \(P_B\) is (firmly) nonexpansive, it also follows that \((P_Br_n)_{n\in {\mathbb {N}}}\) converges linearly to c. Since B is clearly polyhedral at every point in \({\text {int}}\,B \) it follows from the hypothesis that B is polyhedral at c. Lemma 2.9 guarantees the existence of \(n_0 \in \mathbb {N}\) such that

$$\begin{aligned} (\forall n \ge n_0)\quad \left\langle {r_n -P_Br_n},{c -P_Br_n} \right\rangle =0, \end{aligned}$$
(64)

and

$$\begin{aligned} (\forall n \ge n_0)\quad r_n -P_Br_n \in N_B(c). \end{aligned}$$
(65)

Because of \({\text {int}}\,B \ne \varnothing \) and \(c\in B\), Proposition 2.10(iii) yields \(N_B(c)\cap (-N_B(c)) =\{0\}\). Hence \(N_B(c)\) is a nonempty pointed closed convex cone. Using again \(0 \in {\text {int}}(A -B)\), Corollary 2.11(i) gives

$$\begin{aligned} \ker P_A \cap N_B(c) =A^\perp \cap N_B(c) =\{0\}. \end{aligned}$$
(66)

In view of Lemma 2.5(ii),

$$\begin{aligned} K :=P_A\big (N_B(c)\big ) \text { is a nonempty pointed closed convex cone.} \end{aligned}$$
(67)

Combining (52) and (65), we obtain

$$\begin{aligned} (\forall n \ge n_0)\quad a_n -a_{n+1} \in K. \end{aligned}$$
(68)

Since \(a_n \rightarrow c\) and K is a closed convex cone, we have

$$\begin{aligned} (\forall n \ge n_0)\quad a_n -c =\sum _{k=n}^\infty (a_k -a_{k+1}) \in K. \end{aligned}$$
(69)

We now consider two cases.

Case 1: \((\exists \,p\ge n_0)\) \(a_p=c\).

Using (67) and (68), we deduce from Lemma 2.6 that \(a_p =a_{p+1} =c\). Now (65), (66), and Lemma 3.4 yield \((\forall n\ge p+1)\) \(z_n=c\) as required.

Case 2: \((\forall n\ge n_0)\) \(a_n\ne c\).

By (69), \((\forall n\ge n_0)\) \(a_n -c \in K{\backslash }\{0\}\). Since K is pointed [see (67)], Lemma 2.3 yields \(v \in {\text {ri}}\,K\cap {\text {ri}}\,K^\oplus \subseteq K\) such that

$$\begin{aligned} (\forall n \ge n_0)\quad \left\langle {v},{a_n -c} \right\rangle >0. \end{aligned}$$
(70)

Recalling (67), we get \(u\in N_B(c)\) such that \(v=P_Au\). Clearly, \((\forall n \ge n_0)\) \(\left\langle {u},{a_n -c} \right\rangle = \left\langle {u},{P_A(a_n -c)} \right\rangle = \left\langle {P_Au},{a_n -c} \right\rangle = \left\langle {v},{a_n -c} \right\rangle \). It follows from (70) that

$$\begin{aligned} (\forall n \ge n_0)\quad \left\langle {u},{a_n -c} \right\rangle > 0. \end{aligned}$$
(71)

Since \(u \in N_B(c)\) we also have

$$\begin{aligned} (\forall n \ge n_0)\quad \left\langle {u},{P_Br_n -c} \right\rangle \le 0. \end{aligned}$$
(72)

Now define a linear functional on X by

$$\begin{aligned} f :X \rightarrow \mathbb {R}:x \mapsto \left\langle {u},{x} \right\rangle . \end{aligned}$$
(73)

In view of (71) with (72), we obtain \((\forall n\ge n_0)\) \(f(z_n -z_{n+1}) =f(a_n -P_Br_n) =f(a_n -c) -f(P_Br_n -c) =\left\langle {u},{a_n -c} \right\rangle -\left\langle {u},{P_Br_n -c} \right\rangle >0\). Therefore,

$$\begin{aligned} (\forall n \ge n_0)\quad f(z_n) >f(z_{n+1}). \end{aligned}$$
(74)

However, this and (56) together contradict Lemma 3.6 (applied to \((z_n)_{n \ge n_0}\)). We deduce that Case 2 never occurs which completes the proof of the theorem. \(\square \)

Remark 3.8

Some comments on Theorem 3.7 are in order.

  1. (i)

    Suppose we replace the Slater’s condition “\(A\cap {\text {int}}\,B\ne \varnothing \)” by “\(A\cap {\text {ri}}\,B \ne \varnothing \)”. Then one still obtains linear convergence (see [21, Theorem 4.14] or [6, Theorem 8.5(i)]); however, finite convergence fails in general: indeed, B can chosen to be an affine subspace (see [1, Section 5]).

  2. (ii)

    Under the assumptions of Theorem 3.7, Rockafellar’s condition (37) is only applicable when both \(A=\{\bar{a}\}\) and \(\bar{a}\in {\text {int}}\,B\) hold. There are many examples where this condition is violated yet our condition is applicable (see, e.g., the scenario in the following item).

  3. (iii)

    Suppose that \(X =\mathbb {R}^2\), that \(A =\mathbb {R}\times \{0\}\) and that \(B ={\text {epi }}f\), where \(f:\mathbb {R}\rightarrow \mathbb {R}:x\mapsto |x|-1\). It is clear that B is polyhedral and that \(A\cap {\text {int}}\,B\ne \varnothing \). Define \((\forall \varepsilon \in \mathbb {R}_{++})\) \(z_\varepsilon := (1 +\varepsilon , \varepsilon )\). Then

    $$\begin{aligned} (\forall \varepsilon \in \mathbb {R}_{++})\quad Tz_{\varepsilon } = (1,\varepsilon )\notin {\text {Fix }}T = [-1,1]\times \{0\} \;\;\text {yet}\;\; \Vert z_\varepsilon -Tz_{\varepsilon }\Vert =\varepsilon . \end{aligned}$$
    (75)

    We conclude that (the resolvent reformulation of) Luque’s condition (41) fails.

4 Applications

4.1 Product space setup and Spingarn’s method

Let us now consider a feasibility problem with possibly more than two sets, say

$$\begin{aligned} \text {find a point in } C :=\bigcap _{j=1}^M C_j, \end{aligned}$$
(76)

where

$$\begin{aligned} C_1, \ldots , C_M \text { are closed convex subsets of } X \text { such that }C \ne \varnothing . \end{aligned}$$
(77)

This problem is reduced to a two-set problem as follows. In the product Hilbert space \({\mathbf {X}}:=X^M\), with the inner product defined by \(((x_1, \ldots , x_M), (y_1, \ldots , y_M)) \mapsto \sum _{j=1}^M \left\langle {x_j},{y_j} \right\rangle \), we set

$$\begin{aligned} {\mathbf {A}}:=\left\{ {(x,\ldots , x) \in {\mathbf {X}}}~\big |~{x \in X}\right\} \quad \text {and}\quad {\mathbf {B}}:=C_1\times \cdots \times C_M. \end{aligned}$$
(78)

Because of

$$\begin{aligned} x \in \bigcap _{j=1}^M C_j \quad \Leftrightarrow \quad (x, \ldots , x) \in {\mathbf {A}}\cap {\mathbf {B}}, \end{aligned}$$
(79)

the M-set problem (76), which is formulated in X, is equivalent to the two-set problem

$$\begin{aligned} \text {find a point in }{\mathbf {A}}\cap {\mathbf {B}}, \end{aligned}$$
(80)

which is posed in \({\mathbf {X}}\). By, e.g., [2, Proposition 25.4(iii) and Proposition 28.3], the projections of \(\mathbf {x}=(x_1, \ldots , x_M) \in {\mathbf {X}}\) onto \({\mathbf {A}}\) and \({\mathbf {B}}\) are respectively given by

$$\begin{aligned} P_{\mathbf {A}}\mathbf {x}=\Big ( \frac{1}{M}\sum _{j=1}^M x_j, \ldots , \frac{1}{M}\sum _{j=1}^M x_j \Big ) \quad \text {and}\quad P_{\mathbf {B}}\mathbf {x}=\big ( P_{C_1}x_1, \ldots , P_{C_M}x_M \big ). \end{aligned}$$
(81)

This opens the door of applying the DRA in \({\mathbf {X}}\): indeed, set

$$\begin{aligned} {\mathbf {T}}:={\text {Id}}-P_{\mathbf {A}}+P_{\mathbf {B}}R_{\mathbf {A}}, \end{aligned}$$
(82)

fix a starting point \(\mathbf {z}_0 \in {\mathbf {X}}\), and generate the DRA sequence \((\mathbf {z}_n)_{n\in {\mathbb {N}}}\) via

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad \mathbf {z}_{n+1} := {\mathbf {T}}\mathbf {z}_n. \end{aligned}$$
(83)

We now obtain the following result as a consequence of Theorem 3.7.

Corollary 4.1

Suppose that \(C_1,\ldots ,C_M\) are polyhedral such that \({\text {int}}\,C\ne \varnothing \). Then the DRA sequence defined by (83) converges finitely to \(\mathbf {z}=(z, \ldots , z) \in {\mathbf {A}}\cap {\mathbf {B}}\) with \(z \in C =\bigcap _{j=1}^M C_j\).

Proof

Since \({\text {int}}\,C \ne \varnothing \), there exists \(c \in C\) and \(\varepsilon \in \mathbb {R}_{++}\) such that \((\forall j\in \{1,\ldots ,M\})\) \({\text {ball}}({c};{\varepsilon }) \subseteq C_{j}\). Then \((c, \ldots , c) \in {\mathbf {A}}\cap {\text {int}}\,{\mathbf {B}}\), and so \({\mathbf {A}}\cap {\text {int}}\,{\mathbf {B}}\ne \varnothing \). Since \({\mathbf {B}}=C_1\times \cdots \times C_M\) is a polyhedral subset of \({\mathbf {X}}\), the conclusion now follows from Theorem 3.7. \(\square \)

Fig. 1
figure 1

The DRA for the case when B is not a Cartesian product

Problem (76) was already considered by Spingarn [26] for the case where all sets \(C_1, \ldots , C_M\) are halfspaces. He cast the resulting problem into the form

$$\begin{aligned} \text {find }(\mathbf {a}, \mathbf {b}) \in {\mathbf {A}}\times {\mathbf {A}}^\perp \text { such that }\mathbf {b}\in N_{\mathbf {B}}(\mathbf {a}), \end{aligned}$$
(84)

and suggested solving it by a version of his method of partial inverses [25], which generates a sequence \((\mathbf {a}_n,\mathbf {b}_n)_{n\in {\mathbb {N}}}\) via

$$\begin{aligned} (\mathbf {a}_0, \mathbf {b}_0) \in {\mathbf {A}}\times {\mathbf {A}}^\perp \quad \text {and}\quad (\forall {n\in {\mathbb {N}}}) \quad {\left\{ \begin{array}{ll} \mathbf {a}'_n :=P_{\mathbf {B}}(\mathbf {a}_n +\mathbf {b}_n), &{} \mathbf {b}'_n :=\mathbf {a}_n +\mathbf {b}_n -\mathbf {a}'_n, \\ \mathbf {a}_{n+1} :=P_{\mathbf {A}}\mathbf {a}'_n, &{} \mathbf {b}_{n+1} :=\mathbf {b}'_n -P_{\mathbf {A}}\mathbf {b}'_n. \end{array}\right. } \end{aligned}$$
(85)

It was pointed out in [15, Section 1], [12, Section 5], [19, Appendix] and [10, Remark 2.4] that this scheme is closely related to the DRA. (In fact, Lemma 2.17 above makes it completely clear why (85) is equivalent to applying the DRA to \({\mathbf {A}}\) and \({\mathbf {B}}\), with starting point \((\mathbf {a}_0-\mathbf {b}_0)\).) However, Spingarn’s proof of finite convergence in [26] requires

$$\begin{aligned} \mathbf {a}_n -\mathbf {a}_{n+1} \in N_C(c) \times \cdots \times N_C(c), \end{aligned}$$
(86)

and he chooses a linear functional f based on the “diagonal” structure of \({\mathbf {A}}\)—unfortunately, his proof does not work for problem (3). Our proof in the previous section at the same time simplifies and strengthens his proof technique to allow us to deal with polyhedral sets rather than just halfspaces. While every polyhedron is an intersection of halfspaces, the problems are theoretically equivalent—in practice, however, there can be huge savings as the requirement to work in Spingarn’s setup might lead to much larger instances of the product space \({\mathbf {X}}\)! It also liberates us from being forced to work in the product space. Our extension is also intrinsically more flexible as the following example illustrates.

Example 4.2

Suppose that \(X =\mathbb {R}^2\), that \(A =\left\{ {(x, x)}~\big |~{x \in \mathbb {R}}\right\} \) is diagonal, and that \(B =\left\{ {(x, y)\in \mathbb {R}^2}~\big |~{-y \le x \le 2}\right\} \). Clearly, B is polyhedral and \(A\cap {\text {int}}\,B\ne \varnothing \). Moreover, B is also not the Cartesian product of two polyhedral subsets of \(\mathbb {R}\), i.e., of two intervals. Therefore, the proof of finite convergence of the DRA in [26] no longer applies. However, A and B satisfy all assumptions of Theorem 3.7, and thus the DRA finds a point in \(A\cap B\) after a finite number of steps (regardless of the location of starting point). See Fig. 1 for an illustration, created with GeoGebra [13].

4.2 Solving linear equalities with a strict positivity constraint

In this subsection, we assume that

$$\begin{aligned} X = \mathbb {R}^N, \end{aligned}$$
(87)

and that

$$\begin{aligned} A = \left\{ {x\in X}~\big |~{Lx=a}\right\} \;\;\text {and}\;\; B = \mathbb {R}^N_+, \end{aligned}$$
(88)

where \(L\in \mathbb {R}^{M\times N}\) and \(a\in \mathbb {R}^M\). Note that the set \(A\cap B\) is polyhedral yet \(A\cap B\) has empty interior (unless \(A=X\) which is a case of little interest). Thus, Spingarn’s finite-convergence result is never applicable. However, Theorem 3.7 guarantees finite convergence of the DRA provided that Slater’s condition

$$\begin{aligned} \mathbb {R}^N_{++} \cap L^{-1}a \ne \varnothing \end{aligned}$$
(89)

holds. Using [5, Lemma 4.1], we obtain

$$\begin{aligned} P_A:X\rightarrow X:x\mapsto x -L^\dagger (Lx -a), \end{aligned}$$
(90)

where \(L^\dagger \) denotes the Moore-Penrose inverse of L. We also have (see, e.g., [2, Example 6.28])

$$\begin{aligned} P_B:X\rightarrow X:x =(\xi _1, \ldots , \xi _N) \mapsto x =(\xi _1^+, \ldots , \xi _N^+), \end{aligned}$$
(91)

where \(\xi ^+ = \max \{\xi ,0\}\) for every \(\xi \in \mathbb {R}\). This implies \(R_A:X\rightarrow X:x\mapsto x-2L^\dagger (Lx-a)\) and \(R_B:X\rightarrow X:x =(\xi _1, \ldots , \xi _N) \mapsto (|\xi _1|,\ldots ,|\xi _N|)\). We will compare three algorithms, all of which generate a governing sequence with starting point \(z_0\in X\) via

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad z_{n+1}=Tz_n. \end{aligned}$$
(92)

The DRA uses, of course,

$$\begin{aligned} T = {\text {Id}}-P_A + P_BR_A. \end{aligned}$$
(93)

The second method is the classical method of alternating projections (MAP) where

$$\begin{aligned} T = P_AP_B; \end{aligned}$$
(94)

while the third method of reflection–projection (MRP) employs

$$\begin{aligned} T = P_AR_B. \end{aligned}$$
(95)

We now illustrate the performance of these three algorithms numerically. For the remainder of this section, we assume that that \(X =\mathbb {R}^2\), that \(A =\left\{ {(x, y) \in \mathbb {R}^2}~\big |~{x +5y =6}\right\} \), and that \(B =\mathbb {R}_+^2\). Then A and B satisfy (89), and the sequence (93) generated by the DRA thus converges finitely to a point in \(A\cap B\) regardless of the starting point. See Fig. 2 for an illustration, created with GeoGebra [13]. Note that the shadow sequence \((P_Az_n)_{n\in {\mathbb {N}}}\) for the DRA finds a point in \(A\cap B\) even before a fixed point is reached. For each starting point \(z_0 \in [-100, 100]^2 \subseteq \mathbb {R}^2\), we perform the DRA until \(z_{n+1} =z_n\), and run the MAP and the MRP until \(d_B(z_n) =\max \{d_A(z_n), d_B(z_n)\} <\varepsilon \), where we set the tolerance \(\varepsilon =10^{-4}\). Figure 3 compares the number of iterations needed to stop each algorithm. Note that even though we put the DRA at an “unfair disadvantage” (it must find a true fixed point while the MAP and the MRP will stop with \(\varepsilon \)-feasible solutions), it does extremely well. In Fig. 4, we level the playing field and compare the distance from \(P_Az_n\) (for the DRA) or from \(z_n\) (for the MAP and the MRP) to B, where \(n \in \{5, 10\}\).

Fig. 2
figure 2

The orbits of DRA (dashed), MAP (dotted) and MRP (solid)

Fig. 3
figure 3

Number of iterations needed to get the solution of DRA, MAP and MRP

Fig. 4
figure 4

The distance from the monitored iterate to B after 5 steps (top) and 10 steps (bottom)

Now we look at the process of reaching a solution for each algorithm. For the DRA, we monitor the shadow sequence \((P_Az_n)_{n\in {\mathbb {N}}}\) and for the MAP and the MRP, we monitor \((z_n)_{n\in {\mathbb {N}}}\), which is actually the same as \((P_Az_n)_{n\in {\mathbb {N}}}\). Because all three monitored sequences lie in A, we are concerned about the distance to B. Our stopping criterion is therefore

$$\begin{aligned} d_B(P_Az_n) < \varepsilon \end{aligned}$$
(96)

for all three algorithms. From top to bottom in Fig. 5, we check how many iterates are required to get to tolerance \(\varepsilon =10^{-m}\), where \(m \in \{2, 4\}\), respectively. Computations were performed with MATLAB R2013b [18]. These experiments illustrate the superior convergence behaviour of the DRA compared to the MAP and the MRP.

Fig. 5
figure 5

Number of iterations needed to get the tolerance \(10^{-2}\) (top) and \(10^{-4}\) (bottom) of DRA, MAP and MRP

5 The hyperplanar-epigraphical case with Slater’s condition

In this section we assume that

$$\begin{aligned} f :X \rightarrow \mathbb {R}\text { is convex and continuous.} \end{aligned}$$
(97a)

We will work in \(X\times \mathbb {R}\), where we set

$$\begin{aligned} A :=X\times \{0\} \end{aligned}$$
(97b)

and

$$\begin{aligned} B :={\text {epi }}f :=\left\{ {(x, \rho ) \in X\times \mathbb {R}}~\big |~{f(x)\le \rho }\right\} . \end{aligned}$$
(97c)

Then

$$\begin{aligned} P_A :X\times \mathbb {R}\rightarrow X\times \mathbb {R}:(x, \rho ) \mapsto (x, 0), \end{aligned}$$
(98)

and the projection onto B is described in the following result.

Lemma 5.1

Let \((x, \rho ) \in (X\times \mathbb {R}){\backslash } B\). Then there exists \(p\in X\) such that \(P_B(x, \rho ) =(p, f(p))\),

$$\begin{aligned} x \in p +\big (f(p) -\rho \big )\partial f(p)\;\text {and}\; \rho <f(p) \le f(x) \end{aligned}$$
(99)

and

$$\begin{aligned} (\forall y \in X)\quad \left\langle {y -p},{x -p} \right\rangle \le \big (f(y) -f(p)\big )\big (f(p) -\rho \big ). \end{aligned}$$
(100)

Moreover, the following hold:

  1. (i)

    If u is a minimizer of f, then \(\left\langle {u -p},{x -p} \right\rangle \le 0\).

  2. (ii)

    If x is a minimizer of f, then \(p =x\).

  3. (iii)

    If x is not a minimizer of f, then \(f(p) <f(x)\).

Proof

According to [2, Proposition 28.28], there exists \((p,p^*)\in {\text {gra }}\partial f\) such that \(P_B(x, \rho ) =(p, f(p))\) and \(x = p +(f(p) -\rho )p^*\). Next, [2, Propositions 9.18] implies that \(\rho <f(p)\) and (100). The subgradient inequality gives

$$\begin{aligned} f(x) -f(p) \ge \left\langle {p^*},{x -p} \right\rangle =\left\langle {p^*},{(f(p) -\rho )p^*} \right\rangle =(f(p) -\rho )\Vert p^*\Vert ^2 \ge 0. \end{aligned}$$
(101)

Hence \(f(p) \le f(x)\), and this completes the proof of (99),

  1. (i)

    It follows from (100) that \(\left\langle {u -p},{x -p} \right\rangle \le (f(u) -f(p))(f(p) -\rho )\). Since \(f(p) -\rho >0\) and \(f(u) \le f(p)\), we have \(\left\langle {u -p},{x -p} \right\rangle \le 0\).

  2. (ii)

    Apply (i) with \(u =x\).

  3. (iii)

    By (99), \(\rho < f(p) \le f(x)\). We show the contrapositive and thus assume that \(f(p) =f(x)\). Since \(f(p) -\rho >0\), (101) yields \(p^* =0\), and so \(0 \in \partial f(p)\). It follows that p is a minimizer of f, and therefore x also minimizes f.

\(\square \)

Remark 5.2

When \(X =\mathbb {R}\) and u is a minimizer of f, then \((u -p)(x -p) \le 0\) by Lemma 5.1(i). Therefore, p lies between x and u (see also [4, Corollary 4.2]).

Define the DRA operator by

$$\begin{aligned} T :={\text {Id}}-P_A +P_BR_A. \end{aligned}$$
(102)

It will be convenient to abbreviate

$$\begin{aligned} B' :=R_A(B) =\left\{ {(x, \rho ) \in X\times \mathbb {R}}~\big |~{\rho \le -f(x)}\right\} \end{aligned}$$
(103)

and to analyze the effect of performing one DRA step in the following result.

Corollary 5.3

(One DRA step) Let \(z =(x, \rho ) \in X\times \mathbb {R}\), and set \(z_+ :=(x_+, \rho _+) =T(x, \rho )\). Then the following hold:

  1. (i)

    Suppose that \(z \in B'\). Then \(z_+ =(x, 0) \in A\). Moreover, either (\(f(x)\le 0\) and \(z_+ \in A\cap B\)) or (\(f(x) > 0\) and \(z_+ \not \in B\cup B'\)).

  2. (ii)

    Suppose that \(z \not \in B'\). Then there exists \(x_+^* \in \partial f(x_+)\) such that

    $$\begin{aligned} x_+ =x -\rho _+x_+^*,\; f(x_+) \le f(x), \;\text {and}\; \rho _+ =\rho +f(x_+) >0. \end{aligned}$$
    (104)

    Moreover, either (\(\rho \ge 0\) and \(z_+ \in B\)) or (\(\rho <0\), \(z_+ \not \in B\cup B'\) and \(Tz_+ \in B\)).

  3. (iii)

    Suppose that \(z \in B\cap B'\). Then \(z_+ \in A\cap B\).

  4. (iv)

    Suppose that \(z \in B{\backslash } B'\). Then \(z_+ \in B\).

Proof

  1. (i)

    We have \(P_Az =(x, 0)\) and \(R_Az =(x, -\rho ) \in B\). Thus \(P_BR_Az =P_B(x, -\rho ) =(x, -\rho )\), which gives

    $$\begin{aligned} z_+ =({\text {Id}}-P_A +P_BR_A)z =(x, \rho ) -(x, 0) +(x, -\rho ) =(x, 0) \in A. \end{aligned}$$
    (105)

    If \(f(x) \le 0\), then \(z_+ =(x, 0) \in B\), and hence \(z_+ \in A\cap B\). Otherwise, \(f(x) >0\) which implies \(-f(x) <0\) and further \(z_+ =(x, 0) \not \in B\cup B'\).

  2. (ii)

    We have \(R_Az =(x, -\rho ) \not \in B\), and by (99), \(P_B(x, -\rho ) =(p, f(p))\), where

    $$\begin{aligned} p =x -(\rho +f(p))p^* \text { for some } p^* \in \partial f(p), \quad \text {and}\quad -\rho <f(p) \le f(x). \end{aligned}$$
    (106)

    We obtain

    $$\begin{aligned} z_+ =(x_+, \rho _+) =({\text {Id}}-P_A +P_BR_A)z =(x, \rho ) -(x, 0) +(p, f(p)) =(p, \rho +f(p)), \end{aligned}$$
    (107)

    which gives \(x_+ =p\) and \(\rho _+ =\rho +f(p) >0\), so (104) holds. If \(\rho \ge 0\), then \(\rho _+ =\rho +f(x_+) \ge f(x_+)\), and thus \(z_+ \in B\). Otherwise, \(\rho <0\), so \(\rho _+ <f(x_+)\) and also \(f(x_+) >-\rho >0\). Hence \(\rho _+ >0 >-f(x_+)\), which implies \(z_+ \not \in B\cup B'\), and then \(Tz_+ \in B\) because \(\rho _+ >0\) and the previous case applies.

  3. (iii)

    We have \(f(x) \le \rho \le -f(x)\), and so \(f(x) \le 0\). Now apply (i).

  4. (iv)

    We have \(\rho \ge f(x)\) and \(\rho >-f(x)\). Then \(\rho \ge |f(x)| \ge 0\), and (ii) gives \(z_+ \in B\).

\(\square \)

Theorem 5.4

(Finite convergence of DRA in the hyperplanar-epigraphical case) Suppose that \(\inf _X f <0\), and, given a starting point \(z_0 =(x_0, \rho _0) \in X\times \mathbb {R}\), generate the DRA sequence \((z_n)_{n\in {\mathbb {N}}}\) by

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad z_{n+1} =(x_{n+1}, \rho _{n+1}) =Tz_n. \end{aligned}$$
(108)

Then \((z_n)_{n\in {\mathbb {N}}}\) converges finitely to a point \(z \in A\cap B\).

Proof

In view of Corollary 5.3(i)&(iii), we can and do assume that \(z_0 \in B{\backslash } B'\), where \(B'\) was defined in (103). It follows then from Corollary 5.3(iii)&(iv), that \((z_n)_{n\in {\mathbb {N}}}\) lies in B.

Case 1: \((\exists n \in \mathbb {N})\) \(z_{n}\in B\cap B'\).

By Corollary 5.3(iii), \(z_{n+1}\in A\cap B\) and we are done.

Case 2: \((\forall {n\in {\mathbb {N}}})\) \(z_n \in B{\backslash } B'\).

By Corollary 5.3(ii),

$$\begin{aligned} (\forall {n\in {\mathbb {N}}})\quad f(x_{n+1}) \le f(x_n) \;\;\text {and}\;\; \rho _{n+1} =\rho _n +f(x_{n+1}) >0. \end{aligned}$$
(109)

Next, it follows from [22, Lemma 7.3] that \({\text {int}}\,B =\left\{ {(x, \rho ) \in X\times \mathbb {R}}~\big |~{f(x)<\rho }\right\} \). Because \(\inf _X f <0\), we obtain \(A\cap {\text {int}}\,B \ne \varnothing \), which, due to Lemma 3.2(i) yields \(z_n \rightarrow z =(x, \rho ) \in A\cap B\). Since \(z \in A\), we must have \(\rho =0\). If \((\forall {n\in {\mathbb {N}}})\) \(f(x_{n+1}) \ge 0\), then, by (109), \(0<\rho _1\le \rho _2\le \cdots \le \rho _n\rightarrow \rho = 0\) which is absurd. Therefore,

$$\begin{aligned} (\exists \,n_0\in \mathbb {N})\quad f(x_{n_0+1}) <0. \end{aligned}$$
(110)

In view of (109), we see that \((\forall n \ge n_0+1)\) \(\rho _n \le \rho _{n_0} +(n -n_0)f(x_{n_0+1})\). Since \(f(x_{n_0+1}) <0\), there exists \(n_1 \in \mathbb {N}\), \(n_1 \ge n_0 +1\) such that

$$\begin{aligned} \rho _{n_0} +(n_1 -n_0)f(x_{n_0+1}) \le -f(x_{n_0+1}). \end{aligned}$$
(111)

Noting that \(f(x_{n_1}) \le f(x_{n_0+1})\), we then obtain

$$\begin{aligned} \rho _{n_1} \le \rho _{n_0} +(n_1 -n_0)f(x_{n_0+1}) \le -f(x_{n_0+1}) \le -f(x_{n_1}). \end{aligned}$$
(112)

Hence \(z_{n_1} \in B'\), which contradicts the assumption of Case 2. Therefore, Case 2 never occurs and the proof is complete. \(\square \)

We conclude by illustrating that finite convergence may be deduced from Theorem 5.4 but not necessarily from the finite-convergence conditions of Sect. 2.6.

Example 5.5

Suppose that \(X =\mathbb {R}\), that \(A =\mathbb {R}\times \{0\}\), and that \(B ={\text {epi }}f\), where \(f :\mathbb {R}\rightarrow \mathbb {R}:x\mapsto x^2 -1\). Let \((\forall \varepsilon \in \mathbb {R}_{++})\) \(z_\varepsilon = (1 +\varepsilon , -\varepsilon )\). Then \((\forall \varepsilon \in \mathbb {R}_{++})\) \(Tz_\varepsilon \notin {\text {Fix }}T =[-1,1]\times \{0\}\), and \(z_\varepsilon -Tz_\varepsilon \rightarrow 0\) as \(\varepsilon \rightarrow 0^+\). Consequently, Luque’s condition (41) fails.

Proof

Let \(\varepsilon \in \mathbb {R}_{++}\). Then \(-f(1 +\varepsilon ) =-2\varepsilon -\varepsilon ^2 <-\varepsilon \), and so \(z_\varepsilon =(1 +\varepsilon , -\varepsilon ) \notin B'\). By Corollary 5.3(ii), there exists \(x_\varepsilon \in \mathbb {R}\) such that

$$\begin{aligned} Tz_\varepsilon =(x_\varepsilon , -\varepsilon +f(x_\varepsilon )),\quad x_\varepsilon =1 +\varepsilon -(-\varepsilon +f(x_\varepsilon ))2x_\varepsilon , \quad \text {and}\quad -\varepsilon +f(x_\varepsilon ) >0. \end{aligned}$$
(113)

The last inequality shows that \(Tz_\varepsilon \notin {\text {Fix }}T =[-1,1]\times \{0\}\). It follows from the expression of \(x_\varepsilon \) that

$$\begin{aligned} 2x_\varepsilon ^3 -(1 +2\varepsilon )x_\varepsilon -1 -\varepsilon =0. \end{aligned}$$
(114)

Note that \((x_\varepsilon ,f(x_\varepsilon ))+ (0,-\varepsilon ) = Tz_\varepsilon = z_\varepsilon -P_Az_\varepsilon + P_BR_Az_\varepsilon = (1+\varepsilon ,-\varepsilon )-(1+\varepsilon ,0)+P_B(1+\varepsilon ,\varepsilon ) = P_B(1+\varepsilon ,\varepsilon )+(0,-\varepsilon )\) and hence \((x_\varepsilon , f(x_\varepsilon )) =P_B(1 +\varepsilon , \varepsilon )\). Remark 5.2 gives \(0 \le x_\varepsilon \le 1 +\varepsilon \). If \(0 \le x_\varepsilon \le 1\), then (114) yields \(0 \le 2x_\varepsilon -(1 +2\varepsilon )x_\varepsilon -1 -\varepsilon =(x_\varepsilon -1) -2\varepsilon x_\varepsilon -\varepsilon <0\), which is absurd. Thus \(1 <x_\varepsilon \le 1 +\varepsilon \). Now as \(\varepsilon \rightarrow 0^+\), we have \(x_\varepsilon \rightarrow 1\), \(f(x_\varepsilon ) \rightarrow f(1) =0\), and at the same time, \(z_\varepsilon -Tz_\varepsilon =(1 +\varepsilon -x_\varepsilon , -f(x_\varepsilon )) \rightarrow 0\). \(\square \)