Abstract
The ancient concept of circumcenter has recently given birth to the circumcentered-reflection method (CRM). CRM was first employed to solve best approximation problems involving affine subspaces. In this setting, it was shown to outperform the most prestigious projection-based schemes, namely, the Douglas-Rachford method (DRM) and the method of alternating projections (MAP). We now prove convergence of CRM for finding a point in the intersection of a finite number of closed convex sets. This striking result is derived under a suitable product space reformulation in which a point in the intersection of a closed convex set with an affine subspace is sought. It turns out that CRM skillfully tackles the reformulated problem. We also show that for a point in the affine set the CRM iteration is always closer to the solution set than both the MAP and DRM iterations. Our theoretical results and numerical experiments, showing outstanding performance, establish CRM as a powerful tool for solving general convex feasibility problems.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We consider the important problem of finding a point in a nonempty set given by the intersection of finitely many closed convex sets \(\{X_{i}\}_{i=1}^{m}\), that is,
We assume that for each i = 1,2,…,m and any , the orthogonal projection of z onto Xi, denoted by \(P_{X_{i}}(z)\), is available.
It is well known that Problem (1) arises in many applications in science and engineering and is often solved enforcing the Douglas-Rachford method (DRM) or the method of alternating projections (MAP). The extensive literature on DRM and MAP in various contexts of continuous optimization expresses their relevance in the field (see, for instance, [4,5,6, 8,9,10, 26, 31, 33]). Notwithstanding, DRM and MAP sequences may converge slowly due to spiraling and zigzag behavior, respectively, even in the case where the Xi’s are affine subspaces. This was our main motivation in [16] for the introduction of the circumcentered-reflection method (CRM).
The iterates of CRM are based on a generalization of the Euclidean concept of circumcenter (the point in the plane equidistant to the vertices of a given triangle). Successfully applied for projecting a given point onto the intersection of finitely many affine subspaces (see [16,17,18]), CRM is in its original form likely to face issues when dealing with general convex feasibility. Indeed, in the very first paper introducing CRM [16], it was pointed out that under non-affine structures, the method could possibly diverge or simply be undefined. There is now an actual example featuring two intersecting balls for which CRM stalls or diverges depending on the initial point (see [3, Figure 10]). These apparent drawbacks are genuinely overcome in the present work. The key is to reformulate the problem of finding a point in X. The product space reformulation introduced by Pierra [32] will be considered.
Product space versions of Problem (1) have been considered for both DRM and MAP [3, 6]. Although necessary for DRM to converge if m ≥ 3, the product space approach does not lead to satisfactory performance in comparison with variants of DRM (see, for instance, the cyclic Douglas-Rachford method (CyDRM) [21, 22] and the cyclically anchored Douglas-Rachford algorithm (CADRA) [11]). On the other hand, MAP converges with or without the product space reformulation, but tends to get worse on complexity when applied to the reformulated problem. Nonetheless, usually avoided for DRM and MAP, we will see that when suitably utilized, the product space reformulation captures CRM’s essence and enables it to efficiently solve Problem (1).
The main result in our work guaranties that, for any starting point, CRM converges to a solution of Problem (1). This is one of the most impressive abilities of CRM proven so far, although noticeable results have already been derived by various authors. The history of circumcenters dates back to as early as 300 BC, when they were described in Euclid’s Elements [27, Book 4, Proposition 5]. More than two thousand years later, in 2018, circumcenters were discovered to be a simple yet effective way of accelerating the prominent Douglas-Rachford method [16]. In 2019, the paper celebrating 60 years of DRM [28] mentions circumcenters as a natural way of dealing with DRM’s spiraling characteristic. Also, CRM was employed for multi-affine set problems in [17] and in a block-wise version in [18]. Along with these articles, groups of researchers have been leading careful studies on properties, properness, and calculations of circumcenters including a viewpoint of isometries (see [12,13,14,15, 30]). Very recently, a work on CRM for particular nonconvex wavelet problems was carried out [24]. It seems that circumcenters are here to stay.
Before providing all the technical machinery of our approach, let us consider Fig. 1 as a synthesized preview of what is developed in the present work. Figure 1 illustrates the problem of finding a point in the intersection of two given balls and displays DRM, MAP, and CRM trajectories starting all from the common initial point x0 and the number of iterations taken by them to track a solution up to a given accuracy. A trajectory of CRM under the product space reformulation is exhibited as well and labeled as CRM-prod. After the picture, the definitions of MAP, DRM, CRM, and CRM-prod sequences are described. The particular intersection problem considered in Fig. 1 is of special interest as for a different choice of starting point x0, CRM might not be well-defined or worse, it may diverge. In this paper, these two issues are overcome by CRM-prod.
Let us explain how the MAP, DRM, and CRM sequences are generated for the two-set case, that is, the case in which a point common to two closed convex sets is sought. For a current iterate , we move to zMAP, zDRM, and zCRM using MAP, DRM, and CRM, respectively, where
Let Id denote the identity operator in and consider the notation for reflection operators RΩ2PΩ −Id for a given nonempty, closed, and convex set . Note that MAP employs a composition of projections and, not by chance, DRM is also known as averaged reflection method, as it iterates by taking the mean of the current iterate with a composition of reflections. As for CRM, it chooses the Euclidean circumcenter of the triangle immersed in with vertices z, \(R_{X_{1}}(z)\), and \(R_{X_{2}}R_{X_{1}}(z)\). The circumcenter zCRM is defined by satisfying two criteria: (i) zCRM is equidistant to the three vertices, i.e., \(\|z-z_{\text {CRM}}\|=\|R_{X_{1}}(z)-z_{\text {CRM}}\|=\|R_{X_{2}}R_{X_{1}}(z)-z_{\text {CRM}}\|\), with ∥⋅∥ representing the Euclidean norm; (ii) zCRM belongs to the affine subspace determined by z, \(R_{X_{1}}(z)\) and \(R_{X_{2}}R_{X_{1}}(z)\), denoted here by \(\text {aff}\{z,R_{X_{1}}(z),R_{X_{2}}R_{X_{1}}(z)\}\).
In order to understand how CRM-prod works, we have to consider the product space environment by Pierra [32]. Assume that our problem still consists of finding a point in X1 ∩ X2 and take into account now two new closed convex sets , where WX1 × X2 and . Indeed, W and D are closed, convex and actually it is straightforward to see that D is a subspace of . Then, it obviously holds that
CRM-prod is ruled by zk+ 1circumcenter{zk,RW(zk),RDRW(zk)}. We anticipate that in our approach it will be key to initialize CRM-prod in D. Indeed, it will be shown that if z0 ∈ D, then zk ∈ D for all k. This enables us to fully understand the CRM-prod trajectory plotted in Fig. 1. We took the initial point so that the CRM-prod sequence lies entirely in D, that is, each has the form zk = (xk,xk). So, we plotted the xk part of the CRM-prod sequence in Fig. 1.
With all these basic notions having been introduced, we get a clear geometric comprehension of the sequences illustrated in Fig. 1. Now, it is important to stress that the intersection problem (1) when m ≥ 3 can also be reformulated as the problem of finding a point in the intersection of two closed convex sets, one of which being a subspace (see Section 3). With that said, our investigation will focus on CRM for finding a point in the intersection of a closed convex set K and an affine subspace U, a problem that is actually interesting and relevant on its own.
The ability of CRM for finding a point in the intersection of a closed convex set K and an affine subspace U is simply impressive in comparison with the classical MAP and DRM. Moreover, a CRM iteration is always better by means of distance to the solution set than both DRM and MAP iterations taken at any point in U (see the last theorem in Section 2). Also, it has to be noted that the computational effort for calculating a CRM step is essentially the same as the one for computing a MAP or DRM step. This is due to the fact that a circumcenter outcoming from a two-set intersection problem consists of solving a 2 × 2 linear system of equations. The outstanding numerical performance of CRM over MAP and DRM recorded in our experiments not only reveals a Newtonian flavor of CRM but it also opens opportunities for new research. Moreover, an explicit connection with Newton-Raphson method was shown in [24] for nonconvex problems, and used to prove quadratic convergence of a hybrid circumcenter scheme.
Our paper is organized as follows. Section 2 is at the core of this work containing our two most technical results, namely Theorems 1 and 2. Theorem 1 states convergence of CRM when applied to finding a point common to an affine subspace and a general closed convex set. Theorem 2 claims that CRM is better than both MAP and DRM when computed at iterates belonging to the affine subspace. Also, Theorem 1, together with the product space reformulation, leads to the main contribution of the manuscript in Section 3, namely, the global convergence of CRM for solving Problem (1). Numerical experiments are conducted in Section 4 concerning convex inclusions such as polyhedral and second-order cone feasibility. The results show that in all instances, CRM outperforms MAP and DRM. We close the article in Section 5 with a summary of our results together with perspectives for future work.
2 CRM for the intersection of a convex set and an affine subspace
In this section, the problem under consideration is given by
with a closed convex set, an affine subspace and K ∩ U nonempty.
We are going to prove that CRM solves problem (2) as long as the initial point lies in U and the circumcenter iteration considers first a reflection onto K and then onto U. This CRM scheme reads as
It turns out that if z0 ∈ U, then the whole sequence generated upon (3) remains in U and converges to a solution, that is, a point in K ∩ U. This result lies at the core of our work. In order to establish it, we need to go through some preliminaries. We start with a formal and general definition of circumcenter.
Definition 1 (circumcenter operator)
Let \({\mathscr{B}} (Y_{1},Y_{2},\ldots ,Y_{q})\) be a collection of ordered nonempty closed convex sets in , where q ≥ 1 is a fixed integer. The circumcenter of \({\mathscr{B}}\) at the point is denoted by \(C_{{\mathscr{B}}}(z)\) and defined by the following properties:
-
(i) \(\|z-C_{{\mathscr{B}}}(z)\|=\|R_{Y_{1}}(z)-C_{{\mathscr{B}}}(z)\|=\cdots =\|R_{Y_{q}}{\cdots } R_{Y_{2}}R_{Y_{1}}(z)-C_{{\mathscr{B}}}(z)\|\); and
-
(ii) \(C_{{\mathscr{B}}}(z)\in \text {aff}\{z,R_{Y_{1}}(z),R_{Y_{2}}R_{Y_{1}}(z),{\ldots } ,R_{Y_{q}}{\cdots } R_{Y_{2}}R_{Y_{1}}(z)\}\).
For convenience, we sometimes also write \(C_{{\mathscr{B}}}(z)\) as
Circumcenters, when well-defined, arise from the intersection of suitable bisectors and their computation requires the resolution of a q × q linear system of equations [12], with q as in Definition 1. The good definition of circumcenters is not an issue when the convex sets in question are intersecting affine subspaces (see [17]). This might not be the case for general convex sets (see [16]). However, we are going to see that for any point z ∈ U the circumcenter used in (3), namely \({C}(z) ={C}_{{\mathscr{B}}}(z)\), where \({\mathscr{B}} = (K, U)\), is well-defined.
We proceed now by listing and establishing results that are going to enable us to indeed prove that C(z) is well-defined for all z ∈ U.
Lemma 1 (good definition and characterization of CRM for intersecting affine subspaces)
Consider a collection of affine subspaces \({\mathscr{B}}=(U_{1},U_{2},\ldots ,U_{q})\) with nonempty intersection \(U_{{\mathscr{B}}} \bigcap _{i=1}^{q} U_{i}\). For any , \(C_{{\mathscr{B}}}(z)\) exists, is unique and fulfills
-
(i) \(P_{U_{{\mathscr{B}}}}(C_{{\mathscr{B}}}(z))=P_{U_{{\mathscr{B}}}}(z)\); and
-
(ii) for any \(s\in U_{{\mathscr{B}}}\), we have \(C_{{\mathscr{B}}}(z)=P_{U_{z}}(s)\), where \(U_{z} \text {aff}\{z,R_{U_{1}}(z),R_{U_{2}}R_{U_{1}}(z),{\ldots } ,R_{U_{q}}{\cdots } R_{U_{2}}R_{U_{1}}(z)\}\).
Proof
See [17, Lemmas 3.1 and 3.2]. □
It is worth emphasizing that, whenever we deal with circumcenter operators regarding two sets, many results can be carried out as if we were in since, in this case, the circumcenter is a point sought in the affine subspace defined by three points. So, this affine subspace has dimension of at most 2. Along the manuscript, one is going to come across arguments based on this fact.
Lemma 2 (one step convergence of CRM for a hyperplane and an affine subspace)
Let be a hyperplane and an affine subspace, respectively. If H ∩ U is nonempty, then
for any z ∈ U.
Proof
Let z ∈ U. If z lies also in H, the result follows trivially. Therefore, assume z∉H. For our analysis, it will be convenient to look at the point PURH(z). If PURH(z) coincides with z, then aff{z,RH(z),RURH(z)} = aff{RH(z),RURH(z)}, because \(z=P_{U}R_{H}(z)=\frac {1}{2}(R_{H}(z)+R_{U}R_{H}(z))\). Also, the only point equidistant to RH(z) and RURH(z) in aff{RH(z),RURH(z)} is PURH(z) = z. This means that z = circumcenter{z,RH(z),RURH(z)} and, in particular, z = RH(z), which implies that z ∈ H, a contradiction. So, we have that z and PURH(z) are distinct and the line connecting these points, denoted here by Lz, is well-defined. Moreover, Lz lies entirely in U and perpendicularly crosses the segment at its midpoint, namely PURH(z). So, Lz is a bisector of the segment \(\overline {R_{H}(z) R_{U}R_{H}(z)}\). Thus, \(\tilde z \text {circumcenter}\{z,R_{H}(z),R_{U}R_{H}(z)\}\) has to lie in Lz and consequently in U.
Furthermore, \(\overline {z P_{H}(z)}\perp \overline {\tilde z P_{H}(z)}\) because the circumcenter is the point where the bisectors of the triangle of vertices z,RH(z) and RURH(z) intersect. Since H is a hyperplane, the orthogonality between \(\overline {z P_{H}(z)}\) and \(\overline {\tilde z P_{H}(z)}\) suffices to guaranty that \(\tilde z\in H\). Thus, \(\tilde z\in H\cap U\). Finally, from Lemma 1, we have that \(P_{H\cap U}(z)=P_{H\cap U}(\tilde z)\). Hence, \(P_{H\cap U}(z)=\tilde z\), which completes the proof. □
The next results concern the domain of the circumcenter operator for K and U, namely . We will see that U ⊂dom(C) and C(z) ∈ U, whenever z ∈ U.
Lemma 3 (well-definedness and characterization of CRM for K and U)
Let , where K is a closed convex set and U is an affine subspace and assume K ∩ U to be nonempty. Then, for all z ∈ U, C(z)circumcenter{z,RK(z),RURK(z)} is well-defined and we have C(z) ∈ U. Furthermore, \({C}(z) = P_{H_{z}\cap U}(z)\), where if z∉K and HzK, otherwise.
Proof
Let z ∈ U. If z ∈ K, the result is trivial. So assume that z belongs to U, but not to K. Then, we have of course that \(P_{H_{z}}(z)=P_{K}(z)\) and thus \(R_{H_{z}}(z)=R_{K}(z)\). Note that \(H_{z}\cap U\neq \varnothing \). In fact, let y ∈ K ∩ U. Thus, by y ∈ K and by the characterization of projection on nonempty closed convex set, \( \left (z-P_{K}(z)\right )^{T} \left (y-P_{K}(z)\right ) \leq 0\). Define . Then, \(f(0)= \left (z- P_{K}(z)\right )^{T}\) \(\left (y-P_{K}(z)\right ) \leq 0\) and \(f(1)=\left (z-P_{K}(z)\right )^{T}\left (z-P_{K}(z)\right )=\left \|z-P_{K}(z)\right \|^{2}>0\). Since f is continuous, there exists \(\bar {t} \in [0,1[\) such that \(f(\bar t) = \left (z-P_{K}(z)\right )^{T} \left (\bar t z+(1-\bar t) y-P_{K}(z)\right ) = 0\). Thus, \(\bar {t} z+(1-\bar {t}) y \in H_{z}\). On the other hand, as both z and y are in U, an affine subspace, we have \(\bar {t} z+(1-\bar {t}) y \in U\) as well. Altogether, \(\bar {t} z+(1-\bar {t}) y \in H_{z} \cap U\).Therefore,
where the last equality follows by employing Lemma 2. □
Below, in Fig. 2, we illustrate geometrically what has been established in Lemma 3.
Let us characterize the fixed point set of the circumcenter operator.
Lemma 4 (fixed points of C)
Assume as in Lemma 3 and consider FixC{z ∈dom(C)∣C(z) = z}. Then,
Proof
Clearly if z ∈ K ∩ U then z ∈FixC. Conversely, take z ∈FixC, that is, C(z) = z, or equivalently z = RK(z) = RURK(z). So, z ∈ K and then z = RU(z). Hence, z ∈ U, proving the lemma. □
Next, we derive a firmly nonexpansiveness property of the circumcenter operator C restricted to U, known as firmly quasinonexpansiveness [7, Definition 4.1(iv)].
Lemma 5 (firmly quasinonexpansiveness of CRM)
Assume as in Lemma 3. Then, for any z ∈ U and s ∈ K ∩ U
Moreover,
Proof
Let z ∈ U. Define and . It is clear that \(K\subseteq H^{+}_{z}\), and so \(K\cap U\subseteq H_{z}^{+}\cap U\). If z ∈ Hz, then z ∈ K ∩ U and, in this case, C(z) = z, in view of Lemma 4, and the claims follow easily.
Let z∉Hz. We have that C(z) coincides with \(\text {circumcenter}\{z,R_{H_{z}}(z),R_{U}\) \(R_{H_{z}}(z)\}\), since \(R_{K}(z) = R_{H_{z}}(z)\). From Lemma 2, it follows that \(C(z)=P_{H_{z}\cap U}(z)\in U\). As consequence of z not being in Hz, we immediately get that z cannot be in K. Also, it easily follows that \(z\notin H_{z}^{+}\). This, combined with the fact that Hz is the boundary of \(H_{z}^{+}\), gives us that \(P_{H_{z}\cap U}(z)=P_{H^{+}_{z}\cap U}(z)\). Now, taking into account that, for any s ∈ K ∩ U, \(P_{H^{+}_{z}\cap U}(s)=s\), we have
where the inequality is due to the firmly nonexpansiveness propriety of the projection [7, Proposition 4.16]. To prove the last part of the lemma, fix \(s=\bar z P_{K\cap U}(z)\) and use (4) in order to get
□
We arrive now at the key result of our study, which establishes CRM as a tool for finding a point in K ∩ U whenever the initial point is chosen in U.
Theorem 1 (convergence of CRM)
Assume as in Lemma 3 and let z ∈ U be given. Then, the CRM sequence is well-defined, contained in U and converges to a point in K ∩ U.
Proof
The well-definedness of as well as its pertinence to U are due to Lemma 3. From Lemma 5, we have, for any z ∈ U, s ∈ K ∩ U and
Hence,
Summing from ℓ = 0 to m, we have
Taking limits as \(m\to +\infty \), we get the summability of the associated series and so, zk − zk+ 1 converges to 0 as \(k\to +\infty \).
Moreover, the sequence is bounded because of (4) in Lemma 5. That is,
Let \(\hat z\) be any cluster point of the sequence and denote an associated convergent subsequence to \(\hat z\). Note further that the fact \(z^{i_{k}}-z^{i_{k}+1}\to 0\) implies \(z^{i_{k}+1}\to \hat z\). We claim that \(\hat z\in K\cap U\). Since U is closed and is contained in U, we must have \(\hat z\in U\). By the definition of C, we have that
So, \(z^{i_{k}+1}-R_{K}(z^{i_{k}})\) converges to 0. It follows from the continuity of the reflection onto K and taking limits as \(k\to +\infty \) in the last equality that \(\hat z=R_{K}(\hat z)\). Hence, \(\hat z\in K\) proving the claim.
Therefore, so far we have that is bounded, contained in U and all its cluster points are in K ∩ U. We could now complete the proof using standard Fejér monotonicity results (see [7, Theorem 5.5] or [6, Theorem 2.16(v)]). However, for the sake of self-containment, we show that all these cluster points are equal and hence converges to a point in K ∩ U. Let \(\tilde {z}\), \(\hat {z}\) be two accumulation points of and , be subsequences convergent to \(\tilde {z}\), \(\hat {z}\) respectively. The real sequence is convergent because it is bounded below by zero and from (5) it is monotone non-increasing. Thus,
establishing the desired result. □
We present now a newsworthy nonconvex instance covered by Theorem 1.
Remark 1 (line-sphere intersection)
A straightforward consequence of Theorem 1 is the convergence of CRM for finding a point in the intersection (when nonempty) of a given closed ball and a line. Interestingly, this remark almost promptly gives a convergence result for CRM applied to solving the nonconvex intersection problem involving a line crossing a sphere. More precisely, for the latter, having an iterate on the line, CRM moves us out of the sphere and thereon CRM acts as if the sphere were a ball and convergence is guaranteed. An exception for moving out of the sphere occurs when the iterate is on the line and also happens to be the center of the sphere. In this case, due to nonconvexity, the projection operator is set-valued but one only needs to pick one of the two points in the direction of the line crossing the sphere as a projection and then CRM converges in one single step. The relevance of this remark relies on the fact that the definitive proof of DRM for this problem took three papers to be completely derived [2, 19, 20].
A nice consequence of Theorem 1 regards convex intersection problems in which one of the sets is affine.
Corollary 1 (serial composition of circumcenters)
Let \(K \bigcap _{i=1}^{N}K_{i}\), where Ki is convex, for i = 1,…,N, and U be an affine subspace and suppose \(K\cap U\neq \varnothing \). Then, for all z ∈ U, \(\tilde C(z) C_{(K_{N}, U)} \circ \cdots \circ C_{(K_{1}, U)}(z)\) is well-defined, with \(C_{(K_{i}, U)} (z)\) being \(\text {circumcenter}\{z,R_{K_{i}}(z),R_{U}R_{K_{i}}(z)\}\), for i = 1,…,N. Moreover, the CRM sequence is well-defined, contained in U and converges to a point in K ∩ U.
Proof
We present the proof for when N = 2 and an induction argument suffices for the general case. Thus, consider \(\tilde C(z) C_{(K_{2}, U)} \circ C_{(K_{1}, U)}(z)\) and set \(C_{1} C_{(K_{1}, U)} \), \(C_{2} C_{(K_{2}, U)}\). Note that we can essentially proceed as in the proof of Lemma 4 to show that \(\text {Fix} \tilde C = K_{1}\cap K_{2} \cap U\). In the following, we derive a similar inequality to (4), given in Lemma 5. Let s ∈ K1 ∩ K2 ∩ U and z ∈ U. Then,
where we used Lemma 5 for the first two inequalities and Corollary 2.15 by [7] for the third equality. Now, by employing the same arguments of Theorem 1, we can prove that the sequence is bounded and has an accumulation point in \(\text {Fix} \tilde C\) and the converge is achieved upon Fejér monotonicity. □
Remark 2
We can directly employ a similar argument of Corollary 4.48 in [7] to get the same claim as in Corollary 1 replacing its serial composition by a convex combination of circumcenter operators of the form \(\hat C(z) {\sum }_{i=1}^{N}\omega _{i}C_{(K_{i}, U)}(z)\), with ωi ∈]0,1[, for i = 1,…,N and \({\sum }_{i=1}^{N}\omega _{i} = 1\).
We close this section with a theorem stating that for a given iterate in an affine subspace U, CRM gets us closer to the solution set than both MAP and DRM.
Theorem 2 (comparing CRM with MAP and DRM)
Assume as in Lemma 3 and let z ∈ U be given. Also recall the notation zMAPPUPK(z), \(z_{\text {DRM}} \frac {1}{2}(\text {Id}+R_{U}R_{K})(z)\) and C(z)circumcenter{z,RK(z),RURK(z)}. Then, for any s ∈ K ∩ U we have
-
(i) ∥C(z) − s∥≤∥zMAP − s∥≤∥zDRM − s∥,
-
(ii) dist(C(z),K ∩ U) ≤dist(zMAP,K ∩ U) ≤dist(zDRM,K ∩ U).
Proof
Assume z ∈ U and s ∈ K ∩ U arbitrary but fixed. If z ∈ K, the result follows trivially because then PUPK(z) = z, \(\frac {1}{2}(\text {Id}+R_{U}R_{K})(z)=z\) and C(z) = z due to Lemma 4. So, let z ∈ U∖K and also, in order to avoid translation formulas, let us assume without loss of generality that U is a subspace. Recall that Lemma 3 characterized C(z) as the projection of z onto the intersection of U and the hyperplane with normal z − PK(z) passing through PK(z). Therefore, the triangle of vertices z, PK(z) and C(z) has a right angle at PK(z) and the triangle of vertices z, PUPK(z) and PK(z) has a right angle at PUPK(z) since U is a subspace. Considering these two triangles and taking into account the fact that hypotenuses are larger than corresponding legs, we conclude that
Another property we have is that the MAP point PUPK(z) is a convex combination of z and C(z). In order to deduce this, first note that these three points are collinear. The collinearity follows because both PUPK(z) and C(z) lie in the semi-line starting at z and passing through \(P_{U}R_{K}(z)=\frac {1}{2}(R_{K}(z)+R_{U}R_{K}(z))\). In fact, this semi-line contains the circumcenter C(z) as it is a bisector of the isosceles triangle of vertices z, RK(z) and RURK(z). Bearing in mind that PU is a linear operator [14, Proposition 2.10] and that z ∈ U, we get
Hence, zMAP is a convex combination of z and PURK(z) with parameter \(\frac {1}{2}\). Therefore, we have more than just collinearity of z,C(z), and zMAP. Both C(z) and zMAP lie on the semi-line starting at z and passing through PURK(z). In particular, this means that z cannot lie between C(z) and zMAP. Thus, in view of (6), the only remaining possibility is that zMAP is a convex combination of C(z) and z, that is, there exists a parameter r ∈ [0,1] such that
and
Moreover, the parameter r is strictly larger than zero because otherwise z would be in K ∩ U. Hence,
This equation will be properly combined with the following inner product manipulations
The first under-brace equality follows as a consequence of the right angle at PK(z) of the triangle of vertices z, PK(z), and C(z). The under-brace inequality is due to characterization of projections onto closed convex sets as, in particular, s belongs to K. The last under-brace remark holds since PK(z) − zMAP is orthogonal to the subspace U and both C(z) and s are in U. Now, inequality (8) together with (7) gives us
which, due to the cosine rule, leads to ∥C(z) − s∥≤∥zMAP − s∥ for any s ∈ K ∩ U. Taking this into account and letting \(\hat z,\tilde z\in K\cap U\) realize the distance of C(z),zMAP to K ∩ U, respectively, we have
proving the first inequalities in items (i) and (ii).
The rest of the proof is a comparison between zMAP and zDRM. We will see below that under our hypothesis zMAP is precisely the midpoint of PK(z) and zDRM. The linearity of RU will be employed.
So, \(z_{\text {MAP}}=\frac {1}{2}(P_{K}(z)+z_{\text {DRM}})\). Furthermore, ∥PK(z) − zMAP∥ = ∥zDRM − zMAP∥, and PK(z) − zMAP and zDRM − zMAP are orthogonal to U. In particular, for any s ∈ K ∩ U, it holds that (s − zMAP)T(PK(z) − zMAP) = 0 and (s − zMAP)T(zDRM − zMAP) = 0. Then, we conclude that the right triangle of vertices s, zMAP, and PK(z) is congruent to the one of vertices s, zMAP, and zDRM. This implies that ∥zDRM − s∥ = ∥PK(z) − s∥. However, ∥PK(z) − s∥≥∥zMAP − s∥ as PK(z) − s is a hypotenuse vector with one corresponding leg vector being zMAP − s. Finally, having stated the inequality ∥PK(z) − s∥≥∥zMAP − s∥ for any s ∈ K ∩ U allows us to take \(\check z,\tilde z\in K\cap U\) realizing the distance of zDRM,zMAP, to K ∩ U, respectively, and thus we have
□
The previous theorem is particularly meaningful when seen as a comparison between CRM and MAP as the associated sequences to these methods stay in the affine subspace U whenever the initial point is taken there. This is not the case for DRM sequences. Anyhow, our numerical section confirms a strict favorability of CRM over both MAP and DRM.
3 CRM for general convex intersection
We are now going to use CRM for solving
where is a nonempty set given by the intersection of finitely many closed convex sets X1,X2,…,Xm.
In order to employ CRM, let us rewrite Problem (9) by considering Pierra’s reformulation [32]. Define WX1 × X2 ×⋯ × Xm as the product space of the sets and . Then, one can easily see that
Due to (10), solving Problem (9) corresponds to solving
Note that it is straightforward to prove that D is a subspace of . Moreover, considering m arbitrary vectors x(i) in , with i = 1,…,m, we can build an arbitrary point in of the form and its projection onto D is given by
As for the orthogonal projection of onto W, we have
Next, relying on the results of Section 2, we establish convergence of CRM to a solution of Problem (9).
Theorem 3 (convergence of CRM for general convex intersection)
Assume as above and let be given. Then, taking as initial point z0(x0,x0,…,x0) ∈ D, we have that the sequence generated by zk+ 1circumcenter{zk,RW(zk),RDRW(zk)} is well-defined, entirely contained in D, and converges to a point , where x∗∈ X.
Proof
The result follows by enforcing Theorem 1 with W, D, and nm playing the role of K, U, and n, respectively. □
4 Numerical experiments
This section illustrates several numerical comparisons between CRM, MAP, and DRM. Computational tests were performed on an Intel Xeon W-2133 3.60GHz with 32GB of RAM running Ubuntu 18.04 and using Julia programming language.
In order to present the results of our numerical experiments, we choose the number of iterations as performance measure. The rationale of choosing such measure is that in each of the methods, the majority of the computational cost involved in one iteration is equivalent: the same number of orthogonal projections. Moreover, when using the product space reformulation, one can parallelize the computation of projections to speedup the CPU time, and iterations are still equivalent.
4.1 Intersection between an affine and a convex set
First, we show the results of the aforementioned methods when applied to solve the problem of finding a point in the intersection of an affine subspace and a closed convex set.
In these experiments, we want to find that solves the conic system
where is a matrix with m < n, and \(\mathcal {C}_{n}\) is the standard second-order cone of dimension n defined as
\(\mathcal {C}_{n}\) is also called the ice-cream cone or the Lorentz cone and it is easy to verify that it is a closed convex set. This problem, called second-order conic system feasibility, arises in second-order cone programming (SOCP) [1, 29], where a linear function is minimized over the intersection of an affine set and the intersection of second-order cones and an initial feasible point needs to be found [23]. Of course, x∗ is a solution of (14) if and only if x∗ lies in \(\mathcal {S} \mathcal {C}_{n}\cap U_{A,b}\), where is the affine subspace defined by A and b, that is, if nonempty, the solution set of (14) can be written as (2).
To execute our tests, we randomly generate 100 instances of subspaces UA,b, where n is fixed as 200 and m is a random value between 1 and n − 1. We guarantee that \(\mathcal {S}\) is nonempty by sampling A from the standard normal distribution and choosing a suitable b. Each instance is run for 10 initial random points, summing up to 1000 individual tests. Each initial point z0 is also sampled from the standard normal distribution and is accepted as long as it is not in \(\mathcal {S}\). We also assure that the norm of z0 is between 5 and 15 and then we project it over UA,b to begin each method. Thus, in view of Theorem 1, we are sure that CRM solves problem (14).
Let {zk} be any of the three sequences that we monitor: \(\{z_{\text {CRM}}^{k}\}\), \(\{z_{\text {DRM}}^{k}\}\), and \(\{z_{\text {MAP}}^{k}\}\), that is, CRM, DRM, and MAP sequences, respectively. We considered as tolerance ε10− 6 and employed as stopping criteria the gap distance, given by
which we consider a reasonable measure of infeasibility. Note that for CRM and MAP, the sequence monitored lies in UA,b; however, that is not the case for DRM. So, for the sake of fairness, we utilized the aforementioned stopping criteria for all methods. The projections computed to measure the gap distance can be utilized on the next iteration; thus, this calculation does not add any extra cost.
The numerical experiments results shown in Fig. 3 and Table 1 corroborate with Theorem 2, since CRM has a much better performance than DRM and MAP. Figure 3 is a performance profile [25], an analytic tool that allows one to compare several different algorithms on a set of problems with respect to a performance measure or cost. The vertical axis indicates the percentage of problems solved, while the horizontal axis indicates, in log-scale, the corresponding factor of the number of iterations used by the best solver. It shows that CRM was always better or equal than the other two methods in comparison.
In Table 1, each column shows the mean, minimum, median, and maximum, respectively, of iterations taken by each method for all instances and starting points. We remark that CRM took at most 6 iterations, but in average, less than 5. Moreover, we report that there were 89 (out of 1000) ties between CRM and DRM, while CRM always took less iterations than MAP. There were 10 ties between MAP and DRM.
4.2 Experiments with product space reformulation
We show now experiments with Pierra’s product space reformulation of CRM stated in Section 3, which we call CRM-prod. Recall that we defined the Cartesian product of of the convex sets X1,…Xm as W and defined D as the diagonal subspace .
Both MAP and DRM can also be considered in product space reformulation versions. For MAP, the iteration is given by \(z^{k+1}_{\text {MAP}} P_{W}P_{D}(z_{\text {MAP}}^{k})\). For DRM, the iteration is given by
where we use the projections onto D and W given by (12) and (13) to define the reflectors RD and RW, respectively.
We compare the numerical experiments of CRM-prod with MAP-prod and DRM-prod when applied to the problem of polyhedral feasibility. Here, each closed convex set is a half-space given by
here and . \(X = \bigcap _{i=1}^{m}X_{i}\) is called convex polyhedron (or polytope).
To generate the instance, we fix n = 200 and sampled each ai from the standard normal distribution, for i = 1,…,m, where m is randomly selected from 1 to n − 1. To assure that \(X\neq \varnothing \), we sample from the standard normal distribution an \(\bar x\) and fix \(\bar b_{i} {a_{i}^{T}}\bar x\). We then randomly select p indices from 1 to m to determine the set \(\mathcal {I} = \{i_{1},\ldots , i_{p}\}\) and we define
where \(\bar b = (\bar b_{1},\ldots ,\bar b_{m})\) and r is a random value between 0 and 1. Thus, for all i = 1,…,m, \({a_{i}^{T}}\bar x \leq b_{i}\) and moreover, for \(\hat \imath \in \mathcal {I}\), \(a_{\hat \imath }^{T}\bar x < b_{i}\), that is, X has a Slater point.
Again, for each sequence {zk} that we generate, we use as stopping criteria the error given by the gap distance
with tolerance ε10− 6. In Fig. 4, we plot number of iterations (horizontal axis) versus the error (vertical axis), both with \(\log _{10}\) scale that each method realized to achieve the stopping criteria, for one instance. One can see that CRM-prod overpowers DRM-prod and MAP-prod.
We also restart the methods with 20 different starting points sampled from the standard normal distribution guaranteeing that the norm of each one is between 5 and 15 and then we project it over D to begin each method. In Table 2, we present again the mean, minimum, median, and maximum, respectively, for these experiments.
5 Concluding remarks
This work has taken a large step towards consolidating circumcenter schemes as powerful theoretical and practical tools for addressing feasibility problems. We were able to prove that the circumcentered-reflection method, referred to as CRM throughout the article, finds a point in the intersection of a finite number of closed convex sets. Also, favorable theoretical properties of circumcenters over alternating projections and Douglas-Rachford iterations were proven. Along with the theory, we have seen great numerical performance of CRM in comparison with the classical variants for polyhedral feasibility and inclusion problems involving second-order cones. Our aim now is to widen the scope of experiments as well as to broaden the investigation on the use of circumcenters for nonconvex problems, including sparse affine feasibility problems and optimization involving manifolds. Finally, due to favorable computational tests, we would like to carry out a local rate convergence analysis for CRM under conditions such as metric-subregularity and Hölder type error bounds.
References
Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. Ser. B 95(1), 3–51 (2003). https://doi.org/10.1007/s10107-002-0339-5
Aragón Artacho, F. J., Borwein, J. M.: Global convergence of a non-convex Douglas–Rachford iteration. J. Glob. Optim. 57(3), 753–769 (2013). https://doi.org/10.1007/s10898-012-9958-4
Aragón Artacho, F. J., Campoy, R., Tam, M. K.: The Douglas–Rachford algorithm for convex and nonconvex feasibility problems. Math Meth Oper Res. https://doi.org/10.1007/s00186-019-00691-9 (2019)
Bauschke, H. H., Bello-Cruz, J.Y., Nghia, T.T.A., Phan, H.M., Wang, X.: The rate of linear convergence of the Douglas–Rachford algorithm for subspaces is the cosine of the Friedrichs angle. J. Approx. Theory 185, 63–79 (2014). https://doi.org/10.1016/j.jat.2014.06.002
Bauschke, H. H., Bello-Cruz, J.Y., Nghia, T.T.A., Phan, H.M., Wang, X.: Optimal rates, of linear convergence of relaxed alternating projections and generalized Douglas-Rachford methods for two subspaces. Numer. Algorithms 73 (1), 33–76 (2016). https://doi.org/10.1007/s11075-015-0085-4
Bauschke, H. H., Borwein, J. M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38(3), 367–426 (2006). https://doi.org/10.1137/S0036144593251710
Bauschke, H. H., Combettes, P. L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2 edn. CMS Books in Mathematics. Springer International Publishing, Cham (2017). https://doi.org/10.1007/978-3-319-48311-5
Bauschke, H. H., Dao, M. N., Noll, D., Phan, H. M.: Proximal point algorithm, Douglas-Rachford algorithm and alternating projections: a case study. J. Convex Anal. 23(1), 237–261 (2016)
Bauschke, H. H., Moursi, W. M.: The Douglas–Rachford algorithm for two (not necessarily intersecting) affine subspaces. SIAM J. Optim. 26(2), 968–985 (2016). https://doi.org/10.1137/15M1016989
Bauschke, H. H., Moursi, W. M.: On the Douglas–Rachford algorithm. Math. Program. 164(1), 263–284 (2017). https://doi.org/10.1007/s10107-016-1086-3
Bauschke, H. H., Noll, D., Phan, H. M.: Linear and strong convergence of algorithms involving averaged nonexpansive operators. J. Math. Anal. Appl. 421(1), 1–20 (2015). https://doi.org/10.1016/j.jmaa.2014.06.075
Bauschke, H. H., Ouyang, H., Wang, X.: On circumcenters of finite sets in Hilbert spaces. Linear Nonlinear Anal. 4(2), 271–295 (2018)
Bauschke, H. H., Ouyang, H., Wang, X.: Circumcentered methods induced by isometries. arXiv:1908.11576 (2019)
Bauschke, H. H., Ouyang, H., Wang, X.: On circumcenter mappings induced by nonexpansive operators. Pure and Applied Functional Analysis (in press) (2020)
Bauschke, H. H., Ouyang, H., Wang, X.: On the linear convergence of circumcentered isometry methods. arXiv:1912.01063 (2019)
Behling, R., Bello-Cruz, J.Y., Santos, L.R.: Circumcentering the Douglas–Rachford method. Numer. Algorithms 78(3), 759–776 (2018). https://doi.org/10.1007/s11075-017-0399-5
Behling, R., Bello-Cruz, J.Y., Santos, L.R.: On the linear convergence of the circumcentered-reflection method. Oper. Res. Lett. 46(2), 159–162 (2018). https://doi.org/10.1016/j.orl.2017.11.018
Behling, R., Bello-Cruz, J-Y, Santos, L-R: The block-wise circumcentered–reflection method. Comput. Optim. Appl. 76, 675–699 (2020). https://doi.org/10.1007/s10589-019-00155-0
Benoist, J.: The Douglas–Rachford algorithm for the case of the sphere and the line. J. Glob. Optim. 63(2), 363–380 (2015). https://doi.org/10.1007/s10898-015-0296-1
Borwein, J.M., Sims, B.: The Douglas–Rachford algorithm in the absence of convexity. In: Bauschke, H.H., Burachik, R.S., Combettes, P.L., Elser, V., Luke, D.R., Wolkowicz, H. (eds.) Fixed-Point Algorithms for Inverse Problems in Science and Engineering. https://doi.org/10.1007/978-1-4419-9569-8_6. Series Title: Springer Optimization and Its Applications, vol. 49, pp 93–109. Springer, New York (2011)
Borwein, J. M., Tam, M. K.: A cyclic Douglas–Rachford iteration scheme. J. Optim. Theory Appl. 160(1), 1–29 (2014). https://doi.org/10.1007/s10957-013-0381-x
Borwein, J. M., Tam, M. K.: The cyclic Douglas-Rachford method for inconsistent feasibility problems. J. Nonlinear Convex Anal. Int. J. 16 (4), 573–584 (2015)
Cucker, F., Peña, J., Roshchina, V.: Solving second-order conic systems with variable precision. Math. Program. 150(2), 217–250 (2015). https://doi.org/10.1007/s10107-014-0767-z
Dizon, N., Hogan, J., Lindstrom, S.B.: Circumcentering reflection methods for nonconvex feasibility problems. arXiv:1910.04384 (2019)
Dolan, E. D., Moré, J. J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002). https://doi.org/10.1007/s101070100263
Douglas, J., Rachford Jr., H. H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82 (2), 421–421 (1956). https://doi.org/10.1090/S0002-9947-1956-0084194-4
Euclid, H.T.L.: The Thirteen Books of Euclid’s Elements, 2nd edn., vol. II. Dover Publications, Inc., New York (1956)
Lindstrom, S. B., Sims, B.: Survey: Sixty years of Douglas–Rachford. arXiv:1809.07181 (2018)
Lobo, M. S., Vandenberghe, L., Boyd, S., Lebret, H.: Applications of second-order cone programming. Linear Algebra Appl. 284(1-3), 193–228 (1998). https://doi.org/10.1016/S0024-3795(98)10032-0
Ouyang, H.: Circumcenter operators in Hilbert spaces. Master Thesis, University of British Columbia, Okanagan CA. https://doi.org/10.14288/1.0371095 (2018)
Phan, H. M.: Linear convergence of the Douglas–Rachford method for two closed sets. Optimization 65(2), 369–385 (2016). https://doi.org/10.1080/02331934.2015.1051532
Pierra, G.: Decomposition through formalization in a product space. Math. Program. 28 (1), 96–115 (1984). https://doi.org/10.1007/BF02612715
Svaiter, B. F.: On weak convergence of the Douglas–Rachford method. SIAM J. Control Optim. 49(1), 280–287 (2011). https://doi.org/10.1137/100788100
Acknowledgments
We thank the anonymous referees for their valuable suggestions which significantly improved this manuscript.
Funding
RB was partially supported by the Brazilian Agency Conselho Nacional de Desenvolvimento Científico e Tecnoógico (CNPq), Grants 304392/2018-9 and 429915/2018-7; YBC was partially supported by the National Science Foundation (NSF), Grant DMS – 1816449.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Behling, R., Bello-Cruz, Y. & Santos, LR. On the circumcentered-reflection method for the convex feasibility problem. Numer Algor 86, 1475–1494 (2021). https://doi.org/10.1007/s11075-020-00941-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-020-00941-6
Keywords
- Circumcenter
- Reflection method
- Convex feasibility problem
- Accelerating convergence
- Douglas-Rachford method
- Method of alternating projections