1 Introduction

A metric space X can be viewed as the first member of an infinite sequence of nested metric spaces \((X(n), \Delta )\), \(n=1, 2, \dots \) where the elements of X(n) are subsets of X with at most n elements, and \(\Delta \) is the Hausdorff metric. The topology and geometry of X(n) can be difficult to grasp even for simple spaces X, e.g., [4, 7, 8, 26]. This paper will show how the relations between the finite subset spaces X(n) reflect the cluster tendency of X.

The detection of clusters in a finite subset of a metric space is an important part of statistical data analysis, and its solution is usually sought in the form of an algorithm. Our approach is different in that we treat the problem as a purely mathematical one, focusing on the existence of a Lipschitz continuous map \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(k)\), \(k<n\), such that \({{\,\mathrm{{\mathcal {R}}}\,}}\) acts as the identity on X(k). If such \({{\,\mathrm{{\mathcal {R}}}\,}}\) exists, the elements of \({{\,\mathrm{{\mathcal {R}}}\,}}(A)\) can be interpreted as centers of (at most k) clusters of a set \(A\in X(n)\). Each point \(x\in A\) can be assigned to a cluster based on which point of \({{\,\mathrm{{\mathcal {R}}}\,}}(A)\) is nearest to x. Among all metric spaces, those that satisfy the ultrametric inequality (4) are exceptionally well suited for clustering [12, 25]. The first of our main results, Theorem 1, shows that the same holds when cluster tendency of X is quantified by the Lipschitz constants of retractions \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(k)\).

Section 4 concerns the existence of Lipschitz or Hölder continuous retractions for subsets of \({\mathbb {R}}\). It simplifies and extends some of the results in [20]. Every additive subgroup \(G\subset {\mathbb {R}}\) supports Lipschitz retractions of G(n) (Corollary 2). On the other hand, there exist subsets of \({\mathbb {R}}\) with Hölder continuous retractions only (Proposition 3). In Corollary 3 we will see that a Lipschitz retraction \(X(n)\rightarrow X(k)\) does not always factor through retractions between intermediate finite subset spaces.

A metric space X that supports Lipschitz retractions \(X(n)\rightarrow X(k)\) for all \(n>k\ge 1\) is said to have the Lipschitz Clustering Property (LCP). Section 5 relates the LCP to a better understood class of metric spaces: quasiconvex ones, where any two points xy can be joined by a curve of length comparable to the distance between x and y. Our second main result, Theorem 3, shows that an LCP space that contains a bi-Lipschitz image of a line segment must be locally quasiconvex. On the other hand, it is possible for a metric space to have the LCP without containing any rectifiable curves (Example 4).

Section 6 concerns the invariance of the LCP under certain transformations of metric spaces: quasisymmetric and quasihomogeneous maps, products, disjoint unions, etc. The paper concludes with a list of open questions in Sect. 7.

2 Definitions and Preliminary Results

When A is a finite set, |A| denotes its number of elements. Given a metric space X and a positive integer n, let \(X(n) = \{A\subset X:1\le |A|\le n\}\). The space X(n) is called the nth finite subset space of X (other names, such as “symmetric product”, appear in the literature). It is a metric space with respect to the Hausdorff metric

$$\begin{aligned} \Delta (A, B) = \max \left( \sup _{a\in A} {{\,\mathrm{dist}\,}}(a, B), \sup _{b\in B} {{\,\mathrm{dist}\,}}(b, A)\right) . \end{aligned}$$
(1)

Sometimes we write \(\Delta _{X}\) instead of \(\Delta \) to disambiguate the underlying metric space. The notation \({{\,\mathrm{dist}\,}}\) means the infimal distance \({{\,\mathrm{dist}\,}}(A, B)=\inf \{d(a, b):a\in A, b\in B\}\).

The natural embeddings \(X=X(1)\subset X(2)\subset \cdots \) are isometric with respect to \(\Delta \), which allows us to consider X(k) as a subset of X(n) when \(k < n\).

If \(Y\subset X\), a map \({{\,\mathrm{{\mathcal {R}}}\,}}:X\rightarrow Y\) is a retraction if its restriction to Y is the identity map. A set Y for which such \({{\,\mathrm{{\mathcal {R}}}\,}}\) exists is a retract of X. In the context of geometric embeddings of metric spaces, it is desirable for the range of an embedding to be a Lipschitz retract of the ambient space [23].

A map \(f:X\rightarrow Y\) is called Lipschitz if there is a number \(L\ge 0\) such that \(d_Y(f(x),f(x'))\le L d_X(x,x')\) for all \(x,x'\in X\). We sometimes emphasize the value of L by saying that f is L-Lipschitz. The least of such numbers L is denoted by \({{\,\mathrm{Lip}\,}}(f)\) and is called the Lipschitz constant of f. A map f is called L-bi-Lipschitz if \(L^{-1}d_X(x,x') \le d_Y(f(x),f(x'))\le L d_X(x,x') \) for all \(x,x'\in X\).

Definition 1

A metric space X has the Lipschitz Clustering Property (LCP) if for every n there exists a Lipschitz retraction \({{\,\mathrm{{\mathcal {R}}}\,}}_n\) from X(n) onto \(X(n-1)\).

Definition 1 implies, via composition of maps \({{\,\mathrm{{\mathcal {R}}}\,}}_n\), the existence of Lipschitz retractions \(X(n)\xrightarrow {\text {onto}}X(k)\) for any \(n> k\ge 1\). The class of LCP spaces includes Euclidean and Hilbert spaces [21] and, more generally, Hadamard spaces [6]. It does not include the circle \(S^1\) [3, Proposition 2.2] or any space that retracts onto a circle.

Definition 2

A metric space (Xd) is quasiconvex if there exists a constant C such that any two points \(x, y\in X\) can be joined by a curve of length at most \(C\,d(x, y)\).

Although Definitions 1 and 2 do not look similar, they have something in common: both properties are inherited by Lipschitz retracts of the space. A stronger connection between them will be established in §5.

Definition 3

The minimum separation function \(\delta _n:X(n)\rightarrow [0, \infty )\) is defined as follows: \(\delta _n(A)=0\) if \(|A|<n\), and \(\delta _n(A) = \min \{d_X(a, b) :a, b\in A, \ a\ne b\}\) if \(|A|=n\).

It is easy to see that \(\delta _n\) is a 2-Lipschitz function which vanishes precisely on \(X(n-1)\). Moreover, we have [3, Lemma 3.1]:

$$\begin{aligned} \frac{1}{2}\delta _n(A)\le \inf \{\Delta (A, B):B\in X(n-1)\} \le \delta _n(A). \end{aligned}$$
(2)

The relevance of \(\delta _n\) to Lipschitz clustering is indicated by the following facts.

Lemma 1

[3, Lemma 3.2] If \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(n-1)\) is an L-Lipschitz retraction, then for every \(A\in X(n)\) we have

$$\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), A)\le (L+1)\delta _n(A). \end{aligned}$$
(3)

Lemma 2

[2, Lemma 2.5] If \(A, B\in X(n)\) and \(\max (\delta _n(A), \delta _n(B)) > 2\Delta (A, B)\), then there exists a bijection \(\phi :A\rightarrow B\) such that \(d_X(a, \phi (a))\le \Delta (A, B)\) for all \(a\in A\).

3 Ultrametrics and Uniformly Disconnected Spaces

A metric space (Xd) is ultrametric if

$$\begin{aligned} d(x, y)\le \max (d(x, z), d(y, z)) \quad \text {for all } x, y, z\in X. \end{aligned}$$
(4)

A metric space (Xd) is uniformly disconnected if there exists a constant \(c>0\) such that any finite sequence \(x_0, \dots , x_n\) in X satisfies

$$\begin{aligned} \max _{1\le k\le n}d(x_k, x_{k-1}) \ge c\, d(x_0, x_n). \end{aligned}$$
(5)

Uniformly disconnected spaces were introduced by David and Semmes in [10, Chapter 15] with a different but equivalent definition; see also Section 14.24 in [16]. A metric space (Xd) is uniformly disconnected if and only if there exists an ultrametric \(\rho \) on X such that the ratio \(d/\rho \) is bounded between two positive constants [10, Proposition 15.7].

Theorem 1

Let (Xd) be an ultrametric space. For any \(n > m\ge 1\) and any \(L>1\) there exists a retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(m)\) with \({{\,\mathrm{Lip}\,}}({{\,\mathrm{{\mathcal {R}}}\,}})\le L\). If X is compact, one can take \(L=1\).

The proof of Theorem 1 is based on a proposition that covers a wider class of metric spaces.

Proposition 1

Let (Xd) be a metric space. Suppose there exist constants \(L\ge 1\) and \(b\in (0, 1)\) and a family of L-Lipschitz maps \(\tau _k:X\rightarrow X\), \(k\in {\mathbb {Z}}\), such that for every \(x, y\in X\) and every \(k\in {\mathbb {Z}}\) we have:

$$\begin{aligned} d(\tau _k(x), x)\le Lb^k; \end{aligned}$$
(6)
$$\begin{aligned} \text {either }\tau _k(x)=\tau _k(y)\text { or } d(\tau _k(x), \tau _k(y))\ge L^{-1}b^k. \end{aligned}$$
(7)

Then for any \(n > m\ge 1\) there exists a retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(m)\) with \({{\,\mathrm{Lip}\,}}({{\,\mathrm{{\mathcal {R}}}\,}})\le 2L^3 b^{-1}+1\).

Proof

Using (7) and the L-Lipschitz property of \(\tau _k\) we obtain

$$\begin{aligned} d(x, y)< L^{-2}b^k \implies \tau _k(x)=\tau _k(y). \end{aligned}$$
(8)

Hence for any bounded set \(A\subset X\) there exists \(k\in {\mathbb {Z}}\) such that \(\tau _k(A)\) consists of a single point. On the other hand, (6) implies that \(|\tau _k(A)|=|A|\) for all sufficiently large k.

Define \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(m)\) as follows: if \(|A|\le m\), then \({{\,\mathrm{{\mathcal {R}}}\,}}(A)=A\). Otherwise let \(\mu (A) = \max \{k:|\tau _k(A)|\le m\}\) and define \({{\,\mathrm{{\mathcal {R}}}\,}}(A)=\tau _{\mu (A)}(A)\). By definition, \({{\,\mathrm{{\mathcal {R}}}\,}}\) is a retraction onto X(m), so it remains to check its Lipschitz property. It is convenient to let \(\mu (A) = \infty \) when \(A\in X(m)\).

Given \(A, B\in X(n)\), consider three cases.

Case 1: \(\mu (A)=\mu (B)=\infty \). This case is trivial: \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B)) = \Delta (A, B)\).

Case 2: \(\mu (A)=\mu (B)<\infty \). Let \(k=\mu (A)\) and use the Lipschitz property of \(\tau _k\) to obtain

$$\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B)) = \Delta (\tau _k(A), \tau _k(B)) \le L \Delta (A, B) \end{aligned}$$

Case 3: \(\mu (A) \ne \mu (B)\). We may assume \(\mu (A) < \mu (B)\). Let \(k=\mu (A)+1\). Since \(|\tau _k(A)| > m \ge |\tau _k(B)|\), there exists a point \(a_0\in A\) such that \(\tau _k(a_0)\notin \tau _k(B)\). By virtue of (8) we have \(d(a_0, b)\ge L^{-2}b^k\) for all \(b\in B\). Hence

$$\begin{aligned} \Delta (A, B)\ge L^{-2}b^{\mu (A)+1}. \end{aligned}$$
(9)

The assumption (6) implies that \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), A)\) and \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(B), B)\) are bounded by \(Lb^{\mu (A)}\). Therefore,

$$\begin{aligned} \begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))&\le \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), A) + \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(B), B) + \Delta (A, B) \\&\le 2L b^{\mu (A)} + \Delta (A, B) \\&\le (2L^3 b^{-1}+1) \Delta (A, B) \end{aligned} \end{aligned}$$

where the last step uses (9). \(\square \)

Proof of Theorem 1

For \(k\in {\mathbb {Z}}\) consider the set of all open balls \(B(x, 2^{-k})\) with radius \(2^{-k}\) and arbitrary center \(x\in X\). By the ultrametric inequality (4), for any \(x, y\in X\) we have either \(B(x, 2^{-k})=B(x, 2^{-k})\) or \({{\,\mathrm{dist}\,}}(B(x, 2^{-k}), B(y, 2^{-k}))\ge 2^{-k}\). Choose a set of centers \(C_k\) such that the balls \(B(p, 2^{-k})\), \(p\in C_k\), are disjoint and cover X. Define \(\tau _k:X\rightarrow X\) so that \(\tau _k(x)=p\) when \(x\in B(p, 2^{-k})\) with \(p\in C_k\).

By construction, (6) and (7) hold with \(L=1\) and \(b=1/2\). To check the Lipschitz property of \(\tau _k\), suppose that \(x\in B(p, 2^{-k})\) and \(y\in B(q, 2^{-k})\) where \(p, q\in C_k\) are distinct. By the ultrametric inequality

$$\begin{aligned} d(\tau _k(x), \tau _k(y)) = d(p, q)\le \max (2^{-k}, d(x, y)) = d(x, y), \end{aligned}$$

which shows that \(\tau _k\) is 1-Lipschitz.

Proposition 1 provides a retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(m)\) with \({{\,\mathrm{Lip}\,}}({{\,\mathrm{{\mathcal {R}}}\,}})\le 5\). This estimate can be improved with a metric transform as follows.

Given \(L>1\), choose \(\alpha >1\) large enough so that \(L^\alpha \ge 5\). Since the function \(d^\alpha \) satisfies the ultrametric inequality, we can apply the preceding argument to the space \((X, d^\alpha )\) and obtain a retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:(X(n), \Delta ^\alpha )\rightarrow (X(m), \Delta ^\alpha )\) with

$$\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))^\alpha \le 5 \Delta (A, B)^\alpha \quad \text {for all } A, B\in X(n). \end{aligned}$$

In terms of the original metric \(\Delta \), we have \({{\,\mathrm{Lip}\,}}({{\,\mathrm{{\mathcal {R}}}\,}})\le 5^{1/\alpha }\le L\) as claimed.

It remains to consider the case of compact X. For each \(j\in {\mathbb {N}}\) we have a retraction \({{\,\mathrm{{\mathcal {R}}}\,}}_j:X(n)\rightarrow X(m)\) with \({{\,\mathrm{Lip}\,}}({{\,\mathrm{{\mathcal {R}}}\,}}_j)\le 1 + 1/j\). By the Arzelà-Ascoli theorem, a subsequence of \(\{{{\,\mathrm{{\mathcal {R}}}\,}}_j\}\) converges to a map \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(m)\), which is easily seen to be a 1-Lipschitz retraction. \(\square \)

Corollary 1

Let (Xd) be an uniformly disconnected metric space. There is a constant \(L\ge 1\) such that for any \(n > m\ge 1\) there exists an L-Lipschitz retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(m)\).

Proof

By [10, Proposition 15.7] the space X supports an ultrametric \(\rho \) such that \(C^{-1}d\le \rho \le d\) for some \(C\ge 1\). By Theorem 1 the finite subset spaces of \((X, \rho )\) admit 2-Lipschitz retractions. In terms of the original metric d these retractions are L-Lipschitz with \(L=2C^2\). \(\square \)

4 Linear Sets

The availability of total order on \({\mathbb {R}}\) allows for a more precise version of Lemma 2.

Lemma 3

If \(A, B\in {\mathbb {R}}(n)\) and \(\max (\delta _n(A), \delta _n(B)) > 2\Delta (A, B)\), then there exists an order-preserving (i.e., increasing) bijection \(\phi :A\rightarrow B\) such that \(|a - \phi (a)|\le \Delta (A, B)\) for all \(a\in A\). Moreover,

$$\begin{aligned} \Delta (A\setminus \{\min A\}, B\setminus \{\min B\}) \le \Delta (A, B). \end{aligned}$$
(10)

Proof

Without loss of generality \(\delta _n(A) > 2\Delta (A, B)\). Let \(\phi \) be as in Lemma 2. Given \(a_1, a_2\in A\) with \(a_1<a_2\), let \(b_k=\phi (a_k)\) for \(k=1, 2\). By the triangle inequality we have

$$\begin{aligned} b_2 - b_1 \ge (a_2-a_1) - |a_1-b_1| - |a_2-b_2| \ge \delta _n(A) - 2\Delta (A, B) > 0. \end{aligned}$$

proving that \(\phi \) is order-preserving. Hence the restriction of \(\phi \) to \(A\setminus \{\min A\}\) is a bijection onto \(B\setminus \{\min B\}\), which implies (10). \(\square \)

The fact that the real line \({\mathbb {R}}\) has the LCP is known [20, Lemma 4.2]. But Proposition 2 also gives a simple explicit formula for Lipschitz retractions on the line, which we will use later.

Proposition 2

Given a set \(A\in {\mathbb {R}}(n)\), \(n\ge 2\), define \(s_A(x)=|A\cap (-\infty , x)|\) for \(x\in {\mathbb {R}}\). The map \({{\,\mathrm{{\mathcal {R}}}\,}}\), defined by

$$\begin{aligned} {{\,\mathrm{{\mathcal {R}}}\,}}(A) = \{x - \delta _n(A) s_A(x) :x\in A\}, \end{aligned}$$
(11)

is a \((4n-3)\)-Lipschitz retraction of \({\mathbb {R}}(n)\) onto \({\mathbb {R}}(n-1)\).

Proof

If \(A\in {\mathbb {R}}(n-1)\), then \(\delta _n(A)=0\) and therefore \({{\,\mathrm{{\mathcal {R}}}\,}}(A)=A\). If \(A\in {\mathbb {R}}(n)\) has n elements \(a_1<\dots <a_n\), then \(a_{k+1}-a_{k}=\delta _n(A)\) for some \(k\in \{1, \dots , n-1\}\). This implies \(a_{k+1} - \delta _n(A) s_A(a_{k+1}) = a_{k} - \delta _n(A) s_A(a_{k})\), hence \({{\,\mathrm{{\mathcal {R}}}\,}}(A)\in {\mathbb {R}}(n-1)\). Thus, \({{\,\mathrm{{\mathcal {R}}}\,}}\) is a retraction onto \({\mathbb {R}}(n-1)\).

By the definition of \(s_A\) we have \(s_A(x)\le n-1\) for all \(x\in A\), and therefore

$$\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), A)\le (n-1)\delta _n(A). \end{aligned}$$
(12)

By the triangle inequality, for all \(A, B\in {\mathbb {R}}(n)\) we have

$$\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))\le \Delta (A, B) + (n-1)(\delta _n(A)+\delta _n(B)). \end{aligned}$$
(13)

If \(\delta _n(A)+\delta _n(B)\le 4\Delta (A, B)\), then (13) implies \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))\le (4n-3)\Delta (A, B)\) as claimed.

Suppose \(\delta _n(A)+\delta _n(B)> 4\Delta (A, B)\). By Lemma 3 there exists an order-preserving bijection \(\phi :A\rightarrow B\). Note that \(s_B(\phi (x))=s_A(x)\) for all \(x\in A\).

Given a point \(z\in {{\,\mathrm{{\mathcal {R}}}\,}}(A)\), write it as \(z = x - \delta _n(A) s_A(x)\) for some \(x\in A\), and let \(w = \phi (x) - \delta _n(B) s_B(\phi (x)) \in {{\,\mathrm{{\mathcal {R}}}\,}}(B)\). Then

$$\begin{aligned} |z-w|\le |x-\phi (x)| + |\delta _n(A)-\delta _n(B)| s_A(x) \le \Delta (A, B) + 2(n-1)\Delta (A, B), \end{aligned}$$

where the last step uses \({{\,\mathrm{Lip}\,}}(\delta _n)\le 2\). As z runs through the points of \({{\,\mathrm{{\mathcal {R}}}\,}}(A)\), the corresponding point w runs through all points of \({{\,\mathrm{{\mathcal {R}}}\,}}(B)\) because \(\phi \) is a bijection. It follows that \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))\le (2n-1)\Delta (A, B)\) in this case. \(\square \)

Corollary 2

Every additive subgroup G of \({\mathbb {R}}\) has the LCP. Furthermore, the intersection of G with any interval has the LCP as well.

Proof

Let \({{\,\mathrm{{\mathcal {R}}}\,}}\) be as in (11). If \(A\in G(n)\), then \(\delta _n(A)\in G\), hence \({{\,\mathrm{{\mathcal {R}}}\,}}(A)\subset G\). Moreover, \(\min {{\,\mathrm{{\mathcal {R}}}\,}}(A)=\min A\) and \(\max {{\,\mathrm{{\mathcal {R}}}\,}}(A) \le \max A\), which means that the subsets of any interval do not move out of the interval under \({{\,\mathrm{{\mathcal {R}}}\,}}\). \(\square \)

Remark 1

The retraction (11) moves the points of A toward \(\min A\). One could replace \(s_A\) with the sign-counting function \( \sigma _A(x) = \frac{1}{2} \sum _{y\in A} {{\,\mathrm{sign}\,}}(y-x) \) which results in a retraction that moves the points of A toward the median of A. However, this map does not preserve additive subgroups because \(\sigma _A\) is not integer-valued.

So far we saw that the LCP holds both for connected subsets of \({\mathbb {R}}\) and for uniformly disconnected ones. The following result shows that it also holds for compact subsets with finitely many components.

Theorem 2

Suppose that \(X\subset {\mathbb {R}}\) is a union of disjoint compact intervals \(I_k\), \(k=1, \dots , m\), some of which may degenerate into points. Then, X has the LCP.

Proof

Let \(M = \max _{1\le k\le m} {{\,\mathrm{diam}\,}}I_k\). Fix \(n\ge 2\) and let \({{\,\mathrm{{\mathcal {R}}}\,}}:{\mathbb {R}}(n)\rightarrow {\mathbb {R}}(n-1)\) be as in Proposition 2.

After applying a suitable bi-Lipschitz transformation \(F:{\mathbb {R}}\rightarrow {\mathbb {R}}\), we can achieve \({{\,\mathrm{dist}\,}}(I_k, I_j)\ge 3 n M\) whenever \(j\ne k\). To be specific, F could be a piecewise-linear function that does not change the diameter of any interval \(I_k\) but increases the distances between them.

Let \({\widetilde{X}}=\{x\in {\mathbb {R}}:{{\,\mathrm{dist}\,}}(x, X)\le nM\}\). Each gap between the components of \({\widetilde{X}}\) is at least a third of the corresponding gap between the components of X. Therefore, the nearest-point projection \(\rho :{\widetilde{X}}\rightarrow X \), which sends each point of \({\widetilde{X}}\) to the nearest point of X, is 3-Lipschitz.

We partition X(n) as \(X(n)=U\cup V\) where U consists of all \(A\in X(n)\setminus X(n-1)\) such that \(|A\cap I_k|\le 1\) for all \(k=1, \dots , m\), and \(V=X(n)\setminus U\). In other words, we have \(A\in U\) if and only if A intersects n of the intervals \(I_1, \dots , I_m\). The set U is empty when \(n > m\).

Define \({{\,\mathrm{{\mathcal {R}}}\,}}_X:X(n)\rightarrow X(n-1)\) as follows.

$$\begin{aligned} {{\,\mathrm{{\mathcal {R}}}\,}}_X(A) = {\left\{ \begin{array}{ll} A\setminus \{\min A\} &{} \text {if } A\in U \\ \rho ({{\,\mathrm{{\mathcal {R}}}\,}}(A)) &{} \text {if } A\in V, \end{array}\right. } \end{aligned}$$
(14)

Since \(\rho \) is only defined on \({\widetilde{X}}\), we must show that \({{\,\mathrm{{\mathcal {R}}}\,}}(A)\subset {\widetilde{X}}\) for \(A\in V\). Such sets have \(\delta _n(A)\le M\) since either \(|A|<n\) or two of the points of A lie in the same component of X. From (12) we have \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), A)\le (n-1)M\), hence \({{\,\mathrm{{\mathcal {R}}}\,}}(A)\subset {\widetilde{X}}\). The Lipschitz continuity of \(\rho \) and \({{\,\mathrm{{\mathcal {R}}}\,}}\) implies that \({{\,\mathrm{{\mathcal {R}}}\,}}_X\) is Lipschitz on V.

The set U consists of \(\left( {\begin{array}{c}m\\ n\end{array}}\right) \) connected components, based on which n of the intervals \(I_1, \dots , I_m\) contain a point of A. The distance (in the metric \(\Delta \)) between any two components of U is at least 3nM, because of the gaps between the intervals \(I_k\). For the same reason, \(\Delta (A, B)\ge 3nM\) whenever \(A\in U\) and \(B\in V\).

Let A and B be two sets in the same connected component of U, which means they intersect the same collection of intervals \(I_k\). Since the length of each interval is bounded by M, we have \(\Delta (A, B)\le M\). Furthermore, the gaps between the intervals force \(\delta _n(A)\ge 3nM\). Inequality (10) shows that \({{\,\mathrm{{\mathcal {R}}}\,}}_X\) is 1-Lipschitz on each connected component of U. This completes the proof. \(\square \)

The map \(A\mapsto A\setminus \{\min A\}\) merits further consideration. It is not continuous on \({\mathbb {R}}(n)\) when \(n\ge 3\), as the example of sets \(\{0, \epsilon , 1\}\) and \(\{0, 1, 1+\epsilon \}\) shows: the Hausdorff distance between these sets is \(\epsilon \), but it increases to \(1-\epsilon \) when the minimal elements are removed. However, it provides continuous retractions of X(n) for some linear sets X to which Theorem 2 does not apply. One such example is given below.

Proposition 3

Let \(X=\{0\}\cup \{k^{-1}:k\in {\mathbb {N}}\}\). Fix \(n\ge 2\). For \(A\in X(n)\) let \({{\,\mathrm{{\mathcal {R}}}\,}}(A)=A\setminus \{\min A\}\) if \(|A|=n\) and \({{\,\mathrm{{\mathcal {R}}}\,}}(A)=A\) if \(|A|<n\). Then:

  1. (a)

    \({{\,\mathrm{{\mathcal {R}}}\,}}\) is a Hölder continuous retraction of X(n) onto \(X(n-1)\), with Hölder exponent 1/2;

  2. (b)

    X does not have the LCP.

Proof

(a) Consider a set \(A\in X(n)\) of the form \(A=\{a_1, \dots , a_n\}\) where \(a_1<a_2<\dots < a_n\). We have \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), A)=a_2-a_1\). Also,

$$\begin{aligned} \delta _n(A) = \min \{a_{k+1}-a_k :k=1, \dots , n-1\} \ge \min (a_2-a_1, a_2^2), \end{aligned}$$

because \(|a-b|\ge ab\) for any two points \(a, b\in X\). Thus

$$\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), A)\le \sqrt{\delta _n(A)}. \end{aligned}$$
(15)

Recalling (2) we conclude that

$$\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B)) \le \Delta (A, B) + \sqrt{2\Delta (A, B)} \le 3\sqrt{\Delta (A, B)} \quad \text {if } |A|=n \text { and } |B|<n. \end{aligned}$$

It remains to consider the case of two sets \(A, B\in X(n)\setminus X(n-1)\). If \(\max (\delta _n(A), \delta _n(B))> 2\Delta (A, B)\), then \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))\le \Delta (A, B)\) by Lemma 3. Assume \(\max (\delta _n(A), \delta _n(B))\le 2\Delta (A, B)\). By (15) we have

$$\begin{aligned}\begin{aligned} \Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))&\le \Delta (A, B) + \sqrt{\delta _n(A)} + \sqrt{\delta _n(B)} \\&\le \Delta (A, B) + 2\sqrt{2\Delta (A, B)} \le 4 \sqrt{\Delta (A, B)}. \end{aligned} \end{aligned}$$

This completes the proof of (a).

(b) If \(t\in X\setminus \{0, 1\}\), then the neighbors of t in X are \(t/(1+t)\) and \(t/(1-t)\). The distances from t to its neighbors are \(t^2/(1+t)\) and \(t^2/(1-t)\), respectively. Hence

$$\begin{aligned} \Delta (A, B)\ge t^2/2 \quad \text {if } t\in A\vartriangle B, \end{aligned}$$
(16)

where \(A, B\in X(n)\) and \(A\vartriangle B = (A\setminus B)\cup (B\setminus A)\).

Suppose that \({{\,\mathrm{{\mathcal {R}}}\,}}:X(4)\rightarrow X(3)\) is an L-Lipschitz retraction. Choose \(x, y, z\in X\) such that

$$\begin{aligned} 0<2x<y<z, \quad 2Lx^2< y^2, \quad \text {and} \quad 2(L+1)(z-y) < x. \end{aligned}$$
(17)

For example, one can take \(x=1/k^3\), \(y=1/(k^2+1)\), and \(z=1/k^2\) where \(k\in {\mathbb {N}}\) is large enough that (17) is satisfied.

Let \(A=\{0, y, z\}\) and \(B=\{0, x, y, z\}\). Since the distance between consecutive elements of \(X\cap [0, x]\) is less than \(x^2\), there is a finite sequence of sets \(A_j\in X(4)\) that begins with \(A_1=A\) and ends with \(A_J=B\), such that \(\Delta (A_j, A_{j+1})\le x^2\) for all \(j=1, \dots , J-1\). It follows that \(\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A_j), {{\,\mathrm{{\mathcal {R}}}\,}}(A_{j+1}))\le Lx^2<y^2/2\) for all j. Since \({{\,\mathrm{{\mathcal {R}}}\,}}(A_1)=A\) contains y, inequality (16) implies \(y\in {{\,\mathrm{{\mathcal {R}}}\,}}(A_j)\) for all j. The same argument shows that \(z\in {{\,\mathrm{{\mathcal {R}}}\,}}(A_j)\) for all j. Thus, we have \({{\,\mathrm{{\mathcal {R}}}\,}}(B)=\{u, y, z\}\) for some \(u\in X\).

The set \({{\,\mathrm{{\mathcal {R}}}\,}}(B)\cap [0, 2x]\) has at most one point because \(z>y>2x\). Since B contains both 0 and x, it follows that \(\Delta ( {{\,\mathrm{{\mathcal {R}}}\,}}(B), B) \ge x/2\). On the other hand, Lemma 1 yields

$$\begin{aligned} \Delta ( {{\,\mathrm{{\mathcal {R}}}\,}}(B), B) \le (L+1)\delta _4(B) \le (L+1)(z-y) < x/2, \end{aligned}$$

where the last step uses (17). This contradiction completes the proof. \(\square \)

The proof of Proposition 3 shows the non-existence of a Lipschitz retraction from X(4) to X(3). In contrast, for every subset \(X\subset {\mathbb {R}}\) and any \(n\ge 2\) one has 1-Lipschitz retractions \(A\mapsto \{\max (A)\}\) from X(n) to X(1), and \(A\mapsto \{\min (A), \max (A)\}\) from X(n) to X(2). The exceptional nature of retractions onto X(1) and X(2) was also observed in [9] in the context of finite subsets of trees.

Corollary 3

A Lipschitz retraction \(X(n)\rightarrow X(k)\) does not necessarily factor into a chain of retractions

$$\begin{aligned}X(n)\rightarrow X(n-1)\rightarrow \cdots \rightarrow X(k+1)\rightarrow X(k).\end{aligned}$$

Proof

Let X be the set in Proposition 3. The map \(X(4)\rightarrow X(2)\) sending every set A to \(\{\min A, \max A\}\) does not factor through a retraction onto X(3), since none exist. \(\square \)

Proposition 3 demonstrates that Hölder continuous clustering may be possible in some settings where Lipschitz clustering is unavailable. As another possible instance of this phenomenon, Akofor [2] proved that for every normed space X there are locally Hölder retractions \(X(n)\rightarrow X(n-1)\), while the existence of Lipschitz retractions remains unknown [21, Question 3.4].

5 Quasiconvexity of LCP Spaces

The main result of this section gives a strong necessary condition for the Lipschitz Clustering Property. It exhibits a dichotomy for LCP spaces: they are either well connected by rectifiable curves, or do not admit any bi-Lipschitz maps from an interval. This result will be used to show that the LCP is not inherited by products or disjoint unions.

Theorem 3

Suppose that a metric space (Xd) supports an L-Lipschitz retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(4)\rightarrow X(3)\) and the interval [0, 1] admits an L-bi-Lipschitz embedding into X. Then there exist positive constants r and M, depending only on L, such that any two points \(p, q\in X\) with \(d(p, q)\le r\) can be connected by a curve of length at most Md(pq).

The proof of Theorem 3 is preceded by several lemmas.

Lemma 4

[20, Lemma 4.3] Let Z and X be metric spaces with \(D:={{\,\mathrm{diam}\,}}Z < \infty \). Suppose that \(f :Z \rightarrow X(n)\) is an L-Lipschitz function such that

$$\begin{aligned} {{\,\mathrm{diam}\,}}f(z_0) > 3(n-1)LD \quad \text {for some } z_0\in Z. \end{aligned}$$
(18)

Then, there are L-Lipschitz functions \(g, h:Z \rightarrow X(n-1)\) such that \(f(z) = g(z)\cup h(z)\) for all \(z \in Z\). Specifically, one can let

$$\begin{aligned} \begin{aligned} g(z)&= \{x \in f(z) :{{\,\mathrm{dist}\,}}(x, E) \le LD\}; \\ h(z)&= \{x \in f(z) :{{\,\mathrm{dist}\,}}(x, f(z_0)\setminus E) \le LD\} = f(z)\setminus g(z), \end{aligned} \end{aligned}$$
(19)

where E can be any subset of \(f(z_0)\) that satisfies \({{\,\mathrm{diam}\,}}E \le 3LD(|E| - 1)\) and is a maximal such subset with respect to containment.

Lemma 4 can be refined when the domain Z is an interval and the cardinality of f(z) does not depend on z.

Lemma 5

Let X be a metric space. Let \(I\subset {\mathbb {R}}\) be an interval, possibly unbounded. Any L-Lipschitz map \(f :I\rightarrow X(n)\setminus X(n-1)\) can be decomposed as

$$\begin{aligned} f(t)=\{f_1(t), \dots , f_n(t)\}, \end{aligned}$$
(20)

where each function \(f_k:I\rightarrow X\) is L-Lipschitz.

Proof

The minimal separation function \(\delta _n(f(t))\) is continuous and positive on I. Therefore, for every compact subinterval J there exists \(\delta >0\) such that \(\delta _n(f(t))\ge \delta \) for all \(t\in J\). Let \(\epsilon = \delta /(3L(n-1))\). On any subinterval \(S\subset J\) with \({{\,\mathrm{diam}\,}}S < \epsilon \) one can apply Lemma 4 repeatedly to obtain a decomposition of the form (20), because any subset of f(t) with more than one element has diameter at least \(\delta \).

Let \(J=\bigcup _{i=1}^N [t_{i-1}, t_i]\) be a partition of J into subintervals of length less than \(\epsilon \). We have an L-Lipschitz decomposition for each i,

$$\begin{aligned} f(t)=\{f_1^i(t), \dots , f_n^i(t)\}, \quad t\in [t_{i-1}, t_i]. \end{aligned}$$

Since \(\{f_1^i(t_i), \dots , f_n^i(t_i)\} = \{f_1^{i+1}(t_i), \dots , f_n^{i+1}(t_i)\}\), we can relabel the functions to achieve \(f_k^i(t_i)=f_k^{i+1}(t_i)\) for all \(k=1, \dots , n\) and all \(i=1, \dots N-1\). This produces an L-Lipschitz decomposition of f on J.

Since I can be partitioned into countably many compact subintervals, the above process of concatenation produces an L-Lipschitz decomposition of f on all of I. \(\square \)

Lemma 6

If a set \(\{p, q\}\subset X\) is connected to some singleton \(\{c\}\) by a curve of length \(\ell \) in X(2), then the points p and q are connected in X by a curve of length at most \(2\ell \).

Proof

We may assume \(p\ne q\). Let \(\Gamma :[0, \ell ]\rightarrow X(2)\) be a curve with \(\Gamma (0)=\{p, q\}\) and \(\Gamma (\ell )=\{c\}\), parametrized by arclength. Since \(\delta _2:X(2)\rightarrow [0,\infty )\) is a continuous function, the set \(S = \{t\in [0, \ell ] :\delta _2(\Gamma (t))=0\}\) is closed. Let \(t_0=\inf S\). Since the restriction of \(\Gamma \) to \([0, t_0)\) takes values at \(X(2)\setminus X(1)\), Lemma 5 provides a decomposition \(\Gamma (t)=\{\gamma _1(t), \gamma _2(t)\}\) where both \(\gamma _1\) and \(\gamma _2\) are 1-Lipschitz on \([0, t_0)\). The curves \(\gamma _1\) and \(\gamma _2\) have the same limit as \(t\rightarrow t_0-\) because \(|\Gamma (t_0)|=1\). Thus, their concatenation is a curve from p to q of length at most \(2t_0\le 2\ell \). \(\square \)

Proof of Theorem 3

By assumption there exists an L-bi-Lipschitz embedding \(\Gamma :[0, 1]\rightarrow X\). Let

$$\begin{aligned} r = \frac{1}{48(L+1)^5}. \end{aligned}$$
(21)

Fix distinct points pq with \(d(p, q)\le r\) and let \(E=\{p, q\}\). By the triangle inequality there exists \(t_0\in \{0, 1\}\) such that \(d(\Gamma (t_0), p)\ge 1/(2L)\). We may assume \(t_0=0\) (otherwise, reverse the parametrization of \(\Gamma \)). Let \(I = [0, D]\) where \(D=1/(24L^3)\). Observe that

$$\begin{aligned} \frac{1}{24L^4} \le {{\,\mathrm{diam}\,}}\Gamma (I)\le \frac{1}{24L^2}, \end{aligned}$$
(22)

and therefore,

$$\begin{aligned} {{\,\mathrm{dist}\,}}(\Gamma (I), E)\ge \frac{1}{2L} - \frac{1}{24L^2} \ge \frac{11}{24 L}. \end{aligned}$$
(23)

Define \(f:I\rightarrow X(3)\) by \(f(t) = {{\,\mathrm{{\mathcal {R}}}\,}}(\{\Gamma (t), \Gamma (0), p, q\})\). By construction, f is \(L^2\)-Lipschitz.

Let us check that the assumptions of Lemma 4 are satisfied for the map f with domain \(Z=I\) of diameter \(D=1/(24L^3)\), distinguished point \(z_0 = 0\), Lipschitz constant \(L^2\), and with the choice \(E=\{p, q\}\subset f(0)\). Indeed,

$$\begin{aligned} {{\,\mathrm{diam}\,}}f(0) = {{\,\mathrm{diam}\,}}\{\Gamma (0), p, q\} \ge \frac{1}{2L} = 12 L^2 D, \end{aligned}$$

while

$$\begin{aligned} {{\,\mathrm{diam}\,}}E = d(p, q) \le r < \frac{1}{8L} = 3L^2D. \end{aligned}$$

The maximality assumption of Lemma 4 holds because E is a maximal proper subset of f(0).

Lemma 4 provides \(L^2\)-Lipschitz maps \(g, h:I\rightarrow X(2)\) such that \(f(t)=g(t)\cup h(t)\) for all \(t\in I\), and, moreover,

$$\begin{aligned} g(t) = \{x\in f(t):{{\,\mathrm{dist}\,}}(x, E)\le L^2 D\}. \end{aligned}$$
(24)

In particular, \(g(0)=E=\{p, q\}\) because \(f(0) = \{\Gamma (0), p, q\}\).

Claim: There exists \(T\in [0, D]\) such that

$$\begin{aligned} |g(T)|=1 \quad \text {and} \quad T\le 2(L+1)^2 d(p, q). \end{aligned}$$
(25)

Assuming this claim for now, let us show how it implies the statement of Theorem 3. Since g is \(L^2\)-Lipschitz, the set \(g(0)=\{p, q\}\) is connected to the singleton g(T) by a curve of length at most \(L^2T\) in X(2). By Lemma 6 there is a curve of length at most \(2L^2T\) connecting p to q in X. By virtue of (25) the statement of Theorem 3 holds with \(M = 4L^2(L+1)^2\).

It remains to prove (25). Fix \(t\in [0, D]\) such that g(t) contains more than one point. Since g(t) and h(t) are disjoint and their union has at most three elements, it follows that h(t) is a singleton.

Consider the set \(A:=\{\Gamma (t), \Gamma (0), p, q\}\). By Lemma 1, the Hausdorff distance between A and \(f(t)={{\,\mathrm{{\mathcal {R}}}\,}}(A)\) can be estimated as

$$\begin{aligned} \Delta (f(t), A) \le (L+1)\delta _4(A)\le (L+1)d(p, q). \end{aligned}$$
(26)

But we can also estimate \(\Delta (f(t), A)\) from below. Indeed, (23) and (24) imply that

$$\begin{aligned} {{\,\mathrm{dist}\,}}(\Gamma (I), g(t))\ge {{\,\mathrm{dist}\,}}(\Gamma (I), E) - L^2D \ge \frac{11}{24L} - \frac{1}{24L} = \frac{5}{12L}. \end{aligned}$$
(27)

The fact that h(t) is a singleton implies that for some \(s\in \{0, t\}\)

$$\begin{aligned} {{\,\mathrm{dist}\,}}(\Gamma (s), h(t)) \ge \frac{1}{2}d(\Gamma (0), \Gamma (t)) \ge \frac{t}{2L}. \end{aligned}$$
(28)

Since A contains both \(\Gamma (0)\) and \(\Gamma (t)\), it follows from (27) and (28) that

$$\begin{aligned} \Delta (f(t), A) \ge {{\,\mathrm{dist}\,}}(f(t), \Gamma (s)) \ge \min \left( \frac{5}{12L}, \frac{t}{2L}\right) = \frac{t}{2L}, \end{aligned}$$
(29)

where the last step is based on \(t\le D\le 1/24\).

From (26) and (29) we obtain

$$\begin{aligned} \frac{t}{2L} \le (L+1)d(p, q). \end{aligned}$$
(30)

By (30), we have \(|g(T)|=1\) for any T such that \(2L(L+1)d(p, q)<T \le D\). To see that this interval is nonempty, recall (21) which implies

$$\begin{aligned} 2L(L+1)d(p, q)\le \frac{2L(L+1)}{48(L+1)^5} < \frac{1}{24L^3} = D. \end{aligned}$$

This completes the proof of Claim (25) and of the theorem. \(\square \)

Example 1

For every \(\epsilon >0\) the set \(X=[-1, 0]\cup [\epsilon , 1]\) has the LCP by Theorem 2. However, the Lipschitz constant of any retraction of X(4) onto X(3) cannot be bounded by a constant L independent of \(\epsilon \). Indeed, if this was possible, then by Theorem 3 we would have \(r>0\), independent of \(\epsilon \), such that any two points \(p, q\in X\) with \(|p-q|\le r\) can be connected by a curve in X. But this is false when \(\epsilon \le r\).

Under the stronger hypothesis of containing bi-Lipschitz images of long line segments, the conclusion of Theorem 3 can be strengthened to global quasiconvexity.

Corollary 4

Suppose that a metric space X supports an L-Lipschitz retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(4)\rightarrow X(3)\) and for every \(T>0\) the interval [0, T] admits an L-bi-Lipschitz embedding into X. Then X is quasiconvex.

Proof

Let \(X_\epsilon \) be the rescaling of metric space (Xd) by the factor \(\epsilon >0\); that is, the metric on \(X_\epsilon \) is \(\epsilon d\). The retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(4)\rightarrow X(3)\) induces a retraction \(X_\epsilon (4)\rightarrow X_\epsilon (3)\) with the same Lipschitz constant. Also, an L-bi-Lipschitz embedding \(\Gamma :[0, \epsilon ^{-1}]\rightarrow X\) induces an L-bi-Lipschitz embedding of [0, 1] into \(X_\epsilon \), namely \(\Gamma _\epsilon (t)=\Gamma (\epsilon ^{-1} t)\).

Applying Theorem 3 to \(X_\epsilon \) we find that there exist M and r, which depend only on L, such that any two points \(p, q\in X_\epsilon \) with \(\epsilon d(p, q) \le r\) can be joined by a curve \(\gamma \) of length at most \(M \epsilon d(p, q)\) in \(X_\epsilon \).

Given any two points \(p, q\in X\), we can choose \(\epsilon >0\) such that \(\epsilon d(p, q)\le r\). The previous paragraph provides a curve connecting p to q in X, the length of which is at most Md(pq) in the metric of X. \(\square \)

Example 2

Let \(X={\mathbb {R}}\times {\mathbb {Z}}\), considered as a subset of \({\mathbb {R}}^2\) with the restriction metric. Since X contains lines but is not a connected space, by Corollary 4 it does not have the LCP. Thus, additive subgroups of \({\mathbb {R}}^2\) do not have the LCP in general, in contrast to Corollary 2.

Example 3

Consider the parabola \(P=\{(x, y)\in {\mathbb {R}}^2:y = x^2\}\) with the restriction metric inherited from \({\mathbb {R}}^2\). The set P contains 2-bi-Lipschitz images of arbitrarily long line segments. Since P is not quasiconvex, by Corollary 4 it does not have the LCP.

6 Transformations of Metric Spaces

This section concerns the invariance of the Lipschitz Clustering Property under certain transformations of metric spaces. Since any bi-Lipschitz map \(f:X\xrightarrow {\text {onto}}Y\) induces a bi-Lipschitz map of X(n) onto Y(n), it follows that LCP is bi-Lipschitz invariant. The following lemma extends this observation.

Lemma 7

[3, Lemma 3.3] Suppose that X and Y are metric spaces and there exist Lipschitz maps \(f:X\rightarrow Y\) and \(g:Y\rightarrow X\) with \(f\circ g={{\,\mathrm{id}\,}}_Y\). If X has the LCP, then so does Y.

For example, Lemma 7 applies when \(Y\subset X\) and \(g:Y\rightarrow X\) is the inclusion map. In this case, the existence of a Lipschitz map f with \(f\circ g={{\,\mathrm{id}\,}}_Y\) means precisely that Y is a Lipschitz retract of X.

The Lipschitz retracts of \({\mathbb {R}}^d\) have a transparent characterization when \(d=2\) [17, Theorem 2.11] and a less transparent one for \(d>2\) ( [19, Theorem 3.4] and [17, Theorem 2.12]). All these retracts have the LCP by Lemma 7. Example 4 will show that the converse is not true even for connected sets: a connected LCP subset of \({\mathbb {R}}^d\) need not be a Lipschitz retract of \({\mathbb {R}}^d\).

The class of quasisymmetric maps in metric spaces [16, Chapters 10–12] contains bi-Lipschitz class as a proper subset.

Definition 4

A homeomorphism \(f:X\xrightarrow {\text {onto}}Y\) is quasisymmetric if there exists a homeomorphism \(\eta :[0, \infty ) \rightarrow [0, \infty )\) such that for any three distinct points xuv in X we have

$$\begin{aligned} \frac{d_Y(f(x), f(u))}{d_Y(f(x), f(v))} \le \eta \left( \frac{d_X(x, u)}{d_X(x, v)}\right) . \end{aligned}$$

Using Theorem 3 we can show that the Lipschitz Clustering Property is not invariant under quasisymmetric maps. Indeed, a quasisymmetric image of \({\mathbb {R}}\) may be a curve \(\Gamma \) that contains both a line segment and an unrectifiable arc such as the von Koch snowflake. This follows from Ahlfors’ characterization of such images in terms of the three-point “bounded turning” condition [1, §IV.D]. By Theorem 3, \(\Gamma \) does not have the LCP.

The following lemma shows the LCP is preserved by certain transformations of the metric d, such as the snowflake transform \(d\mapsto d^\alpha \), \(0<\alpha <1\). Note that the identity map from (Xd) onto \((X, d^\alpha )\) is quasisymmetric but not bi-Lipschitz.

Lemma 8

Suppose (Xd) is a metric space with the LCP, and \(\varphi :[0, \infty )\rightarrow [0, \infty )\) is a nondecreasing function such that \(\varphi \circ d\) is a metric on X. If, in addition, there exists a constant M such that

$$\begin{aligned} \varphi (2t)\le M\varphi (t)\quad \text {for all } t\in [0, \infty ), \end{aligned}$$
(31)

then \((X, \varphi \circ d)\) has the LCP.

Proof

Consider a retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n) \rightarrow X(n-1)\) that is L-Lipschitz with respect to the Hausdorff metric \(\Delta \) based on d. The doubling property (31) implies that there is a constant \(L'\) such that \(\varphi (Lt)\le L'\varphi (t)\) for all \(t\in [0, \infty )\). Therefore, for any \(A, B\in X(n)\) we have

$$\begin{aligned} \varphi (\Delta ({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(B))) \le \varphi (L \Delta (A, B)) \le L'\varphi (\Delta (A, B)), \end{aligned}$$

which proves the claim. \(\square \)

Example 4

Fix \(\alpha \in (1/2, 1)\). By [5, Proposition 4.4] there exists a map \(f:[0, 1]\rightarrow {\mathbb {R}}^2\) such that

$$\begin{aligned} C^{-1} |x-y|^\alpha \le |f(x)-f(y)|\le C|x-y|^\alpha ,\quad x, y\in \mathbb [0, 1], \end{aligned}$$

for some constant C. Lemma 8 shows that the snowflake-type curve \(\Gamma =f([0, 1])\) has the LCP. On the other hand, \(\Gamma \) contains no rectifiable curves and therefore is not a quasiconvex set. Consequently, it is not a Lipschitz retract of \({\mathbb {R}}^2\).

Since Lipschitz maps generally behave well with respect to Cartesian products, one may expect the Lipschitz Clustering Property to be inherited by such products. However, Rickman’s rug, one of standard examples of a fractal surface ([11, 14], or [24, p. 65]), provides a counterexample. By definition, a Rickman’s rug is the Cartesian product of a line segment I with the “snowflake” \(I_\alpha \) which is the set I equipped with the metric \(|x-y|^\alpha \), \(0<\alpha <1\). Both factors I and \(I_\alpha \) have the LCP by Theorem 2 and Lemma 8. However, Theorem 3 will show that the product \(I\times I_\alpha \) does not have the LCP, since it contains some line segments without being locally connected by rectifiable curves.

The argument from the previous paragraph applies to the disjoint union \(X=I\sqcup I_\alpha \) as well. It follows that the disjoint union of two compact LCP spaces need not have the LCP.

The Lipschitz Clustering Property is preserved by quasihomogeneous maps, which form an intermediate class between bi-Lipschitz and quasisymmetric maps.

Definition 5

[13, 15, 18] A homeomorphism \(f:X\xrightarrow {\text {onto}}Y\) is quasihomogeneous if there exists a homeomorphism \(\eta :[0, \infty ) \rightarrow [0, \infty )\) such that for any four distinct points \(x_1, \dots , x_4\) in X we have

$$\begin{aligned} \frac{d_Y(y_1, y_2)}{d_Y(y_3, y_4)} \le \eta \left( \frac{d_X(x_1, x_2)}{d_X(x_3, x_4)}\right) \end{aligned}$$
(32)

where \(y_k=f(x_k)\), \(k=1, \dots , 4\).

Proposition 4

If \(f:X\xrightarrow {\text {onto}}Y\) is a quasihomogeneous map and X has the LCP, then Y has the LCP as well.

Proof

Property (32) can be equivalently stated as follows: if \(t > 0\) and \(d_X(x_1, x_2)\le t d_X(x_3, x_4)\), then \(d_Y(y_1, y_2) \le \eta (t) d_Y(y_3, y_4)\).

Our first goal is to prove that if \(A_1, \dots , A_4\in X(n)\) satisfy \(\Delta _X(A_1, A_2) \le t \Delta _X(A_3, A_4)\), then the sets \(B_k=f(A_k)\) satisfy

$$\begin{aligned} \Delta _Y(B_1, B_2)\le \eta (t) \Delta _Y(B_3, B_4). \end{aligned}$$
(33)

That is, the map \(X(n)\rightarrow Y(n)\) defined by \(A\mapsto f(A)\) is \(\eta \)-quasihomogeneous. A similar observation was made in [22, Theorem 3.4] but it was not quantified as in (33).

Because \(B_1\) and \(B_2\) are interchangeable, to prove (33) it suffices to show that \(\forall y_1\in B_1 \ \exists y_2\in B_2\) such that

$$\begin{aligned} d_Y(y_1, y_2)\le \eta (t) \Delta _Y(B_3, B_4). \end{aligned}$$
(34)

Given \(y_1\in B_1\), let \(x_1=f^{-1}(y_1)\) and choose \(x_2\in A_2\) so that \(d_X(x_1, x_2)\le \Delta _X(A_1, A_2)\). We claim that the point \(y_2:=f(x_2)\) satisfies (34).

After exchanging the roles of \(A_3\) and \(A_4\) if necessary, we can find \(x_3\in A_3\) such that \(d_X(x_3, x_4)\ge \Delta _X(A_3, A_4)\) for all \(x_4\in A_4\). Thus, with the above choice of \(x_1, x_2, x_3\) we have

$$\begin{aligned} d_X(x_1, x_2)\le t d_X(x_3, x_4), \quad \forall x_4\in A_4. \end{aligned}$$

The quasihomogeneity of f implies

$$\begin{aligned} d_Y(y_1, y_2)\le \eta (t) d_Y(y_3, y_4), \quad \forall y_4\in B_4, \end{aligned}$$

where \(y_3=f(x_3)\). Taking the minimum over \(y_4\in B_4\) we obtain (34). This completes the proof of (33).

By assumption, there exists a Lipschitz retraction \({{\,\mathrm{{\mathcal {R}}}\,}}:X(n)\rightarrow X(n-1)\). For any sets \(A, A'\in X(n)\) we have \(\Delta _X({{\,\mathrm{{\mathcal {R}}}\,}}(A), {{\,\mathrm{{\mathcal {R}}}\,}}(A')) \le L \Delta _X(A, A')\) where \(L={{\,\mathrm{Lip}\,}}({{\,\mathrm{{\mathcal {R}}}\,}})\). Inequality (33) yields

$$\begin{aligned} \Delta _Y(f({{\,\mathrm{{\mathcal {R}}}\,}}(A)), f({{\,\mathrm{{\mathcal {R}}}\,}}(A')))\le \eta (L) \Delta _Y(f(A), f(A')). \end{aligned}$$

Therefore, the map \({\widetilde{{{\,\mathrm{{\mathcal {R}}}\,}}}}:Y(n)\rightarrow Y(n-1)\), defined by \({\widetilde{{{\,\mathrm{{\mathcal {R}}}\,}}}}(B) = f({{\,\mathrm{{\mathcal {R}}}\,}}(f^{-1}(B)))\), is \(\eta (L)\)-Lipschitz. It is easy to see that \({\widetilde{{{\,\mathrm{{\mathcal {R}}}\,}}}}\) is a retraction onto \(Y(n-1)\). \(\square \)

For example, Proposition 4 shows that every quasihomogeneous image of Euclidean space \({\mathbb {R}}^d\) has the LCP. Freeman [13] proved that an unbounded Jordan curve \(\Gamma \) in the plane is a quasihomogeneous image of \({\mathbb {R}}\) if and only if \(\Gamma \) is bi-Lipschitz homogeneous, meaning that there exists L such that for any two points \(x,y\in \Gamma \) there exists an L-bi-Lipschitz self homeomorphism of \(\Gamma \) sending x to y. Quasihomogeneous images of \({\mathbb {R}}^2\) have been described in [14].

7 Questions and Remarks

Question 1

Is the converse of Corollary 1 true? That is, does the existence of a uniformly Lipschitz family of retractions \(X(n)\rightarrow X(m)\) for all \(n>m\ge 1\) imply that X is uniformly disconnected?

Some evidence in favor of affirmative answer is provided by Corollary 5.2 in [3] which states that if X contains a bi-Lipschitz image of a line segment as its Lipschitz retract, then for any family of retractions \({{\,\mathrm{{\mathcal {R}}}\,}}_n:X(n)\rightarrow X(n-1)\) the ratio \(n^{-1}{{\,\mathrm{Lip}\,}}{{\,\mathrm{{\mathcal {R}}}\,}}_n\) is bounded below by a positive constant.

Question 2

Does the existence of a Lipschitz retraction \(X(n+1)\rightarrow X(n)\) imply the existence of Lipschitz retraction \(X(n)\rightarrow X(n-1)\)? Here X is a general metric space and \(n\ge 2\).

Question 3

If a homeomorphic image of \({\mathbb {R}}^d\) has the LCP, is it a space of bounded turning [16, p. 120]? Note that quasihomogeneous images of \({\mathbb {R}}^d\) have the LCP by Proposition 4 and they are spaces of bounded turning.

Question 4

Is there a geometric description of the LCP subsets of \({\mathbb {R}}\)? Some natural examples that are not covered by §4 are: Cantor-type sets that are not uniformly disconnected (e.g., those of positive measure), countable sets formed by convergent sequences (such as the set in Proposition 3), and the set of irrational numbers \({\mathbb {R}}\setminus {\mathbb {Q}}\). Example 1 suggests that Cantor-type sets of positive measure do not have the LCP.