1 Introduction

This paper is concerned with structures created by taking (many) random points and building the structure based on neighbourhood relations between the points. Perhaps the simplest way to describe this is to let \(\Phi = \{x_1,x_2,\dots \}\) be a finite or countable, locally finite, subset of points in \({\mathbb {R}}^d\), for some \(d>1\), and to consider the set

$$\begin{aligned} \mathcal{C}_B(\Phi ,r) \mathop {=}\limits ^{\Delta }\bigcup _{x \in \Phi }B_{x}(r), \end{aligned}$$
(1.1)

where \(0<r<\infty \), and \(B_x(r)\) denotes the d-dimensional ball of radius r centred at \(x \in {\mathbb {R}}^d\).

When the points of \(\Phi \) are those of a stationary Poisson process on \({\mathbb {R}}^d\), this union is a special case of a ‘Boolean model’, and its integral geometric properties—such as volume, surface area, Minkowski functionals—have been studied in the setting of stochastic geometry since the earliest days of that subject. Our interest, however, lies in the homological structure of \(\mathcal{C}_B(\Phi ,r)\), in particular, as expressed through its Betti numbers. Thus our approach will be via the tools of algebraic topology, and, to facilitate this, we shall generally work not with \(\mathcal{C}_B(\Phi ,r)\) but with a homotopically equivalent abstract simplicial complex with a natural combinatorial structure. This will be the Čech complex with radius r built over the point set \(\Phi \), denoted by \({{\mathcal {C}}}(\Phi ,r)\), and defined below in Sect. 2.1.

The first, and perhaps most natural topological question to ask about these sets is how connected are they. This is more a graph theoretic question than a topological one, and has been well studied in this setting, with [31] being the standard text in the area. There are various ‘regimes’ in which it is natural to study these questions, depending on the radius r. If r is small, then the balls in (1.1) will only rarely overlap, and so the topology of both \(\mathcal{C}_B(\Phi ,r)\) and \({{\mathcal {C}}}(\Phi ,r)\) will be mainly that of many isolated points. This is known as the ‘dust regime’. However, as r grows, the balls will tend to overlap, and so a large, complex structure will form, leading to the notion of ‘continuum percolation’, for which the standard references are [16, 25]. The percolation transition occurs within what is known as the ‘thermodynamic’, regime (described in more detail in Sect. 2.2), and is typically the hardest to analyse. The third and final regime arises as r continues to grow, and (loosely speaking) \(\mathcal{C}_B(\Phi ,r)\) merges into a single large set with no empty subsets and so no interesting topology.

Motivated mainly by issues in topological data analysis (e.g. [27, 28]) there has been considerable recent interest in the topological properties of \(\mathcal{C}_B(\Phi ,r)\) and \({{\mathcal {C}}}(\Phi ,r)\) that go beyond mere connectivity or the volumetric measures provided by integral geometry. These studies were initiated by Matthew Kahle in [19], in a paper which studied the growth of the expected Betti numbers of these sets when the underlying point process \(\Phi \) was either a Poisson process or a random sample from a distribution satisfying mild regularity properties. Shortly afterwards, more sophisticated distributional results were proven in [21]. An extension to more general stationary point processes \(\Phi \) on \({\mathbb {R}}^d\) can be found in [42], while, in the Poisson and binomial settings, [3] looks at these problems from the point of view of the Morse theory of the distance function. Recently [5] has established important—from the point of view of applications—extensions to the results of [3, 19, 21] in which the underlying point process lies on a manifold of lower dimension than an ambient Euclidean space in which the balls of (1.1) are defined. See also the recent survey [4].

However, virtually all of the results described in the previous paragraph (with the notable exception of some growth results for expected Betti numbers in [19] and numbers of critical points in [3]) deal with the topology of the dust regime. What is new in the current paper is a focus on the thermodynamic regime, and new results that go beyond the earlier ones about expectations. Moreover, because of the long range dependencies in the thermodynamic regime, proofs here involve considerably more topological arguments than is the case for the dust regime.

Our main results are summarised in the following subsection, after which we shall give some more details about the current literature. Then, in Sect. 2, we shall recall some basic notions from topology and from the theory of point processes. The new results begin in Sect. 3, where we shall treat the setting of general stationary point processes, while the Poisson and binomial settings will be treated in Sect. 4. The paper concludes with some appendices containing variations of some known tools, adapted to our needs.

1.1 Summary of results

Throughout the paper we shall assume that all our point processes are defined over \({\mathbb {R}}^d\) for \(d \ge 2\). Denoting Betti numbers of a set \(A \subset {\mathbb {R}}^d\) by \(\beta _k(A)\), \(k=1,\dots ,d-1\), we are interested in \(\beta _k(\mathcal{C}_B(\Phi ,r))\) for point processes \(\Phi \subset {\mathbb {R}}^d\). Since the Betti numbers for \(k\ge d\) are identically zero, these values of k are uninteresting. On the other hand, \(\beta _0(A)\) gives the number of connected components of A. While this is clearly interesting and important in our setting, it has already been studied in detail from the point of view of random graph theory, as described above. Indeed, (sometimes stronger) versions of virtually all our results for the higher Betti numbers already exist for \(\beta _0\) (cf. [1, 31]), and so this case will appear only peripherally in what follows.

Here is a summary of our results, grouped according to the underlying point processes involved. Formal definitions of technical terms are postponed to later sections.

  1. 1.

    General stationary point processes: For a stationary point process \(\Phi \) and \(r \in (0,\infty )\), we study the asymptotics of \(\beta _k(\mathcal{C}_B(\Phi \cap W_l,r))\) as \(l \rightarrow \infty \) and where \(W_l = [-\frac{l}{2},\frac{l}{2})^d\). We show convergence of expectations (Lemma 3.3) and, assuming ergodicity, we prove strong laws (Theorem 3.5) for all the Betti numbers and a concentration inequality for \(\beta _0\) (Theorem 3.6) in the special case of determinantal point processes.

  2. 2.

    Stationary Poisson point processes: Retain the same notation as above, but take \(\Phi = {\mathcal {P}}\), a stationary Poisson point process on \({\mathbb {R}}^d\). In this setting we prove a central limit theorem (Theorem 4.7) for the Betti numbers of \(\mathcal{C}_B({\mathcal {P}}\cap W_l, r)\) and \({{\mathcal {C}}}({\mathcal {P}}\cap W_l,r)\), for any \(r\in (0,\infty )\), as \(l\rightarrow \infty \). We also treat the case in which l points are chosen uniformly in \(W_l\) and obtain a similar result, although in this case we can only prove the central limit theorem for \(r \notin I_d\), where the interval \(I_d\) will be defined in Sect. 4.2. Informally, \(I_d\) is the interval of radii where both \(\mathcal{C}_B({\mathcal {P}},r)\) and its complement have unbounded components a.s. We only remark here that \(I_2 = \emptyset \) and \(I_d\) is a non-degenerate interval for \(d \ge 3\).

  3. 3.

    Inhomogeneous Poisson and binomial point processes: Now, consider either the Poisson point process \({\mathcal {P}}_n\) with non-constant intensity function nf, for a ‘nice’, compactly supported, density f, or the binomial process of n iid random variables with probability density f. In this case the basic set-up requires a slight modification, and so we consider asymptotics for \(\beta _k(\mathcal{C}_B({\mathcal {P}}_n,r_n))\) as \(n \rightarrow \infty \) and \(nr_n^d \rightarrow r \in (0,\infty )\). We derive an upper bound for variances and a weak law (Lemma 4.2). In the Poisson case, we also derive a variance lower bound for the top homology. For the corresponding binomial case we prove a concentration inequality (Theorem 4.5) and use this to prove a strong law for both cases (Theorem 4.6).

A few words on our proofs: In the case of stationary point processes, we shall use the nearly-additive properties of Betti numbers along with sub-additive theory arguments [39, 43]. In the Poisson and binomial cases, the proofs center around an analysis of the so-called add-one cost function,

$$\begin{aligned} \beta _k(\mathcal{C}_B({\mathcal {P}}\cup \{O\},r)) - \beta _k(\mathcal{C}_B({\mathcal {P}},r)), \end{aligned}$$

where O is the origin in \({\mathbb {R}}^d\). While simple combinatorial topology bounds with martingale techniques suffice for strong laws, weak laws, and concentration inequalities, a more careful analysis via the Mayer–Vietoris sequence is required for the central limit theorems.

Our central limit theorems rely on similar results for stabilizing Poisson functionals (cf. [33]), which in turn were based upon martingale central limit theory. As for variance bounds, while upper bounds can be derived via Poincaré or Efron–Stein inequalities, the more involved lower bounds exploit the recent bounds developed in [23] using chaos expansions of Poisson functionals.

One of the difficulties in analyzing Betti numbers that will become obvious in the proof of the central limit theorem is their global nature. Most known examples of stochastic geometric functionals satisfy both the notions of stabilization (cf. [33]) known as ‘weak’ and ‘strong’ stabilization. However, we shall prove that higher Betti numbers satisfy weak stabilization but satisfy strong stabilization only for certain radii regimes. We are unable to prove strong stabilization of higher Betti numbers for all radii regimes because of the global dependence of Betti numbers on the underlying point process.

1.2 Some history

To put our results into perspective, and to provide some motivation, here is a little history.

As already mentioned, recent interest in random geometric complexes was stimulated by their connections to topological data analysis and, more broadly, applied topology. There are a number of accessible surveys on this subject (e.g. [6, 9, 12, 15, 45]), all of which share a common theme of studying topological invariants of simplicial complexes built on point sets. At the time of writing, another excellent review [7] by Carlsson appeared, which is longer than the earlier ones, more up to date, and which also contains a gentle introduction the topological concepts needed in the current paper. Throughout this literature, Betti numbers, apart from being a simple topological invariant, appear as the first step to understanding persistent homology, undoubtedly the single most important tool to emerge from current research in applied topology.

Although the study of random geometric complexes seems to have originated in [19], it is worth noting that Betti numbers of a random complex model were already investigated in [24], where a higher-dimensional version of the Erdös–Renyi random graph model was constructed. The recent paper [20] gives a useful survey of what is known about the topology of these models, which are rather different to those in the current paper.

As we already noted above, the Boolean model (1.1) has long been studied in stochastic geometry, mainly through its volumetric measures. However, one of these measures is the Euler characteristic, \(\chi \), which is also one of the basic homotopy invariants of topology, and is given by an alternating sum of Betti numbers. Results for Euler characteristics which are related to ours for the individual Betti numbers can be found, for example, in [35], which establishes ergodic theorems for \(\chi (\mathcal{C}_B(\Phi ,r))\) when the underlying point process \(\Phi \) is itself ergodic. More recently, a slew of results have been established for \(\chi (\mathcal{C}_B({\mathcal {P}},r))\) (i.e. the Poisson case) in the preprint [18]. The arguments in this paper replace more classic integral geometric arguments, and are based on new results in the Malliavin–Stein approach to limit theory (cf. [29] and esp. [36]). To some extent we shall also exploit these methods in the current paper, although they are not as well suited to the study of Betti numbers as they are to the Euler characteristic, due to the non-additivity of the former.

An alternative approach to the Euler characteristic of a simplicial complex is via an alternating sum of the numbers of faces of different dimensions. This fact has been used to good effect in [10], which derives exact expressions for moments of face counts, and a central limit theorem and concentration inequality for the Euler characteristic and \(\beta _0\) when the underlying space is a torus. (Working on a torus rather avoids otherwise problematic boundary issues which complicate moment calculations.) Some additional results on phase transitions in face counts for a wide variety of underlying stationary point processes can be found in [42, Section 3].

1.3 Beyond the Čech complex

Although this paper concentrates on the Čech complex as the basic topological object determined by a point process, this is but one of the many geometric complexes that could have been chosen. There are various other natural choices including the Vietoris-Rips, alpha, witness, cubical, and discrete Morse complexes (cf. [14, Section 7], [44, Section 3]) that are also of interest. In particular, the alpha complex is homotopy equivalent to the Čech complex [44, Section 3.2], as is an appropriate discrete Morse complex [14, Theorem 2.5]. This immediately implies that all the limit theorems for Betti numbers in this paper also hold for these complexes.

Moreover, since our main topological tools—Lemmas 2.2 and 2.3—can be shown to hold for all the complexes listed above, most of our arguments should easily extend to obtain similar theorems for these cases as well.

2 Preliminaries

This section introduces a handful of basic concepts and definitions from algebraic topology and the theory of point processes. The aim is not to make the paper self-contained, which would be impossible, but to allow readers from these two areas to have at least the vocabulary for reading our results. We refer readers to the standard texts such as [17, 26] for more details on the topology we need, while [35, 40] covers the point process material.

2.1 Topological preliminaries

An abstract simplicial complex, or simply complex, \({{\mathcal {K}}}\) is a finite collection of finite sets such that \(\sigma _1 \in {{\mathcal {K}}}\) and \(\sigma _2 \subset \sigma _1\) implies \(\sigma _2 \in {{\mathcal {K}}}\). The sets in \({{\mathcal {K}}}\) are called faces or simplices and the dimension \(\text {dim}(\sigma )\) of any simplex \(\sigma \in {{\mathcal {K}}}\) is the cardinality of \(\sigma \) minus 1. If \(\sigma \in {{\mathcal {K}}}\) has dimension k, we say that \(\sigma \) is a k-simplex of \({{\mathcal {K}}}\). The k-skeleton of \({{\mathcal {K}}}\), denoted by \({{\mathcal {K}}}^k\), is the complex formed by all faces of \({{\mathcal {K}}}\) with dimension at most k.

Note that a singleton containing a simplex of dimension greater than zero is not necessarily a simplicial complex. (This is as opposed to their usual concrete representations as subsets of Euclidean space, in which a closed simplex physically contains all its lower dimensional faces.) When we want to study the complex generated by a simplex \(\sigma \), we shall refer to it as the full simplex \(\sigma \), or, equivalently, \(\sigma ^k\), its k-skeleton, where \(k =\) dim\((\sigma )\).

A map \(g:\, {{\mathcal {K}}}^0 \rightarrow {{\mathcal {L}}}^0\) between two complexes \({{\mathcal {K}}}\) and \({{\mathcal {L}}}\) is said to be a simplicial map if, for any \(m \ge 0\), \(\{g(v_1),\ldots ,g(v_m)\}\in \mathcal L\) whenever \(\{v_1,\ldots ,v_m\}\in \mathcal K\).

Given a point set in \({\mathbb {R}}^d\) (or generally, in a metric space) there are various ways to define a simplicial complex that captures some of the geometry and topology related to the set. We shall be concerned with a specific construction—the so-called Čech complex.

Definition 2.1

Let \({{\mathcal {X}}}=\{x_i\}_{i=1}^n\subset {\mathbb {R}}^d\) be a finite set of points. For any \(r>0\), the Čech complex of radius r is the abstract simplicial complex

$$\begin{aligned} \mathcal{C}({{\mathcal {X}}},r) \triangleq \left\{ \sigma \subset {{\mathcal {X}}}:\, \bigcap _{x\in \sigma } B_x(r)\ne \emptyset \right\} , \end{aligned}$$

where \(B_x(r)\) denotes the ball of radius r centered at x.

For future reference, note that by the nerve theorem (cf. [2, Theorem 10.7]) the Čech complex built over a finite set of points is homotopic to the Boolean model (1.1) constructed over the same set.

The Čech complexes that we shall treat will be generated by random point sets, and we shall be interested in their homology groups \(H_k\), with coefficients from a field \({\mathbb {F}}\), which will be anonymous but fixed throughout the paper. A common choice is to take \({\mathbb {F}}={\mathbb {Z}}_2\), which is computationally convenient, but this will not be necessary here.

A few words are in place for the reader unfamiliar with homology theory. On the heuristic level, the homology groups of a space are meant to capture the topological structure of cycles or ‘holes’ in it. Of course, such concepts are best understood in a geometric setting, e.g. when the space is a subset in \({\mathbb {R}}^{d}\), or an abstract complex generated by a triangulation of such a subset. Nevertheless, high dimensional cycles can be defined combinatorially, much like 1-dimensional ones are defined for graphs. Besides simply defining the cycles, one wishes to be able to ignore trivial cycles or ones that are equivalent to others. As concrete examples, the boundary of a full disc should not be regarded as a ‘hole’ and the two cycles forming the boundary of a hollow cylinder are to be thought of as representatives of the same ‘hole’ (where, to obtain an abstract complex, one should consider triangulations of these objects, or alternatively work with singular homology). The way the theory deals with these two issues is by defining \(H_{k}\) as the quotient of a group \(Z_{k}\) representing cycles by another group \(B_{k}\) representing boundaries. Then, trivial cycles are exactly those in the class of 0 and equivalent ones belong to the same class. The groups \(Z_{k}\) and \(B_{k}\) are subgroups of the free group generated by (oriented) simplices; i.e. their elements are formal sums of simplices with coefficients taken from some field. In general, the coefficients are from an Abelian group but we shall work with field coefficients. Having made the choice of working with field coefficients, all groups in our case are vector spaces. The dimension of \(H_{k}\), denoted by \(\beta _k\), is called the k-th Betti number and has a special meaning: it is the maximal number of non-equivalent cycles of dimension k. It is important to note that, for \(k=0\), \(\beta _0\) is the maximal number of vertices which (pairwise) cannot be connected by a sequence of 1-simplices; that is, \(\beta _0\) is the number of connected components of the space. Throughout the paper, we shall concentrate on the (random) Betti numbers \(\beta _k\), \(0\le k\le d-1\), of Čech complexes.

Our two main topological tools are collected in the following two lemmas. The first is needed for obtaining various moment bounds on Betti numbers of random simplicial complexes, and the second will replace the role that additivity of functionals usually plays in most probabilistic limit theorems. Because the arguments underlying these lemmas are important for what follows, and will be unfamiliar to most probabilistic readers, we shall prove them both. However both contain results that are well known to topologists.

Lemma 2.2

Let \({{\mathcal {K}}},{{\mathcal {K}}}_1\) be two finite simplicial complexes such that \({{\mathcal {K}}}\subset {{\mathcal {K}}}_1\) (i.e.,  every simplex in \({{\mathcal {K}}}\) is also a simplex in \({{\mathcal {K}}}_1).\) Then,  for every \(k \ge 1,\) we have that

$$\begin{aligned} \big |\beta _k({{\mathcal {K}}}_1) - \beta _k({{\mathcal {K}}})\big | \le \sum _{j=k}^{k+1} \# \{ j\text {-}\mathrm{{simplices}}\ \mathrm{{in}}\ {{\mathcal {K}}}_1{\setminus }{{\mathcal {K}}}\}. \end{aligned}$$

Proof

We start with the simple case when \({{\mathcal {K}}}_1 = {{\mathcal {K}}}\bigcup \{\sigma \}\) where \(\sigma \) is a j-simplex for some \(j \ge 0\). Note that since both \({{\mathcal {K}}}\) and \({{\mathcal {K}}}_1\) are simplicial complexes it follows that all the proper subsets of \(\sigma \) must already be present in \({{\mathcal {K}}}\). Thus we immediately have that

$$\begin{aligned} \beta _k({{\mathcal {K}}}_1) - \beta _k({{\mathcal {K}}}) \, \in \, {\left\{ \begin{array}{ll} \{0\} &{} j \ne k,k+1, \\ \{0,1\} &{} j = k, \\ \{-1,0\} &{} j = k+1. \end{array}\right. } \end{aligned}$$

Thus the lemma is proven for the case \({{\mathcal {K}}}_1 = {{\mathcal {K}}}\bigcup \{\sigma \}\). For arbitrary complexes \({{\mathcal {K}}}\subset {{\mathcal {K}}}_1\), enumerate the simplices in \({{\mathcal {K}}}_1{\setminus }{{\mathcal {K}}}\) such that lower dimensional simplices are added before the higher dimensional ones and repeatedly apply the above argument along with the triangle inequality. \(\square \)

With a little more work, one can go further than the previous lemma and derive an explicit equality for differences of Betti numbers. This is again a classical result in algebraic topology which is derived using the Mayer–Vietoris sequence (see [11, Corollary 2.2]). However we shall state it here as it is important for our proof of the central limit theorem.

A little notation is needed before we state the lemma. A sequence of Abelian groups \(G_1,\ldots ,G_l\) and homomorphisms \(\eta _i: G_i \rightarrow G_{i+1}\), \(i=1,\ldots ,l-1\) is said to be exact if \(\mathrm{im}\,\eta _i = \mathrm{ker}\, \eta _{i+1}\) for all \(i = 1,\ldots ,l-1\). If \(l = 5\) and \(G_1\) and \(G_5\) are trivial, then the sequence is called short exact.

Lemma 2.3

(Mayer–Vietoris Sequence) Let \({{\mathcal {K}}}_1\) and \({{\mathcal {K}}}_2\) be two finite simplicial complexes and \({{\mathcal {L}}}= {{\mathcal {K}}}_1 \cap {{\mathcal {K}}}_2\) (i.e.,  \({{\mathcal {L}}}\) is the complex formed from all the simplices in both \({{\mathcal {K}}}_1\) and \({{\mathcal {K}}}_2).\) Then the following are true : 

  1. 1.

    The following is an exact sequence,  and,  furthermore,  the homomorphisms \(\lambda _k\) are induced by inclusions : 

    $$\begin{aligned}&\cdots \rightarrow H_k({{\mathcal {L}}}) \mathop {\rightarrow }\limits ^{\lambda _k} H_k({{\mathcal {K}}}_1)\oplus H_k({{\mathcal {K}}}_2) \rightarrow H_k({{\mathcal {K}}}_1 \cup {{\mathcal {K}}}_2)\\&\qquad \rightarrow H_{k-1}({{\mathcal {L}}}) \mathop {\rightarrow }\limits ^{\lambda _{k-1}} H_{k-1}({{\mathcal {K}}}_1)\oplus H_{k-1}({{\mathcal {K}}}_2)\rightarrow \cdots \end{aligned}$$
  2. 2.

    Furthermore, 

    $$\begin{aligned} \beta _k\left( {{\mathcal {K}}}_1 \bigcup {{\mathcal {K}}}_2\right) =\beta _k({{\mathcal {K}}}_1) + \beta _k({{\mathcal {K}}}_2)+ \beta (N_k) +\beta (N_{k-1})-\beta _k({{\mathcal {L}}}), \end{aligned}$$

    where \(\beta (G)\) denotes the rank of a vector space G and \(N_j = \mathrm{ker}\, \lambda _j\).

Proof

The first part of the lemma is just a simplicial version of the classical Mayer–Vietoris theorem (cf. [26, Theorem 25.1]). The second part follows from the first part, as follows: Suppose we have the exact sequence

$$\begin{aligned} \cdots \rightarrow G_1 \mathop {\rightarrow }\limits ^{\eta _1} G_2 \mathop {\rightarrow }\limits ^{\eta _2} G_3 \mathop {\rightarrow }\limits ^{\eta _3} G_4 \mathop {\rightarrow }\limits ^{\eta _4} G_5 \rightarrow \cdots \end{aligned}$$

Then we also have the short exact sequence

$$\begin{aligned} 0 \rightarrow \mathrm{coker}\, \eta _1 \rightarrow G_3 \rightarrow \mathrm{ker}\, \eta _4 \rightarrow 0, \end{aligned}$$

where the quotient space \(\mathrm{coker}\, \eta _1 = G_2 / \mathrm{im}\, \eta _1\) is the cokernel of \(\eta _1\). From the exactness of the sequence we have that

$$\begin{aligned} \beta (G_3) = \beta (\mathrm{coker}\, \eta _1) + \beta (\mathrm{ker}\, \eta _4). \end{aligned}$$

Now applying this to the Mayer–Vietoris sequence with \(G_1 = H_k({{\mathcal {L}}})\), etc, we have

$$\begin{aligned} \beta _k({{\mathcal {K}}}_1 \bigcup {{\mathcal {K}}}_2)= & {} \beta (\mathrm{coker}\, \lambda _k) + \beta (\mathrm{ker}\, \lambda _{k-1}) \\= & {} \beta _k({{\mathcal {K}}}_1) + \beta _k({{\mathcal {K}}}_2) - \beta (\mathrm{im}\, \lambda _k) + \beta (N_{k-1}) \\= & {} \beta _k({{\mathcal {K}}}_1) + \beta _k({{\mathcal {K}}}_2)+ \beta (N_k) +\beta (N_{k-1})-\beta _k({{\mathcal {L}}}), \end{aligned}$$

which completes the proof. \(\square \)

2.2 Point process preliminaries

A point process \(\Phi \) is formally defined to be a random, locally-finite (Radon), counting measure on \({\mathbb {R}}^d\). More formally, let \({\mathcal {B}}_{b}\) be the \(\sigma \)-ring of bounded, Borel subsets of \({\mathbb {R}}^d\) and let \({\mathbb {M}}\) be the corresponding space of non-negative Radon counting measures. The Borel \(\sigma \)-algebra \(\mathcal M\) is generated by the mappings \(\mu \rightarrow \mu (B)\) for all \(B\in {\mathcal {B}}_{b}\). A point process \(\Phi \) is a random element in \(({\mathbb {M}}, \mathcal M)\), i.e. a measurable map from a probability space \((\Omega ,\mathcal F,{\mathbb {P}})\) to \(({\mathbb {M}},\mathcal M)\). The distribution of \(\Phi \) is the measure \({\mathbb {P}}\Phi ^{-1}\) on \(({\mathbb {M}},\mathcal M)\).

We shall typically identify \(\Phi \) with the positions \(\{x_1,x_2,\dots \}\) of its atoms, and so for Borel \(B\subset {\mathbb {R}}^d\), we shall allow ourselves to write

$$\begin{aligned} \Phi (B) = \sum _i\delta _{x_i}(B) = \#\{i:\,x_i\in B\} = \#\{\Phi \cap B\}, \end{aligned}$$

where \(\#\) denotes cardinality and \(\delta _x\) the single atom measure with mass one at x. The intensity measure of \(\Phi \) is the non-random measure defined by \(\mu (B)={\mathbb {E}}\left\{ \Phi (B)\right\} \), and, when \(\mu \) is absolutely continuous with respect to Lebesgue measure, the corresponding density is called the intensity of \(\Phi \). A point process is called simple if its points (i.e., \(x_i\)’s) are a.s. distinct. In this article, we shall consider only simple point processes.

For a measure \(\phi \in {\mathbb {M}}\), let \(\phi _{(x)}\) be the translate measure given by \(\phi _{(x)}(B) = \phi (B-x)\) for \(x \in {\mathbb {R}}^d\) and \(B\in {\mathcal {B}}_{b}\). A point process is said to be stationary if the distribution of \(\Phi _{(x)}\) is invariant under such translation, i.e. \({\mathbb {P}}\Phi ^{-1}_{(x)} = {\mathbb {P}}\Phi ^{-1}\) for all \(x \in {\mathbb {R}}^d\). For a stationary point process in \({\mathbb {R}}^d\), \(\mu (B) = \lambda |B|\) for all \(B\in {\mathcal {B}}_b\), where |B| denotes the Lebesgue measure of B, and the constant of proportionality \(\lambda \) is called the intensity of the point process.

Of particular importance to us are the Poisson and Binomial point processes. These processes are characterized through their relation to one of the most fundamental notions of probability theory—statistical independence. A Poisson process \({\mathcal {P}}\) is the simple point process uniquely determined by its intensity measure \(\mu \) and the following property: for any collection of disjoint measurable sets \(\{A_i\}\), \(\{{\mathcal {P}}(A_i)\}\) are independent random variables. An equivalent, direct definition is given by the finite dimensional distributions,

$$\begin{aligned} {\mathbb {P}}\left\{ {\mathcal {P}}(A_i)=n_i,i=1,\ldots ,k \right\} =\prod ^k_{i=1}{\mathbb {P}}\left\{ P_i=n_i \right\} , \end{aligned}$$

where \(P_i\) are Poisson variables with parameter \(\mu (A_i)\) and, again, \(A_i\) are assumed to be disjoint. A Binomial point process \({{\mathcal {X}}}_n\) is a process formed by n i.i.d points \(X_1,\ldots ,X_n\). It is worth mentioning that conditioning a Poisson process to have exactly n points yields a Binomial process; and conversely, mixing a Binomial process by taking n to be a Poisson variable produces a Poisson process.

For all of the point processes we consider, we shall be interested in behavior in the so-called thermodynamic limit. That is, while letting the number of points n increase to infinity, we choose the radii \(r_n\) so that the average degree of a point (in the random 1-skeleton) converges to a constant. (Note, however, that this average depends on the location of the point for inhomogeneous processes.) As was described in Sect. 1.1, this is done either by scaling the space and taking r to be n-independent for stationary processes, or by fixing the space, increasing the intensity and decreasing \(r_n\) for inhomogeneous processes.

We conclude the section with some more definitions. For Borel \(A \subset {\mathbb {R}}^d\), we write \(\Phi _A\) for both the restricted random measure given by \(\Phi _A(B):=\Phi (A\cap B)\) (when treating \(\Phi \) itself as a measure) and the point set \(\Phi \cap A\) (when treating \(\Phi \) as a point set). To save space, we shall write \(\Phi _l\) for \(\Phi _{W_l}\), where \(W_l\) is the ‘window’ \([-l/2,l/2)^d\), for all \(l \ge 0\).

For a set of measures \( \Theta \in \mathcal {M}\), let the translate family be \(\Theta _x := \{\phi _{(x)} :\, \phi \in \Theta \}\). A point process \(\Phi \) is said to be ergodic if

$$\begin{aligned} {\mathbb {P}}\left\{ \Phi \in {\Theta } \right\} \, \in \, \{0,1\} \end{aligned}$$

for all \(\Theta \in {\mathcal {M}}\) for which

$$\begin{aligned} {\mathbb {P}}\left\{ \Phi \in ({\Theta }{\setminus }\Theta _x) \,\cup \, ({\Theta }_x{\setminus }{\Theta }) \right\} = 0 \end{aligned}$$

for all \(x \in {\mathbb {R}}^d\).

Finally, we say that \(\Phi \) has all moments if, for all bounded Borel \(B \subset {\mathbb {R}}^d\), we have

$$\begin{aligned} {\mathbb {E}}\left\{ \left[ \Phi (B)\right] ^k\right\} < \infty , \quad \text {for all} \ k \ge 1. \end{aligned}$$
(2.1)

3 Limit theorems for stationary point processes

This section is concerned with the Čech complex \(\mathcal{C}(\Phi _l ,r)\), where \(\Phi \) is a stationary point process on \({\mathbb {R}}^d\) with unit intensity and, as above, \(\Phi _l\) is the restriction of \(\Phi \) to the window \(W_l=[-l/2,l/2)^d\). The radius r is arbitrary but fixed.

It is natural to expect that, as a consequence of stationarity, letting \(l\rightarrow \infty \), \(l^{-d}{\mathbb {E}}\left\{ \beta _k(\mathcal{C}(\Phi _l ,r))\right\} \) will converge to a limit. Furthermore, if we also assume ergodicity for \(\Phi \), one expects convergence of \(l^{-d} \beta _k(\mathcal{C}(\Phi _l ,r))\) to a random limit. All this would be rather standard fare, and rather easy to prove from general limit theorems, if it were only true that Betti numbers were additive functionals on simplicial complexes, or, alternatively, the Betti numbers of Čech complexes were additive functionals of the underlying point processes. Although this is not the case, Betti numbers are ‘nearly additive’, and a correct quantification of this near additivity is what will be required for our proofs.

As hinted before Lemma 2.2, the additivity properties of Betti numbers are related to simplicial counts \(S_j ({{\mathcal {X}}},r)\), which, for \(j\ge 0\), denotes the number of j-simplices in \(\mathcal{C}({{\mathcal {X}}},r)\), and \(S_j({{\mathcal {X}}},r;A)\), which denotes the number of j-simplices with at least one vertex in A.

Our first results are therefore limit theorems for these quantities.

Lemma 3.1

Let \(\Phi \) be a unit intensity stationary point process on \({\mathbb {R}}^d,\) possessing all moments. Then,  for each \(j\ge 0,\) there exists a constant \(c_j:= c({{\mathcal {L}}}_{\Phi },j,d,r)\) such that

$$\begin{aligned} {\mathbb {E}}\left\{ S_j(\Phi _{A},r)\right\} \le {\mathbb {E}}\left\{ S_j(\Phi ,r;A)\right\} \le c_j |A|. \end{aligned}$$

Proof

We have the following trivial upper bound for simplicial counts:

$$\begin{aligned} S_j(\Phi ,r;A) \le \sum _{x \in \Phi \cap A}\left( \Phi (B_x(2r))\right) ^{j-1}. \end{aligned}$$

Due to the stationarity of \(\Phi \) along with the assumption that it has all moments, we have that the measure

$$\begin{aligned} \mu _0(A) := {\mathbb {E}}\left\{ \sum _{x \in \Phi \cap A}(\Phi (B_x(r))^{j-1}\right\} \end{aligned}$$

is translation invariant and finite on compact sets. Thus \(\mu _0(A) = c_j |A|\) for some \(c_j\in (0,\infty )\), and we are done. \(\square \)

Lemma 3.2

Let \(\Phi \) be a unit intensity,  ergodic,  point process on \({\mathbb {R}}^d\) possessing all moments. Then,  for each \(j\ge 0,\) there exists a constant,  \(\widehat{S}_j:=\widehat{S} ({{\mathcal {L}}}_{\Phi },j,d,r),\) such that,  with probability one, 

$$\begin{aligned} \lim _{l \rightarrow \infty }\frac{S_j(\Phi ,r;W_l)}{l^d} = \lim _{l \rightarrow \infty }\frac{S_j(\Phi _l,r)}{l^d} = \widehat{S}_j({{\mathcal {L}}}_{\Phi },r). \end{aligned}$$

Proof

Define the function

$$\begin{aligned} h(\Phi ) := \frac{1}{j+1} {\sum _{x \in \Phi _{W_1}} \#[j\text {-simplices in } \mathcal{C}(\Phi ,r)\text { containing } x]}. \end{aligned}$$

Recalling that by \(\Phi -z\) we mean the points of \(\Phi \) moved by \(-z\), it is easy to check that

$$\begin{aligned} \sum _{z \, \in \, {\mathbb {Z}}^d \cap W_{l-2r-1}} h(\Phi - z) \le S_j(\Phi _l,r) \le \sum _{z\, \in \, {\mathbb {Z}}^d \cap W_{l+1}} h(\Phi - z). \end{aligned}$$
(3.1)

Since \(\Phi \) has all moments, we have that

$$\begin{aligned} {\mathbb {E}}\left\{ h(\Phi )\right\} \le {\mathbb {E}}\left\{ \Phi (W_{1+r})^{j+1}\right\} < \infty . \end{aligned}$$

and so are in position to apply the multivariate ergodic theorem (e.g. [25, Proposition 2.2]) to each of the sums in (3.1). This implies the existence of a constant \(\widehat{S}_j({{\mathcal {L}}}_{\Phi },r) \in [0,\infty )\) such that, with probability one,

$$\begin{aligned} \lim _{l \rightarrow \infty } \frac{1}{l^d}\sum _{z \, \in \, {\mathbb {Z}}^d \cap W_{l-2r-1}} h(\Phi - z) = \lim _{l \rightarrow \infty } \frac{1}{l^d}\sum _{z \, \in \, {\mathbb {Z}}^d \cap W_{l+1}} h(\Phi - z) = \widehat{S}_j({{\mathcal {L}}}_{\Phi },r). \end{aligned}$$

This gives the ergodic theorem for \(S_j(\Phi _l,r)\). The result for \(S_j(\Phi ,r;W_l)\) follows from this and the bounds

$$\begin{aligned} S_j(\Phi _l,r) \le S_j(\Phi _l,r ; W_{l}) \le S_j(\Phi _{l+2r+1},r). \end{aligned}$$

\(\square \)

3.1 Strong law for Betti numbers

In this section we shall start with a convergence result for the expectation of \(\beta _k(\mathcal{C}(\Phi _l,r))\) when \(\Phi \) is a quite general stationary point process, and then proceed to a strong law. We treat these results separately, since convergence of expectations can be obtained under weaker conditions than the strong law. In addition, seeing the proof for expectations first should make the proof of strong law easier to follow.

From [42, Theorem 4.2] we know that

$$\begin{aligned} {\mathbb {E}}\left\{ \beta _k(\mathcal{C}(\Phi _l,r))\right\} = O(l^d). \end{aligned}$$

The following lemma strengthens this result.

Lemma 3.3

Let \(\Phi \) be a unit intensity stationary point process possessing all moments. Then,  for each \(0\le k\le d-1,\) there exists a constant \(\widehat{\beta }_k:= \widehat{\beta }_k({{\mathcal {L}}}_{\Phi },r) \in [0,\infty )\) such that

$$\begin{aligned} \lim _{l\rightarrow \infty } \frac{{\mathbb {E}}\left\{ \beta _k(\mathcal{C}(\Phi _l,r))\right\} }{l^d} = \widehat{\beta }_k. \end{aligned}$$

Remark 3.4

The lemma is interesting only in the case when \(\widehat{\beta }_k > 0\), and this does not always hold. However, it can be guaranteed for negatively associated point processes (including Poisson processes, simple perturbed lattices and determinantal point processes) under some simple conditions on void probabilities, cf. [42, Theorem 3.3].

Proof of Lemma 3.3

Set

$$\begin{aligned} \psi (l) := {\mathbb {E}}\left\{ \beta _k(\mathcal{C}(\Phi _l,r))\right\} , \end{aligned}$$

and define

$$\begin{aligned} \widehat{\beta }_k :=\limsup _{l \rightarrow \infty }\frac{\psi (l)}{l^d}. \end{aligned}$$
(3.2)

Fix \(t > 0\). Let \(Q_{it}, i = 1,\ldots ,m^d\) be an enumeration of \(\{tz_i + W_t \subset W_{mt}:\, z_i \in {\mathbb {Z}}^d\}\). Note that the \(Q_{it}, i = 1,\ldots ,m^d\) form a partition \(W_{mt}\).

Define the complex

$$\begin{aligned} {{\mathcal {K}}}(r,t) := \bigcup _{i=1}^{m^d}\mathcal{C}\left( \Phi _{Q_{it}},r\right) , \end{aligned}$$

and note that it is a subcomplex of \(\mathcal{C}(\Phi _{mt},r)\). Since the union here is of disjoint complexes,

$$\begin{aligned} \beta _k({{\mathcal {K}}}(r,t))= \sum _{i=1}^{m^d} \beta _k\left( \mathcal{C}(\Phi _{Q_{it}},r)\right) . \end{aligned}$$

Note that the vertices of any simplex in \(\mathcal{C}(\Phi _{mt},r) {\setminus } {{\mathcal {K}}}(r,t)\) must lie in the set \(\bigcup _{i=1}^{m^d} (\partial Q_{it})^{(2r)}\), where for any set \(A\subset {\mathbb {R}}^d\), \(A^{(r)}\) is the set of points in \({\mathbb {R}}^d\) with distance at most r from A. Hence, by Lemma 2.2,

$$\begin{aligned} \left| \beta _k(\mathcal{C}(\Phi _{mt},r))-\sum _{i=1}^{m^d} \beta _k(\mathcal{C}(\Phi _{Q_{it}},r))\right| \le \sum _{j=k}^{k+1} S_j\left( \Phi _{\bigcup _{i=1}^{m^d} (\partial Q_{it})^{(2r)}},r\right) . \end{aligned}$$
(3.3)

Thus, since for \(c:=c(d,r)\) large enough, for any \(t\ge 1\),

$$\begin{aligned} \left\| \bigcup _{i=1}^{m^d} (\partial Q_{it})^{(2r)}\right\| \le cm^dt^{d-1}, \end{aligned}$$

it follows from Lemma 3.1 that

$$\begin{aligned} \frac{1}{(mt)^d} \, {\mathbb {E}}\left\{ \sum _{j=k}^{k+1} S_j\left( \Phi _{\bigcup _{i=1}^{m^d} (\partial Q_{it})^{(2r)}},r\right) \right\} \le \frac{c}{t}. \end{aligned}$$
(3.4)

By the stationarity of \(\Phi \), taking expectations over (3.3) and applying (3.4) we obtain that, for any \(t\ge 1\),

$$\begin{aligned} \frac{\psi (mt)}{(mt)^d}\ge \frac{\psi (t)}{t^d} - \frac{c}{t}. \end{aligned}$$

Now fix \(\varepsilon >0\). By (3.2), we can find an arbitrarily large \(t_0 \ge 1\) such that \(\frac{\psi (t_0)}{t_0^d} \ge \widehat{\beta }_k - \frac{\varepsilon }{2}\) and \(\frac{c}{t_0} \le \frac{\varepsilon }{2}\). Hence, from the above we have that, for all \(m \ge 1\),

$$\begin{aligned} \frac{\psi (mt_0)}{(mt_0)^d} \ge \widehat{\beta }_k - \varepsilon . \end{aligned}$$

Now take \(l > 0\), and let m be the unique integer \(m=m(l)\) such that \(mt_0 \le l < (m+1)t_0\). Again, applying Lemma 2.2 yields

$$\begin{aligned} \big |\beta _k(\mathcal{C}(\Phi _l,r)) - \beta _k(\mathcal{C}(\Phi _{mt_0},r))\big |\le & {} \sum _{j = k}^{k+1}S_j(\Phi _l,r ; W_l{\setminus } W_{mt_0}). \end{aligned}$$
(3.5)

Since \(\Vert W_l{\setminus } W_{mt_0}\Vert \le d(l-mt_0)l^{d-1}\), as before, using Lemma 3.1, it is easy to verify that

$$\begin{aligned} \frac{\psi (l)}{l^d} \ge \frac{\psi (mt_0)^d}{(m+1)^d t_0^d} - O(m^{-1}) \ge (\widehat{\beta }_k - \varepsilon )\frac{m^d}{(m+1)^d} - O(m^{-1}). \end{aligned}$$

Since \(m \rightarrow \infty \) as \(l \rightarrow \infty \), it follows that \(\liminf _{l \rightarrow \infty } \frac{\psi (l)}{l^d} \ge \widehat{\beta }_k - \varepsilon .\) This and (3.2) complete the proof.\(\square \)

Theorem 3.5

Let \(\Phi \) be a unit intensity ergodic point process possessing all moments. Then,  for \(0\le k\le d-1,\) and \(\widehat{\beta }_k\) as in Lemma 3.3,

$$\begin{aligned} \frac{\beta _k(\mathcal{C}(\Phi _l,r))}{l^d} \ \mathop {\rightarrow }\limits ^{a.s. }\ \widehat{\beta }_k. \end{aligned}$$

Proof

As in the previous proof, fix \(t > 0\) and let \(Q_{it}, i = 1,\ldots ,m^d\) be the partition of \(W_{mt}\) to translations of \(W_t\). Further, for each real \(l>0\), let \(m=m(l,t)\) be the unique integer for which \(mt \le l < (m+1)t\).

The proof contains two steps. Firstly, we shall establish a strong law for \(\beta _k(\mathcal{C}(\Phi _{W_{mt}},r))\) in m, and then show that the error term in (3.5) vanishes asymptotically. Many of our arguments will rely on the multi-parameter ergodic theorem (e.g. [25, Proposition 2.2]).

Let \(e_i, i=1,\ldots ,d\), be the d unit vectors in \({\mathbb {R}}^d\), and \(T_i=T_i(t)\) the measure preserving transformation defined by a shift of \(te_i\). Then, setting \(Y = \beta _k(\mathcal{C}(\Phi _{W_t},r))\), and noting that \({\mathbb {E}}\left\{ Y\right\} \le {\mathbb {E}}\left\{ \Phi (W_t)^{k+1}\right\} < \infty \), it follows immediately from the multi-parameter ergodic theorem that

$$\begin{aligned} \frac{1}{m^d}\sum _{i_1=0}^{m-1}\cdots \sum _{i_d=0}^{m-1}Y(T_1^{i_1}\cdots T_d^{i_d}(\Phi ))= & {} \frac{1}{m^d} \sum _{i=1}^{m^d} \beta _k(\mathcal{C}(\Phi _{Q_{it}},r)) \nonumber \\&\mathop {\rightarrow }\limits ^{a.s. } {\mathbb {E}}\left\{ \beta _k\left( \mathcal{C}(\Phi _{W_t},r)\right) \right\} , \end{aligned}$$
(3.6)

as \(m\rightarrow \infty \).

Applying the multiparameter ergodic theorem again, but now with

$$\begin{aligned} Y = S_k(\Phi _{(\partial W_t)^{(4r)}},r) + S_{k+1}(\Phi _{(\partial W_t)^{(4r)}},r), \end{aligned}$$

we obtain

$$\begin{aligned} \frac{1}{m^d}\sum _{i=1}^{m^d} \left( S_k(\Phi _{(\partial Q_{it})^{(4r)}},r) + S_{k+1}(\Phi _{(\partial Q_{it})^{(4r)}},r) \right) \ \mathop {\rightarrow }\limits ^{a.s. }\ \widehat{S}_t, \end{aligned}$$
(3.7)

where \(\widehat{S}_t \le ct^{d-1}\). This bound follows by applying Lemma 3.1 to obtain

$$\begin{aligned} {\mathbb {E}}\left\{ S_k(\Phi _{(\partial Q_{it})^{(4r)}},r) + S_{k+1}(\Phi _{(\partial Q_{it})^{(4r)}},r)\right\} \le ct^{d-1}, \end{aligned}$$
(3.8)

for some \(c:= c({{\mathcal {L}}}_{\Phi },j,d,r)\).

Note that, for \(j=k,k+1\),

$$\begin{aligned} S_j\left( \Phi _{\bigcup _{i=1}^{m^d} (\partial Q_{it})^{(2r)}},r\right) \le \sum _{i=1}^{m^d} S_j\left( \Phi _{(\partial Q_{it})}^{(4r)},r\right) . \end{aligned}$$
(3.9)

It follows immediately from (3.6)–(3.9) that, with probability one,

$$\begin{aligned}&\limsup _{m \rightarrow \infty } \left| \frac{\beta _k(\mathcal{C}(\Phi _{mt},r))}{(mt)^d} - \frac{{\mathbb {E}}\left\{ \beta _k(\mathcal{C}(\Phi _{W_t},r))\right\} }{t^d}\right| \\&\quad =\, \limsup _{m \rightarrow \infty } \left| \frac{\beta _k(\mathcal{C}(\Phi _{mt},r))}{(mt)^d} - \frac{\sum _{i=1}^{m^d} \beta _k(\mathcal{C}(\Phi _{Q_{it}},r))}{(mt)^d}\right| \\&\quad \le \, \limsup _{m \rightarrow \infty } \frac{1}{(mt)^d} \sum _{j=k}^{k+1} S_j\left( \Phi _{\bigcup _{i=1}^{m^d} (\partial Q_{it})^{(2r)}},r\right) \\&\quad \le \, \limsup _{m \rightarrow \infty } \frac{1}{(mt)^d} \sum _{j=k}^{k+1} \sum _{i=1}^{m^d} S_j\left( \Phi _{(\partial Q_{it})}^{(4r)},r\right) \\&\quad =\, \frac{\widehat{S}_t}{t^d} \ \le \ \frac{c}{t}. \end{aligned}$$

Now, given \(\varepsilon > 0\), by Lemma 3.3, we can choose \(t_0\) large enough so that, with probability one,

$$\begin{aligned} \lim _{m \rightarrow \infty } \left| \frac{\beta _k(\mathcal{C}(\Phi _{mt_0},r))}{(mt_0)^d} - \widehat{\beta }_k \right| \le \varepsilon . \end{aligned}$$

Now consider the error terms in (3.5). For \(j = k,k+1\), we have that

$$\begin{aligned} \frac{S_j(\Phi _l,r ; W_l{\setminus } W_{m(l)t_0})}{l^d}\le & {} \frac{S_j(\Phi _l,r)}{l^d} - \frac{S_j(\Phi _{m(l)t_0},r)}{l^d} \\\le & {} \frac{S_j(\Phi _{(m(l)+1)t_0},r)}{(m(l)t_0)^d} - \frac{S_j(\Phi _{m(l)t_0},r)}{((m(l)+1)t_0)^d}. \end{aligned}$$

By Lemma 3.2, we know that there exist \(\widehat{S}_j(\Phi ,r) \in [0,\infty )\), \(j = k,k+1\), such that, with probability one,

$$\begin{aligned} \lim _{l \rightarrow \infty } \frac{S_j(\Phi _{(m(l)+1)t_0},r)}{(m(l)t_0)^d}= & {} \lim _{l \rightarrow \infty } \frac{S_j(\Phi _{m(l)t_0},r)}{((m(l)+1)t_0)^d} \\= & {} \widehat{S}_j(\Phi ,r). \end{aligned}$$

Hence, with probability one,

$$\begin{aligned} \lim _{l \rightarrow \infty } \frac{1}{l^d}\sum _{j = k}^{k+1}S_j(\Phi _l,r ; W_l{\setminus } W_{m(l)t_0}) = 0. \end{aligned}$$

Substituting this in (3.5) gives that, with probability one,

$$\begin{aligned} \lim _{l \rightarrow \infty } \left| \frac{\beta _k(\mathcal{C}(\Phi _l,r))}{l^d} - \frac{\beta _k(\mathcal{C}(\Phi _{m(l)t_0},r))}{l^d}\right| = 0, \end{aligned}$$

so that

$$\begin{aligned} \lim _{l \rightarrow \infty } \left| \frac{\beta _k(\mathcal{C}(\Phi _l,r))}{l^d} - \widehat{\beta }_k \right| \le \varepsilon , \end{aligned}$$

and the proof is complete. \(\square \)

The following concentration inequality is an easy consequence of the general concentration inequality of [30].

Theorem 3.6

Let \(\Phi \) be a unit intensity stationary determinantal point process. Then for all \(l \ge 1,\) \(\varepsilon > 0,\) and \(a \in (\frac{1}{2},1],\) we have that

$$\begin{aligned} {\mathbb {P}}\left\{ \Big |\beta _0(\mathcal{C}(\Phi _{l^{\frac{1}{d}}},r)) - {\mathbb {E}}\left\{ \beta _0(\mathcal{C}(\Phi _{l^{\frac{1}{d}}},r))\right\} \Big | \ge \varepsilon l^a \right\} \le 5 \exp \left( -\frac{\varepsilon ^2 l^{2a-1}}{16K_d(\varepsilon l^{a-1} + 2K_d)} \right) , \end{aligned}$$

where \(K_d\) is the maximum number of disjoint unit balls that can be packed into \(B_O(2)\).

Proof

Firstly, note that \(\beta _0\), viewed as a function on finite point sets is \(K_d\)-Lipschitz; viz. for any finite point set \({{\mathcal {X}}}\subset {\mathbb {R}}^d\) and \(x \in {\mathbb {R}}^d\),

$$\begin{aligned} \big |\beta _0(\mathcal{C}({{\mathcal {X}}}\cup \{x\},r)) - \beta _0(\mathcal{C}({{\mathcal {X}}},r))\big | \le K_d. \end{aligned}$$

This follows from the fact that, on the one hand, adding a point x to \({{\mathcal {X}}}\) can add no more than one connected component to \(\mathcal{C}({{\mathcal {X}}},r)\). On the other hand, the largest decrease in the number of disjoint components in \(\mathcal{C}({{\mathcal {X}}},r)\) is bounded by the number of disjoint r-balls in \(B_x(2r)\). By scale invariance, the latter number depends only on the dimension d and not on r, and is denoted by \(K_d\).

The remainder of the proof is a simple application of [30, Theorem 3.5] (see also [30, Example 6.4]). \(\square \)

4 Poisson and binomial point processes

Since there is already an extensive literature on \(\beta _0(\mathcal{C}({{\mathcal {X}}},r))\) for Poisson and binomial point processes, albeit in the language of connectedness of random graphs (e.g. [31]), in this section we shall restrict ourselves only to \(\beta _k\) for \( 1\le k\le d-1\).

The models we shall treat start with a Lebesgue-almost everywhere continuous probability density f on \({\mathbb {R}}^d\), with a compact, convex support that (for notational convenience) includes the origin, and such that

$$\begin{aligned} 0 < \inf _{x \in \text {supp}(f)}f(x) \mathop {=}\limits ^{\Delta }f_* \le f^* \mathop {=}\limits ^{\Delta }\sup _{x \in {\mathbb {R}}^d} f(x) < \infty . \end{aligned}$$
(4.1)

The models are \({\mathcal {P}}_n\), the Poisson point process on \({\mathbb {R}}^d\) with intensity nf, and the binomial point process \({{\mathcal {X}}}_n=\{X_1,\ldots ,X_n\}\), where the \(X_i\) are i.i.d. random vectors with density f. From [19], we know that for both \({\mathcal {P}}_n\) and \({{\mathcal {X}}}_n\) the thermodynamic regime corresponds to the case \(nr_n^d \rightarrow r \in (0,\infty )\), so that for such a radius regime we have that

$$\begin{aligned} {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({\mathcal {P}}_n,r_n))\right\} = \Theta (n), \qquad {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\right\} = \Theta (n). \end{aligned}$$

In proving limit results for Betti numbers in these cases, much will depend on moment estimates for the add-one cost function. The add-one cost function for a real-valued functional F defined over finite point sets \({{\mathcal {X}}}\) is defined by

$$\begin{aligned} D_xF({{\mathcal {X}}})\ \mathop {=}\limits ^{\Delta }\ F({{\mathcal {X}}}\cup \{x\}) - F({{\mathcal {X}}}), \quad x \in {\mathbb {R}}^d. \end{aligned}$$
(4.2)

Our basic estimate follows. For notational convenience, we write

$$\begin{aligned} \beta _k^n({{\mathcal {X}}})\ \mathop {=}\limits ^{\Delta }\ \beta _k(\mathcal{C}({{\mathcal {X}}},r_n)), \end{aligned}$$

where \(\{r_n\}_{n \ge 1}\) is a sequence of radii to be determined.

Lemma 4.1

Let \(1\le k \le d-1\). For the Poisson point process \({\mathcal {P}}_n\) and binomial point process \({{\mathcal {X}}}_n,\) with \(nr_n^d \rightarrow r \in (0,\infty ),\) we have that

$$\begin{aligned} \Delta _k \ \mathop {=}\limits ^{\Delta }\ \max \left( \sup _{n \ge 1}\sup _{x \in {\mathbb {R}}^d} {\mathbb {E}}\left\{ |D_x\beta ^n_k({\mathcal {P}}_n)|^4\right\} , \ \sup _{n \ge 1}\sup _{x \in {\mathbb {R}}^d} {\mathbb {E}}\left\{ |D_x\beta ^n_k({{\mathcal {X}}}_n)|^4\right\} \right) \end{aligned}$$
(4.3)

is finite.

Proof

The lemma is a consequence of the following simple bounds from Lemma 2.2.

$$\begin{aligned} |D_x\beta ^n_k({\mathcal {P}}_n)|\le & {} \sum _{j=k}^{k+1}S_j({\mathcal {P}}_n,r_n ; \{x\}) \\\le & {} \left[ {\mathcal {P}}_n(B_x(r_n))\right] ^k + \left[ {\mathcal {P}}_n(B_x(r_n))\right] ^{k+1} \\\le & {} 2\left[ {\mathcal {P}}_n(B_x(r_n))\right] ^{k+1}, \end{aligned}$$

and, similarly,

$$\begin{aligned} |D_x\beta ^n_k({{\mathcal {X}}}_n)|\le & {} 2\left[ {{\mathcal {X}}}_n(B_x(r_n))\right] ^{k+1}. \end{aligned}$$

Set \(r_*= \sup _{n \ge 1}\omega _dnr_n^d < \infty \), where \(\omega _d\) is the volume of a d-dimensional unit ball. Let \(\mathrm{Poi}(\lambda )\) and \(\mathrm{Bin}(n,p)\) denote the Poisson random variable with mean \(\lambda \) and the binomial random variable with parameters np respectively. Then, we obtain that

$$\begin{aligned} {\mathbb {E}}\left\{ |D_x\beta ^n_k({\mathcal {P}}_n)|^4\right\}\le & {} 16 {\mathbb {E}}\left\{ \left[ {\mathcal {P}}_n(B_x(r_n))\right] ^{4(k+1)}\right\} \le 16 {\mathbb {E}}\left\{ \left[ \mathrm{Poi}(r_*f^*)\right] ^{4(k+1)}\right\} , \\ {\mathbb {E}}\left\{ |D_x\beta ^n_k({{\mathcal {X}}}_n)|^4\right\}\le & {} 16 {\mathbb {E}}\left\{ \left[ {{\mathcal {X}}}_n(B_x(r_n))\right] ^{4(k+1)}\right\} \le 16 {\mathbb {E}}\left\{ \left[ \mathrm{Bin}\left( n,\frac{r_*f^*}{n}\right) \right] ^{4(k+1)}\right\} . \end{aligned}$$

The lemma now follows from the boundedness of moments of Poisson and binomial random variables with constant means. \(\square \)

4.1 Strong laws

We begin with a lemma giving variance inequalities, which, en passant, establish weak laws for Betti numbers.

Lemma 4.2

For the Poisson point process \({\mathcal {P}}_n\) and binomial point process \({{\mathcal {X}}}_n,\) with \(nr_n^d \rightarrow r \in (0,\infty ),\) and each \(1\le k \le d-1,\) there exists a positive constant \(c_1\) such that for all \(n \ge 1,\)

$$\begin{aligned} \mathsf {VAR}\left( \beta _k(\mathcal{C}({\mathcal {P}}_n,r_n))\right) < c_1 n, \qquad \mathsf {VAR}\left( \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\right) < c_1 n. \end{aligned}$$
(4.4)

Thus,  as \(n \rightarrow \infty ,\)

$$\begin{aligned} n^{-1}\left[ \beta _k(\mathcal{C}({\mathcal {P}}_n,r_n))- {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({\mathcal {P}}_n,r_n))\right\} \right] \ \mathop {\rightarrow }\limits ^{{\mathbb {P}}}\ 0, \end{aligned}$$

and

$$\begin{aligned} n^{-1}\left[ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) \, -\,{\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\right\} \right] \ \mathop {\rightarrow }\limits ^{{\mathbb {P}}}\ 0. \end{aligned}$$

Proof

Note firstly that the two weak laws (4.5) follow immediately from the upper bounds in (4.4) and Chebyshev’s inequality.

Thus it remains to prove (4.4). The Poisson case is the easiest, since by Poincaré’s inequality (e.g. [22, equation (1.8)]), the Cauchy-Schwartz inequality and Lemma 4.1,

$$\begin{aligned} \mathsf {VAR}\left( \beta _k(\mathcal{C}({\mathcal {P}}_n,r)\right)\le & {} \int _{{\mathbb {R}}^d} {\mathbb {E}}\left\{ \left[ D_x\beta ^n_k({\mathcal {P}}_n)\right] ^2\right\} n f(x) {\,d}x \\\le & {} n \sqrt{\Delta _k}, \end{aligned}$$

where \(\Delta _k<\infty \) is given by (4.3).

For the binomial case, we need the Efron–Stein inequality (cf. [13] and for the case of random vectors [38, (2.1)]), which states that for a symmetric function \(F:\, ({\mathbb {R}}^d)^n \rightarrow {\mathbb {R}}\),

$$\begin{aligned} \mathsf {VAR}\left( F({{\mathcal {X}}}_n)\right)\le & {} \frac{1}{2}\sum _{i=1}^n{\mathbb {E}}\left\{ \big [F({{\mathcal {X}}}_n)-F({{\mathcal {X}}}_{n+1} {\setminus } \{X_i\})\big ]^2\right\} , \end{aligned}$$

where \({{\mathcal {X}}}_n\) and \({{\mathcal {X}}}_{n+1}\) are coupled so that \({{\mathcal {X}}}_{n+1} = {{\mathcal {X}}}_{n}\cup \{X_{n+1}\}\). Applying this inequality, we have

$$\begin{aligned} \mathsf {VAR}\left( \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\right)\le & {} \frac{1}{2}\sum _{i=1}^n{\mathbb {E}}\left\{ \left[ \beta _k({{\mathcal {X}}}_n,r_n)-\beta _k({{\mathcal {X}}}_{n+1} {\setminus } \{X_i\},r_n)\right] ^2\right\} \nonumber \\= & {} \frac{1}{2}\sum _{i=1}^n{\mathbb {E}}\Big \{\left[ \beta _k({{\mathcal {X}}}_n,r_n)-\beta _k({{\mathcal {X}}}_n {\setminus } \{X_i\},r_n) \right. \nonumber \\&+ \left. \beta _k({{\mathcal {X}}}_n {\setminus } \{X_i\},r_n) - \beta _k({{\mathcal {X}}}_{n+1} {\setminus } \{X_i\},r_n)\right] ^2\Big \} \nonumber \\\le & {} \frac{1}{2}\sum _{i=1}^n 4\sqrt{\Delta _k} \nonumber \\= & {} 2n\sqrt{\Delta _k}, \end{aligned}$$
(4.5)

where in the second inequality we have used Lemma 4.1. This completes the proof.

\(\square \)

Thanks to the recent bound of [23, Theorem 5.2] (see Lemma 5.1), we can also give a lower bound for the Poisson point process in the case of \(k = d-1\).

Lemma 4.3

For the Poisson point process \({\mathcal {P}}_n\) with \(nr_n^d \rightarrow r \in (0,\infty ),\) let \(n_0\) be such that there is a set \(A \subset \mathrm{{supp}}(f)\) with \(A \oplus B_O(3r_n) \subset \mathrm{{supp}}(f)\) and \(|A| > 0\) for all \(n \ge n_0\). Then,  there exists a positive constant \(c_2\) such that,  for all \(n \ge n_0\) as above, 

$$\begin{aligned} \mathsf {VAR}\left( \beta _{d-1}(\mathcal{C}({\mathcal {P}}_n,r_n))\right) > c_2 n. \end{aligned}$$
(4.6)

Remark 4.4

Note that from the universal coefficient theorem [26, Theorem 45.8] and Alexander duality [37, Theorem 16], we have thatFootnote 1

$$\begin{aligned} \tilde{H}_k(\mathcal{C}_B({\mathcal {P}}_n,r)) \ \cong \ \tilde{H}_{d-k-1}({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n,r)). \end{aligned}$$

Thus

$$\begin{aligned} \beta _{d-1}(\mathcal{C}_B({\mathcal {P}}_n,r)) = \beta _0({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n,r)) - 1. \end{aligned}$$

\(\beta _0({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n,r))\) is nothing but the number of components of the vacant region of the Boolean model, which is easier to analyse and this will play a crucial role in our proof.

Proof

The proof will be based on Lemma 5.1 and the duality argument of Remark 4.4. The finiteness of moments required by this lemma is guaranteed by Lemma 4.1. Choose \(n \ge n_0\) for \(n_0\) as defined in the statement of the lemma and also the set A guaranteed by this assumption. Let \(x \in A\). So we now have to show that, for each \(1 \le k\le d-1\), there exists an m (depending on k and d only) and a finite set of points \(\{z_1,\ldots ,z_m\} \in B_O(2r_n)\) such that for some constants \(c, c_* \in (0,1)\) and for all \((y_1,\ldots ,y_m) \in \prod _{i=1}^mB_{x+z_i}(c_*r_n)\),

$$\begin{aligned} {\mathbb {P}}\left\{ D_x\beta _k^n({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m \}) \le -1 \right\} > c, \end{aligned}$$
(4.7)

and

$$\begin{aligned} D_x\beta _{d-1}^n({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m\}) \le 0, \end{aligned}$$
(4.8)

with probability one. Though not explicitly mentioned, it is to be understood that the above choices of \(m,z_i,c,c_*\) are not dependent on \(x \in A\). The above two inequalities imply that

$$\begin{aligned} \left| {\mathbb {E}}\left\{ D_x\beta _{d-1}^n({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m\})\right\} \right| \ge c, \end{aligned}$$

so that condition (5.1) required in Lemma 5.1 is satisfied for the add-one cost function with the constant c above, and the lower bound to the variance given there holds. Though, in this proof we require the construction only for \(k = d-1\), we have stated one of the inequalities for all k as this will be important to the variance lower bound argument in Theorem 4.7.

Moreover, it is easy to check that, given our choice of \(\{z_1,\ldots ,z_m\}\) and \(c_*r_n\) (in place of r in Lemma 5.1), the bound in (5.2) can be further bounded from below by \(c_2 n\) for some \(c_2>0\) (depending only on f, A, k, r and d). This will prove the lemma.

Thus, all that remains is to find an m and construct \(z_1,\ldots ,z_m\) satisfying the above conditions.

Fix \(k \in \{1,\ldots ,d-1\}\). Let \(S^k\) denote the unit k-dimensional sphere, and embed it via the usual inclusion in the unit sphere in \({\mathbb {R}}^d\). For \(\varepsilon < \frac{1}{4}\) let \(S_\varepsilon ^k=\{x\in {\mathbb {R}}^d:\, \min _{y\in S^k} \Vert x-y\Vert \le \varepsilon \}\) denote the \(\varepsilon \)-thickening of \(S^k\).

Now choose m large enough (but depending only on k and d only) such that there exist points \(v_1,\ldots ,v_{m}\) in \({\mathbb {R}}^d\) so that

$$\begin{aligned} S_\varepsilon ^k \subset \bigcup _{i =1}^{m}B_{v_i}(1) \subset \left( B_O(\frac{1}{4})\right) ^c \end{aligned}$$
(4.9)

and, for all \(0\le j\le d-1\),

$$\begin{aligned} \beta _j\left( \mathcal{C}(\{v_1,\ldots ,v_{m}\},1)\right) = \beta _j(S^k)=\beta _j(S^k_\varepsilon ). \end{aligned}$$
(4.10)

(Recall that \( \beta _j(S^k)=0\) for \(j\ne 0,k\), while \(\beta _0(S^k) = \beta _k(S^k) =1\).)

Now, if needed choose m larger such that there is a \(c_* > 0\) for which all \((y_1,\ldots ,y_m) \in \prod _{i=1}^mB_{c_*}(v_i)\) satisfy (4.9) and (4.10). Note that by scaling we have, for all \(\{y_1,\ldots ,y_m\} \in \prod _{i=1}^mB_{r_nv_i}(c_*r_n) \),

$$\begin{aligned} r_nS^k_{r_n\varepsilon }\subset \bigcup _{i =1}^{m}B_{y_i}(r_n) \subset \left( B_O(r_n/{4})\right) ^c, \end{aligned}$$

while \(\beta _k(\mathcal{C}(\{y_1,\ldots ,y_m\},r_n)) = 1\).

Setting \(z_i = r_nv_i\) for \(i = 1,\ldots , m\), we have that \(z_i \in B_O(2r_n)\) as required as well as \(c_*\) chosen as above ensures the size requirements we need. So, what remains is to show that (4.7) and (4.8) hold for \(\{y_1,\ldots ,y_m\}\in B_{n,m} := \prod _{i=1}^mB_{x+z_i}(c_*r_n)\).

On the other hand, the structure of \(B_{n,m}\) implies that, for \(\{y_1,\ldots ,y_m\}\in B_{n,m}\),

$$\begin{aligned} {\mathbb {P}}\left\{ D_x\beta _k^n({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m \}) \le -1 \ \big |\ {\mathcal {P}}_n(B_x(2r_n)) = 0 \right\} = 1. \end{aligned}$$

Furthermore, it is immediate from Poisson void probabilities that

$$\begin{aligned} {\mathbb {P}}\left\{ {\mathcal {P}}_n(B_x(2r_n)) = 0 \right\} \ge e^{-f^*n\omega _dr_n^d} \ge e^{-r_*f^*} > 0, \end{aligned}$$

where \(r_*:= \sup _{n \ge 1}n\omega _dr_n^d\). These two facts together imply that (4.7) holds with \(c = e^{-r_*f^*} > 0\).

We now turn to the second of these inequalities (which is only for \(k = d-1\)), for which we need the nerve theorem [2, Theorem 10.7] along with duality argument (Remark 4.4). The nerve theorem allows us to prove the inequality for \(\beta _{d-1}(\mathcal{C}_B({\mathcal {P}}_n,r_n))\) instead of \(\beta _{d-1}({{\mathcal {C}}}({\mathcal {P}}_n,r_n))\), and the duality argument further reduces our task to proving

$$\begin{aligned} D_x\beta _0^n({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m\},r_n)) \le 0, \end{aligned}$$
(4.11)

with probability one.

Set \(V_n := {\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m\},r_n)\). Since \(x \oplus r_nS_{r_n\varepsilon }^{d-1} \subset \bigcup _{i =1}^{m}B_{y_i}(r_n)\), we have that \(V_n\) is the disjoint union of \(V_n \cap B_x(r_n)\) and \(V_n \cap B_x(r_n)^c\). Thus,

$$\begin{aligned} \beta _0^n({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m\},r_n)) = \beta _0(V_n \cap B_x(r_n)) + \beta _0(V_n \cap B_x(r_n)^c). \end{aligned}$$

So,

$$\begin{aligned}&D_x\beta _0^n({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n \cup \{y_1,\ldots ,y_m\},r_n)) \\&\quad = \beta _0^n({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}}_n \cup \{x,y_1,\ldots ,y_m\},r_n)) - \beta _0(V_n \cap B_x(r_n)) - \beta _0(V_n \cap B_x(r_n)^c) \\&\quad = \beta _0(V_n \cap B_x(r_n)^c) - \beta _0(V_n \cap B_x(r_n)) - \beta _0(V_n \cap B_x(r_n)^c) \\&\quad = - \beta _0(V_n \cap B_x(r_n)) \le 0, \end{aligned}$$

where in the second equality, we have used the fact

$$\begin{aligned} B_x(r_n) \subset \mathcal{C}_B({\mathcal {P}}_n \cup \{x,y_1,\ldots ,y_m\},r_n). \end{aligned}$$

This proves (4.11) and hence we have (4.8), which was all that was required to complete the proof. \(\square \)

Our next main result is a concentration inequality for \(\beta _k (\mathcal{C}({{\mathcal {X}}}_n,r_n)\).

Theorem 4.5

Let \(1\le k \le d-1,\) \({{\mathcal {X}}}_n\) be a binomial point process, and assume that \(nr_n^d \rightarrow r\in (0,\infty )\). Then,  for any \(a > \frac{1}{2}\) and \(\varepsilon > 0,\) for n large enough, 

$$\begin{aligned} {\mathbb {P}}\left\{ \big |\beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)\right\} \big | \ge \varepsilon n^a \right\} \le \frac{{C}}{\varepsilon }n^{2k+2-a}\exp (-n^{\gamma }), \end{aligned}$$

where \(\gamma = {(2a-1)}/{4k}\) and \({C} > 0\) is a constant depending only on arkd and the density f.

The proof, close to that of [31, Theorem 3.17], is based on a concentration inequality for martingale differences.

Proof

Fix \(n \in {\mathbb {N}}\). Let \(Q_{n,i}\) be a partition of \({\mathbb {R}}^d\) into cubes of side length \(r_n\). Define the set \({\mathbb {A}}_n\) as follows:

$$\begin{aligned} {\mathbb {A}}_n\ \mathop {=}\limits ^{\Delta }\ \left\{ {{\mathcal {X}}}:\, |{{\mathcal {X}}}| = n,\ \forall i,\ {{\mathcal {X}}}(Q_{n,i} \cap \text {supp}(f)) \le \max (r,1)n^{\gamma } \right\} . \end{aligned}$$

For large enough n, since \({{\mathcal {X}}}_n(Q_{n,i})\) is stochastically dominated by a \(\mathrm{Bin}(n,f^*r_n^d)\) random variable, elementary bounds (e.g. [31, Lemma 1.1]), yield that

$$\begin{aligned} {\mathbb {P}}\left\{ {{\mathcal {X}}}_n \notin {\mathbb {A}}_n \right\} \le {c}_1 n \exp (-n^{\gamma }), \end{aligned}$$

for some constant \({c_1}\). Since the above bound is dependent only on the mean of the binomial random variable and r, the constants d and r are suppressed. Recall that the points of \({{\mathcal {X}}}_n\) are denoted by \(X_1,\dots ,X_n\), and let \({{\mathcal {F}}}_i = \sigma (X_1,\ldots ,X_i)\) be the sequence of \(\sigma \)-fields they generate, with \({{\mathcal {F}}}_0\) denoting the trivial \(\sigma \)-field. (The actual ordering of the \(X_j\) will not be important.) We can define the finite martingale

$$\begin{aligned} M^{(n)}_{i}\ \mathop {=}\limits ^{\Delta }\ {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\mid {{\mathcal {F}}}_i\right\} , \end{aligned}$$

for \(0=1,\dots ,n\), along with the corresponding martingale differences

$$\begin{aligned} D^{(n)}_{i}\ \mathop {=}\limits ^{\Delta }\ {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\mid {{\mathcal {F}}}_i\right\} - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) \mid {{\mathcal {F}}}_{i-1}\right\} , \end{aligned}$$

\(i=1,\dots ,n\), and \(D^{(n)}_0 =0\). Writing

$$\begin{aligned} \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\right\} = \sum _{i=0}^nD_{i}^{(n)}, \end{aligned}$$

and setting \({{\mathcal {X}}}^{i}_n = {{\mathcal {X}}}_{n+1}{\setminus } \{X_i\}\), we have that \(D_{i}^{(n)}\) can be represented as

$$\begin{aligned} D_{i}^{(n)} = {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) - \beta _k(\mathcal{C}({{\mathcal {X}}}^i_n,r_n)) \mid {{\mathcal {F}}}_i\right\} . \end{aligned}$$

Let \(A_{i,n}:= \{ {{\mathcal {X}}}_n \in {\mathbb {A}}_n , {{\mathcal {X}}}^i_n \in {\mathbb {A}}_n \}\). Then, recalling that \(S_j({{\mathcal {X}}},r;A)\) denotes the number of j-simplices with at least one vertex in A, and appealing again to Lemma 2.2, we have that, conditioned on the event \(A_{i,n}\), for n large enough,

$$\begin{aligned} |\beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) - \beta _k(\mathcal{C}({{\mathcal {X}}}^i_n,r_n))| \le \sum _{j=k}^{k+1}S_j({{\mathcal {X}}}_{n+1},r_n ; \{X_i,X_{n+1}\}) \le {c}_2(rn^{\gamma })^k. \end{aligned}$$

In all cases, we have the universal bound \(|\beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) - \beta _k(\mathcal{C}({{\mathcal {X}}}^i_n,r_n))| \le n^k\), and so,

$$\begin{aligned} \big |D_{i}^{(n)}\big |= & {} {\mathbb {E}}\left\{ \big (\big |D_{i}^{(n)}\big | \mathbf {1}_{A_{i,n}}\big ) \big | {{\mathcal {F}}}_i\right\} + {\mathbb {E}}\left\{ \big (\big |D_{i}^{(n)}\big | \mathbf {1}_{A_{i,n}^c}\big ) \big | {{\mathcal {F}}}_i\right\} \\\le & {} {c}_2(rn^{\gamma })^k + n^{k}{\mathbb {P}}\left\{ A^c_{i,n}|{{\mathcal {F}}}_i \right\} . \end{aligned}$$

Defining \(B_{i,n}:= \{ {\mathbb {P}}\{A^c_{i,n}\mid {{\mathcal {F}}}_i\} \le n^{-k} \}\), Markov’s inequality implies

$$\begin{aligned} {\mathbb {P}}\left\{ B_{i,n}^c \right\} \le n^{k}{\mathbb {E}}\left\{ {\mathbb {P}}\left\{ A^c_{i,n}\mid {{\mathcal {F}}}_i \right\} \right\} = n^{k}{\mathbb {P}}\left\{ A_{i,n}^c \right\} \le 2 {c}_1n^{k+1} \exp (-n^{\gamma }). \end{aligned}$$

Thus, since \(|D_{i}^{(n)}|\mathbf {1}_{B_{i,n}} \le {c}_3(rn^{\gamma })^k\), using [8, Lemma 1], we have that for any \(b_1,b_2 > 0\),

$$\begin{aligned} {\mathbb {P}}\left\{ \left| \sum _{i = 1}^n D_{i}^{(n)}\right| > b_1 \right\}\le & {} 2\exp \left( -\frac{b_1^2}{32nb_2^2}\right) \\&\quad + \left( 1 + \frac{2\sup _i \Vert D_{i}^{(n)}\Vert _{\infty }}{b_1}\right) \sum _{i=1}^n {\mathbb {P}}\left\{ \left| D_i\right| > b_2 \right\} . \end{aligned}$$

Choosing \(b_1 = \varepsilon n^a\) and \(b_2 = {c}_3(rn^{\gamma })^k\), we have

$$\begin{aligned}&{\mathbb {P}}\left\{ |\beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\right\} | \ge \varepsilon n^a \right\} \\&\quad \le \, 2\exp \left( -\frac{\varepsilon ^2 n^{2a}}{32n{c}_3^2(rn^{\gamma })^{2k}}\right) + ( 1 +2\varepsilon ^{-1}n^{k-a})\sum _{i=1}^n{\mathbb {P}}\left\{ |D_{i}^{(n)}| > {c}_3(rn^{\gamma })^k \right\} \\&\quad \le \, 2\exp \left( -\frac{\varepsilon ^2 n^{2a}}{32n{c}_3^2(rn^{\gamma })^{2k}}\right) + ( 1 +2\varepsilon ^{-1}n^{k-a})\sum _{i=1}^n{\mathbb {P}}\left\{ B_{i,n}^c \right\} \\&\quad \le \, 2\exp \left( -\frac{\varepsilon ^2 n^{2a-2k\gamma -1}}{32{c}_3^2r^{2k}}\right) + (1 + 2\varepsilon ^{-1}n^{k-a})2{c}_1n^{k+2}\exp (-n^{\gamma }) \\&\quad \le \, \frac{{c}}{\varepsilon }n^{2k+2-a}\exp (-n^{\gamma }), \end{aligned}$$

for large enough n and a constant \({c} > 0\) which depends only on arkd. The last inequality above follows from the fact that \(\gamma < a - \frac{1}{2}\) and hence, for large enough n, \(\exp (-n^{\gamma })\) is the dominating term in the penultimate expression. \(\square \)

We now finally have the ingredients needed to lift the weak laws of Lemma 4.2 to the promised strong convergence.

Theorem 4.6

For the Poisson point process \({\mathcal {P}}_n\) and binomial point process \({{\mathcal {X}}}_n,\) with \(nr_n^d \rightarrow r \in (0,\infty ),\) and each \(1\le k \le d-1,\) we have,  with probability one, 

$$\begin{aligned} \lim _{n \rightarrow \infty } n^{-1} \left[ \beta _k(\mathcal{C}({\mathcal {P}}_n,r_n)) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({\mathcal {P}}_n,r_n)\right\} \right] = 0, \end{aligned}$$

and

$$\begin{aligned} \lim _{n \rightarrow \infty } n^{-1}\left[ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)\right\} \right] = 0. \end{aligned}$$

Proof

By choosing \(a = 1\) in Theorem 4.5 and summing over n, we have, for all \(\varepsilon > 0\),

$$\begin{aligned} \sum _{n \ge 1} {\mathbb {P}}\left\{ n^{-1}{\left| \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)\right\} \right| } \ge \varepsilon \right\} < \infty . \end{aligned}$$

The Borel–Cantelli lemma immediately implies the strong law for \(\beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n))\).

Turning to the Poisson case, we shall use the standard coupling of \({\mathcal {P}}_n\) to \({{\mathcal {X}}}_n\) to complete the proof of the theorem. Let \(N_n\) be a Poisson random variable with mean n. Then, by choosing \({\mathcal {P}}_n = \{X_1,\ldots ,X_{N_n}\}\), we have coupled it with \({{\mathcal {X}}}_n = \{X_1,\ldots ,X_n\}\), \(n\ge 1\). By Lemma 2.2, we have that

$$\begin{aligned} |\beta _k(\mathcal{C}({\mathcal {P}}_n,r_n) - \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)|\le & {} S_k({\mathcal {P}}_n \cup {{\mathcal {X}}}_n,r_n ; {\mathcal {P}}_n \bigtriangleup {{\mathcal {X}}}_n)\\&+\, S_{k+1}({\mathcal {P}}_n \cup {{\mathcal {X}}}_n,r_n ; {\mathcal {P}}_n \bigtriangleup {{\mathcal {X}}}_n). \end{aligned}$$

Now note that,

$$\begin{aligned} S_k\left( {\mathcal {P}}_n \bigcup {{\mathcal {X}}}_n,r_n ; {\mathcal {P}}_n \bigtriangleup {{\mathcal {X}}}_n\right) = |S_k({\mathcal {P}}_n ,r_n) - S_k({{\mathcal {X}}}_n ,r_n)|, \end{aligned}$$

with a similar inequality holding for \(S_{k+1}()\). From [32, Theorem 2.2] and the remarks following that result, we have that there exists a constant \(\widehat{S}_k(f) \in (0,\infty )\) such that, with probability one,

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{S_k({{\mathcal {X}}}_n ,r_n)}{n} = \lim _{n \rightarrow \infty } \frac{S_k({\mathcal {P}}_n ,r_n)}{n} = \widehat{S}_k(f). \end{aligned}$$

A similar limit law holds true for \(S_{k+1}\), and so we have that, with probability one,

$$\begin{aligned} \lim _{n \rightarrow \infty }\frac{S_k({\mathcal {P}}_n \cup {{\mathcal {X}}}_n,r_n ; {\mathcal {P}}_n \bigtriangleup {{\mathcal {X}}}_n)}{n} = \lim _{n \rightarrow \infty }\frac{S_{k+1}({\mathcal {P}}_n \cup {{\mathcal {X}}}_n,r_n ; {\mathcal {P}}_n \bigtriangleup {{\mathcal {X}}}_n)}{n} = 0. \end{aligned}$$

Thus,

$$\begin{aligned} \frac{|\beta _k(\mathcal{C}({\mathcal {P}}_n,r_n) - \beta _k(\mathcal{C}({{\mathcal {X}}}_n,r_n)|}{n} \ \mathop {\rightarrow }\limits ^{n \rightarrow \infty }\ 0, \end{aligned}$$

and the strong law for \(\beta _k(\mathcal{C}({\mathcal {P}}_n,r_n)\) follows. \(\square \)

4.2 Central limit theorem

We have finally come to main result of this section: central limit theorems for Betti numbers.

We start with some definitions from percolation theory for the Boolean model on Poisson processes [25] needed for the proof of the Poisson central limit theorem. Recall firstly that we say that a subset A of \({\mathbb {R}}^d\) percolates if it contains an unbounded connected component of A.

Now let \({\mathcal {P}}\) be a stationary Poisson point process on \({\mathbb {R}}^d\) with unit intensity. (Unit intensity is for notational convenience only. The arguments of this section will work for any constant intensity.) We define the critical (percolation) radii for \({\mathcal {P}}\) as follows:

$$\begin{aligned} r_c({\mathcal {P}})\ \mathop {=}\limits ^{\Delta }\ \inf \{ r:\, {\mathbb {P}}\left\{ {C({\mathcal {P}},r) \text { percolates}} \right\} > 0 \}, \end{aligned}$$

and,

$$\begin{aligned} r^*_c({\mathcal {P}})\ \mathop {=}\limits ^{\Delta }\ \sup \{ r:\, {\mathbb {P}}\{{{\mathbb {R}}^d {\setminus } C({\mathcal {P}},r) \text { percolates}}\} > 0 \}. \end{aligned}$$

By Kolmogorov’s zero-one law, it is easy to see that the both of the probabilities inside the infimum and supremum here are either 0 or 1. The first critical radius is called the critical radius for percolation of the occupied component and the second is the critical radius for percolation of the vacant component.

We define the interval of co-existence, \(I_d({\mathcal {P}})\), for which unbounded components of both the Boolean model and its complement co-exist, as follows:

$$\begin{aligned} I_d({\mathcal {P}}) = {\left\{ \begin{array}{ll} {(}r_c,r_c^*] &{}\text {if }{\mathbb {P}}\left\{ C({\mathcal {P}},r_c)\text { percolates} \right\} = 0,\\ {[}r_c,r_c^*] &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

From [25, Theorem 4.4 and Theorem 4.5], we know that \(I_2({\mathcal {P}})= \emptyset \) and from [34, Theorem 1] we know that \(I_d({\mathcal {P}}) \ne \emptyset \) for \(d \ge 3\). In high dimensions, it is known that \(r_c \notin I_d({\mathcal {P}})\) (cf. [41]).

We now need a little additional notation. Let \(\{B_n\}_{n \ge 1}\) be a sequence of bounded Borel subsets in \({\mathbb {R}}^d\) satisfying the following four conditions:

  1. (A)

    \(|B_n| = {n}\), for all \(n \ge 1\).

  2. (B)

    \(\bigcup _{n \ge 1}\bigcap _{m \ge n} B_m = {\mathbb {R}}^d\).

  3. (C)

    \( {|(\partial B_n)^{(r)}|}/{n} \rightarrow 0\), for all \(r > 0\).

  4. (D)

    There exists a constant \(b_1\) such that diam\((B_n) \le b_1n^{b_1}\), where diam(B) is the diameter of B.

In a moment we shall state and prove a central limit theorem for the sequences of the form \(\beta _k(\mathcal{C}({\mathcal {P}}\cap B_n,r))\), when the \(B_n\) are as above. Setting up the central limit theorem for the binomial case requires a little more notation.

In particular, we write \({{\mathcal {U}}}_n\) to denote the point process obtained by choosing n points uniformly in \(B_n\), and call this the extended binomial point process. This is a natural binomial counterpart to the Poisson point process \({\mathcal {P}}\cap B_n\).

We finally have all that we need to formulate the main central limit theorem.

Theorem 4.7

Let \(\{B_n\}\) be a sequence of sets in \({\mathbb {R}}^d\) satisfying conditions (A)–(D) above,  and let \({\mathcal {P}}\) and \({{\mathcal {U}}}_n, \ n \ge 1,\) respectively,  be the unit intensity Poisson process and the extended binomial point process described above. Take \(k \in \{1,\ldots ,d-1\}\) and \(r \in (0,\infty )\). Then there exists a constant \(\sigma ^2 > 0\) such that,  as \(n \rightarrow \infty ,\)

$$\begin{aligned} n^{-1}\mathsf {VAR}\left( \beta _k(\mathcal{C}({\mathcal {P}}\cap B_n,r)\right) \rightarrow \sigma ^2, \end{aligned}$$

and

$$\begin{aligned} n^{-1/2}\left( \beta _k(\mathcal{C}({\mathcal {P}}\cap B_n,r)) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({\mathcal {P}}\cap B_n,r))\right\} \right) \ \Rightarrow \ N(0,\sigma ^2). \end{aligned}$$

Furthermore,  for \(r \notin I_d({\mathcal {P}}),\) there exists a \(\tau ^2\) with \(0 < \tau ^2 \le \sigma ^2\) such that

$$\begin{aligned} n^{-1}\mathsf {VAR}\left( \beta _k(\mathcal{C}({{\mathcal {U}}}_n,r)\right) \rightarrow \tau ^2, \end{aligned}$$

and

$$\begin{aligned} n^{-1/2}\left( \beta _k(\mathcal{C}({{\mathcal {U}}}_n,r)) - {\mathbb {E}}\left\{ \beta _k(\mathcal{C}({{\mathcal {U}}}_n,r))\right\} \right) \ \Rightarrow \ N(0,\tau ^2). \end{aligned}$$

The constants \(\sigma ^2\) and \(\tau ^2\) are independent of the sequence \(\{B_n\}\).

Remark 4.8

The condition \(r \notin I_d({\mathcal {P}})\), needed for the binomial central limit theorem, is rather irritating, and we are not sure if it is necessary or an artefact of the proof. It is definitely not needed for the case \(k=d-1\). To see this, note that from the duality argument of Remark 4.4, we have that

$$\begin{aligned} \beta _{d-1}(\mathcal{C}({\mathcal {P}}\cap B_n,r)) = \beta _0({\mathbb {R}}^d {\setminus } \mathcal{C}({\mathcal {P}}\cap B_n,r)) - 1. \end{aligned}$$

However, \({\mathbb {R}}^d {\setminus } \mathcal{C}({\mathcal {P}}\cap B_n,r)\) is nothing but the vacant component of the Boolean model, and central limit theorems for \(\beta _0({\mathbb {R}}^d {\setminus } \mathcal{C}({{\mathcal {X}}}\cap B_n,r))\) for both Poisson and binomial point processes are given in [33, p 1040] for all \(r \in (0,\infty )\). By the above duality arguments, this proves both the central limit theorems of Theorem 4.7, when \(k=d-1\), and without the requirement that \(r \notin I_d({\mathcal {P}})\).

Proof

Since the theorem is somewhat of an omnibus collection of results, the proof is rather long. Thus we shall break it up into signposted segments.

I. Poisson central limit theorem: Since \(\beta _k(\mathcal{C}(\,\cdot \, ,r))\) is a translation invariant functional over finite subsets of \({\mathbb {R}}^d\), we need only check that Conditions 3 and 4 in Theorem 5.2, along with the weak stabilization of (5.4), hold in order to prove the convergence of variances and the asymptotic normality in the Poisson case. The strict positivity of \(\sigma ^2\) will follow from that of \(\tau ^2\), to be proven below.

We treat each of the three necessary conditions separately.

(i) Weak stabilization: Firstly, we shall show that there exist a.s. finite random variables \(D_{\beta _k}(\infty ),\) and R such that, for all \(\rho > R\),

$$\begin{aligned} (D_O\beta _k^r)({\mathcal {P}}\cap B_O(\rho )) = D_{\beta _k}(\infty ), \end{aligned}$$
(4.12)

where \(\beta _k^r({{\mathcal {X}}})\mathop {=}\limits ^{\Delta }\beta _k(\mathcal{C}({{\mathcal {X}}},r))\) for any finite point-set \({{\mathcal {X}}}\). Then, we shall complete the proof of weak stabilization by showing the above for any \({\mathfrak {B}}\)-valued sequence of sets \(A_n\) tending to \({\mathbb {R}}^d\). (See the paragraphs preceding Theorem 5.2 for the definition of \({\mathfrak {B}}\).)

For any \(\rho >2r\), define the simplicial complexes

$$\begin{aligned} {{\mathcal {K}}}_{\rho }= & {} \mathcal{C}(({\mathcal {P}}\cap B_O(\rho )) \cup \{0\},r), \\ {{\mathcal {K}}}_{\rho }'= & {} \mathcal{C}({\mathcal {P}}\cap B_O(\rho ),r),\\ {{\mathcal {K}}}''= & {} \mathcal{C}(({\mathcal {P}}\cap B_O(2r)) \cup \{0\},r),\\ {{\mathcal {L}}}= & {} {{\mathcal {K}}}_{\rho }' \cap {{\mathcal {K}}}'', \end{aligned}$$

and note that \({{\mathcal {K}}}_{\rho }={{\mathcal {K}}}_{\rho }'\cup {{\mathcal {K}}}''\) and that, as implied by the notation, \({{\mathcal {L}}}\) and \({{\mathcal {K}}}''\) do not depend on \(\rho \).

From the second part of Lemma 2.3, we have that

$$\begin{aligned} (D_O\beta _k^r)({\mathcal {P}}\cap B_O(\rho ))= & {} \beta _k({{\mathcal {K}}}_{\rho })-\beta _k({{\mathcal {K}}}_{\rho }') \nonumber \\= & {} \beta _k({{\mathcal {K}}}'')+ \beta (N_k^{\rho }) + \beta (N_{k-1}^{\rho })-\beta _k({{\mathcal {L}}}), \end{aligned}$$
(4.13)

where \(N_k^{\rho }\) is the kernel of the induced homomorphism

$$\begin{aligned} \lambda _k^{\rho }:\, H_k({{\mathcal {L}}})\rightarrow H_k({{\mathcal {K}}}_{\rho }') \oplus H_k({{\mathcal {K}}}'') \end{aligned}$$

Hence, all that remains is to show that \(\beta (N_j^{\rho })\), \(j=k,k-1\), remain unchanged as \(\rho \) increases beyond some random variable R. Since these variables are integer valued, it suffices to show that they are increasing and bounded to prove (4.12). We shall do this for \(\beta (N_k^{\rho })\). The same proof also works for \(\beta (N_{k-1}^{\rho })\).

The boundedness is immediate, since

$$\begin{aligned} \beta (N_k^{\rho })\le \beta _k({{\mathcal {L}}})\le \Phi (B_O(2r))^{k+1} < \infty , \quad \text {a.s.} \end{aligned}$$

All that remains to show is that \(\beta (N_k^{\rho })\) is increasing. Let \(\rho _1,\rho _2\) be such that \(2r < \rho _1 \le \rho _2\). We need to show that \(\beta (N_k^{\rho _1}) \le \beta (N_k^{\rho _2})\).

Since \({{\mathcal {L}}}\subset {{\mathcal {K}}}_{\rho _1} \subset {{\mathcal {K}}}_{\rho _2}\), we have the corresponding simplicial maps defined by the respective inclusions (see Sect. 2.1) and hence the following homomorphisms:

$$\begin{aligned} H_k({{\mathcal {L}}}) \ \mathop {\rightarrow }\limits ^{\lambda _k^{\rho _1}}\ H_k({{\mathcal {K}}}_{\rho _1}') \oplus H_k({{\mathcal {K}}}'') \ \mathop {\rightarrow }\limits ^{\eta }\ H_k({{\mathcal {K}}}_{\rho _2}') \oplus H_k({{\mathcal {K}}}''). \end{aligned}$$

Also, by the functoriality of homology, \(\lambda _k^{\rho _2} = \eta \circ \lambda _k^{\rho _1}\). Since \(\mathrm{ker}\, \eta \subset \mathrm{ker}\, \eta ' \circ \eta \) for any two homomorphisms \(\eta ,\eta '\), we have that

$$\begin{aligned} \beta (N_k^{\rho _1}) =\beta (\mathrm{ker}\, \lambda _k^{\rho _1})\le \beta (\mathrm{ker}\, \eta \circ \lambda _k^{\rho _1}) = \beta (N_k^{\rho _2}). \end{aligned}$$
(4.14)

This proves that \(\beta (N_k^{\rho })\) is increasing in \(\rho \), as, similarly, is \(\beta (N_{k-1}^{\rho })\). Combining the convergence of \(\beta (N_j^{\rho })\), \(j=k,k-1\), with (4.13) gives (4.12).

Now, let \(A_n\) be a \({\mathfrak {B}}\)-valued sequence of sets tending to \({\mathbb {R}}^d\). To complete the proof of weak stabilization, we need to show that there exists an integer valued random variable N such that for all \(n > N\),

$$\begin{aligned} (D_O\beta _k^r)({\mathcal {P}}\cap A_n)= D_{\beta _k}(\infty ), \end{aligned}$$
(4.15)

where, as before, \(\beta _k^r({{\mathcal {X}}})\mathop {=}\limits ^{\Delta }\beta _k(\mathcal{C}({{\mathcal {X}}},r))\) for any finite point-set \({{\mathcal {X}}}\). Firstly, choose R as in (4.12) and WLOG assume \(R>2r\). In particular, this implies that \(\beta (N_j^{\rho })\) remains constant for \(\rho > R\) for \(j = k-1,k\). Since \({\mathcal {P}}\cap B_O(R + 1)\) is a.s. finite and \(\bigcup _{n \ge 1}\bigcap _{m \ge n}A_m = {\mathbb {R}}^d\), there exists an a.s. finite random variable \(N^*\) such that

$$\begin{aligned} \{0\} \cup {\mathcal {P}}\cap B_O(R + 1) \subset \bigcap _{m \ge N^*} A_m. \end{aligned}$$

Hence, \(\{0\} \cup {\mathcal {P}}\cap B_O(R + 1) \subset A_n\) for all \(n > N^*\). Let \(n > N^*\), and note that since \(A_n\in {\mathfrak {B}}\), diam\((A_n)<\infty \) and so we can choose \(R_n<\infty \) such that \(A_n \subset B_O(R_n)\). Define the simplicial complexes

$$\begin{aligned} {{\mathcal {K}}}= & {} \mathcal{C}(({\mathcal {P}}\cap B_O(R + 1)) \cup \{0\},r), \\ {{\mathcal {K}}}_n= & {} \mathcal{C}(({\mathcal {P}}\cap A_n) \cup \{0\},r), \\ {{\mathcal {K}}}^*_n= & {} \mathcal{C}(({\mathcal {P}}\cap B_O(R_n)) \cup \{0\},r), \\ {{\mathcal {K}}}_n'= & {} \mathcal{C}({\mathcal {P}}\cap A_n,r),\\ {{\mathcal {K}}}''= & {} \mathcal{C}(({\mathcal {P}}\cap B_O(2r)) \cup \{0\},r),\\ {{\mathcal {L}}}= & {} {{\mathcal {K}}}_n' \cap {{\mathcal {K}}}'', \end{aligned}$$

where, again, \({{\mathcal {L}}}\) and \({{\mathcal {K}}}''\) do not depend of n. Now applying the second part of Lemma 2.3, since \({{\mathcal {K}}}_n = {{\mathcal {K}}}_n' \cup {{\mathcal {K}}}''\),we have that

$$\begin{aligned} (D_O\beta _k^r)({\mathcal {P}}\cap A_n)= & {} \beta _k({{\mathcal {K}}}_n)-\beta _k({{\mathcal {K}}}_n') \nonumber \\= & {} \beta _k({{\mathcal {K}}}'')+ \beta (M_k^n) + \beta (M_{k-1}^n)-\beta _k({{\mathcal {L}}}), \end{aligned}$$
(4.16)

where \(M_j^{n},j=k,k-1\), is the kernel of the induced homomorphism

$$\begin{aligned} \gamma _j^n:\, H_j({{\mathcal {L}}}) \rightarrow H_j({{\mathcal {K}}}_n') \oplus H_j({{\mathcal {K}}}''). \end{aligned}$$

Again, to prove (4.15), all we need to show is that \(\beta (M_j^n)\), \(j=k,k-1\), remain constant for any \(n > N^*\).

To see this, start by noting that, by the choice of \(n,R,R_n\), we have the following inclusions:

$$\begin{aligned} {{\mathcal {L}}}\subset {{\mathcal {K}}}\subset {{\mathcal {K}}}'_n \subset {{\mathcal {K}}}^*_n. \end{aligned}$$

Hence the corresponding simplicial maps give rise to the following induced homomorphisms:

$$\begin{aligned} H_k({{\mathcal {L}}}) \ \mathop {\rightarrow }\limits ^{\eta _1}\ H_k({{\mathcal {K}}}) \oplus H_k({{\mathcal {K}}}'') \ \mathop {\rightarrow }\limits ^{\eta _2}\ H_k({{\mathcal {K}}}'_n) \oplus H_k({{\mathcal {K}}}'')\ \mathop {\rightarrow }\limits ^{\eta _3}\ H_k({{\mathcal {K}}}^*_n) \oplus H_k({{\mathcal {K}}}''). \end{aligned}$$

Note that \(\gamma _k^n = \eta _2 \circ \eta _1\). Also, from the choice of \(R, {{\mathcal {K}}}\) and \({{\mathcal {K}}}_n^*\), we have that

$$\begin{aligned} \beta (N_k^{R+1}) = \beta (\mathrm{ker}\, \eta _1)= \beta (\mathrm{ker}\, \eta _3 \circ \eta _2 \circ \eta _1), \end{aligned}$$

where \(N_k^\rho \) for any \(\rho \ge 2r\) was defined after (4.13). Now, by an argument similar to that used to obtain (4.14), we have the following inequality:

$$\begin{aligned} \beta (\mathrm{ker}\, \eta _1)\le \beta (M_k^n) = \beta (\mathrm{ker}\, \eta _2 \circ \eta _1)\le \beta (\mathrm{ker}\, \eta _3 \circ \eta _2 \circ \eta _1). \end{aligned}$$

Thus, we have that \(\beta (M_k^n) = \beta (N_k^{R+1})\) for \(n > N^*\) with a corresponding result holding for \(\beta (M_{k-1}^n)\). Using this in (4.16) proves (4.15), and so we have shown that \(\beta _k(\mathcal{C}({\mathcal {P}},r))\) is weakly stabilizing on \({\mathcal {P}}\) for all \(r \ge 0\).

(ii) Uniformly bounded moments: Via a calculation similar to that in Lemma 4.1, we obtain that, for \(m \in [{|A|}/{2},{3|A|}/{2}]\),

$$\begin{aligned} \left| (D_O\beta _k)({{\mathcal {U}}}_{m,A})\right| \le 2\left[ \mathrm{Bin}\left( m,\frac{\omega _dr^d}{|A|}\right) \right] ^{k+1}\le 2\left[ \mathrm{Bin}\left( \left\lceil \frac{3|A|}{2}\right\rceil ,\frac{\omega _dr^d}{|A|}\right) \right] ^{k+1}, \end{aligned}$$

where the inequalities here are to be read as ‘bounded by a random variable with distribution’. Thus, the uniformly bounded fourth moments for the rightmost binomial random variable implies uniformly bounded fourth moments for the add-one cost function.

(iii) Polynomial boundedness: This follows easily from the relation

$$\begin{aligned} \beta _k(\mathcal{C}({{\mathcal {X}}},r)) \le S_k({{\mathcal {X}}},r)\le {{\mathcal {X}}}({\mathbb {R}}^d)^{k+1}. \end{aligned}$$

From Theorem 5.2 and the remarks below it, the above three items suffice to prove the central limit theorem for the Poisson point processes.

II. Binomial central limit theorem: Given the bounds proven in the previous part of the proof, all that remains to complete the central limit theorem for the binomial case is to prove the strong stabilization of \(D_O\beta _k\) for \(r \notin I_d\).

What we need to show is that there exist a.s. finite random variables \(\widehat{D}_{\beta _k}(\infty ),S\) such that for all finite \({{\mathcal {X}}}\subset B_O(S)^c\),

$$\begin{aligned} (D_O\beta _k)(({\mathcal {P}}\cap B_O(S)) \cup {{\mathcal {X}}})= \widehat{D}_{\beta _k}(\infty ). \end{aligned}$$

We shall handle the two case of \(r < r_c\) and \(r > r_c^*\) separately.

Assume that \(r < r_c\), or \(r \le r_c\) if \(r_c \notin I_d\). In this case, since \(\mathcal{C}_B({\mathcal {P}},r)\) does not percolate, there are only finitely many components of \(\mathcal{C}_B({\mathcal {P}},r)\) that intersect \(B_O(r)\) and all of them are a.s. bounded. Let \(C_1,\ldots , C_M\) be an enumeration of the components for some a.s. finite \(M>0\). (We exclude the trivial but possible case of \(M = 0\).) Further, \(C_1,\ldots , C_M\) are a.s. bounded subsets and so \(C = \bigcup _{i=1}^MC_i\) is also a.s. bounded. Thus, in this case we can choose an a.s. finite S such that \(d(x,C) > 3r\) for all \(x \notin B_O(S)\). This implies that for any locally finite \({{\mathcal {X}}}\subset B_O(S)^c\) we have \(C \cap \mathcal{C}_B({{\mathcal {X}}},r) = \emptyset \). Thus, for any finite \({{\mathcal {X}}}\subset [B_O(S)]^c\),

$$\begin{aligned} (D_O\beta _k)(({\mathcal {P}}\cap B_O(S)) \cup {{\mathcal {X}}}) = \beta _k(C \cup B_O(r)) - \beta _k(C), \end{aligned}$$
(4.17)

i.e. \(\beta _k\) strongly stabilizes with stabilization radius S and

$$\begin{aligned} \widehat{D}_{\beta _k}(\infty )\ \mathop {=}\limits ^{\Delta }\ \beta _k(C \cup B_O(r)) - \beta _k(C). \end{aligned}$$

Now assume that \(r > r_c^*\). Since \({\mathbb {R}}^d {\setminus } \mathcal{C}_B({\mathcal {P}},r)\) has only finitely many components that intersect \(B_O(r)\), duality arguments, as in Remark 4.8, establish strong stabilization for \(\beta _k({\mathcal {P}},r)\).

Consequently, (4.17) and duality establishes strong stabilization of \(\beta _k({{\mathcal {C}}}({\mathbb {P}},r))\) for \(r \notin I_d\), and this completes the proof of the central limit theorem for \(\beta _k(\mathcal{C}({{\mathcal {U}}}_n,r))\).

III. Positivity of \(\tau ^2\): All that remains is to show the strict positivity of \(\tau ^2\). By Theorem 5.2, it suffices to show that \(D_{\beta _k}(\infty )\) is non-degenerate and this we shall do by using similar arguments to those we used to obtain (4.7) and (4.8).

Write \({\mathcal {P}}_n\) for \({\mathcal {P}}\cap B_O(n)\). We showed in (4.12) that \(|(D_O\beta _k)({\mathcal {P}}_n)| \mathop {\rightarrow }\limits ^{a.s. }|D_{\beta _k}(\infty )|\), and we have from Lemma 2.2 that, for n large enough,

$$\begin{aligned} |(D_O\beta _k)({\mathcal {P}}_n)|\le \sum _{j=k}^{k+1} S_j({\mathcal {P}}_n \cup \{0\},r ; \{0\}) \le 2{\mathcal {P}}(B_O(2r))^{k+1}. \end{aligned}$$

Since \({\mathbb {E}}\left\{ {\mathcal {P}}(B_O(2r))^{k+1}\right\} < \infty \), we can use the dominated convergence theorem to obtain that \({\mathbb {E}}\left\{ |(D_O\beta _k)({\mathcal {P}}_n)|\right\} \rightarrow {\mathbb {E}}\left\{ |D_{\beta _k}(\infty )|\right\} \) as \(n \rightarrow \infty \).

Choose m (depending on kd only) as in the proof of variance lower bound in Lemma 4.2 (see (4.9) and (4.10)) and define the set \(B^*_m\) as there. Setting \(B^*_{r,m}= rB^*_m\), we have that \(|B^*_{m,r}| = |B^*_m|r^{md} > 0\). Thus, for all \(n \ge 5r\),

$$\begin{aligned}&{\mathbb {E}}\left\{ |(D_O\beta _k)({\mathcal {P}}_n)|\right\} \\&\quad \ge \, {\mathbb {E}}\left\{ |(D_O\beta _k)({\mathcal {P}}_n)| \mathbf {1}_{{\mathcal {P}}_n(B_O(2r))=m, {\mathcal {P}}_n(B_O(4r){\setminus } B_O(2r)) = 0 }\right\} \nonumber \\&\quad = \, {\mathbb {E}}\left\{ |(D_O\beta _k)({\mathcal {P}}_n)| \, \big | \, {\mathcal {P}}_n(B_O(2r))=m, {\mathcal {P}}_n(B_O(4r){\setminus } B_O(2r)) = 0\right\} \nonumber \\&\qquad \times \, {\mathbb {P}}\left\{ {\mathcal {P}}_n(B_O(2r))=m, {\mathcal {P}}_n(B_O(4r){\setminus } B_O(2r)) = 0 \right\} \nonumber \\&\quad \ge \, {\mathbb {P}}\left\{ {\mathcal {P}}_n(B_O(2r))=m, {\mathcal {P}}_n(B_O(4r){\setminus } B_O(2r)) = 0 \right\} \nonumber \\&\qquad \times \,\frac{1}{(\omega _d(2r)^d)^{m}}\int _{(y_1,\ldots ,y_{m})\in B_O(2r)^{m}} |(D_O\beta _k)(\{y_1,\ldots ,y_{m}\})|{\,d}y_1 \cdots {\,d}y_{m} \nonumber \\&\quad \ge \, \frac{|B^*_{m,r}|}{m!} e^{-\omega _d(4r)^d)}\\&\quad >\, 0. \end{aligned}$$

Thus,

$$\begin{aligned} {\mathbb {E}}\left\{ |D_{\beta _k}(\infty )|\right\} = \lim _{n \rightarrow \infty }{\mathbb {E}}\left\{ |(D_O\beta _k)({\mathcal {P}}_n)|\right\} > \frac{|B^*_{m,r}|}{m!} e^{-\omega _d(4r)^d)}. \end{aligned}$$

This shows that \({\mathbb {P}}\left\{ D_{\beta _k}(\infty ) \ne 0 \right\} > 0\). Thus, to complete the proof of non-degeneracy of \(D_{\beta _k}(\infty )\), it suffices show that \({\mathbb {P}}\left\{ D_O\beta _{\infty } = 0 \right\} > 0\).

$$\begin{aligned} {\mathbb {P}}\left\{ D_{\beta _k}(\infty ) = 0 \right\}= & {} \lim _{n \rightarrow \infty }{\mathbb {P}}\left\{ (D_O\beta _k)({\mathcal {P}}_n) = 0 \right\} \\\ge & {} \lim _{n \rightarrow \infty } {\mathbb {P}}\left\{ {\mathcal {P}}_n(B_O(2r))=0 \right\} \\= & {} \exp (-\omega _d(2r)^d) \\> & {} 0. \end{aligned}$$

\(\square \)