1 Introduction and results

1.1 Interacting gases

The probability density of n independent and identically distributed points in \(\mathbb {R}^d\) can always be represented as

$$\begin{aligned} \frac{1}{Z}\exp \left( -\beta \sum _{i=1}^n V(x_i)\right) \end{aligned}$$

where V is some real-valued function on \(\mathbb {R}^d\), \(\beta \) is some positive parameter, and Z is the normalizing constant.

Suppose that we want to introduce some interactions between the points. The simplest way to do that is to introduce a pairwise interaction term in the exponent; the new density is of the form

$$\begin{aligned} \frac{1}{Z}\exp \left( -\beta \sum _{1\le i<j\le n} w(x_i,x_j) - \beta n \sum _{i=1}^n V(x_i)\right) , \end{aligned}$$

where w is a symmetric real-valued function on \(\mathbb {R}^d\times \mathbb {R}^d\), known as the interaction potential. The factor n is put in front of the second term to ensure that the two terms are of comparable size, which is necessary for ensuring that the system has nontrivial properties in the large n limit. A particularly important type of interaction potentials are the Coulomb potentials, defined as

$$\begin{aligned} w(x,y) = {\left\{ \begin{array}{ll} |x-y| &{}\quad \text {if }\; d=1,\\ -\log |x-y| &{}\quad \text {if }\; d=2,\\ |x-y|^{2-d} &{}\quad \text {if }\; d\ge 3, \end{array}\right. } \end{aligned}$$

where \(|x-y|\) is the Euclidean distance between x and y. With w as above and \(V(x)=|x|^2\), we get the so-called Coulomb gases.

The 1D Coulomb gas is a very well-understood exactly solvable system, studied thoroughly by physicists [37, 60, 66, 67] and mathematicians [1, 29]. In higher dimensions, much less is known. For \(\beta =1\), the 2D Coulomb gas is an exactly solvable model due to its relationship with the Ginibre ensemble of random matrices [49]. The Ginibre ensemble has received widespread attention from mathematicians [2, 3, 18, 26, 27, 47, 50, 80, 87]. For general \(\beta \), however, the 2D Coulomb gas has no representation as an exactly solvable model. Fortunately, a number of results are now known for the case of general \(\beta \). Large deviation principles for the 2D Coulomb gas were proved in [11, 52, 78], and extended to general dimensions in [34, 83]. Concentration inequalities were proved in [33] and dynamical properties have been recently studied in [17]. The ground state in a related model was studied in [79]. Local properties have been studied in great depth in the recent papers [6, 7, 62, 64, 82].

In dimensions three and higher, very precise information about the normalizing constants has been obtained in [63, 81]. For a comprehensive survey, see [84]. Further results are provable by the techniques of these papers but have not been written up yet, as I learned from Sylvia Serfaty in a personal communication.

Another widely studied example is the 1D log gas, where \(d=1\), \(w(x,y)=-\log |x-y|\) and \(V(x)=x^2\). For \(\beta =1,2\) and 4, the log gases arise as eigenvalues of various random matrix ensembles and are exactly solvable. Precise fluctuation estimates for these special values of \(\beta \) were obtained in [57]. There is now considerable information available about other values of \(\beta \) and more general V [10, 23,24,25, 28, 85]. Asymptotic series expansions for the normalizing constants were computed in [20,21,22]. Central limit theorems have been investigated in [9, 19, 61]. For an introduction to log gases and their connections with random matrices, see [4, 36, 42]. A recent survey is given in [6].

1.2 Hyperuniformity

If we have a collection of n independent and identically distributed points in \(\mathbb {R}^d\), then the number of points that fall in a given set has fluctuations of order \(n^{1/2}\) as \(n\rightarrow \infty \). If a random point process has the property that this order of fluctuations is \(o(n^{1/2})\), then it is called ‘rigid’ or ‘hyperuniform’. More generally, a point process is called hyperuniform if its empirical measure has smaller fluctuations than the empirical measure of a collection of i.i.d. random points. In this paper I will use the terms ‘rigidity’ and ‘hyperuniformity’ interchangeably, but in general hyperuniformity is probably a more suitable term for the phenomenon described above, since rigidity has also been used to mean other things in the literature.

Sometimes point processes are very rigid, such as eigenvalues of various random matrix ensembles, for which the order of fluctuations may be as small as \(O(\sqrt{\log n})\) or even O(1) if one considers integrals of the empirical measure with respect to smooth functions. Rigidity/hyperuniformity has been established for many processes in dimensions one and two. For example, rigidity of the eigenvalues of random unitary matrices was proved in [38, 91]. Rigidity of eigenvalues in the standard hermitian random matrix ensembles follow from the results of [23,24,25, 35, 43, 76, 88]. Rigidity of eigenvalues of non-hermitian random matrices has been studied in [18, 26, 27, 47, 89]. Another class of 2D processes that exhibit hyperuniformity are zeros of random analytic functions. This has been investigated in [44, 45, 47, 48, 54, 73,74,75]. In recent work, rigidity of the 2D Coulomb gas has been established in [6, 7, 62].

However, no such results for interacting particle systems of the above kind are known in dimensions three and higher. The only random point process which has been shown to be hyperuniform in any dimension \(d\ge 3\), as far as I know, is the point process obtained by giving i.i.d. random perturbations to the vertices of \(\mathbb {Z}^d\). This is a recent result [77], improving on an earlier work in \(d\le 2\) [53]. The notion of rigidity in these papers is somewhat different than hyperuniformity. For interacting systems such as Coulomb gases, the detailed information about the normalizing constants obtained in [81, 84] provide some control on the order of fluctuations in \(d\ge 3\), but do not establish that the order of fluctuations is smaller than \(n^{1/2}\). There is a remark in [62] that the 2D techniques of that paper can be extended to higher dimensions for proving rigidity of integrals of smooth functions with respect to the empirical measure of a Coulomb gas, but the details have not yet been written up.

A class of processes that are related to Coulomb gases in dimension one but not in higher dimensions, are the so-called orthogonal polynomial ensembles (see [59] for a survey). These are generalizations of the 1D determinantal point processes arising in random matrix theory, and have nice mathematical structures that allow various exact calculations. A general central limit theorem for orthogonal polynomial ensembles was proved in [86]. Rigidity/hyperuniformity for orthogonal polynomial ensembles (beyond random matrix eigenvalues) have been investigated in recent years, for example in [16, 30, 31, 56] in dimensions one and two, and [5] in dimension three.

There is a considerable amount of work by physicists on hyperuniformity. For example, [72] and [71] give physics proofs of hyperuniformity in 3D Coulomb systems. A non-rigorous computation of covariances in Coulomb systems in all dimensions greater than one was given in [65]. More recently, a physics proof of hyperuniformity of free fermions at zero temperature (a certain kind of determinantal point process) was given in [32] in \(d\le 3\), based on an asymptotic formula for the variance of the number of points falling in a given region. This formula was later extended to arbitrary dimensions in [90]. Similar formulas have been very recently obtained for the 1D log gas (with general \(\beta \) and special V) in [69, 70]. For an extensive list of references to the physics literature, see the recent survey [46].

1.3 The hierarchical Coulomb gas model

In this paper, we consider a model of an interacting gas of n particles in the 3D unit cube \([0,1]^3\), which have joint probability density

$$\begin{aligned} \frac{1}{Z(n,\beta )}\exp \left( -\beta \sum _{1\le i<j\le n}w(x_i,x_j)\right) , \end{aligned}$$
(1.1)

where w(xy) is a symmetric potential that behaves like the Coulomb potential \(|x-y|^{-1}\) at short distances, and \(Z(n,\beta )\) is the normalizing constant. The potential w is defined as follows.

The unit cube in \(\mathbb {R}^3\) can be partitioned into 8 sub-cubes of side-length 1/2. Each of these sub-cubes can be further partitioned into 8 sub-cubes of side-length 1/4, and so on, generating a tree of dyadic sub-cubes. For any two distinct points x and y in the unit cube, let \(w(x,y)=2^k\), where k is the smallest number such that x and y belong to distinct dyadic sub-cubes of side-length \(2^{-k}\). There may be some ambiguity about points on the boundaries of the cubes, but since they form a set of measure zero, they do not matter. This w is our potential, which defines our point process through the density (1.1). Note that w is symmetric but not translation invariant.

For typical x and y which are close together, w(xy) behaves like a multiple of the Coulomb potential \(|x-y|^{-1}\). Indeed, it is not hard to prove that there is a constant C such that for all x and y,

$$\begin{aligned} w(x,y)\le \frac{C}{|x-y|}. \end{aligned}$$

Conversely, there is a constant \(c>0\) such that for any \(0<\delta <1\), the average value of w(xy) over all pairs (xy) with \(|x-y|=\delta \) is bounded below by \(c/\delta \).

Replacing the Euclidean distance by a hierarchical distance as above is a famous idea of Dyson [40, 41], who formulated and analyzed a hierarchical version of the 1D Ising model with long range interactions. This is now known as ‘Dyson’s hierarchical model’. Dyson’s work has inspired a large body of literature on hierarchical models over the years, and is still an active area of research. The model proposed above is sometimes called the ‘hierarchical Coulomb gas’. The 2D hierarchical Coulomb gas has received considerable attention in the mathematical physics literature [14, 15, 39, 51, 58, 68]. However, not much is known about this model in dimensions three and higher.

Just as the Coulomb potential is the Green’s function for Brownian motion, the potential w can also be realized as the Green’s function of a certain continuous time random walk on the unit cube, following a method developed in [12, 13] for constructing Markov semigroups on ultrametric spaces. More generally, the prescription given in [12, 13] can be used for a large class of hierarchical potentials arising from Dyson-type constructions.

The chief reason why the hierarchical structure of the potential helps in the analysis is that it does an automatic ‘coarse-graining’ of the interactions. The total interaction between the particles in two disjoint dyadic cubes is determined solely by the numbers of particles in those cubes, rather than their exact locations.

One of our main results, stated in the next subsection, is that if U is a nonempty open subset of the unit cube with a nicely behaved boundary, then the number of points falling in U has fluctuations of order at most \(n^{1/3}\sqrt{\log n}\), thereby establishing the hyperuniformity of our point process. This is matched up to a logarithmic factor by a lower bound of order \(n^{1/3}\). We also establish microscopic hyperuniformity in a local neighborhood of any given point. Finally, the analogous results in dimensions one and two are established for the sake of completeness.

1.4 Results in 3D

Take any \(d\ge 1\), and let U be a nonempty open subset of \(\mathbb {R}^d\). Let \(\partial U\) be the boundary of U. For each \(\epsilon >0\), let \(\partial U_\epsilon \) be the set of all points that are at distance \(\le \epsilon \) from \(\partial U\). Let \(\hbox {diam}(U)\) denote the diameter of U. We will say that the boundary of U is regular if there is some constant C such that for all \(0< \epsilon \le \hbox {diam}(U)\),

$$\begin{aligned} \hbox {Leb}(\partial U_\epsilon ) \le C\epsilon , \end{aligned}$$
(1.2)

where \(\hbox {Leb}\) stands for Lebesgue measure.

Now let \(d=3\), and let U be a nonempty open subset of \([0,1]^3\) whose boundary is regular in the sense defined above. Take any \(n\ge 2\) and \(\beta >0\), and consider an interacting gas of n particles behaving according to the model defined above. Let N(U) be the number of particles that fall in U. Our first theorem says that the gas is macroscopically hyperuniform in the sense that N(U) has fluctuations of order at most \(n^{1/3}\sqrt{\log n}\), instead of \(n^{1/2}\) as would be the case for a gas of i.i.d. particles.

Theorem 1.1

(Macroscopic hyperuniformity in 3D) Let U and N(U) be as above. Then

$$\begin{aligned} \mathbb {E}(N(U)) = \hbox {Leb}(U)n \end{aligned}$$

and

$$\begin{aligned} \mathrm {Var}(N(U))&\le C(U,\beta )n^{2/3}\log n, \end{aligned}$$

where \(C(U,\beta )\) is a constant that depends only on U and \(\beta \).

The next theorem shows that when \(\partial U\) is smooth, \(n^{1/3}\) is actually the correct order of fluctuations of N(U), up to possible logarithmic corrections.

Theorem 1.2

(Lower bound in 3D) Let U be a nonempty connected open subset of \([0,1]^3\) whose boundary is a smooth, closed, orientable surface. Let N(U) be as in Theorem 1.1. Then N(U) has fluctuations of order at least \(n^{1/3}\), in the sense that there are three constants \(n_0\ge 1\), \(c_1>0\) and \(c_2<1\), depending only on U and \(\beta \), such that for any \(n\ge n_0\) and any \(-\infty<a\le b<\infty \) with \(b-a\le c_1n^{1/3}\), we have \(\mathbb {P}(a\le N(U)\le b) \le c_2\).

Incidentally, the \(n^{1/3}\) order of fluctuations matches a well-known prediction from physics [55, 65, 72] for the 3D Coulomb gas model (see also [75]). The 1/3 exponent is also reminiscent of a famous classical result [8] about irregularities in distributions of arbitrary sequences of points in Euclidean space.

Let us now turn our attention to hyperuniformity in the microscopic scale. Take any point \(x\in (0,1)^3\). Blow up the neighborhood of x by a factor of \(n^{1/3}\) by applying the blow-up map \(y\mapsto n^{1/3}(y-x)\) to the points in our interacting gas. Since the original process had an expected density of n particles per unit volume, the new process has an expected density of one particle per unit volume. Studying the blown up process is the standard way of investigating the local behavior of interacting gases [84].

Let U be a nonempty open subset of \(\mathbb {R}^3\) whose boundary is regular in the sense defined above. For each \(\lambda >0\), let \(\lambda U\) denote the set \(\{\lambda y:y\in U\}\), and let \(N_x(\lambda U)\) be the number of points from the blown up process that land in \(\lambda U\). The following theorem shows that for \(\lambda \gg 1\), \(N_x(\lambda U)\) has fluctuations of order at most \(\lambda \sqrt{\log \lambda }\). This is smaller than \(\lambda ^{3/2}\), the corresponding order of fluctuations for a Poisson point process. This proves the hyperuniformity of our interacting gas at the microscopic scale.

Theorem 1.3

(Microscopic hyperuniformity in 3D) Let U and \(N_x(\lambda U)\) be as above. Then for any \(\lambda \) such that \(\hbox {diam}(\lambda U)\ge 1\),

$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbb {E}(N_x(\lambda U)) = \hbox {Leb}(\lambda U) = \lambda ^3\hbox {Leb}(U), \end{aligned}$$

and

$$\begin{aligned} \limsup _{n\rightarrow \infty } \mathrm {Var}(N_x(\lambda U))&\le C(U, \beta )\lambda ^2\log (4\lambda \hbox {diam}(U)), \end{aligned}$$

where \(C(U, \beta )\) is a constant that depends only on U and \(\beta \).

Theorems 1.1 and 1.3 are both special cases of a more general theorem (Theorem 2.13 in Sect. 2.4), which gives hyperuniformity at all scales.

Finally, let us consider linear statistics. Any function \(f:[0,1]^3\rightarrow \mathbb {R}\) defines a linear statistic

$$\begin{aligned} X(f) := \sum _{i=1}^n f(X_i), \end{aligned}$$
(1.3)

where \(X_1,\ldots , X_n\) is a realization of our point process. In particular, N(U) is a linear statistic, with f being the indicator function of U. We have the following two theorems about fluctuations of linear statistics when f is continuous. The results are not as definitive as the other results of this section, since the upper and lower bounds do not match.

If f is Lipschitz, we get the following slight improvement of the bound given in Theorem 1.1.

Theorem 1.4

(Upper bound for linear statistics in 3D) Suppose that \(f:[0,1]^3\rightarrow \mathbb {R}\) is a Lipschitz function with Lipschitz constant L. Let \(X_1,\ldots ,X_n\) be a realization of points from our model in dimension two. Let X(f) be the linear statistic defined in (1.3). Then

$$\begin{aligned} \mathrm {Var}(X(f))\le C(\beta )L^2n^{2/3}, \end{aligned}$$

where \(C(\beta )\) is a constant that depends only on \(\beta \).

The next theorem gives a lower bound of order \(n^{1/6}\) on the order of fluctuations of X(f) when f is a non-constant linear function. This does not match the upper bound from Theorem 1.4, but is nonetheless growing polynomially in n, deviating from the O(1) rate for smooth linear statistics in dimensions one and two [6, 7, 23,24,25,26,27, 35, 38, 57, 62, 88, 91].

Theorem 1.5

(Lower bound for linear statistics in 3D) Let \(f:[0,1]^3\rightarrow \mathbb {R}\) be a non-constant linear function, and let X(f) be as in (1.3). Then X(f) has fluctuations of order at least \(n^{1/6}\), in the sense that there are three constants \(n_0\ge 1\), \(c_1>0\) and \(c_2<1\), depending only on U and \(\beta \), such that for any \(n\ge n_0\) and any \(-\infty<a\le b<\infty \) with \(b-a\le c_1n^{1/6}\), we have \(\mathbb {P}(a\le X(f)\le b) \le c_2\).

It is not clear whether \(n^{1/3}\) or \(n^{1/6}\) is the correct order of fluctuations for smooth linear statistics. Theorem 1.2 does not provide any strong evidence in favor of \(n^{1/3}\), because, as we will see later for the 2D hierarchical Coulomb gas, linear statistics of smooth functions may have much smaller fluctuations than linear statistics of indicator functions. However, there is a recent result [5] which shows that \(n^{1/3}\) is the correct order of fluctuations for smooth linear statistics of a 3D orthogonal polynomial ensemble. Although orthogonal polynomial ensembles are not related to Coulomb type systems in dimension three, this gives some support in favor of \(n^{1/3}\).

1.5 Results in 2D and 1D

In dimension two, we will modify w to mimic the logarithmic potential of the 2D Coulomb gas. This is done by declaring \(w(x,y)=\) the minimum k such that x and y belong to distinct dyadic sub-squares of \([0,1]^2\) of side-length \(2^{-k}\). We will use the same formula in dimension one as well (with dyadic intervals instead of squares), so that w mimics the logarithmic potential of 1D log gases. With these modifications, we have the following analogs of Theorem 1.1. With N(U) as in Theorem 1.1, it says that N(U) has fluctuations of order at most \(n^{1/4}\log n\) in dimension two, and \(\log n\) in dimension one.

Theorem 1.6

(Macroscopic hyperuniformity in 2D and 1D) Consider the model defined above in dimension \(d=1\) or 2. Let U and N(U) be as in Theorem 1.1. Then

$$\begin{aligned} \mathbb {E}(N(U))=\hbox {Leb}(U)n \end{aligned}$$

and

$$\begin{aligned} \mathrm {Var}(N(U))\le C(U,\beta ) n^{(d-1)/d} (\log n)^2, \end{aligned}$$

where \(C(U,\beta )\) is a constant that depends only on U and \(\beta \).

The following theorem shows that in dimension two, N(U) has fluctuations of order at least \(n^{1/4}\), matching the above upper bound up to a logarithmic factor.

Theorem 1.7

(Lower bound in 2D) Let U be a nonempty connected open subset of \([0,1]^2\) whose boundary is a simple, smooth, closed curve. Let N(U) be as in Theorem 1.6. Then N(U) has fluctuations of order at least \(n^{1/4}\), in the sense that there are three constants \(n_0\ge 1\), \(c_1>0\) and \(c_2<1\), depending only on U and \(\beta \), such that for any \(n\ge n_0\) and any \(-\infty<a\le b<\infty \) with \(b-a\le c_1n^{1/4}\), we have \(\mathbb {P}(a\le N(U)\le b) \le c_2\).

Like the \(n^{1/3}\) rate in the 3D case, the \(n^{1/4}\) rate was also predicted in the physics literature [55, 65, 72] for the 2D Coulomb gas. The \(n^{1/4}\) fluctuation in the special case of \(\beta =1\) in the 2D Coulomb gas (corresponding to the exactly solvable Ginibre ensemble) can be established by standard techniques, as I learned from Paul Bourgade in a personal communication.

We also have the following analog of Theorem 1.3. With \(N_x(\lambda U)\) as in Theorem 1.3, it shows that for \(\lambda \gg 1\), \(N_x(\lambda U)\) has fluctuations of order at most \(\lambda ^{1/2}\log \lambda \) in dimension two, and \(\log \lambda \) in dimension one.

Theorem 1.8

(Microscopic hyperuniformity in 2D and 1D) Consider the model defined above in dimension \(d=1\) or 2. Let U and \(N_x(\lambda U)\) be as in Theorem 1.3. Then for any \(\lambda \) such that \(\hbox {diam}(\lambda U)\ge 1\),

$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbb {E}(N_x(\lambda U)) = \hbox {Leb}(\lambda U) = \lambda ^d\hbox {Leb}(U), \end{aligned}$$

and

$$\begin{aligned} \limsup _{n\rightarrow \infty } \mathrm {Var}(N_x(\lambda U))&\le C(U, \beta )\lambda ^{d-1} (\log (7\lambda ^d\hbox {diam}(U)^d))^2, \end{aligned}$$

where \(C(U, \beta )\) is a constant that depends only on U and \(\beta \).

As before, Theorems 1.6 and 1.8 are special cases of a more general theorem (Theorem 3.10 in Sect. 3.4) that gives hyperuniformity at all scales.

Finally, let us consider linear statistics. It has been proved recently in [6, 7, 62] that for the 2D Coulomb gas, linear statistics of smooth functions have O(1) fluctuations. For Lipschitz f, the following theorem shows that for our model in dimension two, the fluctuations of X(f) are at most of order \((\log n)^{3/2}\) instead of \(n^{1/4}\). Unlike Theorem 1.4, this is a big improvement of the bound from Theorem 1.6, and is within a logarithmic factor of the O(1) bound from [6, 7, 62].

Theorem 1.9

(Upper bound for linear statistics in 2D and 1D) Let \(d=1\) or 2. Suppose that \(f:[0,1]^d\rightarrow \mathbb {R}\) is a Lipschitz function with Lipschitz constant L. Let \(X_1,\ldots ,X_n\) be a realization of points from our model in dimension d, and let X(f) be the linear statistic defined in (1.3). Then

$$\begin{aligned} \mathrm {Var}(X(f))\le C(\beta )L^2(\log n)^{d+1}, \end{aligned}$$

where \(C(\beta )\) is a constant that depends only on \(\beta \).

2 Proofs in 3D

The rest of this paper is devoted to proofs. In this section, we will prove the theorems of Sect. 1.4.

2.1 Notation

It is helpful to define some precise notations and terminologies. For a slight technical convenience, we will replace the unit cube by the half-open unit cube \([0,1)^3\). Clearly, this will not alter the conclusions.

A dyadic sub-interval of the half-open unit interval [0, 1) is an interval of the form \([i2^{-k}, (i+1)2^{-k})\), where \(k\ge 0\) and \(0\le i\le 2^k-1\). A dyadic sub-cube of the half-open unit cube \([0,1)^3\) is a sub-cube of the form \(I_1\times I_2\times I_3\), where \(I_1\), \(I_2\) and \(I_3\) are dyadic sub-intervals of [0, 1) of equal length. Let \(\mathcal {D}_k\) be the set of all dyadic sub-cubes of \([0,1)^3\) of side-length \(2^{-k}\), and let

$$\begin{aligned} \mathcal {D}:= \bigcup _{k=0}^\infty \mathcal {D}_k \end{aligned}$$

be the set of all dyadic sub-cubes of \([0,1)^3\). Then \(\mathcal {D}\) has a natural tree structure, with each node having 8 children. We will freely use the terms ‘child’, ‘parent’, ‘ancestor’ and ‘descendant’ with respect to this tree.

For any two distinct points \(x,y\in [0,1)^3\), let k(xy) be the smallest k such that x and y belong to distinct elements of \(\mathcal {D}_k\). Then our potential w is the function \(w(x,y) = 2^{k(x,y)}\). For \(x=y\), let \(w(x,y)=\infty \).

For each \(n\ge 2\), let \(\Sigma _n\) be the set of all n-tuples of points from \([0,1)^3\). Define the energy of a configuration \((x_1,\ldots , x_n)\in \Sigma _n\) as

$$\begin{aligned} H_n(x_1,\ldots , x_n) := \sum _{1\le i<j\le n} w(x_i, x_j). \end{aligned}$$

For \(\beta >0\), let \(\mu _{n,\beta }\) be the probability measure on \(\Sigma _n\) that has density

$$\begin{aligned} \frac{1}{Z(n,\beta )}e^{-\beta H_n(x_1,\ldots , x_n)} \end{aligned}$$

with respect to Lebesgue measure on \(\Sigma _n\), where \(Z(n,\beta )\) is the normalizing constant. The measure \(\mu _{n,\beta }\) defines our model of an interacting gas at inverse temperature \(\beta \).

For certain technical reasons, we will also define the model for \(n=0\) and \(n=1\). When \(n=0\), there are no points. When \(n=1\), there is one point which is uniformly distributed in the cube. We will let \(Z(0,\beta )=Z(1,\beta )=1\) for any \(\beta \).

2.2 Preliminary calculations

In the following, all integrals are over \([0,1)^3\) and all double integrals are over \([0,1)^3\times [0,1)^3\), unless otherwise specified.

Lemma 2.1

For each \(x\in [0,1)^3\),

$$\begin{aligned} \int w(x,y) \, dy = \frac{7}{3}. \end{aligned}$$

Consequently,

$$\begin{aligned} \iint w(x,y) \, dx\, dy = \frac{7}{3}. \end{aligned}$$

Proof

Take any x. For each k, let \(D_k\) be the element of \(\mathcal {D}_k\) that contains x. It is easy to see that the set of all y with \(w(x,y)=2^k\) is exactly the union of all members of \(\mathcal {D}_k\) that are contained in \(D_{k-1}\), except the one that contains x. The Lebesgue measure of this set is \(8^{-k}\cdot 7\). Thus,

$$\begin{aligned} \int w(x,y)\, dy = 7\sum _{k=1}^\infty 2^k8^{-k} = \frac{7}{3}. \end{aligned}$$

The second assertion is obvious from the first. \(\square \)

Let us now investigate energy-minimizing configurations of finite size. Henceforth, \(L_n\) will denote the minimum possible energy of a configuration of n points. The following result gives upper and lower bounds for \(L_n\).

Theorem 2.2

There is a positive constant \(C_1\) such that for each \(n\ge 2\),

$$\begin{aligned} {n\atopwithdelims ()2}\frac{7}{3} - C_1n^{4/3}\le L_n \le {n\atopwithdelims ()2}\frac{7}{3}. \end{aligned}$$

Proof

Let \(Y_1,\ldots , Y_n\) be i.i.d. uniform random points from \([0,1)^3\). Then by symmetry,

$$\begin{aligned} L_n \le \mathbb {E}(H_n(Y_1,\ldots , Y_n)) = \sum _{1\le i<j\le n} \mathbb {E}(w(Y_i, Y_j)) = {n\atopwithdelims ()2} \mathbb {E}(w(Y_1,Y_2)). \end{aligned}$$

By Lemma 2.1, \(\mathbb {E}(w(Y_1,Y_2)) = 7/3\). This proves the upper bound. For the lower bound, let k be an integer such that

$$\begin{aligned} n^{-1/3}\le 2^{-k}\le 2n^{-1/3}. \end{aligned}$$

Take any configuration of n points. For each \(D\in \mathcal {D}\), let \(n_D\) be the number of points in D. Summing up the contributions to the energy from each cube, it is not difficult to see that

$$\begin{aligned} H_n(x_1,\ldots , x_n)&= \sum _{j=1}^\infty \sum _{D\in \mathcal {D}_j} 2^j{n_D\atopwithdelims ()2} + 2{n\atopwithdelims ()2}\ge \sum _{j=1}^k \sum _{D\in \mathcal {D}_j} 2^j{n_D\atopwithdelims ()2} + 2{n\atopwithdelims ()2}\\&= \sum _{j=1}^k \sum _{D\in \mathcal {D}_j} 2^{j-1}n_D^2 - \sum _{j=1}^k 2^{j-1}n + 2{n\atopwithdelims ()2}. \end{aligned}$$

By the Cauchy–Schwarz inequality, for each j,

$$\begin{aligned} \sum _{D\in \mathcal {D}_j} n_D^2 \ge \frac{1}{|\mathcal {D}_j|}\left( \sum _{D\in \mathcal {D}_j} n_D\right) ^2 = \frac{n^2}{8^j}. \end{aligned}$$

Thus,

$$\begin{aligned} H_n(x_1,\ldots , x_n)&\ge \frac{n^2}{2}\sum _{j=1}^k 4^{-j} - n^{4/3} + 2{n\atopwithdelims ()2}= \frac{n^2}{6}(1-4^{-k})- n^{4/3} + 2{n\atopwithdelims ()2}\\&\ge \frac{n^2}{6}(1-4n^{-2/3}) -n^{4/3} + 2{n\atopwithdelims ()2}. \end{aligned}$$

Since this lower bound holds for any configuration of n points, this completes the proof. \(\square \)

2.3 Estimates for the partition function

The following lemma gives important information about the ratio \(Z(n+1,\beta )/Z(n,\beta )\). Theorem 2.2 is a crucial ingredient in the proof of this lemma. Recall that \(Z(0,\beta )=Z(1,\beta )=1\). For a measurable function \(f: \Sigma _n\rightarrow \mathbb {R}\), we will denote its expected value under \(\mu _{n,\beta }\) by \(\mu _{n,\beta }(f)\).

Lemma 2.3

There is a constant \(C_2\) such that for any \(n\ge 0\) and \(\beta >0\),

$$\begin{aligned} e^{-7\beta n/3}\le \frac{Z(n+1, \beta )}{Z(n, \beta )}\le e^{-7\beta n/3 + C_2\beta n^{1/3}}. \end{aligned}$$

Proof

First suppose that \(n\ge 2\). For \(x_1,\ldots , x_n, x_{n+1}\in [0,1)^3\), let

$$\begin{aligned} f_n(x_1,\ldots ,x_n, x_{n+1}) := \sum _{i=1}^n w(x_i, x_{n+1}), \end{aligned}$$

so that

$$\begin{aligned} H_{n+1}(x_1,\ldots ,x_{n+1}) = f_n(x_1,\ldots ,x_n, x_{n+1}) + H_n(x_1,\ldots ,x_n). \end{aligned}$$

By the above representation and Jensen’s inequality,

$$\begin{aligned} \frac{Z(n+1, \beta )}{Z(n,\beta )}&= \iint e^{-\beta f_n(x_1,\ldots ,x_n, x_{n+1})} \, dx_{n+1}\, d\mu _{n,\beta }(x_1,\ldots ,x_n) \\&\ge \exp \left( -\beta \iint f_n(x_1,\ldots ,x_n, x_{n+1}) \, dx_{n+1}\, d\mu _{n,\beta }(x_1,\ldots ,x_n) \right) . \end{aligned}$$

But by Lemma 2.1,

$$\begin{aligned}&\iint f_n(x_1,\ldots ,x_n,x_{n+1}) \, dx_{n+1}\, d\mu _{n,\beta }(x_1,\ldots ,x_n) \\&\quad =\sum _{i=1}^n \iint w(x_i, x_{n+1}) \, dx_{n+1}\, d\mu _{n,\beta }(x_1,\ldots ,x_n) \\&\quad = \sum _{i=1}^n \int \frac{7}{3}\, d\mu _{n,\beta }(x_1,\ldots ,x_n) = \frac{7n}{3}. \end{aligned}$$

This gives the desired lower bound. Next, note that

$$\begin{aligned} \frac{Z(n,\beta )}{Z(n+1,\beta )} = \mu _{n+1,\beta }(e^{\beta f_n(x_1,\ldots , x_n, x_{n+1})}). \end{aligned}$$

Therefore by Jensen’s inequality and the invariance of \(\mu _{n+1,\beta }\) under permutations of coordinates,

$$\begin{aligned} \frac{Z(n,\beta )}{Z(n+1,\beta )}&\ge \exp (\beta \mu _{n+1,\beta }(f(x_1,\ldots , x_{n+1}))= \exp (\beta n \mu _{n+1,\beta } (w(x_1,x_{n+1})))\\&= \exp \left( \frac{\beta n}{{n+1\atopwithdelims ()2}} \sum _{1\le i<j\le n+1} \mu _{n+1,\beta } (w(x_i,x_j))\right) \\&= \exp \left( \frac{\beta n}{{n+1\atopwithdelims ()2}} \mu _{n+1,\beta }(H_{n+1}(x_1,\ldots ,x_{n+1}))\right) . \end{aligned}$$

But by Theorem 2.2,

$$\begin{aligned} \mu _{n+1,\beta }(H_{n+1}(x_1,\ldots ,x_{n+1})) \ge L_{n+1} \ge \frac{7}{3}{n+1\atopwithdelims ()2} - C_1 (n+1)^{4/3}. \end{aligned}$$

This gives the required upper bound and completes the proof of the lemma for \(n\ge 2\). When \(n=0\), the bounds hold trivially. When \(n=1\), the lower bound follows from an application of Jensen’s inequality and Lemma 2.1. The upper bound can be forced to hold for \(n=1\) by choosing \(C_2\) sufficiently large. \(\square \)

Lemma 2.3 is iterated to obtain the following corollary.

Corollary 2.4

For any \(n\ge 0\), \(\beta >0\), and any \(k\ge -n\),

$$\begin{aligned} \frac{Z(n+k, \beta )}{Z(n,\beta )} \le \exp \left( -\frac{7\beta nk}{3} -\frac{7\beta k(k-1)}{6} + C_2 \beta |k| (n+|k|)^{1/3}\right) , \end{aligned}$$

where \(C_2\) is the constant from Lemma 2.3.

Proof

First suppose that \(k\ge 0\). By the upper bound from Lemma 2.3,

$$\begin{aligned} \frac{Z(n+k, \beta )}{Z(n,\beta )}&= \prod _{i=0}^{k-1}\frac{Z(n+i+1,\beta )}{Z(n+i,\beta )}\\&\le \prod _{i=0}^{k-1}\exp \left( -\frac{7\beta (n+i)}{3} + C_2\beta (n+i)^{1/3}\right) \\&\le \exp \left( -\frac{7\beta nk}{3} -\frac{7\beta k(k-1)}{6} + C_2 \beta k (n+k)^{1/3}\right) . \end{aligned}$$

Next, suppose that \(k<0\). Let \(l= |k|\). Then by the lower bound from Lemma 2.3,

$$\begin{aligned} \frac{Z(n+k, \beta )}{Z(n,\beta )}&= \prod _{i=0}^{l-1} \frac{Z(n-i-1,\beta )}{Z(n-i,\beta )}\le \prod _{i=0}^{l-1}\exp \left( \frac{7\beta (n-i-1)}{3}\right) \\&= \exp \left( \frac{7\beta n l}{3} -\frac{7\beta l(l+1)}{6}\right) . \end{aligned}$$

To complete the proof, note that \(l=-k\) and \(l(l+1)= k(k-1)\). \(\square \)

2.4 Proofs of the upper bounds

Let us now fix some \(n\ge 0\) and \(\beta >0\). In the following, \((X_1,\ldots , X_n)\) will denote a random configuration drawn from the measure \(\mu _{n,\beta }\). We will assume that \((X_1,\ldots ,X_n)\) is defined on some abstract probability space \((\Omega , \mathcal {F}, \mathbb {P})\). Expectation, variance and covariance with respect to \(\mathbb {P}\) will be denoted by \(\mathbb {E}\), \(\mathrm {Var}\) and \(\mathrm {Cov}\) respectively.

Lemma 2.5

Let \(D_1,\ldots ,D_8\) denote the 8 elements of \(\mathcal {D}_1\), and for each \(1\le i\le 8\), let \(N_i := |\{j: X_j\in D_i\}|\). Then for each i, \(\mathbb {E}(N_i)= n/8\) and

$$\begin{aligned} \mathrm {Var}(N_i)\le K(\beta ) n^{2/3}, \end{aligned}$$

where \(K(\beta )\) is a non-increasing function of \(\beta \).

Proof

We have already defined universal constants \(C_1\) and \(C_2\) in the previous subsections. In this proof, we will continue to use this convention and denote further universal constants by \(C_3, C_4,\ldots \) without explicitly mentioning that they denote universal constants on each occasion.

The identity \(\mathbb {E}(N_i)=n/8\) follows by symmetry. We will now prove the claimed bound on the variance. The cases \(n=0\) and \(n=1\) are trivial, so let us assume that \(n\ge 2\). First, note that energy of a configuration is the sum of the energies within each \(D_i\), plus the interactions between the \(D_i\)’s. From this observation it is easy to deduce the recursive relation

$$\begin{aligned} Z(n,\beta )&= \sum _{\begin{array}{c} 0\le n_1,\ldots , n_8\le n\\ n_1+\cdots +n_8=n \end{array}} \frac{n!}{n_1!n_2!\ldots n_8!} e^{-2\beta \sum _{1\le i<j\le 8} n_i n_j}\prod _{i=1}^8 (8^{-n_i}Z(n_i,2\beta ))\\&= \sum _{\begin{array}{c} 0\le n_1,\ldots , n_8\le n\\ n_1+\cdots +n_8=n \end{array}} \frac{8^{-n}n!}{n_1!n_2!\ldots n_8!} e^{-2\beta \sum _{1\le i<j\le 8} n_i n_j}\prod _{i=1}^8 Z(n_i,2\beta ). \end{aligned}$$

Moreover, for any \((n_1,\ldots , n_8)\) occurring in the above sum,

$$\begin{aligned}&\mathbb {P}(N_1=n_1,\ldots , N_8=n_8) \\&\quad = \frac{8^{-n}n!}{n_1!n_2!\cdots n_8!} e^{-2\beta \sum _{1\le i<j\le 8} n_i n_j}\frac{\prod _{i=1}^8 Z(n_i,2\beta ) }{Z(n,\beta )}. \end{aligned}$$

Choose nonnegative integers \(m_1,\ldots , m_8\) such that \(m_1+\cdots +m_8=n\) and \(|m_i-n/8|\le 1\) for each i. It is not difficult to see that such integers can be found for any n. For convenience, let

$$\begin{aligned} f(n_1,\ldots , n_8)&:= \frac{n!}{n_1!n_2!\cdots n_8!},\\ g(n_1,\ldots , n_8)&:= e^{-2\beta \sum _{1\le i<j\le 8} n_i n_j} = e^{-\beta n^2 + \beta \sum _{i=1}^8 n_i^2}, \\ h(n_1,\ldots , n_8)&:= \prod _{i=1}^8 Z(n_i,2\beta ). \end{aligned}$$

Take any \(k_1,\ldots ,k_8\in \mathbb {Z}\) such that \(k_1+\cdots +k_8=0\) and \(0\le m_i+k_i\le n\) for each i. Then by Corollary 2.4,

$$\begin{aligned}&\frac{h(m_1+k_1,\ldots , m_8+k_8)}{h(m_1,\ldots , m_8)} \\&\quad \le \prod _{i=1}^8 \exp \left( -\frac{14\beta m_ik_i}{3} -\frac{14\beta k_i(k_i-1)}{6} + 2C_2 \beta |k_i| (n+|k_i|)^{1/3}\right) \\&\quad \le \prod _{i=1}^8 \exp \left( -\frac{14\beta (nk_i/8 - |k_i|)}{3} -\frac{14\beta k_i(k_i-1)}{6} + 4C_2\beta |k_i|n^{1/3}\right) \\&\quad \le \exp \left( -\frac{14\beta }{6}\sum _{i=1}^8 k_i^2 + C_3\beta n^{1/3}\sum _{i=1}^8 |k_i|\right) . \end{aligned}$$

Next, note that

$$\begin{aligned} \frac{g(m_1+k_1,\ldots , m_8+k_8)}{g(m_1,\ldots , m_8)}&= \exp \left( \beta \sum _{i=1}^8 (m_i+k_i)^2 -\beta \sum _{i=1}^8 m_i^2\right) \\&= \exp \left( \beta \sum _{i=1}^8(2m_ik_i + k_i^2)\right) \\&\le \exp \left( \beta \sum _{i=1}^8(2nk_i/8 +2|k_i|+ k_i^2)\right) \\&= \exp \left( \beta \sum _{i=1}^8(2|k_i|+ k_i^2)\right) . \end{aligned}$$

Therefore,

$$\begin{aligned}&\frac{\mathbb {P}(N_1=m_1+k_1,\ldots , N_8=m_8+k_8)}{\mathbb {P}(N_1=m_1,\ldots , N_8=m_8)}\\&\quad \le \frac{f(m_1+k_1,\ldots , m_8+k_8)}{f(m_1,\ldots , m_8)} \exp \left( -\frac{4\beta }{3}\sum _{i=1}^8 k_i^2 + C_4 \beta n^{1/3} \sum _{i=1}^8|k_i|\right) . \end{aligned}$$

This shows that there are positive constants \(C_5\) and \(C_6\) such that if

$$\begin{aligned} \max _{1\le i\le 8} |k_i|\ge C_5n^{1/3}, \end{aligned}$$

then

$$\begin{aligned}&\frac{\mathbb {P}(N_1=m_1+k_1,\ldots , N_8=m_8+k_8)}{\mathbb {P}(N_1=m_1,\ldots , N_8=m_8)}\nonumber \\&\quad \le \frac{f(m_1+k_1,\ldots , m_8+k_8)}{f(m_1,\ldots , m_8)} e^{-C_6\beta n^{2/3}}. \end{aligned}$$
(2.1)

Let A denote the set of all \((n_1,\ldots , n_8)\) such that each \(n_i\) is a nonnegative integer, \(n_1+\cdots +n_8=n\), and

$$\begin{aligned} \max _{1\le i\le 8} |n_i-m_i|\ge C_5 n^{1/3}. \end{aligned}$$

Then by (2.1), for any \((n_1,\ldots , n_8)\in A\),

$$\begin{aligned} \frac{\mathbb {P}(N_1=n_1,\ldots , N_8=n_8)}{\mathbb {P}(N_1=m_1,\ldots , N_8=m_8)}\le \frac{f(n_1,\ldots , n_8)}{f(m_1,\ldots , m_8)} e^{-C_6\beta n^{2/3}}. \end{aligned}$$

Now recall the multinomial formula

$$\begin{aligned} \sum _{\begin{array}{c} 0\le n_1,\ldots ,n_8\le n\\ n_1+\cdots +n_8=n \end{array}} f(n_1,\ldots , n_8) = 8^n. \end{aligned}$$

A simple calculation using Stirling’s formula shows that

$$\begin{aligned} f(m_1,\ldots , m_8)8^{-n} \ge C_{7} n^{-4}. \end{aligned}$$

Thus,

$$\begin{aligned} \mathbb {P}((N_1,\ldots , N_8)\in A)&\le \frac{\mathbb {P}((N_1,\ldots , N_8)\in A)}{\mathbb {P}(N_1=m_1,\ldots , N_8=m_8)}\\&= \sum _{(n_1,\ldots , n_8)\in A} \frac{\mathbb {P}(N_1=n_1,\ldots , N_8=n_8)}{\mathbb {P}(N_1=m_1,\ldots , N_8=m_8)}\\&\le e^{-C_6\beta n^{2/3}} \frac{8^n}{f(m_1,\ldots , m_8)}\le C_{8} n^4e^{-C_6\beta n^{2/3}}. \end{aligned}$$

Therefore for each i,

$$\begin{aligned} \mathrm {Var}(N_i)&\le \mathbb {E}(N_i-m_i)^2\le C_5^2 n^{2/3} + n^2 \mathbb {P}((N_1,\ldots ,N_8)\in A)\\&\le C_5^2 n^{2/3}+ C_{8}n^6 e^{-C_6\beta n^{2/3}}. \end{aligned}$$

The above inequality shows that

$$\begin{aligned} \mathrm {Var}(N_i)\le K(\beta ) n^{2/3}, \end{aligned}$$

where \(K(\beta )\) is a decreasing function of \(\beta \). \(\square \)

For any Borel set \(A\subseteq [0,1)^3\), let

$$\begin{aligned} X(A) := \{X_j: X_j\in A\}. \end{aligned}$$

and let \(N(A):=|X(A)|\). For each \(k\ge 0\), let \(\mathcal {F}_k\) be the \(\sigma \)-algebra generated the random variables \(\{N(D): D\in \mathcal {D}_k\}\). Note that \(\{\mathcal {F}_k\}_{k\ge 0}\) is a filtration of \(\sigma \)-algebras. This filtration will play an important role in the subsequent discussion.

Lemma 2.6

Conditional on \(\mathcal {F}_k\), the random sets \(\{X(D): D\in \mathcal {D}_k\}\) are mutually independent. Moreover, for any \(D\in \mathcal {D}_k\), conditional on \(\mathcal {F}_k\), X(D) has the same distribution as a scaled version of a point process from the measure \(\mu _{N(D), 2^k\beta }\).

Proof

Take any k. Note that the joint density of \((X_1,\ldots ,X_n)\) at a point \((x_1,\ldots ,x_n)\) may be written as

$$\begin{aligned} \frac{1}{Z(n,\beta )} \exp \left( -\beta \sum _{D\in \mathcal {D}_k} H_D(x_1,\ldots ,x_n) -\beta R_k(x_1,\ldots ,x_n)\right) , \end{aligned}$$

where \(H_D(x_1,\ldots ,x_n)\) is the contribution due to the interactions between points in D, and \(R_k(x_1,\ldots ,x_n)\) is the contribution due to the interactions between points in different members of \(\mathcal {D}_k\). The crucial property of the potential w is that \(R_k(x_1,\ldots ,x_n)\) is a function of \(\{n_D: D\in \mathcal {D}_k\}\), where \(n_D = |\{j: x_j\in D\}|\). The claims follow easily from this observation. \(\square \)

Lemma 2.6 allows us to compute conditional means and variances.

Lemma 2.7

If \(D\in \mathcal {D}_k\) and \(D'\) is a child of D, then

$$\begin{aligned} \mathbb {E}(N(D')|\mathcal {F}_k) = \frac{N(D)}{8} \end{aligned}$$

and

$$\begin{aligned} \mathrm {Var}(N(D')|\mathcal {F}_k) \le K(\beta ) N(D)^{2/3}, \end{aligned}$$

where K is the function from Lemma 2.5.

Proof

The formula for the conditional expectation follows from Lemma 2.6 and symmetry, and the bound on the conditional variance follows from Lemmas 2.5, 2.6, and the observation that \(K(2^k\beta )\le K(\beta )\) since K is a non-increasing function of \(\beta \). \(\square \)

The above lemma leads to the following conclusions about unconditional means and variances.

Lemma 2.8

For any \(D\in \mathcal {D}\), \(\mathbb {E}(N(D))=\hbox {Leb}(D)n\) and

$$\begin{aligned} \mathrm {Var}(N(D)) \le 8K(\beta ) \hbox {Leb}(D)^{2/3}n^{2/3}, \end{aligned}$$

where K is the function from Lemma 2.5.

Proof

Suppose that \(D\in \mathcal {D}_k\). The formula for the expectation follows easily by iterating the formula for the conditional expectation from Lemma 2.7, and observing that \(\hbox {Leb}(D)=8^{-k}\). Next, let \(D'\) be the parent of D. Then by Lemma 2.7 and the formula for expected value,

$$\begin{aligned} \mathbb {E}(N(D)^2)&= \mathbb {E}(N(D)^2 - (\mathbb {E}(N(D)|\mathcal {F}_{k-1}))^2) + \mathbb {E}((\mathbb {E}(N_{D}|\mathcal {F}_{k-1}))^2)\\&= \mathbb {E}(\mathrm {Var}(N(D)|\mathcal {F}_{k-1})) + 8^{-2} \mathbb {E}(N(D')^2)\\&\le K(\beta ) \mathbb {E}(N(D')^{2/3}) + 8^{-2} \mathbb {E}(N(D')^2)\\&\le K(\beta ) (\mathbb {E}(N(D')))^{2/3} + 8^{-2} \mathbb {E}(N(D')^2)\\&= K(\beta ) 4^{-k+1} n^{2/3} + 8^{-2}\mathbb {E}(N(D')^2). \end{aligned}$$

Iterating this, we get

$$\begin{aligned} \mathbb {E}(N(D)^2)&\le K(\beta )n^{2/3}(4^{-k+1} + 8^{-2} 4^{-k+2} + 8^{-4} 4^{-k+3} + \cdots ) + 8^{-2k}n^2\\&\le 8K(\beta )4^{-k} n^{2/3} + 8^{-2k}n^2, \end{aligned}$$

which completes the proof since \(\mathbb {E}(N(D)) = \hbox {Leb}(D)n = 8^{-k}n\). \(\square \)

Now take any nonempty open set \(U\subseteq [0,1)^3\) with regular boundary. Let \(\mathcal {U}\) be the set of all \(D\in \mathcal {D}\) such that \(D\subseteq U\) but the parent cube of D is not contained in U.

Lemma 2.9

The set U is the disjoint union of all elements of \(\mathcal {U}\).

Proof

Since U is open, each point in U belongs to some dyadic cube that is contained in U. Some ancestor of this cube must belong to \(\mathcal {U}\). This shows that U is the union of the members of \(\mathcal {U}\). It is easy to see that the elements of \(\mathcal {U}\) are disjoint. \(\square \)

Corollary 2.10

\(\mathbb {E}(N(U))=\hbox {Leb}(U)n\).

Proof

Just observe that by Lemmas 2.8 and 2.9,

$$\begin{aligned} \mathbb {E}(N(U)) = \sum _{D\in \mathcal {U}} \mathbb {E}(N(D)) = \sum _{D\in \mathcal {U}} \hbox {Leb}(D)n = \hbox {Leb}(U)n, \end{aligned}$$

where we have implicitly used the fact that \(\mathcal {U}\) is a countable collection. \(\square \)

For each j, let \(\mathcal {U}_j := \mathcal {U}\cap \mathcal {D}_j\). Let \(\mathcal {V}_j\) denote the set of all \(D\in \mathcal {D}_j\) that intersect both U and \(U^c\). Note that \(\mathcal {U}_j\) and \(\mathcal {V}_j\) do not overlap. For any dyadic cube D, let p(D) denote the proportion of D that belongs to U. Let \(M_0 = \hbox {Leb}(U)n\) and for each \(j\ge 1\), let

$$\begin{aligned} M_j := \sum _{i=0}^j\sum _{D\in \mathcal {U}_i} N(D) + \sum _{D\in \mathcal {V}_j} p(D) N(D). \end{aligned}$$

Lemma 2.11

The sequence \(\{M_j\}_{j\ge 0}\) is a martingale with respect to the filtration \(\{\mathcal {F}_j\}_{j\ge 0}\).

Proof

Take any \(j\ge 1\). Then

$$\begin{aligned} \mathbb {E}(M_j|\mathcal {F}_{j-1})&= \sum _{i=0}^{j-1} \sum _{D\in \mathcal {U}_i} N(D) + \sum _{D\in \mathcal {U}_j} \mathbb {E}(N(D)|\mathcal {F}_{j-1}) \\&\quad + \sum _{D\in \mathcal {V}_j} p(D) \mathbb {E}(N(D)|\mathcal {F}_{j-1})\\&= \sum _{i=0}^{j-1} \sum _{D\in \mathcal {U}_i} N(D) + \sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j} p(D) \mathbb {E}(N(D)|\mathcal {F}_{j-1}) \end{aligned}$$

Take any \(D\in \mathcal {V}_{j-1}\). Then each child of D is either a member of \(\mathcal {U}_j\), or a member of \(\mathcal {V}_j\), or has no intersection with U. Conversely, every member of \(\mathcal {U}_j\cup \mathcal {V}_j\) is the child of some member of \(\mathcal {V}_{j-1}\). Lastly, note that if \(D_1,\ldots ,D_8\) are the children of a dyadic cube D, then

$$\begin{aligned} p(D) = \frac{1}{8}\sum _{i=1}^8 p(D_i). \end{aligned}$$

Combining these observations and applying Lemma 2.7, we get

$$\begin{aligned} \sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j} p(D) \mathbb {E}(N(D)|\mathcal {F}_{j-1}) = \sum _{D\in \mathcal {V}_{j-1}} p(D) N(D), \end{aligned}$$

which completes the proof. \(\square \)

For the remainder of this section, let A(U) be a constant such that for all \(0<\epsilon \le \hbox {diam}(U)\),

$$\begin{aligned} \hbox {Leb}(\partial U_\epsilon )\le A(U)\epsilon . \end{aligned}$$
(2.2)

By the regularity condition, we can choose A(U) to be finite. The martingale property of \(M_j\) and our previous calculations lead to the following conclusion.

Lemma 2.12

For any \(j\ge 1\) such that \(\sqrt{3}\cdot 2^{-j+1}\le \hbox {diam}(U)\),

$$\begin{aligned} \mathrm {Var}(M_j)\le C(\beta )A(U) n^{2/3} + \mathrm {Var}(M_{j-1}), \end{aligned}$$

where \(C(\beta )\) is a constant that depends only on \(\beta \).

Proof

By the martingale property,

$$\begin{aligned} \mathrm {Var}(M_j)&= \mathbb {E}(\mathrm {Var}(M_j|\mathcal {F}_{j-1})) + \mathrm {Var}(\mathbb {E}(M_j|\mathcal {F}_{j-1}))\nonumber \\&= \mathbb {E}(\mathrm {Var}(M_j|\mathcal {F}_{j-1})) + \mathrm {Var}(M_{j-1}). \end{aligned}$$
(2.3)

Now,

$$\begin{aligned} \mathrm {Var}(M_j|\mathcal {F}_{j-1})&= \mathrm {Var}\left( \sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j} p(D)N(D)\biggl |\mathcal {F}_{j-1}\right) \nonumber \\&= \sum _{D,D'\in \mathcal {U}_j\cup \mathcal {V}_j} p(D)p(D') \mathrm {Cov}(N(D), N(D')|\mathcal {F}_{j-1}). \end{aligned}$$
(2.4)

If D and \(D'\) have different parents, then N(D) and \(N(D')\) are conditionally independent by Lemma 2.6, and hence the conditional covariance is zero. Otherwise, Lemma 2.7 and the Cauchy–Schwarz inequality imply that

$$\begin{aligned} |\mathrm {Cov}(N(D), N(D')|\mathcal {F}_{j-1})|\le K(\beta )N(D'')^{2/3}, \end{aligned}$$

where \(D''\) is the parent of D and \(D'\). Thus, by Lemma 2.8,

$$\begin{aligned} |\mathbb {E}(\mathrm {Cov}(N(D), N(D')|\mathcal {F}_{j-1})) |&\le K(\beta ) (\hbox {Leb}(D'') n)^{2/3} \\&= K(\beta ) (8^{-j+1} n)^{2/3}. \end{aligned}$$

On the other hand, each \(D\in \mathcal {U}_j\cup \mathcal {V}_j\) has at most 7 sibling cubes that belong to \(\mathcal {U}_j\cup \mathcal {V}_j\). Since \(p(D)8^{-j} = p(D)\hbox {Leb}(D) = \hbox {Leb}(D\cap U)\), this shows that

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(M_j|\mathcal {F}_{j-1}))&\le K(\beta ) (8^{-j+1}n)^{2/3}\sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j} 7p(D)\\&= 28K(\beta ) n^{2/3}2^j\sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j}\hbox {Leb}(D\cap U). \end{aligned}$$

Note that each element of

$$\begin{aligned} \bigcup _{D\in \mathcal {U}_j\cup \mathcal {V}_j}( D\cap U) \end{aligned}$$

is within distance \(\sqrt{3}\cdot 2^{-j+1}\) from \(\partial U\). Since \(\sqrt{3}\cdot 2^{-j+1}\le \hbox {diam}(U)\), inequality (2.2) gives

$$\begin{aligned} \sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j}\hbox {Leb}(D\cap U)\le A(U) \sqrt{3}\cdot 2^{-j+1}. \end{aligned}$$

Consequently,

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(M_j|\mathcal {F}_{j-1})) \le C(\beta ) A( U) n^{2/3}, \end{aligned}$$

where \(C(\beta )\) depends only on \(\beta \). The proof is completed by plugging this bound into (2.3). \(\square \)

We now have all the ingredients for proving the following theorem, which implies Theorems 1.1 and 1.3 and special cases.

Theorem 2.13

(Hyperuniformity at all scales) Let U and N(U) be as in Theorem 1.1. Suppose that \(\hbox {diam}(U)\ge n^{-1/3}\). Let A(U) be the constant defined in (2.2). Then

$$\begin{aligned} \mathbb {E}(N(U)) = \hbox {Leb}(U)n \end{aligned}$$

and

$$\begin{aligned} \mathrm {Var}(N(U))&\le C(\beta )A(U)n^{2/3}\log (4n^{1/3}\hbox {diam}(U))+ C(\beta )\hbox {Leb}(U)^{2/3}n^{2/3}, \end{aligned}$$

where \(C(\beta )\) is a constant that depends only on \(\beta \).

Proof

Throughout this proof, \(C(\beta )\) will denote any constant that depends only on \(\beta \). The value of \(C(\beta )\) may change from line to line or even within a line.

The formula for the expectation follows from Corollary 2.10. It remains to prove the variance bound. Choose k such that

$$\begin{aligned} \frac{1}{2}n^{-1/3}\le \sqrt{3} \cdot 2^{-k}\le n^{-1/3}. \end{aligned}$$

Note that by Lemma 2.9, any point in U either belongs to some \(D\in \mathcal {U}_j\) for some \(j\le k\), or belongs to some \(D\in \mathcal {U}_j\) for some \(j> k\). In the latter case, there is an ancestor of D that belongs to \(\mathcal {V}_{k}\). Thus,

$$\begin{aligned} U = \left( \bigcup _{j=0}^{k}\mathcal {U}_j\right) \cup \left( \bigcup _{D\in \mathcal {V}_{k}} (D\cap U)\right) , \end{aligned}$$

and so

$$\begin{aligned} N(U) = \sum _{j=0}^{k} \sum _{D\in \mathcal {U}_j} N(D) + \sum _{D\in \mathcal {V}_{k}} N(D\cap U). \end{aligned}$$

Consequently, by Lemmas 2.6, 2.8 and Corollary 2.10,

$$\begin{aligned} \mathbb {E}(N(U)|\mathcal {F}_{k})&= \sum _{j=0}^{k} \sum _{D\in \mathcal {U}_j} N(D) + \sum _{D\in \mathcal {V}_{k}} \mathbb {E}(N(D\cap U)|\mathcal {F}_{k})\\&= \sum _{j=0}^{k} \sum _{D\in \mathcal {U}_j} N(D) + \sum _{D\in \mathcal {V}_k} p(D)N(D) = M_k. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathrm {Var}(N(U))&= \mathbb {E}(\mathrm {Var}(N(U)|\mathcal {F}_k)) + \mathrm {Var}(\mathbb {E}(N(U)|\mathcal {F}_k))\nonumber \\&= \mathbb {E}(\mathrm {Var}(N(U)|\mathcal {F}_k)) + \mathrm {Var}(M_k). \end{aligned}$$
(2.5)

Given \(\mathcal {F}_k\), the random variables \(\{N(D\cap U):D\in \mathcal {D}_k\}\) are independent by Lemma 2.6. Therefore, by Lemma 2.8 and Corollary 2.10,

$$\begin{aligned} \mathrm {Var}(N(U)|\mathcal {F}_k)&= \mathrm {Var}\left( \sum _{D\in \mathcal {V}_k} N(D\cap U)\biggl |\mathcal {F}_k\right) = \sum _{D\in \mathcal {V}_k} \mathrm {Var}(N(D\cap U)|\mathcal {F}_k)\\&\le \sum _{D\in \mathcal {V}_k} \mathbb {E}(N(D\cap U)^2|\mathcal {F}_k)\\&\le \sum _{D\in \mathcal {V}_k} \mathbb {E}(N(D\cap U)|\mathcal {F}_k) N(D) = \sum _{D\in \mathcal {V}_k} p(D) N(D)^2. \end{aligned}$$

By Lemma 2.8 and our choice of k,

$$\begin{aligned} \mathbb {E}(N(D)^2)= \mathrm {Var}(N(D))+(\mathbb {E}(N(D)))^2 \le C(\beta ) \end{aligned}$$

for all \(D\in \mathcal {V}_k\). Also, each element of

$$\begin{aligned} \bigcup _{D\in \mathcal {V}_k} (D\cap U) \end{aligned}$$

is within distance \(\sqrt{3}\cdot 2^{-k}\) of \(\partial U\), and \(p(D) 8^{-k} = \hbox {Leb}(D\cap U)\). Since

$$\begin{aligned} \sqrt{3}\cdot 2^{-k}\le n^{-1/3}\le \hbox {diam}(U) \end{aligned}$$

by our choice of k and the assumption that \(\hbox {diam}(U)\ge n^{-1/3}\), this gives

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(N(U)|\mathcal {F}_k))&\le C(\beta )8^k\sum _{D\in \mathcal {V}_k} \hbox {Leb}(D\cap U) \\&\le C(\beta )8^kA(U) 2^{-k}\\&= C(\beta )A(U)4^k \le C(\beta )A(U)n^{2/3}. \end{aligned}$$

Let l be the smallest integer such that \(\sqrt{3}\cdot 2^{-l} \le \hbox {diam}(U)\). Note that \(l\le k\). Together with (2.5) and Lemma 2.12, the above inequality shows that

$$\begin{aligned} \mathrm {Var}(N(U))\le C(\beta )A(U)n^{2/3}(k-l+1) + \mathrm {Var}(M_l). \end{aligned}$$

By the definition of l, \(\mathcal {U}_i\) if empty for all \(i<l\). Therefore

$$\begin{aligned} M_l = \sum _{D\in \mathcal {U}_l\cup \mathcal {V}_l} p(D)N(D). \end{aligned}$$

Note that for any \(D\in \mathcal {U}_l\cup \mathcal {V}_l\), Lemma 2.8 gives

$$\begin{aligned} \mathrm {Var}(p(D)N(D))&= p(D)^2\mathrm {Var}(N(D))\le C(\beta )p(D)^2 \hbox {Leb}(D)^{2/3}n^{2/3}\\&\le C(\beta )(p(D)\hbox {Leb}(D))^{2/3}n^{2/3}\\&= C(\beta )\hbox {Leb}(D\cap U)^{2/3} n^{2/3}\le C(\beta )\hbox {Leb}(U)^{2/3}n^{2/3}. \end{aligned}$$

Moreover, it is easy to see that U intersects at most 64 members of \(\mathcal {D}_l\), and therefore \(|\mathcal {U}_l \cup \mathcal {V}_l|\le 64\). From these observations, we get

$$\begin{aligned} \mathrm {Var}(M_l)\le C(\beta ) \hbox {Leb}(U)^{2/3}n^{2/3}. \end{aligned}$$

Finally, note that by the lower bound on \(\sqrt{3}\cdot 2^{-k}\) and the upper bound on \(\sqrt{3}\cdot 2^{-l}\), we get

$$\begin{aligned} 2^{k-l} \le 2n^{1/3}\hbox {diam}(U), \end{aligned}$$

and hence \(k-l+1\le \log _2(4n^{1/3}\hbox {diam}(U))\). This completes the proof of the theorem. \(\square \)

Proof of Theorem 1.1

This is a direct application of Theorem 2.13. The condition \(\hbox {diam}(U)\ge n^{-1/3}\) is irrelevant because the variance bound can be enforced for small n by adjusting the constant \(C(U,\beta )\). \(\square \)

Proof of Theorem 1.3

Let \(V := n^{-1/3}\lambda U + x\). Note that \(N_x(\lambda U)=N(V)\). Also, note that

$$\begin{aligned} \hbox {Leb}(V)&= \lambda ^3n^{-1}\hbox {Leb}(U),\\ A(V)&=\lambda ^2n^{-2/3}A(U),\\ \hbox {diam}(V)&= \lambda n^{-1/3} \hbox {diam}(U). \end{aligned}$$

In particular, the condition \(\hbox {diam}(V)\ge n^{-1/3}\) is equivalent to \(\hbox {diam}(\lambda U)\ge 1\). The proof is now just an application of Theorem 2.13, and the observation that since \(x\in (0,1)^3\), V is eventually contained in \((0,1)^3\) as n gets large. \(\square \)

Finally, let us prove Theorem 1.4.

Proof of Theorem 1.4

Here \(C(\beta )\) denotes any constant that depends only on \(\beta \). Let f(D) be the average value of f in a dyadic square \(D\in \mathcal {D}\). For each k, let \(f_k\) be the function that is identically equal to f(D) within each \(D\in \mathcal {D}_k\). Let

$$\begin{aligned} W_k := X(f_k). \end{aligned}$$

By Lemmas 2.6 and 2.7, it is easy to see that \(\{W_k\}_{k\ge 0}\) is martingale with respect to the filtration \(\{\mathcal {F}_k\}_{k\ge 0}\). Moreover, for any k,

$$\begin{aligned} \mathbb {E}(X(f)|\mathcal {F}_k) = X(f_k). \end{aligned}$$
(2.6)

Now choose k such that

$$\begin{aligned} n^{-1/3}\le 2^{-k}\le 2n^{-1/3}. \end{aligned}$$

Then by (2.6) and the martingale property of \(\{W_j\}_{j\ge 0}\),

$$\begin{aligned} \mathrm {Var}(X(f)) = \mathbb {E}(\mathrm {Var}(X(f)|\mathcal {F}_k)) + \sum _{j=1}^{k}\mathbb {E}(\mathrm {Var}(X(f_j)|\mathcal {F}_{j-1})). \end{aligned}$$
(2.7)

Take any j. For each \(D\in \mathcal {D}_{j-1}\), let c(D) denote the set of 8 children of D. By Lemma 2.6 and Lemma 2.7,

$$\begin{aligned} \mathrm {Var}(X(f_j)|\mathcal {F}_{j-1})&= \mathrm {Var}\left( \sum _{D\in \mathcal {D}_j} f(D)N(D) \biggl |\mathcal {F}_{j-1}\right) \\&= \sum _{D\in \mathcal {D}_{j-1}} \mathrm {Var}\left( \sum _{D'\in c(D)} f(D')N(D')\biggl |\mathcal {F}_{j-1}\right) \\&= \sum _{D\in \mathcal {D}_{j-1}} \mathbb {E}\left( \left( \sum _{D'\in c(D)} f(D')N(D') - f(D)N(D)\right) ^2\biggl |\mathcal {F}_{j-1}\right) . \end{aligned}$$

Now notice that for any \(D\in \mathcal {D}_{j-1}\),

$$\begin{aligned}&\sum _{D'\in c(D)} f(D')N(D') - f(D)N(D) \\&\quad = \sum _{D'\in c(D)} (f(D')- f(D))\left( N(D')-\frac{N(D)}{8}\right) . \end{aligned}$$

Recall that L is the Lipschitz constant of f. For any \(D'\in c(D)\),

$$\begin{aligned} |f(D')-f(D)|\le \sqrt{3} L 2^{-j+1}. \end{aligned}$$

Thus,

$$\begin{aligned}&\left( \sum _{D'\in c(D)} (f(D')- f(D))\left( N(D')-\frac{N(D)}{8}\right) \right) ^2 \\&\quad \le 4^{-j+2} L^2 \left( \sum _{D'\in c(D)} \biggl |N(D')-\frac{N(D)}{8}\biggr |\right) ^2\\&\quad \le 4^{-j+4}L^2\sum _{D'\in c(D)} \left( N(D')-\frac{N(D)}{8}\right) ^2. \end{aligned}$$

Therefore, by Lemma 2.7,

$$\begin{aligned}&\mathbb {E}\left( \left( \sum _{D'\in c(D)} f(D')N(D') - f(D)N(D)\right) ^2\biggl |\mathcal {F}_{j-1}\right) \\&\quad \le 4^{-j+4} L^2 \sum _{D'\in c(D)}\mathrm {Var}(N(D')|\mathcal {F}_{j-1})\le 4^{-j+6} L^2 K(\beta )N(D)^{2/3}. \end{aligned}$$

Consequently, by Lemma 2.8,

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(X(f_j)|\mathcal {F}_{j-1}))&\le C(\beta )L^2 4^{-j}\sum _{D\in \mathcal {D}_{j-1}}\mathbb {E}(N(D)^{2/3})\\&\le C(\beta ) L^24^{-j}\sum _{D\in \mathcal {D}_{j-1}}(\mathbb {E}(N(D)))^{2/3}\le C(\beta ) L^22^{-j}n^{2/3}. \end{aligned}$$

Next, for \(D\in \mathcal {D}_k\), let

$$\begin{aligned} s(D) := \sum _{j\, : \, X_j\in D} f(X_j), \end{aligned}$$

so that

$$\begin{aligned} X(f) = \sum _{D\in \mathcal {D}_k} s(D). \end{aligned}$$

Then by Lemma 2.6,

$$\begin{aligned} \mathrm {Var}(X(f)|\mathcal {F}_k)&= \sum _{D\in \mathcal {D}_k} \mathrm {Var}(s(D)|\mathcal {F}_k)\\&\le \sum _{D\in \mathcal {D}_k}\mathbb {E}((s(D)-f(D)N(D))^2|\mathcal {F}_k). \end{aligned}$$

By the Lipschitz condition,

$$\begin{aligned} |s(D)-f(D)N(D)|\le \sqrt{3}L 2^{-k}N(D) \end{aligned}$$

for each \(D\in \mathcal {D}_k\). Thus, by Lemma 2.8 and our choice of k,

$$\begin{aligned} \mathbb {E}((s(D)-f(D)N(D))^2) \le 4^{-k+1}L^2 \mathbb {E}(N(D)^2)\le C(\beta )L^2 4^{-k}. \end{aligned}$$

Consequently,

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(X(f)|\mathcal {F}_k)) \le C(\beta )L^24^{-k}|\mathcal {D}_k|\le C(\beta ) L^22^k\le C(\beta )L^2 n^{1/3}. \end{aligned}$$

The proof is now easily completed by combining the steps. \(\square \)

2.5 Proofs of the lower bounds

Let us now prove Theorem 1.2. We will continue using the notations introduced in the previous sections. We need to prove some simple geometric facts. Let

$$\begin{aligned} \mathcal {T}:= \{z+[0,1)^3: z\in \mathbb {Z}^3\}. \end{aligned}$$

Our first geometric lemma is very simple.

Lemma 2.14

Let \(\mathcal {T}\) be as above. Take any \(D\in \mathcal {T}\) and any \(x\in D\). Let \(\delta \) be the distance of x from the boundary of D. Then any plane through x bifurcates D into two parts, each of which has volume at least \(2\pi \delta ^3/3\).

Proof

The open ball of radius \(\delta \) around x is contained in D. Any plane P through x bifurcates this ball into two parts of volume \(2\pi \delta ^3/3\) each. The proof is completed by observing that these two hemispheres are contained in the two parts of D obtained by bifurcating using P. \(\square \)

The second lemma is an easy fact about intervals.

Lemma 2.15

Let I be a closed interval of the real line of length at least \(\delta \in [0,1]\). Then I has a closed subinterval J of length \(\delta /4\) such that any integer is at a distance at least \(\delta /4\) from J.

Proof

If I contains no integers, then we can take J to be an interval of length \(\delta /4\) that is at distance at least \(\delta /4\) from each endpoint of I. If I contains an integer n, then at least one of the two intervals \([n, n+\delta /2]\) and \([n-\delta /2, n]\) must be contained in I. In the first case take \(J= [n+\delta /4, n+\delta /2]\) and in the second case take \(J=[n-\delta /2, n-\delta /4]\). Since \(\delta \le 1\), there is no integer within distance \(\delta /4\) from J. \(\square \)

The next lemma is intuitively obvious but a little tedious to prove. The constants are probably not optimal, but that does not matter for us.

Lemma 2.16

Take any \(x\in \mathbb {R}^3\) and a unit vector \(u = (u_1,u_2,u_3)\in S^2\). Let P be the plane that contains x and is perpendicular to u. Suppose that

$$\begin{aligned} \min \{|u_1|, |u_2|, |u_3|\} \ge 0.1. \end{aligned}$$
(2.8)

Then there is an element \(D\in \mathcal {T}\), within Euclidean distance \(\sqrt{402}\) from x, which is bifurcated by the plane P in such a way that each part has volume at least \(6\times 10^{-8}\).

Proof

Take any \(x = (x_1,x_2,x_3)\in \mathbb {R}^3\) and \(u=(u_1,u_2, u_3)\in S^2\) as in the statement of the lemma. Let \(P_0\) be the plane with normal vector u that contains the origin. Define

$$\begin{aligned} y_1= \hbox {sign}(u_1),\quad y_2 = \hbox {sign}(u_2), \quad y_3 = -\frac{|u_1|+|u_2|}{u_3}. \end{aligned}$$

Then \(y= (y_1,y_2,y_3)\in P_0\). Also, we have \(|y_1|=1\), \(|y_2|=1\), and by condition (2.8) and the fact that \(|u_3|\le 1\),

$$\begin{aligned} |y_3| = \frac{|u_1|+|u_2|}{|u_3|} \ge |u_1|+|u_2|\ge 0.2. \end{aligned}$$

Now consider the set

$$\begin{aligned} I_1 = \{x_1 + \alpha y_1: 0\le \alpha \le 1\}. \end{aligned}$$

Since \(|y_1|=1\), \(I_1\) is an interval of length 1. By Lemma 2.15, \(I_1\) has a subinterval of \(I_2\) of length 0.25 such that any integer is at least at a distance 0.25 from \(I_2\). Moreover, since \(|y_1|= 1\), \(I_2\) is of the form

$$\begin{aligned} \{x_1+\alpha y_1: a\le \alpha \le b\}, \end{aligned}$$

where \(b-a= 0.25\). Let

$$\begin{aligned} I_3 := \{x_2+\alpha y_2: a\le \alpha \le b\}. \end{aligned}$$

Since \(|y_2|= 1\), \(I_3\) has length 0.25. Thus by Lemma 2.15, \(I_3\) contains a subinterval \(I_4\) of length 0.0625 such that any integer is at a distance at least 0.0625 from \(I_4\). Again, since \(|y_2|= 1\), this implies that \(I_4\) is of the form

$$\begin{aligned} \{x_2+\alpha y_2: c\le \alpha \le d\}, \end{aligned}$$

where \(a\le c\le d\le b\) and \(d-c = 0.0625\). Let

$$\begin{aligned} I_5 := \{x_3+\alpha y_3: c\le \alpha \le d\}. \end{aligned}$$

Since \(|y_3|\ge 0.2\), \(I_5\) has length at least 0.0125. Consequently by Lemma 2.15, \(I_5\) has a subinterval \(I_6\) of length 0.003125 such that any integer is at a distance at least 0.003125 from \(I_6\).

In particular, there is some \(\alpha \in [0,1]\) such that \(x_1+\alpha y_1\in I_2\), \(x_2+\alpha y_2\in I_4\) and \(x_3+\alpha y_3\in I_6\). The distance of \(x_i+\alpha y_i\) from the nearest integer is at least 0.003125 for each i. Thus, the distance of the point \(x+\alpha y\) from the boundary of the cube \(D\in \mathcal {T}\) that contains \(x+\alpha y\) is at least 0.003125. By Lemma 2.14 and the fact that \(x+\alpha y\in P\), this proves P bifurcates D into two parts, each of which has volume at least \(6\times 10^{-8}\). Lastly, note that

$$\begin{aligned} |(x+\alpha y)-x|&\le |y| = \sqrt{y_1^2+y_2^2+y_3^2}\\&\le \sqrt{1+1 + \frac{(1+1)^2}{0.1^2}}\le \sqrt{402}, \end{aligned}$$

since \(|u_1|\le 1\), \(|u_2|\le 1\) and \(|u_3|\ge 0.1\). This completes the proof of the lemma. \(\square \)

Now recall that the boundary of the set U in the statement of Theorem 1.2 is a smooth, closed, orientable surface. In particular, we can choose a unit normal vector u(x) at each \(x\in \partial U\) such that the map \(x\mapsto u(x)\) is smooth.

Lemma 2.17

Take any \(x\in \partial U\) such that the normal vector u(x) satisfies (2.8). Then there is some \(j_0\) depending only on U (but not on x), such that for all \(j\ge j_0\), there is some \(D\in \mathcal {D}_j\) at distance at most \(\sqrt{402} \cdot 2^{-j}\) from x, which satisfies

$$\begin{aligned} 10^{-8}\le \frac{\hbox {Leb}(D\cap U)}{\hbox {Leb}(D)}\le 1- 10^{-8}. \end{aligned}$$
(2.9)

Proof

From the given properties of \(\partial U\), it is clear that \(\partial U\) has uniformly bounded curvature. Consequently, there is a constant C depending only on U, such that for any \(x\in \partial U\) and any \(\epsilon \in (0,1)\), \(B(x,\epsilon )\cap \partial U\) lies inside a slab of width \(C\epsilon ^2\) around \(T_x\), where \(B(x,\epsilon )\) is the Euclidean ball of radius \(\epsilon \) around x, and \(T_x\) is the tangent plane at x. The rest of the proof is an easy application of Lemma 2.16 and scaling. \(\square \)

The above lemma leads to the following result, which is a key component of the proof of Theorem 1.2.

Lemma 2.18

There is some \(K_1>0\) and some \(j_1\ge 1\) depending only on U such that for any \(j\ge j_1\), there is a set of at least \(K_1 4^j\) cubes \(D\in \mathcal {D}_j\) that satisfy (2.9) and the union of these cubes has diameter at most \(\hbox {diam}(U)/3\).

Proof

Let P be the plane through the origin that is perpendicular to the vector (1, 1, 1). Let \(\alpha _0\) be the largest \(\alpha \) such that the plane \(P_\alpha := (\alpha ,\alpha ,\alpha )+P\) intersects the closure of U. Let x be a point of intersection. Then \(x\in \partial U\), and \(P_{\alpha _0}=T_x\). Consequently, there is some \(0<\epsilon <\hbox {diam}(U)/7\) such that for every \(y\in B(x,\epsilon )\cap \partial U\), u(y) satisfies (2.8). Due to the boundedness of the curvature of \(\partial U\), a small enough choice of \(\epsilon \) guarantees that for any \(\delta \in (0,1)\), there are at least \(C\delta ^{-2}\) points in \(B(x,\epsilon )\cap \partial U\), where C is a positive constant that depends only on U, such that any two points are at distance at least \(50\delta \) from each other.

Take \(\delta =2^{-j}\), and choose a collection of points as above. Then by Lemma 2.17, there is an element of \(\mathcal {D}_j\) within distance \(21\delta \) from each point, that satisfies (2.9). Since the points are separated by distance at least \(50\delta \) from each other, these elements of \(\mathcal {D}_j\) are distinct. Since \(\epsilon <\hbox {diam}(U)/7\), a large enough choice of j ensures that the union of these cubes has diameter less than \(\hbox {diam}(U)/3\). \(\square \)

Lastly, we need a lemma about our point process. Recall that for any \(D\in \mathcal {D}\), N(D) is the number of points landing in D.

Lemma 2.19

For any \(n\ge 1\), \(\beta >0\), \(j\ge 0\) and \(D\in \mathcal {D}_j\),

$$\begin{aligned} \mathbb {P}(N(D)\ge 2)\le \exp \left( -2^{j+1}\beta +\frac{7\beta }{3}{n\atopwithdelims ()2}\right) . \end{aligned}$$

Proof

The \(n=1\) case is trivial, so let us take \(n\ge 2\). By Jensen’s inequality and Lemma 2.1,

$$\begin{aligned} Z(n,\beta ) \ge \exp \left( -\frac{7\beta }{3}{n\atopwithdelims ()2}\right) . \end{aligned}$$

On the other hand, if a configuration \(x_1,\ldots , x_n\) has two or more points in D, then

$$\begin{aligned} H_n(x_1,\ldots ,x_n)\ge 2^{j+1}. \end{aligned}$$

Thus, if A is the set of all such configurations, then

$$\begin{aligned} \int _A e^{-\beta H_n(x_1,\ldots ,x_n)}\, dx_1\cdots dx_n \le e^{- 2^{j+1}\beta }\hbox {Leb}(A)\le e^{-2^{j+1}\beta }. \end{aligned}$$

Combining, we get

$$\begin{aligned} \mathbb {P}(N(D)\ge 2) = \mu _{n,\beta }(A) \le \exp \left( -2^{j+1}\beta +\frac{7\beta }{3}{n\atopwithdelims ()2}\right) , \end{aligned}$$

which completes the proof. \(\square \)

Finally, we are ready to prove Theorem 1.2. Recall the filtration \(\{\mathcal {F}_k\}_{k\ge 0}\) defined earlier.

Proof of Theorem 1.2

In this proof, the phrase ‘n sufficiently large’ will mean ‘\(n\ge n_0\), where \(n_0\) depends only on U and \(\beta \)’. Also, C will denote any positive universal constant, \(C(\beta )\) will denote any positive constant that depends only on \(\beta \), and \(C(U,\beta )\) will denote any positive constant that depends only on U and \(\beta \).

Choose k such that

$$\begin{aligned} n^{-1/3}\le 2^{-k} \le 2n^{-1/3}. \end{aligned}$$
(2.10)

Then for any \(D\in \mathcal {D}_k\), Lemma 2.8 gives

$$\begin{aligned} \mathbb {E}(N(D)^2) \le K_2(\beta ), \end{aligned}$$
(2.11)

where \(K_2(\beta )\) is a positive integer that depends only on \(\beta \). Let

$$\begin{aligned} m := 1000 K_2(\beta ). \end{aligned}$$

Let \(j>k\) be the smallest number such that

$$\begin{aligned} 2^{j-k+1} \ge \frac{7}{3}{m\atopwithdelims ()2}+1. \end{aligned}$$

Note that \(0\le j-k\le C(\beta )\).

Take any \(D\in \mathcal {D}_k\). Let \(\mathcal {D}_j(D)\) denote the set of elements of \(\mathcal {D}_j\) that are descendants of D. Take any \(D'\in \mathcal {D}_j(D)\). If \(N(D)\le m\), then by Lemmas 2.6 and 2.19,

$$\begin{aligned} \mathbb {P}(N(D')\ge 2|\mathcal {F}_k) \le e^{-2^k\beta } \le e^{-\beta n^{1/3}}. \end{aligned}$$

Consequently,

$$\begin{aligned} \mathbb {P}(N(D)\le m, \, N(D')\ge 2)&= \mathbb {E}(\mathbb {P}(N(D')\ge 2|\mathcal {F}_k); N(D)\le m)\\&\le e^{-\beta n^{1/3}} \mathbb {P}(N(D)\le m)\le e^{-\beta n^{1/3}}. \end{aligned}$$

In particular, if E is the event

$$\begin{aligned} \{N(D)\le m \text { and } N(D')\ge 2 \text { for some } D\in \mathcal {D}_k \text { and some } D'\in \mathcal {D}_j(D)\}, \end{aligned}$$

then a union bound gives

$$\begin{aligned} \mathbb {P}(E)&\le \sum _{D\in \mathcal {D}_k}\sum _{D'\in \mathcal {D}_j(D)}\mathbb {P}(N(D)\le m, \, N(D')\ge 2)\nonumber \\&\le |\mathcal {D}_j| e^{-\beta n^{1/3}} \le C(\beta ) ne^{-\beta n^{1/3}}. \end{aligned}$$
(2.12)

We will need this inequality later.

Now, if n is sufficiently large, then there is a set \(\mathcal {C}'\subseteq \mathcal {D}_j\) that satisfies the conclusions of Lemma 2.18. In particular, \(|\mathcal {C}'|\ge C(U,\beta ) 4^j\). Moreover, since each element of \(\mathcal {C}'\) satisfies (2.9), these cubes must lie entirely within distance \(\sqrt{3}\cdot 2^{-j}\) from \(\partial U\). If n is large enough, then \(\sqrt{3}\cdot 2^{-j}\le \hbox {diam}(U)\). Therefore by the regularity of \(\partial U\), we have \(|\mathcal {C}'|\le C(U,\beta ) 4^j\).

Let \(\mathcal {C}\) denote the set of all members of \(\mathcal {D}_k\) who are ancestors of elements of \(\mathcal {C}'\). By dropping some elements from \(\mathcal {C}'\) if necessary, we can ensure that each member of \(\mathcal {C}\) has exactly one descendant in \(\mathcal {C}'\). Since \(0\le j-k\le C(\beta )\), this gives the inequalities

$$\begin{aligned} C_1(U,\beta ) 4^k\le |\mathcal {C}|=|\mathcal {C}'|\le C_2(U,\beta )4^k, \end{aligned}$$
(2.13)

where \(C_1(U,\beta )\) and \(C_2(U,\beta )\) are positive constants that depend only on U and \(\beta \). Let Q be the union of the elements of \(\mathcal {C}\). Recall that by Lemma 2.18 and the relation between \(\mathcal {C}\) and \(\mathcal {C}'\),

$$\begin{aligned} \hbox {diam}(Q)\le \frac{\hbox {diam}(U)}{3}+2\sqrt{3}\cdot 2^{-k}, \end{aligned}$$

which is less than \(\hbox {diam}(U)/2\) if n is sufficiently large. Thus, if n is large enough and \(\sqrt{3}\cdot 2^{-k} \le \epsilon \le \hbox {diam}(Q)\), then

$$\begin{aligned} \epsilon +\sqrt{3}\cdot 2^{-k}\le 2\epsilon \le 2\,\hbox {diam}(Q)\le \hbox {diam}(U). \end{aligned}$$

Moreover, each point in Q is at distance at most \(\sqrt{3}\cdot 2^{-k}\) from U. Therefore,

$$\begin{aligned} \hbox {Leb}(\partial Q_\epsilon ) \le \hbox {Leb}(\partial U_{\epsilon +\sqrt{3}\cdot 2^{-k}})\le A(U)(\epsilon + \sqrt{3}\cdot 2^{-k} ) \le 2A(U)\epsilon . \end{aligned}$$

On the other hand, if \(0<\epsilon \le \sqrt{3}\cdot 2^{-k}\), then

$$\begin{aligned} \hbox {Leb}(\partial Q_\epsilon )&\le \sum _{D\in \mathcal {C}} \hbox {Leb}(\partial D_\epsilon )\le \sum _{D\in \mathcal {C}} A(D)\epsilon \le C\sum _{D\in \mathcal {C}} 4^{-k}\epsilon = C |\mathcal {C}|4^{-k}\epsilon . \end{aligned}$$

Therefore, by (2.13), for \(0<\epsilon \le \sqrt{3}\cdot 2^{-k}\),

$$\begin{aligned} \hbox {Leb}(\partial Q_\epsilon )\le C(U,\beta )\epsilon . \end{aligned}$$

Combining the two cases, we get \(A(Q)\le C(U,\beta )\). Consequently, by Theorem 2.13,

$$\begin{aligned} \mathrm {Var}(N(Q)) \le C(U,\beta ) n^{2/3}\log n, \end{aligned}$$
(2.14)

provided that n is sufficiently large. Also, by Lemma 2.8 and our choice of k,

$$\begin{aligned} \mathbb {E}(N(Q)) = \hbox {Leb}(Q) n = |\mathcal {C}| 8^{-k} n\ge |\mathcal {C}|. \end{aligned}$$

Thus, by (2.13), (2.14) and Chebychev’s inequality,

$$\begin{aligned} \mathbb {P}\left( \frac{N(Q)}{|\mathcal {C}|}\ge \frac{1}{2}\right) \ge 1-\frac{4\mathrm {Var}(N(Q))}{|\mathcal {C}|^2}\ge 1-C(U,\beta ) n^{-2/3}\log n. \end{aligned}$$
(2.15)

Now let

$$\begin{aligned} a_1&:=\frac{1}{|\mathcal {C}|}\sum _{D\in \mathcal {C}} N(D) = \frac{N(Q)}{|\mathcal {C}|}, \quad a_2 := \frac{1}{|\mathcal {C}|} \sum _{D\in \mathcal {C}}N(D)^2,\\ p_1&:= \frac{|\{D\in \mathcal {C}: N(D)>0\}|}{|\mathcal {C}|},\quad p_2 := \frac{|\{D\in \mathcal {C}: N(D)> m\}|}{|\mathcal {C}|},\\ q&:= \frac{|\{D\in \mathcal {C}: 0<N(D)\le m\}|}{|\mathcal {C}|}. \end{aligned}$$

By (2.11), \(\mathbb {E}(a_2)\le K_2(\beta )\). Thus,

$$\begin{aligned} \mathbb {P}(a_2\ge 2K_2(\beta ))\le \frac{1}{2}. \end{aligned}$$
(2.16)

By the Paley–Zygmund second moment inequality,

$$\begin{aligned} p_1\ge \frac{a_1^2}{a_2}, \end{aligned}$$

and so by (2.15) and (2.16),

$$\begin{aligned} \mathbb {P}\left( p_1 \ge \frac{1}{8K_2(\beta )}\right) \ge \mathbb {P}\left( a_1\ge \frac{1}{2}, \, a_2 \le 2K_2(\beta )\right) \ge \frac{1}{2}-C(U,\beta ) n^{-2/3}\log n. \end{aligned}$$

Choose n so large that the above lower bound at least 1 / 3. Next, note that by Lemma 2.8 and Markov’s inequality,

$$\begin{aligned} \mathbb {E}(p_2) \le \frac{1}{m|\mathcal {C}|} \sum _{D\in \mathcal {C}}\mathbb {E}(N(D))\le \frac{8}{m}, \end{aligned}$$

and hence

$$\begin{aligned} \mathbb {P}\left( p_2\ge \frac{32}{m}\right) \le \frac{1}{4}. \end{aligned}$$

Since \(q=p_1-p_2\) and

$$\begin{aligned} \frac{1}{8K_2(\beta )} \ge \frac{64}{m}, \end{aligned}$$

this gives

$$\begin{aligned} \mathbb {P}\left( q\ge \frac{32}{m}\right) \ge \mathbb {P}\left( p_1\ge \frac{64}{m},\, p_2\le \frac{32}{m}\right) \ge \frac{1}{3}-\frac{1}{4}= \frac{1}{12}. \end{aligned}$$
(2.17)

Let \(\mathcal {C}_0\) be the set of all \(D\in \mathcal {C}\) such that \(0<N(D)\le m\). Let \(\mathcal {C}'_0\) be the set of all elements of \(\mathcal {C}'\) that are contained in elements of \(\mathcal {C}_0\). Let

$$\begin{aligned} r := \frac{1}{|\mathcal {C}'_0|} \sum _{D\in \mathcal {C}'_0}N(D) \end{aligned}$$
(2.18)

if \(\mathcal {C}'_0\ne \emptyset \) and let \(r=0\) otherwise. By Lemma 2.7, if \(\mathcal {C}'_0\) is nonempty,

$$\begin{aligned} \mathbb {E}(r|\mathcal {F}_k) = \frac{1}{8^{j-k}|\mathcal {C}_0|} \sum _{D\in \mathcal {C}_0}N(D) \ge C(\beta ), \end{aligned}$$
(2.19)

and by Lemmas 2.6 and 2.7,

$$\begin{aligned} \mathrm {Var}(r|\mathcal {F}_k) \le \frac{C(\beta )}{|\mathcal {C}'_0|} = \frac{C(\beta )}{|\mathcal {C}|q}\le \frac{C(\beta )}{n^{2/3}q}. \end{aligned}$$
(2.20)

By the last two inequalities and Chebychev’s inequality, we see that there is a positive constant \(K_3(\beta )\) depending only on \(\beta \) such that if \(q\ge 32/m\) and n is sufficiently large, then

$$\begin{aligned} \mathbb {P}(r\ge K_3(\beta )|\mathcal {F}_k) \ge 1-C(U,\beta )n^{-2/3}. \end{aligned}$$

Therefore by (2.17), if n is sufficiently large,

$$\begin{aligned} \mathbb {P}\left( r\ge K_3(\beta ), \, q\ge \frac{32}{m}\right) \ge \frac{1}{13}. \end{aligned}$$
(2.21)

Thus, for sufficiently large n,

$$\begin{aligned} \mathbb {P}(|\mathcal {C}_0'| \ge K_4(\beta ) n^{2/3}) \ge \frac{1}{13}, \end{aligned}$$

where \(K_4(\beta )\) is a positive constant that depends only on \(\beta \).

Now recall the event E defined earlier. Let \(E^c\) denote the complement of E. If \(E^c\) happens, then \(\mathcal {C}_0'= \mathcal {C}^*\), where

$$\begin{aligned} \mathcal {C}^* := \{D\in \mathcal {C}_0': N(D)=1\}. \end{aligned}$$
(2.22)

Combining this with (2.12), this shows that for sufficiently large n,

$$\begin{aligned} \mathbb {P}(|\mathcal {C}^*| \ge K_4(\beta )n^{2/3})&\ge \mathbb {P}(\{|\mathcal {C}_0'| \ge K_4(\beta ) n^{2/3}\}\cap E^c)\nonumber \\&\ge \mathbb {P}(|\mathcal {C}_0'| \ge K_4(\beta ) n^{2/3})-\mathbb {P}(E)\ge \frac{1}{14}. \end{aligned}$$
(2.23)

By Lemma 2.6, the random variables \(\{N(D\cap U): D\in \mathcal {D}_j\}\) are independent given \(\mathcal {F}_j\). If \(N(D)=1\), then the conditional distribution of \(N(D\cap U)\) given \(\mathcal {F}_j\) is Bernoulli(p(D)), where \(p(D)=\hbox {Leb}(D\cap U)/\hbox {Leb}(D)\). Let

$$\begin{aligned} M := \sum _{D\in \mathcal {C}^*} N(D\cap U). \end{aligned}$$

Since \(10^{-8} \le p(D)\le 1-10^{-8}\) for each \(D\in \mathcal {C}^*\), the Berry–Esseen theorem for sums of independent random variables shows that for any interval I,

$$\begin{aligned} \mathbb {P}(M\in I|\mathcal {F}_j)\le \frac{C(|I|+1)}{\sqrt{|\mathcal {C}^*|}}, \end{aligned}$$
(2.24)

where |I| denotes the length of I. Since

$$\begin{aligned} N(U) = \sum _{D\in \mathcal {D}_j} N(D\cap U) = \sum _{D\in \mathcal {D}_j{\setminus } \mathcal {C}^*}N(D\cap U) + M, \end{aligned}$$

and the two terms in the last expression are independent given \(\mathcal {F}_j\), the inequality (2.24) implies that

$$\begin{aligned} \mathbb {P}(N(U)\in I|\mathcal {F}_j)\le \frac{C(|I|+1)}{\sqrt{|\mathcal {C}^*|}}. \end{aligned}$$

Therefore by (2.23),

$$\begin{aligned} \mathbb {P}(N(U)\in I)\le C(\beta )(|I|+1)n^{-1/3} + \frac{13}{14} \end{aligned}$$

if n is sufficiently large. This completes the proof. \(\square \)

Finally, let us prove Theorem 1.5. The ingredients are almost all drawn from the proof of Theorem 1.2.

Proof of Theorem 1.5

In this proof, \(C(\beta )\) denotes any positive constant that depends only on \(\beta \), C(f) denotes any positive constant that depends only on f and \(C(f,\beta )\) denotes any positive constant that depends only on f and \(\beta \). Let j and k be defined as in (2.10). Let \(f:[0,1]^3\rightarrow \mathbb {R}\) be a non-constant linear function.

Let \(\mathcal {C}:= \mathcal {D}_k\), and let \(a_1\), \(a_2\), \(p_1\), \(p_2\) and q be defined as in the proof of Theorem 1.2, with this \(\mathcal {C}\). Then \(|\mathcal {C}|=8^k\), and \(a_1 = 8^{-k}n\ge 1\). The inequality (2.16) is still valid, and hence we get

$$\begin{aligned} \mathbb {P}\left( p_1\ge \frac{1}{8K_2(\beta )} \right) \ge \frac{1}{2}. \end{aligned}$$

Proceeding then as in the proof of Theorem 1.2, this gives

$$\begin{aligned} \mathbb {P}\left( q \ge \frac{32}{m}\right) \ge \frac{1}{4}. \end{aligned}$$

Let \(\mathcal {C}_0\) be the set of all \(D\in \mathcal {C}\) for which \(0<N(D)\le m\). Construct a set \(\mathcal {C}_0'\subseteq \mathcal {D}_j\) by choosing exactly one descendant of each element of \(\mathcal {C}_0\) by some arbitrary deterministic rule. Let r be defined as in (2.18). Then (2.12), (2.19) and (2.20) continue to hold, and therefore so does (2.21) when n is sufficiently large. Since \(|\mathcal {C}|\ge n\) in this proof, this shows that for sufficiently large n,

$$\begin{aligned} \mathbb {P}(|\mathcal {C}^*|\ge K_5(\beta )n) \ge \frac{1}{14}, \end{aligned}$$
(2.25)

where \(\mathcal {C}^*\) is defined as in (2.22) and \(K_5(\beta )\) is a positive constant that depends only on \(\beta \).

For each \(D\in \mathcal {D}_j\), let

$$\begin{aligned} X(f,D) := \sum _{i: X_i\in D} f(X_i). \end{aligned}$$

By Lemma 2.6, the random variables \(\{X(f,D): D\in \mathcal {D}_j\}\) are conditionally independent given \(\mathcal {F}_j\). Let

$$\begin{aligned} M := n^{1/3}\sum _{D\in \mathcal {C}^*} X(f, D). \end{aligned}$$

Now take any \(D\in \mathcal {C}^*\). Recall that D contains exactly one point of our point process, and by Lemma 2.6, the conditional distribution of this point given \(\mathcal {F}_j\) is uniform over the cube D. Since f is a linear function, it is easy to see from this observation that for any \(D\in \mathcal {C}^*\), the conditional distribution of the random variable

$$\begin{aligned} n^{1/3}(X(f,D) - \mathbb {E}(X(f,D))) \end{aligned}$$

given \(\mathcal {F}_j\) is actually non-random, and depends only on f. In particular, since f is also non-constant, this shows that

$$\begin{aligned} \mathrm {Var}(n^{1/3}X(f,D)|\mathcal {F}_j) = K_6(f) \end{aligned}$$

and

$$\begin{aligned} \mathbb {E}\bigl (|n^{1/3}X(f,D) - \mathbb {E}(n^{1/3}X(f,D))|^3\bigl |\mathcal {F}_j\bigr )= K_7(f), \end{aligned}$$

where \(K_6(f)\) and \(K_7(f)\) are strictly positive constants that depend only on f. Therefore by the Berry–Esseen theorem, for any interval I,

$$\begin{aligned} \mathbb {P}(M\in I|\mathcal {F}_j)\le \frac{C(f)(|I|+1)}{\sqrt{|\mathcal {C}^*|}}, \end{aligned}$$
(2.26)

where |I| denotes the length of I. Since

$$\begin{aligned} n^{1/3}X(f) = n^{1/3}\sum _{D\in \mathcal {D}_j} X(f,D) = n^{1/3}\sum _{D\in \mathcal {D}_j{\setminus } \mathcal {C}^*}X(f,D) + M, \end{aligned}$$

and the two terms in the last expression are independent given \(\mathcal {F}_j\), the inequality (2.26) implies that

$$\begin{aligned} \mathbb {P}(n^{1/3}X(f)\in I|\mathcal {F}_j)\le \frac{C(f)(|I|+1)}{\sqrt{|\mathcal {C}^*|}}. \end{aligned}$$

Therefore by (2.25),

$$\begin{aligned} \mathbb {P}(n^{1/3}X(f)\in I)\le \frac{C(f,\beta )(|I|+1)}{\sqrt{n}} + \frac{13}{14} \end{aligned}$$

if n is sufficiently large. This completes the proof. \(\square \)

3 Proofs in 2D and 1D

In this section, we will prove the results of Sect. 1.5. The proofs are similar to the proofs in the 3D case, but there are substantial differences, which is why we need a separate section.

3.1 Notation

All notation will remain the same as in the 3D case. For example, \(\mathcal {D}_k\) will denote dyadic sub-squares of side-length \(2^{-k}\) in 2D, and dyadic sub-intervals of length \(2^{-k}\) in 1D. The main change is that w is now different, namely, \(w(x,y)=k(x,y)\), where k(xy) is the smallest k such that x and y belong to distinct elements of \(\mathcal {D}_k\). The partition function \(Z(n,\beta )\) and the measure \(\mu _{n,\beta }\) are defined as before, with this new w instead of the old one. We will denote the dimension by d, which may be 1 or 2.

3.2 Preliminary calculations

First, let us carry out the calculations analogous to those done in Sect. 2.2.

Lemma 3.1

For each \(x\in [0,1)^d\),

$$\begin{aligned} \int w(x,y) \, dy = \frac{2^d}{2^d-1}. \end{aligned}$$

Consequently,

$$\begin{aligned} \iint w(x,y) \, dx\, dy = \frac{2^d}{2^d-1}. \end{aligned}$$

Proof

Take any x. For each k, let \(D_k\) be the element of \(\mathcal {D}_k\) that contains x. It is easy to see that the set of all y with \(w(x,y)=k\) is exactly the union of all members of \(\mathcal {D}_k\) that are contained in \(D_{k-1}\), except the one that contains x. The Lebesgue measure of this set is \(2^{-dk} (2^d-1)\). Thus,

$$\begin{aligned} \int w(x,y)\, dy = (2^d-1)\sum _{k=1}^\infty k2^{-dk} = \frac{2^d}{2^d-1}. \end{aligned}$$

The second assertion is obvious from the first. \(\square \)

Let us now investigate energy-minimizing configurations of finite size. As before, \(L_n\) will denote the minimum possible energy of a configuration of n points. The following result gives upper and lower bounds for \(L_n\) in dimensions one and two.

Theorem 3.2

There is a positive constant \(C_1\) such that for each \(n\ge 2\),

$$\begin{aligned} {n\atopwithdelims ()2}\frac{2^d}{2^d-1} - C_1n\log n\le L_n \le {n\atopwithdelims ()2}\frac{2^d}{2^d-1}. \end{aligned}$$

Proof

The proof of the upper bound is exactly the same as in Theorem 2.2. For the lower bound, let k be an integer such that

$$\begin{aligned} n^{-1/d}\le 2^{-k}\le 2n^{-1/d}. \end{aligned}$$

Take any configuration of n points. For each \(D\in \mathcal {D}\), let \(n_D\) be the number of points in D. Summing up the contributions to the energy from each cube, we get

$$\begin{aligned} H_n(x_1,\ldots , x_n)&= \sum _{j=1}^\infty \sum _{D\in \mathcal {D}_j} {n_D\atopwithdelims ()2} + {n\atopwithdelims ()2}\\&\ge \sum _{j=1}^k \sum _{D\in \mathcal {D}_j} {n_D\atopwithdelims ()2} + {n\atopwithdelims ()2}= \frac{1}{2}\sum _{j=1}^k \sum _{D\in \mathcal {D}_j} n_D^2 - \frac{nk}{2} + {n\atopwithdelims ()2}. \end{aligned}$$

By the Cauchy–Schwarz inequality, for each j,

$$\begin{aligned} \sum _{D\in \mathcal {D}_j} n_D^2 \ge \frac{1}{|\mathcal {D}_j|}\left( \sum _{D\in \mathcal {D}_j} n_D\right) ^2 = \frac{n^2}{2^{dj}}. \end{aligned}$$

Thus,

$$\begin{aligned} H_n(x_1,\ldots , x_n) \ge \frac{n^2}{2}\sum _{j=1}^k 2^{-dj} - \frac{nk}{2} + {n\atopwithdelims ()2}= \frac{n^2}{2}\frac{1-2^{-dk}}{2^d-1}- \frac{nk}{2} + {n\atopwithdelims ()2}. \end{aligned}$$

By our choice of k, this completes the proof. \(\square \)

3.3 Estimates for the partition function

Recall that for a measurable function \(f: \Sigma _n\rightarrow \mathbb {R}\), its expected value under \(\mu _{n,\beta }\) is denoted by \(\mu _{n,\beta }(f)\).

Lemma 3.3

There is a constant \(C_2\) such that for any \(n\ge 0\) and \(\beta >0\),

$$\begin{aligned} \exp \left( -\frac{2^d\beta n}{2^d-1} \right) \le \frac{Z(n+1, \beta )}{Z(n, \beta )}\le \exp \left( -\frac{2^d\beta n}{2^d-1} + C_2 \log (n+1)\right) . \end{aligned}$$

Proof

The proof of Lemma 2.3 goes through verbatim, the only change being that we need to use Theorem 3.2 instead of Theorem 2.2.\(\square \)

Corollary 3.4

For any \(n\ge 0\), \(\beta >0\), and any \(k\ge -n\),

$$\begin{aligned} \frac{Z(n+k, \beta )}{Z(n,\beta )} \le \exp \left( -\frac{2^d\beta nk}{2^d-1} -\frac{2^d\beta k(k-1)}{2(2^d-1)} + C_2 \beta |k| \log (n+|k|+1)\right) , \end{aligned}$$

where \(C_5\) is the constant from Lemma 3.3.

Proof

Again, the proof of Corollary 2.4 goes through verbatim, except that we need to use Lemma 3.3 instead of Lemma 2.3. \(\square \)

3.4 Proofs of the upper bounds

Let us now fix some \(n\ge 0\) and \(\beta >0\). In the following, \((X_1,\ldots , X_n)\) will denote a random configuration drawn from the measure \(\mu _{n,\beta }\). We will assume that \((X_1,\ldots ,X_n)\) is defined on some abstract probability space \((\Omega , \mathcal {F}, \mathbb {P})\). Expectation, variance and covariance with respect to \(\mathbb {P}\) will be denoted by \(\mathbb {E}\), \(\mathrm {Var}\) and \(\mathrm {Cov}\) respectively.

Lemma 3.5

Let \(D_1,\ldots ,D_{2^d}\) denote the \(2^d\) elements of \(\mathcal {D}_1\), and for each \(1\le i\le 2^d\), let \(N_i := |\{j: X_j\in D_i\}|\). Then for each i, \(\mathbb {E}(N_i)= n/2^d\) and

$$\begin{aligned} \mathrm {Var}(N_i)\le K(\beta ) (\log (n+1))^2, \end{aligned}$$

where \(K(\beta )\) is a non-increasing function of \(\beta \).

Proof

We have already defined universal constants \(C_1\) and \(C_2\) in the previous subsections. In this proof, we will denote further universal constants by \(C_3, C_4,\ldots \) without explicitly mentioning that they denote universal constants on each occasion.

The identity \(\mathbb {E}(N_i)=n/2^d\) follows by symmetry. We will now prove the claimed bound on the variance. The cases \(n=0\) and \(n=1\) are trivial, so assume that \(n\ge 2\). As in the proof of Lemma 2.5, we have a recursion for the partition function, although the recursion is slightly different due to the different nature of the potential:

$$\begin{aligned}&Z(n,\beta ) \\&\quad = \sum _{\begin{array}{c} 0\le n_1,\ldots , n_{2^d}\le n\\ n_1+\cdots +n_{2^d}=n \end{array}} \frac{n!}{n_1!n_2!\cdots n_{2^d}!} e^{-\beta \sum _{1\le i<j\le 2^d} n_i n_j}\prod _{i=1}^{2^d} (2^{-dn_i}Z(n_i,\beta )e^{-\beta {n_i\atopwithdelims ()2}})\\&\quad = \sum _{\begin{array}{c} 0\le n_1,\ldots , n_{2^d}\le n\\ n_1+\cdots +n_{2^d}=n \end{array}} \frac{2^{-dn}e^{-\beta {n\atopwithdelims ()2}}n!}{n_1!n_2!\cdots n_{2^d}!} \prod _{i=1}^{2^d} Z(n_i,\beta ). \end{aligned}$$

Moreover, for any \((n_1,\ldots , n_{2^d})\) occurring in the above sum,

$$\begin{aligned} \mathbb {P}(N_1=n_1,\ldots , N_{2^d}=n_{2^d}) = \frac{2^{-dn}e^{-\beta {n\atopwithdelims ()2}}n!}{n_1!n_2!\cdots n_{2^d}!} \frac{\prod _{i=1}^{2^d} Z(n_i,\beta ) }{Z(n,\beta )}. \end{aligned}$$

Choose nonnegative integers \(m_1,\ldots , m_{2^d}\) such that \(m_1+\cdots +m_{2^d}=n\) and \(|m_i-n/2^d|\le 1\) for each i. For convenience, let

$$\begin{aligned} f(n_1,\ldots , n_{2^d}) := \frac{n!}{n_1!n_2!\cdots n_{2^d}!}, \ \ \ h(n_1,\ldots , n_{2^d}) := \prod _{i=1}^{2^d} Z(n_i,\beta ). \end{aligned}$$

Take any \(k_1,\ldots ,k_{2^d}\in \mathbb {Z}\) such that \(k_1+\cdots +k_{2^d}=0\) and \(0\le m_i+k_i\le n\) for each i. Then by Corollary 3.4,

$$\begin{aligned}&\frac{h(m_1+k_1,\ldots , m_{2^d}+k_{2^d})}{h(m_1,\ldots , m_{2^d})} \\&\quad \le \prod _{i=1}^{2^d} \exp \left( -\frac{2^d\beta m_ik_i}{2^d-1} -\frac{2^d\beta k_i(k_i-1)}{2(2^d-1)} + C_2 \beta |k_i| \log (n+|k_i|+1)\right) \\&\quad \le \prod _{i=1}^{2^d} \exp \left( -\frac{2^d\beta (nk_i/2^d - |k_i|)}{2^d-1} -\frac{2^d\beta k_i(k_i-1)}{2(2^d-1)} + 2C_2\beta |k_i|\log n\right) \\&\quad \le \exp \left( -\frac{2^d\beta }{2(2^d-1)}\sum _{i=1}^{2^d} k_i^2 + C_3\beta \log n\sum _{i=1}^{2^d} |k_i|\right) . \end{aligned}$$

Therefore,

$$\begin{aligned}&\frac{\mathbb {P}(N_1=m_1+k_1,\ldots , N_{2^d}=m_{2^d}+k_{2^d})}{\mathbb {P}(N_1=m_1,\ldots , N_{2^d}=m_{2^d})}\\&\quad \le \frac{f(m_1+k_1,\ldots , m_{2^d}+k_{2^d})}{f(m_1,\ldots , m_{2^d})} \exp \left( -\frac{2\beta }{3}\sum _{i=1}^{2^d} k_i^2 + C_3 \beta \log n \sum _{i=1}^{2^d}|k_i|\right) . \end{aligned}$$

This shows that there are positive constants \(C_4\) and \(C_5\) such that if

$$\begin{aligned} \max _{1\le i\le 2^d} |k_i|\ge C_4\log n, \end{aligned}$$

then

$$\begin{aligned}&\frac{\mathbb {P}(N_1=m_1+k_1,\ldots , N_{2^d}=m_{2^d}+k_{2^d})}{\mathbb {P}(N_1=m_1,\ldots , N_{2^d}=m_{2^d})}\\&\quad \le \frac{f(m_1+k_1,\ldots , m_{2^d}+k_{2^d})}{f(m_1,\ldots , m_{2^d})} e^{-C_5\beta (\log n)^2}. \end{aligned}$$

It is now easy to complete the proof by imitating the last part of the proof of Lemma 2.5. \(\square \)

For a Borel set \(A\subseteq [0,1)^d\), let X(A) and N(A) be defined as before. Also, define \(\{\mathcal {F}_k\}_{k\ge 0}\) as before.

Lemma 3.6

Conditional on \(\mathcal {F}_k\), the random sets \(\{X(D): D\in \mathcal {D}_k\}\) are mutually independent. Moreover, for any \(D\in \mathcal {D}_k\), conditional on \(\mathcal {F}_k\), X(D) has the same distribution as a scaled version of a point process from the measure \(\mu _{N(D), \beta }\).

Proof

The proof is the same as the proof of Lemma 2.6, except that \(\beta \) need not be replaced by \(2^k \beta \) due to the different nature of the potential. \(\square \)

Lemma 3.7

If \(D\in \mathcal {D}_k\) and \(D'\) is a child of D, then

$$\begin{aligned} \mathbb {E}(N(D')|\mathcal {F}_k) = \frac{N(D)}{2^d} \end{aligned}$$

and

$$\begin{aligned} \mathrm {Var}(N(D')|\mathcal {F}_k) \le K(\beta ) (\log (N(D)+1))^{2}, \end{aligned}$$

where K is the function from Lemma 3.5.

Proof

The formula for the conditional expectation follows from Lemma 2.6 and symmetry, and the bound on the conditional variance follows from Lemmas 3.5 and 3.6. \(\square \)

Lemma 3.8

For any \(D\in \mathcal {D}\), \(\mathbb {E}(N(D))=\hbox {Leb}(D)n\) and

$$\begin{aligned} \mathrm {Var}(N(D)) \le C(\beta ) (\log (2^d\hbox {Leb}(D)n+3))^2, \end{aligned}$$

where \(C(\beta )\) depends only on \(\beta \).

Proof

Suppose that \(D\in \mathcal {D}_k\). The formula for the expectation follows easily by iterating the formula for the conditional expectation from Lemma 3.7, and observing that \(\hbox {Leb}(D)=2^{-dk}\). Next, let \(D'\) be the parent of D. Then by Lemma 3.7, the formula for expected value, and the concavity of the map \(x\mapsto (\log (x+3))^2\) on the nonnegative axis,

$$\begin{aligned} \mathbb {E}(N(D)^2)&= \mathbb {E}(N(D)^2 - (\mathbb {E}(N(D)|\mathcal {F}_{k-1}))^2) + \mathbb {E}((\mathbb {E}(N_{D}|\mathcal {F}_{k-1}))^2)\\&= \mathbb {E}(\mathrm {Var}(N(D)|\mathcal {F}_{k-1})) + 2^{-2d} \mathbb {E}(N(D')^2)\\&\le K(\beta ) \mathbb {E}((\log (N(D')+1))^{2}) + 2^{-2d} \mathbb {E}(N(D')^2)\\&\le K(\beta ) \mathbb {E}((\log (N(D')+3))^{2}) + 2^{-2d} \mathbb {E}(N(D')^2)\\&\le K(\beta ) (\log \mathbb {E}(N(D')+3))^{2} + 2^{-2d} \mathbb {E}(N(D')^2)\\&= K(\beta ) (\log (2^{-d(k-1)}n+3))^{2} + 2^{-2d}\mathbb {E}(N(D')^2). \end{aligned}$$

Iterating this, we get

$$\begin{aligned} \mathbb {E}(N(D)^2) \le K(\beta )\sum _{r=0}^{k-1}(\log (2^{d+rd}\hbox {Leb}(D)n+3))^22^{-2rd} + 2^{-2dk}n^2. \end{aligned}$$

Now note that for any \(r\ge 0\),

$$\begin{aligned} \frac{\log (2^{d+rd}\hbox {Leb}(D)n+3)}{\log (2^{d}\hbox {Leb}(D)n+3)}&\le \frac{\log (2^{d}\hbox {Leb}(D)n+3) + \log 2^{rd}}{\log (2^{d}\hbox {Leb}(D)n+3)}\\&= 1+ \frac{rd\log 2}{\log (2^{d}\hbox {Leb}(D)n+3)}\le 1+ \frac{rd\log 2}{\log 3}. \end{aligned}$$

Thus,

$$\begin{aligned} \mathbb {E}(N(D)^2)&\le K(\beta )(\log (2^d\hbox {Leb}(D)n+3))^2\sum _{r=0}^\infty \left( 1+\frac{rd\log 2}{\log 3}\right) ^22^{-2rd} \\&\quad + 2^{-2dk}n^2\\&\le C(\beta )(\log (2^d\hbox {Leb}(D)n+3))^2 + 2^{-2dk}n^2, \end{aligned}$$

where \(C(\beta )\) depends only on \(\beta \). This completes the proof, since \(\mathbb {E}(N(D)) = \hbox {Leb}(D)n = 2^{-dk}n\). \(\square \)

Now take any nonempty open set \(U\subseteq [0,1)^d\) with regular boundary, and let A(U) be defined as in (2.2). Define \(\mathcal {U}\), \(\mathcal {U}_j\), \(\mathcal {V}_j\) and \(M_j\) as in the 3D case. It is easy to see that Lemmas 2.9, 2.11 and Corollary 2.10 remain valid in the 2D and 1D cases.

Lemma 3.9

For any \(j\ge 1\) such that \(\sqrt{d}\cdot 2^{-j+1}\le \hbox {diam}(U)\),

$$\begin{aligned} \mathrm {Var}(M_j)\le C(\beta ) A(U) (\log (2^{-d(j-1)} n+3))^{2} 2^{(d-1)j} + \mathrm {Var}(M_{j-1}), \end{aligned}$$

where \(C(\beta )\) is a constant that depends only on \(\beta \).

Proof

In this proof, \(C(\beta )\) will denote any constant that depends only on \(\beta \). Equations (2.3) and (2.4) are still valid. If \(D,D'\in \mathcal {U}_j\cup \mathcal {V}_j\) have different parents, then N(D) and \(N(D')\) are conditionally independent by Lemma 3.6, and hence the conditional covariance is zero. Otherwise, Lemma 3.7 and the Cauchy–Schwarz inequality imply that

$$\begin{aligned} |\mathrm {Cov}(N(D), N(D')|\mathcal {F}_{j-1})|\le C(\beta )(\log (N(D'')+1))^{2}, \end{aligned}$$

where \(D''\) is the parent of D and \(D'\). Thus, by Lemma 3.8 and the concavity of the map \(x\mapsto (\log (x+3))^2\) on the nonnegative real axis,

$$\begin{aligned} |\mathbb {E}(\mathrm {Cov}(N(D), N(D')|\mathcal {F}_{j-1}))|&\le C(\beta ) (\log (\hbox {Leb}(D'') n+3))^{2} \\&= C(\beta ) (\log (2^{-d(j-1)} n+3))^{2}. \end{aligned}$$

On the other hand, each \(D\in \mathcal {U}_j\cup \mathcal {V}_j\) has at most \(2^d-1\) siblings that belong to \(\mathcal {U}_j\cup \mathcal {V}_j\). Since \(p(D)2^{-dj} = p(D)\hbox {Leb}(D) = \hbox {Leb}(D\cap U)\), this shows that

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(M_j|\mathcal {F}_{j-1}))&\le C(\beta ) (\log (2^{-d(j-1)} n+3))^{2}\sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j} 2^dp(D)\\&= C(\beta ) (\log (2^{-d(j-1)} n+3))^{2} 2^{d(j+1)}\sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j}\hbox {Leb}(D\cap U). \end{aligned}$$

Note that each element of

$$\begin{aligned} \bigcup _{D\in \mathcal {U}_j\cup \mathcal {V}_j}( D\cap U) \end{aligned}$$

is within distance \(\sqrt{d}\cdot 2^{-j+1}\) of \(\partial U\). Since \(\sqrt{d}\cdot 2^{-j+1}\le \hbox {diam}(U)\), inequality (2.2) gives

$$\begin{aligned} \sum _{D\in \mathcal {U}_j\cup \mathcal {V}_j}\hbox {Leb}(D\cap U)\le A(U)\sqrt{d}\cdot 2^{-j+1}. \end{aligned}$$

Consequently,

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(M_j|\mathcal {F}_{j-1})) \le C(\beta ) A(U) (\log (2^{-d(j-1)} n+3))^{2} 2^{(d-1)j}, \end{aligned}$$

where \(C(\beta )\) depends only on \(\beta \). The proof is completed by plugging this bound into (2.3). \(\square \)

We now have all the ingredients for proving the following analog of Theorem 2.13.

Theorem 3.10

(Hyperuniformity at all scales in 2D and 1D) Let U and N(U) be as in Theorem 1.1. Suppose that \(\hbox {diam}(U)\ge n^{-1/d}\). Let A(U) be the constant defined in (2.2). Then

$$\begin{aligned} \mathbb {E}(N(U)) = \hbox {Leb}(U)n \end{aligned}$$

and

$$\begin{aligned} \mathrm {Var}(N(U))\le C(\beta )(A(U)n^{(d-1)/d}+1) (\log (7\hbox {diam}(U)^d n))^2, \end{aligned}$$

where \(C(\beta )\) is a constant that depends only on \(\beta \).

Proof

Throughout this proof, \(C(\beta )\) will denote any constant that depends only on \(\beta \). The value of \(C(\beta )\) may change from line to line or even within a line.

The formula for the expectation follows from the d-dimensional version of Corollary 2.10. It remains to prove the variance bound. Choose k such that

$$\begin{aligned} \frac{1}{2}n^{-1/d}\le \sqrt{d} \cdot 2^{-k}\le n^{-1/d}. \end{aligned}$$

Equation (2.5) remains valid, as does the inequality

$$\begin{aligned} \mathrm {Var}(N(U)|\mathcal {F}_k) \le \sum _{D\in \mathcal {V}_k} p(D) N(D)^2. \end{aligned}$$

By Lemma 3.8 and our choice of k,

$$\begin{aligned} \mathbb {E}(N(D)^2)\le C(\beta ) (\log (2^d\hbox {Leb}(D)n+3))^2 + \hbox {Leb}(D)^2n^2\le C(\beta ) \end{aligned}$$

for all \(D\in \mathcal {V}_k\). Note that each element of

$$\begin{aligned} \bigcup _{D\in \mathcal {V}_k} (D\cap U) \end{aligned}$$

is within distance \(\sqrt{d}\cdot 2^{-k}\) of \(\partial U\), and \(p(D) 2^{-dk} = \hbox {Leb}(D\cap U)\). Since \(\sqrt{d}\cdot 2^{-k}\le n^{-1/d}\le \hbox {diam}(U)\) by our choice of k, this gives

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(N(U)|\mathcal {F}_k))&\le C(\beta )2^{dk}\sum _{D\in \mathcal {V}_k} \hbox {Leb}(D\cap U) \\&\le C(\beta ) 2^{dk}A(U) 2^{-k}\le C(\beta )A(U)n^{(d-1)/d}. \end{aligned}$$

Let l be the smallest integer such that \(\sqrt{d}\cdot 2^{-l} \le \hbox {diam}(U)\). Note that \(l\le k\). Together with (2.5) and Lemma 3.9, the above inequality shows that

$$\begin{aligned}&\mathrm {Var}(N(U))\le C(\beta )A(U)\sum _{j=l+1}^k(\log (2^{-d(j-1)} n+3))^{2} 2^{(d-1)j} + \mathrm {Var}(M_l)\\&\quad \le C(\beta )A(U)(\log (2^d\hbox {diam}(U)^d n+3))^{2}\sum _{j=l+1}^k 2^{(d-1)j} + \mathrm {Var}(M_l)\\&\quad \le C(\beta )A(U)n^{(d-1)/d} (\log (2^d\hbox {diam}(U)^d n+3))^{2} + \mathrm {Var}(M_l). \end{aligned}$$

By the definition of l, \(\mathcal {U}_i\) if empty for all \(i<l\). Therefore

$$\begin{aligned} M_l = \sum _{D\in \mathcal {U}_l\cup \mathcal {V}_l} p(D)N(D). \end{aligned}$$

Note that for any \(D\in \mathcal {U}_l\cup \mathcal {V}_l\), Lemma 3.8 gives

$$\begin{aligned} \mathrm {Var}(p(D)N(D))&= p(D)^2\mathrm {Var}(N(D))\le C(\beta )(\log (2^d\hbox {Leb}(D)n+3))^2\\&= C(\beta )(\log (2^d 2^{-dl} n + 3))^2\\&\le C(\beta )(\log (2^d \hbox {diam}(U)^d n + 3))^2. \end{aligned}$$

Moreover, it is easy to see that U intersects at most \(2^d\) members of \(\mathcal {D}_l\), and therefore \(|\mathcal {U}_l \cup \mathcal {V}_l|\le 2^d\). From these observations, we get

$$\begin{aligned} \mathrm {Var}(M_l)\le C(\beta )(\log (2^d \hbox {diam}(U)^d n + 3))^2. \end{aligned}$$

This completes the proof of the theorem. \(\square \)

Proofs of Theorems 1.6 and 1.8

These are consequences of Theorem 3.10 in the same way as Theorems 1.1 and 1.3 followed from Theorem 2.13. \(\square \)

Finally, let us prove Theorem 1.9.

Proof of Theorem 1.9

The proof is very similar to the proof of Theorem 1.4, with minor modifications. As usual, \(C(\beta )\) denotes any constant that depends only on \(\beta \). Define f(D) and \(W_k\) as in the proof of Theorem 1.4. Then \(W_k\) is again a martingale, and Eq. (2.6) is still valid. Now choose k such that

$$\begin{aligned} n^{-1/d}\le 2^{-k}\le 2n^{-1/d}. \end{aligned}$$

Then (2.7) continues to hold. Take any j. For each \(D\in \mathcal {D}_{j-1}\), let c(D) denote the set of \(2^d\) children of D. Proceeding as in the proof of Theorem 1.4, we get

$$\begin{aligned}&\mathrm {Var}(X(f_j)|\mathcal {F}_{j-1}) \\&\quad = \sum _{D\in \mathcal {D}_{j-1}} \mathbb {E}\left( \left( \sum _{D'\in c(D)} f(D')N(D') - f(D)N(D)\right) ^2\biggl |\mathcal {F}_{j-1}\right) . \end{aligned}$$

Now notice that for any \(D\in \mathcal {D}_{j-1}\),

$$\begin{aligned}&\sum _{D'\in c(D)} f(D')N(D') - f(D)N(D) \\&\quad = \sum _{D'\in c(D)} (f(D')- f(D))\left( N(D')-\frac{N(D)}{2^d}\right) . \end{aligned}$$

Recall that L is the Lipschitz constant of f. For any \(D'\in c(D)\),

$$\begin{aligned} |f(D')-f(D)|\le \sqrt{d} L 2^{-j+1}. \end{aligned}$$

As in the proof of Theorem 1.4,

$$\begin{aligned}&\left( \sum _{D'\in c(D)} (f(D')- f(D))\left( N(D')-\frac{N(D)}{2^d}\right) \right) ^2 \\&\quad \le 4^{-j+3}L^2\sum _{D'\in c(D)} \left( N(D')-\frac{N(D)}{2^d}\right) ^2. \end{aligned}$$

Therefore, by Lemma 3.7,

$$\begin{aligned}&\mathbb {E}\left( \left( \sum _{D'\in c(D)} f(D')N(D') - f(D)N(D)\right) ^2\biggl |\mathcal {F}_{j-1}\right) \\&\quad \le 4^{-j+3} L^2 \sum _{D'\in c(D)}\mathrm {Var}(N(D')|\mathcal {F}_{j-1})\\&\quad \le 4^{-j+4} L^2 K(\beta )(\log (N(D)+1))^2\le 4^{-j+4}L^2K(\beta )(\log (n+1))^2. \end{aligned}$$

Consequently,

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(X(f_j)|\mathcal {F}_{j-1})) \le C(\beta ) L^2(\log n)^24^{-j}|\mathcal {D}_{j-1}|\le C(\beta )L^22^{(d-2)j} (\log n)^2. \end{aligned}$$

For \(D\in \mathcal {D}_k\), let s(D) be defined as in the proof of Theorem 1.4. Then as before, we have

$$\begin{aligned} \mathrm {Var}(X(f)|\mathcal {F}_k) \le \sum _{D\in \mathcal {D}_k}\mathbb {E}((s(D)-f(D)N(D))^2|\mathcal {F}_k). \end{aligned}$$

By the Lipschitz condition,

$$\begin{aligned} |s(D)-f(D)N(D)|\le \sqrt{d}L 2^{-k}N(D) \end{aligned}$$

for each \(D\in \mathcal {D}_k\). Thus, by Lemma 3.8 and our choice of k,

$$\begin{aligned} \mathbb {E}((s(D)-f(D)N(D))^2) \le 4^{-k+1}L^2 \mathbb {E}(N(D)^2)\le C(\beta )L^2 4^{-k}. \end{aligned}$$

Consequently,

$$\begin{aligned} \mathbb {E}(\mathrm {Var}(X(f)|\mathcal {F}_k)) \le C(\beta )L^24^{-k}|\mathcal {D}_k|\le C(\beta ) L^22^{(d-2)k}\le C(\beta )L^2 n^{(d-2)/d}. \end{aligned}$$

The proof is now easily completed by combining the steps. \(\square \)

3.5 Proofs of the lower bounds

Let us now prove Theorem 1.7. The proof is similar to the proof of Theorem 1.2, but with some significant changes due to the different nature of the potential. Let

$$\begin{aligned} \mathcal {T}:= \{z+[0,1)^2: z\in \mathbb {Z}^2\}. \end{aligned}$$

Lemma 3.11

Let \(\mathcal {T}\) be as above. Take any \(D\in \mathcal {T}\) and any \(x\in D\). Let \(\delta \) be the distance of x from the boundary of D. Then any line through x bifurcates D into two parts, each of which has volume at least \(\pi \delta ^2/2\).

Proof

Same as the proof of Lemma 2.14. \(\square \)

Lemma 3.12

Take any \(x\in \mathbb {R}^2\) and a unit vector \(u = (u_1,u_2)\in S^1\). Let L be the line that contains x and is perpendicular to u. Suppose that

$$\begin{aligned} \min \{|u_1|, |u_2|\} \ge 0.1. \end{aligned}$$
(3.1)

Then there is an element \(D\in \mathcal {T}\), within Euclidean distance \(\sqrt{101}\) from x, which is bifurcated by the line P in such a way that each part has volume at least \(6\times 10^{-5}\).

Proof

Take any \(x = (x_1,x_2)\in \mathbb {R}^2\) and \(u=(u_1,u_2)\in S^1\) as in the statement of the lemma. Let \(L_0\) be the line with normal vector u that contains the origin. Define

$$\begin{aligned} y_1= \hbox {sign}(u_1), \ \ y_2 = -\frac{|u_1|}{u_2}. \end{aligned}$$

Then \(y= (y_1,y_2)\in L_0\). Also, we have \(|y_1|=1\), and by condition (3.1) and the fact that \(|u_2|\le 1\),

$$\begin{aligned} |y_2| = \frac{|u_1|}{|u_2|} \ge |u_1|\ge 0.1. \end{aligned}$$

Now consider the set

$$\begin{aligned} I_1 = \{x_1 + \alpha y_1: 0\le \alpha \le 1\}. \end{aligned}$$

Since \(|y_1|=1\), \(I_1\) is an interval of length 1. By Lemma 2.15, \(I_1\) has a subinterval of \(I_2\) of length 0.25 such that any integer is at least at a distance 0.25 from \(I_2\). Moreover, since \(|y_1|= 1\), \(I_2\) is of the form

$$\begin{aligned} \{x_1+\alpha y_1: a\le \alpha \le b\}, \end{aligned}$$

where \(b-a= 0.25\). Let

$$\begin{aligned} I_3 := \{x_2+\alpha y_2: a\le \alpha \le b\}. \end{aligned}$$

Since \(|y_2|\ge 0.1\), \(I_3\) has length at least 0.025. Thus by Lemma 2.15, \(I_3\) contains a subinterval \(I_4\) of length 0.00625 such that any integer is at a distance at least 0.00625 from \(I_4\).

In particular, there is some \(\alpha \in [0,1]\) such that \(x_1+\alpha y_1\in I_2\) and \(x_2+\alpha y_2\in I_4\). The distance of \(x_i+\alpha y_i\) from the nearest integer is at least 0.00625 for each i. Thus, the distance of the point \(x+\alpha y\) from the boundary of the square \(D\in \mathcal {T}\) that contains \(x+\alpha y\) is at least 0.00625. By Lemma 3.11 and the fact that \(x+\alpha y\in L\), this proves L bifurcates D into two parts, each of which has volume at least \(6\times 10^{-5}\). Lastly, note that

$$\begin{aligned} |(x+\alpha y)-x|\le |y| = \sqrt{y_1^2+y_2^2}\le \sqrt{1 + \frac{1}{0.1^2}}\le \sqrt{101}, \end{aligned}$$

since \(|u_1|\le 1\) and \(|u_2|\ge 0.1\). This completes the proof of the lemma. \(\square \)

Now recall that the boundary of the set U in the statement of Theorem 1.7 is a simple smooth closed curve. In particular, we can choose a unit normal vector u(x) at each \(x\in \partial U\) such that the map \(x\mapsto u(x)\) is smooth.

Lemma 3.13

Take any \(x\in \partial U\) such that the normal vector u(x) satisfies (3.1). Then there is some \(j_0\) depending only on U (but not on x), such that for all \(j\ge j_0\), there is some \(D\in \mathcal {D}_j\) at distance at most \(\sqrt{101} \cdot 2^{-j}\) from x, such that

$$\begin{aligned} 10^{-5}\le \frac{\hbox {Leb}(D\cap U)}{\hbox {Leb}(D)}\le 1- 10^{-5}. \end{aligned}$$
(3.2)

Proof

Same as the proof of Lemma 2.17, using Lemma 3.12 instead of Lemma 2.16. \(\square \)

Lemma 3.14

There is some \(K_1>0\) and some \(j_1\ge 1\) depending only on U such that for any \(j\ge j_1\), there is a set of at least \(K_1 2^j\) squares \(D\in \mathcal {D}_j\) that satisfy (3.2) and the union of these squares has diameter at most \(\hbox {diam}(U)/3\).

Proof

Same as the proof of Lemma 2.18, with a small adjustment for dimension that replaces \(K_14^j\) by \(K_12^j\). \(\square \)

Lemma 3.15

Take any \(n\ge 1\) and \(\beta >0\), and a Borel set \(A\subseteq [0,1)^2\) with \(0<\hbox {Leb}(A)<1\). Let \(\delta >0\) be a number such that \(\delta \le \hbox {Leb}(A)\le 1-\delta \). Then

$$\begin{aligned} c(\beta , n,\delta )\le \mathrm {Var}(N(A)) \le n^2, \end{aligned}$$

where \(c(\beta , n,\delta )\) is a positive real number that depends only on \(\beta \), n and \(\delta \).

Proof

The upper bound is trivial since \(N(A)\le n\). For the lower bound, the case \(n=1\) is easy, since in that case N(A) is a Bernoulli\((\hbox {Leb}(A))\) random variable. So let us take \(n\ge 2\). Trivially, \(Z(n,\beta )\le 1\). Therefore, by Jensen’s inequality and Lemma 3.1,

$$\begin{aligned} \mathbb {P}(N(A)=n)&=\mu _{n,\beta }(A^n) \ge \int _{A^n} e^{-\beta H_n(x_1,\ldots ,x_n)}\, dx_1\cdots dx_n \\&\ge \hbox {Leb}(A^n) \exp \left( -\frac{\beta }{\hbox {Leb}(A^n)} \int _{A^n}H_n(x_1,\ldots ,x_n)\, dx_1\cdots dx_n \right) \\&\ge \hbox {Leb}(A^n)\exp \left( -\frac{4\beta }{3\hbox {Leb}(A^n)}{n \atopwithdelims ()2}\right) \\&\ge \delta ^n\exp \left( -\frac{4\beta }{3\delta ^n}{n \atopwithdelims ()2}\right) . \end{aligned}$$

Similarly, if \(B= [0,1)^2{\setminus } A\), then

$$\begin{aligned} \mathbb {P}(N(A)=0)=\mu _{n,\beta }(B^n)\ge \delta ^n\exp \left( -\frac{4\beta }{3\delta ^n}{n \atopwithdelims ()2}\right) . \end{aligned}$$

With the two lower bounds derived above, it is now easy to complete the proof, for example using Chebychev’s inequality. \(\square \)

Proof of Theorem 1.7

In this proof, as in the proof of Theorem 1.2, the phrase ‘n sufficiently large’ will mean ‘\(n\ge n_0\), where \(n_0\) depends only on U and \(\beta \)’. Also, C will denote any positive universal constant, \(C(\beta )\) will denote any positive constant that depends only on \(\beta \), and \(C(U,\beta )\) will denote any positive constant that depends only on U and \(\beta \).

Choose k such that

$$\begin{aligned} n^{-1/2}\le 2^{-k} \le 2n^{-1/2}. \end{aligned}$$

Then for any \(D\in \mathcal {D}_k\), Lemma 3.8 gives

$$\begin{aligned} \mathbb {E}(N(D)^2) \le L_1(\beta ), \end{aligned}$$
(3.3)

where \(L_1(\beta )\) is a positive integer that depends only on \(\beta \). Let

$$\begin{aligned} m := 1000 L_1(\beta ). \end{aligned}$$

If n is sufficiently large, then there is a set \(\mathcal {C}\subseteq \mathcal {D}_k\) that satisfies the conclusions of Lemma 3.14. In particular, arguing as in the proof of Theorem 1.2, we get

$$\begin{aligned} C_1(U,\beta ) 2^k\le |\mathcal {C}|\le C_2(U,\beta )2^k, \end{aligned}$$
(3.4)

where \(C_1(U,\beta )\) and \(C_2(U,\beta )\) are positive constants that depend only on U and \(\beta \). Let Q be the union of the elements of \(\mathcal {C}\). Proceeding as in the proof of Theorem 1.2, and using Theorem 3.10 instead of Theorem 2.13, we get

$$\begin{aligned} \mathrm {Var}(N(Q)) \le C(U,\beta ) n^{1/2}(\log n)^2, \end{aligned}$$
(3.5)

provided that n is sufficiently large. Also, by Lemma 3.8 and our choice of k,

$$\begin{aligned} \mathbb {E}(N(Q)) = \hbox {Leb}(Q) n = |\mathcal {C}| 4^{-k} n\ge |\mathcal {C}|. \end{aligned}$$

Thus, by (3.4), (3.5) and Chebychev’s inequality,

$$\begin{aligned} \mathbb {P}\left( \frac{N(Q)}{|\mathcal {C}|}\ge \frac{1}{2}\right) \ge 1-\frac{4\mathrm {Var}(N(Q))}{|\mathcal {C}|^2}\ge 1-C(U,\beta ) n^{-1/2}(\log n)^2. \end{aligned}$$
(3.6)

Let \(a_1\), \(a_2\), \(p_1\), \(p_2\) and q be defined as in the proof of Theorem 1.2. By the inequality (3.3), \(\mathbb {E}(a_2)\le L_1(\beta )\). Thus,

$$\begin{aligned} \mathbb {P}(a_2\ge 2L_1(\beta ))\le \frac{1}{2}. \end{aligned}$$
(3.7)

By the Paley–Zygmund second moment inequality,

$$\begin{aligned} p_1\ge \frac{a_1^2}{a_2}, \end{aligned}$$

and so by (3.6) and (3.7),

$$\begin{aligned} \mathbb {P}\left( p_1 \ge \frac{1}{8L_1(\beta )}\right)&\ge \mathbb {P}\left( a_1\ge \frac{1}{2}, \, a_2 \le 2L_1(\beta )\right) \ge \frac{1}{2}-C(U,\beta ) n^{-1/2}(\log n)^2. \end{aligned}$$

Choose n so large that the above lower bound at least 1 / 3. Next, note that by Lemma 3.8 and Markov’s inequality,

$$\begin{aligned} \mathbb {E}(p_2) \le \frac{1}{m|\mathcal {C}|} \sum _{D\in \mathcal {C}}\mathbb {E}(N(D))\le \frac{4}{m}, \end{aligned}$$

and hence

$$\begin{aligned} \mathbb {P}\left( p_2\ge \frac{16}{m}\right) \le \frac{1}{4}. \end{aligned}$$

Since \(q=p_1-p_2\) and

$$\begin{aligned} \frac{1}{8L_1(\beta )} \ge \frac{32}{m}, \end{aligned}$$

we get

$$\begin{aligned} \mathbb {P}\left( q\ge \frac{16}{m}\right) \ge \mathbb {P}\left( p_1\ge \frac{32}{m},\, p_2\le \frac{16}{m}\right) \ge \frac{1}{3}-\frac{1}{4}= \frac{1}{12}. \end{aligned}$$

Let \(\mathcal {C}_0\) be the set of all \(D\in \mathcal {C}\) such that \(0<N(D)\le m\). The above inequality and (3.4) show that if n is sufficiently large, then

$$\begin{aligned} \mathbb {P}(|\mathcal {C}_0| \ge L_2(\beta )n^{1/2})\ge \frac{1}{13}, \end{aligned}$$
(3.8)

where \(L_2(\beta )\) is a positive constant that depends only on \(\beta \). By Lemma 3.6, the random variables \(\{N(D\cap U): D\in \mathcal {D}_k\}\) are independent given \(\mathcal {F}_k\). Moreover, for each \(D\in \mathcal {C}_0\), \(N(D\cap U)\le m\le C(\beta )\), and by Lemma 3.6 and Lemma 3.15,

$$\begin{aligned} \mathrm {Var}(N(D\cap U)|\mathcal {F}_k) \ge L_3(U,\beta ), \end{aligned}$$

where \(L_3(U,\beta )\) is a positive constant that depends only on U and \(\beta \). (This is the crucial difference with the proof of Theorem 1.2. The scale invariance of the model in dimension two is not valid in dimension three.) Thus, if we let

$$\begin{aligned} M := \sum _{D\in \mathcal {C}_0} N(D\cap U), \end{aligned}$$

then the Berry–Esseen theorem shows that for any interval I,

$$\begin{aligned} \mathbb {P}(M\in I|\mathcal {F}_k)\le \frac{C(U,\beta )(|I|+1)}{\sqrt{|\mathcal {C}_0|}}, \end{aligned}$$
(3.9)

where |I| denotes the length of I. Since

$$\begin{aligned} N(U) = \sum _{D\in \mathcal {D}_k} N(D\cap U) = \sum _{D\in \mathcal {D}_k{\setminus } \mathcal {C}_0}N(D\cap U) + M, \end{aligned}$$

and the two terms in the last expression are independent given \(\mathcal {F}_k\), the inequality (3.9) implies that

$$\begin{aligned} \mathbb {P}(N(U)\in I|\mathcal {F}_k)\le \frac{C(U,\beta )(|I|+1)}{\sqrt{|\mathcal {C}_0|}}. \end{aligned}$$

Therefore by (3.8),

$$\begin{aligned} \mathbb {P}(N(U)\in I)\le C(U,\beta )(|I|+1)n^{-1/4} + \frac{12}{13} \end{aligned}$$

if n is sufficiently large. This completes the proof. \(\square \)