1 Introduction

The starting point of this research is a simplified version of Lemma 5 in [15] which gives an upper bound of higher norms of the discrepancy of a random set of points in the unit square \([0,1]^2\), treated as a torus. Let \(N=M^2\) and consider a random set of N points \(\mathcal P\) as follows: Split the unit square into N small squares \(\{S_j\}_{j=1}^N\) of area \(N^{-1}\) in the usual way. In each small square \(S_j\) there is a random point \(x_j\), uniformly distributed in the small square, independently of the distribution of all the other random points in the other small squares.

Suppose that \(\mathcal B\) is a convex set in \([0,1]^2\). Let \(\mathcal J\) denote the set af all values of j for which the small squares \(S_j\) intersect the boundary \(\partial \mathcal B\) of \(\mathcal B\). Then it is easy to see that the cardinality of \(\mathcal J\), \(|\mathcal J|\), is O(M). For each \(j\in \mathcal J\), write

$$\begin{aligned} \xi _j=\chi _{\mathcal B}(x_j)=\left\{ \begin{array}{ll} 1 &{} \quad \mathrm{if }x_j\in \mathcal B,\\ 0 &{} \quad \mathrm{otherwise}, \end{array} \right. \end{aligned}$$

and let \(\eta _j=\xi _j-\mathbb E\xi _j\). Then, \(|\eta _j|\leqslant 1\) and \(\mathbb E\eta _j=0\). Furthermore if we define the discrepancy as

$$\begin{aligned} D[\mathcal P, \mathcal B]:=\frac{1}{N}\sum _{j=1}^N\chi _\mathcal B(x_j)-|B|, \end{aligned}$$
(1)

then

$$\begin{aligned} D[\mathcal P, \mathcal B]=\frac{1}{N}\sum _{j\in \mathcal J}\eta _j. \end{aligned}$$

We now want to estimate \(\mathbb E(|D[\mathcal P,\mathcal B]|^p)\) from above, where p is an even positive integer. Note first that

$$\begin{aligned} |D[\mathcal P,\mathcal B]|^p=\frac{1}{N^p}\sum _{j_1\in \mathcal J}\cdots \sum _{j_p\in \mathcal J}\eta _{j_1}\ldots \eta _{j_p} \end{aligned}$$

and so

$$\begin{aligned} \mathbb E(|D[\mathcal P,\mathcal B]|^p)=\frac{1}{N^p}\sum _{j_1\in \mathcal J}\cdots \sum _{j_p\in \mathcal J}\mathbb E(\eta _{j_1}\ldots \eta _{j_p}) \end{aligned}$$
(2)

The random variables \(\eta _j\), where \(j\in \mathcal J\), are independent because the distributions of the random points are independent of each other. If one of \(j_1,\ldots , j_p\), say \(j_i\), is different from all the others, then

$$\begin{aligned} \mathbb E(\eta _{j_1}\ldots \eta _{j_p})=\mathbb E(\eta _{j_i})\mathbb E(\eta _{j_1}\ldots \eta _{j_{i-1}}\eta _{j_{i+1}}\ldots \eta _{j_p})=0. \end{aligned}$$

It follows that the only non-zero contribution to the sum (2) comes from those terms where each of \(j_1,\ldots ,j_p\) appears more than once. It can be shown that the major contribution comes when they appear in pairs, and there are

$$\begin{aligned} O_p\left( {{|\mathcal J|}\atopwithdelims (){p/2}}\right) =O_p({|\mathcal J|}^{p/2})=O_p({M^{p/2}})=O_p({N^{p/4}}) \end{aligned}$$

such pairs. Bounding each of such terms \(\mathbb E(\eta _{j_1}\ldots \eta _{j_p})\) trivially by 1, we obtain the estimate

$$\begin{aligned} \mathbb E(|D[\mathcal P,\mathcal B]|^p)=O_p(N^{-3p/4}). \end{aligned}$$
(3)

This result is the first step towards the proof of the existence of point sets with small \(L^p\) discrepancy with respect to, say, all discs in the square. Indeed, let B(xr) be the ball centered in the point x and with radius r. Then, an application of the above estimate to the sets B(xr) gives

$$\begin{aligned} \int _0^{1/2}\int _{[0,1]^2}\mathbb E(|D[\mathcal P,B(x,r)|^p)\mathrm{d}x\,\mathrm{d}r=O_p(N^{-3p/4}) \end{aligned}$$
(4)

and Fubini’s theorem immediately implies the existence of an N-point set \(\mathcal P\) such that

$$\begin{aligned} \left( \int _0^{1/2}\int _{[0,1]^2}\left| D[\mathcal P,B(x,r)]\right| ^p\,\mathrm{d}x\,\mathrm{d}r\right) ^{1/p}=O_p(N^{-3/4}). \end{aligned}$$
(5)

By the monotonicity of the \(L^p\) norms, one obtains these estimates for all \(p<+\infty \).

This argument can be easily extended to a very general setting. In some sense, all that one needs for the argument to work is

  1. 1.

    a partition of the ambient space into N parts with the same measure and similar diameter (the analog of the “small squares” in the previous argument);

  2. 2.

    a collection of sets with uniformly regular boundary, in such a way that the cardinality of the collection of indices \(\mathcal J\) can be controlled uniformly in terms of the diameter of the “small squares.”

We will therefore be able to replace the unit square with a compact Riemannian manifold, or more generally, with metric measure spaces \(\mathcal M\) having finite measure and with the property that for any integer N the space can be partitioned as required in point 1 above. By a recent result [19] this property holds under very general hypotheses.

In fact, we can replace the characteristic function of the set \(\mathcal B\) with more general functions, so that our results are actually results on numerical integration. Consider the integral

$$\begin{aligned} \int _{\mathcal {M}}f\left( x\right) \mathrm{d}x \end{aligned}$$

of a function \(f\left( x\right) \) over the metric measure space \(\mathcal {M}\) with finite measure \(\mathrm{d}x\) and distance between two points x and y denoted with \(|x-y|\), and the Riemann sums

$$\begin{aligned} \sum _{j=1}^{N}\omega _{j}f\left( x_{j}\right) , \end{aligned}$$

where \(\left\{ x_{j}\right\} _{j=1}^{N}\) are nodes in \(\mathcal {M}\) and \(\left\{ \omega _{j}\right\} _{j=1}^{N}\) are given weights. We are interested in the rate of decay of the error

$$\begin{aligned} {\sum _{j=1}^{N}}\omega _{j}f\left( x_{j}\right) -{\int _{\mathcal {M}}}f\left( x\right) \mathrm{d}x \end{aligned}$$

as \(N\rightarrow +\infty \). This decay depends on the smoothness of the function \(f\left( x\right) \), the weights \(\left\{ \omega _{j}\right\} _{j=1}^{N}\) and the distribution of the nodes \(\left\{ x_{j}\right\} _{j=1}^{N}\) in \(\mathcal {M}\). For references on this problem when the metric space is a torus, a sphere, or more generally a compact Riemannian manifold, see, for example, [7,8,9,10,11,12,13, 23,24,25,26,27, 31]. For some results related to metric measure spaces, see [33, 47, 48].

Here we proceed as in the situation described before for the study of the discrepancy, and partition \(\mathcal {M}\) into N disjoint measurable sets \(\mathcal {X}_{1},\ldots ,\mathcal {X}_{N}\) with positive measure, set \(\omega _{j}=\left| \mathcal {X}_{j}\right| \), where \(\left| \cdot \right| \) denotes the measure, and consider random choices of points \(x_{j}\in \mathcal {X}_{j}\).

To fix the notation we write \(\varvec{\omega }=\left( \omega _{1} ,\ldots ,\omega _{N}\right) \), \(\mathbf {x}=\left( x_{1},\ldots ,x_{N}\right) \), \(\mathbf {X}=\mathcal {X}_{1}\times \cdots \times \mathcal {X}_{N}\), \(\mathrm{d}\mathbf {x}=\frac{\mathrm{d}x_{1}}{\omega _{1}}\times \cdots \times \frac{\mathrm{d}x_{N}}{\omega _{N}}, \) and consider the probability space \(\left( \mathbf {X} ,\mathrm{d}\mathbf {x}\right) \). We also write the error as

$$\begin{aligned} \mathcal {E}_{\mathbf {x},{\omega }}\left( f\right) ={\sum _{j=1}^{N} }\omega _{j}f\left( x_{j}\right) -{\int _{\mathcal {M}}}f\left( x\right) \mathrm{d}x. \end{aligned}$$
(6)

Notice the analogy with the discrepancy (1). There will be however an important difference: In order to measure the smoothness of our functions we will use suitable Besov spaces or potential spaces adapted to this more general context and obtain estimates of \(\mathcal {E}_{\mathbf {x},{\omega }}\left( f\right) \) for functions in such spaces. The previous combinatorial argument, however, works only when the integrability exponent p is an even integer, whereas in this case in order to obtain sharp results, we need estimates that work for generic values of p. The main idea is to replace such combinatorial argument with a generalization of the classical Khintchine inequality for sums of random variables, due to Marcinkiewicz and Zygmund [34, 35]. It says that for every \(1\leqslant p <+\infty \) and for every sequence of independent random variables \(\{f_j\}_{j=1}^N\)

$$\begin{aligned} \mathbb {E}\left( \left| {\displaystyle \sum _{j=1}^{N}} \left( f_{j}-\mathbb {E}\left( f_{j}\right) \right) \right| ^{p}\right) \approx _p \mathbb {E}\left( \left( {\displaystyle \sum _{j=1}^{N}} \left| f_{j}-\mathbb {E}\left( f_{j}\right) \right| ^{2}\right) ^{p/2}\right) . \end{aligned}$$

For the case of discrepancy, this immediately gives

$$\begin{aligned} \mathbb E(|D[\mathcal P,\mathcal B]|^p)\approx _p\frac{1}{N^p}\mathbb E\left( \left( \sum _{j\in \mathcal J}|\eta _j|^{2}\right) ^{p/2}\right) \leqslant \frac{1}{N^p}\mathbb E\left( |\mathcal J|^{p/2}\right) =O(N^{-3p/4}). \end{aligned}$$

We will see that one such argument can be also used to deduce estimates on the error in numerical integration. In particular, we will study two types of problems. In the first case we will focus on the worst case numerical integration error, which determines how bad the error of a given fixed quadrature rule can be when applied to all integrands whose norm has an upper bound. The function space that we will consider for this type of problem is a space of potentials: We will say that \(f\in \mathbb H^\varPhi _p(\mathcal M)\) for \(1\leqslant p\leqslant \infty \) and a suitable kernel \(\varPhi (x,y)\) defined on \(\mathcal M\times \mathcal M\) (see Sect. 2 for the precise definition) if there is a \(g\in \mathbb L^p(\mathcal M)\) such that

$$f(x)=\int _{\mathcal M}\varPhi (x,y)g(y)\mathrm{d}y,$$

and its norm is \(\Vert f\Vert _{\mathbb H^{\varPhi }_p(\mathcal M)}=\inf _g\Vert g\Vert _{\mathbb L^p(\mathcal M)}\), where the infimum is taken over all g(x) which give the potential f(x). Observe that when \(\mathcal M\) is the Euclidean space \(\mathbb R^d\) and \(\varPhi (x,y)=|x-y|^{\alpha -d}\), \(0<\alpha <d\) is the Riesz kernel, then for \(1<p<\infty \) the potential space \(\mathbb H^{\varPhi }_p(\mathcal M)\) coincides with the homogeneous fractional Sobolev space \(\dot{\mathbb H}^\alpha _p(\mathbb R^d)\). The (non-homogeneous) fractional Sobolev space \({\mathbb H}^\alpha _p(\mathbb R^d)\) can be obtained similarly, via the Bessel kernel. Also, when \(\mathcal M\) is a compact Riemannian manifold, the Sobolev space \(\mathbb H^\alpha _p(\mathcal M)\) can be defined as a potential space via the Bessel kernel, see Example 6.5 here or [7] for details on this. We will show here the following

Theorem I

Let \(\mathcal {M}\) be a metric measure space with the property that there exist d and c such that for every \(y\in \mathcal {M}\) and \(r>0\),

$$\begin{aligned} \left| \left\{ x\in \mathcal {M}:\left| x-y\right| \leqslant r\right\} \right| \leqslant cr^{d}. \end{aligned}$$

Assume also that \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \(\delta _{j}={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\). Assume that for some \(0<\alpha <d\)

$$\begin{aligned} \left| \varPhi \left( x,y\right) \right|\leqslant & {} c\left| x-y\right| ^{\alpha -d} \quad \mathrm{for\;every} \,x\, \mathrm{and}\, y,\\ \left| \varPhi \left( x,y\right) -\varPhi \left( z,y\right) \right|\leqslant & {} c\left| x-z\right| \left| x-y\right| ^{\alpha -d-1} \quad \mathrm{if}\, \left| x-y\right| \geqslant 2\left| x-z\right| . \end{aligned}$$

Finally, assume that \(1<p\leqslant +\infty \), \(1/p+1/q=1\), and \(d/p<\alpha <d\). Then

$$\begin{aligned} \left\{ {\displaystyle \int _{\mathbf {X}}} \sup _{\Vert f\Vert _{\mathbb H^\varPhi _p}\leqslant 1}\left| \mathcal {E} _{\mathbf {x},{\omega }}(f)\right| ^{q} \mathrm{d}\mathbf {x}\right\} ^{1/q} \leqslant \left\{ \begin{array} [c]{ll} cN^{-\alpha /d} &{}\quad \mathrm{if}\, \alpha <d/2+1,\\ cN^{-1/2-1/d}\left( \log N\right) ^{1/2} &{}\quad \mathrm{if}\, \alpha =d/2+1,\\ cN^{-1/2-1/d} &{}\quad \mathrm{if }\,\alpha >d/2+1. \end{array} \right. \end{aligned}$$

The first observation is that the Bessel kernel on a compact Riemannian manifold satisfies the hypotheses required by this theorem. In fact, the particular case given by the case \(\mathcal M\) a compact Riemannian manifold, \(\varPhi \) the Bessel kernel, \(\alpha <d/2+1\), and \(p=2\) had been proved in [7, Theorem 2.7] (see also [13, Theorem 24] for the case of the sphere).

We also show that under rather natural hypotheses on the space \(\mathcal M\) and on the kernel \(\varPhi \), the estimates from above hold from below as well.

Theorem II

Let \(\mathcal {M}\) be a metric measure space with the property that there exist HK, and d such that for every \(y\in \mathcal {M} \) and \(0<r<r_{0}\),

$$\begin{aligned} Hr^{d}\leqslant \left| \left\{ x\in \mathcal {M}:\left| x-y\right| \leqslant r\right\} \right| \leqslant Kr^{d}. \end{aligned}$$

Assume also that \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \(\delta _{j}={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d} \). Suppose that there exists \(0<\alpha <d\), such that for any \(j=1,\ldots ,N\) and any \(z\in \mathcal {X}_{j}\), and for any y such that \({\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}\),

$$\begin{aligned} \int _{\mathcal {X}_{j}}\left| \varPhi \left( x,y\right) -\varPhi \left( z,y\right) \right| \mathrm{d}x\geqslant cN^{-1-1/d}\left( {\text {dist}}\left( y,\mathcal {X}_{j}\right) \right) ^{\alpha -d-1}. \end{aligned}$$

Suppose also that for any \(y\in M\), the function \(x\mapsto \varPhi \left( x,y\right) \) is continuous in \(x\ne y\). Finally, assume that \(1<p\leqslant +\infty \), \(1/p+1/q=1\), and \(d/p<\alpha <d\). Then

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\sup _{\Vert f\Vert _{\mathbb H^\varPhi _p}\leqslant 1}\left| \mathcal {E}_{\mathbf {x},{\omega }}(f)\right| ^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q}\geqslant \left\{ \begin{array} [c]{ll} cN^{-\alpha /d} &{}\quad \mathrm{if}\, \alpha <d/2+1,\\ cN^{-1/2-1/d}\left( \log N\right) ^{1/2} &{}\quad \mathrm{if}\, \alpha =d/2+1,\\ cN^{-1/2-1/d} &{}\quad \mathrm{if}\, \alpha >d/2+1. \end{array} \right. \end{aligned}$$

Once again, it should be observed that the Bessel kernel on a compact Riemannian manifold satisfies the hypotheses in the above theorem, and that the particular case given by the case where \(\mathcal M\) is the sphere, \(\varPhi \) the Bessel kernel, and \(p=2\) is contained in [13, Theorems 24 and 25].

In order to understand the two above results it could be useful to recall the following result [7, Theorem 2.16]: Let \(\mathcal M\) be a compact Riemannian manifold. For every \(1\leqslant p\leqslant \infty \) and \(\alpha >d/p\) there exists \(c>0\) such that for every distribution of points \(\mathbf x=\{x_j\}_{j=1}^N\) and weights \(\mathbf \omega =\{\omega _j\}_{j=1}^N\), one has

$$\begin{aligned} \sup _{\Vert f\Vert _{\mathbb H^\alpha _p}\leqslant 1} \left| \mathcal E_{\mathbf x,\mathbf \omega }(f)\right| \geqslant cN^{-\alpha /d}. \end{aligned}$$

In other words, the worst case error for any quadrature rule cannot have a better decay than \(N^{-\alpha /d}\). Thus, Theorem I says that a random choice of points \(x_j\in \mathcal X_j\), \(j=1,\ldots ,N\), gives the best possible decay for the worst case error in \(\mathbb H^\alpha _p(\mathcal M)\) when \(d/2<\alpha <d/2+1\), while Theorem II says that when \(\alpha \geqslant d/2+1\) the stratification strategy does not lead, on average, to quadrature rules with the desired decay \(N^{-\alpha /d}\) of the worst case error in \(\mathbb H^\alpha _p(\mathcal M)\).

By the above-mentioned result in [19] on the partitioning of \(\mathcal M\) into N regions of equal measure and small diameter, under the hypotheses on \(\mathcal M\) contained in Theorem II (see Sect. 5), the above Theorems I and II apply with equal weights, that is \(\omega _j=|\mathcal M|/N\) for all \(j=1,\ldots ,N\). In [11] and [13] a sequence of point configurations on the sphere that gives the best possible decay \(N^{-\alpha /d}\) for the worst case error in the equal weight case has been called a Quasi Monte Carlo (QMC) design sequence for \(\mathbb H^\alpha _p(\mathcal M)\).

We would like to emphasize that all the cited results in [7] and [13] are based on Hilbert space techniques (\(p=2\)), while we were able to obtain \(\mathbb L^p\) integrability results thanks to the Marcinkiewicz–Zygmund inequality. Moreover, we work in the more general setting of metric measure spaces, and some of the results are new even in the particular case of compact Riemannian manifolds.

Concerning the non-Hilbert space setting, we mention here that when \(\mathcal M\) is the d-dimensional sphere, \(1\le p\le +\infty \), and \(\alpha >d/p\), the existence of QMC design sequences for \(\mathbb H^\alpha _p(\mathcal M)\) has been proved with different techniques in [11].

So far, we have considered the worst case error in numerical integration, that is the error for a whole class of functions. The second type of estimate that will be discussed here concerns the numerical integration error for a given fixed function. In particular, we will consider functions in the homogeneous Hajłasz–Besov space \(\dot{\mathbb B}^\alpha _{p,\infty }(\mathcal M)\), as defined in [30]. For details on these spaces, see §3. Here we should mention that when \(0<\alpha <1\), the spaces \(\mathbb H^{\varPhi }_{p}(\mathcal M)\) as in Theorem I are embedded into \(\dot{\mathbb B}^\alpha _{p,\infty }(\mathcal M)\), and that when \(\mathcal M\) is the Euclidean space \(\mathbb R^d\) and \(0<\alpha <1\) then the spaces \(\dot{\mathbb B}^\alpha _{p,\infty }(\mathcal M)\) coincide with the classical homogeneous Besov spaces defined via Littlewood–Paley decomposition. The main result in this context is the following

Theorem III

Assume that a metric measure space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X} _{1}\cup \cdots \cup \mathcal {X}_{N}\), with measure \(0<\left| \mathcal {X} _{j}\right| =\omega _{j}\approx N^{-1}\) and \(0<{\text {diam}}\left( \mathcal {X}_{j}\right) \approx {N^{-1/d}}\). Then for every \(1\leqslant p\leqslant 2\) there is a constant c such that

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left| \mathcal {E}_{\mathbf {x},{\omega }}\left( f\right) \right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p}\leqslant c N^{ 1/p-1-\alpha /d} \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\alpha }\left( \mathcal {M}\right) } \end{aligned}$$

and for every \(2\leqslant p<+\infty \) there is a constant c such that

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left| \mathcal {E}_{\mathbf {x},{\omega }}\left( f\right) \right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p}\leqslant cN^{-1/2-\alpha /d} \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\alpha }\left( \mathcal {M}\right) }. \end{aligned}$$

Of course, here a random choice of points \(x_j\in \mathcal X_j\), \(j=1,\ldots ,N\) gives better estimates than those obtained in Theorem I (and better than \(N^{-\alpha /d}\)). This is natural, since in this case we are looking for point distributions which give a small error for a given integrand f, whereas in the situation described by Theorem I we were looking for point distributions which give a small error for all integrands in our space at the same time.

Theorem III and its sharpness will be discussed in Sect. 7.

We believe that our effort in the search for the minimal properties that guarantee the validity of certain results in discrepancy theory and numerical integration can be of some help towards a deeper understanding of these results, even in the classical cases. In fact, to the best of our knowledge, the above Theorems I, II, and III are new even when applied to a general compact Riemannian manifold. There has recently been some interest in this type of problems in spaces as general as those considered here, or even more. See for example [33, 39, 43,44,45] for discrepancy and numerical integration on metric spaces and [47, 48] for analysis on fractals.

The plan of the paper is the following. In Sects. 2 and 3 we introduce the appropriate Sobolev-type spaces, and we recall a few results on how these spaces relate to each other. These matters are not completely new, but can be of some help for the unfamiliar reader. In Sect. 4 we introduce in some detail the Marcinkiewicz–Zygmund inequality. In Sect. 5 we recall the above-mentioned result in [19] concerning the partitioning of a metric measure space into regions of equal measure and small diameter. In Sects. 6 and 7 we give all the details on our results on numerical integration, with examples. Finally, Sect. 8 contains our results on the \(L^p\) (and \(L^\infty \)) discrepancy that generalize (3) and (5).

2 Sobolev Spaces and Potentials on Measure Spaces

Our estimates on the worst case error described in Theorems I and II require a definition of Sobolev spaces via potentials. For a classical approach to potentials, see, for example, [46].

Definition 2.1

Let \(\mathcal {M}\) be a measure space, let \(1\leqslant p,q\leqslant +\infty \) with \(1/p+1/q=1,\) and let \(\varPhi \left( x,y\right) \) be a measurable kernel on \(\mathcal {M}\times \mathcal {M}\). Assume that for every x,

$$\begin{aligned} \begin{array} [c]{rl} {{ {\displaystyle \int _{\mathcal {M}}} }\left| \varPhi \left( x,y\right) \right| ^{q}\mathrm{d}y<+\infty } &{} \quad \mathrm{if}\, q<+\infty ,\\ {\underset{y\in \mathcal {M}}{{\text {ess}}\sup }\left\{ \left| \varPhi \left( x,y\right) \right| \right\} <+\infty } &{}\quad \mathrm{if}\, q=+\infty . \end{array} \end{aligned}$$

Then every function g(x) in \(\mathbb {L}^{p}\left( \mathcal {M}\right) \) has a pointwise well defined potential

$$\begin{aligned} f\left( x\right) ={\int _{\mathcal {M}}}\varPhi \left( x,y\right) g\left( y\right) \mathrm{d}y. \end{aligned}$$

The space \(\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) \) is the space of all potentials of functions in \(\mathbb {L}^{p}\left( \mathcal {M}\right) \), with norm

$$\begin{aligned} \left\| f\right\| _{\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) }=\inf _{g}\left\{ {\int _{\mathcal {M}}}\left| g\left( x\right) \right| ^{p}\mathrm{d}x\right\} ^{1/p}. \end{aligned}$$

The infimum is taken over all g(x) which give the potential f(x).

Observe that this definition does not even require a metric. Potentials can also be defined under weaker assumptions on the kernel, but the above assumptions guarantee that these potentials are defined pointwise everywhere, and this will be necessary in the sequel. In particular, when \(\mathcal {M}\) is the Euclidean space \(\mathbb {R}^{d}\) and \(\varPhi \left( x,y\right) =\left| x-y\right| ^{\alpha -d}\) with \(0<\alpha <d\) is the Riesz kernel, then \(\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) \) is the homogeneous fractional Sobolev space \(\dot{\mathbb {H}}_{p}^{\alpha }\left( \mathbb {R}^{d}\right) \). However, the cases \(p=1\) and \(p=+\infty \) require some extra care. For interesting examples of generalized potential spaces, see, for example, [29].

3 Besov and Triebel–Lizorkin Spaces on Metric Measure Spaces

Our estimates in Theorem III require a definition of Sobolev spaces, more appropriately Besov or Triebel–Lizorkin spaces, via upper gradients. Let \(\mathcal {M}\) be a metric measure space, that is, a metric space equipped with a positive Borel measure. With a small abuse of notation we denote by \(\left| \mathcal {X}\right| \) the measure of a measurable set \(\mathcal {X}\) and by \(\left| x-y\right| \) the distance between two points x and y. We will often denote with B(xr) the open balls \(\{y\in \mathcal M:|x-y|<r\}\) with center x and radius r. Simple examples are Riemannian manifolds, or not necessarily smooth surfaces in a Euclidean space with the inherited measure and distance. In [22] Hajłasz has given a purely metric definition of a Sobolev space: A measurable function f(x) is in the Sobolev space \(\mathbb {W}_{p}^{1}\left( \mathcal {M}\right) \), \(1\leqslant p\leqslant +\infty \), if there exists a non-negative function \(g\left( x\right) \) in \(\mathbb {L}^{p}\left( \mathcal {M}\right) \) such that for almost every \(x,y\in \mathcal {M}\),

$$\begin{aligned} \left| f(x)-f(y)\right| \leqslant \left| x-y\right| \left( g(x)+g(y)\right) . \end{aligned}$$

For example, it is proved in [22] that in Euclidean spaces one can choose as an upper gradient g(x) a suitable multiple of the Hardy–Littlewood maximal function of the gradient \(\nabla f\left( x\right) \).

The following is a natural generalization of upper gradient and associated Sobolev space.

Definition 3.1

Let \(\mathcal {M}\) be a metric measure space and \(\varphi \left( t\right) \) a non-negative increasing function in \(t\geqslant 0\). A measurable non-negative function \(g\left( x\right) \) is a \(\varphi \)-gradient of a measurable function \(f\left( x\right) \) if there exists a set \(\mathcal {N} \) with measure zero such that for all x and y in \(\mathcal {M{\setminus }N}\),

$$\begin{aligned} \left| f\left( x\right) -f\left( y\right) \right| \leqslant \varphi \left( \left| x-y\right| \right) \left( g\left( x\right) +g\left( y\right) \right) . \end{aligned}$$

Definition 3.2

A measurable function f(x) is in the Hajłasz–Sobolev space \(\dot{\mathbb {M}}_{p}^{\varphi }\left( \mathcal {M}\right) ,\)\(0<p\leqslant +\infty \), if it has a \(\varphi \)-gradient in \(L^{p}\left( \mathcal {M}\right) \). We set

$$\begin{aligned} \left\| f\right\| _{\dot{\mathbb {M}}_{p}^{\varphi }\left( \mathcal {M}\right) } =\inf \left\| g\right\| _{L^{p}\left( \mathcal {M}\right) }. \end{aligned}$$

The infimum is taken over all \(\varphi \)-gradients \(g\left( x\right) \) of \(f\left( x\right) \).

In [22] Hajłasz proved that when \(\varphi (t)=t\), \(1<p\le +\infty \) and \({\mathcal M}=\mathbb R^d\), then this space coincides with the classical homogeneous Sobolev space \(\dot{\mathbb H}^{1}_p\left( \mathbb R^d\right) \). The above definition has been extended by Koskela et al. [30] who have defined Besov and Triebel–Lizorkin spaces on a general metric measure space. In particular they proved that, when \(\varphi (t)=t^\alpha \) with \(0<\alpha <1\) and \({\mathcal M}=\mathbb R^d\), then the space \(\dot{\mathbb {M}}_{p}^{\varphi }\left( \mathcal {M}\right) \) is larger than the classical fractional Sobolev space, and it coincides with the homogeneous Triebel–Lizorkin space \(\dot{\mathbb F}^{\alpha }_{p,\infty }\left( \mathbb R^d\right) \).

What follows is in the spirit of the definitions of Besov and Triebel–Lizorkin spaces in [30]. In order to define these spaces one needs to introduce families of gradients localized at different scales.

Definition 3.3

Let \(\mathcal {M}\) be a metric measure space and \(\varphi \left( t\right) \) a non-negative increasing function in \(t\geqslant 0\). Let \(n_{0}=\log _{2}\left( {\text {diam}}\left( \mathcal {M}\right) \right) \), possibly infinity. A sequence of non-negative measurable functions \(\left\{ g_{n}\left( x\right) \right\} _{-n_{0} }^{+\infty }\) is a \(\varphi \)-gradient for a measurable function \(f\left( x\right) \) if there exists a set \(\mathcal {N}\) with measure zero such that for all x and y in \(\mathcal {M{\setminus }N}\) with \(\left| x-y\right| \leqslant 2^{-n}\),

$$\begin{aligned} \left| f\left( x\right) -f\left( y\right) \right| \leqslant \varphi \left( 2^{-n}\right) \left( g_{n}\left( x\right) +g_{n}\left( y\right) \right) . \end{aligned}$$

Definition 3.4

A measurable function f(x) is in the Hajłasz–Triebel–Lizorkin space \(\dot{\mathbb {F}} _{p,q}^{\varphi }\left( \mathcal {M}\right) ,\)\(0<p<+\infty \) and \(0<q\leqslant +\infty \), if f(x) has a \(\varphi \)-gradient \(\left\{ g_{n}\left( x\right) \right\} \) with

$$\begin{aligned} \begin{array} [c]{rl} {\left\| \left\{ { {\displaystyle \sum \limits _{n=-n_{0}}^{+\infty }} }\left| g_{n}\left( x\right) \right| ^{q}\right\} ^{1/q}\right\| _{\mathbb {L}^{p}\left( \mathcal {M}\right) }<+\infty } &{}\quad \mathrm{if}\, 0<q<+\infty ,\\ {\left\| \sup _{n\geqslant -n_{0}}\left| g_{n}\left( x\right) \right| \right\| _{\mathbb {L}^{p}\left( \mathcal {M}\right) }<+\infty } &{}\quad \mathrm{if} \, q=+\infty . \end{array} \end{aligned}$$

The infimum of the above expression taken over all \(\varphi \)-gradients defines the semi-norm \(\left\| f\right\| _{\dot{\mathbb {F}}_{p,q}^{\varphi }\left( \mathcal {M}\right) }\).

Definition 3.5

A measurable function f(x) is in the Hajłasz–Besov space \(\dot{\mathbb {B}}_{p,q}^{\varphi }\left( \mathcal {M}\right) \), \(0<p\leqslant +\infty \) and \(0<q\leqslant +\infty \), if f(x) has a \(\varphi \)-gradient \(\left\{ g_{n}\left( x\right) \right\} \) with

$$\begin{aligned} \begin{array} [c]{rl} {\left\{ { {\displaystyle \sum \limits _{n=-n_{0}}^{+\infty }} }\left\| g_{n}\left( x\right) \right\| _{\mathbb {L}^{p}\left( \mathcal {M}\right) }^{q}\right\} ^{1/q}<+\infty } &{}\quad \mathrm{if}\, 0<q<+\infty ,\\ {\sup _{n\geqslant -n_{0}}\left\| g_{n}\left( x\right) \right\| _{\mathbb {L}^{p}\left( \mathcal {M}\right) }<+\infty } &{}\quad \mathrm{if}\, \ q=+\infty . \end{array} \end{aligned}$$

The infimum of the above expression taken over all \(\varphi \)-gradients defines the semi-norm \(\left\| f\right\| _{\dot{\mathbb {B}}_{p,q}^{\varphi }\left( \mathcal {M}\right) }\).

Observe that the above spaces are homogeneous, and constant functions have semi-norms equal to zero. Also observe that when \(q=+\infty \) and \(\varphi (t)\) is doubling, that is \(\varphi (2t)\le c\varphi (t)\), then the space \(\dot{\mathbb {F}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \) coincides with the previously defined Hajłasz–Sobolev space \(\dot{\mathbb {M}}_{p}^{\varphi }\left( \mathcal {M}\right) \). It suffices to define \(g\left( x\right) =\sup \left\{ g_{n}\left( x\right) \right\} \). In particular, the straightforward generalization of a Hajłasz–Sobolev space is a Hajłasz–Triebel–Lizorkin space.

When \(\varphi \left( t\right) =t^{\alpha }\) the above definition is nothing but the definition of Hajłasz–Besov and Hajłasz–Triebel–Lizorkin given in [30]. To be precise, the definition of \(\varphi \)-gradient in [30] requires

$$\begin{aligned} \left| f\left( x\right) -f\left( y\right) \right| \leqslant 2^{-\alpha n}\left( g_{n}\left( x\right) +g_{n}\left( y\right) \right) \end{aligned}$$

only for x and y with \(2^{-n-1}\leqslant \left| x-y\right| \leqslant 2^{-n}\). On the other hand, if one defines

$$\begin{aligned} G_{n}\left( x\right) ={\sum _{k=0}^{+\infty }}2^{-\alpha k}g_{n+k}\left( x\right) , \end{aligned}$$

then

$$\begin{aligned} \left| f\left( x\right) -f\left( y\right) \right| \leqslant 2^{-\alpha n}\left( G_{n}\left( x\right) +G_{n}\left( y\right) \right) \end{aligned}$$

for x and y with \(\left| x-y\right| \leqslant 2^{-n}\), and the semi-norms defined via \(\left\{ g_{n}\left( x\right) \right\} \) and \(\left\{ G_{n}\left( x\right) \right\} \) are equivalent. In the same paper, it is proved that when \(\mathcal {M}\) is the Euclidean space \(\mathbb {R}^{d}\) and \(\varphi \left( t\right) =t^{\alpha }\) with \(0<\alpha <1\), then the spaces \(\dot{\mathbb {B}}_{p,q}^{\varphi }\left( \mathcal {M}\right) \) and \(\dot{\mathbb {F}}_{p,q}^{\varphi }\left( \mathcal {M}\right) \) coincide with the classical Besov and Triebel–Lizorkin spaces defined via a Littlewood–Paley decomposition. See, for example, [3] for the relevant definitions.

The lemma below gives an example of a function in the spaces \(\dot{\mathbb {B}} _{p,q}^{\varphi }\left( \mathcal {M}\right) \) and \(\dot{\mathbb {F}}_{p,q}^{\varphi } \left( \mathcal {M}\right) \).

Definition 3.6

For every subset \(\mathcal {B}\) in \(\mathcal {M}\), define

$$\begin{aligned} \psi _{\mathcal {B}}\left( t\right) =\left| \left\{ x\in \mathcal {B} :{\text {dist}}\left\{ x,\mathcal {M}{\setminus }\mathcal {B}\right\} \leqslant t\right\} \right| +\left| \left\{ x\in \mathcal {M} {\setminus }\mathcal {B}:{\text {dist}}\left\{ x,\mathcal {B}\right\} \leqslant t\right\} \right| . \end{aligned}$$

For example, if \(\mathcal {M}\) is a d-dimensional Riemannian manifold and \(\mathcal {B}\) is a bounded open set with regular boundary, then \(\psi _{\mathcal {B}}\left( t\right) \approx t\), while if \(\psi _{\mathcal {B} }\left( t\right) \approx t^{\beta }\), then the boundary has Minkowski fractal dimension \(d-\beta \).

Proposition 3.7

Let \(\mathcal {B}\) be an arbitrary subset of \(\mathcal {M}\). Then

$$\begin{aligned} \left\| \chi _{\mathcal {B}}\right\| _{\dot{\mathbb {F}}_{p,\infty }^{\varphi } \left( \mathcal {M}\right) }&\leqslant \left\{ \sum _{n=-n_{0}}^{+\infty } \varphi \left( 2^{-n}\right) ^{-p}\psi _{\mathcal {B}}\left( 2^{-n}\right) \right\} ^{1/p},\\ \left\| \chi _{\mathcal {B}}\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi } \left( \mathcal {M}\right) }&\leqslant \sup _{n\geqslant -n_{0}} \left\{ \varphi \left( 2^{-n}\right) ^{-1}\psi _{\mathcal {B}}\left( 2^{-n}\right) ^{1/p}\right\} . \end{aligned}$$

In particular, if \(\varphi \left( t\right) =t^{\alpha }\) and \(\psi _{\mathcal {B}}\left( t\right) \leqslant ct^{\beta }\), then \(\chi _{\mathcal {B}}\in \dot{\mathbb {F}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \) for \(p\alpha <\beta \) and \(\chi _{\mathcal {B}}\in \dot{\mathbb {B}}_{p,\infty }^{\varphi } \left( \mathcal {M}\right) \) for \(p\alpha \leqslant \beta \).

Proof

It suffices to observe that a \(\varphi \)-gradient for the characteristic function \(\chi _{\mathcal {B}}\left( x\right) \) is given by

$$\begin{aligned} g_{n}(x)=\left\{ \begin{array} [c]{ll} \varphi \left( 2^{-n}\right) ^{-1} &{}\quad \mathrm{if}\, x\in \mathcal {B}\, \mathrm{and} {\text {dist}}\left\{ x,\mathcal {M}{\setminus }\mathcal {B}\right\} \leqslant 2^{-n} ,\\ 0 &{}\quad \mathrm{otherwise}. \end{array} \right. \end{aligned}$$

Of course, there are other possible choices for the \(\varphi \)-gradient of \(\chi _{\mathcal B}\left( x\right) \), for example,

$$\begin{aligned} g_{n}(x)=\left\{ \begin{array} [c]{ll} \varphi \left( 2^{-n}\right) ^{-1} &{}\quad \mathrm{if}\, x\in \mathcal {M}{\setminus }\mathcal {B}\, \mathrm{and}\, {\text {dist}}\left\{ x,\mathcal {B}\right\} \leqslant 2^{-n} ,\\ 0 &{}\quad \mathrm{otherwise}. \end{array} \right. \end{aligned}$$

\(\square \)

The following is an immediate consequence of the definitions.

Proposition 3.8

  1. (i)

    If \(q_{1}\leqslant q_{2}\) and \(\varphi _{1}\left( t\right) \leqslant \varphi _{2}\left( t\right) \), then

    $$\begin{aligned} \dot{\mathbb {B}}_{p,q_{1}}^{\varphi _{1}}\left( \mathcal {M}\right) \subseteq \dot{\mathbb {B}}_{p,q_{2}}^{\varphi _{2}}\left( \mathcal {M}\right) \quad \text{ and }\quad \dot{\mathbb {F}}_{p,q_{1}}^{\varphi _{1}}\left( \mathcal {M}\right) \subseteq \dot{\mathbb {F}}_{p,q_{2}}^{\varphi _{2}}\left( \mathcal {M}\right) . \end{aligned}$$
  2. (ii)

    For every \(\varphi \left( t\right) \) and \(0<p\leqslant +\infty \),

    $$\begin{aligned} \dot{\mathbb {B}}_{p,p}^{\varphi }\left( \mathcal {M}\right) ={ \dot{\mathbb {F}}_{p,p}^{\varphi }} \left( \mathcal {M}\right) \quad \text{ and }\quad \dot{\mathbb {F}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \subseteq \dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) . \end{aligned}$$

    In particular, for fixed p and \(\varphi \left( t\right) \), the largest space in the scale of Hajłasz–Besov and Hajłasz–Triebel–Lizorkin spaces is \(\dot{\mathbb {B}} _{p,\infty }^{\varphi }\left( \mathcal {M}\right) \).

In the Euclidean spaces it is well known that the homogeneous potential spaces \(\dot{\mathbb {H}}_{p}^{\alpha }\left( \mathbb {R}^{d}\right) \) defined via the Riesz potentials coincide with the homogeneous Triebel–Lizorkin spaces \(\dot{\mathbb {F}} _{p,2}^{\alpha }\left( \mathbb {R}^{d}\right) \) defined via the Littlewood–Paley decomposition [3, 18]. We do not know under which assumptions on \(\varPhi \left( x,y\right) \) and \(\varphi \left( t\right) \) and \(\mathcal {M}\) the equality \(\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) =\dot{\mathbb {F}} _{p,2}^{\varphi }\left( \mathcal {M}\right) \) holds. Anyhow, the following proposition guarantees a weaker embedding.

Proposition 3.9

Assume that \(\psi \left( t\right) \) is an increasing function on \(0\leqslant t<+\infty \) with \(\psi \left( 0\right) =0\), and define for \(\varepsilon >0\)

$$\begin{aligned} \varphi \left( t\right) =\sum _{k=0}^{+\infty }\psi \left( 2^{2-k}t\right) +\sum _{k=0}^{+\infty }2^{-(k+1)\varepsilon }\psi \left( 2^{k+2}t\right) . \end{aligned}$$

Also assume that \(\varPhi \left( x,y\right) \) is a kernel on \(\mathcal {M}\times \mathcal {M}\) with the property that for some \( C>0\)

$$\begin{aligned} \left| \varPhi \left( x,y\right) \right|&\leqslant C\frac{\psi \left( \left| x-y\right| \right) }{ \left| B\left( x,6\left| x-y\right| \right) \right| } \quad \mathrm{for\;every}\, x,y\in \mathcal M,\\ \left| \varPhi \left( x,y\right) -\varPhi \left( z,y\right) \right|&\leqslant C\left( \frac{\left| x-z\right| }{ \left| x-y\right| }\right) ^{\varepsilon }\frac{\psi \left( \left| x-y\right| \right) }{ \left| B\left( x,6\left| x-y\right| \right) \right| }\quad \mathrm{when }|x-y|\geqslant 2|x-z|. \end{aligned}$$

Define the potential

$$\begin{aligned} f\left( x\right) =\int _{\mathcal {M}}\varPhi \left( x,y\right) g\left( y\right) \mathrm{d}y. \end{aligned}$$

Finally, define the Hardy–Littlewood maximal operator

$$\begin{aligned} Mg\left( x\right) =\sup _{r>0}\left\{ \left| B\left( x,3r\right) \right| ^{-1}\int _{\left\{ \left| x-y\right| \leqslant r\right\} }\left| g\left( y\right) \right| \mathrm{d}y\right\} . \end{aligned}$$

Then,

$$\begin{aligned} \left| f\left( x\right) -f\left( z\right) \right| \leqslant 2C\varphi \left( \left| x-z\right| \right) \left( Mg\left( x\right) +Mg\left( z\right) \right) . \end{aligned}$$

Proof

By the hypotheses on the kernel,

The dyadic decomposition

$$\begin{aligned} \left\{ \left| x-y\right| \leqslant 4\left| x-z\right| \right\} =\bigcup _{k=0}^{+\infty }\left\{ 2^{1-k}\left| x-z\right| < \left| x-y\right| \leqslant 2^{2-k}\left| x-z\right| \right\} \end{aligned}$$

gives

$$\begin{aligned}&\int _{\left\{ \left| x-y\right| \leqslant 4\left| x-z\right| \right\} }\psi \left( \left| x-y\right| \right) \left| B\left( x,6\left| x-y\right| \right) \right| ^{-1}\left| g\left( y\right) \right| \mathrm{d}y \\&\quad \leqslant \sum _{k=0}^{+\infty }\psi \left( 2^{2-k}\left| x-z\right| \right) \left| B\left( x,6\cdot 2^{1-k}\left| x-z\right| \right) \right| ^{-1}\int _{\left\{ \left| x-y\right| \leqslant 2^{2-k}\left| x-z\right| \right\} }\left| g\left( y\right) \right| \mathrm{d}y \\&\quad \leqslant \sum _{k=0}^{+\infty }\psi \left( 2^{2-k}\left| x-z\right| \right) \sup _{r>0}\left\{ \left| B\left( x,3r\right) \right| ^{-1}\int _{\left\{ \left| x-y\right| \leqslant r\right\} }\left| g\left( y\right) \right| \mathrm{d}y\right\} \\&\quad \leqslant \varphi \left( \left| x-z\right| \right) Mg\left( x\right) . \end{aligned}$$

Similarly,

$$\begin{aligned} \int _{\left\{ \left| z-y\right| \leqslant 4\left| x-z\right| \right\} }\psi \left( \left| z-y\right| \right) \left| B\left( z,6\left| z-y\right| \right) \right| ^{-1}\left| g\left( y\right) \right| \mathrm{d}y\leqslant \varphi \left( \left| x-z\right| \right) Mg\left( z\right) . \end{aligned}$$

Finally, the dyadic decomposition

$$\begin{aligned} \left\{ \left| x-y\right| \geqslant 2\left| x-z\right| \right\} =\bigcup _{k=0}^{+\infty }\left\{ 2^{k+1}\left| x-z\right| \leqslant \left| x-y\right| < 2^{k+2}\left| x-z\right| \right\} \end{aligned}$$

gives

$$\begin{aligned}&\int _{\left\{ \left| x-y\right| \geqslant 2\left| x-z\right| \right\} }\left| x-z\right| ^{\varepsilon } \left| x-y\right| ^{-\varepsilon }\psi \left( \left| x-y\right| \right) \left| B\left( x,6\left| x-y\right| \right) \right| ^{-1}\left| g\left( y\right) \right| \mathrm{d}y \\&\quad \leqslant \sum _{k=0}^{+\infty }2^{-(k+1)\varepsilon }\psi \left( 2^{k+2}\left| x-z\right| \right) \left| B\left( x,6\cdot 2^{k+1}\left| x-z\right| \right) \right| ^{-1}\\&\quad \times \,\int _{\left\{ \left| x-y\right| \leqslant 2^{k+2}\left| x-z\right| \right\} }\left| g\left( y\right) \right| \mathrm{d}y\leqslant \varphi \left( \left| x-z\right| \right) Mg\left( x\right) . \end{aligned}$$

\(\square \)

Corollary 3.10

With the notation of the above proposition, if \(1<p\leqslant +\infty \), then the potential space \(\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) \) can be continuously embedded into \( \dot{\mathbb {F}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \).

Proof

It suffices to recall that, due to the extra 3 in the definition of \(Mg\left( x\right) \), this maximal operator \(Mg\left( x\right) \) is bounded on \(\mathbb {L}^{p}\left( \mathcal {M}\right) \) for all \(1<p\leqslant +\infty \), even when the measure on the metric space is non-doubling. See [37]. \(\square \)

Example 3.11

If \(\varphi \left( t\right) =t^{\alpha }\) with \(0<\alpha <\varepsilon \), then

$$\begin{aligned} \psi \left( t\right)&=\sum _{k=0}^{+\infty }\varphi \left( 2^{2-k}t\right) +\sum _{k=0}^{+\infty }2^{(-k-1)\varepsilon }\varphi \left( 2^{k+2}t\right) \\&=\left( 2^{2\alpha }\sum _{k=0}^{+\infty }2^{-k\alpha }+2^{2\alpha -\varepsilon }\sum _{k=0}^{+\infty }2^{-k\left( \varepsilon -\alpha \right) }\right) t^{\alpha }=Ct^{\alpha }. \end{aligned}$$

Thus, if

$$\begin{aligned} \left| \varPhi \left( x,y\right) \right|&\leqslant C\frac{\left| x-y\right| ^\alpha }{ \left| B\left( x,6\left| x-y\right| \right) \right| } \quad \mathrm{for\;every}\, x,y\in \mathcal {M},\\ \left| \varPhi \left( x,y\right) -\varPhi \left( z,y\right) \right|&\leqslant C\left( \frac{\left| x-z\right| }{ \left| x-y\right| }\right) ^{\varepsilon }\frac{ \left| x-y\right| ^\alpha }{ \left| B\left( x,6\left| x-y\right| \right) \right| }\quad \mathrm{when }|x-y|\geqslant 2|x-z|. \end{aligned}$$

and if \(1<p\leqslant +\infty \) then the potential space \(\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) \) can be continuously embedded into \( \dot{\mathbb {F}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \) with \(\varphi (t)=t^\alpha \).

4 The Marcinkiewicz–Zygmund Inequality

The main ingredient in what follows is the Marcinkiewicz–Zygmund inequality for sums of independent random variables.

As is well known, the variance of the sum of independent random variables is the sum of the variances. For every sequence of independent random variables \(f_{j}\),

$$\begin{aligned} \mathbb {E}\left( \left| \sum _{j}\left( f_{j} -\mathbb {E}\left( f_{j}\right) \right) \right| ^{2}\right) =\sum _{j}\mathbb {E}\left( \left| f_{j} -\mathbb {E}\left( f_{j}\right) \right| ^{2}\right) . \end{aligned}$$

In fact, there is a similar result with the second moment replaced by other moments and with the equality replaced by two inequalities.

Theorem 4.1

(Marcinkiewicz–Zygmund) For every \(1\leqslant p<+\infty \), there exist positive constants \(A\left( p\right) \) and \(B\left( p\right) \) such that for every sequence \(\{f_{j}\}\) of independent random variables,

$$\begin{aligned}&A\left( p\right) \left\{ \mathbb {E}\left( {\displaystyle \sum _{j=1}^{N}} \left| f_{j}-\mathbb {E}\left( f_{j}\right) \right| ^{2}\right) ^{p/2}\right\} ^{1/p} \\&\quad \leqslant \left\{ \mathbb {E}\left( \left| {\displaystyle \sum _{j=1}^{N}} \left( f_{j}-\mathbb {E}\left( f_{j}\right) \right) \right| ^{p}\right) \right\} ^{1/p}\\&\quad \leqslant B\left( p\right) \left\{ \mathbb {E}\left( {\displaystyle \sum _{j=1}^{N}} \left| f_{j}-\mathbb {E}\left( f_{j}\right) \right| ^{2}\right) ^{p/2}\right\} ^{1/p}. \end{aligned}$$

The Marcinkiewicz–Zygmund inequality is a generalization of the classical inequality of Khintchine for sums of random variables with Rademacher distribution that take values \(\pm 1\) with probability 1 / 2. For a proof, see [34] and [35], or [17].

In what follows, special attention will be paid to the constants, and \(A\left( p\right) \) and \(B\left( p\right) \) will denote the best constants in the Marcinkiewicz–Zygmund inequality. If \(\overline{A}\left( p\right) \) and \(\overline{B}\left( p\right) \) are the corresponding best constants for the Khintchine inequality, then it can be proved that

$$\begin{aligned} \tfrac{1}{2}\overline{A}\left( p\right) \leqslant A\left( p\right) \leqslant \overline{A}\left( p\right) \quad \mathrm{and}\quad \overline{B}\left( p\right) \leqslant B\left( p\right) \leqslant 2\overline{B}\left( p\right) ; \end{aligned}$$

see, for example, [17]. In particular there is a positive constant c such that \(c\leqslant \overline{A}(p)\leqslant 1\), while \(\overline{B}\left( p\right) =1\) for \(1\leqslant p\leqslant 2\) and \(\overline{B}(p)=\sqrt{2}\left( \varGamma \left( ({p+1})/{2}\right) / \sqrt{\pi }\right) ^{1/p}\) for \(2\leqslant p<+\infty \); see [14] and [21].

We remark that the Marcinkiewicz–Zygmund inequality can also be extended to infinite sums of independent random variables.

From now on we will assume that \(\mathcal {M}\) is a measure space of finite measure which can be expressed as a finite union \(\mathcal {X}_{1}\cup \cdots \cup \mathcal {X}_{N}\) of disjoint sets \(\mathcal {X}_{1},\ldots ,\mathcal {X}_{N}\) with measures \(0<\left| \mathcal {X}_{j}\right| =\omega _{j}<+\infty \).

As indicated earlier, we write \(\varvec{\omega }=\left( \omega _{1} ,\ldots ,\omega _{N}\right) \), \(\mathbf {x}=\left( x_{1},\ldots ,x_{N}\right) \), \(\mathbf {X}=\mathcal {X}_{1}\times \cdots \times \mathcal {X}_{N}\) and

$$\begin{aligned} \mathrm{d}\mathbf {x}=\frac{\mathrm{d}x_{1}}{\omega _{1}}\times \cdots \times \frac{\mathrm{d}x_{N}}{\omega _{N}}. \end{aligned}$$

The error incurred in a quadrature rule with sampling points \(\mathbf {x}=\left( x_{1},\ldots ,x_{N}\right) \) and weights \(\varvec{\omega }=\left( \omega _{1},\ldots ,\omega _{N}\right) \) is the functional

$$\begin{aligned} \mathcal {E}_{\mathbf {x,\omega }}\left( f\right) ={\sum _{j=1}^{N}}\omega _{j}f\left( x_{j}\right) -\int _{\mathcal {M}}f\left( x\right) \mathrm{d}x. \end{aligned}$$

The above Marcinkiewicz–Zygmund inequality has an immediate corollary that allows us to control the norm of this error.

Corollary 4.2

Let \(1\leqslant p<+\infty \). For every measurable function \(f\left( x\right) \) on \(\mathcal {M}\),

Proof

It suffices to apply the Marcinkiewicz–Zygmund inequality to the independent random variables

$$\begin{aligned} f_{j}\left( \mathbf {x}\right) =f_{j}\left( x_{1},x_{2},\ldots ,x_{N}\right) =\omega _{j}f\left( x_{j}\right) , \end{aligned}$$

and observe that

$$\begin{aligned} \mathbb {E}\left( f_{j}\right) =\int _{\mathcal {X}_{j}}f\left( x_{j}\right) \mathrm{d}x_{j} \end{aligned}$$

and

$$\begin{aligned} {\displaystyle \sum _{j=1}^{N}} \left( f_{j}-\mathbb {E}\left( f_{j}\right) \right)= & {} {\displaystyle \sum _{j=1}^{N}} \left( \omega _{j}f\left( x_{j}\right) -\int _{\mathcal {X}_{j}}f\left( x_{j}\right) \mathrm{d}x_{j}\right) \\= & {} {\displaystyle \sum _{j=1}^{N}} \omega _{j}f\left( x_{j}\right) -\int _{\mathcal {M}}f\left( x\right) \mathrm{d}x. \end{aligned}$$

\(\square \)

5 Diameter Bounded Equal Measure Partition of Metric Measure Spaces

In some of the results that follow we shall assume that a metric measure space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \(\delta _{j}={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\). In fact, under appropriate assumptions, Gigante and Leopardi in [19] proved the following more precise result.

Theorem 5.1

Let \(\mathcal {M}\) be a connected metric measure space with finite measure with the property that there exist positive constants d, H and K such that for every \(y\in \mathcal {M}\) and \(0<r<{\text {diam}} (\mathcal {M})\),

$$\begin{aligned} Hr^{d} \leqslant \left| \left\{ x\in \mathcal {M}:\left| x-y\right| < r\right\} \right| \leqslant Kr^{d}. \end{aligned}$$

Then, there exist two constants \(c_{1}\) and \(c_{2}\), such that for every sufficiently large N there exists a partition \(\mathcal {M}=\mathcal {X} _{1}\cup \cdots \cup \mathcal {X}_{N}\) and points \(y_{j}\in \mathcal {X}_{j}\) with \(\left| \mathcal {X}_{j}\right| = \left| \mathcal {M} \right| /N\) and

$$\begin{aligned} \left\{ x\in \mathcal {M}:\left| x-y_{j}\right|< c_{1}N^{-1/d} \right\} \subset \mathcal {X}_{j}\subset \left\{ x\in \mathcal {M}:\left| x-y_{j}\right| < c_{2}N^{-1/d} \right\} . \end{aligned}$$

For example, the theorem applies to all compact Riemannian manifolds. An algorithmic construction in the particular case of the 2-dimensional sphere that pays attention to the size of the constant \(c_2\) is contained in [38] (see also [32] for the extension to higher dimensions).

6 Numerical Integration in Potential Spaces

In this section we shall study the functional \(\mathcal {E}_{\mathbf {x,\omega } }\) on the potential space \(\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) \). Let

$$\begin{aligned} \left\| \mathcal {E}_{\mathbf {x,\omega }}\right\| _{\varPhi ,p} =\sup _{f\in \mathbb H^{\varPhi }_p(\mathcal M)}\left\{ \frac{\left| \mathcal {E}_{\mathbf {x,\omega }}\left( f\right) \right| }{\left\| f\right\| _{\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) }} \right\} \end{aligned}$$

be the norm of this functional, also termed worst case error, and the following lemma gives an explicit formula for it.

Lemma 6.1

Assume that a measure space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with measure \(\left| \mathcal {X} _{j}\right| =\omega _{j}>0\). Assume also that \(1\leqslant p\leqslant +\infty \), \(1\leqslant q\leqslant +\infty \), \(1/p+1/q=1\), and that for every x,

$$\begin{aligned} {\left\{ \int _{\mathcal {M}}\left| \varPhi \left( x,y\right) \right| ^{q}\mathrm{d}y\right\} ^{1/q}<+\infty .} \end{aligned}$$

Finally, assume that

$$\begin{aligned} \left\{ \int _{\mathcal {M}}\left( \int _{\mathcal {M}}\left| \varPhi \left( x,y\right) \right| \mathrm{d}x\right) ^{q}\mathrm{d}y\right\} ^{1/q}<+\infty . \end{aligned}$$

Then the functional \(\mathcal {E}_{\mathbf {x,\omega }}\) is well defined and continuous on \(H_{p}^{\varPhi }\left( \mathcal {M}\right) \), and its norm is

$$\begin{aligned} \left\| \mathcal {E}_{\mathbf {x,\omega }}\right\| _{\varPhi ,p}=\left\{ {\displaystyle \int _{\mathcal {M}}}\left| {\displaystyle \sum _{j=1}^{N} }{\displaystyle \int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (x,y)\right) \mathrm{d}x\right| ^{q}\mathrm{d}y\right\} ^{1/q}. \end{aligned}$$

Proof

For simplicity assume that \(q<+\infty \), the case \(q=+\infty \) being similar. Let

$$\begin{aligned} f\left( x\right) = {\displaystyle \int _{\mathcal {M}}} \varPhi \left( x,y\right) g\left( y\right) \mathrm{d}y \end{aligned}$$

be the potential of a function g(x) in \(\mathbb {L}^{p}\left( \mathcal {M} \right) \). Since

$$\begin{aligned} {\displaystyle \int _{\mathcal {M}}} \left| \varPhi \left( x,y\right) \right| ^{q}\mathrm{d}y<+\infty , \end{aligned}$$

\(f\left( x\right) \) is pointwise well defined. Since

$$\begin{aligned} \int _{\mathcal {M}} \left( \int _{\mathcal {M}}\left| \varPhi \left( x,y\right) \right| \mathrm{d}x\right) ^{q}\mathrm{d}y <+\infty , \end{aligned}$$

it follows from Fubini’s theorem that \(f\left( x\right) \) is integrable, and

$$\begin{aligned} {\displaystyle \int _{\mathcal {M}}} f\left( x\right) \mathrm{d}x= {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathcal {M}}} \varPhi \left( x,y\right) g\left( y\right) \mathrm{d}y\,\mathrm{d}x= {\displaystyle \int _{\mathcal {M}}} g\left( y\right) {\displaystyle \int _{\mathcal {M}}} \varPhi \left( x,y\right) \mathrm{d}x\,\mathrm{d}y. \end{aligned}$$

This implies that \(\mathcal {E}_{\mathbf {x,\omega }}\left( f\right) \) is well defined. Moreover,

$$\begin{aligned} \left| \mathcal {E}_{\mathbf {x,\omega }}\left( f\right) \right|&=\left| {\displaystyle \sum _{j=1}^{N}} \omega _{j}f\left( x_{j}\right) - {\displaystyle \int _{\mathcal {M}}} f\left( x\right) \mathrm{d}x\right| \\&=\left| {\displaystyle \sum _{j=1}^{N}} \omega _{j} {\displaystyle \int _{\mathcal {M}}} \varPhi \left( x_{j},y\right) g\left( y\right) \mathrm{d}y- {\displaystyle \int _{\mathcal {M}}} g\left( y\right) {\displaystyle \int _{\mathcal {M}}} \varPhi \left( x,y\right) \mathrm{d}x\,\mathrm{d}y\right| \\&=\left| {\displaystyle \int _{\mathcal {M}}} \left( {\displaystyle \sum _{j=1}^{N}} \int _{\mathcal {X}_{j}}\left( \varPhi \left( x_{j},y\right) -\varPhi \left( x,y\right) \right) \mathrm{d}x\right) g\left( y\right) \mathrm{d}y\right| \\&\leqslant \left\{ {\displaystyle \int _{\mathcal {M}}} \left| g(y)\right| ^{p}\mathrm{d}y\right\} ^{1/p}\left\{ {\displaystyle \int _{\mathcal {M} }} \left| {\displaystyle \sum _{j=1}^{N}} \int _{\mathcal {X}_{j}}\left( \varPhi \left( x_{j},y\right) -\varPhi \left( x,y\right) \right) \mathrm{d}x\right| ^{q}\mathrm{d}y\right\} ^{1/q}. \end{aligned}$$

Taking the infimum as \(g\left( y\right) \) varies among all possible functions in \(\mathbb {L}^{p}\left( \mathcal {M}\right) \) with potential \(f\left( x\right) \), one obtains

$$\begin{aligned} \left\| \mathcal {E}_{\mathbf {x,\omega }}\right\| _{\varPhi ,p}\leqslant \left\{ {\displaystyle \int _{\mathcal {M}}} \left| {\displaystyle \sum _{j=1}^{N}} \int _{\mathcal {X}_{j}}\left( \varPhi \left( x_{j},y\right) -\varPhi \left( x,y\right) \right) \mathrm{d}x\right| ^{q}\mathrm{d}y\right\} ^{1/q}. \end{aligned}$$

Conversely, using the standard argument for \(L^p-L^q\) duality, set

$$\begin{aligned} F\left( y\right)= & {} {\displaystyle \sum _{j=1}^{N}} \int _{\mathcal {X}_{j}}\left( \varPhi \left( x_{j},y\right) -\varPhi \left( x,y\right) \right) \mathrm{d}x,\\ g\left( y\right)= & {} \left\{ \begin{array} [c]{ll} \overline{F\left( y\right) }\left| F\left( y\right) \right| ^{q/p-1} &{}\quad \mathrm{if}\, F\left( y\right) \ne 0,\\ 0 &{} \quad \mathrm{if}\, F\left( y\right) =0, \end{array} \right. \end{aligned}$$

and

$$\begin{aligned} f\left( x\right) = {\displaystyle \int _{\mathcal {M}}} \varPhi \left( x,y\right) g\left( y\right) \mathrm{d}y. \end{aligned}$$

Then

$$\begin{aligned} \left| \mathcal {E}_{\mathbf {x,\omega }}\left( f\right) \right|&=\left| {\displaystyle \sum _{j=1}^{N}} \omega _{j}f\left( x_{j}\right) - {\displaystyle \int _{\mathcal {M}}} f(x)\mathrm{d}x\right| =\left| {\displaystyle \int _{\mathcal {M}}} F\left( y\right) g\left( y\right) \mathrm{d}y\right| \\&={\displaystyle \int _{\mathcal {M}}} \left| F\left( y\right) \right| ^{1+q/p}\mathrm{d}y =\left\{ {\displaystyle \int _{\mathcal {M}}} \left| F\left( y\right) \right| ^{q}\mathrm{d}y\right\} ^{1/q} \left\{ {\displaystyle \int _{\mathcal {M}}} \left| F\left( y\right) \right| ^{q}\mathrm{d}y\right\} ^{1/p}\\&=\left\{ {\displaystyle \int _{\mathcal {M}}} \left| F\left( y\right) \right| ^{q}\mathrm{d}y\right\} ^{1/q} \left\{ {\displaystyle \int _{\mathcal {M}}} \left| g\left( y\right) \right| ^{p}\mathrm{d}y\right\} ^{1/p} \geqslant \left\{ {\displaystyle \int _{\mathcal {M}}} \left| F\left( y\right) \right| ^{q}\mathrm{d}y\right\} ^{1/q} \left\| f\right\| _{\mathbb {H} _{p}^{\varPhi }}. \end{aligned}$$

This implies that

$$\begin{aligned} \left\| \mathcal {E}_{\mathbf {x,\omega }}\right\| _{\varPhi ,p} \geqslant \left\{ {\displaystyle \int _{\mathcal {M}}} \left| F\left( y\right) \right| ^{q}\mathrm{d}y\right\} ^{1/q}. \end{aligned}$$

\(\square \)

Theorem 6.2

Assume that a measure space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with measure \(\left| \mathcal {X} _{j}\right| =\omega _{j}>0\). Assume also that \(1\leqslant p\leqslant +\infty \), \(1\leqslant q\leqslant +\infty \), \(1/p+1/q=1\), and that for every x,

$$\begin{aligned} \left\{ \int _{\mathcal {M}}\left| \varPhi \left( x,y\right) \right| ^{q}\mathrm{d}y\right\} ^{1/q}<+\infty . \end{aligned}$$

Finally, assume that

$$\begin{aligned} \left\{ \int _{\mathcal {M}}\left( \int _{\mathcal {M}}\left| \varPhi \left( x,y\right) \right| \mathrm{d}x\right) ^{q}\mathrm{d}y\right\} ^{1/q}<+\infty . \end{aligned}$$

Define

$$\begin{aligned} \varGamma \left( \varPhi \right) ={\sum _{j=1}^{N}}\left\{ {\int _{\mathcal {X}_{j} }\int _{\mathcal {M}}}\left| {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{q}\mathrm{d}y\frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/q}, \end{aligned}$$

and

$$\begin{aligned} \varDelta \left( \varPhi \right) =\left\{ {\int _{\mathbf {X}}\int _{\mathcal {M}} }\left( {\sum _{j=1}^{N}}\left| {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{2}\right) ^{q/2} \mathrm{d}y\,\mathrm{d}\mathbf {x}\right\} ^{1/q}. \end{aligned}$$

Then for every \(1\leqslant p\leqslant +\infty \),

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left\| \mathcal {E}_{\mathbf {x},{\omega }}\right\| _{\varPhi ,p}^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q}\leqslant \varGamma \left( \varPhi \right) , \end{aligned}$$
(7)

and for every \(1<p\leqslant +\infty \),

$$\begin{aligned} A\left( q\right) \varDelta \left( \varPhi \right) \leqslant \left\{ {\int _{\mathbf {X}}}\left\| \mathcal {E}_{\mathbf {x},{\omega }}\right\| _{\varPhi ,p}^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q}\leqslant B\left( q\right) \varDelta \left( \varPhi \right) . \end{aligned}$$
(8)

In particular, there exist choices of nodes \(\left\{ x_{j}\right\} \) with the property that for every function \(f\left( x\right) \) in the potential space \(\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) \),

$$\begin{aligned} \left| {\sum _{j=1}^{N}}\omega _{j}f\left( x_{j}\right) -{\int _{\mathcal {M}}}f(x)\mathrm{d}x\right| \leqslant \left\{ \begin{array} [c]{ll} \varGamma \left( \varPhi \right) \left\| f\right\| _{\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) } &{}\quad \mathrm{for\;every}\, 1\leqslant p\leqslant +\infty ,\\ B\left( q\right) \varDelta \left( \varPhi \right) \left\| f\right\| _{\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) } &{}\quad \mathrm{for\;every}\, 1<p\leqslant +\infty . \end{array} \right. \end{aligned}$$

The constants \(A\left( q\right) \) and \(B\left( q\right) \) are the best constants in the Marcinkiewicz–Zygmund inequality. The constants \(\varGamma \left( \varPhi \right) \) and \(\varDelta \left( \varPhi \right) \) are related to the smoothness of the kernel \(\varPhi \left( x,y\right) \). These last constants could be estimated in terms of Sobolev norms. However, in the applications, the estimates in terms of Sobolev norms are not always optimal, and it is more convenient to keep the above complicated expressions. Finally, since \(B\left( q\right) \rightarrow +\infty \) as \(p\rightarrow 1+\), the interest of the estimate (7) is when \(p\rightarrow 1+\).

Proof

By Lemma 6.1 and the triangle inequality,

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left\| \mathcal {E}_{\mathbf {x},{\omega }}\right\| _{\varPhi ,p}^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q}&=\left\{ {\int _{\mathbf {X}}\int _{\mathcal {M}}}\left| {\sum _{j=1}^{N}} {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{q}\,\mathrm{d}y\,\mathrm{d}\mathbf {x}\right\} ^{1/q}\\&\leqslant {\sum _{j=1}^{N}}\left\{ {\int _{\mathcal {X}_{j}}\int _{\mathcal {M} }}\left| {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j} ,y)\right) \mathrm{d}z_{j}\right| ^{q}\mathrm{d}y\,\dfrac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/q}, \end{aligned}$$

where we have used the fact that for a function depending only on \(x_j\), integration on \(\mathbf X\) coincides with integration on \(\mathcal X_j\). This gives the proof with \(\varGamma \left( \varPhi \right) \). The proof with \(\varDelta \left( \varPhi \right) \) is similar, with the crucial difference that we replace the triangle inequality with the Marcinkiewicz–Zygmund inequality. Indeed, by Corollary 4.2,

$$\begin{aligned}&\left\{ {\int _{\mathbf {X}}}\left\| \mathcal {E}_{\mathbf {x},{\omega }}\right\| _{\varPhi ,p}^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q} \\&\quad =\left\{ {\int _{\mathcal {M}}\int _{\mathbf {X}}}\left| {\sum _{j=1}^{N}\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j} \right| ^{q}\mathrm{d}\mathbf {x}\,\mathrm{d}y\right\} ^{1/q}\\&\quad \leqslant B\left( q\right) \left\{ {\int _{\mathcal {M}}\int _{\mathbf {X}} }\left( {\sum _{j=1}^{N}}\left| {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{2}\right) ^{q/2} \mathrm{d}\mathbf {x}\,\mathrm{d}y\right\} ^{1/q}. \end{aligned}$$

The proof for the lower bound is similar. \(\square \)

The following corollary is a slightly generalized version of Theorem I in the Introduction.

Corollary 6.3

Let \(\mathcal {M}\) be a metric measure space with the property that there exist d and c such that for every \(y\in \mathcal {M}\) and \(r>0\),

$$\begin{aligned} \left| \left\{ x\in \mathcal {M}:\left| x-y\right| \leqslant r\right\} \right| \leqslant cr^{d}. \end{aligned}$$

Assume also that \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \(\delta _{j}={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\). Assume that for some \(\varepsilon >0\) and \(0<\alpha <d\),

$$\begin{aligned} \left| \varPhi \left( x,y\right) \right| \leqslant c\left| x-y\right| ^{\alpha -d} \end{aligned}$$

for every x and y, and

$$\begin{aligned} \left| \varPhi \left( x,y\right) -\varPhi \left( z,y\right) \right| \leqslant c\left| x-z\right| ^{\varepsilon } \left| x-y\right| ^{\alpha -d-\varepsilon } \end{aligned}$$

if \(\left| x-y\right| \geqslant 2\left| x-z\right| \). Finally, assume that \(1<p\leqslant +\infty \), \(1/p+1/q=1\) and \(d/p<\alpha <d\). Then

$$\begin{aligned} \left\{ {\displaystyle \int _{\mathbf {X}}} \left\| \mathcal {E} _{\mathbf {x},{\omega }}\right\| _{\varPhi ,p} ^{q} \mathrm{d}\mathbf {x}\right\} ^{1/q} \leqslant \left\{ \begin{array} [c]{ll} cN^{-\alpha /d} &{}\quad \mathrm{if}\, \alpha <d/2+\varepsilon ,\\ cN^{-1/2-\varepsilon /d}\left( \log N\right) ^{1/2} &{}\quad \mathrm{if}\, \alpha =d/2+\varepsilon ,\\ cN^{-1/2-\varepsilon /d} &{}\quad \mathrm{if }\,\alpha >d/2+\varepsilon . \end{array} \right. \end{aligned}$$

Proof

The assumption \(\alpha >d/p\) ensures that the kernel \(\varPhi \left( x,y\right) \) is q integrable and satisfies the hypotheses of Theorem 6.2. It then suffices to estimate

$$\begin{aligned} \varDelta \left( \varPhi \right) =\left\{ {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathbf {X}}} \left( {\displaystyle \sum _{j=1}^{N}} \left| {\displaystyle \int _{\mathcal {X}_{j}}} \left( \varPhi (x_{j} ,y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{2}\right) ^{q/2} \mathrm{d}\mathbf {x}\,\mathrm{d}y \right\} ^{1/q}. \end{aligned}$$

If \({\text {dist}}\left( y,\mathcal {X}_{j}\right) \leqslant 2\delta _{j}\), then for every \(x_{j}\) in \(\mathcal {X}_{j}\),

$$\begin{aligned} \left| {\displaystyle \int _{\mathcal {X}_{j}}} \left( \varPhi (x_{j} ,y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right|&\leqslant c {\displaystyle \int _{\mathcal {X}_{j}}} \left( \left| x_{j}-y\right| ^{\alpha -d} +\left| z_{j}-y\right| ^{\alpha -d}\right) \mathrm{d}z_{j}\\&\leqslant c\omega _{j}\left| x_{j}-y\right| ^{\alpha -d} +c\delta _{j}^{\alpha } \leqslant cN^{-1}\left| x_{j}-y\right| ^{\alpha -d}. \end{aligned}$$

If \({\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}\), then

$$\begin{aligned} \left| {\displaystyle \int _{\mathcal {X}_{j}}} \left( \varPhi (x_{j} ,y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right|&\leqslant c {\displaystyle \int _{\mathcal {X}_{j}}} \left| x_{j}-z_{j}\right| ^{\varepsilon } \left| x_{j}-y\right| ^{\alpha -d-\varepsilon }\mathrm{d}z_{j}\\&\leqslant c\delta _{j}^{\varepsilon }\omega _{j} \left| x_{j}-y\right| ^{\alpha -d-\varepsilon } \leqslant cN^{-1-\varepsilon /d} \left| x_{j}-y\right| ^{\alpha -d-\varepsilon }. \end{aligned}$$

Hence

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathbf {X}}} \left( \sum _{j=1}^{N}\left| \int _{\mathcal {X}_{j}} \left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{2}\right) ^{q/2} \mathrm{d}\mathbf {x}\,\mathrm{d}y\right\} ^{1/q}\\&\quad \leqslant c\left\{ {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathbf {X}}} \left( N^{-2} {\displaystyle \sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \leqslant 2\delta _{j} }} \left| x_{j}-y\right| ^{2\alpha -2d}\right) ^{q/2} \mathrm{d}\mathbf {x} \,\mathrm{d}y\right\} ^{1/q}\\&\quad \qquad +c\left\{ {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathbf {X}}} \left( N^{-2-2\varepsilon /d} {\displaystyle \sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) >2\delta _{j}}} \left| x_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\right) ^{q/2} \mathrm{d}\mathbf {x}\,\mathrm{d}y\right\} ^{1/q}. \end{aligned}$$

Under the assumption that \({\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\), there is only a bounded number of \(\mathcal {X}_{j}\) with \({\text {dist}}\left( y,\mathcal {X}_{j}\right) \leqslant 2{\text {diam}}\left( \mathcal {X}_{j}\right) \). Hence

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathbf {X}}} \left( N^{-2} {\displaystyle \sum _{j:{\text {dist}}\left( y,\mathcal {X} _{j}\right) \leqslant 2\delta _{j}}} \left| x_{j}-y\right| ^{2\alpha -2d}\right) ^{q/2} \mathrm{d}\mathbf {x}\,\mathrm{d}y\right\} ^{1/q}\\&\quad \leqslant c\left\{ {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathbf {X}}} {\displaystyle \sum _{j=1}^{N}} \left( N^{-2}\chi _{\left\{ {\text {dist}}\left( y,\mathcal {X}_{j}\right) \leqslant 2\delta _{j}\right\} }\left( y\right) \left| x_{j}-y\right| ^{2\alpha -2d}\right) ^{q/2}\mathrm{d}\mathbf {x}\,\mathrm{d}y\right\} ^{1/q}\\&\quad \leqslant c\left\{ N^{-q} {\displaystyle \sum _{j=1}^{N}} {\displaystyle \int _{\mathcal {X}_{j}}} {\displaystyle \int _{\left\{ \left| y-x_{j}\right| \leqslant cN^{-1/d}\right\} }} \left| x_{j} -y\right| ^{\alpha q-dq} \mathrm{d}y\frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/q} \leqslant cN^{-\alpha /d}. \end{aligned}$$

Moreover,

$$\begin{aligned}&N^{-2-2\varepsilon /d} {\displaystyle \sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right)>2\delta _{j}}} \left| x_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\\&\quad \leqslant cN^{-1-2\varepsilon /d} {\displaystyle \sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right)>2\delta _{j}}} \omega _{j}\left| x_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\\&\quad \leqslant cN^{-1-2\varepsilon /d} {\displaystyle \sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right)>2\delta _{j}}} \int _{\mathcal {X}_{j}}\left| x-y\right| ^{2\alpha -2d-2\varepsilon }\mathrm{d}x\\&\quad \leqslant cN^{-1-2\varepsilon /d} {\displaystyle \int _{\left\{ \left| x-y\right|>cN^{-1/d}\right\} }} \left| x-y\right| ^{2\alpha -2d-2\varepsilon }\mathrm{d}x\\&\quad \leqslant \left\{ \begin{array} [c]{ll} cN^{-2\alpha /d} &{}\quad \mathrm{if}\, \alpha <\varepsilon +d/2,\\ cN^{-1-2\varepsilon /d}\log N &{}\quad \mathrm{if}\, \alpha =\varepsilon +d/2,\\ cN^{-1-2\varepsilon /d} &{}\quad \mathrm{if}\, \alpha >\varepsilon +d/2. \end{array} \right. \end{aligned}$$

Hence

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathcal {M}}} {\displaystyle \int _{\mathbf {X}}} \left( N^{-2-2\varepsilon /d} {\displaystyle \sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right)>2\delta _{j}}} \left| x_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\right) ^{q/2} \mathrm{d}\mathbf {x}\,\mathrm{d}y\right\} ^{1/q}\\&\quad \leqslant \left\{ \begin{array} [c]{ll} cN^{-\alpha /d} &{} \quad \mathrm{if}\, \alpha <\varepsilon +d/2,\\ cN^{-1/2-\varepsilon /d}\left( \log N\right) ^{1/2} &{}\quad \mathrm{if}\, \alpha =\varepsilon +d/2,\\ cN^{-1/2-\varepsilon /d} &{}\quad \mathrm{if}\, \alpha >\varepsilon +d/2. \end{array} \right. \end{aligned}$$

\(\square \)

The following result shows that under some natural assumptions on the kernel, the mean value estimate in the above corollary is essentially sharp, and is a slightly generalized version of Theorem II in the Introduction.

Corollary 6.4

Let \(\mathcal {M}\) be a metric measure space with the property that there exist HK and d such that for every \(y\in \mathcal {M} \) and \(0<r<r_{0}\),

$$\begin{aligned} Hr^{d}\leqslant \left| \left\{ x\in \mathcal {M}:\left| x-y\right| \leqslant r\right\} \right| \leqslant Kr^{d}. \end{aligned}$$

Assume also that \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \(\delta _{j}={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d} \). Suppose that there exists \(0<\alpha <d\) and \(\varepsilon >0\), such that for any \(j=1,\ldots ,N\) and any \(z\in \mathcal {X}_{j}\), and for any y such that \({\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}\),

$$\begin{aligned} \int _{\mathcal {X}_{j}}\left| \varPhi \left( x,y\right) -\varPhi \left( z,y\right) \right| \mathrm{d}x\geqslant cN^{-1-\varepsilon /d}\left( {\text {dist}}\left( y,\mathcal {X}_{j}\right) \right) ^{\alpha -d-\varepsilon }. \end{aligned}$$

Suppose also that for any \(y\in M\), the function \(x\mapsto \varPhi \left( x,y\right) \) is continuous in \(x\ne y\). Finally, assume that \(1<p\leqslant +\infty \), \(1/p+1/q=1\) and \(d/p<\alpha <d\). Then

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left\| \mathcal {E}_{\mathbf {x},{\omega }}\right\| _{\varPhi ,p} ^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q}\geqslant \left\{ \begin{array} [c]{ll} cN^{-\alpha /d} &{}\quad \mathrm{if}\, \alpha <d/2+\varepsilon ,\\ cN^{-1/2-\varepsilon /d}\left( \log N\right) ^{1/2} &{}\quad \mathrm{if}\, \alpha =d/2+\varepsilon ,\\ cN^{-1/2-\varepsilon /d} &{} \quad \mathrm{if}\, \alpha >d/2+\varepsilon . \end{array} \right. \end{aligned}$$

Proof

It follows from Lemma 6.1 that

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left\| \mathcal {E}_{\mathbf {x},{\omega }}\right\| ^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q}&=\left\{ {\int _{\mathbf {X}} }{\int _{\mathcal {M}}}\left| {\sum _{j=1}^{N}}{\int _{\mathcal {X}_{j}} }\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{q} \mathrm{d}y\,\mathrm{d}\mathbf {x}\right\} ^{1/q}\\&\geqslant \left| \mathcal {M}\right| ^{-1/p}{\int _{\mathbf {X}}} {\int _{\mathcal {M}}}\left| {\sum _{j=1}^{N}}{\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| \mathrm{d}y\,\mathrm{d}\mathbf {x}. \end{aligned}$$

By Corollary 4.2, this is bounded from below by

$$\begin{aligned}&\left| \mathcal {M}\right| ^{-1/p}A\left( 1\right) {\int _{\mathcal {M}}}{\int _{\mathbf {X}}}\left( {\sum _{j}}\left| {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j} \right| ^{2}\right) ^{1/2}\mathrm{d}\mathbf {x}\,\mathrm{d}y\\&\quad \geqslant \left| \mathcal {M}\right| ^{-1/p}A\left( 1\right) {\int _{\mathcal {M}}}{\int _{\mathbf {X}}}\left( {\sum _{j:{\text {dist}} \left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\left| {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j}\right| ^{2}\right) ^{1/2}\mathrm{d}\mathbf {x}\,\mathrm{d}y. \end{aligned}$$

By the continuity of \(z_{j}\rightarrow \varPhi (z_{j},y)\), there exists a point \(x_{j}^{*}\), depending on y, such that

$$\begin{aligned} {\int _{\mathcal {X}_{j}}}\varPhi (z_{j},y)\mathrm{d}z_{j}=\omega _{j}\varPhi (x_{j}^{*},y). \end{aligned}$$

Thus

$$\begin{aligned}&{\int _{\mathbf {X}}}\left( {\sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\left| {\int _{\mathcal {X}_{j}}}\left( \varPhi (x_{j},y)-\varPhi (z_{j},y)\right) \mathrm{d}z_{j} \right| ^{2}\right) ^{1/2}\mathrm{d}\mathbf {x}\\&\quad ={\int _{\mathbf {X}}}\left( {\sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| \varPhi (x_{j},y)-\varPhi (x_{j}^{*},y)\right| ^{2}\right) ^{1/2}\mathrm{d}\mathbf {x}. \end{aligned}$$

For any two positive sequences \(\left\{ \alpha _{j}\right\} \) and \(\left\{ \beta _{j}\right\} \) we clearly have

$$\begin{aligned} \sum _{j}\alpha _{j}^{2}\beta _{j}\leqslant \left( \sum _{j}\alpha _{j}^{2}\right) ^{1/2}\left( \sum _{j}\alpha _{j}^{2}\beta _{j}^{2}\right) ^{1/2}. \end{aligned}$$

Thus if \(v_{j}\) is a point of the closure of \(\mathcal {X}_{j}\) that minimizes the distance from y,

$$\begin{aligned}&{\int _{\mathbf {X}}}\left( {\sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\frac{\left| \varPhi (x_{j},y)-\varPhi (x_{j}^{*},y)\right| ^{2}}{\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }}\right) ^{1/2}\mathrm{d}\mathbf {x}\\&\quad \geqslant {\int _{\mathbf {X}}}\left( {\sum _{j:{\text {dist}} \left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\omega _{j} ^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\right) ^{-1/2}\\&\qquad \times {\sum _{j:{\text {dist}}\left( y,\mathcal {X} _{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\frac{\left| \varPhi (x_{j},y)-\varPhi (x_{j}^{*},y)\right| }{\left| v_{j}-y\right| ^{\alpha -d-\varepsilon } }\,\mathrm{d}\mathbf {x}\\&\quad =\left( {\sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\right) ^{-1/2}\\&\qquad \times {\sum _{j:{\text {dist}}\left( y,\mathcal {X} _{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{\alpha -d-\varepsilon }\left( {\int _{\mathbf {X}}}\left| \varPhi (x_{j},y)-\varPhi (x_{j}^{*},y)\right| \mathrm{d}\mathbf {x}\right) \\&\quad \geqslant c\left( {\sum _{j:{\text {dist}}\left( y,\mathcal {X} _{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\right) ^{-1/2}\\&\qquad \times \,{\sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }N^{-\varepsilon /d}\\&\quad =c\left( {\sum _{j:{\text {dist}}\left( y,\mathcal {X} _{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\right) ^{1/2}N^{-\varepsilon /d}. \end{aligned}$$

The desired result now follows from the estimates

$$\begin{aligned} {\sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\geqslant c\left\{ \begin{array} [c]{ll} N^{-2\alpha /d+2\varepsilon /d} &{}\quad \mathrm{if}\, \alpha <d/2+\varepsilon ,\\ N^{-1}\log N &{}\quad \mathrm{if}\, \alpha =d/2+\varepsilon ,\\ N^{-1} &{}\quad \mathrm{if}\, \alpha >d/2+\varepsilon . \end{array} \right. \end{aligned}$$

Indeed,

$$\begin{aligned} {\sum _{j:{\text {dist}}\left( y,\mathcal {X}_{j}\right) \geqslant 2\delta _{j}}}\omega _{j}^{2}\left| v_{j}-y\right| ^{2\alpha -2d-2\varepsilon }\geqslant cN^{-1}{\int _{\left\{ \left| x-y\right| >cN^{-1/d}\right\} }}\left| x-y\right| ^{2\alpha -2d-2\varepsilon }\mathrm{d}x. \end{aligned}$$

If \(\lambda \geqslant \left( 2K/H\right) ^{1/d}\), then for every center y and \(r<r_{0}/\lambda \),

$$\begin{aligned} \left| \left\{ x\in \mathcal {M}:r<\left| x-y\right| \leqslant \lambda r\right\} \right| \geqslant Kr^{d}. \end{aligned}$$

This gives

$$\begin{aligned}&N^{-1}{\int _{\left\{ \left| x-y\right|>cN^{-1/d}\right\} } }\left| x-y\right| ^{2\alpha -2d-2\varepsilon }\mathrm{d}x\\&\quad \geqslant N^{-1}\sum _{k=0}^{\left[ \log _{\lambda }\left( r_{0} N^{1/d}\right) \right] -1}{\int _{\left\{ \lambda ^{k}N^{-1/d}\leqslant \left| x-y\right|<\lambda ^{k+1}N^{-1/d}\right\} }}\left| x-y\right| ^{2\alpha -2d-2\varepsilon }\mathrm{d}x\\&\quad \geqslant N^{-1}\sum _{k=0}^{\left[ \log _{\lambda }\left( r_{0} N^{1/d}\right) \right] -1}\left( \lambda ^{k}N^{-1/d}\right) ^{2\alpha -2d-2\varepsilon }K\left( \lambda ^{k}N^{-1/d}\right) ^{d}\\&\quad =KN^{-2\alpha /d+2\varepsilon /d}\sum _{k=0}^{\left[ \log _{\lambda }\left( r_{0}N^{1/d}\right) \right] -1}\lambda ^{k\left( 2\alpha -d-2\varepsilon \right) }\\&\quad \geqslant c\left\{ \begin{array} [c]{ll} N^{-2\alpha /d+2\varepsilon /d} &{}\quad \mathrm{if}\, \alpha <d/2+\varepsilon ,\\ N^{-1}\log N &{}\quad \mathrm{if}\, \alpha =d/2+\varepsilon ,\\ N^{-1} &{}\quad \mathrm{if}\, \alpha >d/2+\varepsilon . \end{array} \right. \end{aligned}$$

\(\square \)

Example 6.5

Let \(\mathcal {M}\) be a d-dimensional compact Riemannian manifold. Let \(\left\{ \lambda ^{2}\right\} \) and \(\left\{ \varphi _{\lambda }(x)\right\} \) be the eigenvalues and a complete orthonormal system of eigenfunctions of the Laplace Beltrami operator \(\varDelta \), respectively. Every tempered distribution on \(\mathcal {M}\) has Fourier transform and series

$$\begin{aligned} \mathcal {F}f(\lambda )={\int _{\mathcal {M}}}f(y)\overline{\varphi _{\lambda } (y)}\mathrm{d}y\quad \text{ and }\quad f(x)={\sum \limits _{\lambda }}\mathcal {F} f(\lambda )\varphi _{\lambda }(x). \end{aligned}$$

The Bessel kernel \(B^{\alpha }(x,y)\), \(-\infty<\alpha <+\infty \), is a distribution defined by the expansion

$$\begin{aligned} B^{\alpha }(x,y)={\sum _{\lambda }}\left( 1+\lambda ^{2}\right) ^{-\alpha /2}\varphi _{\lambda }(x)\overline{\varphi _{\lambda }(y)}. \end{aligned}$$

A distribution f(x) is the Bessel potential of a distribution g(x) if

$$\begin{aligned} f(x)={\int _{\mathcal {M}}}B^{\alpha }(x,y)g(y)\mathrm{d}y={\sum _{\lambda }}\left( 1+\lambda ^{2}\right) ^{-\alpha /2}\mathcal {F}g(\lambda )\varphi _{\lambda }(x). \end{aligned}$$

Bessel potentials of functions in \(\mathbb {L}^{p}\left( \mathcal {M}\right) \) define the fractional Sobolev space \(\mathbb {H}_{p}^{\alpha }\left( \mathcal {M}\right) \). If \(0<\alpha <d\), then the Bessel kernel satisfies the estimates

$$\begin{aligned} \left| B^{\alpha }(x,y)\right| \leqslant c\left| x-y\right| ^{\alpha -d}\quad \text{ and }\quad \left| \nabla B^{\alpha }(x,y)\right| \leqslant c\left| x-y\right| ^{\alpha -d-1}. \end{aligned}$$

See [7, Lemmas 2.5 and 2.6]. In particular, Corollary 6.3 applies with \(\varPhi (x,y)=B^{\alpha }(x,y)\) and \(\varepsilon =1\). Indeed, using the Hadamard parametrix for the wave equation, see, for example, [6], one can prove a more precise result: there is a smooth positive function \(C\left( y\right) \) and positive constants \(\eta \) and c such that

$$\begin{aligned} B^{\alpha }\left( x,y\right) =C\left( y\right) \left| x-y\right| ^{\alpha -d}+E\left( x,y\right) , \end{aligned}$$

with

$$\begin{aligned} \left| E\left( x,y\right) \right| \leqslant c\left| x-y\right| ^{\alpha -d+\eta }\quad \text{ and }\quad \left| \nabla E\left( x,y\right) \right| \leqslant c\left| x-y\right| ^{\alpha -d-1+\eta }. \end{aligned}$$

It then follows that Corollary 6.4 applies also. See [13, Theorems 24 and 25] for the case of the sphere.

7 Numerical Integration in Besov Spaces

The techniques of the previous section can also be used to study the error in numerical integration from a different perspective. So far we have considered the worst case error

$$\begin{aligned} \left\{ {\displaystyle \int _{\mathbf {X}}} \sup _{\left\| f\right\| _{\mathbb {H}_{p}^{\varPhi }\left( \mathcal {M}\right) }\leqslant 1} \left| \mathcal {E}_{\mathbf {x},{\omega }} \left( f\right) \right| ^{q}\mathrm{d}\mathbf {x}\right\} ^{1/q}, \end{aligned}$$

whereas now we will estimate the error

$$\begin{aligned} \left\{ {\displaystyle \int _{\mathbf {X}}} \left| \mathcal {E}_{\mathbf {x} ,{\omega }}\left( f\right) \right| ^{p} \mathrm{d}\mathbf {x}\right\} ^{1/p}. \end{aligned}$$

for a given \(f\in \dot{\mathbb {B}}_{p,\infty }^{\varphi } \left( \mathcal {M}\right) \). The following is a slightly generalized version of Theorem III in the Introduction.

Theorem 7.1

Assume that a metric measure space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X} _{1}\cup \cdots \cup \mathcal {X}_{N}\), with measure \(0<\left| \mathcal {X} _{j}\right| =\omega _{j}<+\infty \) and \(0<{\text {diam}}\left( \mathcal {X}_{j}\right) =\delta _{j}<+\infty \). Also let \(\varphi \left( t\right) \) be a non-negative increasing function in \(t\geqslant 0\), and let \(\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \) be the associated Besov space. Then for every \(1\leqslant p\leqslant +\infty \),

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left| \mathcal {E}_{\mathbf {x},{\omega }}\left( f\right) \right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p}\leqslant 2\left| \mathcal {M}\right| ^{1-1/p}\varphi \left( 2\sup \left\{ \delta _{j}\right\} \right) \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }. \end{aligned}$$
(9)

Furthermore, if \(1\leqslant p\leqslant 2\), then

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left| \mathcal {E}_{\mathbf {x},{\omega }}\left( f\right) \right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p}\leqslant 2B\left( p\right) \sup \left\{ \omega _{j}^{1-1/p}\right\} \varphi \left( 2\sup \left\{ \delta _{j}\right\} \right) \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }, \end{aligned}$$
(10)

and if \(2\leqslant p<+\infty \), then

$$\begin{aligned} \left\{ {\int _{\mathbf {X}}}\left| \mathcal {E}_{\mathbf {x},{\omega }}\left( f\right) \right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p}\leqslant 2B\left( p\right) \left| \mathcal {M}\right| ^{1/2-1/p}\sup \left\{ \omega _{j}^{1/2}\right\} \varphi \left( 2\sup \left\{ \delta _{j}\right\} \right) \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }. \end{aligned}$$
(11)

Observe that the estimate (9) is of some interest only for large p. Indeed, if \(1\leqslant p\leqslant 2\), then (10) is better than (9), and if \(2\leqslant p<+\infty \) and \(B\left( p\right) \leqslant \left| \mathcal {M}\right| ^{1/2} \left( \sup \left\{ \omega _{j}^{1/2}\right\} \right) ^{-1}\), then (11) is better than (9).

In order to prove Theorem 7.1, we need a Poincaré type inequality for functions in Hajłasz–Besov spaces.

Lemma 7.2

Let \(1\leqslant p\leqslant +\infty \), let \(\mathcal {M}\) be a metric measure space, and let \(\left\{ g_{n}\left( x\right) \right\} \) be a \(\varphi \)-gradient for an integrable function \(f\left( x\right) \). Let \(\mathcal {X}\) be a measurable subset of \(\mathcal {M}\) with \(\omega =\left| \mathcal {X}\right| >0\) and \({\text {diam}}\left( \mathcal {X} \right) \leqslant 2^{-n}\), and let

$$\begin{aligned} f_{\mathcal {X}}=\frac{1}{\omega }{\displaystyle \int _{\mathcal {X}}}f\left( y\right) \mathrm{d}y. \end{aligned}$$

Then

$$\begin{aligned} \left\{ {\int _{\mathcal {X}}}\left| f\left( x\right) -f_{\mathcal {X} }\right| ^{p}\frac{\mathrm{d}x}{\omega }\right\} ^{1/p}\leqslant 2\varphi \left( 2^{-n}\right) \left\{ {\int _{\mathcal {X}}}\left| g_{n}\left( x\right) \right| ^{p}\dfrac{\mathrm{d}x}{\omega }\right\} ^{1/p}. \end{aligned}$$

Proof

For almost every x and y with \(\left| x-y\right| \leqslant 2^{-n} \), we have

$$\begin{aligned} \left| f\left( x\right) -f\left( y\right) \right| \leqslant \varphi \left( 2^{-n}\right) \left( g_{n}\left( x\right) +g_{n}\left( y\right) \right) . \end{aligned}$$

Then, by Hölder’s inequality, we obtain

$$\begin{aligned}&\left\{ {\int _{\mathcal {X}}}\left| f\left( x\right) -f_{\mathcal {X} }\right| ^{p}\frac{\mathrm{d}x}{\omega }\right\} ^{1/p}\leqslant \left\{ {\int _{\mathcal {X}}}{\int _{\mathcal {X}}}\left| f\left( x\right) -f(y)\right| ^{p}\dfrac{\mathrm{d}x}{\omega }\frac{\mathrm{d}y}{\omega }\right\} ^{1/p}\\&\quad \leqslant \varphi \left( 2^{-n}\right) \left\{ {\int _{\mathcal {X}} }{\int _{\mathcal {X}}}\left| g_{n}\left( x\right) +g_{n}\left( y\right) \right| ^{p}\frac{\mathrm{d}x}{\omega }\frac{\mathrm{d}y}{\omega }\right\} ^{1/p} \leqslant 2\varphi \left( 2^{-n}\right) \left\{ {\int _{\mathcal {X}} }\left| g_{n}\left( x\right) \right| ^{p}\dfrac{\mathrm{d}x}{\omega }\right\} ^{1/p}. \end{aligned}$$

\(\square \)

Proof of Theorem 7.1

Let \(f\in \dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \), and let \(\left\{ g_{n}\left( x\right) \right\} \) be a \(\varphi \)-gradient for \(f\left( x\right) \). Choose n such that \(2^{-n-1}<\sup \delta _{j}\leqslant 2^{-n}\). Then by Lemma 7.2, we have

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathbf {X}}} \left| {\displaystyle \sum _{j}} \omega _{j}f\left( x_{j}\right) - {\displaystyle \int _{\mathcal {M}}} f\left( x\right) \mathrm{d}x\right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p} =\left\{ {\displaystyle \int _{\mathbf {X}}} \left| {\displaystyle \sum _{j}} \omega _{j}\left( f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right) \right| ^{p} \mathrm{d}\mathbf {x}\right\} ^{1/p} \\&\quad \leqslant {\displaystyle \sum _{j}} \omega _{j}\left\{ {\displaystyle \int _{\mathcal {X}_{j}}} \left| f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right| ^{p} \frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/p} \leqslant 2\varphi \left( 2^{-n}\right) {\displaystyle \sum _{j}} \omega _{j}\left\{ {\displaystyle \int _{\mathcal {X}_{j}}} \left| g_{n}\left( x_{j}\right) \right| ^{p}\dfrac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/p} \\&\quad \leqslant 2\varphi \left( 2^{-n}\right) \left\{ {\displaystyle \sum _{j}} \omega _{j}\right\} ^{1-1/p}\left\{ {\displaystyle \sum _{j}} {\displaystyle \int _{\mathcal {X}_{j}}} \left| g_{n}\left( x_{j}\right) \right| ^{p}\mathrm{d}x_{j}\right\} ^{1/p}\\&\quad \leqslant 2\varphi \left( 2\sup \delta _{j}\right) \left| \mathcal {M}\right| ^{1-1/p}\left\{ {\displaystyle \int _{\mathcal {M}}} \left| g_{n}\left( x\right) \right| ^{p}\mathrm{d}x\right\} ^{1/p}. \end{aligned}$$

The proofs of (10) and (11) are similar. Indeed, by Corollary 4.2, we have

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathbf {X}}} \left| {\displaystyle \sum _{j=1}^{N}} \omega _{j}f\left( x_{j}\right) - {\displaystyle \int _{\mathcal {M}}} f(x)\mathrm{d}x\right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p} \\&\quad \leqslant B\left( p\right) \left\{ {\displaystyle \int _{\mathbf {X}}} \left( {\displaystyle \sum _{j=1}^{N}} \omega _{j}^{2} \left| f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right| ^{2}\right) ^{p/2} \mathrm{d}\mathbf {x}\right\} ^{1/p}. \end{aligned}$$

Choose n such that \(2^{-n-1}<\sup \delta _{j}\leqslant 2^{-n}\), and assume that \(1\leqslant p\leqslant 2\). By Lemma 7.2, we obtain

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathbf {X}}} \left( {\displaystyle \sum _{j=1}^{N}} \omega _{j}^{2}\left| f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right| ^{2}\right) ^{p/2}\mathrm{d}\mathbf {x}\right\} ^{1/p} \leqslant \left\{ {\displaystyle \int _{\mathbf {X}}} \left( {\displaystyle \sum _{j=1}^{N}} \omega _{j}^{p}\left| f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right| ^{p}\right) \mathrm{d}\mathbf {x}\right\} ^{1/p} \\&\quad =\left\{ {\displaystyle \sum _{j=1}^{N}} \omega _{j}^{p} {\displaystyle \int _{\mathcal {X}_{j}}} \left| f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right| ^{p} \frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/p} \leqslant 2\varphi \left( 2^{-n}\right) \left\{ {\displaystyle \sum _{j=1}^{N}} \omega _{j}^{p} {\displaystyle \int _{\mathcal {X}_{j}}} \left| g_{n}\left( x_{j}\right) \right| ^{p} \frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/p} \\&\quad \leqslant 2\sup \left( \omega _{j}^{1-1/p}\right) \varphi \left( 2\sup \left\{ \delta _{j}\right\} \right) \left\{ \int _{\mathcal {M}}\left| g_{n}\left( x\right) \right| ^{p}\mathrm{d}x\right\} ^{1/p}. \end{aligned}$$

Similarly, if \(2\leqslant p<+\infty \), then Hölder’s inequality with indices \(p/\left( p-2\right) \) and p / 2 yields

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathbf {X}}} \left( {\displaystyle \sum _{j=1}^{N}} \omega _{j}^{2}\left| f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right| ^{2}\right) ^{p/2}\mathrm{d}\mathbf {x}\right\} ^{1/p} \\&\quad \leqslant \left\{ {\displaystyle \sum _{j=1}^{N}} \omega _{j}^{\left( 2p-2\right) /\left( p-2\right) }\right\} ^{\left( p-2\right) /2p}\left\{ {\displaystyle \sum _{j=1}^{N}} \omega _{j} {\displaystyle \int _{\mathcal {X}_{j}}} \left| f\left( x_{j}\right) -f_{\mathcal {X}_{j}}\right| ^{p} \frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/p} \\&\quad \leqslant \sup \left( \omega _{j}^{1/2}\right) \left\{ {\displaystyle \sum _{j=1}^{N}} \omega _{j}\right\} ^{\left( p-2\right) /2p}2\varphi \left( 2^{-n}\right) \left\{ {\displaystyle \sum _{j=1}^{N}} \omega _{j} {\displaystyle \int _{\mathcal {X}_{j}}} \left| g_{n}\left( x_{j}\right) \right| ^{p} \frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/p} \\&\quad \leqslant 2\left| \mathcal {M}\right| ^{1/2-1/p} \sup \left( \omega _{j}^{1/2}\right) \varphi \left( 2\sup \delta _{j}\right) \left\{ {\displaystyle \int _{\mathcal {M}}} \left| g_{n}\left( x\right) \right| ^{p}\mathrm{d}x\right\} ^{1/p}. \end{aligned}$$

\(\square \)

Corollary 7.3

Assume that a metric measure space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \({\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\) for a suitable positive constant d. Then for every \(1\leqslant p<+\infty \) and every \(0<\varepsilon <1\), there exists a constant c with the following property. For every function \(f\left( x\right) \) in the Besov space \(\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) \), \(\varphi \left( t\right) =t^{\alpha }\) and \(\alpha >0\), with probability greater than \(1-\varepsilon \), a random choice of points \(\left\{ x_{j}\right\} \) in \(\left\{ \mathcal {X}_{j}\right\} \) gives

$$\begin{aligned} \left| {\sum _{j=1}^{N}}\omega _{j}f\left( x_{j}\right) -{\int _{\mathcal {M}}}f(x)\mathrm{d}x\right| \leqslant \left\{ \begin{array} [c]{ll} c\left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }N^{1/p-1-\alpha /d} &{}\quad \mathrm{if}\, 1\leqslant p\leqslant 2 ,\\ c\left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }N^{-1/2-\alpha /d} &{}\quad \mathrm{if}\, 2\leqslant p<+\infty . \end{array} \right. \end{aligned}$$

Proof

This follows from Theorem 7.1 via Chebyshev’s inequality. \(\square \)

The following example shows that Theorem 7.1 is essentially sharp.

Example 7.4

As in Theorem 5.1, let \(\mathcal {M}\) be a metric measure space of finite measure with the property that there exist positive constants H, K, d, such that for every \(y\in \mathcal {M}\) and \(0<r<{\text {diam}} \left( \mathcal {M}\right) \),

$$\begin{aligned} Hr^{d}\leqslant \left| \left\{ x\in \mathcal {M}:\left| x-y\right| <r\right\} \right| \leqslant Kr^{d}. \end{aligned}$$

For every N, the space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M}=\mathcal {X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \(\delta _{j}={\text {diam}}\left( \mathcal {X} _{j}\right) \approx N^{-1/d}\). Moreover every \(\mathcal {X}_{j}\) contains a ball \(B\left( w_{j},r_{j}\right) \) of center \(w_{j}\) and radius \(r_{j}\approx \delta _{j}\). It is possible to prove that each ball \(B\left( w_{j},r_{j}\right) \) contains two disjoint balls \(B\left( y_{j},\varepsilon r_{j}\right) \) and \(B\left( z_{j},\varepsilon r_{j}\right) \), with \(\varepsilon =6^{-1}\left( 2K/H\right) ^{-1/d}\). Fix \(1\leqslant j\leqslant N\) and define

$$\begin{aligned} f_{j}\left( x\right) =\left( 1-\frac{2}{\varepsilon r_{j}}\left| x-y_{j}\right| \right) _{+}-\vartheta _{j}\left( 1-\frac{2}{\varepsilon r_{j}}\left| x-z_{j}\right| \right) _{+}, \end{aligned}$$

with \(\vartheta _{j}\) such that f has mean 0. Observe that \(0<\vartheta _{j}<C\) independent of N. Also note that there exist constants A and B independent of N such that

$$\begin{aligned} \int \left| f_{j}\left( x\right) \right| ^{p}\frac{\mathrm{d}x}{\omega _{j} }\geqslant A, \end{aligned}$$

and

$$\begin{aligned} \left| f_{j}\left( x\right) -f_{j}\left( y\right) \right| \leqslant B\delta _{j}^{-1}\left| x-y\right| ~~~~\mathrm{for\;all }\, x,y\in \mathcal {M}\mathrm{.} \end{aligned}$$

If \(\varphi \left( t\right) =t^{\alpha }\) with \(d/p<\alpha \leqslant 1\), and if \(w_{j}\) is the above defined point in \(\mathcal {X}_{j}\), then the function

$$\begin{aligned} g_{j}\left( x\right) =c\min \left( \delta _{j}^{-\alpha },\left| x-w_{j}\right| ^{-\alpha }\right) \end{aligned}$$

is a \(\varphi \)-gradient of \(f\left( x\right) \). Indeed, if \(x\in \mathcal {X}_{j}\) and \(\left| x-y\right| \leqslant 2\delta _{j}\), then

$$\begin{aligned} \left| f_{j}\left( x\right) -f_{j}\left( y\right) \right| \leqslant B\delta _{j}^{-1}\left| x-y\right| \leqslant c\delta _{j}^{-\alpha }\left| x-y\right| ^{\alpha }, \end{aligned}$$

while for \(x\in \mathcal {X}_{j}\) and \(\left| x-y\right| >2\delta _{j} \),

$$\begin{aligned} \left| f_{j}\left( x\right) -f_{j}\left( y\right) \right|&=\left| f_{j}\left( x\right) \right| \leqslant \max \left\{ 1,\vartheta _{j}\right\} =\max \left\{ 1,\vartheta _{j}\right\} \left| x-y\right| ^{-\alpha }\left| x-y\right| ^{\alpha }\\&\leqslant c\left| w_{j}-y\right| ^{-\alpha }\left| x-y\right| ^{\alpha }. \end{aligned}$$

In particular,

$$\begin{aligned} \left\| f_{j}\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }\leqslant c\left\{ \int _{\mathcal {M}}\min \left( \delta _{j}^{-\alpha p},\left| x-w_{j}\right| ^{-\alpha p}\right) \mathrm{d}x\right\} ^{1/p}\leqslant cN^{\alpha /d-1/p}. \end{aligned}$$

Moreover, since f(x) has mean zero and it is supported in \(\mathcal {X}_{j}\),

$$\begin{aligned} \left\{ \int _{\mathbf {X}}\left| {\sum _{k=1}^{N}}\omega _{k}f_{j}\left( x_{k}\right) -{\int _{\mathcal {M}}}f_{j}\left( x\right) \mathrm{d}x\right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p}\!\!\!\!=\omega _{j}\left\{ \int _{\mathcal {X}_{j} }\left| f_{j}\left( x_{j}\right) \right| ^{p}\frac{\mathrm{d}x_{j}}{\omega _{j}}\right\} ^{1/p}\!\!\!\!\geqslant cN^{-1}. \end{aligned}$$

Finally,

$$\begin{aligned} N^{-1}=N^{1/p-1}N^{-\alpha /d}N^{\alpha /d-1/p}\geqslant c\sup \left\{ \omega _{j}^{1-1/p}\right\} \varphi \left( 2\sup \left\{ \delta _{j}\right\} \right) \left\| f_{j}\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }. \end{aligned}$$

This shows that Theorem 7.1 with \(1\leqslant p\leqslant 2\) is sharp.

In order to show that the theorem is essentially sharp also for \(2<p\leqslant +\infty \), let \(f_{j}\left( x\right) \) as before and define

$$\begin{aligned} f\left( x\right) ={\sum _{j=1}^{N}}f_{j}\left( x\right) . \end{aligned}$$

If \(\varphi \left( t\right) =t^{\alpha }\), then a \(\varphi \)-gradient of \(f\left( x\right) \) is given by

$$\begin{aligned} g\left( x\right) =cN^{\alpha /d}. \end{aligned}$$

Indeed, if \(x,y\in \mathcal {X}_{j}\), then

$$\begin{aligned} \left| f\left( x\right) -f\left( y\right) \right| =\left| f_{j}\left( x\right) -f_{j}\left( y\right) \right| \leqslant B\delta _{j}^{-1}\left| x-y\right| \leqslant cN^{\alpha /d}\left| x-y\right| ^{\alpha }. \end{aligned}$$

If \(x\in \mathcal {X}_{i}\) and \(y\in \mathcal {X}_{j}\), with \(i\ne j\) and \(\left| x-y\right| \leqslant N^{-1/d}\), then

$$\begin{aligned}&\left| f\left( x\right) -f\left( y\right) \right| =\left| f_{i}\left( x\right) -f_{j}\left( y\right) \right| \leqslant \left| f_{i}\left( x\right) -f_{i}\left( y\right) \right| +\left| f_{j}\left( y\right) -f_{j}\left( x\right) \right| \\&\quad \leqslant cN^{\alpha /d}\left| x-y\right| ^{\alpha }. \end{aligned}$$

If \(\left| x-y\right| \geqslant N^{-1/d}\), then

$$\begin{aligned} \left| f\left( x\right) -f\left( y\right) \right| \leqslant c\leqslant cN^{\alpha /d}\left| x-y\right| ^{\alpha }. \end{aligned}$$

This gives

$$\begin{aligned} \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }\leqslant cN^{\alpha /d}. \end{aligned}$$

Moreover, the Marcinkiewicz–Zygmund inequality gives for \(2<p<+\infty \),

$$\begin{aligned}&\left\{ \int _{\mathbf {X}}\left| {\sum _{j=1}^{N}}\omega _{j}f\left( x_{j}\right) -{\int _{\mathcal {M}}}f\left( x\right) \mathrm{d}x\right| ^{p}\mathrm{d}\mathbf {x}\right\} ^{1/p}\geqslant A\left( p\right) \left\{ \int _{\mathbf {X}}\left( {\sum _{j=1}^{N}}\left| \omega _{j}f_{j}\left( x_{j}\right) \right| ^{2}\right) ^{p/2}\mathrm{d}\mathbf {x}\right\} ^{1/p}\\&\quad \geqslant A\left( p\right) \left\{ \int _{\mathbf {X}}\left( {\sum _{j=1}^{N}}\left| \omega _{j}f_{j}\left( x_{j}\right) \right| ^{2}\right) \mathrm{d}\mathbf {x}\right\} ^{1/2}\geqslant A\left( p\right) \left\{ {\sum _{j=1}^{N}}\omega _{j}\int _{\mathcal {X}_{j}}\left| f_{j}\left( x_{j}\right) \right| ^{2}\mathrm{d}x_{j}\right\} ^{1/2}\\&\quad \geqslant A\left( p\right) \min \left\{ \omega _{j}^{1/2}\right\} \left\{ \int _{\mathcal {M}}\left| f\left( x\right) \right| ^{2}\mathrm{d}x\right\} ^{1/2}\geqslant cN^{-1/2}. \end{aligned}$$

Finally,

$$\begin{aligned} N^{-1/2}=N^{-1/2}N^{-\alpha /d}N^{\alpha /d}\geqslant c\sup \left\{ \omega _{j}^{1/2}\right\} \varphi \left( 2\sup \left\{ \delta _{j}\right\} \right) \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathbb {T}^{d}\right) }. \end{aligned}$$

In particular, this estimate shows that Theorem 7.1 with \(2<p<+\infty \) is sharp.

When \(p=+\infty \) let \(f\left( x\right) \) and \(\left\{ y_{j}\right\} \) as before, so that \(f\left( y_{j}\right) =1\). Then

$$\begin{aligned} \left| {\sum _{j=1}^{N}}\omega _{j}f\left( y_{j}\right) -{\int _{\mathcal {M}}}f\left( x\right) \mathrm{d}x\right| =1\geqslant c\varphi \left( 2\sup \left\{ \delta _{j}\right\} \right) \left\| f\right\| _{\dot{\mathbb {B}}_{p,\infty }^{\varphi }\left( \mathcal {M}\right) }. \end{aligned}$$

In particular, this estimate shows that Theorem 7.1 with \(p=+\infty \) is sharp.

8 Discrepancy

The following result on the expected value of the pth power of the discrepancy of a random set of points with respect to a fixed given set in a measure space extends a result in [15, Lemma 5]. The result in the latter paper concerns the case of a compact convex set in the d-dimensional unit cube, and the proof is based on the combinatorial argument described in the Introduction, which in our case is replaced by the Marcinkiewicz–Zygmund inequality.

Theorem 8.1

Assume that a metric measure space \(\mathcal {M}\) is decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X} _{1}\cup \cdots \cup \mathcal {X}_{N}\), and call \(\omega _{j}=\left| \mathcal {X}_{j}\right| \) and \(\delta _{j}={\text {diam}}\left( \mathcal {X}_{j}\right) \). Let \(\mathcal {B}\) be a measurable subset of \(\mathcal {M},\) and let

$$\begin{aligned} \psi _{\mathcal {B}}\left( t\right) =\left| \left\{ x\in \mathcal {B} :{\text {dist}}\left\{ x,\mathcal {M}{\setminus }\mathcal {B}\right\} \leqslant t\right\} \right| +\left| \left\{ x\in \mathcal {M} {\setminus }\mathcal {B}:{\text {dist}}\left\{ x,\mathcal {B}\right\} \leqslant t\right\} \right| . \end{aligned}$$

If \(\mathcal {J}\) is the set of indices j such that \(\mathcal {X}_{j}\) intersects both \(\mathcal {B}\) and its complement, then the following hold:

  1. (i)

    For every choice of points \(\left\{ x_{j}\right\} \) in \(\left\{ \mathcal {X}_{j}\right\} ,\)

    $$\begin{aligned} \left| {\displaystyle \sum _{j=1}^{N}} \omega _{j}\chi _{\mathcal {B}}\left( x_{j}\right) -\left| \mathcal {B}\right| \right| \le \psi _{\mathcal {B}}\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) . \end{aligned}$$
  2. (ii)

    For every \(1\leqslant p<+\infty ,\)

    $$\begin{aligned} \left\{ {\displaystyle \int _{\mathbf {X}}} \left| {\displaystyle \sum _{j=1}^{N}} \omega _{j}\chi _{\mathcal {B}}\left( x_{j}\right) -\left| \mathcal {B}\right| \right| ^{p}{\mathrm{d}\mathbf {x}}\right\} ^{1/p} \le B\left( p\right) \sqrt{\sup _{j\in \mathcal {J}}\left\{ \omega _{j}\right\} \psi _{\mathcal {B}}\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) }. \end{aligned}$$

Observe that the right hand side in (ii) is better than the one in (i) when

$$\begin{aligned} B\left( p\right) \le \sqrt{\frac{\psi \left( \sup _{j\in \mathcal {J} }\left\{ \delta _{j}\right\} \right) }{\sup _{j\in \mathcal {J}}\left\{ \omega _{j}\right\} }}. \end{aligned}$$

Also observe that

$$\begin{aligned} \sup _{j\in \mathcal {J}}\left\{ \omega _{j}\right\} \le \sum _{j\in \mathcal {J} }\omega _{j}\le \psi _{\mathcal {B}}\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) . \end{aligned}$$

For a sufficiently refined decomposition \(\{\mathcal X_j\}\) of the space, one should expect \(\sup _{j\in \mathcal {J}}\left\{ \omega _{j}\right\} \) to be much smaller than \(\psi _{\mathcal {B}}\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) \). Hence, for a fixed value of p, estimate (ii) is in general better than (i) as \(N\rightarrow +\infty \). On the other hand, recall that \(B\left( p\right) \rightarrow +\infty \) as \(p\rightarrow +\infty \), hence, for a fixed decomposition of \(\mathcal M\), estimate (i) wins for sufficiently large values of p.

Proof

The proof of (i) is elementary. For every choice of \(x_{j} \in \mathcal {X}_{j}\) one has

$$\begin{aligned} {\displaystyle \sum _{j=1}^N} \omega _{j}\chi _{\mathcal {B}}\left( x_{j}\right) -\left| \mathcal {B}\right| = {\displaystyle \sum _{j=1}^N} \left( \omega _{j}\chi _{\mathcal {B}\cap \mathcal {X}_{j}}\left( x_{j}\right) -\left| \mathcal {B}\cap \mathcal {X}_{j}\right| \right) . \end{aligned}$$

If \(\mathcal {X}_{j}\subseteq \mathcal {B}\) or if \(\mathcal {B}\cap \mathcal {X} _{j}=\emptyset \) then \(\omega _{j}\chi _{\mathcal {B}\cap \mathcal {X}_{j}}\left( x_{j}\right) -\left| \mathcal {B}\cap \mathcal {X}_{j}\right| =0\). Moreover, for every j,

$$\begin{aligned} \left| \omega _{j}\chi _{\mathcal {B}\cap \mathcal {X}_{j}}\left( x_{j}\right) -\left| \mathcal {B}\cap \mathcal {X}_{j}\right| \right| \le \omega _{j}. \end{aligned}$$

Then, by the triangle inequality,

$$\begin{aligned} \left| {\displaystyle \sum _{j=1}^N} \left( \omega _{j}\chi _{\mathcal {B}\cap \mathcal {X}_{j}}\left( x_{j}\right) -\left| \mathcal {B}\cap \mathcal {X}_{j}\right| \right) \right| \le {\displaystyle \sum _{j\in \mathcal J}} \omega _{j} \le \psi _{\mathcal B}\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) . \end{aligned}$$

The proof of (ii) is similar, with the crucial difference that we replace the triangle inequality with the Marcinkiewicz–Zygmund inequality (Corollary 4.2),

$$\begin{aligned}&\left\{ {\displaystyle \int _{\mathbf {X}}} \left| {\displaystyle \sum _{j=1}^N} \left( \omega _{j}\chi _{\mathcal {B}\cap \mathcal {X}_{j}}\left( x_{j}\right) -\left| \mathcal {B}\cap \mathcal {X}_{j}\right| \right) \right| ^{p}{\mathrm{d}\mathbf {x}}\right\} ^{1/p}\\&\quad \leqslant B\left( p\right) \left\{ {\displaystyle \sum _{j\in \mathcal {J}}} \omega _{j}^{2}\right\} ^{1/2}\leqslant B\left( p\right) \sup _{j\in \mathcal {J} }\left\{ \omega _{j}^{1/2}\right\} \left\{ {\displaystyle \sum _{j\in \mathcal {J}}} \omega _{j}\right\} ^{1/2}\\&\quad \leqslant B\left( p\right) \sqrt{ \sup _{j\in \mathcal {J}}\left\{ \omega _{j} \right\} \psi _{\mathcal B}\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) }.\,\,\, \end{aligned}$$

\(\square \)

Theorems like the above are the main building block for the proof of the existence of point distributions with small \(L^p\) discrepancy with respect to given collections of subsets. A very general result of this type is the following.

Corollary 8.2

Assume that a metric measure space \(\mathcal {M}\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| \approx N^{-1}\) and \(\delta _j={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\). Let \({\mathbb {G}}\) be a collection of measurable subsets of \(\mathcal {M}\) with the property that there exist positive constants c and \(\beta \) such that for all sets \(\mathcal {G}\in {\mathbb {G}}\)

$$\begin{aligned} \psi _{\mathcal {G}}\left( t\right) =\left| \left\{ x\in {\mathcal {G} }:{\text {dist}}\left\{ x,\mathcal {M}{\setminus }{\mathcal {G}}\right\} \leqslant t\right\} \right| +\left| \left\{ x\in \mathcal {M} {\setminus }{\mathcal {G}}:{\text {dist}}\left\{ x,{\mathcal {G}}\right\} \leqslant t\right\} \right| \leqslant ct^{\beta }. \end{aligned}$$

Then for any finite positive measure \(\mu \) on any sigma algebra on \({\mathbb {G}}\), and for every \(1\leqslant p<+\infty \) there exists a constant C and a choice of points \(\left\{ x_{j}\right\} \) in \(\left\{ \mathcal {X}_{j}\right\} \) such that

$$\begin{aligned} \left( \int _{\mathbb {G}}\left| {\sum _{j=1}^{N}}\omega _{j}\chi _{{\mathcal {G}}}\left( x_{j}\right) -\left| {\mathcal {G}}\right| \right| ^{p}\,\mathrm{d}\mu (\mathcal {G})\right) ^{1/p}\leqslant CN^{-1/2-\beta /2d}. \end{aligned}$$

Proof

By point (ii) of Theorem 8.1,

$$\begin{aligned}&{\displaystyle \int _\mathbb {G}\int _{\mathbf {X}}} \left| {\displaystyle \sum _{j=1}^N} \omega _{j}\chi _{\mathcal G}\left( x_{j}\right) -\left| {\mathcal G} \right| \right| ^{p}{\mathrm{d}\mathbf {x}}\mathrm{d}\mu (\mathcal G)\\&\quad \leqslant B\left( p\right) ^p {\displaystyle \int _\mathbb {G}}\left( {\sup _{j\in \mathcal {J}}\left\{ \omega _{j}\right\} \psi _{{\mathcal G}}\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) }\right) ^{p/2}\mathrm{d}\mu (\mathcal G)\\&\quad \leqslant B\left( p\right) ^p \mu (\mathbb {G})\left( {\sup _{j\in \mathcal {J}}\left\{ \omega _{j}\right\} c\left( \sup _{j\in \mathcal {J}}\left\{ \delta _{j}\right\} \right) ^\beta }\right) ^{p/2}. \end{aligned}$$

This implies that there exists an \(\mathbf x\in \mathbf X\) such that the thesis of the theorem holds. \(\square \)

We emphasize that under the hypotheses of Corollary 5.1 the required decomposition exists, and it is always possible to take all \(\omega _j\) equal to \(|\mathcal M|N^{-1}\). The corollary has several possible applications. We now examine a few particular cases, starting with the isotropic discrepancy (the discrepancy with respect to convex sets) in the unit cube \([0,1]^{d}\). For interesting phenomena concerning stratified sampling in the unit cube, see [16].

Corollary 8.3

Let \(1\leqslant p<+\infty \), and let \(\mu \) be a finite positive measure on a sigma algebra on the collection \({\mathcal {K}}^{d}_{u}\) of all convex sets of the unit cube \([0,1]^{d}.\) For any integer N there exists a distribution of points \(\{x_{j}\}_{j=1}^{N}\) in \([0,1]^{d}\) such that

$$\begin{aligned} \left( \int _{\mathcal {K}^{d}_{u}}\left| \frac{1}{N}{\sum _{j=1}^{N}}\chi _{{K} }\left( x_{j}\right) -\left| {K}\right| \right| ^{p} \,\mathrm{d}\mu (K)\right) ^{1/p} \leqslant CN^{-1/2-1/2d}. \end{aligned}$$

Proof

It suffices to show that Corollary 8.2 applies with \(\beta =1\). First of all, the unit cube can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| = N^{-1}\) and \(\delta _j={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\). See, for example, Theorem 5.1. It then suffices to observe that for all convex sets K the uniform estimate \(\psi _{K}(t)\leqslant 4dt\) holds. Indeed, by the coarea formula

$$\begin{aligned} \psi _K(t)=|\{x\in [0,1]^d:{\mathrm {dist}}(x,\partial K)\leqslant t\}| =\int _{-t}^t |\partial K_u|_{d-1}\mathrm{d}u, \end{aligned}$$

where for \(u>0\) we define \(K_u=\{x\in K:{\mathrm {dist}}(x,\partial K)\geqslant u\}\) and for \(u<0\) we define \(K_u=\{x\in [0,1]^d:{\mathrm {dist}}(x, K)\leqslant |u|\}\), and \(|\cdot |_{d-1}\) is the \((d-1)\)-dimensional Hausdorff measure. It is a well known property of convex sets that all sets \(K_u\) are convex (see e.g. [42, Chap. 3]. By the Archimedean property of monotonicity with respect to inclusion of the measure of the boundary of convex sets (see [5, Property 5, page 52]), from \(K_u\subset [0,1]^d\) it follows \(|\partial K_u|_{d-1}\leqslant |\partial [0,1]^d|_{d-1}=2d.\)\(\square \)

As an explicit example of measure, one could consider any finite measure supported on the translated, rotated and dilated copies of a fixed convex set. Hence, this result includes and extends well known results on the \(L^{p}\) discrepancy with respect to discs, or other collections of sets with “reasonable” shapes (see, for example, [15, Theorem 2D]). It is interesting to observe that if one replaces the above \(L^p\) norm with a supremum in \(K\in \mathcal {K}^{d}_{u}\), then the above result fails. Indeed, Schmidt (see [41]) proved that for any N point distribution in the unit cube there exists a convex set with discrepancy of order \(N^{-2/(d+1)}\).

It is perhaps worth mentioning some results about measures on the space of convex sets. It is well known that the set of non-empty convex compact subsets of \({\mathbb {R}}^{d},\) let us call it \(\mathcal {K}^{d},\) can be made into a metric space by introducing the Hausdorff distance

$$\begin{aligned} d_{H}(A,B)=\max \left\{ \sup _{a\in A}\,\inf _{b\in B}|a-b|,\sup _{b\in B}\,\inf _{a\in A}|a-b|\right\} . \end{aligned}$$

A large class of sigma finite Borel measures on \(\mathcal {K}^{d},\) which are positive on open sets of \(\mathcal {K}^{d}\) and are invariant under rigid motions, has been recently constructed by Hoffmann [28]. Just to give a rough idea, let \(\{K_{n}\}_{n=1} ^{+\infty }\) be an enumeration of all polytopes of \({\mathbb {R}}^{d}\) with vertices in rational points, and let \(\sum _{n=1}^{+\infty }\alpha _{n}<+\infty \) be a convergent series with positive terms. Then one can define the measure

$$\begin{aligned} \mu =\sum _{n=1}^{+\infty }\alpha _{n}\delta _{K_{n}}, \end{aligned}$$

where \(\delta _{K_{n}}\) is the Dirac delta centered at \(K_{n}\). This measure is supported on rational polytopes, but is positive on open sets since these rational polytopes are dense. A suitable clever modification can be made invariant under rigid motions. Nevertheless, it can be shown that there are more isometries of the space \(\mathcal {K}^{d}\) than those coming from rigid motions of \(\mathbb {R}^{d}\), and in fact it has been showed by Bandt and Baraki (see [2]) that for \(d>1\) there are no non-trivial sigma finite measures on \(\mathcal {K}^{d},\) that are invariant with respect to all isometries of the whole space \(\mathcal {K}^{d}\). Hence it seems that there is no “natural” measure on \(\mathcal {K}^{d}\).

Similar results hold in compact Riemannian manifolds. For the sake of simplicity, we state here a result on the \(L^p\) discrepancy associated to geodesic balls.

Corollary 8.4

Let \(\mathcal M\) be a compact Riemannian manifold with injectivity radius \(r_0\), and let \(0<r_1<r_0\). Denote by B(xr) the geodesic ball centered at the point \(x\in \mathcal M\) with radius r. Then for any \(1\leqslant p<+\infty \), for any finite positive measure \(\mu \) on \(\mathcal M\times (0,r_1 )\), and for any integer N, there exists a distribution of points \(\{x_{j}\}_{j=1}^{N}\) in \(\mathcal M\) such that

$$\begin{aligned} \left( \iint _{\mathcal M\times (0,r_1)}\left| \frac{|\mathcal M|}{N}{\sum _{j=1}^{N}}\chi _{{B(x,r)} }\left( x_{j}\right) -\left| {B(x,r)}\right| \right| ^{p} \,\mathrm{d}\mu (x,r)\right) ^{1/p} \leqslant CN^{-1/2-1/2d}. \end{aligned}$$

Proof

As before, it suffices to show that Corollary 8.2 applies with \(\beta =1\). Indeed, \(\mathcal M\) can be decomposed into a finite disjoint union of sets in the form \(\mathcal {M=X}_{1}\cup \cdots \cup \mathcal {X}_{N}\), with \(\omega _{j}=\left| \mathcal {X}_{j}\right| =|\mathcal M| N^{-1}\) and \(\delta _j={\text {diam}}\left( \mathcal {X}_{j}\right) \approx N^{-1/d}\). See, for example, Theorem 5.1. Moreover, there exists a positive constant c such that for every geodesic ball B(xr) with \(r<r_1\) and every \(t>0\) one has \( \psi _{B(x,r)}\left( t\right) \leqslant ct \). It clearly suffices to prove this for all \(0<t\leqslant (r_0-r_1)/2.\) Indeed, by the triangle inequality,

$$\begin{aligned}&\left\{ y\in {B(x,r) }:{\text {dist}}\left\{ y,\mathcal {M}{\setminus }{B(x,r)}\right\} \leqslant t\right\} \subset B(x,r){\setminus }B(x,r-t), \\&\quad \left\{ y\in \mathcal {M} {\setminus }{B(x,r)}:{\text {dist}}\left\{ y,{B(x,r)}\right\} \leqslant t\right\} \subset \overline{B}(x,r+t){\setminus }B(x,r). \end{aligned}$$

If \(s\leqslant 0\) then we set \(B(x,s)=\emptyset \). Finally, if \(r<r_1\) and \(t<(r_0-r_1)/2\), the exponential map diffeomorphically maps the annulus \(\overline{B}(x,r+t){\setminus }B(x,r-t)\) into the tangent space in x, and by a uniform bound on the Jacobian of the exponential map, its measure is bounded above by \(c((r+t)^d-\max \{0,r-t\}^d)\leqslant ct \). \(\square \)

We have already mentioned that the above results fail in general when one replaces the \(L^p\) norm with a supremum. Nevertheless, an extra hypothesis concerning the complexity of the collection of sets \(\mathbb G\) allows to obtain the same upper bound in the supremum case too, up to a logarithmic transgression.

Theorem 8.5

Let \(\mathcal {M}\) be a metric measure space with finite measure with the property that there exist positive constants d and \(c_{1}\), such that for every sufficiently large N there exists a partition \(\mathcal {M}=\mathcal {X} _{1}\cup \cdots \cup \mathcal {X}_{N}\) with \(\left| \mathcal {X}_{j}\right| = \left| \mathcal {M} \right| N^{-1}\) and \( {\mathrm {diam}}\left( \mathcal {X}_{j}\right) \leqslant c_{1}N^{-1/d}. \) Let \({\mathbb {G}}\) be a collection of measurable subsets of \(\mathcal {M}\) with the following two properties:

  1. (i)

    There exist positive constants \(c_2\) and \(\beta \) such that for all sets \(\mathcal {G}\in {\mathbb {G}}\)

    $$\begin{aligned} \psi _{\mathcal {G}}\left( t\right) =\left| \left\{ x\in {\mathcal {G} }:{\text {dist}}\left\{ x,\mathcal {M}{\setminus }{\mathcal {G}}\right\} \leqslant t\right\} \right| +\left| \left\{ x\in \mathcal {M} {\setminus }{\mathcal {G}}:{\text {dist}}\left\{ x,{\mathcal {G}}\right\} \leqslant t\right\} \right| \leqslant c_2t^{\beta }. \end{aligned}$$
  2. (ii)

    There exist positive constants \(c_3\) and \(\gamma \) such that for all integers N and for all distributions \(\mathcal P\) of N points in \(\mathcal M\) there are at most \(c_3N^\gamma \) equivalence classes in \(\mathbb G\), where \(\mathcal G, \mathcal G'\) in \(\mathbb G\) are defined to be equivalent if \(\mathcal P\cap \mathcal G=\mathcal P\cap \mathcal G'\).

Then for any integer N there exists a distribution of points \(\{z_{j}\}_{j=1}^{N}\) such that

$$\begin{aligned} \sup _{\mathcal G\in \mathbb G}\left| \frac{|\mathcal M|}{N}{\sum _{j=1}^{N}}\chi _{{\mathcal G} }\left( z_{j}\right) -\left| {\mathcal G}\right| \right| \leqslant CN^{-1/2-\beta /2d}\sqrt{\log N}. \end{aligned}$$

A few words on the above condition (ii) are perhaps necessary. We say that \(\mathbb G\) shatters a finite subset \(\mathcal P\) of N points of \(\mathcal M\) if there are exactly \(2^N\) distinct intersections of sets of \(\mathbb G\) with \(\mathcal P\). The Vapnik–Chervonenkis dimension, or VC dimension, of \(\mathbb G\) is the supremum of the sizes of all finite subsets of \(\mathcal M\) that are shattered by \(\mathbb G\). By Sauer lemma (see [40]), condition (ii) coincides with asking that the collection \(\mathbb G\) has finite VC dimension. For example, the collection of convex sets in \(\mathbb R^d\) has infinite VC dimension. Indeed a set of N points on a circle can be easily shattered with convex sets. On the other hand, the collection of balls in \(\mathbb R^d\) has VC dimension \(d+1\) (see [36, Chap. 5] for an account on this subject).

The proof that we present here follows closely the lines of the classic result for discs in the unit square, as one can find in Matoušek’s book [36].

Proof

Let \(M=N^{q}\), where q is a positive integer that will be fixed later. Consider two partitions of \(\mathcal M\) as in the hypothesis. The first is composed by the N sets \(\mathcal X_1,\ldots ,\mathcal X_N\), and the second is composed by the M sets \(\mathcal Y_1,\ldots ,\mathcal Y_M\). For any \(j=1,\ldots ,N\), define \(I_j=\{i=1,\ldots ,M:\mathcal Y_i\cap \mathcal X_j\ne \emptyset \}\) and, for all \(i\in I_j\) define

$$\begin{aligned} \mathcal {Y}_{j,i}=\mathcal X_j\cap \mathcal Y_i. \end{aligned}$$

Fix a point \(y_{j,i}\) in any of the sets \(\mathcal {Y}_{j,i}\). Clearly \(\{\mathcal {Y}_{j,i}\}_{i\in I_j}\) forms a partition of \(\mathcal X_j\) and \(\mathcal {Y}_{j,i}\subset \{x\in \mathcal M:|x-y_{i,j}|<c_1M^{-1/d}\}\).

For each \(j=1,\ldots , N\), let us pick one point \(q_j\) among all the points \(y_{j,i}\) with \(i\in I_j\). This point \(q_j\) is chosen randomly with probability \({\mathbb P}[q_j=y_{j,i}]=N|\mathcal Y_{j,i}|/|\mathcal M|\), the choices being independent for distinct values of j. The discrepancy of the point distribution \(\{q_j\}_{j=1}^N\) with respect to a given \(\mathcal G\in \mathbb G\) can be estimated as follows

$$\begin{aligned}&\left| \frac{|\mathcal M|}{N}{\sum _{j=1}^{N}}\chi _{{\mathcal G} }\left( q_{j}\right) - \left| {\mathcal G}\right| \right| \\&\quad \leqslant \left| \frac{|\mathcal M|}{N}\sum _{j=1}^{N}\left( \chi _{\mathcal G} \left( q_{j}\right) - \frac{N}{|\mathcal M|}\sum _{i\in I_j}|\mathcal Y_{j,i}|\chi _{{\mathcal G} }\left( y_{j,i}\right) \right) \right| +\left| {\sum _{j=1}^{N}}\sum _{i\in I_j}|\mathcal Y_{j,i}|\chi _{{\mathcal G} }\left( y_{j,i}\right) -\left| {\mathcal G}\right| \right| \end{aligned}$$

The second term in the above sum is deterministic and can be treated easily. By Theorem 8.1 (i) it is bounded above by

$$\begin{aligned} \psi _{\mathcal G}\left( c_1M^{-1/d}\right) \leqslant c_2\left( c_1M^{-1/d}\right) ^\beta = c_1^\beta c_2 N^{-\beta q/d}. \end{aligned}$$

It is therefore sufficient to take \(q\geqslant d/\beta \) to obtain an estimate better than what is needed.

The other term is of a probabilistic nature. We only need to consider the values of j for which \(\mathcal X_j\) intersects both \(\mathcal G\) and \(\mathcal M\). Call this set \(\mathcal J\) and its cardinality m. Since

$$\begin{aligned} m\frac{|\mathcal M|}{N}\leqslant \psi _{\mathcal G}\left( c_1N^{-1/d}\right) \leqslant c_1^\beta c_2 N^{-\beta /d}, \end{aligned}$$

we have \(m\leqslant c_1^\beta c_2 N^{1-\beta /d}/|\mathcal M|\). Let us now set

$$\begin{aligned} k_j=\frac{N}{|\mathcal M|}\sum _{i\in I_j}|\mathcal Y_{j,i}|\chi _{{\mathcal G} }\left( y_{j,i}\right) , \end{aligned}$$

and call \(F_j\) the random variable \( \chi _{\mathcal G} \left( q_{j}\right) - k_j. \) Thus we have

$$\begin{aligned} \frac{|\mathcal M|}{N}\sum _{j=1}^{N}\left( \chi _{\mathcal G} \left( q_{j}\right) - \frac{N}{|\mathcal M|}\sum _{i\in I_j}|\mathcal Y_{j,i}|\chi _{{\mathcal G} }\left( y_{j,i}\right) \right) =\frac{|\mathcal M|}{N}\left( \sum _{j\in \mathcal J}F_j\right) . \end{aligned}$$

The variables \(F_j\) are mutually independent, and \(F_j\) takes the value \(1-k_j\) with probability \(k_j\), and \(-k_j\) with probability \(1-k_j\). Therefore for every \(\varDelta >0\) we have

$$\begin{aligned} {\mathbb P}\left[ \left| \sum _{j\in \mathcal J}F_j\right| \geqslant \varDelta \right] \leqslant 2\exp (-2\varDelta ^2/m) \end{aligned}$$

(see, for example, [1, Corollary A.1.7]). Let us fix a constant \(C>0\). Then we have showed that

$$\begin{aligned}&{\mathbb P}\left[ \frac{|\mathcal M|}{N}\left| \sum _{j\in \mathcal J}F_j\right| \geqslant CN^{-1/2-\beta /(2d)}\sqrt{\log N}\right] \\&\quad ={\mathbb P}\left[ \left| \sum _{j\in \mathcal J}F_j\right| \geqslant CN^{1/2-\beta /(2d)}\sqrt{\log N}/|\mathcal M|\right] \\&\quad \leqslant 2\exp (-2C^2N^{1-\beta /d}\log N/|\mathcal M|^2)/m)\\&\quad \leqslant 2N^{-C^2c_1^{-\beta } c_2^{-1}|\mathcal M|^{-1}}. \end{aligned}$$

Finally, if \(\mathbb F\subset \mathbb G\) contains one representative for each equivalence class, then

$$\begin{aligned}&{\mathbb P}\left[ \frac{|\mathcal M|}{N}\left| \sum _{j\in \mathcal J}F_j\right| \geqslant CN^{-1/2-\beta /(2d)}\sqrt{\log N}\mathrm{for\;some\;}\mathcal G\in \mathbb G\right] \\&\quad ={\mathbb P}\left[ \frac{|\mathcal M|}{N}\left| \sum _{j\in \mathcal J}F_j\right| \geqslant CN^{-1/2-\beta /(2d)}\sqrt{\log N}\mathrm{for\;some\;}\mathcal G\in \mathbb F\right] \\&\quad \leqslant \sum _{\mathcal G\in \mathbb F}{\mathbb P}\left[ \frac{|\mathcal M|}{N}\left| \sum _{j\in \mathcal J}F_j\right| \geqslant CN^{-1/2-\beta /(2d)}\sqrt{\log N}\right] \\&\quad \leqslant 2c_3N^{\gamma -C^2c_1^{-\beta } c_2^{-1}|\mathcal M|^{-1}}<1 \end{aligned}$$

if C is large enough. The theorem follows. \(\square \)

The next Corollary shows one possible application of the above theorem.

Corollary 8.6

Let \(\mathcal M\) be a d-dimensional compact Riemannian manifold isometrically embedded in \(\mathbb R^D\), and call \(B^D(x,r)=\{y\in \mathbb R^D:\Vert y-x\Vert <r\}\) the Euclidean D-dimensional ball of center x and radius r. Then there exist positive constants \(r_0\) and C such that for any integer N there exists a distribution of points \(\{z_{j}\}_{j=1}^{N}\in \mathcal M\) such that

$$\begin{aligned} \sup _{r\leqslant r_0, x\in \mathcal M}\left| \frac{|\mathcal M|}{N}{\sum _{j=1}^{N}}\chi _{{B^D(x,r)\cap \mathcal M} }\left( z_{j}\right) -\left| {B^D(x,r)\cap \mathcal M}\right| \right| \leqslant CN^{-1/2-1/2d}\sqrt{\log N}. \end{aligned}$$

Notice that, by the Nash embedding theorem, every Riemannian manifold can be isometrically embedded into some Euclidean space. In particular, if one takes as \(\mathcal M\) the d-dimensional unit sphere in \(\mathbb R^{d+1}\), then the sets \(B^{d+1}(x,r)\cap \mathcal M\) of the above corollary coincide with the usual spherical caps, and one recovers Beck’s estimate for the spherical cap discrepancy (see [4, Theorem 24D]).

Proof

It is enough to show that the collection of subsets of the form \(B^D(x,r)\cap \mathcal M\) satisfies the two hypotheses of Theorem 8.5. By compactness, there exists a positive number \(r_0\) such that for all \(0<r\leqslant r_0\) and for all \(x\in \mathcal M\), the set \(\mathcal N=\{y\in \mathcal M:\Vert x-y\Vert =r\}\) is a hypersurface of \(\mathcal M\) with uniformly bounded \((d-1)\)-dimensional volume. Furthermore, the measure of the set of points of \(\mathcal M\) with geodesic distance from \(\mathcal N\) less than or equal to t is bounded above by

$$\begin{aligned} \int _{\mathcal N}\int _{-t}^t|f(s,n)|\mathrm{d}s\,\mathrm{d}n, \end{aligned}$$

where \(\mathrm{d}n\) is the \((d-1)\)-dimensional volume form on \(\mathcal N\) and f(sn) is the (uniformly bounded) Jacobian of the exponential map of the normal bundle on \(\mathcal N\) in \(\mathcal M\) (see [20] for the details). Thus

$$\begin{aligned} \psi _{\mathcal M\cap B^D(x,r)}(t)\leqslant ct, \end{aligned}$$

and the first hypothesis of Theorem 8.5 holds with \(\beta =1\). Finally, as we mention before, balls in \(\mathbb R^D\) satisfy the second hypotheses of the same theorem with \(\gamma =D+1\) ([36, Chap. 5]). \(\square \)