1 Introduction

Decomposition systems (bases or frames) consisting of band limited functions of nearly exponential space localization have had significant impact in theoretical and computational harmonic analysis, PDEs, statistics, approximation theory and their applications. Meyer’s wavelets [39] and the frames (the φ-transform) of Frazier and Jawerth [2123] are the most striking examples of such decomposition systems playing a pivotal role in the solution of numerous theoretical and computational problems. The key to the success of wavelet type bases and frames is rooted in their ability to capture a great deal of smoothness and other norms in terms of respective coefficient sequence norms and provide sparse representation of natural function spaces (e.g. Besov spaces) on ℝd. Frames of a similar nature have been recently developed in non-standard settings such as on the sphere [42, 43] and more general homogeneous spaces [24], on the interval [35, 47] and ball [36, 48] with weights, and extensively used in statistical applications (see e.g. [31, 32]).

The primary goal of this paper is to extend and refine the construction of band limited frames with elements of nearly exponential space localization to the general setting of strictly local regular Dirichlet spaces with doubling measure and local scale-invariant Poincaré inequality which lead to a Markovian heat kernel with small time Gaussian bounds and Hölder continuity. The key point of our approach is to be able to deal with (a) different geometries, (b) compact and noncompact spaces, and (c) spaces with nontrivial weights, and at the same time to allow for the development and frame decomposition of Besov and Triebel-Lizorkin spaces with complete range of indices. This will enable us to cover and shed new light on the existing frames and space decompositions and develop band limited localized frames in the context of Lie groups or homogeneous spaces with polynomial volume growth, complete Riemannian manifold with Ricci curvature bounded from below and satisfying the volume doubling condition, and other new settings. To this end we shall make advances on several fronts: development of functional calculus of positive self-adjoint operators with associated heat kernel (in particular, localization of the kernels of related integral operators), development of lower bounds on kernel operators, development of a Shannon sampling theory, Littlewood-Paley analysis, and development of dual frames.

As a first application of our frames we shall develop rapidly and characterize the classical Besov spaces \(B^{s}_{pq}\) with positive smoothness and p≥1. Classical and nonclassical Besov and Triebel-Lizorkin spaces in the general framework of this paper with full range of indices and their frame and heat kernel decompositions are developed in the follow-up paper [33].

In this preamble we outline the main components and points of this undertaking, including the underlying setting, a general scenario for realization of the setting and examples, and a description of the main results.

1.1 The Setting

We now describe precisely all the ingredients we need to develop our theory.

I. We assume that (M,ρ,μ) is a metric measure space, which satisfies the conditions:

(a) (M,ρ) is a locally compact metric space with distance ρ(⋅,⋅) and μ is a positive Radon measure such that the following volume doubling condition is valid

$$ 0 < \mu\bigl(B(x,2r) \bigr) \leq2^d \mu\bigl(B(x,r) \bigr)<\infty\quad\hbox{for $x \in M$ and $r>0$}. $$
(1.1)

Here B(x,r) is the open ball centered at x of radius r and d>0 is a constant that plays the role of a dimension. Note that (M,ρ,μ) is also a homogeneous space in the sense of Coifman and Weiss [10, 11].

(b) The reverse doubling condition is assumed to be valid, that is, there exists a constant β>0 such that

$$ \mu\bigl(B(x,2r) \bigr) \ge2^\beta\mu \bigl(B(x,r)\bigr) \quad\hbox{for } x \in M\mbox{ and } 0< r \le\frac{\operatorname{diam}M}{3}. $$
(1.2)

It will be shown in Sect. 2 that this condition is a consequence of the above doubling condition if M is connected.

(c) The following non-collapsing condition will also be stipulated: There exists a constant c>0 such that

$$ \inf_{x\in M}\mu\bigl(B(x,1) \bigr)\ge c, \quad x \in M. $$
(1.3)

As will be shown in Sect. 2 in the case μ(M)<∞ the above inequality follows by (1.1). Therefore, it is an additional assumption only when μ(M)=∞.

Since we consider in this paper inhomogeneous function spaces only, it would be natural to make only purely local assumptions, and in particular to assume doubling only for balls with radius bounded by some constant, which would enlarge considerably our range of examples. This would require however more work and more space. On the other hand, our next assumptions on the heat kernel are local, in the sense that they are required for small time only. Clearly, by assuming global doubling and global heat kernel bounds, one can treat homogeneous spaces as well.

II. Our main assumption is that the local geometry of the space (M,ρ,μ) is related to an essentially self-adjoint positive operator L on L 2(M,) such that the associated semigroup P t =e tL consists of integral operators with (heat) kernel p t (x,y) obeying the conditions:

(d) Small time Gaussian upper bound:

$$ p_t(x,y) \le\frac{ C\exp\{-\frac{c\rho^2(x,y)}{t}\}}{\sqrt{\mu (B(x,\sqrt{t}))\mu(B(y, \sqrt{t}))}} \quad\hbox{for } x,y\in M,\ 0<t\le1. $$
(1.4)

One can see that by combining the results in [9, 45] and [13], this estimate and the doubling condition (1.1) coupled with the fact that e tL is actually a holomorphic semigroup on L 2(M,), i.e. e zL exists for z∈ℂ, \(\operatorname{Re}z \ge0\), imply that e zL is an integral operator with kernel p z (x,y) satisfying the following estimate: For any z=t+iu, 0<t≤1, u∈ℝ, x,yM,

$$ \big|p_z(x,y)\big| \le\frac{C\exp \{-c \operatorname{Re}\frac{\rho^2(x,y)}{z} \}}{ \sqrt{\mu(B(x,\sqrt{t}))\mu(B(y, \sqrt{t}))}}. $$
(1.5)

(e) Hölder continuity: There exists a constant α>0 such that

$$ \big| p_t(x,y) - p_t\bigl(x,y' \bigr) \big| \le C \biggl(\frac{\rho(y,y')}{\sqrt{t}} \biggr)^\alpha \frac{\exp\{-\frac{c\rho^2(x,y)}{t} \}}{\sqrt{\mu(B(x,\sqrt{t}))\mu (B(y, \sqrt{t}))}} $$
(1.6)

for x,y,y′∈M and 0<t≤1, whenever \(\rho(y,y')\le\sqrt{t}\).

(f) Markov property:

$$ \int_M p_t(x,y) d\mu(y) \equiv1 \quad\hbox{for $t >0$}, $$
(1.7)

which readily implies, by analytic continuation,

$$ \int_M p_z(x,y) d\mu(y) \equiv1 \quad\hbox{for $z=t+iu$,\ $t >0$}. $$
(1.8)

Above C,c>0 are structural constants that will affect almost all constants in what follows.

The main results in this article will be derived from the above conditions. However, it is perhaps suitable to exhibit a more tangible general scenario that guarantees the validity of these conditions.

1.2 Realization of the Setting in the Framework of Dirichlet Spaces

We would like to point out that in a general framework of Dirichlet spaces the needed Gaussian bound, Hölder continuity, and Markov property of the heat kernel follow from the local scale-invariant Poincaré inequality and the doubling condition on the measure, which in turn are equivalent to the parabolic Harnack inequality. The point is that situations where our theory is applicable are quite common and it just amounts to verifying the local scale-invariant Poincaré inequality and the doubling condition on the measure. We shall further illustrate this point on several examples and, in particular, on the “simple” example of [−1,1] with the heat kernel induced by the Jacobi operator, seemingly not covered in the literature.

We shall operate in the framework of strictly local regular Dirichlet spaces (see [2, 5, 6, 14, 20, 45, 5557]). To be specific, we assume that M is a locally compact separable metric space equipped with a positive Radon measure μ such that every open and nonempty set has positive measure. Also, we assume that L is a positive symmetric operator on (the real) L 2(M,μ) with domain D(L), dense in L 2(M,μ). We shall denote briefly \({{\mathbb{L}}}^{p}:= L^{p}(M, \mu )\) in what follows. One can associate with L a symmetric non-negative form

$$\mathcal{E}(f,g) = \langle Lf, g \rangle= \mathcal{E}(g,f), \quad\quad \mathcal{E}(f,f) = \langle Lf, f \rangle\geq0, $$

with domain \(D(\mathcal{E})= D(L)\). We consider on \(D(\mathcal{E})\) the prehilbertian structure induced by

$$\|f\|_{\mathcal{E}}^2 = \|f \|_2^2 + \mathcal{E}(f,f) $$

which in general is not complete (not closed), but closable ([14]) in \({\mathbb{L}}^{2}\). Denote by \(\overline{\mathcal{E}}\) and \(D(\overline{\mathcal{E}})\) the closure of \(\mathcal{E}\) and its domain. It gives rise to a self-adjoint extension \(\overline{L}\) (the Friedrichs extension) of L with domain \(D(\overline{L})\) consisting of all \(f \in D(\overline{\mathcal{E}})\) for which there exists \(u \in{\mathbb{L}}^{2}\) such that \(\overline{\mathcal{E}}(f,g)= \langle u, g \rangle\) for all \(g \in D(\overline{\mathcal{E}})\) and \(\overline{L}f =u\). Then \(\overline{L}\) is positive and self-adjoint, and

$$D(\overline{\mathcal{E}})= D\bigl((\overline{L})^{1/2}\bigr), \quad\quad \overline{\mathcal{E}} (f,g) = \bigl\langle(\overline{L})^{1/2} f, ( \overline{L})^{1/2}g \bigr\rangle. $$

Using the classical spectral theory of positive self-adjoint operators, we can associate with \(\overline{L}\) a self-adjoint strongly continuous contraction semigroup \(P_{t} =e^{-t \overline{L}}\) on \({\mathbb{L}}^{2}(M,\mu)\). Then

$$e^{-t \overline{L}}= \int_0^\infty e^{-\lambda t} dE_\lambda, $$

where E λ is the spectral resolution associated with \(\overline{L}\). Moreover this semigroup has a holomorphic extension to the complex half-plane \(\operatorname{Re}z >0\).

Our next assumption is that P t is a submarkovian semigroup: 0≤f≤1 and \(f \in{\mathbb{L}}^{2}\) imply 0≤P t f≤1. Then P t can be extended as a contraction operator on \({\mathbb{L}}^{p}\), 1≤p≤∞, preserving positivity, satisfying P t 1≤1, and hence yielding a strongly continuous contraction semigroup on \({\mathbb{L}}^{p}, 1\le p<\infty\). A sufficient condition for this [2, 20], which can be verified on D(L), is that for every ε>0 there exists Φ ε :ℝ↦[−ε,1+ε] such that Φ ε is non-decreasing, \(\varPhi_{\varepsilon}\in\operatorname{Lip} 1\), Φ ε (t)=t for t∈[0,1] and

$$\varPhi_\varepsilon(f) \in D(\overline{\mathcal{E}})\quad \hbox{and}\quad \overline{\mathcal{E}}\bigl( \varPhi_\varepsilon(f), \varPhi_\varepsilon(f) \bigr) \leq \mathcal{E}(f,f), \quad\forall f \in D(L) $$

(in fact, this can be done easily only if Φ ε (f)∈D(L)).

Under the above conditions, \((D(\overline{\mathcal{E}}), \overline{\mathcal{E}})\) is called a Dirichlet space and \(D(\overline{\mathcal{E}})\cap{\mathbb{L}}^{\infty}\) is an algebra.

We assume that the form \(\overline{\mathcal{E}}\) is strongly local, i.e. \(\overline{\mathcal{E}}(f,g)=0\) for \(f,g\in D(\overline {\mathcal{E}})\) whenever f is with compact support and g is constant on a neighbourhood of the support of f. We also assume that \(\overline{\mathcal{E}}\) is regular, meaning that the space \(\mathcal{C}_{c}(M)\) of continuous functions on M with compact support has the property that the algebra \(\mathcal{C}_{c}(M)\cap D(\overline{\mathcal{E}})\) is dense in \(\mathcal{C}_{c}(M)\) with respect to the sup norm, and dense in \(D(\overline{\mathcal{E}})\) in the norm \(\sqrt {\overline{\mathcal{E} }(f,f)+\|f\|_{2}^{2}}\).

We next give a sufficient condition for strong locality and regularity ([20], Chap. 3) which can be verified for \(D(L):\overline{\mathcal{E}}\) is strongly local and regular if (i) D(L) is a subalgebra of \(\mathcal{C}_{c}(M)\) verifying the strong local condition: \(0= \mathcal{E}(f,g) = \langle Lf,g \rangle\) if f,gD(L), f is with compact support, and g is constant on a neighbourhood of the support of f, and (ii) for any compact K and open set U such that KU there exists uD(L), u≥0, \(\operatorname{supp}u \subset U\), and u≡1 on K (thus D(L) is a dense subalgebra of \(\mathcal{C}_{c}(M)\) and dense in \(D(\overline{\mathcal{E}})\)).

Under the above assumptions, there exists a bilinear symmetric form defined on \(D (\overline{\mathcal{E}}) \times D(\overline{\mathcal{E}})\) with values in the signed Radon measures on M such that

$$\mathcal{E}( \phi f,g) +\mathcal{E}(f, \phi g) -\mathcal{E}( \phi, fg)= 2 \int _M \phi d\varGamma(f,g) \quad\hbox{for } f, g, \phi\in \mathcal{C}_c(M)\cap D(\overline {\mathcal{E}}), $$

which obviously verifies \(\overline{\mathcal{E}}(f,g)=\int_{M}d\varGamma(f,g)\) and (f,f)≥0.

In fact, if D(L) is a subalgebra of \(\mathcal{C}_{c}(M)\), then is absolutely continuous with respect to μ, and

In other words, \(\overline{\mathcal{E}}\) admits a “carré du champ” ([8], Chap. 1, Sect. 4): There exists a bilinear function \(D (\overline{\mathcal{E}} ) \times D (\overline{\mathcal{E}})\ni f, g \mapsto \varGamma(f,g) \in{{\mathbb{L}}}^{1} \) such that Γ(f,f)(u)≥0,

and \(\overline{\mathcal{E}}(f,g) = 2 \int_{M} \varGamma(f,g)(u) d\mu(u)\).

One can define an intrinsic distance on M by

$$\mbox{\fontsize{9.3}{11.0}\selectfont$\displaystyle \rho(x,y)=\sup\bigl\{u(x)-u(y): u\in D(\overline{\mathcal{E}})\cap \mathcal{C}_c(M), d\varGamma(u,u)= \gamma(u) (x) d\mu(x), \gamma(u) (x) \leq1\bigr\}$.} $$

We assume that ρ:M×M→[0,∞] is actually a true metric that generates the original topology on M and that (M,ρ) is a complete metric space.

As a consequence of this assumption, the space M is connected, the closure of an open ball B(x,r) is the closed ball \(\overline{B(x,r)} := \{y \in M, \rho(x,y) \leq r\}\), and the closed balls are compact (see [5557]).

We are now in a position to describe an optimal scenario when the needed Gaussian bound (1.4), Hölder continuity (1.5), and Markov property (1.6) on the heat kernel can be effectively realized. In the framework of strictly local regular Dirichlet spaces with a complete intrinsic metric, the following two properties are equivalent [28, 57]:

(i) The heat kernel satisfies

$$ \frac{ c_1'\exp\{-\frac{c_1\rho^2(x,y)}{t}\}}{\sqrt{\mu(B(x,\sqrt{t}))\mu(B(y, \sqrt{t}))}} \le p_t(x,y) \le \frac{ c_2'\exp\{-\frac{c_2\rho^2(x,y)}{t}\}}{\sqrt{\mu (B(x,\sqrt{t}))\mu(B(y, \sqrt{t}))}} $$
(1.9)

for x,yM and 0<t≤1.

(ii)(a) (M,ρ,μ) is a local doubling measure space: There exists d>0 such that μ(B(x,2r))≤2d μ(B(x,r)) for xM and 0<r<1.

(b) Local scale-invariant Poincaré inequality holds: There exists a constant C>0 such that for any ball B=B(x,r) with 0<r≤1, xM, and any function \(f\in D(\overline{\mathcal{E}})\),

$$\int_B|f-f_B|^2\le Cr^2 \int_Bd\varGamma(f,f). $$

Here f B is the mean of f over B. Moreover, it is also well-known that the above property is equivalent to a local parabolic Harnack inequality, and, furthermore, any of these equivalent properties implies the validity of (1.6) and (1.7) (see [27, 28, 53, 57], and the references therein).

Consequently, given a situation which fits into the framework of strictly local regular Dirichlet spaces with a complete intrinsic metric it suffices to only verify the local Poincaré inequality and the global doubling condition on the measure and then our theory applies in full.

In a future work we shall further develop this theory under the more general assumption of the small time sub-Gaussian estimate:

$$ p_t(x,y) \le\frac{ C\exp \{-c (\frac{\rho^{m}(x,y)}{t} )^{\frac{1}{m-1}} \}}{\sqrt{\mu(B(x,t^{1/m}))\mu(B(y, t^{1/m}))}} \quad \hbox{for } x,y\in M,\ 0<t\le1, $$
(1.10)

where m≥2.

1.3 Examples

There is a great deal of set-ups which fit in the general framework of this article. We next briefly describe several benchmark examples which are indicative for the versatility and depth of our methods.

1.3.1 Uniformly Elliptic Divergence form Operators on ℝd

Given a uniformly elliptic symmetric matrix-valued function {a i,j (x)} depending on x∈ℝd, one can define an operator

$$L = -\sum_{i,j=1}^d\frac{\partial}{\partial x_i} \biggl(a_{i,j}\frac {\partial}{\partial x_j} \biggr) $$

on L 2(ℝd,dx) via the associated quadratic form. Thanks to the uniform ellipticity condition, the intrinsic metric associated with this operator is equivalent to the Euclidean distance. The Gaussian upper and lower estimates of the heat kernel in this setting hold for all time and are due to Aronson, the Hölder regularity of the solutions is due to Nash [44], the Harnack inequality was obtained by Moser [40, 41].

1.3.2 Domains in ℝd

One can define uniformly elliptic divergence form operators on ℝd by choosing boundary conditions. In this case the upper bounds of the heat kernels are well understood (see for instance [45]). The problem for establishing Gaussian lower bounds is much more complicated. One has to choose Neumann conditions and impose regularity assumptions on the domain. For the state of the art, we refer the reader to [27].

1.3.3 Riemannian Manifolds and Lie Groups

The conditions from Sect. 1.2 are verified for the Laplace-Beltrami operator of a Riemannian manifold with non-negative Ricci curvature [38], also for manifolds with Ricci curvature bounded from below if one assumes in addition that they satisfy the volume doubling property, also for manifolds that are quasi-isometric to such a manifold [25, 51, 52], also for co-compact covering manifolds whose deck transformation group has polynomial growth [51, 52], for sublaplacians on polynomial growth Lie groups [50, 61] and their homogeneous spaces [39]. We would like to point out that the case of the sphere endowed with the natural Laplace-Beltrami operator treated in [42, 43] and the case of more general compact homogeneous spaces endowed with the Casimir operator considered in [24] fall into the above category. One can also consider variable coefficients operators on Lie groups, see [54].

We refer the reader to [27, Sect. 2.1] for further details on the above examples. For more references on the heat kernel in various settings, see [14, 26, 53, 61].

1.3.4 Heat Kernel on [−1,1] Generated by the Jacobi Operator

To show the flexibility of our general approach to frames and spaces through heat kernels we consider in Sect. 7 the “simple” example of M=[−1,1] with (x)=w α,β (x)dx, where w α,β (x) is the classical Jacobi weight:

$$w_{\alpha, \beta}(x)=w(x)= (1-x)^\alpha(1+x)^\beta, \quad\alpha , \beta>-1. $$

The Jacobi operator is defined by

$$Lf(x) = - \frac{ [w(x) a(x) f'(x) ]'}{w(x)} \quad\hbox{with } a(x):=1-x^2 $$

and D(L)=C 2[−1,1]. As is well-known [58], LP k =λ k P k , where P k (k≥0) is the kth degree (normalized) Jacobi polynomial and λ k =k(k+α+β+1). Integration by parts gives

$$\mathcal{E}(f,g) :=\langle Lf, g \rangle= \int_{-1}^1 a(x) f'(x) g'(x) w_{\alpha, \beta} (x) dx. $$

In Sect. 7 it will be shown that in this case the general theory applies, resulting in a complete strictly local Dirichlet space with an intrinsic metric defined by

$$\rho(x, y)= |\arccos x - \arccos y|,\quad x,y \in[-1, 1], $$

which is apparently compatible with the usual topology on [−1,1]. It will be also shown that in this setting the measure μ verifies the doubling condition and the respective scale-invariant Poincaré inequality is valid. Therefore, the example under consideration fits in the general setting described in Sect. 1.2 and our theory applies. In particular, the associated heat kernel with a representation

$$p_t(x, y) = \sum_{k\ge0} e^{-\lambda_k t}P_k(x)P_k(y) $$

has Gaussian bounds (see Sect. 7), which to the best of our knowledge appears first in the present article. Another consequence of this is that our theory covers completely the construction of frames and the development of Besov and Triebel-Lizorkin spaces on [−1,1] with Jacobi weights from [35, 47].

Finally, we would like to point out that there are other examples, e.g. the development of frames and weighted Besov and Triebel-Lizorkin spaces on the unit ball B in ℝd in [36, 48], which perfectly fit in our general setting, but we shall not pursue in this article.

1.4 Outline of the Paper

This paper is organized as follows: In Sect. 2 we give some auxiliary results which are instrumental in proving our main results. In particular, we collect all needed facts about doubling measures and related kernels, construction of maximal δ-nets, and integral operators.

In Sect. 3 we develop some components of a non-holomorphic functional calculus related to a positive self-adjoint operator L in the general set-up of the paper. In particular, we establish the nearly exponential localization of the kernels of operators of the form \(f(\sqrt{L})\) under suitable conditions on f. These localization results are crucial for the development of the Littlewood-Paley theory in our setting. They also enable us to explore the main properties of the spectral spaces and develop the linear approximation theory from spectral spaces through the machinery of Jackson-Bernstein inequalities and interpolation. In this section we also give the main properties of finite dimensional spectral spaces.

In Sect. 4 we establish a sampling theorem in the spirit of the Shannon theory and develop a cubature rule/formula in the compact and non-compact case, which is exact for spectral functions of a given order. This cubature rule is a critical component in the development of our frames.

Our main results are placed in Sect. 5, where we construct pairs of dual frames of the form: \(\{\psi_{j\xi}: \xi\in{\mathcal{X}}_{j}, j\ge0\}\), \(\{\widetilde{\psi}_{j\xi}: \xi\in{\mathcal{X}}_{j}, j\ge0\}\), where each \({\mathcal{X}}_{j}\) is a δ j -net on M for an appropriate δ j . The frame elements ψ , \(\widetilde{\psi}_{j\xi}\) are band limited and well-localized functions, which allow for decomposition of functions and distributions from various spaces (in particular, Besov and Triebel-Lizorkin spaces) of the form

$$f= \sum_{j\ge0} \sum_{\xi\in{\mathcal{X}}_j} \langle f, \widetilde{\psi}_{j\xi }\rangle\psi_{j\xi}. $$

The most critical point in this paper is the construction of the dual frame \(\{\widetilde{\psi}_{j\xi}\}\). We develop it in two settings: (i) in the general case, and (ii) in the case when the spectral spaces have the polynomial property under multiplication (see Sect. 5.3). In the second case the construction is simple and elegant, however, the setting is somewhat restrictive, while in the first case the construction is much more involved, but the localization of \(\widetilde{\psi}_{j\xi}\) is inverse polynomial of an arbitrarily fixed order.

In Sect. 6 we develop the classical and most commonly used Besov spaces \(B^{s}_{pq}\) with indices s>0, 1≤p≤∞, and 0<q≤∞ in the setting of this paper. These spaces are defined through Littlewood-Paley decomposition and characterized as approximation spaces of linear approximation from spectral spaces. A frame decomposition of \(B^{s}_{pq}\) is also established. In full generality, classical and non-classical Besov and Triebel-Lizorkin spaces and their frame decomposition in the general setting of the paper are developed in [33].

Section 7 is an Appendix, where we place the proofs of the Poincaré inequality for the Jacobi operator and the doubling property of the respective measure. Gaussian bounds of the associated heat kernel are also established.

Notation

Throughout this article we shall use the notation |E|:=μ(E) for EM, \({{\mathbb{L}}}^{p}:=L^{p}(M, \mu)\), \(\|\cdot\|_{p}:=\|\cdot\|_{{{\mathbb{L}}}^{p}}\), and ∥T pq will denote the norm of a bounded operator \(T: {{\mathbb{L}}}^{p} \to{{\mathbb{L}}}^{q}\). UCB will stand for the space of all uniformly continuous and bounded functions on M and \({{\mathbb{L}}}^{\infty}\) will be in most cases identified with UCB. D(T) will stand for the domain of a given operator T. We shall denote by \(C^{\infty}_{0}({\mathbb{R}}_{+})\) the set of all compactly supported C functions on ℝ+:=[0,∞). In most cases “sup” will mean “\(\operatorname{ess\,sup}\)”. Positive constants will be denoted by c, C, c 1, c′, … and they may vary at every occurrence, ab will stand for c 1a/bc 2.

2 Doubling Metric Measure Spaces: Basic Facts

In this section we put together some simple facts related to metric measure spaces (M,ρ,μ) obeying the doubling, inverse doubling and non-collapsing conditions (1.1)–(1.3) and integral operators acting on functions defined on such spaces.

2.1 Consequences of Doubling and Clarifications

The doubling condition (1.1) readily implies

$$ \big|B(x,\lambda r)\big| \le(2\lambda)^d \big|B(x, r)\big|, \quad x\in M, \ \lambda> 1, \ r>0, $$
(2.1)

and, therefore, due to B(x,r)⊂B(y,ρ(y,x)+r)

$$ \big|B(x, r)\big| \le2^d \biggl(1+ \frac{\rho(x,y)}{r} \biggr)^d \big|B(y, r)\big|, \quad x, y\in M, \ r>0. $$
(2.2)

In turn, the reverse doubling condition yields

$$ \big|B(x,\lambda r)\big| \ge(\lambda/2)^\beta\big|B(x, r)\big|, \quad \lambda> 1, \ r>0,\ 0< \lambda r<\frac{\operatorname {diam}M}{3}. $$
(2.3)

Also, the non-collapsing condition (1.3) coupled with (2.1) implies

$$ \inf_{x\in M}\big|B(x, r)\big| \ge\widehat{c}r^d, \quad0< r \le1, $$
(2.4)

where \(\widehat{c}=c2^{-d}\) with c>0 the constant from (1.3).

Note that |B(x,r)| can be much larger than cr d as is evidenced by the case of the Jacobi operator on [−1,1], considered in Sects. 1.3.4 and 7, see (7.1).

Several clarifying statements are in order. We begin with a claim which, in particular, shows that the non-collapsing condition is automatically obeyed when μ(M)<∞.

Proposition 2.1

Let (M,ρ,μ) be a metric measure space which obeys the doubling condition (1.1). Then

(a) μ(M)<∞ if and only if \(\operatorname{diam}M <\infty\). Moreover, if \(\operatorname{diam}M =D <\infty\), then

$$ \inf_{x\in M}\big|B(x, r)\big| \ge r^d|M|(2D)^{-d}, \quad0< r \leq D. $$
(2.5)

(b) μ({x})>0 for some xM if and only if {x}=B(x,r) for some r>0.

Proof

We first prove (a). Note that if \(\operatorname{diam}M =D <\infty\), then M=B(x,D) for any xM and hence |M|=|B(x,D)|<∞.

In the other direction, let |M|<∞. Assume on the contrary that \(\operatorname{diam}M= \infty\). Then inductively one can construct a sequence of points {x 0,x 1,…}⊂M such that if d j :=ρ(x 0,x j ), then 1≤d 1<d 2<⋯ and d j+1>3d j , j≥0. One checks easily that \(B(x_{j}, \frac{d_{j}}{2}) \cap B(x_{k}, \frac {d_{k}}{2})=\emptyset\) if jk. On the other hand, using (1.1),

$$0<\big|B(x_0, 1)\big| \le\big|B(x_j, 2d_j)\big| \le4^d \big|B(x_j, d_j/2)\big|. $$

Therefore, we have a sequence of disjoint balls \(\{B(x_{j}, \frac {d_{j}}{2})\}_{j\ge1}\) in M such that \(|B(x_{j}, \frac{d_{j}}{2})| \ge4^{-d}|B(x_{0}, 1)| >0\) and hence |M|=∞. This is a contradiction that proves the claim.

Estimate (2.5) is immediate from (2.1).

To prove (b), we first note that if {x}=B(x,r) for some r>0, then (1.1) implies μ({x})>0. For the other implication, let μ({x})>0 and assume that \(\{x\} \not= B(x,r)\) for all r>0. Then we use this to construct inductively a sequence {x 1,x 2,…}⊂M such that if d j :=ρ(x,x j ), then d 1>d 2>⋯>0 and \(d_{j+1}<\frac{d_{j}}{3}\), j≥1. Clearly, the latter inequality yields \(B(x_{j}, \frac{d_{j}}{2}) \cap B(x_{k}, \frac{d_{k}}{2})=\emptyset\) if jk. On the other hand by our assumption, (1.1), and the fact that xB(x j ,2d j ) we infer

$$0<\mu\bigl(\{x\}\bigr) \le\big|B(x_j, 2d_j)\big| \le4^d\big|B(x_j, d_j/2)\big|. $$

Now, as above we conclude that |M|=∞ which is a contradiction. □

We next show that the reverse doubling condition (1.2) is not quite restrictive.

Proposition 2.2

If M is connected, then the reverse doubling condition holds, i.e. there exists β>0 such that

$$\big|B(x,2r) \big| \ge2^\beta \big|B(x,r) \big| \quad\hbox{\textit{for} $x\in M$ \textit{and} $0<r< \frac{\operatorname{diam}M}{3}$.} $$

Proof

Suppose \(0<r < \frac{\operatorname{diam}M}{3}\). Then there exists yM such that d(x,y)=3r/2, for otherwise \(B(x,3r/2)= \overline{B(x,3r/2)}\ne M \) is simultaneously open and close, which contradicts the connectedness of M. Evidently, B(x,r)∩B(y,r/2)=∅ and B(y,r/2)⊂B(x,2r), which yields |B(x,2r)|≥|B(y,r/2)|+|B(x,r)|. On the other hand B(x,r)⊂B(y,5r/2) which along with (2.1) implies |B(x,r)|≤10d B(y,r/2) and hence |B(x,2r)|≥(10d+1)|B(x,r)|=2β|B(x,r)|. □

2.2 Useful Notation and Estimates

The localization of various operator kernels in what follows will be governed by symmetric functions of the form

$$ D_{\delta, \sigma }(x,y) := \bigl(\big|B(x, \delta)\big| \big|B(y, \delta)\big| \bigr)^{-1/2} \biggl(1+ \frac{\rho(x,y)}{\delta} \biggr)^{-\sigma }, \quad x,y \in M. $$
(2.6)

Here δ,σ>0 are parameters that will be specified in every particular case.

We next give several simple properties of D δ,σ (x,y) which will be instrumental in various proofs in the sequel. Note first that (2.1)–(2.2) readily yield

(2.7)
(2.8)
(2.9)

Furthermore, for 0<p<∞ and σ>d(1/2+1/p)

(2.10)

where \(c(p)= (\frac{2^{dp/2}}{2^{-d}-2^{-(\sigma-d/2)p}} )^{1/p}\) is decreasing as a function of p, and

$$ \int_M D_{\delta,\sigma }(x, u) D_{\delta,\sigma }(u,y) d\mu(u) \le cD_{\delta,\sigma }( x, y)\quad\hbox{if }\ \sigma > 2d, $$
(2.11)

with \(c = \frac{2^{\sigma+d+1}}{2^{-d}- 2^{d-\sigma}}\).

The above two estimates follow readily by the following lemma which will be needed as well.

Lemma 2.3

(a) If σ>d, then for δ>0

$$ \int_M\bigl(1+\delta^{-1}\rho(x, y)\bigr)^{-\sigma }d\mu(y) \le c_1\big|B(x, \delta)\big|, \quad x\in M \quad \bigl(c_1 = \bigl(2^{-d}- 2^{-\sigma}\bigr)^{-1} \bigr). $$
(2.12)

(b) If σ>d, then for x,yM and δ>0

(2.13)

(c) If σ>2d, then for x,yM and δ>0

(2.14)

with \(c_{2} = \frac{2^{\sigma+d+1}}{2^{-d}- 2^{d-\sigma}}\).

Proof

Denote briefly E 0:={yM:ρ(x,y)<δ}=B(x,δ) and

Then using (1.1) we get

which gives (2.12).

For the proof of (2.13), we note that the triangle inequality implies

$$\frac{1+\delta^{-1}\rho(x, y)}{(1+\delta^{-1}\rho(x, u))(1+\delta^{-1}\rho(y, u))} \le\frac{1}{1+\delta^{-1}\rho(x, u)} +\frac{1}{1+\delta^{-1}\rho(y, u)} $$

and hence

(2.15)

We now integrate and use (2.12) to obtain (2.13).

For the proof of (2.14), we use the above inequality and (2.2) to obtain

(2.16)

and integrating and applying again (2.12) we arrive at (2.14). □

2.3 Maximal δ-Nets

For the construction of decomposition systems (frames) we shall need maximal δ-nets on M.

Definition 2.4

We say that \(\mathcal{X}\subset M\) is a δ-net on M (δ>0) if ρ(ξ,η)≥δ for all \(\xi, \eta\in\mathcal{X}\), and \(\mathcal{X}\subset M\) is a maximal δ-net on M if \(\mathcal{X}\) is a δ-net on M that cannot be enlarged, i.e. there does not exist xM such that ρ(x,ξ)≥δ for all \(\xi\in\mathcal{X}\) and \(x\not\in\mathcal{X}\).

We collect some simple properties of maximal δ-nets in the following proposition.

Proposition 2.5

Suppose (M,ρ,μ) is a metric measure space obeying the doubling condition (1.1) and let δ>0.

(a) A maximal δ-net on M always exists.

(b) If \(\mathcal{X}\) is a maximal δ-net on M, then

$$ M=\bigcup_{\xi\in\mathcal{X}} B(\xi, \delta) \quad\hbox{and}\quad B( \xi, \delta/2)\cap B(\eta, \delta/2)=\emptyset \quad\hbox{if } \xi\ne\eta, \ \xi, \eta\in\mathcal{X}. $$
(2.17)

(c) Let \(\mathcal{X}\) be a maximal δ-net on M. Then \(\mathcal{X}\) is countable or finite and there exists a disjoint partition \(\{A_{\xi}\}_{\xi\in\mathcal {X}}\) of M consisting of measurable sets such that

$$ B(\xi, \delta/2) \subset A_\xi\subset B(\xi, \delta), \quad\xi \in\mathcal{X}. $$
(2.18)

Proof

For (a) observe that a maximal δ-net is a maximal set in the collection of all δ-net on M with respect to the natural ordering of sets (by inclusion) and hence by Zorn’s lemma a maximal δ-net on M exists.

Part (b) is immediate from the definition of maximal δ-nets.

To prove (c) we first fix yM and observe that for any n>δ, n∈ℕ, by (2.1)–(2.2) it follows that |B(y,n)|≤c(n,δ)|B(ξ,δ/2)| for \(\xi\in\mathcal{X}\cap B(y, n)\), where c(n,δ) is a constant depending on n and δ. On the other hand, by (2.17)

$$\sum_{\xi\in\mathcal{X}\cap B(y, n)} \big|B(\xi, \delta/2)\big| \le \big|B(y, 2n)\big|\le 2^d\big|B(y, n)\big|. $$

Therefore, \(\# (\mathcal{X}\cap B(y, n)) \le2^{d}c(n, \delta)<\infty\), which readily implies that \(\mathcal{X}\) is countable or finite.

Let us order the elements of \(\mathcal{X}\) in a sequence: \(\mathcal {X}=\{\xi_{1}, \xi_{2}, \dots\}\). We now define the sets A ξ of the claimed cover of M inductively. We set

$$A_{\xi_1}:= B(\xi_1, \delta)\setminus\bigcup_{\eta\in\mathcal{X}, \eta\ne \xi_1} B( \eta, \delta/2) $$

and if \(A_{\xi_{1}}, A_{\xi_{2}}, \dots, A_{\xi_{j-1}}\) have already been defined, we set

$$A_{\xi_{j}}:= B(\xi_{j}, \delta)\big\backslash \biggl[ \bigcup_{\nu\le j-1}A_{\xi_\nu}\bigcup_{\eta\in\mathcal{X}, \eta\ne \xi_{j}} B(\eta, \delta/2) \biggr]. $$

It is easy to see that the sets \(A_{\xi_{1}}, A_{\xi_{2}}, \dots\) have the claimed properties. □

Discrete versions of estimates (2.11) and (2.12) will be needed. Suppose \(\mathcal{X}\) is a maximal δ-net on M and \(\{A_{\xi}\}_{\xi\in\mathcal{X}}\) is a companion disjoint partition of M as in Proposition 2.5. Then

$$ \sum_{\xi\in\mathcal{X}} |A_\xi| \bigl(1+\delta^{-1}\rho(x, \xi ) \bigr)^{-d-1} \le2^{2d+2}\big|B(x, \delta)\big| $$
(2.19)

and

$$ \sum_{\xi\in\mathcal{X}} \bigl(1+ \delta^{-1}\rho(x, \xi) \bigr)^{-2d-1} \le2^{3d+2}. $$
(2.20)

Furthermore, for any δ δ

$$ \sum_{\xi\in\mathcal{X}} \frac{|A_\xi|}{|B(\xi, {\delta_\star })|} \bigl(1+{\delta_\star}^{-1}\rho(x, \xi) \bigr)^{-2d-1} \le2^{3d+2}, $$
(2.21)

and if σ≥2d+1

$$ \sum_{\xi\in\mathcal{X}} |A_\xi| D_{{\delta_\star},\sigma }(x, \xi) D_{{\delta_\star},\sigma }(y,\xi) \le2^{\sigma+3d+3 }D_{{\delta_\star},\sigma }( x, y). $$
(2.22)

Also, for σ≥2d+1

$$ \sum_{\xi\in\mathcal{X}} \bigl(1+ \delta^{-1}\rho(x, \xi) \bigr)^{-\sigma } \bigl(1+ \delta^{-1}\rho(y, \xi) \bigr)^{-\sigma } \le2^{\sigma+2d+3} \bigl(1+\delta^{-1}\rho(x, y) \bigr)^{-\sigma }. $$
(2.23)

We next prove (2.21). The proofs of (2.19) and (2.20) are similar. Observe first that by (2.2) |B(x,δ )|≤2d(1+δ −1 ρ(x,ξ))d|B(ξ,δ )|. On the other hand, for uA ξ B(ξ,δ)

$$1+{\delta_\star}^{-1}\rho(x, u) \le1+{\delta_\star}^{-1} \rho(x, \xi)+{\delta_\star}^{-1}\rho (\xi, u) \le2 \bigl(1+{ \delta_\star}^{-1}\rho(x, \xi) \bigr). $$

Therefore,

This leads to

where for the last inequality we used (2.12). Thus (2.21) is established.

For the proof of (2.22), we observe that using (2.15)

Now, summing up and applying (2.21) we arrive at (2.22).

Estimate (2.23) follows in a similar manner from (2.15) and (2.20).

2.4 Integral Operators

We shall mainly deal with integral (kernel) operators.

The kernels of many operators will be controlled by the quantities D δ,σ (x,y), introduced in (2.6). Our first order of business is to establish a Young-type inequality for such operators.

Proposition 2.6

Let H be an integral operator with kernel H(x,y), i.e.

$$Hf(x)=\int_M H(x,y) f(y) d\mu(y), \quad\hbox{and let}\ \big|H(x,y)\big|\le c'D_{\delta,\sigma }(x,y) $$

for some 0<δ≤1 and σ≥2d+1. If 1≤pq≤∞, then

$$ \|Hf\|_q \le c\delta^{d(\frac{1}{q} - \frac{1}{p})}\|f \|_p, \quad f\in{{\mathbb{L}}}^p, $$
(2.24)

where \(c=c'\widehat{c}^{d(1/r-1)}2^{2d+1}\) with \(\widehat{c}\) being the constant from (2.4).

This result is immediate from the following well-known lemma.

Lemma 2.7

Suppose \(\frac{1}{p}-\frac{1}{q} =1-\frac{1}{r}\), 1≤p,q,r≤∞, and let H(x,y) be a measurable kernel, verifying the conditions

$$ \big\|H(\cdot,y)\big\|_r \le K \quad\hbox{and} \quad \big\|H(x, \cdot)\big\|_r \le K. $$
(2.25)

If Hf(x)=∫ M H(x,y)f(y)(y), then

$$\|Hf\|_q \le K\|f\|_p \quad\hbox{for}\ f\in{{\mathbb{L}}}^p. $$

For the proof, see e.g. [19, Theorem 6.36].

Proof of Proposition 2.6

Pick 1≤r≤∞ so that 1/p−1/q=1−1/r. By (2.10) and (2.4) we obtain

$$\big\|H(\cdot, y)\big\|_r \le c' c(r) \big|B(y,\delta)\big|^{1/r-1} \le c' c(1) (\,\widehat{c}\,\delta)^{d(1/r-1)} $$

and a similar estimate holds for ∥H(x,⋅)∥ r . These estimates and the above lemma imply (2.24). □

We shall frequently use the following well-known result ([16], Theorem 6, p. 503).

Proposition 2.8

An operator \(T: {{\mathbb{L}}}^{1} \to{{\mathbb{L}}}^{\infty}\) is bounded if and only if T is an integral operator with kernel KL (M×M), i.e.

$$Tf(x)=\int_MK(x, y)f(y)d\mu(y)\quad \mbox{\textit{a.e. on} } M, $$

and if this is the case \(\|T\|_{1\to\infty} = \|K\|_{L^{\infty}}\). Moreover, the boundedness of T can be expressed in the bilinear form \(|\langle Tf, g \rangle| \le c\|f\|_{{{\mathbb{L}}}^{1}}\|g\|_{{{\mathbb{L}}}^{1}}\), \(\forall f, g\in{{\mathbb{L}}}^{1}\).

We next use this to derive a useful result for products of integral and non-integral operators.

Proposition 2.9

In the general setting of a doubling metric measure space (M,ρ,μ), let \(U, V: {{\mathbb{L}}}^{2} \to{{\mathbb{L}}}^{2}\) be integral operators and suppose that for some 0<δ≤1 and σd+1 we have

$$ \big|U(x,y)\big| \leq c_1D_{\delta, \sigma }(x,y) \quad \hbox{and}\quad\big|V(x,y)\big| \le c_2D_{\delta,\sigma }(x,y). $$
(2.26)

Let \(R: {{\mathbb{L}}}^{2}\to{{\mathbb{L}}}^{2}\) be a bounded operator, not necessarily an integral operator. Then URV is an integral operator with the following upper bound on its kernel

$$ \big|U R V (x,y)\big| \le \big\|U(x,\cdot)\big\|_2 \| R \|_{2 \to2}\big\|V(\cdot,y)\big\|_2 \le\frac{c\|R \|_{2 \to2}}{\sqrt{|B(x, \delta)||B(y, \delta)|}} $$
(2.27)

with c:=c 1 c 222d+1.

Proof

By Proposition 2.6 we get

and, therefore, URV is a kernel operator. Formally, we have

(2.28)

and hence the kernel of URV is given by

$$ H(x,y) = \int_M U(x,u) R\bigl[V(\cdot,y) \bigr](u)d\mu(u) = \bigl\langle U(x,\cdot), \overline{ R\bigl[V(\cdot,y)\bigr]} \bigr\rangle. $$
(2.29)

This along with (2.26) and (2.10) leads to

$$\big|H(x,y)\big| \le\big\|U(x,\cdot)\big\|_2 \big\|R\bigl[V(\cdot,y)\bigr] \big\|_2 \le\frac{c_1c_2 [c(2)]^2\|R\|_{2\to2}}{|B(x,\delta )|^{1/2}|B(y,\delta)|^{1/2}}, $$

which confirms (2.27), taking into account that [c(2)]2≤22d+1 by (2.10) if σd+1.

It remains to justify the manipulations in (2.28). Observe first that in order to prove (2.29) it suffices to establish identities (2.28) for all \(f\in{{\mathbb{L}}}^{2}\) such that \(\operatorname{supp}f\subset B(a, R)\) an arbitrary ball on M. To this end we shall need Bochner’s integral. In particular, we shall use the following results (e.g. [62], pp. 131–133): Suppose B is a separable Banach space and F:(M,μ,Σ)↦B is measurable in the following sense: ∀B , x(F(x)) is measurable. Then Bochner’s integral \(\int_{M}^{(B)} F(x) d\mu(x)\) is well defined and takes its value in B if and only if

$$\int_M \big\| F(x) \big\|_B d\mu(x) <\infty. $$

Furthermore, if \(\int_{M}^{(B)} F(x) d\mu(x)\) exists, then \(\ell (\int_{M}^{(B)} F(x) d\mu(x) )= \int_{M} \ell(F(x)) d\mu(x)\) for any B . Also, if T:BB is a bounded linear operator, then

$$ T \biggl(\int^{(B)}_M F(x) d\mu(x) \biggr) = \int^{(B)}_M T\bigl(F(x) \bigr) d\mu(x). $$
(2.30)

We shall utilize Bochner’s integral in our setting with \(B={{\mathbb{L}}}^{2}\).

Suppose \(f\in{{\mathbb{L}}}^{2}\) and \(\operatorname{supp}f\subset B(a, R)\), aM, R>0. Then using (2.26), (2.10), and (2.2) we obtain

(2.31)

Therefore, \(\int_{M}^{(B)}V(\cdot, y)f(y) d\mu(y)\) exists and for any \(g\in {{\mathbb{L}}}^{2}\)

Here the shift of the order of integration is justified by Fubini’s theorem and the fact that

where we used (2.31). Therefore, \(Vf=\int^{(B)}_{M} V(\cdot, y) f(y)d\mu(y)\). We now use (2.30) to obtain

$$RVf = R \biggl[\int^{(B)}_M V(\cdot, y) f(y)d \mu(y) \biggr] = \int^{(B)}_M R\bigl[V(\cdot, y) \bigr]f(y)d\mu(y), $$

which implies

Consequently, H(x,y) is given by (2.29) and the proof is complete. □

3 Functional Calculus

The aim of this section is to develop the functional calculus of operators of the form \(f(\sqrt{L})\) associated with smooth and non-smooth functions f. The calculus of smooth operators is in the spirit of [17, 45] and will be needed in most part of this article, including the construction of frames and the Littlewood-Paley theory, while the non-smooth calculus will be needed for estimation of the kernels of the spectral projectors and lower bound estimates.

3.1 Smooth Functional Calculus

We shall be operating in the setting described in Sect. 1.1. More precisely, we assume that (M,ρ,μ) is a metric measure space obeying conditions (1.1)–(1.3) and L is an essentially self-adjoint positive operator on \({{\mathbb{L}}}^{2}\) such that the semi-group e tL, t>0, has a kernel p t (x,y) verifying (1.4)–(1.8).

Theorem 3.1

Let g:ℝ→ℂ be a measurable function such that for some σ>2d

$$ \|g\|_*:=\int_{\mathbb{R}}\big|\widehat{g}(\xi)\big|\bigl(1+| \xi|\bigr)^{\sigma}d\xi <\infty, \quad\hbox{where } \widehat{g}(\xi):=\int _{\mathbb{R}}g(x)e^{-ix\xi} dx $$
(3.1)

is the Fourier transform of g. Then \(g(\delta^{2}L) e^{- \delta^{2}L}\), 0<δ≤1, is an integral operator with kernel \(g(\delta^{2}L) e^{- \delta ^{2}L}(x,y)\) satisfying

$$ \big|g\bigl(\delta^2L\bigr) e^{- \delta^2L}(x,y) \big| \le c_\sigma\| g\|_\star D_{\delta,\sigma }(x,y), \quad\forall x, y\in M, $$
(3.2)

and

$$ \big |g\bigl(\delta^2L\bigr) e^{- \delta^2L}(x,y) - g \bigl(\delta^2L\bigr) e^{- \delta ^2L}\bigl(x,y'\bigr) \big| \le c_\sigma\| g\|_\star \biggl(\frac{\rho(y,y')}{\delta} \biggr)^\alpha D_{\delta, \sigma }(x,y), $$
(3.3)

for all x,y,y′∈M, if ρ(y,y′)≤δ. Here α>0 is the constant from (1.6), D δ,σ (x,y) is defined in (2.6), and c σ >0 is a constant depending only on σ and the structural constants from (1.5)(1.6). Moreover,

$$ \int_M g\bigl(\delta^2L \bigr) e^{- \delta^2L}(x,y) d\mu(y) = g(0)\quad \forall x\in M. $$
(3.4)

Proof

To prove (3.2) we first show that \(g(\delta^{2} L) e^{-\delta^{2}L}\) is a kernel operator. From (3.1) it follows that \(\|\widehat{g} \|_{1} < \infty\) which implies \(g(x) = \frac{1}{2\pi}\int_{\mathbb{R}}\widehat{g}(\xi) e^{ix\xi} dx\) and hence \(\|g\|_{\infty}\le\frac{1}{2\pi}\|\widehat{g} \|_{1}\). Then by the spectral theorem

$$\big\|g\bigl(\delta^2 L\bigr) e^{-\delta^2L} \big\|_{2\to2} = \big\|g \bigl(\delta^2 \cdot\bigr) e^{-\delta^2\cdot} \big\|_\infty \le(2 \pi)^{-1} \| \widehat{g}\, \|_1. $$

Therefore, invoking Proposition 2.8, in order to show that \(g(\delta^{2} L) e^{-\delta^{2}L}\) is a kernel operator it suffices to prove that

$$\big|\bigl\langle g\bigl(\delta^2 L\bigr) e^{-\delta^2L}\varphi , \psi \bigr\rangle \big| \le c\|\varphi \|_1\|\psi\|_1, \quad\forall \varphi , \psi\in{{\mathbb{L}}}^1\cap{{\mathbb{L}}}^2. $$

Let E λ , λ≥0, be the spectral resolution associated with the operator L, then \(L =\int_{0}^{\infty}\lambda dE_{\lambda}\). Writing the spectral decomposition of \(g(\delta^{2} L) e^{-\delta^{2}L}\) and using the Fourier inversion identity, we obtain for \(\varphi , \psi\in{{\mathbb{L}}}^{1}\cap{{\mathbb{L}}}^{2}\)

The above shift of the order of integration is justified by Fubini’s theorem and the fact that for any \(h\in{{\mathbb{L}}}^{2}\)

To go further, we use that \(e^{- \delta^{2}(1-i\xi)L}\) is an integral operator with kernel p z (x,y), z=δ 2(1−), and ∥p z c to obtain for \(\varphi , \psi\in{{\mathbb{L}}}^{1}\cap{{\mathbb{L}}}^{2}\)

(3.5)

To justify the above shift of order of integration we again use Fubini’s theorem and the fact that

This also implies \(| \langle g(\delta^{2} L) e^{-\delta^{2}L}\varphi , \overline{\psi} \rangle| \le c\|\widehat{g}\,\|_{1}\|\varphi \|_{1}\|\psi\|_{1} \) for all \(\varphi , \psi\in{{\mathbb{L}}}^{1}\cap{{\mathbb{L}}}^{2}\). Therefore, \(g(\delta^{2}L) e^{- \delta^{2}L}\) is a kernel operator and by (3.5)

$$ g\bigl(\delta^2L\bigr) e^{- \delta^2L}(x, y) = \frac{1}{2\pi}\int_{\mathbb{R}}\widehat{g}(u) p_{\delta^2(1-iu)}(x, y)du. $$
(3.6)

From this and (1.5) we infer

$$ \big|g\bigl(\delta^2L\bigr) e^{- \delta^2L}(x,y)\big| \le c' \bigl(\big|B(x, \delta)\big|\big|B(y, \delta)\big| \bigr)^{-1/2} \int _{\mathbb{R}}\big|\widehat{g}(u)\big|\exp \biggl\{-\frac{c\rho^2(x,y)}{\delta^2(1+u^2)} \biggr\}du. $$
(3.7)

Assume ρ(x,y)/δ≥1. Clearly, \(\sup_{x\geq0} x^{\beta}e^{-x}=( \frac{\beta}{e})^{\beta}\) for β>0. Using this with β=σ/2 we obtain

Therefore,

which confirms (3.2).

If ρ(x,y)/δ<1, then by (3.7)

This completes the proof of (3.2).

We now take on (3.3). As \(g(\delta^{2}L) e^{- \delta^{2}L} = g(\delta^{2}L)e^{- \frac{1}{2}\delta ^{2}L}e^{- \frac{1}{2}\delta^{2}L}\), the kernels of these operators are related by

$$g\bigl(\delta^2L\bigr) e^{- \delta^2L}(x, y) = \int _M g\bigl(\delta^2L\bigr)e^{- \frac{1}{2}\delta^2L}(x, u)e^{- \frac{1}{2}\delta^2L}(u, y) d\mu(u), $$

which implies

We use (3.2) with δ replaced by \(\delta/\sqrt{2}\) and g(λ) by g(2λ) to estimate the first term under the integral and (1.6) for the second term, taking into account that \(\exp \{-\frac{c\rho^{2}(x, y)}{\delta^{2}} \} \le c_{\sigma}(1+\frac{\rho(x, y)}{\delta} )^{-\sigma }\). Thus we get

Here for the latter estimate we used (2.11) and that σ>2d.

It remains to prove (3.4). By (1.8), i.e. \(\int_{M} p_{\delta^{2}-iu} (x,y) dy \equiv1\), and (3.6) we get

Here the justification of the shift of order of integration is by straightforward application of Fubini’s theorem. □

Some remarks are in order. Condition (3.1) is apparently a smoothness condition on g. By Cauchy-Schwartz it follows that

and hence (3.1) holds if \(\|g\|_{H^{\sigma +1}}<\infty\). However, it will be more convenient to us to replace (3.1) by a condition in terms of derivatives of g that is easier to verify. From \(\xi^{k} \widehat{g}(\xi) = (-i)^{k}\widehat{g^{(k)}}(\xi) \) we get \(|\xi|^{k}|\widehat{g}(\xi)| \le\|g^{(k)}\|_{L^{1}}\). Also, \(|\widehat{g}(\xi)| \le\|g\|_{L^{1}}\). Pick kσ>2d. Then using the above we obtain

$$\bigl(1+|\xi|\bigr)^{k+2}\big|\widehat{g}(\xi)\big| \le2^{k+1} \bigl(\big|\widehat{g}(\xi)\big|+| \xi|^{k+2}\big|\widehat{g}(\xi)\big| \bigr) \le2^{k+1} \bigl(\|g \|_{L^1}+\big\|g^{(k+2)}\big\|_{L^1} \bigr) $$

that implies

Thus we arrive at the following

Remark 3.2

For the norm ∥g from condition (3.1) we have \(\|g\|_{*} \le c\|g\|_{H^{\sigma +1}}\) and \(\|g\|_{*} \le c (\|g\|_{L^{1}}+\|g^{(k+2)}\|_{L^{1}} )\) if kσ>2d.

Corollary 3.3

For any m∈ℕ and σ>0 there exists a constant c σ,m >0 such that the kernel of the operator \(L^{m} e^{-\delta^{2}L}\), 0<δ≤1, satisfies

(3.8)
(3.9)

if ρ(y,y′)≤δ.

Proof

Set g(λ):=λ m θ(λ)e λ for λ≥0, where θC (ℝ), \(\operatorname{supp}\theta \subset[-1, \infty )\), and θ(λ)=1 for λ≥0. Since L≥0, we can write

$$L^m e^{-\delta^2 L} = 2^{m}\delta^{-2m}g \bigl( \delta_*^2L \bigr) e^{-\delta_*^2 L} \quad\hbox{with } \delta_*:=2^{-1/2}\delta $$

and the corollary follows by Theorem 3.1 and (2.8). □

We next use Theorem 3.1 and Remark 3.2 to obtain some important kernel localization results. Our main interest is in operators of the form \(f(\delta\sqrt{L})\).

Theorem 3.4

Let fC 2k+4(ℝ+), k>2d, \(\operatorname {supp}f\subset[0, R]\) for some R≥1, and f (2ν+1)(0)=0 for ν=0,…,k+1. Then \(f(\delta\sqrt{L})\), 0<δ≤1, is an integral operator with kernel \(f(\delta\sqrt{L})(x, y)\) satisfying

(3.10)
(3.11)

where \(c_{k} = c_{k}(f)= \widetilde{c}_{k}R^{2k+d+4} (\|f\|_{L^{\infty}} + \| f^{(2k+4)}\|_{L^{\infty}} + \max_{\nu\le2k+4 } |f^{(\nu)}(0)| ) \) with \(\widetilde{c}_{k}>0\) a constant depending only on k,d, and the constants in (1.5)(1.6), and \(c_{k}'=c_{k}R^{\alpha}\); as before α>0 is the constant from (1.6). Furthermore,

$$ \int_M f(\delta\sqrt{L}) (x, y) d \mu(y) = f(0)\quad \forall x\in M. $$
(3.12)

Proof

We first observe that it suffices to only prove the theorem when R=1, then in the general case it follows by rescaling. Indeed, assume that f satisfies the hypotheses of the theorem and set h(λ):=f(), λ∈ℝ+. Then h verifies the assumptions with R=1 and if the theorem holds for R=1 we obtain, using (2.8),

(3.13)

and similarly

For \(\frac{\delta}{R} <\rho(y, y') \le\delta\), the last estimate follows by (3.13). It remains to observe that

and hence the theorem holds in general.

We now prove the theorem in the case when R=1. Choose θC (ℝ) so that θ is even, \(\operatorname{supp}\theta\subset[-1, 1]\), θ(λ)=1 for λ∈[−1/2,1/2], and 0≤θ≤1. Denote \(P_{k}(\lambda):=\sum_{j=0}^{k+2} \frac{f^{(2j)}(0)}{(2j)!} \lambda^{2j}\) and let f 1(λ), g 0(λ), and g 1(λ) be defined for λ∈ℝ+ from

$$f(\lambda)= \theta(\lambda)P_k(\lambda) + f_1(\lambda), \quad \theta(\lambda)P_k(\lambda) = g_0\bigl( \lambda^2\bigr) e^{-\lambda^2}, \quad f_1(\lambda)= g_1\bigl(\lambda^2\bigr) e^{-\lambda^2}. $$

Thus \(g_{0}(\lambda)= P_{k}(\sqrt{|\lambda|})\theta(\sqrt{|\lambda|}) e^{\lambda}\) for λ∈ℝ+, and we use this to define g 0(λ) for λ<0. Clearly, g 0C (ℝ), \(\operatorname{supp}g_{0}\subset[-1, 1]\) and

$$\|g_0\|_{L^1}+\big\|g_0^{(k+2)} \big\|_{L^1} \le c(k)\sup_{\nu\le2k+4} \big|f^{(\nu)}(0)\big|. $$

Therefore, by Theorem 3.1 the kernel of the operator \(\theta(\delta\sqrt{L})P_{k}(\delta\sqrt{L})\) satisfies the desired inequalities (3.10)–(3.11) with R=1.

On the other hand, \(g_{1}(\lambda )= f_{1} (\sqrt{|\lambda |} )e^{\lambda}\) for λ∈ℝ+ and we use this to define g 1(λ) for λ<0. Observe that \(f_{1}(\delta\sqrt{L})=g_{1}(\delta^{2}L)e^{-\delta^{2}L}\) and \(\operatorname{supp}g_{1}\subset[-1, 1]\). Furthermore, f 1C 2k+4(ℝ+), \(f_{1}^{(\nu)}(0)=0\), ν=0,…,2k+4, and

$$ \big\|f_1^{(j)}\big\|_{L^\infty} \le \big\|f^{(j)}\big\|_{L^\infty} + c\max_{\nu\le 2k+4} \big|f^{(\nu)}(0)\big|, \quad 0\le j\le2k+4. $$
(3.14)

We next show that g 1C k+2(ℝ) and estimate the derivatives of g 1. We have for 1≤mk+2 and λ>0

$$g_1^{(m)}(\lambda )=\sum_{\nu=0}^m \binom{m}{\nu}e^\lambda \biggl(\frac{d}{d\lambda } \biggr)^\nu \bigl[f_1(\sqrt{\lambda}) \bigr] $$

and a little calculus shows that for ν≥1 and λ>0

$$\biggl(\frac{d}{d\lambda } \biggr)^\nu \bigl[f_1(\sqrt{\lambda}) \bigr] =\sum_{j=1}^\nu c_j\lambda^{-\nu+j/2}f_1^{(j)}(\sqrt{\lambda}), \quad\hbox{where } |c_j|\le\nu!. $$

On the other hand, by Taylor’s theorem \(|f_{1}^{(j)}(\sqrt{\lambda})| \le|\lambda |^{(2m-j)/2}\|f_{1}^{(2m)}\|_{L^{\infty}}\) and hence

$$\bigg| \biggl(\frac{d}{d\lambda } \biggr)^\nu \bigl[f_1( \sqrt{|\lambda |}) \bigr] \bigg| \le c|\lambda |^{m-\nu} \big\|f_1^{(2m)}\big\|_{\infty}, \quad1\le\nu\le m. $$

Exactly in the same way we obtain the same estimate for λ<0. Denote briefly \(h(\lambda ):=f_{1}(\sqrt{|\lambda |})\). Observe that since f 1C 2k+4(ℝ+) we have h (k+2)(λ)=o(1) as λ→0. This and the above inequalities yield h (ν)(0)=0, ν=0,…,k+2, and hence hC k+2(ℝ), which implies g 1C k+2(ℝ). From the above we also obtain

$$\big|g_1^{(m)}(\lambda )\big| \le c\sum _{\nu=0}^m e^\lambda |\lambda |^{m-\nu}\big\|f_1^{(2m)}\big\|_{L^\infty} \le c(m+1) \big\|f_1^{(2m)}\big\|_{L^\infty}, \quad\lambda \in{\mathbb{R}}. $$

This in turn (with m=k+2) implies \(\|g_{1}^{(k+2)}\|_{L^{1}} \le c(k+3)\|f_{1}^{(2k+4)}\|_{L^{\infty}}\) and, evidently, \(\|g_{1}\|_{L^{1}} \le e\|f_{1}\|_{L^{\infty}}\). We now apply Theorem 3.1 to conclude that \(f_{1}(\delta\sqrt{L})\) is an integral operator with kernel \(f_{1}(\delta\sqrt{L})(x, y)\) satisfying (3.10)–(3.11), where, in view of Remark 3.2 and (3.14), the constants c k , \(c_{k}'\) are of the claimed form.

Putting the above together we conclude that \(f(\delta\sqrt{L})\) is an integral operator with kernel \(f(\delta\sqrt{L})(x, y)\) satisfying (3.10)–(3.11) with R=1.

Identity (3.12) follows by (3.4). □

Corollary 3.5

Let f:ℝ+→ℂ be as in the hypothesis of Theorem 3.4. Then for any m∈ℕ and 0<δ≤1 the operator \(L^{m} f(\delta\sqrt{L})\) is an integral operator with kernel \(L^{m} f(\delta\sqrt{L})(x, y)\) such that

(3.15)
(3.16)

whenever ρ(y,y′)≤δ. Here the constants c k,m , \(c_{k, m}'\) are as the constants c k , \(c_{k}'\) in Theorem 3.4 with R 2k+d+4 replaced by R 2k+d+4+2m and \(\widetilde{c}_{k}\) depending on m as well.

Proof

Let h(λ):=λ 2m f(λ). Then \(h(\delta\sqrt{L})= \delta^{2m}L^{m} f(\delta\sqrt{L})\) and observe that h (2ν+1)(0)=0 for ν=0,…,k+1. Consequently, the corollary follows by Theorem 3.4 applied to h. □

Corollary 3.6

Let f:ℝ+→ℂ be as in the hypothesis of Theorem 3.4. Then there exists a constant c>0 such that for any 0<δ≤1

$$\big\| f(\delta\sqrt{L})\phi\big\|_q \le c\delta^{1/p-1/q}\| \phi \|_p, \quad\forall\phi\in{\mathbb{L}}^p, \quad1\leq p\leq q \leq\infty, $$

and

$$\big|f(\delta\sqrt{L})\phi(x) - f(\delta\sqrt{L})\phi(y)\big| \le c\|\phi \|_\infty \biggl(\frac{\rho(x,y)}{\delta} \biggr)^\alpha, \quad \forall x,y \in M, \quad\forall\phi\in{\mathbb{L}}^\infty. $$

This corollary is an immediate consequence of Theorem 3.4 and Proposition 2.6.

3.2 Non-smooth Functional Calculus

We need to establish some properties of operators of the form \(f(\sqrt{L})\) and their kernels in the case of non-smooth compactly supported functions f. These are kernel operators with not necessarily well localized kernels.

Theorem 3.7

Let f be a bounded measurable function on+ with \(\operatorname{supp}f \subset[0,\tau]\) for some τ≥1. Then \(f(\sqrt{L})\) is an integral operator with kernel \(f(\sqrt{L})(x, y)\) satisfying

$$ \big| f\bigl(\sqrt{L}\bigr) (x,y) \big| \le\frac{c\| f \|_\infty}{ \sqrt{|B(x, \tau^{-1})|| B(y, \tau^{-1})|}},\quad x,y \in M, $$
(3.17)

and for x,y,y′∈M

$$ \big|f(\sqrt{L}) (x,y)- f(\sqrt{L}) \bigl(x,y'\bigr)\big| \le \frac{c[\tau\rho(y,y')]^\alpha\| f \|_\infty}{ \sqrt{\big|B\bigl(x, \tau^{-1}\bigr)\big|\big| B\bigl(y, \tau^{-1}\bigr)\big|}} \quad\hbox{if } \rho\bigl(y, y'\bigr)\le \tau^{-1}. $$
(3.18)

Furthermore, if 1≤p≤2≤q≤∞,

(3.19)
(3.20)
(3.21)

Above the constants depend only on d and the constants in (1.5) and (1.6); the constant in (3.19) depends in addition on p,q.

Proof

Pick a function θC (ℝ+) so that \(\operatorname{supp}\theta\subset[0, 2]\), θ(x)=1 for x∈[0,1], and 0≤θ≤1. Then by Theorem 3.4

$$ \big |\theta\bigl(\tau^{-1} \sqrt{L}\bigr) (x,y) \big| \le c_\sigma D_{\tau^{-1},\sigma }(x,y) \quad\hbox{for any } \sigma >0. $$
(3.22)

Choose σ>3d/2. We have

(3.23)

Now, (3.17) follows by Proposition 2.9, using the above, (3.22), and the fact that \(\|f(\sqrt{L})\|_{2\rightarrow2} \leq\|f\|_{\infty}\).

From (3.22)–(3.23) and Proposition 2.9 we also obtain for 1≤p≤2≤q≤∞

which confirms (3.19).

For the proof of (3.18), we first observe that

where \(g(u) := f(u) e^{\tau^{-2}u^{2}}\), ∥gef, and hence

We now use (3.17), applied to \(g(\sqrt{L})\), and (1.6) to obtain

Moreover, using (2.2) we have

$$\int_M \frac{e^{- (\tau\rho(u,y))^2}}{|B(u, \tau^{-1}) | } d\mu(u) \le\frac{2^d}{|B(y, \tau^{-1})|} \int_M\bigl(1+ \tau\rho(u,y)\bigr)^d e^{- (\tau\rho(u,y))^2} d\mu(u) \le c <\infty, $$

where for the latter inequality we used (2.12). This completes the proof of (3.18).

We now turn to the proof of (3.20). We have

which proves (3.20). Here for the latter estimate we used (3.17).

Finally, using the above we have

and hence \(\||f|^{2}(\sqrt{L}) \|_{1 \rightarrow\infty} = \sup_{x,y } | |f |^{2}(\sqrt{L})(x,y)| = \sup_{x} |f |^{2}(\sqrt{L})(x,x)\), which confirms (3.21). □

3.3 Approximation of the Identity and Littlewood-Paley Decomposition

We first give a convenient approximation of the identity in \({{\mathbb{L}}}^{p}\) statement.

Proposition 3.8

Let φC (ℝ+), \(\operatorname {supp}\varphi\subset[0, R]\), R>0, φ(0)=1, and φ (2ν+1)(0)=0 for ν=0,1,…. Then for any \(f\in{{\mathbb{L}}}^{p}\), 1≤p≤∞, (L :=UCB) one has

$$f= \lim_{\delta\to0}\varphi(\delta\sqrt{L}) f \quad\hbox{in } {{\mathbb{L}}}^p. $$

Proof

By Theorem 3.4 it follows that \(\varphi (\delta\sqrt{L})\) is an integral operator with kernel \(\varphi(\delta\sqrt{L})(x, y)\) satisfying for any k>2d

$$ \big|\varphi(\delta\sqrt{L}) (x, y)\big| \le c_k D_{\delta, k}(x, y) \le c\big|B(x, \delta)\big|^{-1} \bigl(1+ \delta^{-1}\rho(x, y) \bigr)^{-k+d/2}, $$
(3.24)

where for the last inequality we used (2.2). Now, just as in the proof of (2.12) we obtain for k>3d/2 and r>0

$$\int_{M\setminus B(x, r)} \big|\varphi(\delta\sqrt{L}) (x, y)\big|d\mu(y) \le c( \delta/r)^{k-3d/2} \to0 \quad\hbox{as } \delta\to0. $$

Indeed, suppose 2−1 δr<2 δ and denote E j :=B(x,2j δ)∖B(x,2j−1 δ). Then using (3.24) and (2.1) we get

On the other hand, from (3.12) and φ(0)=1 we have \(\int_{M} \varphi(\delta\sqrt{L})(x, y) d\mu(y) =1\). Using the above and the fact that the vector lattice set of all boundedly supported uniformly continuous functions on M is dense in \({{\mathbb{L}}}^{p}\) (by the Stone-Daniell theorem) one proves as usual the claimed convergence. □

We next give precise meaning to what we call Littlewood-Paley decomposition of \({{\mathbb{L}}}^{p}\)-functions in this article.

Corollary 3.9

Let φ 0,φC (ℝ+), \(\operatorname{supp}\varphi_{0} \subset[0, b]\) and \(\operatorname {supp}\varphi\subset[b^{-1}, b]\) for some b>1, φ(0)=1, φ (2ν+1)(0)=0 for ν≥0, and φ 0(λ)+∑ j≥1 φ(b j λ)=1 for λ∈ℝ+. Then for any \(f\in{{\mathbb{L}}}^{p}\), 1≤p≤∞, \(({{\mathbb{L}}}^{\infty}:=\mathrm{UCB})\)

$$ f=\varphi_0(\sqrt{L})+\sum _{j\ge1}\varphi\bigl(b^{-j}\sqrt{L}\bigr)f \quad\mbox{\textit{in} } {{\mathbb{L}}}^p. $$
(3.25)

Proof

Let θ(λ):=φ 0(λ)+φ(b −1 λ) and observe that \(\sum_{k=0}^{j} \varphi_{k}(\lambda )= \theta(b^{-j}\lambda )\) for j≥1. Then the result follows by Proposition 3.8. □

3.4 Spectral Spaces

We adhere to the setting of this article, described in the introduction. As before E λ , λ≥0, is the spectral resolution associated with the self-adjoint positive operator L on \({{\mathbb{L}}}^{2}:= L^{2}(M, \mu)\). As elsewhere we shall be dealing with operators of the form \(f(\sqrt{L})\). We denote by F λ , λ≥0, the spectral resolution associated with \(\sqrt{L}\), that is, \(F_{\lambda}=E_{\lambda ^{2}}\). Then \(f(\sqrt{L})= \int_{0}^{\infty}f(\lambda) d F_{\lambda}\) and the spectral projectors are defined by and

(3.26)

We next list some properties of F λ which follow readily from Theorem 3.7: The operator F λ is a kernel operator whose kernel F λ (x,y) is a real symmetric nonnegative function on M×M. Also,

$$ F_\lambda(x,y)\le c\big|B\bigl(x, \lambda^{-1} \bigr)\big|^{-1/2}\big|B\bigl(y, \lambda^{-1}\bigr)\big|^{-1/2} $$
(3.27)

and F λ (x,y) is in Lip α for some α>0, see (3.18). The mapping property of F λ on \({{\mathbb{L}}}^{p}\) spaces is given by

$$ \| F_\lambda f \|_q \le c \lambda^{d(1/p-1/q)} \| f \|_p, \quad1\leq p \leq2 \leq q \leq \infty. $$
(3.28)

We define the spectral spaces \(\varSigma^{p}_{\lambda}\) for 1≤p≤2 by

$$\varSigma^p_\lambda= \bigl\{ f \in{{\mathbb{L}}}^p: F_\lambda f = f\bigr\}. $$

Notice that F λ is not necessarily a continuous operator on \({{\mathbb{L}}}^{p}\) if p>2 and, therefore, \(\varSigma^{p}_{\lambda}\) cannot be defined as above for 2<p≤∞. Instead, we shall use the following characterization of \(\varSigma^{p}_{\lambda}\): A function \(f \in\varSigma^{p}_{\lambda}\) for 1≤p≤2 if and only if \(\theta(\sqrt{L}) f =f\) for all \(\theta\in C^{\infty}_{0}({\mathbb{R}}_{+})\) such that θ≡1 on [0,λ]. This characterization follows by the fact that \(\varSigma^{p}_{\lambda}\subset\varSigma^{2}_{\lambda}\) for 1≤p≤2 and the boundedness of the operator \(\theta(\sqrt{L})\) with θ as above.

Definition 3.10

For 1≤p≤∞ we define

$$\varSigma^p_\lambda := \bigl\{f\in{{\mathbb{L}}}^p: \theta(\sqrt{L})f = f \hbox{ for all } \theta\in C^\infty_0({\mathbb{R}}_+), \ \theta\equiv1 \hbox{ on } [0, \lambda]\bigr\}. $$

Furthermore, for any compact K⊂[0,∞) we define

$$\varSigma^p_K := \bigl\{f\in{{\mathbb{L}}}^p: \theta(\sqrt{L})f = f \hbox{ for all } \theta\in C^\infty_0({\mathbb{R}}_+), \theta\equiv1 \hbox{ on } K\bigr\}. $$

Proposition 3.11

For any λ≥1 and 1≤p≤∞

$$ \varSigma_\lambda^p = \bigcap_{\varepsilon>0} \varSigma_{\lambda +\varepsilon}^p. $$
(3.29)

Proof

Suppose \(f \in\bigcap_{\epsilon>0} \varSigma_{\lambda+\epsilon}^{p}\) and let \(\theta\in C^{\infty}_{0}({\mathbb{R}}_{+})\), \(\operatorname{supp}\theta \subset[0,R]\), and θ≡1 on [0,λ]. By Definition 3.10 \(f= \theta(r^{-1}\sqrt{L})f\) for each r>1 and hence

$$ \big\|f- \theta(\sqrt{L})f\big\|_p =\big \|\theta \bigl(r^{-1}\sqrt{L}\bigr)f - \theta(\sqrt{L})f \big\|_p, \quad r>1. $$
(3.30)

Assuming that 1<r≤2, Theorem 3.4 implies

$$\big|\theta\bigl(r^{-1}\sqrt{L}\bigr) (x, y) - \theta(\sqrt{L}) (x, y)\big| \le C_r D_{1,k}(x,y), $$

where C r =c k R 2k+d+4(∥θ(r −1⋅)−θ(⋅)∥+∥(d/)2k+4[θ(r −1⋅)−θ(⋅)]∥). We now choose k≥2d+1 and apply Proposition 2.6 to obtain

$$ \big\|\theta\bigl(r^{-1}\sqrt{L}\bigr)- \theta(\sqrt{L}) \big\|_{p\rightarrow p} \le cC_r. $$
(3.31)

Clearly, for any ν≥0 we have lim r→1∥(d/)ν[θ(r −1⋅)−θ(⋅)]∥=0 and, therefore, lim r→1 C r =0. This along with (3.30)–(3.31) yields \(\|f- \theta(\sqrt{L})f\|_{p}=0\), which completes the proof. □

With the next claim we establish a Nikolski’s type inequality that relates different \({{\mathbb{L}}}^{p}\)-norms on spectral spaces.

Proposition 3.12

If 1≤pq≤∞, then \(\varSigma_{\lambda}^{p} \subset\varSigma_{\lambda}^{q}\), \(\varSigma_{\lambda}^{q} \cap {\mathbb{L}}^{p} =\varSigma_{\lambda}^{p}\), and there exists a constant c>0 such that

$$ \|g\|_q \le c\lambda^{d(1/p-1/q)}\|g \|_p, \quad g \in\varSigma_\lambda^p,\ \lambda \ge1. $$
(3.32)

Furthermore, for any \(g \in\varSigma_{\lambda}^{\infty}\), λ≥1,

$$ \big|g(x)- g(y)\big| \le c \bigl(\lambda\rho(x,y) \bigr)^\alpha\|g \|_\infty, \quad x,y \in M, $$
(3.33)

with α>0 the constant from (1.6).

Proof

Let \(g \in\varSigma_{\lambda}^{p}\), λ≥1, and set δ:=λ −1. Choose \(\theta\in C^{\infty}_{0}({\mathbb{R}}_{+})\) so that θ≡1 on [0,1]. Then \(g=\theta(\delta\sqrt{L})g\) and (3.32)–(3.33) follow readily by Corollary 3.6. □

3.5 Linear Approximation from Spectral Spaces

The purpose of this subsection is to give a short account of linear approximation from \(\varSigma_{t}^{p}\) in \({{\mathbb{L}}}^{p}\), 1≤p≤∞. Let \(\mathcal{E}_{t}(f)_{p}\) denote the best approximation of \(f \in {{\mathbb{L}}}^{p}\) (\({{\mathbb{L}}}^{\infty}:=\mathrm{UCB}\)) from \(\varSigma_{t}^{p}\), that is,

$$ \mathcal{E}_t(f)_p:= \inf_{g\in\varSigma_t^p}\|f-g\|_p. $$
(3.34)

Our goal is to characterize the approximation space \(A_{pq}^{s}\), s>0, 0<q≤∞, defined as the set of all functions \(f\in{{\mathbb{L}}}^{p}\) such that

(3.35)
(3.36)

Due to the monotonicity of \(\mathcal{E}_{t}(f)_{p}\) we have

$$\|f\|_{A_{pq}^s}\sim\|f\|_p+ \biggl(\int_1^\infty\bigl(t^{s}\mathcal{E}_t(f)_p\bigr)^qdt/t \biggr)^{1/q}, $$

when q<∞, and

$$\|f\|_{A^s_{p \infty}} := \|f \|_p + \sup_{t\ge1} t^s \mathcal{E}_t(f)_p <\infty \quad\hbox{if } q=\infty.$$

To characterize \(A_{pq}^{s}\) we shall use the well-known machinery of Bernstein and Jackson estimates and interpolation. In Sect. 6.1 it will be shown that \(A_{pq}^{s}\) can be identified as a certain Besov space.

3.5.1 Bernstein and Jackson Estimates. Characterization of Spectral Spaces

We begin by proving a Bernstein estimate.

Theorem 3.13

Let 1≤p≤∞ and m∈ℕ. Then there exists a constant c =c (m)>0, independent of p, such that for any \(g\in\varSigma_{\lambda}^{p}\), λ≥1,

$$ \big\|L^m g\big\|_p \le c^\star \lambda^{2m}\|g\|_p. $$
(3.37)

Proof

As in the proof of Proposition 3.12, pick \(\theta\in C^{\infty}_{0}({\mathbb{R}}_{+})\) so that θ≡1 on [0,1]. Then for any \(g \in\varSigma_{\lambda}^{p}\) we have \(g=\theta(\delta\sqrt{L})g\) with δ:=λ −1 and, therefore, \(L^{m} g = L^{m}\theta(\delta\sqrt{L})g\). Then (3.37) follows by applying Corollary 3.5 and Proposition 2.6. □

Observe that from spectral theory it readily follows that when p=2 the Bernstein estimate (3.37) holds with constant c =1.

Our next aim is to show that the spectral spaces \(\varSigma_{\lambda}^{p}\) can be characterized by means of Bernstein estimates, in the spirit of the previous theorem, but with a constant (c ν  below) independent of m.

Theorem 3.14

Let 1≤p≤∞ and λ>0. Then the following assertions are equivalent:

(a) \(f \in\varSigma_{\lambda}^{p}\).

(b) f∈⋂ m∈ℕ D(L m) and for any ν>λ there exists a constant c ν >0 such that

$$\big\|L^m f\big\|_p \le c_\nu\nu^{2m}\|f \|_p, \quad\forall m\ge1. $$

(c)

$$z\in{\mathbb{C}}\mapsto e^{-zL}f = \sum_{k \ge0} \frac{(-z)^k}{k!} L^k f $$

is an entire function of exponential type λ 2.

Proof

Clearly, (b) ⟺ (c) using the Paley-Wiener theorem.

To prove that (a) ⟹ (b) we shall show that the constant c in (3.37) can be specified as follows: For any 0<ε<1 there exists a constant c(ε,d)>0 such that

$$ c^\star=c({\varepsilon}, d) m^{4d+8}(1+{\varepsilon})^{2m}. $$
(3.38)

Indeed, let \(\theta\in C^{\infty}_{0}({\mathbb{R}})\) be so that θ≡1 on [−1,1], \(\operatorname{supp}\theta\subset[-1-{\varepsilon}, 1+{\varepsilon }]\), and also 0≤θ≤1. With δ:=λ −1 we have \(f= \theta(\delta\sqrt{L})f\) for any \(f \in\varSigma_{\lambda}^{p}\) and we shall estimate \(\|L^{m}\theta(\delta\sqrt{L})f\|_{p}\). Denote briefly h(u):=u 2m θ(u). Then \(h(\delta\sqrt{L})=\delta^{2m}L^{m}\theta(\delta\sqrt{L})\). To go further, set k:=⌊2d⌋+2, hence 2d+1<k≤2d+2. It is readily seen that

$$\|h\|_\infty\le(1+{\varepsilon})^{2m} \quad\hbox{and}\quad \big\|h^{(2k+4)}\big\|_\infty\le c_1({\varepsilon}, d)m^{4d+8}(1+{\varepsilon})^{2m}. $$

Now, by Theorem 3.4 we infer

$$\big|L^m\theta(\delta\sqrt{L}) (x, y)\big| = \delta^{-2m}\big|h(\delta \sqrt{L}) (x, y)\big| \le c_2({\varepsilon}, d) m^{4d+8}(1+{ \varepsilon})^{2m} \lambda^{2m}D_{\delta, k}(x, y) $$

and applying Proposition 2.6 (k>2d+1) we arrive at

$$\big\|L^m f\big\|_p = \big\|L^m \theta(\delta\sqrt{L})f \big\|_p \le c({\varepsilon}, d) m^{4d+8}(1+{\varepsilon} )^{2m} \lambda^{2m}\|f\|_p \quad\hbox{for } f\in\varSigma_\lambda^p, $$

which confirms (3.38).

Given ν>λ, choose 0<ε<1 so that (1+ε)2 λν. Then from above and the obvious fact that sup m≥1 m 4d+8(1+ε)−2mc′(ε,d) we get

$$\big\|L^m f\big\|_p \le c({\varepsilon}, d) m^{4d+8}(1+{ \varepsilon})^{-2m} \nu^{2m}\|f\|_p \le c''({\varepsilon}, d)\nu^{2m}\|f \|_p \quad\forall f\in\varSigma_\lambda^p. $$

Thus (a) ⟹ (b).

Now, to prove that (b) ⟹ (a), suppose (b) holds for some function \(f \in{{\mathbb{L}}}^{p}\) and let \(\theta\in C^{\infty}_{0}({\mathbb{R}}_{+})\), θ≡1 on [0,λ], as in Definition 3.10. Assume \(\operatorname{supp}\theta\subset[0, R]\). Let ε>0. We shall show that \(\|f-\theta(\sqrt{L})f\|_{p} <{\varepsilon}\), which implies \(f\in\varSigma_{\lambda}^{p}\). Indeed, for 0<δ<r<1 we have

By Proposition 3.8, \(\|f-\theta(\delta\sqrt{L})f\|_{p} \to0\) as δ→0 and hence there exists δ>0 such that \(\|f-\theta(\delta\sqrt{L})f\|_{p} < {\varepsilon}/2\). Clearly, \(\|\theta(r\sqrt{L})f-\theta(\sqrt{L})f\|_{p} \to0\) as r→1 and hence there exists r<1 such that \(\|\theta(r\sqrt{L})f-\theta(\sqrt{L})f\|_{p} <{\varepsilon}/2\).

It remains to show that \(\|\theta(\delta\sqrt{L})f-\theta(r\sqrt{L})f\|_{p} =0\). Let λ<ν<λ/r and denote briefly h(u):=[θ(δu)−θ(ru)]u −2m. Note that \(\operatorname{supp}h \subset[\lambda /r, R/\delta]\). Then using our assumption we have

As above, set k:=⌊2d⌋+2, then 2d+1<k≤2d+2. Now, applying Theorem 3.4 and Proposition 2.6 it follows that

and hence

$$\big\|\theta(\delta\sqrt{L})f-\theta(r\sqrt{L})f\big\|_p \le c m^{4d+8}(r\nu /\lambda )^{2m}\|f\|_p. $$

Here the constant c depends on δ,r,R,d,λ,ν, but is independent of m. Since 0</λ<1 by letting m→∞ we obtain \(\|\theta(\delta\sqrt{L})f-\theta(r\sqrt{L})f\|_{p}=0\). Therefore, (b) ⟹ (a). □

We now establish a Jackson estimate for approximation from \(\varSigma_{t}^{p}\).

Theorem 3.15

Let 1≤p≤∞. Then for any m∈ℕ there exists a constant c m >0 such that for any t≥1

$$ \mathcal{E}_t(f)_p \le c_m t^{-2m}\big\|L^m f\big\|_p \quad \hbox{for } f \in D\bigl(L^m\bigr)\cap{{\mathbb{L}}}^p. $$
(3.39)

Proof

Let θC (ℝ), θ(u)=1 for u∈[0,1], 0≤θ≤1, and \(\operatorname{supp}\theta\subset[0, 2]\). Set φ(u):=θ(u/2)−θ(u). Then 1−θ(u)=∑ j≥0 φ(2j u), u∈ℝ+. Given t>0, set δ:=2/t. Assume \(f \in D(L^{m})\cap{{\mathbb{L}}}^{p}\). Clearly, \(\theta(\delta\sqrt{L})f \in\varSigma_{t}^{p}\) and hence

$$\mathcal{E}_t(f)_p \le\big\|f-\theta(\delta\sqrt{L})f \big\|_p \le\sum_{j\ge0}\big\|\varphi \bigl(2^{-j}\delta\sqrt{L}\bigr)f\big\|_p. $$

Denote briefly h(u):=φ(u)u −2m. Then \(\varphi(2^{-j}\delta\sqrt{L})L^{-m}= (2^{-j}\delta )^{2m}h(2^{-j}\delta\sqrt{L})\) and, therefore,

By Theorem 3.4 and Proposition 2.6 it follows that \(\|h(2^{-j}\delta\sqrt{L})\|_{p\to p} \le c(d,m)\) and hence

$$\mathcal{E}_t(f)_p \le ct^{-2m} \big\|L^m f\big\|_p \sum_{j\ge0}2^{-2mj} \le c't^{-2m}\big\|L^m f\big\|_p, $$

which gives (3.39). □

3.5.2 Characterization of Approximation Spaces

Once the Bernstein and Jackson estimates are established, the approximation spaces \(A^{s}_{pq}\), defined in (3.35)–(3.36), can be characterized by interpolation. In the following we shall denote by (X 0,X 1) θ,q the real interpolation space between the normed spaces X 0, X 1, see e.g. [3, 4].

Theorem 3.16

Let s>0, 1≤p≤∞ and 0<q≤∞. Then for any r>s

$$ A^s_{pq} = \bigl({\mathbb{L}}^p, D (\sqrt{L} )^r \bigr)_{\theta, q}, \quad s= \theta r. $$
(3.40)

Proof

A classical argument (e.g. [15]) using the Jackson and Bernstein estimates from (3.39) and (3.37) implies the following characterization of the spaces \(A^{s}_{pq}\): If 2m>s, then

$$ A^s_{pq} = \bigl({\mathbb{L}}^p, D\bigl(L^m \bigr) \bigr)_{\theta, q} = \bigl({\mathbb{L}}^p, D(\sqrt{L})^{2m} \bigr)_{\theta, q}, \quad s= 2\theta m. $$
(3.41)

Thus (3.40) holds for r=2m. On the other hand, \(-\sqrt{L}\) is the infinitesimal generator of the subordinate semigroup \(Q_{t}f = \int_{0}^{\infty}\frac{t e^{-t^{2}/4s}}{2s \sqrt{\pi s}} e^{-sL}f d\mu(s) \) on \({\mathbb{L}}^{p}\), and by a well-known result (e.g. [4]) if 1≤r<k, then

$$\bigl({{\mathbb{L}}}^p, D (\sqrt{L} )^k \bigr)_{\theta, 1} \subset D (\sqrt{L} )^r \subset \bigl({\mathbb{L}}^p, D (\sqrt{L} )^k \bigr)_{\theta, \infty}, \quad\theta= r/k. $$

Therefore, if 1≤r<2m and \(\theta_{0} =\frac{r}{2m}\), then

$$A^{r}_{p1} = \bigl({\mathbb{L}}^p, D ( \sqrt{L} )^{2m} \bigr)_{\theta_0, 1} \subset D ( \sqrt{L} )^r \subset \bigl({\mathbb{L}}^p, D (\sqrt{L} )^{2m} \bigr)_{\theta _0, \infty}=A^{r}_{p\infty} $$

This along with (3.41) implies

$$\bigl({{\mathbb{L}}}^p, \bigl({\mathbb{L}}^p, D ( \sqrt{L} )^{2m} \bigr)_{\theta_0, 1} \bigr)_{\theta, q} \subset \bigl({{\mathbb{L}}}^p, D ( \sqrt{L} )^{r} \bigr)_{\theta, q} \subset \bigl({{ \mathbb{L}}}^p, \bigl({{\mathbb{L}}}^p, D ( \sqrt{L} )^{2m} \bigr)_{\theta_0, \infty} \bigr)_{\theta, q} $$

and by the reiteration theorem (e.g. [3]) this leads to

$$\bigl({{\mathbb{L}}}^p, D (\sqrt{L} )^{r} \bigr)_{\theta, q} = \bigl({\mathbb{L}}^p, D ( \sqrt{L} )^{2m} \bigr)_{\theta \theta _0, q}= A^{s}_{pq}, \quad s=2\theta\theta_0 m= \theta r. $$

The proof is complete. □

Remark 3.17

From the above, \(A^{s}_{pq}= ({\mathbb{L}}^{p}, D(L^{m} ) )_{\theta, q}\), s=2θm, 0<s<2m, but then as is well-known (e.g. [4])

$$\| f \|_{A^s_{pq}} \sim\| f \|_p + \biggl(\int _0^1 \bigl(t^{-s/2}\big \| \bigl(e^{-tL} - \operatorname{Id}\bigr)^m f \big\|_p \bigr)^q \frac{dt}{t} \biggr)^{1/q} $$

with the usual modification for q=∞. Moreover, since e tL is a holomorphic semigroup, we also have

$$\|f\|_{A^s_{pq}} \sim\| f \|_p + \biggl(\int _0^1 \bigl(t^{-s/2} \big\| (tL)^m e^{-tL}f\big\|_p \bigr)^q \frac {dt}{t} \biggr)^{1/q} $$

with the usual modification for q=∞.

3.6 Kernel Norms

Here we derive bounds on the \({{\mathbb{L}}}^{p}\)-norms of the kernels of operators of the form \(\theta(\delta\sqrt{L})\), which will be important for the development of frames.

Theorem 3.18

Let θC (ℝ+), θ≥0, \(\operatorname{supp}\theta\subset[0, R]\) for some R>1, and θ (2ν+1)(0)=0, ν=0,1,…. Suppose that either

(i) θ(u)≥1 for u∈[0,1], or

(ii) θ(u)≥1 for u∈[1,b], where b>1 is a sufficiently large constant.

Then for 0<p≤∞, \(0<\delta\le\min\{1, \frac {\operatorname{diam} M}{3}\}\), and xM we have

$$ c_1\big|B(\xi,\delta)\big|^{1/p-1} \le\big\| \theta( \delta\sqrt{L}) (x,\cdot)\big\|_p \le c_2\big|B(x,\delta)\big|^{1/p-1}, $$
(3.42)

where c 1>0 depends only on p and the parameters of the space, and c 2>0 depends on p and the smoothness and the support of θ similarly as in Theorem 3.4.

Proof

By Theorem 3.4 we have \(|\theta(\delta\sqrt{L}) (x,y)| \le c_{\sigma}D_{\delta,\sigma }(x,y)\) for any σ>0. Pick σ>d(1/2+1/p). Then the upper bound estimate in (3.42) follows readily by estimate (2.10).

It is not hard to see that to prove the lower bound estimate in (2.10) it suffices to have it for p=2 and p=∞ and use the already established upper bound. However, clearly

$$\big\|\theta(\delta\sqrt{L}) (x,\cdot)\big\|_2^2 = \theta^2(\delta\sqrt{L}) (x,x) \quad\hbox{and}\quad \big\|\theta(\delta \sqrt{L}) (x,\cdot)\big\|_\infty\ge\theta(\delta\sqrt{L}) (x,x), $$

and it boils down to establishing lower bounds on \(\theta^{2}(\delta\sqrt{L}) (x,x)\) and \(\theta(\delta\sqrt{L}) (x,x)\).

Further, let \(f, g\in{\mathbb{L}}^{\infty}({\mathbb{R}}_{+})\) be bounded, \(\operatorname{supp}f, g\subset[0, R]\), and 0≤gf. Then f=g+h for some h≥0, and hence \(f(\sqrt{L})(x,x)= g(\sqrt{L})(x,x) + h(\sqrt{L})(x,x)\). On the other hand, by (3.20) \(f(\sqrt{L})(x,x) = \int_{M} |\sqrt{f} (\sqrt{L})(x,y)|^{2} d\mu(y)\ge0\), and we have similar representations of \(g(\sqrt{L})(x,x)\) and \(h(\sqrt{L})(x,x)\). Therefore,

$$ 0\le g\le f \quad\Longrightarrow\quad 0 \le g(\sqrt{L}) (x,x) \le f(\sqrt{L}) (x,x). $$
(3.43)

This allows to compare the kernels of different operators and we naturally come to the next lemma which is interesting in its own right.

Lemma 3.19

(a) There exist constants c 3,c 4>0 such that for any τ≥1

(3.44)

(b) There exists b>1 such that if τ≥1 and \(\tau^{-1}\le\frac {\operatorname{diam}M}{3}\), then

(3.45)

where c 5,c 6>0 depend only on the parameters of the space.

Proof

We first show that

(3.46)

Indeed, we have

and since is a kernel operator (Theorem 3.7), then is also a kernel operator and

(3.47)

On the other hand,

and by spectral theory . Therefore, applying Proposition 2.9 we arrive at

This and (3.47) imply (3.46).

We also need these bounds on the heat kernel:

$$ c'\big|B(x,\sqrt{t})\big|^{-1} \le p_t(x, x) \le c|B(x,\sqrt{t})|^{-1}, \quad0<t\le1. $$
(3.48)

The upper bound is immediate from (1.4). For the lower bound we have for >1, using (1.7),

However, by (1.4) \(p_{t/2}(x,y) \le c_{\sigma}D_{\sqrt{t}, \sigma}(x, y)\) for any σ>0, and hence, just as in the proof of Proposition 3.8,

$$\int_{M\setminus B(x, 2^\ell\sqrt{t})} e^{-t/2 L}(x,y) d\mu(y) \le c2^{-\ell}\le\frac{1}{2} $$

for a sufficiently large (the constant c is independent of ). This completes the proof of the lower bound estimate in (3.48).

We now turn to the proof of (3.44). Since we obtain, using (3.43) and (3.48)

which gives the right-hand side estimate in (3.44).

For the proof of the left-hand side estimate in (3.44), we first note that for any t>0

From this, (3.43), (3.46), (3.48), and the right-hand side estimate in (3.44) we obtain

Here for the latter inequality we used (2.1). Given τ≥1 and r∈ℕ we choose t so that \(\tau \sqrt{t} = 2^{r}\). Then from above

Hence,

Taking r∈ℕ sufficiently large, this implies the left-hand side estimate in (3.44).

We now take on (3.45). The right-hand side estimate follows from the right-hand side estimate in (3.44). Using (3.44) and the reverse doubling condition (1.2) with \(\tau^{-1} \le\frac{\operatorname{diam}M}{3}\), we obtain for l∈ℕ

which leads to (3.45) with b=2l for sufficiently large l. □

Completion of the proof of Theorem 3.18

We now focus on the left-hand side estimate in (3.42). Suppose θ obeys condition (ii) from the hypothesis of the theorem, i.e. θ(u)≥1 on [1,b], where b>1 is the same as in Lemma 3.19(b) (the proof in the other case is the same). Then by (3.43) and Lemma 3.19 we have for \(0<\delta\le\min\{1, \frac{\operatorname{diam}M}{3}\}\)

On the other hand

where for the last estimate we proceeded as above. Thus so far we have

(3.49)

Now, for 0<p<∞ the left-hand side estimate in (3.42) follows from the estimates in (3.49) in a standard manner. Indeed, set \(f:=\theta(\delta\sqrt{L}) (x,\cdot)\). If 0<p<2, then using (3.49) we get

$$c_5\big|B(x, \delta)\big|^{-1} \le\|f\|_2^2 \le\|f\|_p^p\|f\|_\infty^{2-p} \le c \|f\|_p^p\big|B(x, \delta)\big|^{-2+p}, $$

which implies ∥f p c′|B(x,δ)|1/p−1. If 2<p<∞, we use (3.49) and Hölder’s inequality to obtain

$$c_5\big|B(x, \delta)\big|^{-1} \le\|f\|_2^2 \le\|f\|_p\|f\|_{p'} \le c\|f\|_p^p\big|B(x, \delta)\big|^{1/p'-1} \quad\bigl(1/p+1/p'=1\bigr). $$

This leads again to ∥f p c′|B(x,δ)|1/p−1. □

3.7 Finite Dimensional Spectral Spaces

It is easy to see that in the case when μ(M)<∞ the spectrum of L is discrete and the respective eigenspaces are finitely dimensional. This and some other related simple facts are collected in the following statement, where we adhere to the notation from the previous subsections.

Proposition 3.20

The following claims are equivalent:

(a) \(\operatorname{diam}M <\infty\).

(b) μ(M)<∞.

(c) There exists δ>0 such that M μ(B(x,δ))−1 (x)<∞ and hence we have M μ(B(x,r))−1 (x)<∞ for all r>0.

(d) The spectrum of the operator L is discrete and of the form 0≤λ 1<λ 2<⋯,

$${\mathbb{L}}^2 = \sum\bigoplus_{j} \mathcal{H}_{\lambda_j}, \quad\hbox{where } \mathcal{H}_{\lambda_j} = \operatorname{Ker}(L-\lambda_j \operatorname{Id}), \hbox{ and } \dim( \mathcal{H}_{\lambda_j}) <\infty. $$

(e) There exists t>0 such that

$$\big\|e^{-tL}\big\|_{HS}^2=\int_M \int_M \big|p_t(x, y)\big|^2 d\mu(x) d \mu(y) =\int_M p_{2t}(x, x) d\mu(x) <\infty, $$

and hence this is true for all t>0.

(f) There exists λ≥1 (and henceλ≥1) \(\varSigma_{\lambda}^{\infty}= \varSigma_{\lambda}^{1}\) (\(= \varSigma_{\lambda}^{p}\) for all 1≤p≤∞).

Furthermore, if one of the above holds, then for λ≥1

$$ \dim(\varSigma_\lambda ) \sim\int _M \mu\bigl(B\bigl(x,\lambda^{-1}\bigr) \bigr)^{-1} d\mu(x) \quad\hbox{and}\quad \dim(\varSigma_{\sqrt{\lambda}} )\sim\| e^{-\lambda L}\|^2_{HS}, $$
(3.50)

where \(\varSigma_{\lambda}= \sum\bigoplus_{ \sqrt{\lambda_{j}} \le \lambda } \mathcal{H}_{\lambda_{j}}\). In addition,

$$ p_t(x,y)= \sum_{j\ge1} e^{-\lambda_j}P_{\mathcal{H}_j}(x,y) , \quad P_{\mathcal{H}_j}(x,y)= \sum _{l=1}^{\dim(\mathcal{H}_j)} e_j^l(x) \overline{e_j^l(y)}, $$
(3.51)

where \(\{e_{j}^{l}: l=1,\dots, \dim(\mathcal{H}_{j})\}\) is an orthonormal basis for \(\mathcal{H}_{j}\), \(Le_{j}^{l} = \lambda_{j} e_{j}^{l}\). The convergence is uniform and p t (x,y) is a positive definite kernel.

Proof

As already shown in Proposition 2.1, (a) and (b) are equivalent. Note that, since in our setting closed balls are compact, (a) or (b) is also equivalent to the compactness of M.

Clearly (b) implies (f) as \(\varSigma_{\lambda}^{1} \subset\varSigma_{\lambda}^{\infty}\subset{\mathbb{L}}^{\infty}\subset{\mathbb{L}}^{1} \) and \(\varSigma_{\lambda}^{\infty}\cap{\mathbb{L}}^{1} =\varSigma_{\lambda}^{1}\).

To show that (f) implies (b), assume \(\varSigma_{\lambda}^{\infty}= \varSigma_{\lambda}^{1}\). Then if \(\theta\in C_{0}^{\infty}({\mathbb{R}}_{+})\), θ≡1 in the neighborhood of 0 and \(\operatorname{supp}\theta\subset[0,\lambda]\) we have \(\theta(\sqrt{L})f \in\varSigma_{\lambda}^{\infty}= \varSigma_{\lambda}^{1}\) \(\forall f \in{\mathbb{L}}^{\infty}\). Hence \(1= \theta(\sqrt{L})(1) \in{\mathbb{L}}^{1}\), which implies μ(M)<∞.

Assume that (a)–(b) hold and fix x 0M. Then using (2.1)–(2.2) we get

which readily implies

$$\int_M \big|B(x,\delta)\big|^{-1} d\mu(x) \le(4/ \delta)^d\big|B(x_0, 1)\big|(1+D)|M| < \infty. $$

Thus (a)–(b) imply (c).

For the other direction, assume that (c) holds and let \({\mathcal{X}}_{\delta}\) be a maximal δ-net on M with a companion disjoint partition \(\{A_{\xi}\}_{\xi\in{\mathcal{X}}_{\delta}}\) of M as in Proposition 2.5. Then we use (2.1)–(2.2) again to obtain

Hence #X δ <∞, which readily implies \(\operatorname {diam}(M) <\infty\). So, (c) implies (a).

Since ∫ M p t (x,y)2 (y)=p t (x,x), the equivalence of (c) and (e) is immediate from (1.4).

It remains to show that (c) and (d) are equivalent. Suppose (c) holds true. Since \(E_{\lambda}^{2}=E_{\lambda}\), we have

$$ \int_M\big |E_\lambda(x, y)\big|^2 d\mu(y) = \int_M E_\lambda(x, y)E_\lambda(y, x) d\mu(y) =E_\lambda^2(x, x)= E_\lambda (x, x) $$
(3.52)

and hence, using Lemma 3.19,

Therefore, E λ (λ≥1) is a Hilbert-Schmidt operator on \({{\mathbb{L}}}^{2}\) and hence its spectrum is discrete. Suppose {e j } jJ is an orthonormal family, verifying E λ e j =e j , and put

$$H(x,y) = \sum_{j \in J} e_j(x) \overline{e_j(y)}. $$

Evidently, H 2=H and as in (3.52) \(\int_{M}\hspace{-0.5pt} | H(x,y)|^{2} d\mu(y) \hspace{-0.5pt}=\hspace{-0.5pt} H(x,x) \hspace{-0.5pt}=\hspace{-0.5pt} \sum_{j \in J}\hspace{-0.5pt} |e_{j}(x)|^{2}\). On the other hand E λ H=HE λ =H and hence

Consequently, H(x,x)≤E λ (x,x). Thus

Therefore, \(\dim(\varSigma_{\sqrt{\lambda}}) \le c\int_{M} |B(x,\lambda^{-1/2})|d\mu (x)<\infty\), which shows that (c) implies (d).

Finally, assume that (d) holds true. Let {e j } jJ be an orthonormal basis of Σ λ , λ≥1. Then \(E_{\lambda}(x, y)=\sum_{j\in J} e_{j}(x)\overline{e_{j}(y)}\), where #J=dim(Σ λ ). Now, using Lemma 3.19 we infer

Thus (d) implies (c).

The estimates in (3.50) follow from above. The last assertion of the theorem is Mercer’s theorem (see [18]). □

4 Sampling Theorem and Cubature Formula

Basic tools for constructing decomposition systems (frames) for various spaces will be a sampling theorem for \(\varSigma_{\lambda}^{p}\) and a cubature formula for \(\varSigma_{\lambda}^{1}\). In turn these results will rely on the nearly exponential localization of operator kernels induced by smooth cut-off functions φ (Theorem 3.4): If φC (ℝ+), \(\operatorname{supp}\varphi\subset[0,b]\), b>1, 0≤φ≤1, and φ=1 on [0,1], then there exists a constant α>0 such that for any δ>0 and x,y,x′∈M

(4.1)
(4.2)

Here K(σ)>1 depends on φ, σ and the other parameters, but is independent of x,y,x′ and δ.

The main ingredient in our constructions will be the following Marcinkiewicz-Zygmund inequality for \(\varSigma_{\lambda}^{1}\), where maximal δ-nets (see Sect. 2.3) will be utilized.

Proposition 4.1

Given λ≥1, let \({\mathcal{X}}_{\delta}\) be a maximal δ-net on M with \({\delta}:=\frac{\gamma}{\lambda}\), where 0<γ≤1. Suppose \(\{A_{\xi}\}_{\xi\in{\mathcal{X}}_{\delta}}\) is a companion disjoint partition of M consisting of measurable sets such that B(ξ,δ/2)⊂A ξ B(ξ,δ), \(\xi \in{\mathcal{X}}_{\delta}\). Then for any \(f \in\varSigma_{\lambda}^{p}\), 1≤p<∞,

$$ \sum_{\xi\in{\mathcal{X}}_{\delta}} \int _{A_\xi}\big |f(x) - f(\xi )\big|^p dx \leq\bigl[{K( \sigma_*)}\gamma^{\alpha} {c^\diamond}\bigr]^p \| f \|_p^p, $$
(4.3)

and for any \(f \in\varSigma_{\lambda}^{\infty}\)

$$ \sup_{\xi\in{\mathcal{X}}_{\delta}} \sup_{x\in A_\xi} \big|f(x)-f(\xi)\big| \le{K( \sigma_*)}\gamma^{\alpha} {c^\diamond}\|f\|_\infty, $$
(4.4)

where K(σ ) is the constant from (4.1)(4.2) with σ :=2d+1 and c =22d+1.

Proof

Suppose φ is a cut-off function as in (4.1)–(4.2). Then we have \(f = \int_{M} \varphi(\lambda^{-1} \sqrt{L}) (\cdot,y) f(y)dy\) for \(f\in\varSigma_{\lambda}^{p}\), 1≤p≤∞, and using (4.2) with δ=λ −1 we obtain for 1≤p<∞

where for the last inequality we used Proposition 2.6. The proof of (4.4) is similar. □

4.1 The Sampling Theorem

The following sampling theorem will play an important role in the sequel.

Theorem 4.2

Let 0<γ<1 and

$$ {K(\sigma_*)}\gamma^{\alpha}{c^\diamond}\le\frac{1}{2}, $$
(4.5)

where K(σ ) is the constant from (4.2) with σ :=2d+1 and c =22d+1. For a given λ≥1 let \({\mathcal{X}}_{\delta}\) be a maximal δ-net on M with \({\delta}:=\frac{\gamma}{\lambda}\) and suppose \(\{A_{\xi}\}_{\xi\in{\mathcal{X}}_{\delta}}\) is a companion disjoint partition of M consisting of measurable sets such that B(ξ,δ/2)⊂A ξ B(ξ,δ), \(\xi \in{\mathcal{X}}_{\delta}\). Then for any \(f\in\varSigma_{\lambda}^{p}\), 1≤p<∞,

$$ \frac{1}{2} \| f \|_p \le \biggl(\sum _{\xi\in{\mathcal{X}}_{\delta}} |A_\xi|\big| f(\xi )\big|^p \biggr)^{1/p} \le2 \| f \|_p $$
(4.6)

and for \(f\in\varSigma_{\lambda}^{\infty}\)

$$ \frac{1}{2} \| f \|_\infty\le\sup_{\xi\in{\mathcal{X}}_{\delta}} \big|f(\xi)\big|\le \| f \|_\infty. $$
(4.7)

Furthermore, if 0<γ<1 is selected so that

$$ {K(\sigma_*)}\gamma^{\alpha}{c^\diamond}\le\small \frac {{\varepsilon}}{3}, $$
(4.8)

(instead of (4.5)) for a given 0<ε<1, then for any \(f\in\varSigma_{\lambda}^{p}\), 1≤p≤2,

$$ (1-{\varepsilon})\| f \|_p^p \le\sum _{\xi\in{\mathcal{X}}_{\delta }} |A_\xi|\big| f(\xi)\big|^p \le(1+{\varepsilon})\| f \|_p^p. $$
(4.9)

Proof

We first prove (4.9). It is easy to see that

$$ \frac{1}{(1+\delta)^{p-1}} |a |^p \leq \frac{1}{\delta^{p-1}} |a-b|^p + |b|^p \quad \hbox{if $0<\delta<1$, $a,b \in{\mathbb{C}}$ and $1\le p$.} $$
(4.10)

which implies :

$$ (1-\delta) |a |^p \leq\frac{1}{\delta^{p-1}} |a-b|^p + |b|^p \quad\hbox{if $0<\delta<1$, $a,b \in{\mathbb{C}}$ and $1\le p\le2$.} $$
(4.11)

This inequality with δ:=ε/3 implies

(4.12)
(4.13)

Summing up estimates (4.12) over \(\xi\in{\mathcal {X}}_{\delta}\), we get

which implies the left-hand side estimate in (4.9). Here for the second estimate we used (4.3).

Similarly, we sum up estimates (4.13) and use again (4.3) to obtain

which readily yields the right-hand side estimate in (4.9).

To establish (4.6) note that (using (4.10)) \(\frac{1}{2^{p-1}} |a|^{p} \leq|a-b|^{p} + |b|^{p}\) for a,b∈ℂ and 1≤p<∞, which leads to

Then one proceeds exactly as above and obtains (4.6). The proof of (4.7) is simpler and will be omitted. □

Remark 4.3

Observe that under the assumptions of Theorem 4.2 one has, using (1.1) and (2.1),

Then estimates (4.6) imply that for \(f\in\varSigma_{\lambda}^{p}\), 1≤p<∞,

$$ \small\frac{1}{2}(\gamma/4)^{d/p} \biggl(\sum _{\xi\in{\mathcal{X}}_{\delta}} \big|B\bigl(\xi, \lambda^{-1}\bigr)\big|\big| f(\xi )\big|^p \biggr)^{1/p} \le\| f \|_p \le2 \biggl(\sum _{\xi\in{\mathcal{X}}_{\delta}}\big|B\bigl(\xi, \lambda^{-1}\bigr)\big|\big| f( \xi )\big|^p \biggr)^{1/p}. $$
(4.14)

Also, note that estimates (4.9) are immediate for p=∞ with the usual modification and hold when 2<p<∞ with some modification of the constant in (4.8) (γ depends on p). We do not elaborate on this since we shall only need (4.9) for p=2.

4.2 Cubature Formula for \(\varSigma_{\lambda}^{1}\)

In this subsection we utilize the Marcinkiewicz-Zygmund inequality from Proposition 4.1 for the construction of a cubature formula on \(\varSigma_{\lambda}^{1}\).

Theorem 4.4

Let 0<γ<1 and

$$ {K(\sigma_*)}\gamma^{\alpha}{c^\diamond}= \frac{1}{4}. $$
(4.15)

Let λ≥1 and suppose \({\mathcal{X}}_{\delta}\) is a maximal δ-net on M with δ:=γλ −1. Then there exist positive constants (weights) \(\{w^{\lambda}_{\xi}\}_{\xi\in{\mathcal{X}}_{\delta}}\) such that

$$ \int_M f(x) d\mu(x) = \sum _{\xi\in{\mathcal{X}}_{\delta}} w^\lambda_\xi f(\xi), \quad f \in \varSigma^1_\lambda, $$
(4.16)

and

$$ \frac{2}{3}\big|B(\xi, {\delta}/2)\big| \le w^\lambda_\xi \le2\big|B(\xi, {\delta})\big|, \quad\xi\in{\mathcal{X}}_{\delta}. $$
(4.17)

We shall derive this theorem from Proposition 4.1 and a version of the Hahn-Banach theorem for ordered linear spaces. We next give a theorem of Bauer of this sort (adapted to the case of linear normed spaces) that best serves our purposes and refer the reader to [1] for its proof.

Theorem 4.5

(Bauer)

Suppose E is a linear normed space, FE is a subspace of E, and C is a convex cone in E, which determines an order on E (fg if gfC). Set V:={fE:∥f∥≤1}. Let Λ:F→ℝ be a linear functional on F. Then Λ can be extended to a linear functional \(\widetilde{\varLambda}\) on E which is (i) positive, i.e. \(\widetilde{\varLambda}(f)\ge0\) if fC, and (ii) \(|\widetilde{\varLambda}(f)|\le\|f\|\) for fE, if and only if

$$ \varLambda(f) \ge-1 \quad\hbox{for all $f\in F\cap(V+C)$}. $$
(4.18)

A simple rescaling shows that the theorem holds if the condition in (ii) above is replaced by \(\widetilde{\varLambda}(f)\le{c^{\star}}\|f\|\) and the condition in (4.18) by Λ(f)≥−c , where c >0 is a constant.

We next show how the Marcinkiewicz-Zygmund inequality implies the existence of a quadrature rule in a general setting and then apply the result to our particular case.

Proposition 4.6

Suppose (X,μ) is a measure space and let \(\mathcal{H}\) be a space of μ-integrable functions defined everywhere on X. Suppose {A i } iI is a finite or countable disjoint partition of X, i.e. X=⋃ iI A i and A i A j =∅ if ij, consisting of measurable subsets of X of finite measure (0<μ(A i )<∞). Let ξ i A i , iI. Also, assume that there exists a constant \(\alpha<\frac{1}{2}\) such that

$$ \sum_{i \in I} \int_{A_i} \big|f(x)- f(\xi_i)\big| d\mu(x) \leq\alpha\int_X \big|f(x)\big| d\mu(x), \quad f \in\mathcal{H}. $$
(4.19)

Then there exist positive constants {γ i } iI such that

$$ \int_X f(x) d\mu(x) = \sum _{i \in I} \gamma_i f(\xi_i) \quad\hbox{for } f \in\mathcal{H}, $$
(4.20)

and

$$ \frac{1-2\alpha}{1-\alpha} \mu(A_i) \le\gamma_i \le\frac {1+2\alpha}{1-\alpha}\mu(A_i), \quad i\in I. $$
(4.21)

Proof

Consider the discrete positive measure \(d\nu:= \sum_{i\in I} \mu(A_{i}) \delta_{\xi_{i}}\) on X, supported on the set \({\mathcal{X}}:=\{\xi_{i}: i\in I\}\), and let \({{\mathbb{L}}}^{1}(\nu)\) be the respective (weighted discrete) L 1-space. By (4.19) we obtain for \(f\in\mathcal{H}\)

$$\bigg|\int_X f d\mu- \int_X f d\nu \bigg| \leq\alpha\int_X |f| d\mu, $$

and

$$ (1- \alpha) \| f \|_{{{\mathbb{L}}}^1(\mu)} \leq\| f \|_{{{\mathbb{L}}}^1(\nu)} \leq (1 + \alpha) \| f \|_{{{\mathbb{L}}}^1(\mu)}. $$
(4.22)

Hence

$$\int_X f d\mu- \int_X f d\nu \geq- \alpha\int_X |f| d\mu \geq-\frac{\alpha}{1-\alpha} \int_X |f| d\nu, $$

which readily implies

$$ \int_X f d\mu- \frac{1-2\alpha}{1-\alpha} \int _X f d\nu \geq-\frac{\alpha}{1-\alpha} \int _X \bigl(|f|-f\bigr) d\nu. $$
(4.23)

On the other hand, (4.22) yields that the operator \(J : f \in\mathcal{H}\mapsto\{f(\xi_{i})\}_{i\in I} \in{{\mathbb{L}}}^{1}(\nu)\) is continuous and, moreover, if \(J(\mathcal{H}) = \widetilde{\mathcal {H}} \subset{{\mathbb{L}}}^{1}(\nu)\), then the operator

$$J^{-1}: g \in\widetilde{\mathcal{H}} \mapsto\int_X J^{-1}(g)d\mu $$

is well-defined and continuous, and by (4.23)

$$ \int_X J^{-1}(g) d\mu- \frac{1-2\alpha}{1-\alpha} \int_X g d\nu \geq- \frac{2\alpha}{1-\alpha} \int_X \bigl(|g|-g\bigr) d\nu. $$
(4.24)

Let the linear functional \(\varLambda: \widetilde{\mathcal{H}}\mapsto\mathbb {R}\) be defined by

$$ \varLambda: g \in\widetilde{\mathcal{H}} \mapsto\varLambda(g) := \int _X J^{-1}(g) d\mu- \frac{1-2\alpha}{1-\alpha} \int _X g d\nu. $$
(4.25)

We next apply Theorem 4.5 with \(E={{\mathbb{L}}}^{1}(\nu)\), \(F=\widetilde{\mathcal{H}}\),

$$C=\bigl\{f\in{{\mathbb{L}}}^1(\nu): f(\xi)\ge0, \xi\in{ \mathcal{X}}\bigr\}, \quad\quad V=\bigl\{f\in{{\mathbb{L}}}^1(\nu):\|f \|_{{{\mathbb{L}}}^1(\nu)}\le1\bigr\}, $$

and the linear functional Λ from (4.25). Evidently, in this case, fF∩(V+C) if and only if \(f \in\widetilde{\mathcal{H}}\) and f can be represented in the form f=g+h, where \(\|g\|_{{{\mathbb{L}}}^{1}(\nu)} \leq1\) and h≥0. Then by (4.24) it follows that

Applying now Theorem 4.5 we conclude that there exists a positive continuous extension \(\widetilde{\varLambda}\) of Λ to \({{\mathbb{L}}}^{1}(\nu)\) such that \(\|\widetilde{\varLambda}\|\le{c^{\star}}=\frac{4\alpha}{1-\alpha}\). However, as is well-known (see e.g. [16]) \(({{\mathbb{L}}}^{1}(\nu ))^{*}={{\mathbb{L}}}^{\infty}(\nu)\). Therefore, there exists a sequence \(\beta\in{{\mathbb{L}}}^{\infty}(\nu)\), β={β i } iI , such that \(\|\beta\|_{\infty} =\sup_{i\in I} \beta_{i} \le\frac{4\alpha }{1-\alpha}\) and

$$\widetilde{\varLambda}(f)=\sum_{i\in I} f( \xi_i)\beta_i\mu(A_i), \quad f\in{{ \mathbb{L}}}^1(\nu). $$

Since \(\widetilde{\varLambda}\) is positive, we have β i ≥0, iI. Consequently, for any \(f\in\mathcal{H}\)

$$\varLambda(f)=\int_X f d\mu- \frac{1-2\alpha}{1-\alpha} \sum _i \mu (A_i) f(\xi_i) = \sum_i \beta_i \mu(A_i) f(\xi_i), $$

where \(0 \le\beta_{i} \le\frac{2 \alpha}{1-\alpha}\), which leads to ∫ X f(x)(x)=∑ iI γ i f(ξ i ) for \(f \in\mathcal{H}\), where

$$\frac{1-2\alpha}{1-\alpha} \mu(A_i) \le\gamma_i \le \frac{1-2\alpha}{1-\alpha} \mu(A_i ) + \frac{4\alpha }{1-\alpha} \mu(A_i) = \frac{1+2\alpha}{1-\alpha} \mu(A_i). $$

The proof is complete. □

Proof of Theorem 4.4

Let \({\mathcal{X}}_{\delta}\) be a maximal δ-net on M with \({\delta} =\frac{\gamma}{\lambda}\). Then by Proposition 4.1 we have

$$\sum_{\xi\in{\mathcal{X}}_{\delta}} \int_{A_\xi}\big|f(x) - f( \xi)\big| d\mu(x) \le{K(\sigma_*)}\gamma^{\alpha}{c^\diamond}\|f \|_{{{\mathbb{L}}}^1}. $$

If γ>0 and \({K(\sigma_{*})}\gamma^{\alpha}{c^{\diamond}}\le \frac{1}{4}\), then Theorem 4.4 follows at once from Proposition 4.6. □

5 Construction of Frames

An important part of our development in this article is the construction of well-localized decomposition systems for spaces of functions or distributions in the general setting of this article. The goal will be to construct a pair of dual frames, where the elements of both frames are band limited and have nearly exponential space localization.

5.1 A Natural (Littlewood-Paley Type) Frame for \({{\mathbb{L}}}^{2}\)

We begin with the construction of a well-localized frame based on the kernels of spectral operators considered in Sect. 3.1.

Let ΦC (ℝ+), Φ(u)=1 for u∈[0,1], 0≤Φ≤1, and \(\operatorname{supp}\varPhi\subset[0, b]\), where b>1 is the constant from Theorem 3.18. Set Ψ(u):=Φ(u)−Φ(bu) and note that 0≤Ψ≤1 and \(\operatorname{supp}\varPsi\subset[b^{-1}, b]\). We shall also assume that Φ is selected so that Ψ(u)≥c>0 for u∈[b −3/4,b 3/4]. We set

$$ \varPsi_0(u):=\varPhi(u) \quad\hbox{and}\quad \varPsi_j(u):=\varPsi \bigl(b^{-j}u\bigr),\quad j\ge1. $$
(5.1)

Clearly, Ψ j C (ℝ+), 0≤Ψ j ≤1, \(\operatorname{supp} \varPsi_{0} \subset[0, b]\), \(\operatorname{supp}\varPsi_{j} \subset[b^{j-1}, b^{j+1}]\), j≥1, and ∑ j≥0 Ψ j (u)=1 for u∈ℝ+. By Corollary 3.9 we have the following Littlewood-Paley decomposition

$$ f= \sum_{j\ge0} \varPsi_j(\sqrt{L})f \quad\hbox{for }f\in{{\mathbb{L}}}^p,\quad 1\le p\le\infty\quad \bigl({{\mathbb{L}}}^\infty :=\mathrm{UCB} \bigr). $$
(5.2)

From above it follows that

$$ \frac{1}{2} \le\sum_{j\ge0} \varPsi_j^2(u) \leq1, \quad u\in\mathbb{R}_+, $$
(5.3)

and since \(\|\varPsi_{j}(\sqrt{L}) f\|_{2}^{2} = \langle\varPsi_{j}(\sqrt{L}) f, \varPsi_{j}(\sqrt{L}) f \rangle = \langle\varPsi_{j}^{2}(\sqrt{L}) f, f \rangle\), we get

$$\sum_{j \ge0} \big\|\varPsi_j(\sqrt{L}) f \big\|_2^2 = \int_0^\infty \sum_{j\ge0} \varPsi_j^2(u) d \langle F_u f,f\rangle, $$

and using (5.3) we arrive at

$$ \frac{1}{2} \| f\|_2^2 \le\sum _{j \ge0} \big\|\varPsi_j(\sqrt{L}) f \big\|_2^2 \le \| f\|_2^2, \quad f \in{{\mathbb{L}}}^2. $$
(5.4)

Here we introduce a constant 0<ε<1 that is sufficiently small and will be specified later on in (5.16). Choose 0<γ<1 so that

$$ {K(\sigma_*)}\gamma^{\alpha}{c^\diamond}= { \varepsilon}/3, $$
(5.5)

where K(σ ) is the constant from (4.1)–(4.2) with σ :=2d+1, and c :=22d+1. For any j≥0 let \({\mathcal{X}}_{j} \subset M\) be a maximal δ j -net on M (see Proposition 2.5) with δ j :=γb j−2 and suppose \(\{A_{\xi}^{j}\}_{\xi\in{\mathcal{X}}_{j}}\) is a companion disjoint partition of M consisting of measurable sets such that \(B(\xi, {{\delta}_{j}}/2) \subset A_{\xi}^{j} \subset B(\xi,{{\delta }_{j}})\), \(\xi\in {\mathcal{X}}_{j}\), as in Proposition 2.5. By Theorem 4.2 we have

$$ (1-{\varepsilon})\| f \|_2^2 \le\sum _{\xi\in{\mathcal{X}}_j} \big| A_\xi^j\big| \big| f(\xi)\big|^2 \le(1+{\varepsilon})\| f \|_2^2 \quad \hbox{for } f\in\varSigma_{b^{j+2}}^2. $$
(5.6)

By the definition of Ψ j it follows that \(\varPsi_{j}(\sqrt{L}) f \in\varSigma^{2}_{b^{j+1}}\) for \(f\in{{\mathbb{L}}}^{2}\), and hence (5.4) and (5.6) imply

$$ \frac{1}{4} \|f\|_2^2 \le \sum _{j \geq0} \sum_{\xi\in{\mathcal{X}}_j} \big|A^j_\xi\big| \big| \varPsi_j(\sqrt{L}) f (\xi) \big|^2 \leq2\| f\|_2^2, \quad f\in L^2. $$
(5.7)

Note that

Consider the system {ψ } defined by

$$ \psi_{j \xi} (x):= \big|A^{j}_\xi\big|^{1/2} \varPsi_j(\sqrt{L}) (x,\xi), \quad\xi\in{\mathcal{X}}_j, j \ge0. $$
(5.8)

From the above observation and (5.7) it follows that \(\{\psi_{j \xi}: \xi\in{\mathcal{X}}_{j}, j\ge0\}\) is a frame for \({{\mathbb{L}}}^{2}\).

We next record the main properties of this system.

Proposition 5.1

(a) Localization: For any σ>0 there exist a constant c σ >0 such that for any \(\xi\in{\mathcal{X}}_{j}\), j≥0, we have

$$ \big|\psi_{j \xi} (x)\big| \le c_\sigma \big|B\bigl(\xi, b^{-j}\bigr)\big|^{-1/2}\bigl(1+b^j \rho(x, \xi)\bigr)^{-\sigma } $$
(5.9)

and if ρ(x,y)≤b j

$$ \big|\psi_{j \xi} (x)- \psi_{j \xi} (y)\big| \le c_\sigma \big|B\bigl(\xi, b^{-j} \bigr)\big|^{-1/2}\bigl(b^j\rho(x, y)\bigr)^\alpha \bigl(1+b^j\rho(x, \xi)\bigr)^{-\sigma }, \quad\alpha>0. $$
(5.10)

(b) Norms:

$$ \|\psi_{j \xi}\|_p \sim\big|B\bigl(\xi, b^{-j}\bigr)\big|^{\frac{1}{p}-\frac{1}{2}}, \quad0< p \leq\infty. $$
(5.11)

The constants involved in the previous equivalence depend of p.

(c) Spectral localization: \(\psi_{0\xi}\in\varSigma_{b}^{p}\) if \(\xi\in{\mathcal{X}}_{0}\) and \(\psi_{j\xi}\in\varSigma_{[b^{j-1}, b^{j+1}]}^{p}\) if \(\xi\in{\mathcal{X}}_{j}\), j≥1, 0<p≤∞.

(d) The system {ψ } is a frame for \({{\mathbb{L}}}^{2}\), namely,

$$ 4^{-1}\|f\|_2^2 \le \sum _{j \geq0} \sum_{\xi\in{\mathcal{X}}_j}\big | \langle f, \psi_{j \xi }\rangle\big|^2 \leq2\| f \|_2^2, \quad\forall f\in{{\mathbb{L}}}^2. $$
(5.12)

Proof

Estimates (5.9) and (5.10) follow by Theorem 3.4; (5.11) follows by Theorem 3.18. The spectral localization is obvious by the definition. Estimates (5.12) follow by (5.7). □

5.2 Dual Frame

Our next (nontrivial) step is to construct a dual frame \(\{\widetilde{\psi}_{j \xi}\}\) to {ψ } with elements of similar space and spectral localization. We begin this construction by introducing two new cut-off functions by “stretching” Ψ 0 and Ψ 1 from Sect. 5.1:

(5.13)

Note that \(\operatorname{supp}\varGamma_{0} \subset[0, b^{2}]\), Γ 0(u)=1 for u∈[0,b], \(\operatorname{supp}\varGamma_{1} \subset[b^{-1}, b^{3}]\), Γ 1(u)=1 for u∈[1,b 2], and 0≤Γ 0,Γ 1≤1. Therefore,

$$ \varGamma_0(u)\varPsi_0(u) = \varPsi_0(u), \quad\quad \varGamma_1(u)\varPsi_1(u) = \varPsi_1(u). $$
(5.14)

We shall also use the cut-off function Θ(u):=Φ(b −3 u). Note that \(\operatorname{supp}\varTheta \subset[0, b^{4}]\), Θ(u)=1 for u∈[0,b 3], and Θ≥0. Hence, Θ(u)Γ j (u)=Γ j (u), j=0,1.

Parameter σ

The dual frame under construction will depend on a parameter σ>2d+1 that can be selected as large as we wish. It will govern the localization properties of the dual frame elements.

With σ>2d+1 already selected we next record the localization properties of the operators generated by the above selected functions. Let f=Γ 0 or f=Γ 1 or f=Θ. Then by Corollary 3.5 there exists a constant c σ >1 such that for δ>0 and 0≤mσ we have

$$ \big|L^m f(\delta\sqrt{L}) (x, y)\big| \le c_\sigma \delta^{-2m}D_{\delta, 2\sigma }(x, y). $$
(5.15)

We now select the constant 0<ε<1 so that

$$ \frac{1}{2{\varepsilon}} = c_\sigma^3 2^{8\sigma +9d+10}. $$
(5.16)

Recall that the constant γ, which depends on ε, was defined in (5.5) so that K(σ )γ α c =ε/3.

The next lemma will be instrumental in the construction of the dual frame.

Lemma 5.2

Given λ≥1, let \({\mathcal{X}}_{\delta}\) be a maximal δ-net on M with δ:=γλ −1 b −3 and suppose \(\{A_{\xi}\}_{\xi\in{\mathcal{X}}_{\delta}}\) is a companion disjoint partition of M consisting of measurable sets such that B(ξ,δ/2)⊂A ξ B(ξ,δ), \(\xi \in{\mathcal{X}}_{\delta}\), just as in Proposition 2.5. Set \(\kappa_{\xi}:= \frac{1}{1+\varepsilon}|A_{\xi}|\sim|B(\xi, \delta)|\). Let Γ=Γ 0 or Γ=Γ 1. Then there exists an operator \(T_{\lambda}: {{\mathbb{L}}}^{2}\to{{\mathbb{L}}}^{2}\) of the form \(T_{\lambda}= \operatorname{Id}+ S_{\lambda}\) such that

(a)

$$\|f \|_2\le\|T_\lambda f\|_2 \le\frac{1}{1-2 \varepsilon}\|f \|_2 \quad\forall f \in{{\mathbb{L}}}^2. $$

(b) L m S λ with 0≤mσ is an integral operator with kernel L m S λ (x,y) verifying

$$\big|L^mS_\lambda(x,y)\big| \le c\lambda^{2m} D_{\lambda^{-1}, \sigma }(x,y),\quad x,y\in M. $$

(c) \(S_{\lambda}({{\mathbb{L}}}^{2})\subset\varSigma_{\lambda b^{2}}^{2}\) if Γ=Γ 0 and \(S_{\lambda}({{\mathbb{L}}}^{2})\subset\varSigma_{[\lambda b^{-1}, \lambda b^{3}]}^{2}\) if Γ=Γ 1.

(d) For any \(f\in{{\mathbb{L}}}^{2}\) such that \(\varGamma(\lambda^{-1} \sqrt{L})f =f\) we have

$$ f(x) = \sum_{\xi\in{\mathcal{X}}_{\delta}} \kappa_\xi f(\xi) T_\lambda \bigl[\varGamma_\lambda( \cdot, \xi)\bigr](x), \quad x\in M, $$
(5.17)

where Γ λ (⋅,⋅) is the kernel of the operator \(\varGamma_{\lambda}:=\varGamma (\lambda^{-1} \sqrt{L})\).

Proof

By Theorem 4.2 we have

$$(1-{\varepsilon})\|f\|^2_2 \le\sum _{\xi\in{\mathcal{X}}_{\delta}} |A_\xi|\big|f(\xi)\big|^2 \le(1+{ \varepsilon})\| f \|^2_2 \quad\hbox{for}\ f\in \varSigma_{\lambda b^3}^2, $$

and setting \(\kappa_{\xi}:= \frac{1}{1+\varepsilon}|A_{\xi}|\) we get

$$ (1- 2{\varepsilon})\|f\|^2_2\le\sum _{\xi\in{\mathcal{X}}_{\delta }} \kappa_\xi\big|f(\xi )\big|^2 \le\| f \|^2_2 \quad\hbox{for}\ f\in\varSigma_{\lambda b^3}^2. $$
(5.18)

Denote briefly \(\varTheta_{\lambda}:=\varTheta(\lambda^{-1} \sqrt{L})\) and let Θ λ (⋅,⋅) be the kernel of this operator. Consider now the positive self-adjoint operator U λ with kernel

$$U_\lambda(x,y) = \sum_{\xi\in{\mathcal{X}}_{\delta}} \kappa_\xi\varTheta_\lambda (x,\xi)\varTheta_\lambda( \xi,y). $$

By (5.15) \(|\varTheta_{\lambda}(x,y)| \le c_{\sigma}D_{\lambda^{-1}, 2\sigma}(x,y)\) for x,yM. Therefore, taking into account that δ=γλ −1 b −3<λ −1 and 2σ>2d+1 we can apply (2.22) to obtain

$$ \big|U_\lambda(x,y)\big| \le c_\sigma c_\sharp D_{\lambda^{-1}, 2\sigma}(x,y), \quad c_\sharp:=2^{2\sigma +3d+3}. $$
(5.19)

Also, if \(f\in\varSigma_{\lambda b^{3}}^{2}\), then \(\langle U_{\lambda}f, f \rangle= \sum_{\xi\in{\mathcal{X}}_{\delta }} \kappa_{\xi}\ | f(\xi)|^{2} \) and hence, using (5.18),

$$ (1-2\varepsilon) \| f \|_2^2 \leq\langle U_\lambda f, f \rangle\leq \|f \|_2^2 \quad \hbox{for } f\in\varSigma_{\lambda b^3}^2. $$
(5.20)

Denote briefly \(\varGamma_{\lambda}:= \varGamma (\lambda^{-1} \sqrt{L})\) and let Γ λ (⋅,⋅) be the kernel of this operator (recall that Γ=Γ 0 or Γ=Γ 1). We define yet another self-adjoint kernel operator by

$$R_\lambda:= \varGamma_\lambda(\operatorname{Id}- U_\lambda) \varGamma_\lambda = \varGamma_\lambda^2 - \varGamma_\lambda U_\lambda\varGamma_\lambda. $$

Set V λ :=Γ λ U λ Γ λ and denote by V λ (⋅,⋅) its kernel. Since Θ(u)Γ(u)=Γ(u), we have

Now, by (5.15), (2.11), and (5.19) it follows that for 0≤mσ

Here c :=22σ+2d+2 and as above c :=22σ+3d+3. To simplify our notation we set \(C_{\sigma}:=2c_{\sigma}^{3} c_{\sharp}c_{\star}^{2}= c_{\sigma}^{3}2^{6\sigma +7d+8}\). Thus we have

$$ \big|L^mR_\lambda(x,y)\big| \le C_\sigma \lambda^{2m}D_{\lambda^{-1}, 2\sigma}(x,y), \quad0\le m\le\sigma. $$
(5.21)

By the definition of R λ we have

$$\langle R_\lambda f, f \rangle = \|\varGamma_\lambda f \|_2^2 - \langle U_\lambda\varGamma_\lambda f , \varGamma_\lambda f \rangle \quad\hbox{for $f\in{{\mathbb{L}}}^{2}$.} $$

Since \(\varGamma_{\lambda}({{\mathbb{L}}}^{2}) \subset\varSigma^{2}_{\lambda b^{3}}\), then Θ λ Γ λ f=Γ λ f, and by (5.20)

$$(1-2\varepsilon) \| \varGamma_\lambda f \|^2_2 \le\langle U_\lambda\varGamma_\lambda f , \varGamma_\lambda f \rangle \le\| \varGamma_\lambda f \|^2_2, \quad f\in{{\mathbb{L}}}^2. $$

Therefore,

$$0 \leq\langle R_\lambda f, f \rangle \le2{\varepsilon}\| \varGamma_\lambda f \|^2_2 \le2{\varepsilon}\| f \|_2^2, \quad f\in{{\mathbb{L}}}^2, $$

where for the last inequality we used that ∥Γ≤1. Consequently,

$$\| R_\lambda\|_{2\rightarrow2} \leq2{\varepsilon}<1 \quad\hbox{and}\quad (1-2{\varepsilon})\|f\|_2 \le\big\|( \operatorname{Id}-R_\lambda) f\big\|_2 \le\| f \|_2, \quad f\in{{\mathbb{L}}}^2. $$

We now define \(T_{\lambda}:=(\operatorname{Id}- R_{\lambda})^{-1} = \operatorname{Id}+ \sum_{k\ge1} R_{\lambda}^{k} =: \operatorname{Id}+ S_{\lambda}\). Clearly

$$ \|f\|_2 \le\|T_\lambda f \|_2 \le \frac{1}{1-2{\varepsilon}}\|f \|_2 \quad\forall f \in{{\mathbb{L}}}^2. $$
(5.22)

If Γ λ f=f, then

$$f= T_\lambda(f -R_\lambda f) = T_\lambda ( f- \varGamma_\lambda f + V_\lambda f ) = T_\lambda V_\lambda f. $$

On the other hand, if Γ λ f=f, then \((V_{\lambda}f)(x) = \sum_{\xi\in{\mathcal{X}}_{\delta}} \kappa_{\xi}f(\xi) \varGamma_{\lambda}(x,\xi) \) and hence

$$ f(x) = \sum_{\xi\in{\mathcal{X}}_{\delta}} \kappa_\xi f(\xi) T_\lambda \bigl[\varGamma_\lambda(\cdot,\xi)\bigr](x). $$
(5.23)

Note that by construction

$$ S_\lambda: {{\mathbb{L}}}^2 \mapsto \varSigma^2_{\lambda b^3} \quad\hbox{if } \varGamma = \varGamma_0\quad \mbox{and} \quad S_\lambda: {{\mathbb{L}}}^2 \mapsto\varSigma^2_{[\lambda b^{-1}, \lambda b^3]} \quad\hbox{if $\varGamma =\varGamma_{1}$.} $$
(5.24)

It remains to establish the space localization of the kernel L m S λ (x,y) of the operator L m S λ . Our method borrows from [37]. Consider first the case m=0. Denoting by \(R_{\lambda}^{k}(x,y)\) the kernel of \(R_{\lambda}^{k}\), we have

$$\big|S_\lambda(x,y)\big | \le\sum_{k\ge1} \big|R_\lambda^k(x,y)\big|. $$

But since \(R_{\lambda}^{k} = \varTheta_{\lambda}R_{\lambda}^{k} \varTheta_{\lambda}\) we get by (5.15) with f=Θ and the fact that ∥R λ 2→2≤2ε, applying Proposition 2.9,

$$ \big |R_\lambda^k(x,y)\big| \le\frac{c_dc_\sigma^2 \|R_\lambda\|^k_{2\rightarrow2}}{ \sqrt{|B(x, \lambda^{-1})| |B(y, \lambda^{-1})|}} \le\frac{(2{\varepsilon})^k c_dc_\sigma^2}{\sqrt{|B(x, \lambda^{-1})| |B(y, \lambda^{-1})|}}, $$
(5.25)

where c d :=24d+4. On the other hand, applying repeatedly estimate (2.11) k times using (5.21) with m=0 we obtain

$$ \big|R_\lambda^k(x,y)\big|\le C_\sigma^k c_\star^{k-1} D_{\lambda^{-1}, 2\sigma}(x,y), \quad c_\star=2^{2\sigma +2d+2}. $$
(5.26)

Therefore, for any K∈ℕ

Choose K≥1 so that \((\frac{1}{2{\varepsilon}})^{K-1} \le(1+ \lambda\rho(x,y))^{\sigma}<(\frac {1}{2{\varepsilon}})^{K}\) and note that \(\frac{1}{2{\varepsilon}} = c_{\star}C_{\sigma}\) by (5.16). Then from above we get

$$ \big |S_\lambda(x,y)\big| \le\frac{4C_\sigma }{\sqrt{|B(x, \lambda^{-1})| |B(y, \lambda^{-1})|}} \frac{1}{ \bigl(1+ \lambda\rho(x,y)\bigr)^\sigma} = 4C_\sigma D_{\lambda^{-1}, \sigma} (x,y). $$
(5.27)

Let 1≤mσ. Since \(L^{m} R_{\lambda}^{k} = L^{m}\varTheta_{\lambda}R_{\lambda}^{k} \varTheta_{\lambda}\), with slight modification of the argument above, (5.15) implies that (5.25) holds for the kernel L m R k(⋅,⋅) with an additional factor λ 2m to the right. On the other hand (5.21) implies that estimate (5.26) also holds for L m R k(⋅,⋅) with an additional factor λ 2m to the right. Then proceeding exactly as above it follows that estimate (5.27) holds for L m S(⋅,⋅) with an additional factor λ 2m to the right. This completes the proof of the lemma. □

Armed with this lemma we can now complete the construction of the dual frame. We shall utilize the functions and operators introduced in Sect. 5.1 and above.

Denote briefly \(\varGamma_{\lambda _{0}}:= \varGamma_{0}(\sqrt{L})\) and \(\varGamma_{\lambda _{j}}:= \varGamma_{1}(b^{-j+1}\sqrt{L})\) for j≥1, λ j :=b j+1. Observe that since Γ 0(u)=1 for u∈[0,b] and Γ 1(u)=1 for u∈[1,b 2], then \(\varGamma_{\lambda _{0}}(\varSigma_{b}^{2})=\varSigma_{b}^{2}\) and \(\varGamma_{\lambda _{j}}(\varSigma_{[b^{j-1}, b^{j+1}]}^{2}) = \varSigma_{[b^{j-1}, b^{j+1}]}^{2}\), j≥1. On the other hand, it is readily seen that \(\varPsi_{0}(\cdot, y) \in\varSigma_{b}^{2}\) and \(\varPsi_{j}(\cdot, y) \in\varSigma_{[b^{j-1}, b^{j+1}]}^{2}\) if j≥1. Therefore, we can apply Lemma 5.2 with \({\mathcal{X}}_{j}\) and \(\{A_{\xi}^{j}\}_{\xi\in{\mathcal{X}}_{j}}\) from Sect. 5.1, and λ=λ j =b j−1 to obtain

(5.28)

By (5.8) we have \(\psi_{j\xi}(x)= |A_{\xi}^{j}|^{1/2}\varPsi_{j}(\xi, x)\) and we now set

$$ \widetilde{\psi}_{j\xi}(x):=c_{\varepsilon}\big|A_\xi^j\big|^{1/2}T_{\lambda_j} \bigl[\varGamma_{\lambda_j}(\cdot, \xi)\bigr](x), \quad\xi\in{ \mathcal{X}}_j, \quad c_{\varepsilon}:=(1+{\varepsilon})^{-1}. $$
(5.29)

Thus \(\{\widetilde{\psi}_{j\xi}: \xi\in{\mathcal{X}}_{j}, j\ge0\}\) is the desired dual frame. Observe immediately that (5.28) takes the form

$$ \varPsi_j(\sqrt{L}) (x, y)= \sum _{\xi\in{\mathcal{X}}_j} \psi_{j\xi}(y)\widetilde{\psi}_{j\xi}(x). $$
(5.30)

We next record the main properties of the dual frame \(\{\widetilde{\psi}_{j\xi}\}\).

Theorem 5.3

(a) Representation: For any \(f\in{{\mathbb{L}}}^{p}\), 1≤p≤∞, we have

$$ f = \sum_{j\ge0}\sum _{\xi\in{\mathcal{X}}_j} \langle f, \widetilde{\psi}_{j\xi }\rangle \psi_{j\xi} = \sum_{j\ge0}\sum _{\xi\in{\mathcal{X}}_j} \langle f, \psi_{j\xi }\rangle \widetilde{\psi}_{j\xi} \quad\hbox{in } {{\mathbb{L}}}^p. $$
(5.31)

(b) Frame: The system \(\{\widetilde{\psi}_{j \xi}\}\) as well as {ψ } is a frame for \({{\mathbb{L}}}^{2}\), namely, there exists a constant c>0 such that

$$ c^{-1}\|f\|_2^2 \le \sum _{j \geq0} \sum_{\xi\in{\mathcal{X}}_j} \big| \langle f, \widetilde{\psi}_{j \xi }\rangle\big|^2 \leq c\| f \|_2^2, \quad\forall f\in{{\mathbb{L}}}^2. $$
(5.32)

(c) Space localization: For any \(\xi\in{\mathcal{X}}_{j}\), j≥0, and 0≤mσ

$$ \big|L^m\widetilde{\psi}_{j \xi} (x)\big| \le c_\sigma b^{2jm}\big|B\bigl(\xi, b^{-j} \bigr)\big|^{-1/2}\bigl(1+b^j\rho(x, \xi )\bigr)^{-\sigma }, $$
(5.33)

and if ρ(x,y)≤b j

$$ \big|\widetilde{\psi}_{j \xi} (x)- \widetilde{\psi}_{j \xi} (y)\big| \le c_\sigma \big|B\bigl(\xi, b^{-j} \bigr)\big|^{-1/2}\bigl(b^j\rho(x, y)\bigr)^\alpha \bigl(1+b^j\rho(x, \xi)\bigr)^{-\sigma }. $$
(5.34)

Here σ>2d+1 is the parameter of the dual frame selected in the beginning of Sect5.2.

(d) Spectral localization: \(\widetilde{\psi}_{0\xi}\in\varSigma_{b}^{p}\) if \(\xi\in{\mathcal{X}}_{0}\) and \(\widetilde{\psi}_{j\xi}\in\varSigma_{[b^{j-2}, b^{j+2}]}^{p}\) if \(\xi\in {\mathcal{X}}_{j}\), j≥1, d/σ<p≤∞.

(e) Norms:

$$ \|\widetilde{\psi}_{j \xi}\|_p \sim\big|B\bigl(\xi, b^{-j}\bigr)\big|^{\frac{1}{p}-\frac{1}{2}} \quad\hbox{for } d/\sigma < p \le \infty. $$
(5.35)

Proof

By the definition of \(\widetilde{\psi}_{j\xi}\) in (5.29) and Lemma 5.2 we have

$$\widetilde{\psi}_{j\xi}(x) :=c_{\varepsilon}\big|A_\xi^j\big|^{1/2}T_{\lambda_j} \bigl[\varGamma_{\lambda _j}(\cdot, \xi)\bigr](x) = c_{\varepsilon}\big|A_\xi^j\big|^{1/2} \bigl[\varGamma_{\lambda_j}(x, \xi) +S_{\lambda_j} \bigl[ \varGamma_{\lambda_j}(\cdot, \xi)\bigr](x) \bigr]. $$

Then estimate (5.33) follows from the localization of \(L^{m}\varGamma_{\lambda _{j}}(\cdot, \cdot)\) given by (5.15), Lemma 5.2(b), and (2.11). Estimate (5.34) follows by the fact \(\varGamma_{\lambda _{j}}(\cdot, \cdot)\) is Lip α, given by Theorem 3.4, and the localization of \(S_{\lambda _{j}}(\cdot, \cdot)\), given in Lemma 5.2(b), exactly as in the proof of Theorem 3.1.

To establish representation (5.31) we note that (5.30), (5.33), and (2.22) readily imply \(\sum_{\xi\in{\mathcal{X}}_{j}} |\psi_{j\xi}(y)||\widetilde{\psi}_{j\xi}(x)| \le c D_{b^{-j}, \sigma -d}(x, y)\). Then (5.31) follows by (5.2) and (5.30).

The estimate

$$ \|\widetilde{\psi}_{j\xi}\|_p \le c\big|B \bigl(\xi, b^{-j}\bigr)\big|^{\frac{1}{p} - \frac{1}{2}} \quad\hbox{for } d/ \sigma <p\le\infty $$
(5.36)

follows by (5.33) and (2.12). On the other hand, Lemma 5.2(a) and Theorem 3.18 yield

$$\|\widetilde{\psi}_{j\xi}\|_2 \ge c \big|B\bigl(\xi, b^{-j}\bigr)\big| \big\|\varGamma_{\lambda_j}(\cdot, \xi)\big\|_2 \ge c'>0. $$

From this and (5.36) one easily derives \(\|\widetilde{ \psi}_{j\xi}\|_{p} \ge c|B(\xi, b^{-j})|^{\frac{1}{p} - \frac{1}{2}}\) for 0<p≤∞ (see the proof of Theorem 3.18).

For the proof of (5.32) we shall employ the following lemma which will be instrumental in the development of Besov spaces later on as well.

Lemma 5.4

(a) For any \(f\in{{\mathbb{L}}}^{p}\), 1≤p≤∞,

$$ \biggl(\sum_{\xi\in{\mathcal{X}}_j}\big\|\langle f, \widetilde{\psi}_{j\xi} \rangle \psi_{j\xi}\big\|_p^p \biggr)^{1/p} \le c\|f\|_p, \quad\forall j\ge0. $$
(5.37)

(b) For any sequence of complex numbers \(\{a_{\xi}\}_{\xi\in {\mathcal{X}} _{j}}\), j≥0, and 1≤p≤∞,

$$ \bigg \|\sum_{\xi\in{\mathcal{X}}_j} a_\xi \psi_{j\xi} \bigg\|_p \le c \biggl(\sum _{\xi\in{\mathcal{X}}_j} \|a_\xi\psi_{j\xi} \|_p^p \biggr)^{1/p}. $$
(5.38)

Above each of the p-norms is replaced by the sup-norm when p=∞. Also (a) and (b) hold with the roles of {ψ } and \(\{ \widetilde{\psi}_{j\xi}\}\) interchanged. The constant c>0 is independent of f, {a ξ }, and j.

Proof

We shall need the following simple inequalities

$$ \sum_{\xi\in{\mathcal{X}}_j}\big|\widetilde{\psi}_{j\xi}(x)\big|\|\psi_{j\xi }\|_1 \le c \quad \hbox{and}\quad \sum_{\xi\in{\mathcal{X}}_j} \big|\psi_{j\xi}(x)\big| \|\psi_{j\xi}\|_1 \le c, \quad x\in M, $$
(5.39)

where the roles of {ψ } and \(\{\widetilde{\psi}_{j\xi}\}\) can be switched. Using (5.33) with m=0 and (5.11) we obtain

$$\sum_{\xi\in{\mathcal{X}}_j}\big|\widetilde{\psi}_{j\xi}(x)\big|\| \psi_{j\xi }\|_1 \le c\sum_{\xi\in{\mathcal{X}}_j} \bigl(1+b^j\rho(x, \xi)\bigr)^{-\sigma} \le c<\infty, $$

where for the last inequality we used (2.20) and the fact that σ≥2d+1. This gives the left-hand side inequality in (5.39). The proof of the other inequality is the same.

Estimate (5.37) is immediate from (5.39) when p=1. In the case p=∞ (5.37) follows readily by the inequality \(\|\widetilde{\psi}_{j\xi}\|_{1}\|\psi_{j\xi}\|_{\infty}\le c<\infty\) which is a consequence of (5.11) and (5.35).

To prove (5.37) in the case 1<p<∞ we just apply Hölder’s inequality (1/p+1/p′=1) and obtain

This coupled with the obvious inequality

$$\|\widetilde{\psi}_{j\xi}\|_1^{p-1}\| \psi_{j\xi}\|_p^p \le\bigl(\|\widetilde{\psi}_{j\xi}\|_1 \|\psi_{j\xi}\|_\infty\bigr)^{p-1} \|\psi_{j\xi}\|_1 \le c\|\psi_{j\xi} \|_1, $$

using \(\|\widetilde{\psi}_{j\xi}\|_{1}\|\psi_{j\xi}\|_{\infty}\le c<\infty\) as above, leads to

$$\sum_{\xi\in{\mathcal{X}}_j}\big\|\langle f, \widetilde{\psi}_{j\xi} \rangle\psi_{j\xi}\big\|_p^p \le c\int _M\big|f(x)\big|^p \sum_{\xi\in{\mathcal{X}}_j}\big| \widetilde{\psi}_{j\xi}(x)\big|\| \psi_{j\xi}\|_1 d\mu(x) \le c\|f\|_p^p. $$

Here we used (5.39). This confirms the validity of (5.37).

We now turn to the proof of (5.38). This inequality is obvious when p=1. In the case p=∞ inequality (5.38) follow easily from the right-hand side inequality in (5.39) and the fact that ∥ψ 1ψ c<∞, see (5.11).

To prove (5.38) in the case 1<p<∞ we apply the discrete Hölder inequality and the right-hand side inequality in (5.39) to obtain

Integrating both sides we get

$$\bigg\|\sum_{\xi\in{\mathcal{X}}_j} a_\xi\psi_{j\xi} \bigg\|_p^p \le c\sum_{\xi\in{\mathcal{X}}_j} |a_\xi|^p\|\psi_{j\xi}\|_1^{2-p} \le c\sum_{\xi\in{\mathcal{X}}_j} |a_\xi|^p\| \psi_{j\xi}\|_p^p. $$

Here we used that \(\|\psi_{j\xi}\|_{1}^{2-p} \sim\|\psi_{j\xi}\|_{p}^{p}\), which follows by (5.11). The proof of Lemma 5.4 is complete. □

We are now in a position to complete the proof of Theorem 5.3. From (5.4) applying (5.38) we get

which confirms the left-hand side inequality in (5.32). For the other direction, we first note that since \(\operatorname {supp}\varPsi_{j} \subset[b^{j-1}, b^{j+1}]\) and \(\widetilde{\psi}_{j\xi} \in\varSigma_{[b^{j-2}, b^{j+2}]}\) we have by (5.2) \(\langle f, \widetilde{\psi}_{j\xi}\rangle = \sum_{\nu=j-2}^{j+2} \langle\varPsi_{\nu}(\sqrt{L})f, \widetilde{\psi}_{j\xi} \rangle \) (here Ψ ν :=0 if ν<0) and hence

Here we used (5.37). Summing up the above inequalities and using (5.4) we obtain the right-hand side inequality in (5.32). This completes the proof of Theorem 5.3.  □

5.3 Frames in the Case when \(\{\varSigma_{\lambda}^{2}\}\) Possess the Polynomial Property

The construction of frames with the desired excellent space and spectral localization is simple and elegant in the case when the spectral spaces \(\varSigma_{\lambda}^{2}\) have the polynomial property in the sense of the following

Definition 5.5

Let {F λ ,λ≥0} be the spectral resolution associated with the operator \(\sqrt{L}\); then \(\sqrt{L} = \int_{0}^{\infty}\lambda dF_{\lambda}\). We say that the associated spectral spaces

$$\varSigma_\lambda^2= \bigl\{ f \in{\mathbb{L}}^2: F_\lambda f =f\bigr\} $$

have the polynomial property if there exists a constant κ>1 such that

$$ \varSigma_\lambda^2\cdot\varSigma_\lambda^2 \subset\varSigma^1_{\kappa \lambda}, \quad\hbox{i.e. } f, g\in\varSigma_\lambda^2 \quad\Longrightarrow\quad fg\in \varSigma_{\kappa\lambda}^1. $$
(5.40)

The construction begins with two pairs of cut-off functions \(\varPsi_{0}, \varPsi, \widetilde{\varPsi}_{0}, \widetilde{\varPsi}\in C^{\infty}({\mathbb{R}}_{+})\) with the following properties:

As in Sect. 5.1, here b>1 is the constant from Theorem 3.18. The construction of functions with these properties is quite simple and well-known and will be omitted. It is worth pointing out that given Ψ 0,Ψ, then \(\widetilde{\varPsi}_{0}, \widetilde{\varPsi}\) can be easily constructed with the above properties (see e.g. [23], Lemma 6.9).

Denote Ψ j (u):=Ψ(b j u) and \(\widetilde{\varPsi}_{j}(u):=\widetilde{\varPsi}(b^{-j}u)\). Then from above we have

$$ \sum_{j\ge0} \varPsi_j(u)\widetilde{\varPsi}_j(u)=1, \quad u\in{\mathbb{R}}_+. $$
(5.41)

This and Proposition 3.8 imply the following Calderón type decomposition

$$ f=\sum_{j\ge0} \varPsi_j( \sqrt{L})\widetilde{\varPsi}_j(\sqrt{L})f, \quad f\in{{\mathbb{L}}}^p, \quad 1\le p \le\infty. $$
(5.42)

The key idea is that the polynomial property (5.40) of the spectral spaces can be used to discretize the above expansion and as a result to obtain the desired frames. Indeed, observe first that \(\operatorname{supp}\varPsi_{0}, \widetilde{\varPsi}_{0}\subset[0, b]\) and \(\operatorname{supp}\varPsi_{j}, \widetilde{\varPsi}_{j}\subset[b^{j-1}, b^{j+1}]\), j≥1. From this and above it follows that \(\varPsi_{j}(\sqrt{L})\), \(\widetilde{\varPsi}_{j}(\sqrt{L})\) are kernel operator whose kernels have nearly exponential localization and \(\varPsi_{j}(\sqrt{L})(x, \cdot)\in\varSigma_{b^{j+1}}\) and \(\widetilde{\varPsi}_{j}(\sqrt{L})(\cdot, y)\in\varSigma_{b^{j+1}}\). We now invoke the cubature formula from Theorem 4.4. With 0<γ<1 the constant from (4.15) and κ>1 from (5.40), we select a maximal δ-net, say \({\mathcal{X}}_{j}\), on M with δ:=γκ −1 b j−1b j. Theorem 4.4 provides a cubature formula of the form

$$\int_M f(x) d\mu(x) = \sum_{\xi\in{\mathcal{X}}_j} w_{j\xi} f(\xi) \quad\hbox{for } f\in\varSigma_{\kappa b^{j+1}}^1, $$

where \(\frac{2}{3}|B(\xi, \delta/2)| \le w_{j\xi} \le2|B(\xi, \delta)|\). Since \(\varPsi_{j}(\sqrt{L})(x,\cdot)\widetilde{\varPsi}_{j}(\sqrt{L})(\cdot,y) \in\varSigma_{\kappa b^{j+1}}^{1}\) due to (5.40), we get

(5.43)

We now define the frame elements by

$$ \psi_{j\xi}(x):=\sqrt{w_{j\xi}} \varPsi_j(\sqrt{L}) (x,\xi), \quad \widetilde{\psi}_{j\xi}(x):= \sqrt{w_{j\xi}}\widetilde{\varPsi}_j(\sqrt {L}) (x,\xi), \quad \xi \in{\mathcal{X}}_j, j\ge0. $$
(5.44)

We next present the main properties of the system {ψ }, \(\{\widetilde{\psi}_{j\xi}\}\).

Proposition 5.6

(a) Frame property: For any \(f\in{{\mathbb{L}}}^{p}\), 1≤p≤∞, \(({{\mathbb{L}}}^{\infty}:=\mathrm{UCB})\) we have

$$ f = \sum_{j\ge0}\sum _{\xi\in{\mathcal{X}}_j} \langle f, \widetilde{\psi}_{j\xi }\rangle \psi_{j\xi} = \sum_{j\ge0}\sum _{\xi\in{\mathcal{X}}_j} \langle f, \psi_{j\xi }\rangle \widetilde{\psi}_{j\xi} \quad\hbox{in } {{\mathbb{L}}}^p $$
(5.45)

and

$$ \|f\|_2^2 = \sum _{j \geq0} \sum_{\xi\in{\mathcal{X}}_j} \overline{ \langle f, \widetilde{\psi}_{j \xi}\rangle} \langle f, \psi_{j \xi} \rangle \quad\forall f\in{{\mathbb{L}}}^2. $$
(5.46)

(b) Space localization: For any σ>0 there exists a constant c σ >0 such that for any \(\xi\in{\mathcal{X}}_{j}\), j≥0,

$$ \big|\psi_{j \xi} (x)\big|,\ \big|\widetilde{\psi}_{j \xi} (x)\big| \le c_\sigma \big|B\bigl(\xi, b^{-j} \bigr)\big|^{-1/2}\bigl(1+b^j\rho(x, \xi)\bigr)^{-\sigma }, $$
(5.47)

and if ρ(x,y)≤b j

$$ \big|\psi_{j \xi} (x)- \psi_{j \xi} (y)\big| \le c_\sigma \big|B\bigl(\xi, b^{-j} \bigr)\big|^{-1/2}\bigl(b^j\rho(x, y)\bigr)^\alpha \bigl(1+b^j\rho(x, \xi)\bigr)^{-\sigma }. $$
(5.48)

Here α>0 is the global parameter from (1.6) and the same inequality hold for \(\widetilde{\psi}_{j \xi}\) in place of ψ .

(c) Spectral localization: \(\psi_{0\xi}, \widetilde{\psi}_{0\xi}\in\varSigma_{b}^{p}\) if \(\xi\in {\mathcal{X}}_{0}\) and \(\psi_{j\xi},\widetilde{\psi}_{j\xi}\in\varSigma_{[b^{j-1}, b^{j+1}]}^{p}\) if \(\xi\in{\mathcal{X}}_{j}\), j≥1, 0<p≤∞.

(d) Norms:

$$ \|\psi_{j \xi}\|_p \sim\|\widetilde{\psi}_{j \xi}\|_p \sim\big|B\bigl(\xi, b^{-j} \bigr)\big|^{\frac{1}{p}-\frac{1}{2}}, \quad0< p \le\infty. $$
(5.49)

Proof

Identities (5.45) follow immediately from (5.42) and (5.43). For the proof of (5.46), denote \(S_{N}f = \sum_{j=0}^{N}\sum_{\xi\in{\mathcal{X}}_{j}} \langle f, \widetilde{\psi}_{j\xi}\rangle\psi_{j\xi} \) and observe that

The localization and Lipschitz property of the frame elements given in (5.47) and (5.48) follow by Theorem 3.4. The claimed spectral localization is obvious. The norm bounds in (5.49) follow by Theorem 3.18. □

An interesting special case of the above construction occurs when we choose \(\varPsi_{0} = \widetilde{\varPsi}_{0}\) and \(\varPsi=\widetilde{\varPsi}\). Then \(\psi_{j\xi}=\widetilde{\psi}_{j\xi}\) and {ψ } is a tight frame for \({{\mathbb{L}}}^{2}\), i.e.

$$\|f\|_2^2 = \sum_{j\ge0}\sum _{\xi\in{\mathcal{X}}_j}\big |\langle f, \psi_{j\xi} \rangle\big|^2, \quad\forall f\in{{\mathbb{L}}}^2. $$

Remark

The polynomial property (5.40) of the spectral spaces apparently is valid when the spectral functions are polynomials. This simple fact has been utilized for construction of frames on the sphere [42], on the interval with Jacobi weights [47], on the ball [48], and in the context of Hermite [49] and Laguerre [34] expansions.

6 Besov Spaces

We shall follow the general idea of using spectral decompositions, e.g. [46, 59, 60], to introduce (inhomogeneous) Besov spaces in the general set-up of this paper. As explained in the introduction, we shall only consider Besov spaces \(B^{s}_{pq}\) with s>0 and 1≤p≤∞. The Besov spaces \(B^{s}_{pq}\) with full range of indices are treated in the follow-up paper [33]. For another approach to Besov spaces under heat kernel estimates, but a polynomial upper bound on the volume instead of the volume doubling condition, see [7].

To introduce Besov spaces we assume that there are given two (Littlewood-Paley) functions φ 0,φC (ℝ+) such that

(6.1)
(6.2)

Then |φ 0(λ)|+∑ j≥1|φ(2j λ)|≥c>0 for λ∈[0,∞). Set φ j (λ):=φ(2j λ) for j≥1.

Definition 6.1

Let s>0, 1≤p≤∞, and 0<q≤∞. The Besov space \(B_{pq}^{s}=B_{pq}^{s}(L)\) is defined as the set of all \(f \in{{\mathbb{L}}}^{p}\) such that

$$ \|f\|_{B_{pq}^{s}} := \biggl(\sum _{j\ge0} \bigl(2^{s j} \big\|\varphi_j( \sqrt{L}) f(\cdot)\big\|_{{{{\mathbb{L}}}^p}} \bigr)^q \biggr)^{1/q} <\infty. $$
(6.3)

Here the q-norm is replaced by the sup-norm if q=∞.

Note that by Proposition 6.2 below it follows that the definition of the Besov spaces \(B_{pq}^{s}\) is independent of the specific selection of φ 0, φ satisfying (6.1)–(6.2). Also, \(B_{pq}^{s}\) are (quasi-)Banach spaces, which are continuously embedded in \({{\mathbb{L}}}^{p}\) as will be seen below.

6.1 Characterization of Besov Spaces via Linear Approximation from \(\{\varSigma_{t}^{p}\}\)

Here we show that the Besov spaces \(B_{pq}^{s}\) with s>0 and p≥1 are in fact the approximation spaces of linear approximation from \(\varSigma_{t}^{p}\), t≥1. As in Sect. 3.5, we let \(\mathcal{E}_{t}(f)_{p}\) denote the best approximation of \(f \in{{\mathbb{L}}}^{p}\) from \(\varSigma_{t}^{p}\) and \(A_{pq}^{s}\) will denote the associated approximation spaces, defined in (3.35)–(3.36).

Proposition 6.2

Let s>0, 1≤p≤∞, and 0<q≤∞. Then \(f \in B_{pq}^{s}\) if and only if \(f\in A_{pq}^{s}\). Moreover,

$$ \|f\|_{B_{pq}^s} \sim\|f\|_{A_{pq}^s} := \|f \|_p + \biggl(\sum_{j\ge0} \bigl(2^{s j}\mathcal{E}_{2^j}(f)_p \bigr)^q \biggr)^{1/q}. $$
(6.4)

Proof

Let φ j be as in the definition of the Besov spaces with the additional property: ∑ j≥0 φ j (λ)=1 for λ∈[0,∞) (see Sect. 3.3). Suppose \(f\in{{\mathbb{L}}}^{p}\). Then by Corollary 3.9 we have \(f=\sum_{j\ge0}\varphi_{j}(\sqrt{L})f\) and since \(\varphi_{j}(\sqrt{L})f\in\varSigma_{[2^{j-1}, 2^{j+1}]}^{p}\) we obtain

$$\mathcal{E}_{2^m}(f)_p\le\sum _{j\ge m} \big\|\varphi_j(\sqrt{L})f\big\|_p $$

and the standard Hardy inequality

$$ \sum_{m\ge0} \biggl(2^{sm} \sum_{j\ge m}b_j \biggr)^q \le c \sum_{m\ge0} \bigl(2^{sm}b_m \bigr)^q, \quad b_j \ge0,\ s>0,\ 0<q\le\infty, $$
(6.5)

leads to the estimate \(\|f\|_{A_{pq}^{s}} \le c\|f\|_{B_{pq}^{s}}\).

For the estimate in the other direction, we note that for any \(g\in\varSigma_{2^{j-1}}^{p}\) we have \(\varphi_{j}(\sqrt{L}) f = \varphi_{j}(\sqrt{L}) (f-g)\) and hence

$$\big\|\varphi_j(\sqrt{L}) f\big\|_p = \big\|\varphi_j( \sqrt{L}) (f-g)\big\|_p \le c\| f-g\|_p, $$

where we used the boundedness of the operator \(\varphi_{j}(\sqrt{L})\) on \({{\mathbb{L}}}^{p}\). This implies \(\|\varphi_{j}(\sqrt{L}) f\|_{p} \le c\mathcal{E}_{2^{j-1}}(f)_{p}\), j≥1, and obviously \(\|\varphi_{0}(\sqrt{L}) f\|_{p} \le c\|f\|_{p}\). We use these estimates in the definition of \(B_{pq}^{s}\) to obtain \(\|f\|_{B_{pq}^{s}} \le c\|f\|_{A_{pq}^{s}}\). □

We next record the heat kernel characterization of Besov spaces. Denote

$$ \|f\|_{B^s_{pq}(H)} := \| f \|_p + \biggl(\int _0^1 \bigl(t^{-s/2} \big\|(tL)^m e^{-tL}f\big\|_p \bigr)^q \frac {dt}{t} \biggr)^{1/q} $$
(6.6)

with the usual modification for q=∞.

Corollary 6.3

For admissible indices s,p,q a function \(f\in B^{s}_{pq}\) if and only if \(\|f\|_{B^{s}_{pq}(H)}<\infty\) and if \(f\in B^{s}_{pq}\), then \(\|f\|_{B^{s}_{pq}} \sim\|f\|_{B^{s}_{pq}(H)}\).

This corollary follows readily by Proposition 6.2 taking into account Remark 3.17.

6.2 Comparison of Lipschitz Spaces and \(B^{s}_{\infty\infty}\)

The Lipschitz space \(\operatorname{Lip}\gamma\), γ>0, is defined as the set of all \(f\in{{\mathbb{L}}}^{\infty}\) such that

$$ \|f\|_{\operatorname{Lip}\gamma} := \| f \|_\infty+ \sup_{x\neq y} \frac{| f(x)-f(y) |}{\rho^\gamma(x,y)} <\infty. $$
(6.7)

We would like to record next the fact that in the setting of this article the spaces \(\operatorname{Lip}s\) and \(B^{s}_{\infty\infty}\) coincide provided 0<s<α, where α is the structural constant from (1.6).

Proposition 6.4

The following continuous embeddings hold: (a) For any s>0

$$\operatorname{Lip}s \subset B^s_{\infty\infty}. $$

(b) For any 0<s<α

$$B^s_{\infty\infty} \subset\operatorname{Lip}s. $$

Proof

(a) Let \(f\in\operatorname{Lip}s\) and choose θC [0,∞) so that θ≥0, θ≡1 on [0,1] \(\operatorname{supp}\theta\subset[0,2]\). Then using Theorem 3.4 and (2.10) we obtain for t≥1 and k>s+3d/2

On the other hand \(\theta(t^{-1} \sqrt{L})f \in\varSigma_{2t}^{\infty}\) and hence \(\mathcal{E}_{2t}(f)_{\infty}\le\|\theta(t^{-1} \sqrt{L})f-f\|_{\infty}\). From this and above we infer \(\mathcal{E}_{2t}(f)_{\infty}\le ct^{-s}\|f\|_{\operatorname{Lip}s}\), which implies (a).

(b) Let φ 0:=θ with θ the function from above. Set φ(λ):=θ(λ)−θ(2λ) and φ j (λ):=φ(2j λ). Then ∑ j≥0 φ j (λ)=1 for λ≥0, \(\operatorname{supp}\varphi_{0}\subset[0, 2]\) and \(\operatorname {supp}\varphi_{j}\subset[2^{j-1}, 2^{j+1}]\), j≥1. Now, assuming that \(f \in B^{s}_{\infty\infty}\) we apparently have \(\varphi_{0}(\sqrt{L})f \in\varSigma^{\infty}_{2}\), \(\varphi_{j}(\sqrt{L})f \in\varSigma^{\infty}_{2^{j+1}}\), and by the Littlewood-Paley decomposition (Corollary 3.9) \(f = \sum_{j \ge0}\varphi_{j}(\sqrt{L})f\). Evidently, \(B^{s}_{\infty\infty}\) can be defined using the above constructed functions {φ j } and hence \(\|\varphi_{j}(\sqrt{L})f \|_{\infty}\le c2^{-js}\|f\|_{B^{s}_{\infty\infty }}\), j≥0. Therefore, using (3.33) we have for 0<s<α and any J≥1

Assuming that 0<ρ(x,y)≤1 we choose J≥1 so that 2Jρ(x,y) and the above yields \(|f(x)-f(y)|\le c\|f\|_{B^{s}_{\infty\infty}}\rho(x, y)^{s}\). If ρ(x,y)>1 this estimate is immediate from \(\|f\|_{\infty}\le c\|f\|_{B^{s}_{\infty\infty}}\), which follows trivially using the decomposition of f from above. This completes the proof of (b). □

6.3 Frame Decomposition of Besov Spaces

Our aim here is to show that the Besov spaces introduced by Definition 6.1 can be characterized in terms of respective sequence norms of the frame coefficients of functions, using the frames constructed in Sect. 5. We shall utilize the pair of dual frames {ψ }, \(\{\widetilde{\psi}_{j\xi}\}\) constructed in Sects. 5.15.2 or in Sect. 5.3. To make the idea of frame decomposition of \(B^{s}_{pq}\) more transparent we first introduce the sequence B-spaces \(b^{s}_{pq}\).

Definition 6.5

For s>0, 1≤p≤∞, and 0<q≤∞ the sequence space \(b_{pq}^{s}\) is defined as the space of all complex-valued sequences \(a:=\{a_{j\xi}: j\ge0, \xi\in{\mathcal{X}}\}\) such that

$$ \|a\|_{b_{pq}^s} := \biggl(\sum _{j\ge0}b^{jsq} \biggl[\sum _{\xi\in{\mathcal{X}}_j} \bigl(\big|B\bigl(\xi, {b}^{-j} \bigr)\big|^{1/p-1/2}|a_{j\xi}| \bigr)^p \biggr]^{q/p} \biggr)^{1/q} <\infty. $$
(6.8)

Here b>1 is the constant from Sect. 5, and the p or q norm is replaces by the sup-norm if p=∞ or q=∞.

In our further analysis we shall use the “analysis” and “synthesis” operators defined by

$$ S_{\widetilde{\psi}}: f \rightarrow\bigl\{\langle f, \widetilde{\psi}_{j\xi }\rangle\bigr\} \quad \text{and}\quad T_{\psi}: \{a_{j\xi}\}\rightarrow\sum _{j\ge0}\sum_{\xi\in {\mathcal{X}} _j}a_{j\xi} \psi_{j\xi}. $$
(6.9)

Theorem 6.6

Let s>0, 1≤p≤∞, and 0<q≤∞. Then the operators \(S_{\widetilde{\psi}}: B_{pq}^{s} \rightarrow b_{pq}^{s}\) and \(T_{\psi}: b_{pq}^{s} \rightarrow B_{pq}^{s}\) are bounded and \(T_{\psi} S_{\widetilde{\psi}}=\operatorname{Id}\) on \(B_{pq}^{s}\). Consequently, \(f\in B_{pq}^{s}\) if and only if \(\{\langle f, \widetilde{\psi}_{j\xi}\rangle\}\in b_{pq}^{s}\). Moreover, if \(f\in B_{pq}^{s}\), then

$$ \|f\|_{B_{pq}^s} \sim\big\|\bigl\{\langle f,\widetilde{\psi}_{j\xi}\rangle\bigr\}\big\|_{b_{pq}^s} \sim \biggl(\sum _{j\ge0} b^{jsq} \biggl[\sum _{\xi\in{\mathcal{X}}_j} \big\|\langle f,\widetilde{\psi}_{j\xi}\rangle \psi_{j\xi}\big\|_p^p \biggr]^{q/p} \biggr)^{1/q} $$
(6.10)

with the usual modification when p=∞ or q=∞. Above the roles of {ψ } and \(\{\widetilde{\psi}_{j\xi}\}\) can be interchanged.

Proof

Let \(\varPsi_{j}\in C^{\infty}_{0}\), j≥0, be the functions from the definition of the frames in Sect. 5.1. Recall that \(\operatorname{supp}\varPsi_{0}\subset[0, b]\) and \(\operatorname{supp}\varPsi_{j}\subset [b^{j-1}, b^{j+1}]\), j≥1. Also, ∑ j≥0 Ψ j (u)=1, u∈ℝ+, and hence \(f=\sum_{j\ge0}\varPsi_{j}(\sqrt{L})f\) for \(f\in{{\mathbb{L}}}^{p}\). It is easy to see that Proposition 6.2 implies (with the obvious modification when q=∞)

(6.11)

Here the second equivalence follows by the monotonicity of \(\mathcal{E}_{t}(f)_{p}\) and the last equivalence follows exactly as in the proof of Proposition 6.2.

Let \(f\in B^{s}_{pq}\) and assume q<∞ (the case q=∞ is easier). By (5.11) and (6.8) it follows that

$$ \|S_{\widetilde{\varPsi}}f\|_{b_{pq}^s} = \big\|\bigl\{\langle f,\widetilde{\psi}_{j\xi}\rangle\bigr\}\big\|_{b_{pq}^s} \sim \biggl(\sum _{j\ge0} b^{jsq} \biggl[\sum _{\xi\in{\mathcal{X}}_j} \big\|\langle f,\widetilde{\psi}_{j\xi}\rangle \psi_{j\xi}\big\|_p^p \biggr]^{q/p} \biggr)^{1/q}. $$
(6.12)

Using that \(f=\sum_{j\ge0}\varPsi_{j}(\sqrt{L})f\), \(\varPsi_{j}(\sqrt{L})(\cdot, y) \in\varSigma^{2}_{[b^{j-1}, b^{j+1}]}\), and \(\widetilde{\psi}_{j\xi} \in\varSigma^{2}_{[b^{j-2}, b^{j+2}]}\) we obtain

$$\langle f,\widetilde{\psi}_{j\xi}\rangle\psi_{j\xi} = \sum _{\nu=j-2}^{j+2}\bigl\langle\varPsi_\nu( \sqrt{L})f,\widetilde{\psi}_{j\xi }\bigr\rangle\psi_{j\xi}, \quad\xi \in\mathcal{X}_j, $$

where \(\varPsi_{\nu}(\sqrt{L}):= 0\) if ν<0. This readily implies

$$\sum_{\xi\in{\mathcal{X}}_j}\big\|\langle f,\widetilde{\psi}_{j\xi } \rangle\psi_{j\xi }\big\|_p^p \le c\sum _{\nu=j-2}^{j+2}\big\|\bigl\langle\varPsi_\nu( \sqrt{L})f,\widetilde{\psi}_{j\xi}\bigr\rangle\psi_{j\xi} \big\|_p^p \le c\sum_{\nu=j-2}^{j+2} \big\|\varPsi_\nu(\sqrt{L})f\big\|_p^p. $$

Here for the last inequality we used Lemma 5.4(a). We insert the above in (6.12) and use (6.11) to obtain \(\|S_{\widetilde{\varPsi}}f\|_{b^{s}_{pq}}\le c \|f\|_{B^{s}_{pq}}\). Hence the operator \(S_{\widetilde{\psi}}: B_{pq}^{s} \rightarrow b_{pq}^{s}\) is bounded.

To prove the boundedness of \(T_{\psi}: b_{pq}^{s} \rightarrow B_{pq}^{s}\), we assume that \(a=\{a_{j\xi}\}\in b^{s}_{pq}\) and denote briefly \(f=T_{\psi}a=\sum_{j\ge0}\sum_{\xi\in{\mathcal{X}}_{j}} a_{j\xi }\psi_{j\xi}\). Assume q<∞ (the case q=∞ is easier). Using (5.38), Hölder’s inequality if q>1, and (5.11) we obtain

(6.13)

Therefore, T ψ a is well-defined. Further, since \(\psi_{j\xi}\in\varSigma_{b^{j+1}}^{p}\) and applying again (5.38) we get

$$\mathcal{E}_{b^j}(f)_p \le\bigg \|\sum _{m\ge j}\sum_{\xi\in {\mathcal{X}}_m} a_{m\xi }\psi_{m\xi}\bigg \|_p \le c\sum _{m\ge j} \biggl(\sum_{\xi\in{\mathcal{X}}_m} \|a_{m\xi }\psi_{m\xi }\|_p^p \biggr)^{1/p}. $$

This and the Hardy inequality (6.5) give \((\sum_{j\ge0} (b^{js}\mathcal{E}_{b^{j}}(f)_{p} )^{q} )^{1/q} \le c\|a\|_{b^{s}_{pq}}\). In turn, this and (6.11) yield \(\|f\|_{B^{s}_{pq}} \le c\|a\|_{b^{s}_{pq}}\). Thus the operator \(T_{\psi}: b_{pq}^{s} \rightarrow B_{pq}^{s}\) is also bounded.

The identity \(T_{\psi} S_{\widetilde{\psi}}=\operatorname{Id}\) on \(B_{pq}^{s}\) follows by (5.31). □

6.4 Embedding of Besov Spaces

Finally we show that the Besov spaces \(B^{s}_{pq}\) embed “correctly”.

Proposition 6.7

Let 1≤pp 1<∞, 0<qq 1≤∞, 0<s 1s<∞. Then we have the continuous embedding

$$ B_{pq}^{s} \subset B_{p_1q_1}^{s_1} \quad\mbox{\textit{if} } s/d-1/p=s_1/d-1/p_1. $$
(6.14)

Proof

This assertion follows easily by Proposition 3.12. Indeed, let {φ j } j≥0 be the functions from the definition of Besov spaces (Definition 6.1). Given \(f\in B_{pq}^{s}\) we evidently have \(\varphi_{j}(\sqrt{L}) f\in\varSigma_{2^{j+1}}^{p}\) and using (3.32)

which readily implies \(\|f\|_{B_{p_{1}q_{1}}^{s_{1}}} \le c\|f\|_{B_{pq}^{s}}\). □

Compare the above result with [12], where embeddings between Besov spaces defined via the heat semigroup are proved under an assumption of polynomial decay of the heat kernel.

7 Heat Kernel on [−1,1] Induced by the Jacobi Operator

We consider the case when M=[−1,1], (x)=w α,β (x)dx, where

$$w_{\alpha, \beta}(x)=w(x)= (1-x)^\alpha(1+x)^\beta, \quad \alpha, \beta> -1, $$

and

$$Lf(x) =- \frac{[w(x) a(x)f'(x)]'}{w(x)}, \quad a(x) =\bigl(1-x^2\bigr), \quad D(L) = C^2[-1,1]. $$

Integrating by parts we get \(\mathcal{E}(f,g) =\langle Lf, g \rangle= \int_{-1}^{1} a(x) f'(x) g'(x) w(x) dx\). Clearly, the domain \(D(\overline{\mathcal{E}})\) of the closure \(\overline{\mathcal{E} }\) of \(\mathcal{E}\) is given by the set of weakly differentiable functions f on ]−1,1[ such that

$$\|f\|^2_{\mathcal{E}} = \int_{-1}^1 \big|f(x)\big|^2 w(x) dx + \int_{-1}^1 a(x) \big|f'(x)\big|^2 w(x) dx <\infty. $$

Note that \(D(L)\supset\mathcal{P}\) the space of all polynomials, and \(L(\mathcal{P}_{k}) \subset\mathcal{P}_{k}\), k≥0, with \(\mathcal{P}_{k}\) being the space of all polynomials of degree k. As is well known [58] the (normalized) Jacobi polynomials P k , k=0,1,…, are eigenfunctions of L, i.e. LP k =λ k P k with λ k =k(k+α+β+1). By the density of polynomials in L 2([−1,1],μ) it follows that

$$e^{-t\overline{L}} (f) = \sum_{k\ge0} e^{-\lambda_k t} \langle f, P_k \rangle P_k, \quad t>0. $$

We next show that \(e^{-t\overline{L}}\) is submarkovian. Let Φ ε C (ℝ) and \(0 \leq\varPhi_{\varepsilon}' \leq1\). Then for any fC 2[−1,1] we have \((\varPhi_{\varepsilon}(f))' =\varPhi'_{\varepsilon}(f)f' \in C^{2}[-1,1]\) and

Hence \(e^{-t\overline{L}}\) is submarkovian (see Sect. 1.2).

Moreover, this Dirichlet space is evidently strongly local and regular and also Γ(f,g)(u)=a(u)f′(u)g′(u).

We now compute the intrinsic metric. We have for x,y∈[−1,1], x<y,

Evidently, the topology generated by this metric is the usual topology on [−1,1], and [−1,1] is complete.

It remains to verify the doubling property of the measure and the scale-invariant Poincaré inequality.

7.1 Doubling Property of the Measure

The doubling property of the measure (x)=w α,β (x)dx follows readily by the following estimates on |B(x,r)|: For any x∈[−1,1] and 0<rπ

$$ c_1\big|B(x, r)\big| \le r\bigl(1-x+r^2 \bigr)^{\alpha+1/2}\bigl(1+x+r^2\bigr)^{\beta+1/2} \le c_2\big|B(x, r)\big|, $$
(7.1)

where c 1,c 2>0 are constants depending only on α and β.

To prove these estimates, assume that x=cosθ, 0≤θπ. Then evidently \(|B(x, r)|=\int_{\cos[\pi\wedge(\theta+r)]}^{\cos[0\vee(\theta -r)]}w_{\alpha, \beta}(u)du\), where ab:=max{a,b} and ab:=min{a,b} as usual. Assume 0≤x≤1 and 0<rπ/4. The following chain of similarities with constants depending only on α,β is quite obvious:

which implies (7.1). The case when −1≤x<0 and 0<rπ/4 is similar and in the case π/4<rπ we obviously have |B(x,r)|∼1, which again leads to (7.1).

7.2 Poincaré Inequality

As was explained in Sect. 1.2 a critical ingredient in establishing Gaussian bounds for the heat kernel is the scale-invariant Poincaré inequality, which we establish next.

Theorem 7.1

For any \(f\in D(\overline{\mathcal{E}})\) and an interval I=[a,b]⊂[−1,1]

$$ \int_I \big|f(x)- f_I\big|^2 w(x) dx \le c \bigl(\operatorname{diam}_\rho(I)\bigr)^2 \int_I \big|f'(x)\big|^2 \bigl(1-x^2\bigr) w(x) dx $$
(7.2)

where \(\operatorname{diam}_{\rho}(I)= \arccos a - \arccos b\), \(f_{I} = \frac{1}{w(I)}\int_{I} f(x) w(x) dx\) with w(I)=∫ I w(x)dx, and c>0 is a constant depending only on α,β.

Proof

Denote briefly \(w[c, d]:= \int_{c}^{d}w(u)du\). We have for I=[a,b]⊂[−1,1] and xI

where . It is easy to see that

Using the above we obtain

Therefore, the theorem will be proved if we show that

$$ (b-a) \frac{w[a,u ] w[u,b ]}{w(I)} \le cw(u) \bigl(1-u^2 \bigr) \biggl(\int_a^b \frac{dz}{\sqrt{1-z^2}} \biggr)^2 $$
(7.3)

for some constant c>0 depending only on α,β.

Suppose [a,b]⊂[−1/2,1]. Then it is readily seen that

$$ w(u) \bigl(1-u^2\bigr) \biggl(\int _a^b \frac{dz}{\sqrt{1-z^2}} \biggr)^2 \ge2^{-\beta}(1-u)^{\alpha+1}(\sqrt{1-a}- \sqrt{1-b})^2. $$
(7.4)

On the other hand, since w(x)≤2|β|(1−x)α, we have

$$\frac{w[a,u ]w[u,b ]}{w(I)} \le\frac{2^{3|\beta|}}{\alpha+1}\frac{[ (1-a)^{\alpha+1} - (1-u)^{\alpha+1} ] [ (1-u)^{\alpha+1} - (1-b)^{\alpha+1} ] }{ (1-a)^{\alpha+1} - (1-b)^{\alpha+1}}. $$

We need the following inequality whose proof is straightforward: If γ>0 and 0≤AXB, then

$$ \frac{ (X^\gamma-A^\gamma)(B^\gamma-X^\gamma)}{B^\gamma-A^\gamma} \le(\gamma\vee1) X^\gamma \frac{\sqrt{B} - \sqrt{A}}{\sqrt{B} + \sqrt{A}}. $$
(7.5)

Applying this inequality we get

This coupled with (7.4) gives (7.3). The proof of (7.3) in the case when I=[a,b]⊂[−1,1/2] is the same.

Let now −1≤a<−1/2<1/2<b≤1. Suppose u∈[0,b] (the case when u∈[a,0) is similar). Then evidently w[a,u]∼1, w(I)∼1, \(\int_{a}^{b} \frac{dz}{\sqrt{1-z^{2}}} \sim1\) and (7.3) follows by

$$w[u, b]\le2^{|\beta|}\int_u^1(1-y)^\alpha dy \le\frac{2^{|\beta |}}{\alpha+1}(1-u)^{\alpha+1} \quad \hbox{and}\quad w(u) \bigl(1-u^2\bigr) \sim(1-u)^{\alpha+1}. $$

The proof of the theorem is complete. □

7.3 Gaussian Bounds on the Heat Kernel Associated with the Jacobi Operator

As a consequence of the Poincaré inequality and the doubling property of the measure, established above, we obtain (Sect. 1.2) Gaussian bounds for the heat kernel p t (x,y) associated with the Jacobi operator:

Theorem 7.2

For any x,y∈[−1,1] and 0<t≤1,

$$ \frac{c_1'\exp\{-\frac{c_1\rho^2(x, y)}{t}\}}{\sqrt{|B(x,\sqrt{t})| |B(y,\sqrt{t})}} \le p_t(x,y) \le \frac{c_2'\exp\{-\frac{c_2\rho^2(x, y)}{t}\}}{\sqrt{|B(x,\sqrt{t})| |B(y,\sqrt{t})|}}. $$
(7.6)

Here \(|B(x, \sqrt{t})| \sim\sqrt{t}(1-x+t)^{\alpha+1/2}(1+x+t)^{\beta+1/2}\), ρ(x,y)=|arccosx−arccosy| or ρ(x,y)=|θϕ| if x=cosθ and y=cosϕ, 0≤θ,ϕπ, and \(c_{1}, c_{2}, c_{1}', c_{2}'>0\) are constants depending only on α and β.

Furthermore,

$$ p_t(x,y) = \sum_{k\ge0} e^{-\lambda_k t} P_k(x) P_k(y), \quad \lambda_k = k(k+\alpha+ \beta+1), $$
(7.7)

where the series converges uniformly.

The above results and Theorem 3.4 yield the nearly exponential localization of kernels as in the following

Corollary 7.3

Let \(f\in C^{\infty}_{0}(\mathbb{R}_{+})\) and f (2ν+1)(0)=0, ν≥0 and consider the kernel \(\varLambda_{\delta}(x, y) = \sum_{k\ge0} f(\delta\sqrt{\lambda_{k}}) P_{k}(x)P_{k}(y)\), 0<δ≤1. Then for any σ>0 there exists a constant c σ >0 such that

$$ \big|\varLambda_\delta(x, y)\big| \le c_\sigma \bigl(\big|B(x, \delta)\big|\big|B(x, \delta )\big| \bigr)^{-1/2} \biggl(1+ \frac{\rho(x, y)}{\delta} \biggr)^{-\sigma}, $$
(7.8)

where |B(⋅,δ)| and ρ(x,y) are as above.

This result is more complete than the similar estimate (2.14) in [47] (see also [29, 30]) which is proved under the restriction α,β>−1/2.