1 Introduction

The uncertainty principle in harmonic analysis is a fundamental principle roughly saying that a function and its Fourier transform cannot be simultaneously “well localized”. There are many non-equivalent ways to make this statement precise by imposing different types of “localization” conditions. One such form of the uncertainty principle is Benedicks’ theorem [9] which says that a function \(f\in L^2({{\mathbb {R}}}^d)\) and its Fourier transform \(\hat{f}\) cannot be both supported on sets of finite Lebesgue measure. The following is the quantitative form of this result as obtained by Amrein and Berthier [1]: If \(E, F\subseteq {{\mathbb {R}}}^d\) are sets of finite Lebesgue measure then there exists \(c>0\) such that

$$\begin{aligned} c\Vert f\Vert _{L^2({{\mathbb {R}}})}\le \Vert f\Vert _{L^2(E^c)}+\Vert \hat{f}\Vert _{L^2(F^c)}, \end{aligned}$$

for all \(f\in L^2({{\mathbb {R}}})\). Using the Kohn-Nirenberg quantization one can restate this result in the following form: Let P be the operator

$$\begin{aligned} Pf(x):= \int _{{{\mathbb {R}}}^d}1_{(E\times F)^c}(x,\xi ) \hat{f}(\xi ) e^{2 \pi ix\cdot \xi }\, d\xi . \end{aligned}$$

Then \(\langle Pf,f \rangle \ge c \Vert f\Vert ^2\). It was shown by the first author [22] that a similar result continues to hold for all subsets of \({{\mathbb {R}}}^{2d}\) with finite Lebesgue measure, including the ones which are not of the form \(E\times F\). A natural analog of this result can be obtained by replacing the Kohn-Nirenberg quantization with the anti-Wick quantization. In this case we have the following statement for the short-time Fourier transform: Let \(E \subset {\mathbb {R}}^{2d}\) have finite Lebesgue measure. There exists \(c>0\) such that

$$\begin{aligned} \int _{E^c} \left| V_{\phi } f(p,q)\right| ^2 \, dp \, dq \ge c \Vert f\Vert ^2 \end{aligned}$$
(1)

for all \(f \in L^2({{\mathbb {R}}}^d)\), where \(V_\phi f\) is the short-time Fourier transform defined by \(V_\phi f(p,q) = \int f(x) e^{2\pi i px}\phi (x-q) \, dx\). This result was proved independently and almost simultaneously by Jaming [27], Janssen [29], and Wilczok [40]. Using similar reasoning Wilczok also showed the following analog for Wavelet transform: Let \(E \subset (0,\infty ) \times {{\mathbb {R}}}\) with finite affine measure (\(\int _E a^{-1} da db < \infty \)). There exists \(c>0\) such that

$$\begin{aligned} \int _{E^c} \left| W_\psi f(a,b)\right| ^2 \dfrac{da db}{a} \ge c \Vert f\Vert ^2 \end{aligned}$$
(2)

for all \(f \in L^2({{\mathbb {R}}}^d)\) where the wavelet transform \(W_\psi f\) is defined by \(W_\psi f(a,b)=a^{-1/2}\int _{{{\mathbb {R}}}} f(x)\psi (a^{-1}x-b) \, dx\) for some wavelet \(\psi \).

In the classical cases, when the window function \(\phi \) or the mother wavelet function \(\psi \) are specially chosen, \(V_\phi L^2({\mathbb {R}}^d)\) is a Fock space of analytic functions and \(W_\psi L^2({\mathbb {R}}^d)\) is a Bergman space on the upper half space. In both cases, these inequalities hold for the more general class of so called relatively dense sets, which is known to be the optimal such class of sets. This has been known since the 80’s, and follows from the work of Luecking [33, 34] on the Bergman space. For the Fock space, one can consult the work of Janson–Peetre–Rochberg [30], Ortega-Cerdà [37], and the recent work of Ascensi [3]. However, it can be easily shown (see section 5 below as well as [3, 28]) that relative density is a too weak condition for these results to continue to hold for general \(L^2\) window and wavelet functions.

Still, it is natural to ask whether there exists a larger class of sets E (of infinite measure) for which (1) and (2) continue to hold for a more general class of window and wavelet functions. For the short-time Fourier transform some sufficient conditions can be found in [3, 28] for windows with varying degrees of regularity. However, for general window functions \(\phi \in L^2({\mathbb {R}}^d)\), Fernández and Galbis [19] showed that (1) holds for all sets E satisfying a certain thinness condition. The main goal of this paper is to show that (2) also continues to hold for the appropriate analog of thin sets. Furthermore, we also extend the result of Fernández and Galbis in the context of short-time Fourier transforms on more general LCA groups. We actually prove an uncertainty principle of Amrein-Berthier type for more general Berezin-type quantizations which include both short-time Fourier and wavelet inequalities as special cases.

To state our result we now introduce the above mentioned Berezin-type quantization. Let \(\mathcal {H}\) be a Hilbert space and \((X, d, \mu )\) be a metric measure space with a metric d and a Borel measure \(\mu \). We assume that the metric d is proper, i.e., every ball with respect to this metric is precompact. We will call the continuous map \(k: X\rightarrow \mathcal {H}\) the Berezin quantization of X if \(\left\| k(x)\right\| =1\) for all \(x\in X\), and for each \(f \in \mathcal {H}\)

$$\begin{aligned} \left\| f\right\| ^2 = \int _X \left| \left\langle f,k(x)\right\rangle \right| ^2 d\mu (x). \end{aligned}$$

We will think of the image \(\{k(x): x\in X\}\) as a collection of unit vectors in \(\mathcal {H}\) indexed by X, and thus we will use the notation \(k_x=k(x)\). The map \(f\rightarrow \left\langle f,k_x\right\rangle \) defines an isometric embedding of the Hilbert space \(\mathcal {H}\) into \(L^2(X, \mu )\), and allows us to think of \(\mathcal {H}\) as a closed subspace of \(L^2(X, \mu )\). For each \(a\in L^2(X,\mu )\) define \(\int _X a(x)k_xd\mu (x)\) to be the unique vector \(h\in \mathcal {H}\) satisfying

$$\begin{aligned} \left\langle g,h\right\rangle =\int _X a(x)\left\langle g,k_x\right\rangle d\mu (x), \end{aligned}$$

for all \(g\in \mathcal {H}\). It is easy to see that the map

$$\begin{aligned} a\rightarrow \int _X a(x)k_xd\mu (x) \end{aligned}$$

is the orthogonal projection from \(L^2(X,\mu )\) onto \(\mathcal {H}\). In particular, for \(f\in \mathcal {H}\) we have the expansion formula

$$\begin{aligned} f= \int _X \left\langle f,k_x\right\rangle k_x \, d\mu (x). \end{aligned}$$
(3)

In other words, we have that \(\{k_x\}_{x\in X}\) is a normalized continuous Parseval frame in \(\mathcal {H}\).

Further we will assume that X is a homogeneous space in the sense that some locally compact group G acts transitively on X in a way that both \(\mu \) and d are invariant under this group action (\(\mu (gE)=\mu (E)\) and \(d(gx,gy)=d(x,y)\)). The group action needs also to respect the inner product on \(\mathcal {H}\) in the following way \(\left| \langle k_{gx},k_{gy} \rangle \right| = \left| \langle k_x,k_y \rangle \right| \) for all \(x,y \in X\), \(g \in G\). Under all these assumptions, we will call the tuple \((\mathcal {H},X,G,k,\mu ,d)\) a Berezin quantization.

Very often, the metric measure space X and its quantization \(k: X\rightarrow \mathcal {H}\) of this type arise naturally whenever we are given a locally compact, second countable topological group G and some of its nontrivial irreducible, square-integrable, unitary representations (if such exists), \(\pi :G\rightarrow \mathcal {U}(\mathcal {H})\) [13]. Then G itself can be equipped with a left-invariant Haar measure \(\mu \), and left-invariant metric d (inducing the topology on G) which is proper, and for any unit vector \(k\in \mathcal {H}\), the collection \(\{k_x\}_{x\in G}\) defined by \(k(x)=\pi (x) k\) will form a normalized continuous Parseval frame for \(\mathcal {H}\) such that \(\left\langle k_{gx},k_{gy} \right\rangle = \pi { k_x }{ k_y }\) for all \(x,y, g \in G\). Actually we can obtain a quantization with essentially the same properties using the same procedure even when we only have a projective irreducible square-integrable unitary representation, \(\pi :G\rightarrow \mathcal {U}(\mathcal {H})/{\mathbb C}\).

The following inequality is the obvious analog of (1) and (2) for Berezin quantization.

$$\begin{aligned} \int _{X \backslash E} \left| \left\langle f,k_x\right\rangle \right| ^2 \, d\mu (x) \ge c \Vert f\Vert ^2 \end{aligned}$$
(4)

for all \(f \in \mathcal {H}\). By choosing the group G appropriately (see Sect. 5 for more details on this) one obtains both (1) and (2) as special cases.

The inequality (4) can be viewed as a statement about invertibility of a certain Toeplitz operator. For a non-negative, bounded, measurable function \(\sigma : X\rightarrow {{\mathbb {R}}}\), the Toeplitz operator \(T_\sigma : \mathcal {H}\rightarrow \mathcal {H}\) with symbol \(\sigma \) is defined by

$$\begin{aligned} T_{\sigma }f = \int _X \sigma (x) \langle f,k_x \rangle k_x \, d \mu (x). \end{aligned}$$

It is easy to see that each such operator is bounded, self-adjoint, and positive. We can easily recast (4) in the following way:

$$\begin{aligned} \langle T_{1_{X \backslash E}}f,f \rangle \ge c \Vert f\Vert ^2. \end{aligned}$$
(5)

This is a statement about boundedness from below, and hence invertibility, of the Toeplitz operator \(T_\sigma \) with an indicator symbol. It is clear that each Toeplitz operator whose symbol is bounded away from zero must be bounded from below (and hence invertible). However, to study (5), we must broaden this trivial result since an indicator function is only non-negative. So we would like to characterize the degree to which a non-negative symbol \(\sigma :X\rightarrow {{\mathbb {R}}}\) can vanish and still generate an invertible Toeplitz operator \(T_\sigma \).

The organization of the paper closely follows the main steps in the proof. The proof outline that we use goes back at least to Havin and Joricke [24]. Namely, to prove (5) it suffices to show that the operator \(T_{1_E}\) is compact with a trivial eigenspace. The later is proved by showing that from each nontrivial element of this eigenspace, using small translates, one can construct an infinite dimensional eigenspace of a slightly “bigger” compact self-adjoint operator of the same form. In Sect. 2 we address the compactness problem for positive Toeplitz operators. We show that compactness can be characterized in terms of the vanishing property of its Berezin transform, a condition which in turn is closely connected to the thinness condition of Fernández and Galbis. In Sect. 3 we examine the linear independence problem for translations. We provide a fairly general condition (usually easy to check) on the group which guarantees linear independence of the translations. In Sect. 4 we combine the previous conclusions and state and prove our main result (Theorem 2). Finally, we show how our main result translates to more familiar settings, proving in particular (2) for thin sets and a fairly wide class of mother wavelet functions.

2 Compactness of Positive Toeplitz Operators

In this section we deal with the following question: Which non-negative symbols \(\sigma :X\rightarrow {{\mathbb {R}}}\) generate compact Toeplitz operators \(T_\sigma \)? We show that, under suitable assumptions on the continuous Parseval frame \(\{k_x\}_{x\in X}\), a necessary and sufficient condition for \(T_\sigma \) to be compact is its “diagonal" \(\left\langle T_\sigma k_x,k_x\right\rangle \) to vanish at infinity. We now make this precise.

For a given bounded operator \(T:\mathcal {H}\rightarrow \mathcal {H}\) we define its Berezin transform \(\tilde{T}:X\rightarrow {\mathbb C}\) by \(\tilde{T}(x)=\left\langle T k_x,k_x\right\rangle \). In the case when T is a Toeplitz operator with symbol \(\sigma \) we denote its Berezin transform by \(\tilde{\sigma }\), i.e., \(\tilde{\sigma }(x)=\left\langle T_\sigma k_x,k_x\right\rangle \). In what follows, by \(y \rightarrow \infty \), we will mean that \(d(z,y) \rightarrow \infty \) for some (equivalently, all) \(z \in X\).

Definition 1

A measurable function \(\sigma :X\rightarrow \mathbb {C}\) is said to be thin if its Berezin transform vanishes at infinity, i.e.,

$$\begin{aligned} \tilde{\sigma }(y) := \int _X \sigma (x)\left| \langle k_x,k_y \rangle \right| ^2 \, d\mu (x) \rightarrow 0 \quad \text{ as } y \rightarrow \infty \end{aligned}$$
(6)

We will also say that a set \(E \subset X\) is thin if the indicator function of E satisfies (6).

The connection between compactness of a Toeplitz operator and the Berezin transform vanishing at infinity (called thinness here) is another quite old question [11, 36, 42]. The breakthrough results in the area of analytic function spaces were due to Axler and Zheng [4] for the Bergman space on the disk, \(A^p(\mathbb D)\), and to Suárez [39] for \(A^p(\mathbb B_n)\). Subsequently, these results were generalized to many different settings [7, 15, 38]. However, going beyond the realm of analytic function spaces, extra assumptions are made either on the Berezin transform or the decay rate of the symbol \(\sigma \) [5, 6, 8, 10, 12, 18, 41]. Herein, by restricting our study to nonnegative symbols, and assuming a Schur-type estimate on \(\{k_x\}\), we obtain a complete equivalence in Theorem 1 below.

The following proposition gives a somewhat more explicit alternative characterization of thinness.

Proposition 1

Let \(\sigma :X \rightarrow [0,1]\). The following are equivalent

  1. (i)

    \(\sigma \) is thin.

  2. (ii)

    \(\displaystyle \lim \limits _{y \rightarrow \infty } \int _{B(y,R)} \sigma (x) \, d\mu (x) = 0\) for some \(R>0\).

  3. (iii)

    \(\displaystyle \lim \limits _{y \rightarrow \infty } \int _{B(y,R)} \sigma (x) \, d\mu (x) = 0\) for all \(R>0\).

Proof

Fix an element \(e \in X\) which is arbitrary, but will be treated as the origin. Since X is homogeneous, for each \(x \in X\), there exists \(g_x \in G\) (not necessarily unique) such that \(g_xe=x\). For each x we pick one such \(g_x\) and fix it throughout the proof. By the invariance of \(\mu \) and d,

$$\begin{aligned} \int _{B(y,R)} \sigma (x) \, d\mu (x) = \int _{B(e,R)}\sigma (g_yx) \, d\mu (x). \end{aligned}$$

Using our initial assumption, the metric d is proper, i.e., every ball is precompact. So, the ball B(eR) can be covered by finitely many balls of a fixed radius. Therefore, we have (ii) implies (iii).

Next, consider the function \(h(x) = \langle k_e,k_x \rangle \). \(h(e)=\Vert k_e\Vert ^2 =1\) and h is continuous so there exists \(\delta >0\) such that \(|h(x)| \ge \tfrac{1}{2} \) for \(x \in B(e,\delta )\). Thus,

$$\begin{aligned}{} & {} \dfrac{1}{4} \int _{B(e,\delta )}\sigma (g_yx) \, d\mu (x) \le \int _{B(e,\delta )}\sigma (g_yx)|h(x)|^2 \, d\mu (x) \\{} & {} \quad \le \int _X \sigma (x) \left| h(g_y^{-1}x)\right| ^2 \, d\mu (x) = \int _X \sigma (x)\left| \langle k_y,k_x \rangle \right| ^2 \, d\mu (x). \end{aligned}$$

Therefore (i) implies (ii). To show (iii) implies (i), let \(\varepsilon >0\). We can find \(R>0\) such that \(\int _{B(e,R)^c} \left| h(x)\right| ^2 \, d\mu (x) < \varepsilon /2\). For this R, by (iii), there exists \(N>0\) such that for \(d(e,y) \ge N\), \(\int _{B(e,R)}\sigma (g_yx) \, d\mu (x) \le \varepsilon /2\). Therefore,

$$\begin{aligned} \int _X \sigma (x)\left| \langle k_y,k_x \rangle \right| ^2 \, d\mu (y) = \int _{B(e,R)}+\int _{B(e,R)^c}\sigma (g_yx) \left| h(x)\right| ^2 \, d\mu (x) \le \varepsilon , \end{aligned}$$

for all \(y\in X\) with \(d(e,y) \ge N\). Here we used the fact that \(|h(x)| \le \Vert k_e\Vert \cdot \Vert k_x\Vert =1\).

\(\square \)

We are now ready to characterize compactness of Toeplitz operators when the Parseval frame satisfies some additional decay conditions.

Theorem 1

Let \((\mathcal {H},X,G,k,\mu ,d)\) be a Berezin quantization. Suppose that the continuous Parseval frame \(\{k_x\}_{x\in X}\) satisfies the following: there exists a weight \(w:X \rightarrow (0,\infty )\) and \(M>0\) such that

  1. (i)

    \(\displaystyle w(y)^{-1}\int _X \left| \left\langle k_x , k_y \right\rangle \right| w(x) \, d\mu (x) \le M\) for all \(y \in X\).

  2. (ii)

    \(\displaystyle \lim _{R \rightarrow \infty } \sup _{y \in X} w(y)^{-1} \int _{B(y,R)^c} \left| \left\langle k_x , k_y \right\rangle \right| w(x) \, d\mu (x) =0\).

  3. (iii)

    \(\left| \left\langle k_x , k_y \right\rangle \right| \rightarrow 0\) as \(d(x,y) \rightarrow \infty \).

Then \(\sigma \in L^\infty (X)\) is thin if and only if \(T_\sigma \) is compact.

Remark 1

In the case \(X=G\), it is often easier to check that \(x \mapsto \langle k_1,k_x\rangle w(x)\) is in \(L^1(X)\) for some submultiplicative weight w (\(w(xy) \le C w(x)w(y)\)). Then, (i) and (ii) follow from the group invariance of \(\left| \langle k_x,k_y \rangle \right| \). In this way, these conditions are related to the so-called analyzing vectors for the coorbit spaces of Gröchenig and Feichtinger [17].

Proof

To prove necessity we only need to use (iii). First, we show that (iii) implies \(k_y {\overset{\textrm{w}}{\rightarrow }}0\) as \(y \rightarrow \infty \). We have \(\langle k_y,k_x \rangle \rightarrow 0\) for each \(x \in X\) as \(d(e,y) \rightarrow \infty \). Moreover, since \(\{k_x\}_{x\in X}\) is a continuous Parseval frame we have \(\left\| f\right\| ^2=\int _X\left| \left\langle f,k_x\right\rangle \right| ^2d\mu (x)\), and hence \(\overline{{{\,\textrm{span}\,}}}\{k_x\} = \mathcal {H}\). Let \( f\in \mathcal {H}\) and \(\varepsilon >0\). There exists \(f_\varepsilon \in {{\,\textrm{span}\,}}\{k_x\}\) such that \(\Vert f-f_\varepsilon \Vert \le \varepsilon /2\). Moreover, there exists M such that if \(d(e,y) > M\) then \(\left| \langle k_y,f_\varepsilon \rangle \right| \le \varepsilon /2\). Thus,

$$\begin{aligned} \left| \langle k_y,f \rangle \right| \le \Vert k_y\Vert \cdot \Vert f-f_\varepsilon \Vert + \left| \langle k_y,f_\varepsilon \rangle \right| \le \varepsilon \end{aligned}$$

whenever \(d(e,y) > M\). Thus, if \(T_\sigma \) is compact we have \(T_\sigma k_y \rightarrow 0\) as \(y\rightarrow \infty \), and hence \(\tilde{\sigma }(y)=\left\langle T_\sigma k_y,k_y\right\rangle \rightarrow 0\) as \(y\rightarrow \infty \).

We now prove sufficiency. For this we use (i) and (ii). Let \(\varepsilon >0\). There exists \(R>0\) such that

$$\begin{aligned} w(y)^{-1}\int _{B(y,R)^c} \left| \langle k_y,k_z \rangle \right| w(z) \, d\mu (z) \le \varepsilon \end{aligned}$$

for all \(y \in X\). First we estimate the “tails” using the Schur property of \(\{k_x\}\). For any \(f \in \mathcal {H}\),

$$\begin{aligned} \begin{aligned}&\int _X\left| \int _{B(x,R)^c} \sigma (y)\langle f, k_y \rangle \langle k_y,k_x \rangle \, d\mu (y)\right| ^2 \, d\mu (x) \\&\quad \le \Vert \sigma \Vert _\infty ^2 \int _X \left( \int _{B(x,R)^c} \left| \langle f,k_y \rangle \right| ^2 \left| \langle k_y,k_x \rangle \right| w(y)^{-1} \,d\mu (y)\right) \\&\quad \quad \times \left( \int _{B(x,R)^c} \left| \langle k_y,k_x \rangle \right| w(y) \, d\mu (y) \right) d\mu (x) \\&\quad \le \varepsilon \Vert \sigma \Vert _\infty ^2\int _X \left| \langle f,k_y \rangle \right| ^2 \int _X w(x)\left| \langle k_y,k_x \rangle \right| w(y)^{-1} \,d\mu (x) \, d\mu (y) \\&\quad \le M \Vert \sigma \Vert _\infty ^2 \varepsilon \Vert f\Vert ^2. \end{aligned} \end{aligned}$$

Now, since \(\sigma \) is thin and bounded, by Proposition 1, there exists \(S>0\) such that for \(d(e,y) \ge S\), \(\int _{B(y,R)}\left| \sigma \right| ^2 \, d\mu \le \varepsilon \). Thus,

$$\begin{aligned}{} & {} \int _{B(e,S)^c}\left| \int _{B(x,R)} \sigma (y)\left\langle f , k_y \right\rangle \left\langle k_y , k_x \right\rangle \, d\mu (y) \right| ^2 \, d\mu (x) \\{} & {} \quad \le \varepsilon \int _X\int _X \left| \left\langle f , k_y \right\rangle \right| ^2 \left| \left\langle k_x , k_y \right\rangle \right| ^2 \, d\mu (y)\, d\mu (x) = \varepsilon \Vert f\Vert ^2. \end{aligned}$$

Now, let \(f_n {\overset{\textrm{w}}{\rightarrow }}0\) be arbitrary. Then, \(\Vert f_n\Vert \) is uniformly bounded.

$$\begin{aligned} \Vert T_\sigma f_n\Vert ^2&=\int _X \left| \left\langle T_\sigma f_n , k_x \right\rangle \right| ^2 \, d\mu (x) \\&= \int _X\left| \int _{B(x,R)}+\int _{B(x,R)^c} \sigma (y)\left\langle f_n , k_y \right\rangle \pi { k_y }{ k_x } \, d\mu (y)\right| ^2 \, d\mu (x) \\&\le \int _{B(e,S)} + \int _{B(e,S)^c} 2\left| \int _{B(x,R)} \sigma (y)\left\langle f_n , k_y \right\rangle \left\langle k_y , k_x \right\rangle \, d\mu (y)\right| ^2\, d\mu (x) + C \varepsilon \\&\le 2\int _{B(e,S)}\left| \int _{B(x,R)} \sigma (y)\left\langle f_n , k_y \right\rangle \left\langle k_y , k_x \right\rangle \, d\mu (y)\right| ^2\, d\mu (x) + C \varepsilon . \end{aligned}$$

Since the integrand goes to zero pointwise, applying the dominated convergence theorem we obtain

$$\begin{aligned} \limsup _{n \rightarrow \infty } \left\| T_\sigma f_n\right\| ^2 \le C \varepsilon . \end{aligned}$$

But \(\varepsilon \) is arbitrary so \(\lim \limits _{n\rightarrow \infty } T_\sigma f_n =0\). Therefore \(T_\sigma \) is compact. \(\square \)

3 Independence of Translations

The action of G on X naturally induces the following translation action on \(L^2(X, d\mu )\). For a given \(h \in G\), the translation operator \(\tau _h: L^2(X, d\mu )\rightarrow L^2(X, d\mu )\) is defined by \(\tau _h F(x)=F(h^{-1} x)\) for \(F \in L^2(X, d\mu )\). We denote the translated function by \(F_h:=\tau _h F\). We can view \(\mathcal {H}\) as a subspace of \(L^2(X)\) by identifying \(f \in \mathcal {H}\) with \(x \mapsto \langle f,k_x \rangle \in L^2(X, d\mu )\). In this way, \(\tau _h\) induces the following shift operator \(S_h:\mathcal {H}\rightarrow \mathcal {H}\) defined by

$$\begin{aligned} S_h f = \int _X \langle f,k_{h^{-1}x} \rangle k_{x} \, d\mu (x) = \int _X \langle f,k_{x} \rangle k_{hx} \, d\mu (x). \end{aligned}$$

It is easy to check that \(S_h^*=S_{h^{-1}}\) and \(S_h\) is bounded. While it is not true in general that \(S_hk_x=k_{hx}\), this does hold if \(\langle k_{hx},k_{hy} \rangle = \langle k_x,k_y \rangle \), and then \(S_h\) is unitary so that \(S_hT_\sigma S_h^* = T_{\sigma _h}\). For this reason, we introduce the subgroup of G,

$$\begin{aligned} G_\tau =\{ h \in G : \langle k_{hx},k_{hy} \rangle = \langle k_x,k_y \rangle \text{ for } \text{ all } x,y \in X \}. \end{aligned}$$
(7)

In this way, \(h \mapsto S_h\) is a homomorphism from \(G_\tau \rightarrow \mathcal {U}(\mathcal {H})\).

We are interested in conditions on the action under which different shifts of any given non-zero \(f\in \mathcal {H}\) form a linearly independent set. Observe first that this problem can be reduced to the more studied problem of linear independence of translations. Indeed, for any nonzero \(f\in \mathcal {H}\) define \(F(x) = \langle f,k_x \rangle \). Then \(F \in L^2(X)\) and \( \langle S_h f,k_x \rangle = F(h^{-1}x)\) for \(h \in G_\tau \). The last relation clearly implies that the linear independence of \(\{S_{h_1}f, S_{h_2}f, \dots , S_{h_n}f\}\) is equivalent to the linear independence of the corresponding translations of F, \(\{F_{h_1}, F_{h_2}, \dots , F_{h_n}\}\).

Note, however, that the linear independence problem for translations is in general a very difficult problem. Even for some fairly simple group actions, translations may be not linear independent. Specifically, in the case of the Affine group, \(\chi _{[0,1]}(x)=\chi _{[0,1]}(2x)+\chi _{[0,1]}(2x-1)\) is a simple example. Moreover, for the Heisenberg Group, the linear independence problem for time-frequency shifts (known as the HRT conjecture [25]) is still open and is widely considered to be very difficult.

However, in the case of Abelian groups G, a theorem of Edgar and Rosenblatt [14] says that the translations of any non-zero \(F\in L^2(G)\) are linearly independent as long as G has no nontrivial compact subgroups. Our main goal will be to extend this result to the case of homogeneous spaces X.

Let \(\Gamma \subset G\). Denote

$$\begin{aligned} \mathbb {C}\Gamma = \left\{ \sum c_nh_n: h_n \in \Gamma , c_n \in \mathbb {C}\right\} . \end{aligned}$$

The elements \(\theta \in \mathbb {C}\Gamma \) act on \(F \in L^2(X)\) by

$$\begin{aligned} \theta F = \sum c_n F_{h_n}. \end{aligned}$$

Thus, a collection of translations \(\Gamma F = \{ F_h: h\in \Gamma \}\) is linearly independent if and only if \(\theta F \ne 0\) for all \(0 \ne \theta \in \mathbb {C}\Gamma \). This leads to the following definition.

Definition 2

We say that \(L^2(X)\) has linearly independent \(\Gamma \)-translations if for all \(F \in L^2(X)\), \(\theta \in \mathbb {C}\Gamma \),

$$\begin{aligned} \theta F =0 \text{ implies } F=0 \text{ or } \theta =0. \end{aligned}$$

We want to study the independence of \(\Gamma \)-translates of \(L^2(X)\) by reducing it to the case \(L^2(\Gamma )\) where we can use the result of Edgar and Rosenblatt. The following proposition establishes this reduction. It can be viewed as a generalization of a result from [32] to non-discrete \(\Gamma \).

Proposition 2

Let \((X,\mu )\) be a measure space. Let \(\Gamma \) be a unimodular locally compact group \(\Gamma \) acting on X under which \(\mu \) is invariant. Then \(L^2(X)\) has linearly independent \(\Gamma \)-translations if \(L^2(\Gamma )\) has linearly independent \(\Gamma \)-translations.

In proving this, we will make crucial use of the following identity.

Lemma 1

Let \(\Gamma \) and \((X,\mu )\) be as above. Then there exists a measure \(\nu \) on \(X / \Gamma \) such that

$$\begin{aligned} \int _{X / \Gamma } \int _\Gamma F(\gamma x) \, d\lambda (\gamma ) \, d\nu (\Gamma x) = \int _X F(x) \, d\mu (x) \end{aligned}$$
(8)

holds for all \(0 \le F \in L^1(X,\mu )\), where \(\lambda \) is the Haar measure on \(\Gamma \).

We are not aware of such a fact in the literature. However, in the case where X is a group and \(\Gamma \le X\), it is well-known [20, 35], so we show in the Appendix that a slight modification of these proofs yields (8) in our setting.

Proof of Proposition 2

Let \(F \in L^2(X)\) be nonzero. Let E denote the set of \(\Gamma x \in X /\Gamma \) with \(\int \left| F(\gamma x)\right| ^2\,d \lambda (\gamma ) \in \{0,\infty \}\). By formula (8), replacing F with \(\left| F\right| ^2\), it must be that \(\nu (E^c)>0\). So, there exists \(\Gamma x \in E^c\) such that \(g(\gamma ):= F(\gamma x)\) satisfies \(g \in L^2(\Gamma )\) and \(g \ne 0\). Thus, for each \(\theta \in {\mathbb C}\Gamma \), if \(\theta F=0\), then \(\theta g=0\) which implies \(\theta =0\). \(\square \)

Applying the known result for translations on Abelian groups from [14], we obtain the following corollary.

Corollary 1

Let \(\Gamma \) be an Abelian subgroup of \(G_\tau \) with no nontrivial compact subgroups. For any \(\{h_n\} \subset \Gamma \), and \(f \in \mathcal {H}\), \(\{S_{h_n}S_{h_{n-1}}\cdots S_{h_1}f\}_{n=1}^\infty \) is linearly independent.

Proof

Let \(f \in \mathcal {H}\). Define \(F(x) = \langle f,k_x \rangle \). \(F \in L^2(X)\) and \( \langle S_h f,k_x \rangle = F(h^{-1}x)\). Therefore since \(L^2(\Gamma )\) has linearly independent \(\Gamma \) translations by the result of Edgar and Rosenblatt [14], \(L^2(X)\) has linearly independent \(\Gamma \) translations by Proposition 2.

4 Thin Toeplitz Operators

In this section, we show that from one compact Toeplitz operator \(T_\sigma \), one can create \(T_\rho \) with \(\rho \ge \sup _{n}\sigma _{h_n}\) for some infinite, but small, sequence of translates \(\{h_n\}\), while \(T_\rho \) remains compact.

The set of positive linear operators \(B(\mathcal {H})^+\) on a Hilbert space \(\mathcal {H}\) possesses a partial ordering. We say \(A \ge B\) if \(A-B\) is a positive operator, i.e \(\langle (A-B)f,f\rangle \ge 0\) for all \(f \in \mathcal {H}\). Letting \((\sigma \vee \rho )(x) = \max \{\sigma (x),\rho (x)\}\), we have

  1. (i)

    \(T_\sigma ,T_\rho \le T_{\sigma \vee \rho }\).

  2. (ii)

    If \(T_\sigma \) and \(T_\rho \) are compact, so is \(T_{\sigma \vee \rho }\).

  3. (iii)

    \(\Vert T_{\sigma \vee \rho }\Vert \le \max \{ \Vert \sigma \Vert _\infty , \Vert \rho \Vert _\infty \}\)

Since \(\langle T_\sigma f,f \rangle = \int \sigma (x) \left| \langle f,k_x\rangle \right| ^2 \, d\mu (x)\), ordering of the Toeplitz operators follows from the ordering of their symbols. (ii) follows from the fact that \(T_{\sigma \vee \rho } \le T_\sigma + T_\rho \). (iii) is a consequence of the trivial estimate that \(\Vert T_\sigma \Vert \le \Vert \sigma \Vert _\infty \).

Lemma 2

Let \(\sigma \) be thin and \(\{h_k\} \subset G\) such that \(h_k \rightarrow 1\). Then, there exists \(\{m_k\}\) such that

$$\begin{aligned} \rho (x) = \sup _k \sigma _{h_{m_k}h_{m_{k-1}}\cdots h_{m_1}}(x) \end{aligned}$$

is thin.

Proof

Let \(K_y(x)= \left| \langle k_x,k_y \rangle \right| ^2 \in L^1(X)\). Then,

$$\begin{aligned} \tilde{\sigma }(y) = \int _X \sigma (x)K_y(x) \, d\mu (x). \end{aligned}$$

Therefore, by the boundedness of \(\sigma \) and the continuity of the translations on \(L^1(X)\),

$$\begin{aligned} \left| \tilde{\sigma }(y)-\tilde{\sigma }_h(y)\right| = \int _X \sigma (x)[ K_y(x)-K_y(hx)] \, d\mu (x) \rightarrow 0 \end{aligned}$$

as \(h \rightarrow 1\) for each y. Since \(h_m \rightarrow 1\) we can pick a subsequence such that \(\{\prod _{finite}h_m\} \subset B_1\). From this subsequence, we pick another one. Set \(\rho _0=\sigma \). Then, since \(\sigma \) is thin, there exists \(R_0\) such that \(\tilde{\rho }_0 \le 1\) outside \(B_{R_0}\). Then, pick \(m_0\) such that

$$\begin{aligned} \left| \tilde{\rho }_0 - \tilde{\rho }_{0,h_{m_0}}\right| \le 1 \end{aligned}$$

on \(B_{R_0+1}\). Then,

$$\begin{aligned} \left| \tilde{\rho }_0 - \tilde{\rho }_{0,h_{m_0}}\right| \le 3. \end{aligned}$$

Now, having \(\rho _j\) (thin) and \(h_{m_j}\) for \(j < k\), pick \(m_k\) in the same manner as above (finding \(R_k\) since \(\rho _j\) are all thin) so that

$$\begin{aligned} \left| \tilde{\rho }_j - \tilde{\rho }_{j,h_{m_k}}\right| \le 3\cdot 2^{-k} \end{aligned}$$

for all \(j < k\). then set \(\rho _k = \max \{\rho _{k-1},\rho _{k-1,h_{m_k}}\}\).

Now we are ready to prove that \(\rho =\lim \rho _k\) is thin. Let \(\varepsilon >0\) and pick j such that \(3\sum _{k=j}^\infty 2^{-k} \le \tfrac{\varepsilon }{2}\). Let \(R>0\) such that \(\tilde{\sigma }(y) \le \tfrac{\varepsilon }{2^{j+1}}\) for \(d(1,y) \ge R\). Then, for \(d(1,y) \ge R+1\), \(\tilde{\sigma }_{h}(y) \le \tfrac{\varepsilon }{2^{j+1}}\) for all \(h \in B_1\). Then, by the fact that \(|P_j|=2^j\) (\(P_j\) is the power set of \(\{0,1,2,\ldots ,j\}\)), and noticing that \(\rho _k = \max _{I \in P_k} \sigma _{\prod _{i \in I}h_{m_i}}\),

$$\begin{aligned} \tilde{\rho }_j(y) \le \sum _{I \in P_j} \tilde{\sigma }_{\prod _{i \in I}h_{m_i}}(y) \le 2^j(\tfrac{\varepsilon }{2^{j+1}}) = \tfrac{\varepsilon }{2}. \end{aligned}$$

By construction of \(h_{m_k}\), \(\left| \tilde{\rho }_k -\tilde{\rho }_{k+1}\right| \le 2^{-k}\). Therefore,

$$\begin{aligned} \tilde{\rho }(y) \le \tilde{\rho }_j(y) + \sum _{k=j}^\infty \left| \tilde{\rho }_k(y)-\tilde{\rho }_{k+1}(y)\right| \le \tfrac{\varepsilon }{2} + 3\sum _{k=j}^\infty 2^{-k} \le \varepsilon . \end{aligned}$$

4.1 Main Results

We now assemble the pieces from the previous sections, specifically Theorem 1 and Corollary 1 to prove our main results.

Theorem 2

Let \((\mathcal {H},X,G,k,\mu ,d)\) be a Berezin quantization. Suppose there exists \(w:X \rightarrow (0,\infty )\) and \(M>0\) such that

  1. (i)

    \(\displaystyle w(y)^{-1}\int _X \left| \langle k_x,k_y\rangle \right| w(x) \, d\mu (x) \le M\) for all \(y \in X\).

  2. (ii)

    \(\displaystyle \lim _{R \rightarrow \infty } \sup _{y \in X} w(y)^{-1} \int _{B(y,R)^c} \left| \langle k_x,k_y \rangle \right| w(x) \, d\mu (x) =0\).

Suppose also that \(G_\tau \) contains a non-discrete Abelian subgroup which has no nontrivial compact subgroups. Then, if \(\sigma :X \rightarrow [0,1]\) is thin, there exists \(c>0\) such that

$$\begin{aligned} \langle T_{1-\sigma }f,f \rangle \ge c \Vert f\Vert ^2 \end{aligned}$$

for all \(f \in \mathcal {H}\).

Proof

Suppose there exists \(f \ne 0\) such that \(T_\sigma f=f\). By Lemma 2, there exists a sequence \(\{h_k\}_{k=1}^\infty \subset \Gamma \) such that \(\rho (x)=\sup _{k} \sigma _{h_k \cdots h_1}(x)\) is thin. Define \(f_k = S_{h_{k}}\cdots S_{h_1}f\). Then,

$$\begin{aligned} 1 \ge \langle T_\rho f_k,f_k \rangle \ge \langle S_{h_k}\cdots S_{h_1}T_\sigma (S_{h_k}\cdots S_{h_1})^*f_k,f_k \rangle = \langle T_\sigma f,f \rangle = 1. \end{aligned}$$

This implies \(T_\rho f_k=f_k\). However, the collection \(\{f_k\}\) is linearly independent by Corollary 1 since \(\Gamma \) is Abelian, has no nontrivial compact subgroups, and is contained in \(G_\tau \). Therefore the dimension of the eigenspace corresponding to the eigenvalue 1 is infinite. However, \(T_\rho \) is compact by Theorem 1 which is a contradiction.

Properties (i) and (ii) often follow from the integrability of \(\langle k_x,k_e \rangle \), see Remark 1. In case the decay of \(\langle k_x,k_y\rangle \) is not sufficient for (i) and (ii), we can obtain something weaker.

Theorem 3

Let \((\mathcal {H},X,G,k,\mu ,d)\) be a Berezin quantiztion and suppose \(G_\tau \) contains a non-discrete Abelian subgroup which has no nontrivial compact subgroups. If \(\sigma :X \rightarrow [0,1]\) is in \(L^1(X)\), then there exists \(c>0\) such that

$$\begin{aligned} \langle T_{1-\sigma }f,f \rangle \ge c \Vert f\Vert ^2 \end{aligned}$$

for all \(f \in \mathcal {H}\).

Proof

First, let us see that \(\sigma \in L^1(X)\) implies T is compact. Indeed, if \(f_n {\overset{\textrm{w}}{\rightarrow }}0\), then \(\langle f_n,k_x \rangle \rightarrow 0\) for each \(x \in X\). Then, by dominated convergence,

$$\begin{aligned} \langle T_\sigma f_n,f_n \rangle = \int \sigma (x) \left| \langle f_n,k_x\rangle \right| ^2 \, d\mu (x) \rightarrow 0. \end{aligned}$$

Secondly, since the translations are continuous on \(L^1(X)\), one can find \(h_k\) such that \(\Vert \sigma _{h_k}-\sigma \Vert _{L^1} \le 2^{-k}\) which implies \(\rho (x) = \sup _{k}\sigma _{h_k\cdots h_1}(x)\) is still in \(L^1\). Then we follow the same proof as Theorem 2.

5 Applications

5.1 Short-Time Fourier Transform on LCA Groups

Theorem 3 is similar to Benedicks Theorem (also called the Amrein-Berthier Theorem or Qualitative Uncertainty Principle) for the Plancherel groups, proved in [2], which states that if \(f \in L^2(G, \mu )\) and \(\hat{f} \in L^2(\hat{G}, \hat{\mu })\) are each supported on sets of finite \(\mu \) and \(\hat{\mu }\) measure respectively, then \(f=0\). This uncertainty principle limits the joint time-frequency distribution of these functions. Namely, it states that the joint time-frequency support (a set in \(\hat{G} \times G\)) cannot be contained in a set of finite \(\hat{\mu }\otimes \mu \) measure.

However, looking at other joint time-frequency distributions, such as the Wigner distribution, Ambiguity function, or Short-Time Fourier transform (STFT) yields a stronger result. We will only focus on the STFT, which is defined, for \(f,\phi \in L^2(G)\),

$$\begin{aligned} V_\phi f(p,q) = \int _{G} f(t)\overline{p(q^{-1}t)\phi (q^{-1}t)} \, d\mu (t) \end{aligned}$$

for \((p,q) \in \hat{G} \times G\). We will only deal with locally compact groups G which are Abelian and second countable. The second countability is used to ensure that the invariant metric d is proper. In this way, by the Plancherel theorem for G, one has Moyal’s formula [23, Theorem 3.2.1]

$$\begin{aligned} \langle V_\phi f, V_\psi g \rangle = \langle f,g \rangle \cdot \langle \psi ,\phi \rangle , \end{aligned}$$

which shows that \(\phi _{p,q}(x) = p(q^{-1}x)\phi (q^{-1}x)\) is a Parseval frame for \(L^2(G)\) if \(\Vert \phi \Vert =1\). This implies \(\Vert \phi \Vert \cdot \Vert f\Vert = \Vert V_\phi f\Vert \). In order to apply Theorem 2, we take \({\hat{G} \times G}\) to be the homogeneous space with the measure \(\hat{\mu }\otimes \mu \) acted on by the “Heisenberg” group \(H(G):= {\hat{G} \times G}\times \mathbb {T}\) with the operation

$$\begin{aligned} (p',q',z')(p,q,z):= (pp',qq',zz' p(q')). \end{aligned}$$

H(G) acts on \({\hat{G} \times G}\) by \((p,q,z)(p',q') = (pp',qq')\). It can also be checked (cf. [23, Lemma 3.1.3]) that

$$\begin{aligned} \left| \langle \phi _{(p,q,z)(p',q')},\phi _{(p,q,z)(p'',q'')} \rangle \right| = \left| \langle \phi _{p',q'},\phi _{p'',q''} \rangle \right| \end{aligned}$$

and moreover \(\langle \phi _{(1,q,1)(p',q')},\phi _{(1,q,1)(p'',q'')} \rangle = \langle \phi _{p',q'},\phi _{p'',q''} \rangle \) so that \(\{1_{\hat{G}}\} \times G \times \{1\} \le H(G)_\tau \). If we had used the more conventional definition of \(V_\phi f(p,q)\) (replacing \(p(q^{-1}t)\) with p(t)), then \(\hat{G} \times \{1_G\} \times \{1\} \le H(G)_\tau \) instead.

Theorem 4

Let G be a non-compact second countable LCA group. For any \(\phi \in L^2(G)\), \(\sigma :{\hat{G} \times G}\rightarrow [0,1]\) is thin if and only if

$$\begin{aligned} T_\sigma f = \int _{\hat{G} \times G}\sigma (p,q)\langle f,\phi _{p,q} \rangle \phi _{p,q} \, d({\hat{\mu }\otimes \mu })(p,q) \end{aligned}$$

is compact.

Proof

To apply our compactness characterization, Theorem 1, let us first show that for \(\phi \in L^2(G)\), \(\langle \phi _{p,q},\phi _{p',q'} \rangle \rightarrow 0\) as \(d((p,q),(p',q')) \rightarrow \infty \). By invariance of the frame, it enough to check \(\langle \phi ,\phi _{p,q} \rangle \rightarrow 0\) as \((p,q) \rightarrow \infty \). First, if \(p \rightarrow \infty \) and q remains bounded, then for each q, by the Riemann-Lebesgue Lemma \(\lim _{p \rightarrow \infty }\langle \phi ,\phi _{p,q}\rangle = 0\). Therefore since q remains in a compact set, \(\langle \phi ,\phi _{p,q} \rangle \rightarrow 0\). Otherwise, we consider \(q \rightarrow \infty \). Let \(\varepsilon >0\). Since \(\phi \in L^2(G)\), there exists R such that

$$\begin{aligned} \int _{B(1_G,R)^c} \left| \phi \right| ^2 \, d\mu \le \varepsilon . \end{aligned}$$

Then, there exists M such that for \(d(q,1) > M\), \(B(q^{-1},R) \subset B(1_G,R)^c\). Then,

$$\begin{aligned} \left| \langle \phi ,\phi _{(p,q)} \rangle \right| \le \int _{B(1_G,R)} + \int _{B(1_G,R)^c} \left| \phi \right| \left| \phi (q^{-1}\cdot )\right| \, d\mu \le 2\Vert \phi \Vert \varepsilon . \end{aligned}$$

Applying Theorem 1 gives the necessity. To show sufficiency, consider the Feichtinger algebra [16]

$$\begin{aligned} S_0(G):=\{ \phi \in L^2(G): V_\phi \phi \in L^1({\hat{G} \times G})\}. \end{aligned}$$

It is known that \(S_0(G)\) is dense in \(L^2(G)\), see [26, Lemma 4.19]. First note that \(T_\sigma \) is compact for any \(\phi \in S_0(G)\) by Theorem 1. Then,

$$\begin{aligned}{} & {} \langle T_\sigma ^{\phi ,\psi }f,g \rangle = \int \sigma (p,q)\langle f,\phi _{p,q} \rangle \langle \psi _{p,q},g \rangle \\{} & {} \quad \le \Vert \sigma \Vert _\infty \Vert f\Vert _{L^2(G)} \Vert g\Vert _{L^2(G)} \Vert \psi \Vert _{L^2(G)} \Vert \phi \Vert _{L^2(G)}. \end{aligned}$$

Therefore, we have \(\Vert T_\sigma ^{\phi ,\psi }\Vert \le \Vert \sigma \Vert _\infty \Vert \phi \Vert _{L^2(G)} \Vert \psi \Vert _{L^2(G)}\). Then, \(T_\sigma ^{\phi ,\phi }-T_\sigma ^{\phi ',\phi '} = T_\sigma ^{\phi -\phi ',\phi } + T_\sigma ^{\phi ,\phi -\phi '}\). This concludes the proof since compact operators are closed in the operator norm topology. \(\square \)

Putting this together with Theorem 2, we obtain the following uncertainty principle (see [19] for the case \(G={{\mathbb {R}}}^d\) and \(\sigma \) an indicator function).

Theorem 5

Let G be a second countable LCA group containing a non-discrete subgroup \(\Gamma \) such that \(\Gamma \) has no nontrivial compact subgroups. Let \(\Vert \phi \Vert _{L^2(G)}=1\). If \(\sigma :{\hat{G} \times G}\rightarrow [0,1]\) is thin, then there exists \(c>0\) such that

$$\begin{aligned} \langle T_{1-\sigma }f,f \rangle = \int _{{\hat{G} \times G}} (1-\sigma )\left| V_\phi f\right| ^2 \, d({\hat{\mu }\otimes \mu }) \ge c \Vert f\Vert ^2 \end{aligned}$$

for all \(f \in L^2(G)\).

Proof

The subgroup \(\{1_{\hat{G}}\} \times \Gamma \times \{1\} \le H(G)\) gives the independence of translations and the tautology that \(\sigma \) is thin if and only if \(T_\sigma \) is compact concludes the proof using the argument of Theorem 2. \(\square \)

Corollary 2

Under the assumptions of Theorem 5, if \(E \subset {\hat{G} \times G}\) is thin, then there exists \(c>0\) such that

$$\begin{aligned} \int _{{\hat{G} \times G}\backslash E} \left| V_\phi f\right| ^2 \, d{\hat{\mu }\otimes \mu }\ge c \Vert f\Vert ^2 \end{aligned}$$

for all \(f \in L^2(G)\).

In particular, \({{\,\textrm{supp}\,}}V_\phi f\) cannot be thin unless \(f=0\).

Remark 2

If \(\phi \) and f are both supported on compact sets \(K_1\) and \(K_2\), then \({{\,\textrm{supp}\,}}V_\phi f \subset \hat{G} \times K_2K_1^{-1}\). This shows the thinness condition cannot be relaxed from vanishing at infinity to some smallness at infinity condition.

5.2 Wavelet Transform

Next, we apply these results to wavelets, which are a special case of the affine group acting on \(L^2({{\mathbb {R}}}^d)\). For \(\phi \in L^2({{\mathbb {R}}}^d)\), define the following unitary representation of the wavelet group \(G = (0,\infty ) \times {{\mathbb {R}}}^d\) on \(L^2({{\mathbb {R}}}^d)\) by

$$\begin{aligned} \phi _{a,b}(x) = a^{-1/2}\phi (a^{-1}x-b) \end{aligned}$$
(9)

for \(a \in (0,\infty )\) and \(b \in {{\mathbb {R}}}^d\). The group operation is then \((a_1,b_1)(a_2,b_2) = (a_1a_2,a_2^{-1}b_1+b_2)\) and the Haar measure is \(d\lambda (a,b) = a^{-1} \, da \, db\).

We say that a wavelet \(\phi \in L^2({{\mathbb {R}}}^d)\) is admissible if

$$\begin{aligned} \int _0^\infty \dfrac{\left| \hat{\phi }(x \xi )\right| ^2}{\xi } \, d\xi =1 \end{aligned}$$
(10)

for a.e. \(x \in {{\mathbb {R}}}^d\). Due to the invariance of the measure \(a^{-1} \, da\), it is enough to check this for almost every \(x \in \mathbb {S}^{d-1}\). We define the set \(\mathcal {A}\) to be the set of all admissible \(\phi \).

If \(\phi \) is admissible, then \(\{\phi _g\}_{g \in G}\) does form a generalized Parseval frame with the measure \(\lambda \). To see this, define the wavelet transform \(W_\phi f(a,b):=\langle f,\phi _{a,b} \rangle \). Then, since \(\int e^{-ib \xi }W_\phi f(a,b) \, db = a^{-1/2}\hat{f}(a^{-1}\xi ) \hat{\phi }(\xi )\), by the Plancherel theorem on \({{\mathbb {R}}}^d\),

$$\begin{aligned}{} & {} \int _{G} \left| W_\phi f\right| ^2 \, d\lambda = \int _0^\infty \int _{{{\mathbb {R}}}^d} \left| \hat{f}(a^{-1}\xi ) \hat{\phi }(\xi )\right| ^2 a^{-1}\, d\xi a^{-1}\, da \\{} & {} \quad = \int _{{{\mathbb {R}}}^d} \left| \hat{f}(\eta )\right| ^2 \int _0^\infty \left| \hat{\phi }(a \eta )\right| ^2 \, a^{-1} da \, d\eta = \Vert f\Vert ^2. \end{aligned}$$

We want to check the Schur condition (i) and (ii) in Theorem 2 so we study the decay properties of \(W_\phi \phi \). Define

$$\begin{aligned} \mathcal {B}^1_w = \{ \phi \in L^2({{\mathbb {R}}}^d): W_\phi \phi \in L^1(w\, d\lambda ) \} \end{aligned}$$

for a weight w. We can show that a very large class of functions is contained in \(\mathcal {B}^1_w\). Define the translation operator \(\tau _hf(x) = f(x+h)\). For \(0<\alpha \le 1\), denote by \(\Lambda _\alpha \) the class of \(L^1\) functions such that \(\Vert \tau _h f-f\Vert _{L^1} \le Ch^\alpha \). \(\Lambda _1\) contains the Schwarz functions as well as less smooth functions like indicator functions (thus including the Haar wavelet).

Lemma 3

Let \(0< \varepsilon < \alpha \le 1\) and \(w_\varepsilon (a,b) = a^{d/2+\varepsilon }\). Then,

$$\begin{aligned} \Lambda _\alpha \cap L_0^1(\left| x\right| ^\alpha ) \subset \mathcal {B}^1_{w_\varepsilon } \end{aligned}$$

where \(L^1_0(\left| x\right| ^\alpha ) = \{ f \in L^1({{\mathbb {R}}}^d): \int f =0 \text{ and } \int \left| f(x)\right| \, \left| x\right| ^\alpha \, dx < \infty \}\).

In particular, this weight is multiplicative, so by Remark 1, \(\phi _{a,b}\) satisfies the Schur conditions (i) and (ii) in Theorem 2 for any \(\phi \in \Lambda _\alpha \cap L^1_0(|x|^\alpha )\).

Proof

We split \((0,\infty )=(0,1) \cup (1,\infty )\). On (0, 1),

$$\begin{aligned}{} & {} \int _{{{\mathbb {R}}}^d} \int _0^1 \left| \int \phi (x) \phi (a^{-1}x-b) \, dx \right| a^{-d/2} w(a) \, \dfrac{da}{a} \, db \\{} & {} \quad \le \Vert \phi \Vert _{L^1}^2 \int _0^1 a^{d/2+\varepsilon }\, \dfrac{da}{a^{d/2+1}} < \infty \end{aligned}$$

On the other hand, using the mean zero property of \(\phi \), we estimate

$$\begin{aligned}{} & {} \int \left| W_\phi \phi (a,b)\right| \, db = a^{-d/2}\int \left| \int \phi (x) \left[ \phi (a^{-1}x-b) - \phi (-b) \right] \, dx\right| \, db\\{} & {} \quad \le a^{-d/2}\int \left| \phi (x)\right| \Vert \tau _{a^{-1}x}\phi -\phi \Vert _{L^1} \, dx. \le C a^{-d/2-\alpha } \int \left| \phi (x)\right| \cdot \left| x\right| ^\alpha \, dx \end{aligned}$$

Therefore,

$$\begin{aligned} \int _{{{\mathbb {R}}}^d} \int _1^\infty \left| W_\phi \phi \right| w \, d\lambda \le C\int _1^\infty \dfrac{w(a)}{a^{d/2+1+\alpha }} \, da \Vert \phi \left| x\right| ^\alpha \Vert _{L^1}. \end{aligned}$$

Taking \(\varepsilon <\alpha \) ensures that the a integral is finite.

This lemma gives us plenty of information about the space \(\mathcal {B}^1_w\). It can be verified that the admissibility condition (10) holds for any radial, normalized mean zero function in \(L^1 \cap L^1(|x|) \cap L^2\). From this discussion and the previous lemma, we have

$$\begin{aligned}{} & {} L^1 \cap L^1_0(|x|) \cap L^2 \cap (\cup _{0<\alpha \le 1}\Lambda _\alpha ) \cap \{ \phi \text{ radial } \} \\{} & {} \subset (\cup _{0<\varepsilon <1}\mathcal {B}^1_{w_\varepsilon }) \cap \mathcal {A}=:\mathcal {A}_1. \end{aligned}$$

Therefore, by Theorem 1, the following compactness result holds for many wavelets \(\phi \). In particular, all Schwarz functions and the Haar function.

Theorem 6

Let \(\phi \in \mathcal {A}_1\) and \(\sigma \in L^\infty ((0,\infty ) \times {{\mathbb {R}}}^d)\) be thin. Then,

$$\begin{aligned} T_\sigma f(x) = \int \sigma (a,b) W_\phi f(a,b) \phi _{a,b}(x) \, \dfrac{da}{a} db \end{aligned}$$

is compact.

The converse also holds if \(\phi \) is admissible and Schwartz since \(\langle \phi _{a,b},\phi \rangle \rightarrow 0\) as \(d((a,b),(1,0)) \rightarrow \infty \), see for example [21, Appendix, Lemmas 2 and 4]. This yields the following positivity result, by taking \(\Gamma \) in Theorem 2 to be the subgroup \( \{1\}\times {{\mathbb {R}}}^d\).

Theorem 7

Let \(\phi \in \mathcal {A}_1\) and \(\sigma : (0,\infty ) \times {{\mathbb {R}}}^d \rightarrow [0,1]\) be thin. Then there exists \(c>0\) such that

$$\begin{aligned} \langle T_{1-\sigma }f,f \rangle = \int _{(0,\infty ) \times {{\mathbb {R}}}^d} (1-\sigma )\left| W_\phi f\right| ^2 \, d\lambda \ge c \Vert f\Vert ^2 \end{aligned}$$

for all \(f \in L^2({{\mathbb {R}}}^d)\).

As an immediate consequence, we obtain the following uncertainty principle.

Corollary 3

Let \(\phi \in \mathcal {A}_1\). If \(E \subset {{\mathbb {R}}}^d \times (0,\infty )\) is thin, then there exists \(c>0\) such that

$$\begin{aligned} \int _{ (0,\infty ) \times {{\mathbb {R}}}^d \backslash E} \left| W_\phi f(a,b)\right| ^2 \dfrac{da \, db}{a} \ge c \Vert f\Vert ^2 \end{aligned}$$

for all \(f \in L^2({{\mathbb {R}}}^d)\).

As in the STFT case, this implies \({{\,\textrm{supp}\,}}W_\phi f\) can only be thin if \(f=0\), and this is sharp in the sense that we cannot improve from vanishing sets to small ones for general \(\phi \). If f and \(\phi \) are both supported in a ball B, then \(W_\phi f\) is supported the region \(\{ (a,b): b \in a^{-1}B-B \}\) which contains the strip \((0,\infty ) \times B\).

We also mention that these results continue to hold for higher dimensional wavelet transforms such as the shearlet [31], but describing the classes \(\mathcal {B}^1_w\) and \(\mathcal {A}\) is more difficult. However, we mention that Schwarz functions with Fourier support in a bounded set away from the y-axis are included in \(\mathcal {B}^1\) without any weight as shown in [31].