Abstract
Let \(X_N\) be an N-dimensional subspace of \(L_2\) functions on a probability space \((\Omega , \mu )\) spanned by a uniformly bounded Riesz basis \(\Phi _N\). Given an integer \(1\le v\le N\) and an exponent \(1\le p\le 2\), we obtain universal discretization for the integral norms \(L_p(\Omega ,\mu )\) of functions from the collection of all subspaces of \(X_N\) spanned by v elements of \(\Phi _N\) with the number m of required points satisfying \(m\ll v(\log N)^2(\log v)^2\). This last bound on m is much better than previously known bounds which are quadratic in v. Our proof uses a conditional theorem on universal sampling discretization, and an inequality of entropy numbers in terms of greedy approximation with respect to dictionaries.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A standard approach to solving a continuous problem numerically – the Galerkin method – suggests looking for an approximate solution from a given finite-dimensional subspace. A typical way to measure an error of approximation is an appropriate \(L_p\) norm, \(1\le p\le \infty \). Thus, the problem of discretization of the \(L_p\) norms of functions from a given finite-dimensional subspace arises in a very natural way. Approximation by elements from a linear subspace falls in the category of linear approximation.
It was understood in numerical analysis and approximation theory that in many problems from signal/image processing it is beneficial to use an m-term approximant with respect to a given system of elements (dictionary) \({{\mathcal {D}}}_N:=\{g_i\}_{i=1}^N\). This means that for \(f\in X\) we look for an approximant of the form
where \(\Lambda (f) \subset [1,N]\) is a set of m indices which is determined by f. The complexity of this approximant is characterized by the cardinality \(|\Lambda (f)|=m\) of \(\Lambda (f)\). Approximation of this type is referred to as nonlinear approximation because, for a fixed m, the approximant \(a_m(f)\) comes from different linear subspaces spanned by \(g_k\), \(k\in \Lambda (f)\), which depend on f. The cardinality \(|\Lambda (f)|\) is a fundamental characteristic of \(a_m(f)\) called sparsity of \(a_m(f)\) with respect to \({{\mathcal {D}}}_N\). It is now well understood that we need to study nonlinear sparse approximation in order to significantly increase our ability to process (compress, denoise, etc.) large data sets. Sparse approximations of a function are not only a powerful analytic tool but they are utilized in many applications in image/signal processing and numerical computation.
Therefore, here is an important ingredient of the discretization problem, desirable in practical applications. Suppose we have a finite dictionary \({{\mathcal {D}}}_N:=\{g_j\}_{j=1}^N\) of functions from \(L_p(\Omega ,\mu )\). Applying our strategy of sparse m-term approximation with respect to \({{\mathcal {D}}}_N\) we obtain a collection of all subspaces spanned by at most m elements of \({{\mathcal {D}}}_N\) as a possible source of approximating (representing) elements. Thus, we would like to build a discretization scheme, which works well for all such subspaces. This kind of discretization falls in the category of universal discretization. The paper is devoted to the problem of universal sampling discretization.
Let \(\Omega \) be a nonempty set equipped with a probability measure \(\mu \). For \(1\le p\le \infty \), let \(L_p(\Omega ):=L_p(\Omega ,\mu )\) denote the real Lebesgue space \(L_p\) defined with respect to the measure \(\mu \) on \(\Omega \), and let \(\Vert \cdot \Vert _p\) be the norm of \(L_p(\Omega )\). By discretization of the \(L_p\) norm we understand a replacement of the measure \(\mu \) by a discrete measure \(\mu _m\) with support on a set \(\xi =\{\xi ^j\}_{j=1}^m \subset \Omega \). This means that integration with respect to measure \(\mu \) is replaced by an appropriate cubature formula. Thus, integration is replaced by evaluation of a function f at a finite set \(\xi \) of points. This is why this way of discretization is called sampling discretization. The problem of sampling discretization is a classical problem. The first results in this direction were obtained in the 1930 s by Bernstein, by Marcinkiewicz, and by Marcinkiewicz and Zygmund for discretization of the \(L_p\) norms of the univariate trigonometric polynomials. Even though this problem is very important in applications, its systematic study has begun only recently (see the survey paper [5]). We now give explicit formulations of the sampling discretization problem (also known as the Marcinkiewicz discretization problem) and of the problem of universal discretization.
The sampling discretization problem. Let \((\Omega ,\mu )\) be a probability space and let \(X_N\subset L_p\) be an N-dimensional subspace of \(L_p(\Omega ,\mu )\) with \(1\le p \le \infty \) (the index N here, usually, stands for the dimension of \(X_N\)). We shall always assume that every function in \(X_N\) is defined everywhere on \(\Omega \), and
We say that \(X_N\) admits the Marcinkiewicz-type discretization with parameters \(m\in {\mathbb {N}}\) and p and positive constants \(C_1\le C_2\) if there exists a set \(\xi := \{\xi ^j\}_{j=1}^m \subset \Omega \) such that for any \(f\in X_N\) we have in the case of \(1\le p <\infty \),
and in the case of \(p=\infty \),
The problem of universal discretization. Let \({\mathcal {X}}:= \{X(n)\}_{n=1}^k\) be a collection of finite-dimensional linear subspaces X(n) of the space \(L_p(\Omega )\) for a given \(1\le p \le \infty \). We say that a set \(\xi := \{\xi ^j\}_{j=1}^m \subset \Omega \) provides universal discretization for the collection \({\mathcal {X}}\) if there are two positive constants \(C_i\), \(i=1,2\), such that for each \(n\in \{1,\dots ,k\}\) and any \(f\in X(n)\) we have
in the case of \(1\le p<\infty \), and
in the case of \(p=\infty \).
Note that the problem of universal discretization for the collection \({\mathcal {X}}:= \{X(n)\}_{n=1}^k\) is the sampling discretization problem for the set \(\cup _{n=1}^k X(n)\). Also, we point out that the concept of universality is well known in approximation theory. For instance, the reader can find a discussion of universal cubature formulas in [28], Section 6.8.
The problem of universal discretization for some special subspaces of the trigonometric polynomials was studied in [5, 29]. To describe the results in [5, 29] we need to introduce some necessary notations. First, for a given finite subset Q of \({\mathbb {Z}}^d\) we set
For \({\textbf{s}}=(s_1, \cdots , s_d)\in {\mathbb {Z}}^d_+\) we define
The following result, proved in [29], solves the universal discretization problem for the collection
of subspaces of trigonometric polynomials.
Theorem 1.1
[29] For every \(1\le p\le \infty \) there exists a large enough constant C(d, p), which depends only on d and p, such that for any \(n\in {\mathbb {N}}\) there is a set \(\xi :=\{\xi ^\nu \}_{\nu =1}^m\subset {\mathbb {T}}^d\), with \(m\le C(d,p)2^n\) that provides universal discretization in \(L_p\) for the collection \({{\mathcal {C}}}(n,d)\).
Second, for \(n\in {\mathbb {N}}\) let
For a positive integer \(v\le |\Pi _n|\) define
Then it is easily seen that
The following theorem provides universal discretization of \(L_1\) and \(L_2\) norms for the collection \(\{{\mathcal {T}}(Q):\ \ Q\in {\mathcal {S}}(v,n)\}\).
Theorem 1.2
[5, 27, Theorem 7.4] For positive integers n and \(1\le v\le |\Pi _n|\) let
Then there exist three positive constants \(C_i(d)\), \(i=1,2,3\), such that for any \(n,v\in {\mathbb {N}}\) with \(v\le |\Pi _n|\), and for \(p=1\) and \(p=2\) there is a set \(\xi =\{\xi ^\nu \}_{\nu =1}^m \subset {\mathbb {T}}^d\), with \(m\le C_1(d) M_p(n,v)\) such that for any \(f\in \cup _{Q\in {\mathcal {S}}(v,n)} {\mathcal {T}}(Q)\)
Let us denote by \({{\mathcal {D}}}_N=\{g_i\}_{i=1}^N\) a system of functions from \(L_p\). Denote the set of all v-term approximants with respect to \({{\mathcal {D}}}_N\) as
Theorem 1.2 provides universal discretization for the collection \(\{{\mathcal {T}}(Q):\, Q\in {\mathcal {S}}(v,n)\}\), which is equivalent to the sampling discretization of the \(L_p\) norm of elements from the set \(\Sigma _v({{\mathcal {D}}}_N)\) with \(N=|\Pi _n|\), \({{\mathcal {D}}}_N =\{e^{i({\textbf{k}},{\textbf{x}})}\}_{{\textbf{k}}\in \Pi _n}\). The proof of Theorem 1.2 in the case \(p=2\) is based on deep results on random matrices and in the case \(p=1\) is based on the chaining technique. We point out that in both cases \(p=2\) and \(p=1\) Theorem 1.2 provides universal discretization with the number of points growing as \(v^2\).
On the other hand, while Theorem 1.1 provides universal discretization for the subcollection \({{\mathcal {C}}}(n,d)\) of \(\{{\mathcal {T}}(Q):\ \ Q\in {\mathcal {S}}(v,n+1)\}\) (rather than the whole collection \(\{{\mathcal {T}}(Q):\ \ Q\in {\mathcal {S}}(v,n+1)\}\)) with \(v=2^n\) and
it gives a better estimate \(m\le C v\) on the number of points, which is linear in v, and applies to the full range of \(1\le p\le \infty \).
In this paper we prove the following estimate (see below for definitions and notations).
Theorem 1.3
Let \(1\le p\le 2\). Assume that \(\Phi _N\) is a uniformly bounded Riesz basis of \(X_N:={\text {span}}(\Phi _N)\) satisfying (2.8) for some constants \(0<R_1\le R_2\). Then for a large enough constant \(C=C(p,R_1,R_2)\) and any integer \(1\le v\le N\) there exist m points \(\xi ^1,\cdots , \xi ^m \in \Omega \) with
such that for any \(f\in \Sigma _v(\Phi _N)\) we have
In particular, Theorem 1.3 gives the order of bound
which is linear in v with extra logarithmic terms in N and v. This bound is much better than previously known bounds (see Theorem 1.2), which provided quadratic in v bounds. Note that even for each individual subspace from \(\{{\mathcal {T}}(Q):\ \ Q\in {\mathcal {S}}(v,n)\}\) we have the lower bound \(m\ge v\) for the sampling discretization.
Finally, we point out that very recent progress related to universal discretization has been made in our follow-up papers [8, 9]. More precisely, in [8] we prove that in the setting of Theorem 1.3 independent random points \(\xi _1,\cdots , \xi _m\in \Omega \) that are identically distributed according to a given probabilistic measure \(\mu \) provide the universal discretization (1.4) with high probability under a slightly weaker condition than the condition in Theorem 1.3 on the number of points
Also, in [8] we relaxed the condition on the Riesz basis \(\Phi _N\). In [9] we show how universal discretization can be applied to deduce interesting results on sparse sampling recovery. In particular, we demonstrate that a simple greedy type algorithm based on good points for universal discretization provides good recovery in the square norm.
The rest of this paper is organized as follows. Sections 2 and 3 are devoted to estimating the entropy numbers \(\varepsilon _k(\Sigma _v^p(\Phi _N),L_\infty )\) of the sets
in the \(L_\infty \)-norm, where \(\Phi _N\) is a uniformly bounded Riesz basis of \(X_N:=[\Phi _N]\subset L_2\) and \(1\le v\le N\) is an integer. Such estimates play an important role in the proof of Theorem 1.3. To be more precise, in Sect. 2 we prove under the additional condition (2.9) on the space \(X_N={\text {span}}(\Phi _N)\) that for \(p=2\),
The proof of (1.5) uses a known result from Greedy approximation in smooth Banach spaces and its connection with entropy numbers. In Sect. 3, we show how the estimate (1.5) can be extended to the case \(1\le p<2\) under condition (2.9). This extension step is based on a general inequality for the entropy, which is given in Lemma 3.1 and appears to be of independent interest. In Sect. 4 we prove Theorem 1.3, using the estimates on entropy numbers established in the previous two sections and a conditional theorem on sampling discretization. A main step in the proof is to show that the condition (2.9) that is assumed in our estimates of entropy numbers can be dropped in sampling discretization. The conditional Theorem 2.2 used in the proof of Theorem 1.3 is given in Sect. 2 without proof. In Sect. 5, we prove a refined conditional theorem for sampling discretization of all integral norms \(L_p\) of functions from a subset \(\mathcal {W}\subset L_\infty \) satisfying certain conditions, which allows us to estimate the number of points required for the sampling discretization in terms of an integral of the \(\varepsilon \)-entropy \(\mathcal {H}_\varepsilon (\mathcal {W}, L_\infty )\), \(\varepsilon >0\). This is an extension of the conditional result proved in [7, 27] for the unit ball of the space \(X_N\subset L_p\). In particular, it also allows us to prove a refined version of Theorem 1.3, where the constants \(\frac{1}{2}\) and \(\frac{3}{2}\) in (1.4) are replaced by \(1-\varepsilon \) and \(1+\varepsilon \) respectively for an arbitrarily given \(\varepsilon \in (0, 1)\). Finally, in Sect. 6 we give a few remarks on universal sampling discretization of \(L_p\) norms for \(p>2\).
Throughout this paper the letter C denotes a general positive constant depending only on the parameters indicated as arguments or subscripts. We use the notation |A| to denote the cardinality of a finite set A.
2 Some General Entropy Bounds and the Case \(p=2\)
It is well known that bounds of the entropy numbers of the unit ball of an N-dimensional subspace \(X_N\subset L_p\)
play an important role in sampling discretization of the \(L_p\) norm of elements of \(X_N\) (see [6, 26, 27], and [7]).
Recall the definition of entropy numbers in Banach spaces. Let X be a Banach space and \(B_X(g,r)\) denote the closed ball \(\{f\in X:\Vert f-g\Vert \le r\}\) with center \(g\in X\) and radius \(r>0\). Given a positive number \(\varepsilon \), the covering number \(N_\varepsilon (A, X)\) of a compact set \(A\subset X\) is defined as
We denote by \({\mathcal {N}}_\varepsilon (A,X)\) the corresponding minimal \(\varepsilon \)-net of the set A in X; namely, \({\mathcal {N}}_\varepsilon (A,X)\) is a finite subset of A such that \(A\subset \bigcup _{y\in {\mathcal {N}}_\varepsilon (A,X)}B_X(y,\varepsilon )\) and \(N_\varepsilon (A,X)= |{\mathcal {N}}_\varepsilon (A,X)|\). The \(\varepsilon \)-entropy \(\mathcal {H}_\varepsilon (A, X)\) of the compact set A in X is defined as \(\log _2 N_\varepsilon (A,X)\), and the entropy numbers \(\varepsilon _k(A,X)\) of the set A in X are defined as
The following conditional result was proved in [27] for \(p=1\) and in [6] for the full range of \(1\le p<\infty \).
Theorem 2.1
[27, 6, Theorem 1.3] Let \(1\le p<\infty \). Suppose that a subspace \(X_N\subset L_p(\Omega ,\mu )\) satisfies the condition
where \(B\ge 1\). Then for a large enough constant C(p) there exist m points \(\xi ^1,\cdots , \xi ^m \in \Omega \) with
such that for any \(f\in X_N\) we have
As we explained above the problem of universal discretization of the collection \(\{X(n)\}_{n=1}^k\) is equivalent to the sampling discretization of the union \(\cup _{n=1}^k X(n)\) of the corresponding subsets. Therefore, instead of bounds of the entropy numbers of the unit ball \(X_N^p\) we are interested in the entropy bounds of the "unit ball"
which is the union of the corresponding unit balls.
The following version of Theorem 2.1 follows directly from its proof.
Theorem 2.2
Let \(1\le p<\infty \) and \(1\le v\le N\). Suppose that a dictionary \({{\mathcal {D}}}_N\) is such that the set \(\Sigma _v^p({{\mathcal {D}}}_N)\) satisfies the condition
where \(B_1\ge 1\). Assume in addition that there exists a constant \(B_2\ge 1\) such that
Then for a large enough constant C(p) there exist m points \(\xi ^1,\cdots , \xi ^m\in \Omega \) with
such that for any \(f\in \Sigma _v({{\mathcal {D}}}_N)\) we have
Theorem 2.2 also follows from a more general conditional theorem that will be proved in Sect. 5 (see Corollary 5.1).
Remark 2.1
We point out that (2.2) implies
Therefore, assumption (2.3) can be dropped with \(B_2\) replaced by \(3B_1\) in the bound on m. However, in applications the constant \(B_2\) in (2.3) may be significantly smaller than \(3B_1\). For example, if \({{\mathcal {D}}}_N\) is a uniformly bounded orthonormal system with \(\max _{f\in {{\mathcal {D}}}_N}\Vert f\Vert _\infty =1\), then we can take \(B_2=1\).
Proof of (2.4)
For \(\Lambda \subset [1,N]\cap {\mathbb {N}}\) denote \(X(\Lambda ):= {\text {span}}(g_i)_{i\in \Lambda }\) and \(X(\Lambda )^p:= \{f\in X(\Lambda ):\, \Vert f\Vert _p\le 1\}\). Clearly, (2.2) implies the same bound for each \(X(\Lambda )^p\) with \(|\Lambda |=v\). Thus, it is sufficient to prove (2.4) for a v-dimensional subspace \(X_v\). With a slightly worse constant \(4B_1\) instead of \(3B_1\) it was proved in [7, Remark 1.1]. We now show how to get a better constant. Setting \(\varepsilon _1:= \epsilon _1(X_v^p,L_\infty )\), we can find two functions \(f_1, f_2 \in X_v^p\) such that \(X_v^p \subset B_{L_\infty } (f_1, \varepsilon _1) \cup B_{L_\infty } (f_2, \varepsilon _1)\). Since \(0\in X_v^p\), 0 is contained in one of the two balls. Without loss of generality we may assume that \(0\in B_{L_\infty } (f_1, \varepsilon _1)\) so that \(\Vert f_1\Vert _\infty \le \varepsilon _1\). Since \(-f_2\in X_v^p\), we have either \(-f_2 \in B_{L_\infty } (f_1, \varepsilon _1)\) or \(-f_2 \in B_{L_\infty } (f_2, \varepsilon _1)\), which implies \(\Vert f_2\Vert _\infty \le 2\varepsilon _1\). It then follows that \(\Vert f\Vert _\infty \le 3 \varepsilon _1\) for all \(f\in X_v^p\). This together with (2.2) proves (2.4). \(\square \)
Theorem 2.2 motivates us to estimate the characteristics \(\varepsilon _k(\Sigma _v^p({{\mathcal {D}}}_N),L_\infty )\). We now recall some known general results, which turn out to be useful for that purpose. Let \({{\mathcal {D}}}_N=\{g_j\}_{j=1}^N\) be a system of elements of cardinality \(|{{\mathcal {D}}}_N|=N\) in a Banach space X. Consider the best m-term approximations of f with respect to \({{\mathcal {D}}}_N\)
For a set \(W\subset X\) we define
and \( \sigma _0(W,{{\mathcal {D}}}_N)_X=\sup _{f\in W}\Vert f\Vert _X\). The following Theorem 2.3 was proved in [24] (see also [28], p.331, Theorem 7.4.3).
Theorem 2.3
Let a compact \(W\subset X\) be such that there exist a system \({{\mathcal {D}}}_N\subset X\) with \(|{{\mathcal {D}}}_N|=N\), and a number \(r>0\) such that
Then for \(k\le N\)
For a given set \({{\mathcal {D}}}_N=\{g_j\}_{j=1}^N\) of elements we introduce the octahedron (generalized octahedron)
and the norm \(\Vert \cdot \Vert _A\) on \(X_N\)
We now use a known general result for a smooth Banach space. For a Banach space X we define the modulus of smoothness
The uniformly smooth Banach space is the one with the property
In this paper we only consider uniformly smooth Banach spaces with power type moduli of smoothness \(\rho (u) \le \gamma u^s\), \(1< s\le 2\). The following bound is a corollary of greedy approximation results (see, for instance [28], p.455).
Theorem 2.4
Let X be s-smooth: \(\rho (X,u) \le \gamma u^s\), \(1<s\le 2\). Then for any normalized system \({{\mathcal {D}}}_N\) of cardinality \(|{{\mathcal {D}}}_N|=N\) we have
Note that it is known that in the case \(X=L_p\) we have
We now proceed to a special case when \(X=L_p\) and \({{\mathcal {D}}}_N=\Phi _N:=\{\varphi _j\}_{j=1}^N\) is a uniformly bounded Riesz basis of \(X_N:=[\Phi _N]:={\text {span}}(\varphi _1,\dots ,\varphi _N)\). Namely, we assume that \(\Vert \varphi _j\Vert _\infty \le 1\), \(1\le j\le N\) and for any \((a_1,\cdots , a_N) \in {{\mathbb {R}}}^N\)
where \(0< R_1 \le R_2 <\infty \). Assume in addition that for any \(f\in X_N\) we have
Theorem 2.5
Assume that \(\Phi _N\) is a uniformly bounded Riesz basis of \(X_N:=[\Phi _N]\) satisfying (2.9). Then we have
Proof
First of all, for any \(f=\sum _{j\in G }a_j\varphi _j\), \(|G|=v\) we get
Therefore,
where
By Theorem 2.4 with \(s=2\) and by (2.7) we have that for \(p\in [2,\infty )\)
Thus, Theorem 2.3 implies that for \(p\in [2,\infty )\)
Second, by (2.9) we obtain
Combining (2.13) and (2.14) we get
Finally, for \(k>N\) we use the inequalities
and
to obtain (2.15) for all k. This completes the proof. \(\square \)
3 A Step From \(p=2\) to \(1\le p<2\)
In this section we show how Theorem 2.5 proved in Sect. 2 for \(p=2\) can be extended to the case \(1\le p<2\). This extension step is based on a general inequality for the entropy. For convenience, we set \(\Sigma _v({{\mathcal {D}}}_N)=X_N:=[{{\mathcal {D}}}_N]\) for \(v> N\).
Lemma 3.1
For \(v=1,2,\dots , N\), \(1\le p< 2<q\le \infty \), and\(\theta :=(\frac{1}{2}-\frac{1}{q})/(\frac{1}{p}-\frac{1}{q})\) we have for \(\varepsilon >0\)
where \(a=a(\theta ) =2^{\frac{\theta }{1-\theta }}\).
Proof
In the case when \(v=N\) Lemma 3.1 was proved in [7, Lemma 3.3]. A slight modification of the proof there works equally well for a general case of \(1\le v\le N\). For completeness, we include the proof of this lemma here. First, we note that for any \(\varepsilon _1, \varepsilon _2>0\)
To see this, let \(x_1,\cdots , x_{N_1}\in \Sigma _v^p({{\mathcal {D}}}_N)\) and \(y_1,\cdots , y_{N_2}\in \Sigma _{2v}^2({{\mathcal {D}}}_N)\) be such that
where \(N_1=N_{\varepsilon _1}(\Sigma _v^p({{\mathcal {D}}}_N), L_2)\) and \(N_2=N_{\varepsilon _2}(\Sigma _{2v}^2({{\mathcal {D}}}_N), L_q)\). Since \(\Sigma _v({{\mathcal {D}}}_N) +\Sigma _v({{\mathcal {D}}}_N) \subset \Sigma _{2v}({{\mathcal {D}}}_N)\), we have
Inequality (3.2) then follows.
Next, setting \(\varepsilon _1:=\varepsilon ^{1-\theta }\) and \(\varepsilon _2=\varepsilon ^\theta \) in (3.2), we reduce the problem to showing that
It will be shown that for \(s=0,1,\ldots ,\)
from which (3.3) will follow by taking the sum over \(s=0,1,\ldots \)
To show (3.4), for each nonnegative integer s let \({{\mathcal {F}}}_s\subset \Sigma _v^p({{\mathcal {D}}}_N)\) be a maximal \(2^s \varepsilon _1\)-separated subset of \(\Sigma _v^p({{\mathcal {D}}}_N)\) in the metric \(L_2\); that is \(\Vert f-g\Vert _2\ge 2^s\varepsilon _1\) for any two distinct functions \(f, g\in {{\mathcal {F}}}_s\) and \(\Sigma _v^p({{\mathcal {D}}}_N)\subset \bigcup _{f\in {{\mathcal {F}}}_s} B_{L_2} (f, 2^s\varepsilon _1)\). Then
Let \(f_s\in {{\mathcal {F}}}_{s+2}\) be such that
Since
it follows that
Set
Clearly, for any \(g\in {{\mathcal {A}}}_s\)
On the one hand, using (3.5) and (3.6), we obtain that
On the other hand, since \(\frac{1}{2} =\frac{\theta }{p}+\frac{1-\theta }{q}\), using (3.7) and the fact that \({{\mathcal {F}}}_s\) is \(2^s\varepsilon _1\)-separated in the \(L_2\)-metric, we have that for any two distinct \(g', g\in {{\mathcal {A}}}_s\)
which implies that
This together with (3.7) means that \({{\mathcal {A}}}_s\) is a \(2^{-2} a^{s-1} \varepsilon ^\theta \)-separated subset of \(\Sigma _{2v}^2({{\mathcal {D}}}_N)\) in the metric \(L_{q}\). We obtain
Thus, combining (3.9) with (3.8), we prove inequality (3.4). \(\square \)
Lemma 3.1 with \(1\le p<2\), \(q=\infty \), \(\theta =p/2\) and Theorem 2.5 imply the following bound for the entropy numbers.
Theorem 3.1
Assume that \(\Phi _N\) is a uniformly bounded Riesz basis of \(X_N:=[\Phi _N]\) satisfying (2.9). Then for \(1\le p\le 2\) we have
4 Proof of Theorem 1.3
Theorem 3.1 provides bounds on the entropy numbers \( \varepsilon _k(\Sigma _v^p(\Phi _N),L_\infty )\) under additional assumption (2.9). Thus, a combination of Theorem 3.1 with Theorem 2.2 implies the statement of Theorem 1.3 under extra assumption (2.9). However, (2.9) is not assumed in Theorem 1.3. Below we give a proof of Theorem 1.3.
We need the following lemma proved in [7].
Lemma 4.1
[7, Lemma 4.3] Let \(1\le p<\infty \) be a fixed number. Assume that \(X_N\) is an N-dimensional subspace of \(L_\infty (\Omega )\) satisfying the following condition: For some parameter \(\beta >0\) and constant \(K\ge 2\)
Let \(\{\xi _j\}_{j=1}^\infty \) be a sequence of independent random points distributed in accordance with \(\mu \). Then there exists a positive constant \(C_\beta \) depending only on \(\beta \) such that for any \(0< \varepsilon \le \frac{1}{2}\) and any integer
the inequality
holds with probability \( \ge 1-m^{-N/\log K}\).
For a set \(\Omega _m:=\{x_1,\cdots , x_m\}\subset \Omega \) and a function \(f:\Omega _m\rightarrow {{\mathbb {R}}}\) we define \(\Vert f\Vert _{L_\infty (\Omega _m)}:=\max _{1\le j\le m} |f(x_j)|\) and
Now we turn to the proof of Theorem 1.3. Recall that we do not assume (2.9). First, since the Riesz basis \(\Phi _N:=\{\varphi _j\}_{j=1}^{N}\) is uniformly bounded by 1 on \(\Omega \), we have by (2.8)
which in turn implies that
Thus, by Lemma 4.1 with \(\beta =1\) there exists a discrete set \(\Omega _{m_1}:=\{\xi ^1,\cdots , \xi ^{m_1}\}\subset \Omega \) with
such that for all \(f\in X_N\)
where \(C>1\) is an absolute constant.
Second, we consider the discrete norm \(\Vert \cdot \Vert _{L_p(\Omega _{m_1})}\) instead of the norm \(\Vert \cdot \Vert _{L_p(\Omega )}\). By (4.4) \(\Phi _N\) is a uniformly bounded Riesz basis of the space \((X_N, \Vert \cdot \Vert _{L_2(\Omega _{m_1})})\) and moreover
Since \(\log m_1 \sim \log N\), by the regular Nikolskii inequality for the norms \(\ell _p^{m_1}\), \(1\le p\le \infty \), we also have
where \(C>1\) is an absolute constant. Thus, by Theorem 2.2 and Theorem 3.1 applied to the discrete norm \( \Vert \cdot \Vert _{L_p(\Omega _{m_1})}\) we can find a subset \(\Omega _m\subset \Omega _{m_1}\) with
such that for any \(f\in \Sigma _v(\Phi _N)\)
Combining (4.4) with (4.5), we obtain the stated result of Theorem 1.3.
5 A Refined Version of the Conditional Theorem
Let us first recall some notations. Let \((\Omega , \mu )\) be a probability space. For \(1\le p\le \infty \) denote by \(L_p(\Omega )\) the usual Lebesgue space \(L_p\) defined with respect to the measure \(\mu \) on \(\Omega \) and by \(\Vert \cdot \Vert _p\) the norm of \(L_p(\Omega )\). We also set
In this section we prove a refined version of the conditional Theorem 2.2 for sampling discretization of all integral norms \(L_p\) of functions from a more general subset \(\mathcal {W}\subset L_\infty \), which allows us to estimate the number of points needed for the sampling discretization in terms of an integral of the \(\varepsilon \)-entropy \(\mathcal {H}_\varepsilon (\mathcal {W}, L_\infty )\), \(\varepsilon >0\).
Theorem 5.1
Let \(1\le p<\infty \) and let \(\mathcal {W}\) be a set of uniformly bounded functions on \(\Omega \) with
Assume that \({{\mathcal {H}}}_{ t}(\mathcal {W},L_\infty )<\infty \) for every \(t>0\), and
Then there exist positive constants \(C_p, c_p\) depending only on p such that for any \(\varepsilon \in (0, 1)\) and any integer
there exist m points \(x_1,\cdots , x_m\in \Omega \) such that for all \(f\in \mathcal {W}\),
In particular, Theorem 5.1 allows us to prove refined versions of Theorem 2.2 and Theorem 1.3, where the constants in the Marcinkiewicz type discretization are replaced by \(1-\varepsilon \) and \(1+\varepsilon \) for an arbitrarily given \(\varepsilon \in (0, 1)\).
First, we have the following refined version of Theorem 2.2.
Corollary 5.1
Let \(1\le p<\infty \) and \(1\le v\le N\). Suppose that a dictionary \({{\mathcal {D}}}_N\) is such that
where \(B_1\ge 1\). Assume in addition that there exists a constant \(B_2\ge 1\) such that
Then for a large enough constant C(p) and any \(\varepsilon \in (0, 1)\) there exist m points \(\xi ^1,\cdots , \xi ^m\in \Omega \) with
such that for any \(f\in \Sigma _v({{\mathcal {D}}}_N)\)
Proof of Corollary 5.1
We apply Theorem 5.1 to \(\mathcal {W}:=\Sigma _v^p({{\mathcal {D}}}_N)\) and \(R=B_2 v^{1/p}\). It is clear that \(\mathcal {W}\) satisfies (5.1). Furthermore, (5.4) implies that (see by [7, Lemma 2.1])
Finally, a straightforward calculation using (5.6) then shows that
Corollary 5.1 then follows from Theorem 5.1. \(\square \)
Using Corollary 5.1 and following the proof in Sect. 4, we can also obtain the \(\varepsilon \)-version of Theorem 1.3.
Corollary 5.2
Let \(\Phi _N\) be a uniformly bounded Riesz basis of \(X_N:={\text {span}}(\Phi _N)\subset L_2(\Omega )\) satisfying (2.8) for some constants \(0<R_1\le R_2\). Let \(1\le p\le 2\) and let \(1\le v\le N\) be an integer. Then for a large enough constant \(C=C(p,R_1,R_2)\) and any \(\varepsilon \in (0, 1)\) there exist m points \(\xi ^1,\cdots , \xi ^m\in \Omega \) with
such that for any \(f\in \Sigma _v(\Phi _N)\) we have
The rest of this section is devoted to the proof of Theorem 5.1, which is close to the proof of Theorem 1.3 of [6]. We need the following lemma:
Lemma 5.1
[6, Lemma 2.4] Let \(\{{{\mathcal {F}}}_j\}_{j\in G}\) be a collection of finite sets of bounded functions from \(L_1(\Omega ,\mu )\). Assume that for each \(j\in G\) and all \(f\in {{\mathcal {F}}}_j\) we have
Suppose that positive numbers \(\eta _j\) \(\in (0,1)\) and a natural number m satisfy the condition
Then there exists a set \(\xi =\{\xi ^\nu \}_{\nu =1}^m \subset \Omega \) such that for each \(j\in G\) and for all \(f\in {{\mathcal {F}}}_j\) we have
Proof of Theorem 5.1
Let
Clearly, \(\mathcal {W}_1\subset \mathcal {W}\), and it suffices to prove (5.3) for all \(f\in \mathcal {W}_1\). Let \(c^*=c_p^*\in (0, \frac{1}{2})\) be a sufficiently small constant depending only on p. Let \(a:=c^*\varepsilon \). Let \(J, j_0\) be two integers such that \(j_0<0\le J\),
For \(j\in {\mathbb {Z}}\), let
denote the minimal \(2a(1+a)^j\)-net of \(\mathcal {W}_1\) in the norm of \(L_\infty \). For \(j\in {\mathbb {Z}}\) and \(f\in \mathcal {W}_1\) we define \(A_j(f)\) to be the function in \( {{{\mathcal {A}}}}_j\) that is closest to f in the \(L_\infty \) norm. Thus, \(\Vert A_j (f)-f\Vert _\infty \le 2a(1+a)^j\) for all \(f\in \mathcal {W}_1\) and \(j\in {\mathbb {Z}}\).
Next, for \(f\in \mathcal {W}_1\) and \(j> j_0\) define
and
We also set
Note that by (5.7) \(U_j(f)=\emptyset \) for \(j> J\). Thus, \(\{D_{j}(f):\ \ j=j_0,\cdots , J\}\) forms a partition of the domain \(\Omega \). Define
where \(\chi _E({\textbf{x}})\) is a characteristic function of a set E.
For \({\textbf{x}}\in D_{j_0} (f)\) we have
which in turn implies that
On the other hand, for \({\textbf{x}}\in D_j(f)\) and \(j_0<j\le J\) we have
which implies
Therefore, choosing \(c^*=c_p^*\) small enough, we have
and
In particular, this implies that for any probability measure \(\nu \) on \(\Omega \) and any \(f\in \mathcal {W}_p'\)
For \(j_0+1\le j\le J\) let
Our aim is to find m points \(\xi ^1, \cdots , \xi ^m\in \Omega \) for each m satisfying (5.2)
so that the following inequality holds for all \(f\in {{\mathcal {F}}}_j^p\) and \( j_0<j\le J\):
where \(\{\varepsilon _j\}_{j=j_0+1}^{J} \subset (0, 1)\) satisfies \(\sum _{j=j_0+1}^{J} \varepsilon _j \le \varepsilon /4\). Once (5.11) is proved, we obtain by (5.8) that
which, applying (5.10), will prove the desired inequality (5.3).
To see this, we apply Lemma 5.1 for the collection of the above sets \({{\mathcal {F}}}_j^p\) and notice that for \(j_0<j\le J\)
and
Thus, by Lemma 5.1 it suffices to show that for each integer m satisfying (5.2) one can find a sequence \(\{\varepsilon _j\}_{j_0<j\le J}\subset (0, 1)\) such that
To this end we need to estimate the cardinalities of the sets \({{\mathcal {F}}}_j^p\). By definition, for each \(j_0< j\le J\), the set \(D_j(f)\) is uniquely determined by the functions \(A_k(f)\in {{\mathcal {A}}}_k\), \(j\le k\le J\). As a result, we have
and
For each \(j_0<j\le J\) we choose \(\varepsilon _j>0\) so that
where \(\lambda >1\) is a large absolute constant to be specified later. Then
and hence (5.13) is ensured once
However, using (5.15), we have
This combined with (5.17) implies that (5.13) is ensured by (5.2).
Finally, we prove (5.14). Indeed, using (5.16), we have
where the last step uses the fact that
We claim that
To see this, let \(f^*\in \mathcal {W}\) be such that \(\Vert f^*\Vert _\infty =R\). Let \(k=[\frac{R}{t}]\). Define \(f_j=\frac{ 2jt}{R} f^*\) for \(0\le j\le k/2\). Then \(\{f_j\}_{0\le j\le \frac{k}{2}} \subset \mathcal {W}\) is 2t-separated in \(L_\infty \)-norm. It follows that
which shows (5.18).
Now using (5.18), we obtain
provided that \(\lambda >1\) is large enough. This proves (5.14). \(\square \)
6 Concluding Remarks on Sampling Discretization of \(L_p\) norms for \(2<p<\infty \)
In this section, we give a few remarks on sampling discretization of \(L_p\) norms for \(2<p<\infty \).
-
1.
The following Nikolskii type inequality plays an important role in the proof of Theorem 1.3:
$$\begin{aligned} \Vert f\Vert _\infty \le C v^{\frac{1}{p}}\Vert f\Vert _p,\ \ \ \forall f\in \Sigma _v(\Phi _N), \end{aligned}$$(6.1)where the constant C is independent of f, v and N. This inequality holds for \(1\le p\le 2\) whenever \(\Phi _N\) is a uniformly bounded Riesz basis of \(X_N\). However, this is no longer true for \(p>2\). For example, take \(N=2^v\) and consider the system
$$\begin{aligned} \Phi _N =\{ e^{2\pi i j x}\}_{j=1}^N \end{aligned}$$on the interval [0, 1] equipped with the usual Lebesgue measure. By the Littlewood-Paley inequality we have that for \(f(x)=\sum _{j=1}^v e^{2\pi {\textbf{i}} 2^j x}\in \Sigma _v(\Phi _N)\) and \(2<p<\infty \),
$$\begin{aligned} \Vert f\Vert _\infty =v > C v^{\frac{1}{p}}\Vert f\Vert _p\asymp v^{\frac{1}{2}+\frac{1}{p}}. \end{aligned}$$ -
2.
Let \(\Phi _N\) be a uniformly bounded Riesz basis of \(X_N\subset L_2\) satisfying (2.9). By monotonicity of the \(L_p\) norms, we have that for any integer \(1\le v\le N\),
$$\begin{aligned} \Sigma ^p_v(\Phi _N)\subset \Sigma ^2_v(\Phi _N),\ \ p>2, \end{aligned}$$which in particular implies that
$$\begin{aligned} \sup _{f\in \Sigma _v^p(\Phi _N)} \Vert f\Vert _\infty \le \sup _{f\in \Sigma _v^2(\Phi _N)} \Vert f\Vert _\infty \le C v^{1/2}. \end{aligned}$$Moreover, using Theorem 2.5, we have that for \(p>2\) and all integer \(k\ge 1\),
$$\begin{aligned} \varepsilon _k(\Sigma _v^p(\Phi _N),L_\infty ) \le \varepsilon _k(\Sigma _v^2(\Phi _N),L_\infty )\le C\cdot (\log N) \Bigl (\frac{v}{k}\Bigr )^{1/2}, \end{aligned}$$(6.2)which also yields
$$\begin{aligned} {{\mathcal {H}}}_{t} ( \Sigma _v^p(\Phi _N), L_\infty ) \le C(p) v\cdot \Bigl (\frac{\log N}{t}\Bigr )^2,\ \ \ \forall t>0. \end{aligned}$$(6.3)On the other hand, a straightforward calculation shows that for any \(\varepsilon \in (0, 1)\) and \(p>2\),
$$\begin{aligned}&\varepsilon ^{-5} \left( \int _{10^{-1}\varepsilon ^{1/p}} ^{C v^{1/2}} u^{\frac{p}{2}-1} \Bigl (\int _{ u}^{ C v^{1/2} }\frac{{{\mathcal {H}}}_{c_p \varepsilon t}( \Sigma _v^p(\Phi _N),L_\infty )}{t} \, dt\Bigr )^{\frac{1}{2}} du\right) ^2\\&\quad \le C (p) \varepsilon ^{-7} v^{p/2} (\log N)^2. \end{aligned}$$Thus, an application of Theorem 5.1 leads to
Theorem 6.1
Assume that \(\Phi _N\) is a uniformly bounded Riesz basis of \(X_N:={\text {span}}(\Phi _N)\) satisfying (2.8) for some constants \(0<R_1\le R_2\). Let \(2<p<\infty \) and let \(1\le v\le N\) be an integer. Then for a large enough constant \(C=C(p,R_1,R_2)\) and any \(\varepsilon \in (0, 1)\) there exist m points \(\xi ^1,\cdots , \xi ^m\in \Omega \) with
such that for any \(f\in \Sigma _v(\Phi _N)\)
References
Batson, J., Spielman, D.A., Srivastava, N.: Twice-Ramanujan Sparsifiers. SIAM J. Comput. 41, 1704–1721 (2012)
Belinskii, E.S.: Decomposition theorems and approximation by a “floating’’ system of exponentials. Trans. Amer. Math. Soc. 350(1), 43–53 (1998)
Bourgain, J., Lindenstrauss, J., Milman, V.: Approximation of zonoids by zonotopes. Acta Math. 162, 73–141 (1989)
Carl, B.: Entropy numbers, \(s\)-numbers, and eigenvalue problems. J. Funct. Anal. 41, 290–306 (1981)
Dai, F., Prymak, A., Temlyakov, V.N., Tikhonov, S.: Integral norm discretization and related problems. Russ. Math. Surv. 74, 579–630 (2019)
Dai, F., Prymak, A., Shadrin, A., Temlyakov, V.N., Tikhonov, S.: Sampling discretization of integral norms. Constr. Approx. 54(3), 455–471 (2021)
Dai, F., Prymak, A., Shadrin, A., Temlyakov, V.N., Tikhonov, S.: Entropy numbers and Marcinkiewicz-type discretization. J. Funct. Anal. 281(6), 109090 (2021)
Dai, F., Temlyakov, V.: Random points are good for universal discretization; arXiv:2301.12536
Dai, F., Temlyakov, V.: Universal discretization and sparse sampling recovery; arXiv:2301.05962
DeVore, R.A., Temlyakov, V.N.: Nonlinear approximation by trigonometric sums. J. Fourier Anal. Appl. 2(1), 29–48 (1995)
Dũng, Ding, Temlyakov, V.N., Ullrich, T.: Hyperbolic Cross Approximation, Advanced Courses in Mathematics CRM Barcelona, Birkhäuser, (2018); arXiv:1601.03978v2
Hinrichs, A., Prochno, J., Vybiral, J.: Entropy numbers of embeddings of Schatten classes. J. Functional Analysis 273, 3241–3261 (2017); arXiv:1612.08105v1
Johnson, W.B., Schechtman, G.: Finite dimensional subspaces of \(L_p\), Handbook of the geometry of Banach spaces, Vol. 1, 837–870, North-Holland, Amsterdam (2001)
Kosov, E.: Marcinkiewicz-type discretization of \(L_p\)-norms under the Nikolskii-type inequality assumption. J. Math. Anal. Appl. 504(1), 125358 (2021)
Marcus, A., Spielman, D.A., Srivastava, N.: Interlacing families II: Mixed characteristic polynomials and the Kadison-Singer problem. Annal. Math. 182, 327–350 (2015)
Nitzan, S., Olevskii, A., Ulanovskii, A.: Exponential frames on unbounded sets. Proc. Am. Math. Soc. 144, 109–118 (2016)
Pajor, A., Tomczak-Yaegermann, N.: Subspaces of small codimension of finite-dimensional Banach spaces. Proc. Am. Math. Soc. 97, 637–642 (1986)
Rudelson, M.: Almost orthogonal submatrices of an orthogonal matrix. Israel J. Math. 111, 143–155 (1999)
Schechtman, G.: Tight embedding of subspaces of \(L_p\) in \(\ell _p^n \) for even \(p\). Proc. Am. Math. Soc. 139(12), 4419–4421 (2011)
Schechtman, G.: More on embedding subspaces of \(L_p\) in \(\ell ^n_r\). Compos. Math. 61(2), 159–169 (1987)
Schütt, C.: Entropy numbers of diagonal operators between symmetric Banach spaces. J. Approx. Theor. 40, 121–128 (1984)
Sudakov, V.N.: Gaussian random processes and measures of solid angles in Hilbert spaces. Sov. Math. Dokl. 12, 412–415 (1971)
Temlyakov, V.N.: Greedy Approximation. Cambridge University Press, Cambridge (2011)
Temlyakov, V.N.: An inequality for the entropy numbers and its application. J. Approx. Theor. 173, 110–121 (2013)
Temlyakov, V.N.: On the entropy numbers of the mixed smoothness function classes. J. Approx. Theor. 207, 26–56 (2017)
Temlyakov, V.N.: The Marcinkewiecz-type discretization theorems for the hyperbolic cross polynomials. Jaen Journal on Approximation 9(1), 37–63 (2017); arXiv: 1702.01617v2
Temlyakov, V.N.: The Marcinkiewicz-type discretization theorems. Constr. Approx. 48, 337–369 (2018)
Temlyakov, V.N.: Multivariate Approximation. Cambridge University Press, Cambridge (2018)
Temlyakov, V.N.: Universal discretization. J. Complex. 47, 97–109 (2018)
Temlyakov, V.N.: A remark on entropy numbers. Studia Math. 263(2), 199–208 (2022); arXiv:2008.13030
Zygmund, A.: Trigonometric Series. Cambridge University Press, Cambridge (1959)
Acknowledgements
The authors would like to thank the referees for careful reading of the paper and for helpful suggestions and comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Ronald A. DeVore.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The first named author’s research was partially supported by NSERC of Canada Discovery Grant RGPIN-2020-03909. The second named author’s research wa s supported by the Russian Federation Government Grant No. 14.W03.31.0031.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dai, F., Temlyakov, V. Universal Sampling Discretization. Constr Approx 58, 589–613 (2023). https://doi.org/10.1007/s00365-023-09644-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00365-023-09644-2