Abstract
This current paper provides a data-driven wavelet estimator for deconvolution density model. Moreover, we investigate the totally adaptive estimations with moderately ill-posed noises over \(L^p\) risk on Besov spaces \(B_{r,q}^s(\mathbb {R})\). Compared with the traditional adaptive wavelet estimators, the estimation for the case of \(0<s\le \frac{1}{r}\) is considered. On the other hand, the convergence rate in the region of \(1\le p\le \frac{2sr+(2\beta +1)r}{sr+2\beta +1}\) is improved than that for not necessarily compactly supported density estimations.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction and Preliminary
The density estimation for a statistical model with additive noise plays important roles in both statistics and econometrics [12]. More precisely, let \(Y_{1},\!Y_{2},\!\cdots ,Y_{n}\) be independent and identically distributed (i.i.d.) random data of the form
where X denotes a real valued random variable with the unknown probability density function f and \(\varepsilon \) stands for an independent random noise (error) with the probability density \(f_{\varepsilon }\). The problem is to estimate f by \(Y_{1},Y_{2},\cdots ,Y_{n}\) in some sense. It is well-known that the probability density g of Y equals to the convolution of f and \(f_\varepsilon \), it so called deconvolution problem. In particular, the model (1.1) reduces to the classical density model with no errors [5, 8], when \(f_\varepsilon \) degenerates to the Dirac functional \(\delta \).
In 1996, Delyon and Juditsky [4] investigated the noise-free density estimation by using compacted wavelets. Pensky and Vidakovic [17] and Walter [19] studied the deconvolution density estimations by using Meyer’s function over Sobolev space \(W_{2}^{s}(\mathbb {R})\) in 1999; Three years later, Fan and Koo [6] explored the MISE performance of wavelet deconvolution estimators over Besov space \(B_{r,q}^{s}(\mathbb {R})\) (\(1\le r\le 2\)). Lounici and Nickl [14] investigated the wavelet optimal \(L^\infty \) deconvolution estimations over \(B_{\infty ,\infty }^{s}(\mathbb {R})\). In 2014, Li and Liu [11] provided a completed deconvolution estimations over \(L^{p}~(1\le p<\infty )\) risk on \(B_{r,q}^{s}(\mathbb {R})~(r,q\in [1,\infty ])\) for moderately ill-posed noises by using wavelet bases.
It should be pointed out that the constructions of above wavelet estimators depend more or less on the unknown density function f. For example, the selection of their parameters depends on the unknown smoothness index s of f in linear wavelet estimators, and an upper bound of s in non-linear wavelet ones.
Goldenshluger and Lepski [7] constructed a data-driven kernel density estimator for the noise-free model in 2014, and the selection of parameters in estimator only depends on the observed data. Moreover, they studied the problem of adaptive minimax un-compactly supported density estimations on \(\mathbb {R}^{d}\) with \(L^{p}\) risk over anisotropic Nikol’skii classes. For deconvolution density estimations, Comte and Lacour [2] considered the \(L^{2}\) risk estimation by using a data-driven kernel deconvolution estimator over anistropic Nikol’skii and Sobolev classes. Three year later, Rebelles [18] extended above estimations to \(L^p\) risk over anistropic Nikol’skii classes. In 2017 and 2019, Lepski and Willer [9, 10] established the adaptive and optimal \(L^{p}\) risk estimations in the convolution structure density model via data-driven kernel method, which contains both classical density estimations and deconvolution density ones. Compared with kernel estimators, the wavelet ones provide more local information and fast algorithm. Recently, Cao and Zeng [1] constructed a data-driven wavelet estimator and attained the optimal estimate (up to a logarithmic factor) for un-compactly supported density functions over Besov spaces. However, there are few references to estimate density functions with some additive noises by using data-driven wavelet methods.
This paper provides a data-driven wavelet estimator for compactly supported density functions in deconvolution density model. It is totally adaptive because of the selection of its parameters only depends on the observed data. By using this estimator, we investigate the \(L^{p}~(1\le p<\infty )\) risk estimations for moderately ill-posed noises over Besov balls \(B^{s}_{r,q}(M,T)~(r,q\in [1,\infty ])\). On the other hand, our result includes the case of \(0<s\le \frac{1}{r}\) than the traditional wavelet ones, and the convergence rate is improved in the region of \(1\le p\le \frac{2sr+(2\beta +1)r}{sr+2\beta +1}\) than that for not necessarily compactly supported density estimations [9, 10], see Remark 4.2.
1.1 Preliminaries
We begin with the concept of Multiresolution Analysis (MRA, [8, 16]), which is a sequence of closed subspaces \(\{V_{j}\}_{j\in \mathbb {Z}}\) of the square integrable function space \(L^{2}(\mathbb {R})\) satisfying the following properties:
-
(i)
\(V_{j}\subset V_{j+1}\), \(j\in \mathbb {Z}\);
-
(ii)
\(\overline{\bigcup _{j\in \mathbb {Z}} V_{j}}=L^{2}(\mathbb {R})\) (the space \(\bigcup _{j\in \mathbb {Z}} V_{j}\) is dense in \(L^{2}(\mathbb {R})\));
-
(iii)
\(f(2\cdot )\in V_{j+1}\) if and only if \(f(\cdot )\in V_{j}\) for each \(j\in \mathbb {Z}\);
-
(iv)
There exists \(\varphi \in L^{2}(\mathbb {R})\) (scaling function) such that \(\{\varphi (\cdot -k),~k\in \mathbb {Z}\}\) forms an orthonormal basis of \(V_{0}=\overline{\textrm{span}\{\varphi (\cdot -k),~k\in \mathbb {Z}\}}\).
With the standard notation \(h_{jk}(\cdot ):=2^{\frac{j}{2}}h\left( 2^{j}\cdot -k\right) \) in wavelet analysis, we can derive a wavelet function \(\psi \) from a scaling function \(\varphi \) in a simple way such that for a fixed \(j\in \mathbb {Z}\), \(\{\psi _{jk}\}_{k\in \mathbb {Z}}\) constitutes an orthonormal basis of the orthogonal complement \(W_j\) of \(V_j\) in \(V_{j+1}\). Then for fixed \(j_0\in \mathbb {N}\), both \(\{\varphi _{j_0k},\psi _{jk}\}_{j\ge j_0,k\in \mathbb {Z}}\) and \(\{\psi _{jk}\}_{j,k\in \mathbb {Z}}\) are orthonormal bases (wavelet bases) of \(L^{2}(\mathbb {R})\). Thus, each \(f\in L^2(\mathbb {R})\) has the following expansion in \(L^2\) sense,
with \(\alpha _{jk}:=\langle f, \varphi _{jk}\rangle \) and \(\beta _{jk}:=\langle f,\psi _{jk}\rangle \).
As usual, let \(P_{j}\) be the orthogonal projective operator from \(L^{2}(\mathbb {R})\) onto the scaling space \(V_{j}\) with the orthonormal basis \(\{\varphi _{jk}\}_{k\in \mathbb {Z}}\). Then for each \(f\in L^{2}(\mathbb {R})\),
with \(\alpha _{jk}:=\langle f,\varphi _{jk}\rangle \).
One of advantages of wavelet bases is that they can characterize Besov spaces, which contain Hölder and \(L^{2}\)-Sobolev spaces as special examples. To introduce Lemma 1.1, we need some notations: A scaling function \(\varphi \) is called m regular [3], if \(\varphi \in \mathcal {C}^{m}(\mathbb {R})\) and \(|\varphi ^{(r)}(x)|\le C(1+|x|^{2})^{-l}\) holds for each \(l\in \mathbb {Z}~(r=0,1,\cdots ,m)\); \(\Vert f\Vert _{r}\) denotes \(L^{r}(\mathbb {R})\) norm for \(f\in L^{r}(\mathbb {R})\), and \(\Vert \tau \Vert _{l^{r}}\) does \(l^{r}(\mathbb {Z})\), where
Lemma 1.1
( [16]). Let \(\varphi \) be m regular with \(m>s>0\), \(\psi \) be the corresponding wavelet and \(f\in L^{r}(\mathbb {R})\). If \(\alpha _{jk}:=\langle f,\varphi _{jk}\rangle \), \(\beta _{jk}:=\langle f,\psi _{jk}\rangle \) and \(r,q\in [1,\infty ]\), then the following assertions are equivalent:
-
(i)
\(f\in B^{s}_{r,q}(\mathbb {R})\);
-
(ii)
\(\{2^{js}\Vert P_{j}f-f\Vert _{r}\}\in l^{q}(\mathbb {Z});\)
-
(iii)
\(\{2^{j\left( s-\frac{1}{r}+\frac{1}{2}\right) }\Vert \{\beta _{j\cdot }\}\Vert _{l^r}\}\in l^{q}(\mathbb {Z}).\)
The Besov norm of f can be defined by
Moreover, Lemma 1.1 (i) and (ii) show that \(\Vert P_jf-f\Vert _r\lesssim 2^{-js}\) holds for \(f\in B^{s}_{r,q}(\mathbb {R})\). Here and after, the notation \(A\lesssim B\) denotes \(A\le cB\) with some fixed and independent constant \(c>0\); \(A\gtrsim B\) means \(B\lesssim A\); \(A\thicksim B\) stands for both \(A\lesssim B\) and \(A\gtrsim B\).
When \(r\le p\), Lemma 1.1 (i) and (iii) imply that with \(s'-\frac{1}{p}=s-\frac{1}{r}>0\),
where \(A\hookrightarrow B\) stands for a Banach space A continuously embedded in another Banach space B. All these claims can be found in Refs. [11, 20].
In this paper, the notation \(B_{r,q}^{s}(M)\) with \(M>0\) stands for a Besov ball, i.e.,
and
Moreover, \(L^\infty (M)\) is defined by the way.
To introduce the assumptions on a noise function \(f_{\varepsilon }\), we need the Fourier transform \(f^{ft}\) of \(f\in L^{1}(\mathbb {R})\),
A standard method extends the definition to an \(L^{2}(\mathbb {R})\) function. Furthermore, the following conditions are posed on the noise density function \(f_{\varepsilon }\). For \(\beta \ge 0\),
(T1) \(\left| f_{\varepsilon }^{ft}(t)\right| \gtrsim (1+|t|^{2})^{-\frac{\beta }{2}};\)
(T2) \(\left| (f_{\varepsilon }^{ft})^{(\ell )}(t)\right| \lesssim (1+|t|^{2})^{-\frac{\beta +\ell }{2}},~\ell =0,1,2.\)
Such a noise \(\varepsilon \) is said to be moderately ill-posed. Clearly, the Gamma distribution \(\Gamma (a,b)\) satisfies Conditions (T1)-(T2) with \(\beta =a.\) In particular, the index \(\beta =0\) corresponds to \(\varepsilon \) being the Dirac functional \(\delta \) which the model (1.1) reduces to the classical noise-free density estimation model.
1.2 Data-Driven Wavelet Estimator
This subsection is devoted to introduce the data-driven wavelet estimator for model (1.1). Under Condition (T1),
are well-defined, where \(\varphi \) is m regular with \(m>\beta +1\). Then the classical linear wavelet estimator for deconvolution density model is given by
Clearly, \(E\widehat{\alpha }_{jk}=\alpha _{jk}\) and \(E\widehat{f}_{j}=P_{j}f\), the details please see Refs. [6, 11, 13]. In general, the parameter j in (1.3) depends on the smoothness index s of unknown density function f, and the estimator in (1.3) is non-adaptive [6, 11, 13].
Next, we give a selection rule of parameter j only depending on the observed data \(Y_{1},\cdots ,Y_{n}\), which is so called data-driven version. Let \(\mathcal {H}:=\left\{ 0,1,\cdots ,\left\lfloor \frac{1}{2\beta +1}\log _2{\frac{n}{\ln n}}\right\rfloor \right\} \) with \(\lfloor a\rfloor \) denoting the largest integer smaller or equal to a and
be the stochastic error of \(\widehat{f}_{j}\). Moreover, for any \(x\in [-T,T]\),
Here and after, \(a\wedge b:=\min \{a,b\}\), \(a_{+}:=\max \{a,0\}\) and
where the constant \(\lambda >0\) will be determined later on.
Thus, the selection of \(j=j_{0}\) in (1.3) is obtained by
Obviously, it only depends on the observed data \(Y_1,\cdots ,Y_n\). Then the data-driven wavelet estimator is shown by
with \(j_0\in \mathcal {H}\) being given in (1.8).
2 Oracle Inequality
We shall state a point-wise oracle inequality in this section, which plays the key roles in the proofs of Proposition 3.2 and Theorem 4.1.
Let \(B_j(x,f)\) be the bias of the estimator \(\widehat{f}_{j}(x)\), i.e.,
and define
where \(\xi _{n}(x,j)\) and \(U_{n}(j)\) are given by (1.4) and (1.7) respectively.
Theorem 2.1
For any \(x\in [-T,T]\), the estimator \(\widehat{f}_{n}(x)\) in (1.9) satisfies that
where \(U_{n}^{*}(j)\) is defined in (1.6) and \(B_{j}^{*}(x,f),~v(x)\) are defined in (2.2).
Proof
Obviously, it follows from (1.5) and (1.6) that
The same arguments as (2.3) show
Then combining (2.3) and (2.4), one obtains that
due to \(\widehat{f}_{j_{0}\wedge j}=\widehat{f}_{j\wedge j_{0}}\) and the selection of \(j_0\) in (1.8).
This with (2.1) and (2.2) implies that
On the other hand, according to (1.4) and (1.5),
This with \(\displaystyle \sup \limits _{j'\in \mathcal {H}}\left| E\widehat{f}_{j\wedge j'}(x)-E\widehat{f}_{j'}(x)\right| \le \sup \limits _{j'\in \mathcal {H},j'\ge j}\{B_{j\wedge j'}(x,f)+B_{j'}(x,f)\}\) and (2.2) leads to
Hence, it follows from (2.5)–(2.7) that
holds for any \(j\in \mathcal {H}\). Furthermore,
thanks to \(\widehat{f}_{n}(x)=\widehat{f}_{j_0}(x)\) in (1.9). The proof is done. \(\square \)
3 Two Propositions
This section provides two necessary propositions. Some lemmas and classical inequality are needed, in order to prove Proposition 3.1.
Note that the next condition
(C1) \(\varphi \in L^{1}(\mathbb {R})\) and \(|(\varphi ^{ft})^{(\ell )}(t)|\lesssim (1+|t|^{2})^{-\frac{m}{2}}\) with \(m>1\) and \(\ell =0,1,2\) (see Ref. [13])
can be followed from the m regular of scaling function \(\varphi \). Then, the following lemma holds according to the work of Liu and Zeng [13].
Lemma 3.1
([13]). Let \(\varphi \) be m regular and Conditions (T1)–(T2) hold with \(\beta >1\) and \(m>\beta +1\). Then \(K_{j}\varphi \) in (1.2) satisfies that
where \(M_{0}>0\) is some constant.
To introduce Lemma 3.2, we define
where \(K_{j}\varphi \) is given by (1.2). Then the estimator \(\widehat{f}_{j}(x)\) in (1.9) can be rewritten as \(\displaystyle \widehat{f}_{j}(x)=\frac{1}{n}\sum \nolimits _{i=1}^{n}K^{*}_{j}(Y_{i},x)\). Furthermore, the following lemma holds.
Lemma 3.2
Let \(\varphi \) be m regular and Conditions (T1)–(T2) hold with \(\beta >1\) and \(m>\beta +1\). Then \(K_{j}^{*}(t,x)\) in (3.1) satisfies that
where \(M_1>0\) is some constant.
Proof
According to the definition of \(K_{j}^{*}(t,x)\) in (3.1) and Lemma 3.1, one obtains
On the other hand, \(\Vert g\Vert _{\infty }=\Vert f*f_{\varepsilon }\Vert _{\infty } \le \Vert f\Vert _{\infty }\Vert f_{\varepsilon }\Vert _{1}=\Vert f\Vert _{\infty }\). This with (3.1) and Lemma 3.1 leads to
The desired conclusions are concluded by choosing \(M_1:=\max \{\Vert f\Vert _{\infty }M_{0}^{2}\int _{\mathbb {R}}(1+|t|^{2})^{-2}dt,~M_0\}\) and (3.2)–(3.3). \(\square \)
To show Proposition 3.1, we need a well-known inequality.
Bernstein’s inequality ([15]). Let \(Y_{1},\cdots ,Y_{n}\) be i.i.d. random variables with \(EY_{i}^{2}\le \sigma ^{2}\) and \(|Y_{i}|\le M\) \((i=1,2,\cdots ,n)\). Then for any \(x>0\),
Now, we state the first proposition which is one of main ingredients in the proof of the second one.
Proposition 3.1
Let \(\varphi \) be m regular and Conditions (T1)–(T2) hold with \(\beta >1\) and \(m>\beta +1\). Then for each \(x\in [-T,T]\) and \(\gamma >0\), there exists \(\lambda >\max \{2M_1,~2M_1(\beta +2)\gamma \ln 2\}\) such that
where v(x) is defined in (2.2) and \(M_1\) is the positive constant in Lemma 3.2.
Proof
For any \(j\in \mathcal {H}\), one defines
where \(\lambda _{j}=\max \left\{ (\beta +2)\gamma j\ln 2,\frac{1}{4}\right\} \).
Note that \(\lambda \ln n\ge 2M_{1}\lambda _j\) for large n follows from \(\lambda >\max \{2M_1,2M_1(\beta +2)\gamma \ln 2\}\) and \(j\in \mathcal {H}\). Then \(\overline{U_{n}}(j)\le U_{n}(j)\) due to (1.7) and (3.4). Furthermore,
For each \(t\ge 0\),
Hence,
This with variable substitution \(t=v\omega \) and \(\omega :=\sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}}+ \frac{4M_{1}2^{j(\beta +1)}}{3n}\) shows
because of \(v+\sqrt{\lambda _j}\ge \sqrt{v+\lambda _j}\) and \(\lambda _j\ge \frac{1}{4}\).
On the other hand, \(\left| K_{j}^{*}(Y_{i},x)\right| \le M_{1}2^{j(\beta +1)}\), \(E\left| K_{j}^{*}(Y_{i},x)\right| ^{2}\le M_{1}2^{j(2\beta +1)} \) and
by Lemma 3.2. Then
thanks to Bernstein’s inequality. This with (3.6) implies that
due to \(\omega :=\sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}}+ \frac{4M_{1}2^{j(\beta +1)}}{3n}\). Hence, according to \(e^{-\lambda _{j}}\le 2^{-(\beta +2)\gamma j}\), one knows
Combining (2.2), (3.5) and (3.7), one obtains that
since \(\mathcal {H}\) is a discrete set, which completes the proof. \(\square \)
Before giving another proposition, we introduce the following notations:
where \(\delta _n=\left( \frac{C\ln n}{n}\right) ^{\frac{s}{2s+2\beta +1}}\) and \(C>1\) is some constant.
Note that \(U_{f}(x)\le c_0:=\sup _xU_{f}(x)\). Then there exists
such that \(\Omega _{m}=\emptyset \) for each \(m>m_{2}\). Clearly, \(m_{2}>0\) for large n.
Proposition 3.2
Let \(U_f(x),\Omega _m,\Omega _{0}^-\) be defined in (3.8)–(3.10) respectively and \(\varphi \) be m regular, Conditions (T1)–(T2) hold with \(\beta >1\) and \(m>\beta +1\). Then
satisfy that
(1) For each \(p\in [1,\infty )\),
(2) If \(f\in B_{r,q}^{s}(M)\cap L^{\infty }(M)\) and \(m\in \mathbb {Z}\) satisfy \(0\le m\le m_2\), then
Moreover, if \(s>\frac{1}{r}\) and \(r\le p\), then with \(s':=s-\frac{1}{r}+\frac{1}{p}\),
Proof
(1) According to Theorem 2.1, one finds that
holds for any \(x\in [-T,T]\), where v(x) and \(U_f(x)\) are given by (2.2) and (3.8) respectively. Then for each \(p\in [1,\infty )\), by using (3.10) and Proposition 3.1,
thanks to \(\delta _n\thicksim \left( \frac{\ln n}{n}\right) ^{\frac{s}{2s+2\beta +1}}\), which is the first desired conclusion.
(2) Take \(j_1\) satisfying \(c_12^{\frac{2\,m}{2\beta +1}}~~\delta _n^{-\frac{1}{s}}\le 2^{j_{1}}\le c_22^{\frac{2\,m}{2\beta +1}}~~\delta _n^{-\frac{1}{s}}\), where two positive constants \(c_1,c_2\) satisfy
Then \(j_{1}\in \mathcal {H}\) and \(U_{n}^{*}(x,j_1)\le 2^{m-1}\delta _n\) with \(0<m\le m_2\) and large n. In fact, (3.11) tells that \(2^{m_{2}}\le 2c_0\delta _n^{-1}\). Due to \(0<m\le m_{2}\), (3.12) and \(\delta _{n}= \left( \frac{C\ln n}{n}\right) ^{\frac{s}{2\,s+2\beta +1}}\), one concludes that
Hence, \(j_{1}\in \mathcal {H}\). This with \(2^{j_{1}}\le c_{2}2^{\frac{2\,m}{2\beta +1}}~~\delta _{n}^{-\frac{1}{s}}\) and (3.12) shows that
Clearly, by \(\Omega _m=\{x\in [-T,T],~2^{m}\delta _n <U_{f}(x)\le 2^{m+1}\delta _n\}\),
where \(|\Omega _m|\) stands for the Lebesgue measure of the set \(\Omega _m\). Moreover, (3.8) and (3.13) lead to
When \(1\le r<\infty \), by using Chebyshev’s inequality, (2.2), (3.15) and \(f\in B_{r,q}^s(M)\),
Substituting (3.16) into (3.14), one obtains that
due to \(2^{j_{1}}\thicksim 2^{\frac{2\,m}{2\beta +1}}~~\delta _{n}^{-\frac{1}{s}}\).
For the case \(r=\infty \), it follows from \(f\in B_{r,q}^{s}(M)\) and \(m>0\) that
thanks to the choice of \(2^{j_{1}}\ge c_12^{\frac{2\,m}{2\beta +1}}\delta _{n}^{-\frac{1}{s}}\) with \(c_1>(2M)^{\frac{1}{s}}\). Thus, \(|\Omega _{m}|=0\) because of (3.15). Furthermore, it reduces to \( J_m\le \left( 2^{m+1}\delta _n\right) ^p|\Omega _{m}|=0 \) by (3.14).
Finally, one considers the case of \(s>\frac{1}{r}\) and \(r\le p\). Note that \(B_{r,q}^{s}\hookrightarrow B_{p,q}^{s'}\) with \(s'=s-\frac{1}{r}+\frac{1}{p}\). Similar to (3.16),
This with (3.14) and \(2^{j_{1}}\thicksim 2^{\frac{2\,m}{2\beta +1}}~~\delta _{n}^{-\frac{1}{s}}\) implies that
The proof is done. \(\square \)
4 Main Result
This section is devoted to state and prove our main theorem.
Theorem 4.1
Let \(\varphi \) be m regular and the Conditions (T1)–(T2) hold with \(\beta >1\) and \(m> \beta +1\). Then for \(0<s<m\), \(r,q\in [1,\infty ]\) and \(p\in [1,\infty )\), the estimator \(\widehat{f}_{n}\) in (1.9) satisfies
where
Remark 4.1
Note that \(\theta =\min \left\{ \frac{s}{2\,s+2\beta +1}, \frac{s-\frac{1}{r}+\frac{1}{p}}{2(s-\frac{1}{r})+2\beta +1}\right\} \) for \(s>\frac{1}{r}\) and \(\beta >1\), which coincides with Theorem 4 of Li and Liu [11]. On the other hand, the estimation in the case of \(0<s\le \frac{1}{r}\) is investigated, whereas it is none for that with traditional wavelet estimators.
Remark 4.2
When \(p\in \left[ 1,\frac{2sr+(2\beta +1)r}{sr+2\beta +1}\right] \subset \left[ 1,\frac{2sr}{2\beta +1}+r\right] \), the convergence rate \(\frac{s}{2s+2\beta +1}\) is improved than \(\frac{s(1-1/p)}{s-(2\beta +1)/r+(2\beta +1)}\) for not necessarily compactly supported density estimation with \(\alpha =d=1\) in Refs. [9, 10]. It is reasonable that the condition of compactly support is stricter.
Proof
It follows from Theorem 2.1 that
holds for any \(x\in [-T,T]\). This with Proposition 3.1 leads to
Here, \(J_{0}^{-}\) and \(J_{m}\) can be found in Proposition 3.2.
To complete the proof, one divides (4.1) into three regions. Recall that \(2^{m_{2}}\thicksim \delta _n^{-1}\) and \(\delta _n\thicksim \left( \frac{\ln n}{n}\right) ^{\frac{s}{2s+2\beta +1}}\) by (3.10)–(3.11). According to Proposition 3.2, the following estimations are established.
(i) For \(1\le p<\frac{2sr}{2\beta +1}+r\),
(ii) For \(p\ge \frac{2sr}{2\beta +1}+r\),
(iii) For the case \(p\ge \frac{2sr}{2\beta +1}+r\) and \(s>\frac{1}{r}\). Take \(m_1\in \mathbb {Z}\) satisfying
Then it follows from \(r<p,~p\ge \frac{2sr}{2\beta +1}+r\) and \(s>\frac{1}{r}\) that \(0<m_1<m_2\). Hence,
Combining with (4.4), \(\delta _n\thicksim \left( \frac{\ln n}{n}\right) ^{\frac{s}{2\,s+2\beta +1}}\) and \(s'=s-\frac{1}{r}+\frac{1}{p}\), the above inequality reduces to
This with (4.1)–(4.4) leads to the desired conclusion, which finishes the proof.
\(\square \)
\(\bullet \) Concluding remark
It is worth to note that we assume \(\beta >1\) in Theorem 4.1. For the case of \(\beta \in (0,1]\), the same conclusion of Theorem 4.1 holds, if the following Condition (T3) is additional.
where \(g_{\varepsilon }(x)=\mathcal {F}^{-1} \left\{ \frac{d}{dt}\frac{\left( f_{\varepsilon }^{ft}\right) '(t)}{\left[ f_{\varepsilon }^{ft}(t)\right] ^{2}}\right\} (x)\) and \(\mathcal {F}^{-1}\) means the inverse Fourier transform. Although Condition (T3) looks complicated and unnatural, the Gamma distribution provides an example, see Example 4.1 in Ref. [13].
If Condition (T3) is added, the next lemma can be concluded easily.
Lemma 4.1
([13]). Let \(\varphi \) be m regular and Conditions (T1)–(T3) hold with \(0<\beta \le 1\) and \(m>\beta +1\). Then \(K_{j}\varphi \) in (1.2) satisfies that
where \(M_{0}>0\) is some constant.
Thus, the same conclusions of Lemma 3.2 are established for \(0<\beta \le 1\), which imply that the same conclusion of Theorem 4.1 for the case \(0<\beta \le 1\) holds under Conditions (T1)–(T3).
Data Availability
Not applicable.
References
Cao, K.K., Zeng, X.C.: Adaptive wavelet density estimation under independence hypothesis. Results Math. 76(4), 196 (2021)
Comte, F., Lacour, C.: Anisotropic adaptive kernel deconvolution. Ann. Inst. H. Poincaré Probab. Statist. 49(2), 569–609 (2013)
Daubechies, I.: Ten Lectures on Wavelets. SIAM, Philadelphia (1992)
Delyon, B., Juditsky, A.: On minimax wavelet estimator. Appl. Comput. Harmon. Anal. 3(3), 215–228 (1996)
Donoho, D.L., Johnstone, I.M., Kerkyacharian, G., Picard, D.: Density estimation by wavelet thresholding. Ann. Statist. 24(2), 508–539 (1996)
Fan, J., Koo, J.-Y.: Wavelet deconvolution. IEEE Trans. Inform. Theory 48(3), 734–747 (2002)
Goldenshluger, A., Lepski, O.: On adaptive minimax density estimation on \(\mathbb{R} ^{d}\). Probab. Theory Relat. Fields 159(3–4), 479–543 (2014)
Härdle, W., Kerkyacharian, G., Picard, D., Tsybakov, A.: Wavelets. Approximation and Statistical Applications. Springer-Verlag, New York (1998)
Lepski, O., Willer, T.: Lower bounds in the deconvolution structure density model. Bernoulli 23(2), 884–926 (2017)
Lepski, O., Willer, T.: Oracle inequalities and adaptive estimation in the deconvolution structure density model. Ann. Statist. 47(1), 233–287 (2019)
Li, R., Liu, Y.M.: Wavelet optimal estimations for a density with some additive noises. Appl. Comput. Harmon. Anal. 36(3), 416–433 (2014)
Li, Q., Racine, J.S.: Nonparametric Econometrics: Theory and Practice. Princeton University Press, Princeton (2007)
Liu, Y.M., Zeng, X.C.: Asymptotic normality for wavelet deconvolution density estimators. Appl. Comput. Harmon. Anal. 48(1), 321–342 (2020)
Lounici, K., Nickl, R.: Global uniform risk bounds for wavelet deconvolution estimators. Ann. Statist. 39(1), 201–231 (2011)
Massart, P.: Concentration inequalities and model selection. In: Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour. Springer, Berlin (2007)
Meyer, Y.: Wavelets and Operators. Cambridge University Press, Cambridge (1992)
Pensky, M., Vidakovic, B.: Adaptive wavelet estimator for nonparametric density deconvolution. Ann. Statist. 27(6), 2033–2053 (1999)
Rebelles, G.: Structural adaptive deconvolution under \(L^{p}\) losses. Math. Methods Statist. 25(1), 26–53 (2016)
Walter, G.G.: Density estimation in the presence of noise. Statist. Probab. Lett. 41(3), 237–246 (1999)
Zeng, X.C.: A note on wavelet deconvolution density estimation. Int. J. Wavelets Multiresolut. Inf. Process. 15(6), 1750055 (2017)
Acknowledgements
The authors would like to thank the referees for their valuable suggestions, which greatly improve the readability of the article. The first author is supported by the National Natural Science Foundation of China (No. 12101459). The second author (corresponding author) is supported by the National Natural Science Foundation of China (Nos. 12171016, 11901019), and the Science and Technology Program of Beijing Municipal Commission of Education (No. KM202010005025).
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interest to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cao, K., Zeng, X. A Data-Driven Wavelet Estimator For Deconvolution Density Estimations. Results Math 78, 156 (2023). https://doi.org/10.1007/s00025-023-01928-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00025-023-01928-0