A Data-Driven Wavelet Estimator For Deconvolution Density Estimations

Cao, Kaikai; Zeng, Xiaochen

doi:10.1007/s00025-023-01928-0

A Data-Driven Wavelet Estimator For Deconvolution Density Estimations

Published: 03 June 2023

Volume 78, article number 156, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Results in Mathematics Aims and scope Submit manuscript

A Data-Driven Wavelet Estimator For Deconvolution Density Estimations

Download PDF

Kaikai Cao¹ &
Xiaochen Zeng²

120 Accesses
3 Citations
Explore all metrics

Abstract

This current paper provides a data-driven wavelet estimator for deconvolution density model. Moreover, we investigate the totally adaptive estimations with moderately ill-posed noises over $L^p$ risk on Besov spaces $B_{r,q}^s(\mathbb {R})$. Compared with the traditional adaptive wavelet estimators, the estimation for the case of $0<s\le \frac{1}{r}$ is considered. On the other hand, the convergence rate in the region of $1\le p\le \frac{2sr+(2\beta +1)r}{sr+2\beta +1}$ is improved than that for not necessarily compactly supported density estimations.

Adaptive and optimal pointwise deconvolution density estimations by wavelets

Article 02 February 2021

Generalized Deconvolution Estimation by Multiwavelets

Article 13 May 2023

The mean consistency of wavelet density estimators

Article Open access 27 March 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction and Preliminary

The density estimation for a statistical model with additive noise plays important roles in both statistics and econometrics [12]. More precisely, let $Y_{1},\!Y_{2},\!\cdots ,Y_{n}$ be independent and identically distributed (i.i.d.) random data of the form

$$\begin{aligned} Y=X+\varepsilon , \end{aligned}$$

(1.1)

where X denotes a real valued random variable with the unknown probability density function f and $\varepsilon $ stands for an independent random noise (error) with the probability density $f_{\varepsilon }$. The problem is to estimate f by $Y_{1},Y_{2},\cdots ,Y_{n}$ in some sense. It is well-known that the probability density g of Y equals to the convolution of f and $f_\varepsilon $, it so called deconvolution problem. In particular, the model (1.1) reduces to the classical density model with no errors [5, 8], when $f_\varepsilon $ degenerates to the Dirac functional $\delta $.

In 1996, Delyon and Juditsky [4] investigated the noise-free density estimation by using compacted wavelets. Pensky and Vidakovic [17] and Walter [19] studied the deconvolution density estimations by using Meyer’s function over Sobolev space $W_{2}^{s}(\mathbb {R})$ in 1999; Three years later, Fan and Koo [6] explored the MISE performance of wavelet deconvolution estimators over Besov space $B_{r,q}^{s}(\mathbb {R})$ ($1\le r\le 2$). Lounici and Nickl [14] investigated the wavelet optimal $L^\infty $ deconvolution estimations over $B_{\infty ,\infty }^{s}(\mathbb {R})$. In 2014, Li and Liu [11] provided a completed deconvolution estimations over $L^{p}~(1\le p<\infty )$ risk on $B_{r,q}^{s}(\mathbb {R})~(r,q\in [1,\infty ])$ for moderately ill-posed noises by using wavelet bases.

It should be pointed out that the constructions of above wavelet estimators depend more or less on the unknown density function f. For example, the selection of their parameters depends on the unknown smoothness index s of f in linear wavelet estimators, and an upper bound of s in non-linear wavelet ones.

Goldenshluger and Lepski [7] constructed a data-driven kernel density estimator for the noise-free model in 2014, and the selection of parameters in estimator only depends on the observed data. Moreover, they studied the problem of adaptive minimax un-compactly supported density estimations on $\mathbb {R}^{d}$ with $L^{p}$ risk over anisotropic Nikol’skii classes. For deconvolution density estimations, Comte and Lacour [2] considered the $L^{2}$ risk estimation by using a data-driven kernel deconvolution estimator over anistropic Nikol’skii and Sobolev classes. Three year later, Rebelles [18] extended above estimations to $L^p$ risk over anistropic Nikol’skii classes. In 2017 and 2019, Lepski and Willer [9, 10] established the adaptive and optimal $L^{p}$ risk estimations in the convolution structure density model via data-driven kernel method, which contains both classical density estimations and deconvolution density ones. Compared with kernel estimators, the wavelet ones provide more local information and fast algorithm. Recently, Cao and Zeng [1] constructed a data-driven wavelet estimator and attained the optimal estimate (up to a logarithmic factor) for un-compactly supported density functions over Besov spaces. However, there are few references to estimate density functions with some additive noises by using data-driven wavelet methods.

This paper provides a data-driven wavelet estimator for compactly supported density functions in deconvolution density model. It is totally adaptive because of the selection of its parameters only depends on the observed data. By using this estimator, we investigate the $L^{p}~(1\le p<\infty )$ risk estimations for moderately ill-posed noises over Besov balls $B^{s}_{r,q}(M,T)~(r,q\in [1,\infty ])$. On the other hand, our result includes the case of $0<s\le \frac{1}{r}$ than the traditional wavelet ones, and the convergence rate is improved in the region of $1\le p\le \frac{2sr+(2\beta +1)r}{sr+2\beta +1}$ than that for not necessarily compactly supported density estimations [9, 10], see Remark 4.2.

1.1 Preliminaries

We begin with the concept of Multiresolution Analysis (MRA, [8, 16]), which is a sequence of closed subspaces $\{V_{j}\}_{j\in \mathbb {Z}}$ of the square integrable function space $L^{2}(\mathbb {R})$ satisfying the following properties:

(i)
$V_{j}\subset V_{j+1}$, $j\in \mathbb {Z}$;
(ii)
$\overline{\bigcup _{j\in \mathbb {Z}} V_{j}}=L^{2}(\mathbb {R})$ (the space $\bigcup _{j\in \mathbb {Z}} V_{j}$ is dense in $L^{2}(\mathbb {R})$);
(iii)
$f(2\cdot )\in V_{j+1}$ if and only if $f(\cdot )\in V_{j}$ for each $j\in \mathbb {Z}$;
(iv)
There exists $\varphi \in L^{2}(\mathbb {R})$ (scaling function) such that $\{\varphi (\cdot -k),~k\in \mathbb {Z}\}$ forms an orthonormal basis of $V_{0}=\overline{\textrm{span}\{\varphi (\cdot -k),~k\in \mathbb {Z}\}}$.

With the standard notation $h_{jk}(\cdot ):=2^{\frac{j}{2}}h\left( 2^{j}\cdot -k\right) $ in wavelet analysis, we can derive a wavelet function $\psi $ from a scaling function $\varphi $ in a simple way such that for a fixed $j\in \mathbb {Z}$, $\{\psi _{jk}\}_{k\in \mathbb {Z}}$ constitutes an orthonormal basis of the orthogonal complement $W_j$ of $V_j$ in $V_{j+1}$. Then for fixed $j_0\in \mathbb {N}$, both $\{\varphi _{j_0k},\psi _{jk}\}_{j\ge j_0,k\in \mathbb {Z}}$ and $\{\psi _{jk}\}_{j,k\in \mathbb {Z}}$ are orthonormal bases (wavelet bases) of $L^{2}(\mathbb {R})$. Thus, each $f\in L^2(\mathbb {R})$ has the following expansion in $L^2$ sense,

$$\begin{aligned} f=\sum _{k\in \mathbb {Z}}\alpha _{j_0k}\varphi _{j_0k}+\sum _{j\ge j_0}\sum _{k\in \mathbb {Z}}\beta _{jk}\psi _{jk} \end{aligned}$$

with $\alpha _{jk}:=\langle f, \varphi _{jk}\rangle $ and $\beta _{jk}:=\langle f,\psi _{jk}\rangle $.

As usual, let $P_{j}$ be the orthogonal projective operator from $L^{2}(\mathbb {R})$ onto the scaling space $V_{j}$ with the orthonormal basis $\{\varphi _{jk}\}_{k\in \mathbb {Z}}$. Then for each $f\in L^{2}(\mathbb {R})$,

$$\begin{aligned} P_{j}f=\sum _{k\in \mathbb {Z}}\alpha _{jk}\varphi _{jk} \end{aligned}$$

with $\alpha _{jk}:=\langle f,\varphi _{jk}\rangle $.

One of advantages of wavelet bases is that they can characterize Besov spaces, which contain Hölder and $L^{2}$-Sobolev spaces as special examples. To introduce Lemma 1.1, we need some notations: A scaling function $\varphi $ is called m regular [3], if $\varphi \in \mathcal {C}^{m}(\mathbb {R})$ and $|\varphi ^{(r)}(x)|\le C(1+|x|^{2})^{-l}$ holds for each $l\in \mathbb {Z}~(r=0,1,\cdots ,m)$; $\Vert f\Vert _{r}$ denotes $L^{r}(\mathbb {R})$ norm for $f\in L^{r}(\mathbb {R})$, and $\Vert \tau \Vert _{l^{r}}$ does $l^{r}(\mathbb {Z})$, where

$$\begin{aligned} l^{r}(\mathbb {Z}):=\left\{ \begin{array}{rcl} \left\{ \tau =\{\tau _{k}\},~\displaystyle \sum _{k\in \mathbb {Z}}|\tau _{k}|^{r}<\infty \right\} , ~~ &{}{1\le r<\infty ;} \\ \left\{ \tau =\{\tau _{k}\},~~\displaystyle \sup _{k\in \mathbb {Z}}|\tau _{k}|<\infty \right\} , ~~ &{}{r=\infty }. \end{array} \right. \end{aligned}$$

Lemma 1.1

( [16]). Let $\varphi $ be m regular with $m>s>0$, $\psi $ be the corresponding wavelet and $f\in L^{r}(\mathbb {R})$. If $\alpha _{jk}:=\langle f,\varphi _{jk}\rangle $, $\beta _{jk}:=\langle f,\psi _{jk}\rangle $ and $r,q\in [1,\infty ]$, then the following assertions are equivalent:

(i)
$f\in B^{s}_{r,q}(\mathbb {R})$;
(ii)
$\{2^{js}\Vert P_{j}f-f\Vert _{r}\}\in l^{q}(\mathbb {Z});$
(iii)
$\{2^{j\left( s-\frac{1}{r}+\frac{1}{2}\right) }\Vert \{\beta _{j\cdot }\}\Vert _{l^r}\}\in l^{q}(\mathbb {Z}).$

The Besov norm of f can be defined by

$$\begin{aligned} \Vert f\Vert _{B^{s}_{r,q}}:=\Vert \{\alpha _{j_{0}\cdot }\}\Vert _{l^r}+ \left\| \left\{ 2^{j\left( s-\frac{1}{r}+\frac{1}{2}\right) }\Vert \{\beta _{j\cdot }\}\Vert _{l^r}\right\} _{j\ge j_{0}}\right\| _{l^q}. \end{aligned}$$

Moreover, Lemma 1.1 (i) and (ii) show that $\Vert P_jf-f\Vert _r\lesssim 2^{-js}$ holds for $f\in B^{s}_{r,q}(\mathbb {R})$. Here and after, the notation $A\lesssim B$ denotes $A\le cB$ with some fixed and independent constant $c>0$; $A\gtrsim B$ means $B\lesssim A$; $A\thicksim B$ stands for both $A\lesssim B$ and $A\gtrsim B$.

When $r\le p$, Lemma 1.1 (i) and (iii) imply that with $s'-\frac{1}{p}=s-\frac{1}{r}>0$,

$$\begin{aligned} B_{r,q}^s(\mathbb {R})\hookrightarrow B_{p,q}^{s'}(\mathbb {R}), \end{aligned}$$

where $A\hookrightarrow B$ stands for a Banach space A continuously embedded in another Banach space B. All these claims can be found in Refs. [11, 20].

In this paper, the notation $B_{r,q}^{s}(M)$ with $M>0$ stands for a Besov ball, i.e.,

$$\begin{aligned} B_{r,q}^{s}(M): =\{f\in B_{r,q}^{s}(\mathbb {R}),~f~\text{ is } \text{ density } \text{ function } \text{ and }~\Vert f\Vert _{B_{r,q}^{s}}\le M\} \end{aligned}$$

and

$$\begin{aligned} B_{r,q}^{s}(M,T): =\{f\in B_{r,q}^{s}(M),~supp~f\subseteq [-T,T]~\text{ with }~T>0\}. \end{aligned}$$

Moreover, $L^\infty (M)$ is defined by the way.

To introduce the assumptions on a noise function $f_{\varepsilon }$, we need the Fourier transform $f^{ft}$ of $f\in L^{1}(\mathbb {R})$,

$$\begin{aligned} f^{ft}(t):=\int _{\mathbb {R}}f(x)e^{-itx}dx. \end{aligned}$$

A standard method extends the definition to an $L^{2}(\mathbb {R})$ function. Furthermore, the following conditions are posed on the noise density function $f_{\varepsilon }$. For $\beta \ge 0$,

(T1) $\left| f_{\varepsilon }^{ft}(t)\right| \gtrsim (1+|t|^{2})^{-\frac{\beta }{2}};$

(T2) $\left| (f_{\varepsilon }^{ft})^{(\ell )}(t)\right| \lesssim (1+|t|^{2})^{-\frac{\beta +\ell }{2}},~\ell =0,1,2.$

Such a noise $\varepsilon $ is said to be moderately ill-posed. Clearly, the Gamma distribution $\Gamma (a,b)$ satisfies Conditions (T1)-(T2) with $\beta =a.$ In particular, the index $\beta =0$ corresponds to $\varepsilon $ being the Dirac functional $\delta $ which the model (1.1) reduces to the classical noise-free density estimation model.

1.2 Data-Driven Wavelet Estimator

This subsection is devoted to introduce the data-driven wavelet estimator for model (1.1). Under Condition (T1),

$$\begin{aligned} \widehat{\alpha }_{jk}:= {} \frac{2^{j/2}}{n}\sum _{i=1}^{n}(K_{j}\varphi )\left( 2^{j}Y_{i}-k\right) ~~~~\text { and }~~~~ (K_{j}\varphi )(x) := \frac{1}{2\pi }\int _{\mathbb {R}} e^{itx}\frac{\varphi ^{ft}(t)}{f_{\varepsilon }^{ft}(-2^{j}t)}dt\nonumber \\ \end{aligned}$$

(1.2)

are well-defined, where $\varphi $ is m regular with $m>\beta +1$. Then the classical linear wavelet estimator for deconvolution density model is given by

$$\begin{aligned} \widehat{f}_{j}(x)=\sum _{k\in \mathbb {Z}}\widehat{\alpha }_{jk}\varphi _{jk}(x). \end{aligned}$$

(1.3)

Clearly, $E\widehat{\alpha }_{jk}=\alpha _{jk}$ and $E\widehat{f}_{j}=P_{j}f$, the details please see Refs. [6, 11, 13]. In general, the parameter j in (1.3) depends on the smoothness index s of unknown density function f, and the estimator in (1.3) is non-adaptive [6, 11, 13].

Next, we give a selection rule of parameter j only depending on the observed data $Y_{1},\cdots ,Y_{n}$, which is so called data-driven version. Let $\mathcal {H}:=\left\{ 0,1,\cdots ,\left\lfloor \frac{1}{2\beta +1}\log _2{\frac{n}{\ln n}}\right\rfloor \right\} $ with $\lfloor a\rfloor $ denoting the largest integer smaller or equal to a and

$$\begin{aligned} \xi _{n}(x,j):=\widehat{f}_{j}(x)-E\widehat{f}_{j}(x) \end{aligned}$$

(1.4)

be the stochastic error of $\widehat{f}_{j}$. Moreover, for any $x\in [-T,T]$,

$$\begin{aligned}{} & {} \widehat{R}_{j}(x): =\sup _{j'\in \mathcal {H}}\left[ \left| \widehat{f}_{j\wedge j'}(x)-\widehat{f}_{j'}(x)\right| -U_{n}(j\wedge j') -U_{n}(j')\right] _{+}, \end{aligned}$$

(1.5)

$$\begin{aligned}{} & {} U_{n}^{*}(j): =\sup _{j'\in \mathcal {H},~j'\le j}U_{n}(j'). \end{aligned}$$

(1.6)

Here and after, $a\wedge b:=\min \{a,b\}$, $a_{+}:=\max \{a,0\}$ and

$$\begin{aligned} U_{n}(j):=\sqrt{\frac{\lambda 2^{j(2\beta +1)}\ln n}{n}}+\frac{\lambda 2^{j(\beta +1)}\ln n}{n}, \end{aligned}$$

(1.7)

where the constant $\lambda >0$ will be determined later on.

Thus, the selection of $j=j_{0}$ in (1.3) is obtained by

$$\begin{aligned} j_{0}=j_{0}(x)=\mathop {\text {arginf}}_{j\in \mathcal {H}} \left[ \widehat{R}_{j}(x)+2U_{n}^{*}(j)\right] . \end{aligned}$$

(1.8)

Obviously, it only depends on the observed data $Y_1,\cdots ,Y_n$. Then the data-driven wavelet estimator is shown by

$$\begin{aligned} \widehat{f}_{n}(x):=\widehat{f}_{j_{0}}(x)=\sum _{k\in \mathbb {Z}}\widehat{\alpha }_{j_{0}k} \varphi _{j_{0}k}(x) \end{aligned}$$

(1.9)

with $j_0\in \mathcal {H}$ being given in (1.8).

2 Oracle Inequality

We shall state a point-wise oracle inequality in this section, which plays the key roles in the proofs of Proposition 3.2 and Theorem 4.1.

Let $B_j(x,f)$ be the bias of the estimator $\widehat{f}_{j}(x)$, i.e.,

$$\begin{aligned} B_{j}(x,f):=\left| E\widehat{f}_{j}(x)-f(x)\right| =\left| P_jf(x)-f(x)\right| \end{aligned}$$

(2.1)

and define

$$\begin{aligned} B_{j}^{*}(x,f):= \sup _{j'\in \mathcal {H},~j'\ge j}B_{j'}(x,f)\quad \text { and }\quad v(x):= \sup _{j\in \mathcal {H}}\Big [\left| \xi _{n}(x,j)\right| -U_n(j)\Big ]_{+}, \end{aligned}$$

(2.2)

where $\xi _{n}(x,j)$ and $U_{n}(j)$ are given by (1.4) and (1.7) respectively.

Theorem 2.1

For any $x\in [-T,T]$, the estimator $\widehat{f}_{n}(x)$ in (1.9) satisfies that

$$\begin{aligned} \left| \widehat{f}_{n}(x)-f(x)\right| \le \inf _{j\in \mathcal {H}}\left\{ 5B_{j}^{*}(x,f)+5U_{n}^{*}(j)\right\} +5v(x), \end{aligned}$$

where $U_{n}^{*}(j)$ is defined in (1.6) and $B_{j}^{*}(x,f),~v(x)$ are defined in (2.2).

Proof

Obviously, it follows from (1.5) and (1.6) that

$$\begin{aligned} \left| \widehat{f}_{j\wedge j_0}(x)-\widehat{f}_{j_{0}}(x)\right| \le \widehat{R}_{j}(x)+U_{n}(j\wedge j_0) +U_{n}(j_{0}) \le \widehat{R}_{j}(x)+2U_{n}^{*}(j_{0}).\nonumber \\ \end{aligned}$$

(2.3)

The same arguments as (2.3) show

$$\begin{aligned} \left| \widehat{f}_{j_{0}\wedge j}(x)-\widehat{f}_{j}(x)\right| \le \widehat{R}_{j_{0}}(x)+2U_{n}^{*}(j). \end{aligned}$$

(2.4)

Then combining (2.3) and (2.4), one obtains that

$$\begin{aligned} \left| \widehat{f}_{j_0}(x)-f(x)\right|\le & {} \left| \widehat{f}_{j_{0}\wedge j}(x)-\widehat{f}_{j_{0}}(x)\right| +\left| \widehat{f}_{j_{0}\wedge j}(x)-\widehat{f}_{j}(x)\right| + \left| \widehat{f}_{j}(x)-f(x)\right| \nonumber \\\le & {} 2\widehat{R}_{j}(x)+4U_{n}^{*}(j)+\left| \widehat{f}_{j}(x)-f(x)\right| \end{aligned}$$

(2.5)

due to $\widehat{f}_{j_{0}\wedge j}=\widehat{f}_{j\wedge j_{0}}$ and the selection of $j_0$ in (1.8).

Clearly, by (1.6) and (2.2),

$$\begin{aligned} |\xi _n(x,j)|\le \Big [|\xi _n(x,j)|-U_n(j)\Big ]_++U_n(j)\le v(x)+U_{n}^{*}(j). \end{aligned}$$

This with (2.1) and (2.2) implies that

$$\begin{aligned} \left| \widehat{f}_{j}(x)-f(x)\right| \le B_{j}(x,f)+\left| \xi _{n}(x,j)\right| \le B_{j}^{*}(x,f)+v(x)+U_{n}^{*}(j). \end{aligned}$$

(2.6)

On the other hand, according to (1.4) and (1.5),

$$\begin{aligned} \widehat{R}_{j}(x)=&{} \sup _{j'\in \mathcal {H}}\Big [\left| \widehat{f}_{j\wedge j'}(x)-\widehat{f}_{j'}(x)\right| -U_{n}(j\wedge j')-U_{n}(j')\Big ]_{+}\\\le&{} \sup _{j'\in \mathcal {H}}\Big [\left| E\widehat{f}_{j\wedge j'}(x)-E\widehat{f}_{j'}(x)\right| +\left| \xi _{n}(x,j\wedge j')\right| -U_{n}(j\wedge j')+\left| \xi _{n}(x,j')\right| -U_{n}(j')\Big ]_{+}. \end{aligned}$$

This with $\displaystyle \sup \limits _{j'\in \mathcal {H}}\left| E\widehat{f}_{j\wedge j'}(x)-E\widehat{f}_{j'}(x)\right| \le \sup \limits _{j'\in \mathcal {H},j'\ge j}\{B_{j\wedge j'}(x,f)+B_{j'}(x,f)\}$ and (2.2) leads to

$$\begin{aligned} \widehat{R}_{j}(x)\le 2B_{j}^{*}(x,f)+2v(x). \end{aligned}$$

(2.7)

Hence, it follows from (2.5)–(2.7) that

$$\begin{aligned} |\widehat{f}_{j_{0}}(x)-f(x)| \le 5B_{j}^{*}(x,f)+5v(x)+5U_n^{*}(j) \end{aligned}$$

holds for any $j\in \mathcal {H}$. Furthermore,

$$\begin{aligned} |\widehat{f}_{n}(x)-f(x)|=|\widehat{f}_{j_0}(x)-f(x)|\le \inf _{j\in \mathcal {H}}\left\{ 5B_{j}^{*}(x,f)+5U_{n}^{*}(j)\right\} +5v(x) \end{aligned}$$

thanks to $\widehat{f}_{n}(x)=\widehat{f}_{j_0}(x)$ in (1.9). The proof is done. $\square $

3 Two Propositions

This section provides two necessary propositions. Some lemmas and classical inequality are needed, in order to prove Proposition 3.1.

Note that the next condition

(C1) $\varphi \in L^{1}(\mathbb {R})$ and $|(\varphi ^{ft})^{(\ell )}(t)|\lesssim (1+|t|^{2})^{-\frac{m}{2}}$ with $m>1$ and $\ell =0,1,2$ (see Ref. [13])

can be followed from the m regular of scaling function $\varphi $. Then, the following lemma holds according to the work of Liu and Zeng [13].

Lemma 3.1

([13]). Let $\varphi $ be m regular and Conditions (T1)–(T2) hold with $\beta >1$ and $m>\beta +1$. Then $K_{j}\varphi $ in (1.2) satisfies that

$$\begin{aligned} \left| \sum _{k\in \mathbb {Z}}(K_{j}\varphi )(x-k)\varphi (y-k)\right| \le M_{0}2^{j\beta }\left( 1+|x-y|^{2}\right) ^{-1}, \end{aligned}$$

where $M_{0}>0$ is some constant.

To introduce Lemma 3.2, we define

$$\begin{aligned} K^{*}_{j}(t,x):=2^{j}\sum _{k\in \mathbb {Z}}(K_{j}\varphi )\left( 2^{j}t-k\right) \varphi \left( 2^{j}x-k\right) , \end{aligned}$$

(3.1)

where $K_{j}\varphi $ is given by (1.2). Then the estimator $\widehat{f}_{j}(x)$ in (1.9) can be rewritten as $\displaystyle \widehat{f}_{j}(x)=\frac{1}{n}\sum \nolimits _{i=1}^{n}K^{*}_{j}(Y_{i},x)$. Furthermore, the following lemma holds.

Lemma 3.2

Let $\varphi $ be m regular and Conditions (T1)–(T2) hold with $\beta >1$ and $m>\beta +1$. Then $K_{j}^{*}(t,x)$ in (3.1) satisfies that

$$\begin{aligned} \left| K_{j}^{*}(t,x)\right| \le M_{1}2^{j(\beta +1)} ~~~~\text{ and }~~~~ E\left| K_{j}^{*}(Y_{1},x)\right| ^{2}\le M_{1}2^{j(2\beta +1)}, \end{aligned}$$

where $M_1>0$ is some constant.

Proof

According to the definition of $K_{j}^{*}(t,x)$ in (3.1) and Lemma 3.1, one obtains

$$\begin{aligned} \left| K^{*}_{j}(t,x)\right| =\left| 2^{j}\sum _{k\in \mathbb {Z}}(K_{j}\varphi )(2^{j}t-k)\varphi (2^{j}x-k)\right| \le M_02^{j(\beta +1)}. \end{aligned}$$

(3.2)

On the other hand, $\Vert g\Vert _{\infty }=\Vert f*f_{\varepsilon }\Vert _{\infty } \le \Vert f\Vert _{\infty }\Vert f_{\varepsilon }\Vert _{1}=\Vert f\Vert _{\infty }$. This with (3.1) and Lemma 3.1 leads to

$$\begin{aligned} E|K_{j}^{*}(Y_{1},x)|^{2}{}&{} \le \int _{\mathbb {R}}|K^{*}_{j}(t,x)|^{2}g(t)dt \nonumber \\ {}&\le \Vert f\Vert _{\infty }M_{0}^{2}2^{j(2\beta +2)}2^{-j}\int _{\mathbb {R}} (1+|2^{j}t-2^{j}x|^{2})^{-2} d(2^{j}t-2^{j}x)\nonumber \\{}&{} \le \int _{\mathbb {R}}(1+|t|^{2})^{-2}dt\Vert f\Vert _{\infty }M_{0}^{2}2^{j(2\beta +1)}. \end{aligned}$$

(3.3)

The desired conclusions are concluded by choosing $M_1:=\max \{\Vert f\Vert _{\infty }M_{0}^{2}\int _{\mathbb {R}}(1+|t|^{2})^{-2}dt,~M_0\}$ and (3.2)–(3.3). $\square $

To show Proposition 3.1, we need a well-known inequality.

Bernstein’s inequality ([15]). Let $Y_{1},\cdots ,Y_{n}$ be i.i.d. random variables with $EY_{i}^{2}\le \sigma ^{2}$ and $|Y_{i}|\le M$ $(i=1,2,\cdots ,n)$. Then for any $x>0$,

$$\begin{aligned} P\left\{ \left| \frac{1}{n}\sum _{i=1}^n(Y_i-EY_i)\right| \ge \sqrt{\frac{2\sigma ^2x}{n}} +\frac{4Mx}{3n}\right\} \le 2e^{-x}. \end{aligned}$$

Now, we state the first proposition which is one of main ingredients in the proof of the second one.

Proposition 3.1

Let $\varphi $ be m regular and Conditions (T1)–(T2) hold with $\beta >1$ and $m>\beta +1$. Then for each $x\in [-T,T]$ and $\gamma >0$, there exists $\lambda >\max \{2M_1,~2M_1(\beta +2)\gamma \ln 2\}$ such that

$$\begin{aligned} E[v(x)]^{\gamma }\lesssim n^{-\frac{\gamma }{2}}, \end{aligned}$$

where v(x) is defined in (2.2) and $M_1$ is the positive constant in Lemma 3.2.

Proof

For any $j\in \mathcal {H}$, one defines

$$\begin{aligned} \overline{U_{n}}(j):=\sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}\lambda _j} + \frac{4M_{1}2^{j(\beta +1)}}{3n}\lambda _j, \end{aligned}$$

(3.4)

where $\lambda _{j}=\max \left\{ (\beta +2)\gamma j\ln 2,\frac{1}{4}\right\} $.

Note that $\lambda \ln n\ge 2M_{1}\lambda _j$ for large n follows from $\lambda >\max \{2M_1,2M_1(\beta +2)\gamma \ln 2\}$ and $j\in \mathcal {H}$. Then $\overline{U_{n}}(j)\le U_{n}(j)$ due to (1.7) and (3.4). Furthermore,

$$\begin{aligned} \Big [\left| \xi _{n}(x,j)\right| -U_{n}(j)\Big ]_+\le \Big [\left| \xi _{n}(x,j)\right| -\overline{U_{n}}(j)\Big ]_+. \end{aligned}$$

(3.5)

For each $t\ge 0$,

$$\begin{aligned} P\left\{ \big [\left| \xi _{n}(x,j)\right| -\overline{U_{n}}(j)\big ]_+>t\right\} =P\left\{ \left| \xi _{n}(x,j)\right| -\overline{U_{n}}(j)>t\right\} . \end{aligned}$$

Hence,

$$\begin{aligned} E\Big [|\xi _{n}(x,j)|-\overline{U_{n}}(j)\Big ]_+^{\gamma } =\gamma \int _0^\infty t^{\gamma -1} P\left\{ \left| \xi _{n}(x,j)\right| -\overline{U_{n}}(j)>t\right\} dt. \end{aligned}$$

This with variable substitution $t=v\omega $ and $\omega :=\sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}}+ \frac{4M_{1}2^{j(\beta +1)}}{3n}$ shows

$$\begin{aligned}&E\Big [|\xi _{n}(x,j)|-\overline{U_{n}}(j)\Big ]_+^{\gamma }\nonumber \\\le&\gamma \int _{0}^{\infty }(v\omega )^{\gamma -1}P\left\{ |\xi _{n}(x,j)|> \sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}}(\sqrt{v+\lambda _j})+\frac{4M_{1}2^{j(\beta +1)}}{3n} (v+\lambda _j)\right\} \omega dv \end{aligned}$$

(3.6)

because of $v+\sqrt{\lambda _j}\ge \sqrt{v+\lambda _j}$ and $\lambda _j\ge \frac{1}{4}$.

On the other hand, $\left| K_{j}^{*}(Y_{i},x)\right| \le M_{1}2^{j(\beta +1)}$, $E\left| K_{j}^{*}(Y_{i},x)\right| ^{2}\le M_{1}2^{j(2\beta +1)} $ and

$$\begin{aligned} \xi _n(x,j) =\frac{1}{n}\sum _{i=1}^{n}\left[ K_{j}^{*}(Y_{i},x)-EK_{j}^{*}(Y_{i},x)\right] \end{aligned}$$

by Lemma 3.2. Then

$$\begin{aligned} P\Bigg \{\left| \xi _{n}(x,j)\right| > \sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}}\left( \sqrt{v+\lambda _j}\right) + \frac{4M_{1}2^{j(\beta +1)}}{3n} (v+\lambda _j)\Bigg \}\le 2e^{-(v+\lambda _j)} \end{aligned}$$

thanks to Bernstein’s inequality. This with (3.6) implies that

$$\begin{aligned} E\Big [|\xi _{n}&(x,j)|-\overline{U_{n}}(j)\Big ]_+^{\gamma }\le 2\gamma \omega ^{\gamma }\int _{0}^{\infty }v^{\gamma -1}e^{-(v+\lambda _j)}dv\\&=2\gamma \omega ^{\gamma }e^{-\lambda _j}\int _{0}^{\infty }v^{\gamma -1}e^{-v}dv\\&=2\gamma \Gamma (\gamma )\omega ^{\gamma }e^{-\lambda _j}\\&=2\Gamma (\gamma +1) \left[ \sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}}+ \frac{4M_{1}2^{j(\beta +1)}}{3n}\right] ^{\gamma } e^{-\lambda _j} \end{aligned}$$

due to $\omega :=\sqrt{\frac{2M_{1}2^{j(2\beta +1)}}{n}}+ \frac{4M_{1}2^{j(\beta +1)}}{3n}$. Hence, according to $e^{-\lambda _{j}}\le 2^{-(\beta +2)\gamma j}$, one knows

$$\begin{aligned} \sum _{j\in \mathcal {H}}E\Big [\left| \xi _{n}(x,j)\right| -\overline{U_{n}}(j)\Big ]_+^{\gamma } \lesssim \sum _{j\in \mathcal {H}}\left( \frac{2^{j(\beta +1)}}{\sqrt{n}}\right) ^{\gamma } 2^{-(\beta +2)\gamma j} \lesssim n^{-\frac{\gamma }{2}}. \end{aligned}$$

(3.7)

Combining (2.2), (3.5) and (3.7), one obtains that

$$\begin{aligned} E[v(x)]^{\gamma } \lesssim E\sup _{j\in \mathcal {H}}\Big [\left| \xi _{n}(x,j)\right| -\overline{U_{n}}(j)\Big ]_+^{\gamma } \lesssim \sum _{j\in \mathcal {H}}E\Big [|\xi _{n}(x,j)|-\overline{U_{n}}(j)\Big ]_+^{\gamma } \lesssim n^{-\frac{\gamma }{2}}, \end{aligned}$$

since $\mathcal {H}$ is a discrete set, which completes the proof. $\square $

Before giving another proposition, we introduce the following notations:

$$\begin{aligned}{} & {} U_{f}(x):=\inf _{j\in \mathcal {H}}\left\{ B_{j}^{*}(x,f)+U_{n}^{*}(j)\right\} , \end{aligned}$$

(3.8)

$$\begin{aligned}{} & {} \Omega _{m}:=\left\{ x\in [-T,T],~2^{m}\delta _n<U_{f}(x)\le 2^{m+1}\delta _n\right\} , \end{aligned}$$

(3.9)

$$\begin{aligned}{} & {} \Omega _{0}^{-}:=\left\{ x\in [-T,T],~U_{f}(x)\le \delta _n\right\} , \end{aligned}$$

(3.10)

where $\delta _n=\left( \frac{C\ln n}{n}\right) ^{\frac{s}{2s+2\beta +1}}$ and $C>1$ is some constant.

Note that $U_{f}(x)\le c_0:=\sup _xU_{f}(x)$. Then there exists

$$\begin{aligned} m_2:=\min \left\{ m\in \mathbb {Z},~2^{m}\delta _n\ge c_0\right\} \end{aligned}$$

(3.11)

such that $\Omega _{m}=\emptyset $ for each $m>m_{2}$. Clearly, $m_{2}>0$ for large n.

Proposition 3.2

Let $U_f(x),\Omega _m,\Omega _{0}^-$ be defined in (3.8)–(3.10) respectively and $\varphi $ be m regular, Conditions (T1)–(T2) hold with $\beta >1$ and $m>\beta +1$. Then

$$\begin{aligned} J_{0}^{-}:=E\int _{\Omega _{0}^{-}}|\widehat{f}_{n}(x)-f(x)|^pdx \quad \text{ and }\quad J_m:=E\int _{\Omega _{m}}[U_f(x)]^pdx \end{aligned}$$

satisfy that

(1) For each $p\in [1,\infty )$,

$$\begin{aligned} J_{0}^{-}\lesssim \delta _n^{p}; \end{aligned}$$

(2) If $f\in B_{r,q}^{s}(M)\cap L^{\infty }(M)$ and $m\in \mathbb {Z}$ satisfy $0\le m\le m_2$, then

$$\begin{aligned} J_{m}\lesssim 2^{m\left( p-r-\frac{2sr}{2\beta +1}~\right) }~\delta _n^{p}; \end{aligned}$$

Moreover, if $s>\frac{1}{r}$ and $r\le p$, then with $s':=s-\frac{1}{r}+\frac{1}{p}$,

$$\begin{aligned} J_{m}\lesssim 2^{-\frac{2ms'p}{2\beta +1}}\delta _n^{\frac{s'}{s}p}. \end{aligned}$$

Proof

(1) According to Theorem 2.1, one finds that

$$\begin{aligned} |\widehat{f}_{n}(x)-f(x)|\lesssim U_{f}(x)+v(x) \end{aligned}$$

holds for any $x\in [-T,T]$, where v(x) and $U_f(x)$ are given by (2.2) and (3.8) respectively. Then for each $p\in [1,\infty )$, by using (3.10) and Proposition 3.1,

$$\begin{aligned} J_{0}^-=E\int _{\Omega _{0}^-}|\widehat{f}_{n}(x)-f(x)|^pdx\lesssim E\int _{\Omega _{0}^-}[U_{f}(x)+v(x)]^{p}dx \lesssim \delta _n^{p}+n^{-\frac{p}{2}}\lesssim \delta _n^{p} \end{aligned}$$

thanks to $\delta _n\thicksim \left( \frac{\ln n}{n}\right) ^{\frac{s}{2s+2\beta +1}}$, which is the first desired conclusion.

(2) Take $j_1$ satisfying $c_12^{\frac{2\,m}{2\beta +1}}~~\delta _n^{-\frac{1}{s}}\le 2^{j_{1}}\le c_22^{\frac{2\,m}{2\beta +1}}~~\delta _n^{-\frac{1}{s}}$, where two positive constants $c_1,c_2$ satisfy

$$\begin{aligned} (2M)^{\frac{1}{s}}I_{\{r=\infty \}}<c_1<c_2< \min \left\{ \frac{C}{4c_0^{2}},\frac{C}{4\left( \sqrt{\lambda }+\lambda \right) ^{2}} \right\} ^{\frac{1}{2\beta +1}}. \end{aligned}$$

(3.12)

Then $j_{1}\in \mathcal {H}$ and $U_{n}^{*}(x,j_1)\le 2^{m-1}\delta _n$ with $0<m\le m_2$ and large n. In fact, (3.11) tells that $2^{m_{2}}\le 2c_0\delta _n^{-1}$. Due to $0<m\le m_{2}$, (3.12) and $\delta _{n}= \left( \frac{C\ln n}{n}\right) ^{\frac{s}{2\,s+2\beta +1}}$, one concludes that

$$\begin{aligned} 1<c_1\delta _n^{-\frac{1}{s}}\le 2^{j_{1}} \le c_22^{\frac{2m_2}{2\beta +1}}~~\delta _n^{-\frac{1}{s}}\le c_2(2c_0)^{\frac{2}{2\beta +1}}~~\delta _n^{-\left( \frac{1}{s}+\frac{2}{2\beta +1}~~\right) } <\Big (\frac{n}{\ln n}\Big )^{\frac{1}{2\beta +1}}. \end{aligned}$$

Hence, $j_{1}\in \mathcal {H}$. This with $2^{j_{1}}\le c_{2}2^{\frac{2\,m}{2\beta +1}}~~\delta _{n}^{-\frac{1}{s}}$ and (3.12) shows that

$$\begin{aligned} U_{n}^{*}(j_1)\le&{} \sup _{j' \le j_1}\left\{ \sqrt{\frac{\lambda 2^{j'(2\beta +1)}\ln n}{n}}+\frac{\lambda 2^{j'(2\beta +1)}\ln n}{n}\right\} \nonumber \\\le&{} (\sqrt{\lambda }+\lambda )\sqrt{\frac{2^{j_1(2\beta +1)}\ln n}{n}}\nonumber \\\le&{} (\sqrt{\lambda }+\lambda ) \sqrt{c_2^{2\beta +1}2^{2m}\delta _n^{-\frac{2\beta +1}{s}}~\frac{\ln n}{n}} \nonumber \\\le&{} (\sqrt{\lambda }+\lambda ) \sqrt{c_2^{2\beta +1}/C}2^{m}\delta _n\le 2^{m-1}\delta _n. \end{aligned}$$

(3.13)

Clearly, by $\Omega _m=\{x\in [-T,T],~2^{m}\delta _n <U_{f}(x)\le 2^{m+1}\delta _n\}$,

$$\begin{aligned} J_m=\int _{\Omega _m}[U_{f}(x)]^pdx\le (2^{m+1}\delta _n)^p|\Omega _m|, \end{aligned}$$

(3.14)

where $|\Omega _m|$ stands for the Lebesgue measure of the set $\Omega _m$. Moreover, (3.8) and (3.13) lead to

$$\begin{aligned} |\Omega _m|\le & {} \left| \left\{ x\in [-T,T],~U_{f}(x)>2^{m}\delta _n\right\} \right| \nonumber \\\le & {} \left| \left\{ x\in [-T,T],~B_{j_{1}}^{*}(x,f)+U_{n}^{*}(j_{1})>2^{m}\delta _n\right\} \right| \nonumber \\\le & {} \left| \left\{ x\in [-T,T],~B_{j_{1}}^{*}(x,f)>2^{m-1}\delta _n\right\} \right| . \end{aligned}$$

(3.15)

When $1\le r<\infty $, by using Chebyshev’s inequality, (2.2), (3.15) and $f\in B_{r,q}^s(M)$,

$$\begin{aligned} |\Omega _m|\le & {} \left| \left\{ x\in [-T,T],~B_{j_{1}}^{*}(x,f)>2^{m-1}\delta _n\right\} \right| \nonumber \\\le & {} \sum _{j\in \mathcal {H},j\ge j_{1}}\left| \left\{ x\in [-T,T],~B_{j}(x,f)>2^{m-1}\delta _n\right\} \right| \nonumber \\\le & {} \sum _{j\in \mathcal {H},j\ge j_{1}}\frac{\Vert B_{j}(\cdot ,f)\Vert _r^r}{\left( 2^{m-1}\delta _n\right) ^r} \lesssim 2^{-mr}\delta _n^{-r}2^{-j_{1}sr}. \end{aligned}$$

(3.16)

Substituting (3.16) into (3.14), one obtains that

$$\begin{aligned} J_m\lesssim (2^{m+1}\delta _n)^{p}2^{-mr}\delta _n^{-r}2^{-j_{1}sr}\lesssim 2^{m(p-r)}\delta _n^{p-r}2^{-{j_1}sr}\lesssim 2^{m\left( p-r-\frac{2sr}{2\beta +1}~~\right) }~\delta _n^{p}\quad \end{aligned}$$

due to $2^{j_{1}}\thicksim 2^{\frac{2\,m}{2\beta +1}}~~\delta _{n}^{-\frac{1}{s}}$.

For the case $r=\infty $, it follows from $f\in B_{r,q}^{s}(M)$ and $m>0$ that

$$\begin{aligned} B_{j_{1}}^{*}(x,f)= \sup _{j'\ge j_1}B_{j'}(x,f)\le M 2^{-j_{1}s}\le Mc_1^{^{-s}} ~2^{-\frac{2ms}{2\beta +1}}~~\delta _n \le 2^{m-1}\delta _n \end{aligned}$$

thanks to the choice of $2^{j_{1}}\ge c_12^{\frac{2\,m}{2\beta +1}}\delta _{n}^{-\frac{1}{s}}$ with $c_1>(2M)^{\frac{1}{s}}$. Thus, $|\Omega _{m}|=0$ because of (3.15). Furthermore, it reduces to $ J_m\le \left( 2^{m+1}\delta _n\right) ^p|\Omega _{m}|=0 $ by (3.14).

Finally, one considers the case of $s>\frac{1}{r}$ and $r\le p$. Note that $B_{r,q}^{s}\hookrightarrow B_{p,q}^{s'}$ with $s'=s-\frac{1}{r}+\frac{1}{p}$. Similar to (3.16),

$$\begin{aligned} |\Omega _m|\le & {} \sum _{j\in \mathcal {H},j\ge j_{1}}\left| \left\{ x\in [-T,T],~B_{j}(x,f)>2^{m-1}\delta _n\right\} \right| \\\le & {} \sum _{j\in \mathcal {H},j\ge j_{1}}\frac{\Vert B_{j}(\cdot ,f)\Vert _p^p}{(2^{m-1}\delta _n)^p} \lesssim 2^{-mp}\delta _n^{-p}2^{-j_{1}s'p}. \end{aligned}$$

This with (3.14) and $2^{j_{1}}\thicksim 2^{\frac{2\,m}{2\beta +1}}~~\delta _{n}^{-\frac{1}{s}}$ implies that

$$\begin{aligned} J_m \le (2^{m+1}\delta _n)^{p}2^{-mp}\delta _n^{-p}2^{-j_{1}s'p} \lesssim 2^{-j_1s'p} \lesssim 2^{-\frac{2ms'p}{2\beta +1}}~~\delta _n^{\frac{s'}{s}p}. \end{aligned}$$

The proof is done. $\square $

4 Main Result

This section is devoted to state and prove our main theorem.

Theorem 4.1

Let $\varphi $ be m regular and the Conditions (T1)–(T2) hold with $\beta >1$ and $m> \beta +1$. Then for $0<s<m$, $r,q\in [1,\infty ]$ and $p\in [1,\infty )$, the estimator $\widehat{f}_{n}$ in (1.9) satisfies

$$\begin{aligned} \sup _{f\in B_{r,q}^{s}~(M,T)\cap L^{\infty }~(M)}E\Vert \widehat{f}_{n}I_{[-T,T]}-f\Vert _{p}^{p}\lesssim \Big (\frac{\ln n}{n}\Big )^{\theta p}, \end{aligned}$$

where

$$\begin{aligned} \theta :=\left\{ \begin{array}{rcl} &{}\frac{s}{2s+2\beta +1},\;\;\,\;\;\;\;\;~~ &{}{1\le p<\frac{2sr}{2\beta +1}+r;} \\ \\ &{}\frac{sr}{(2\beta +1)p}, \;\;\;\;\;\;\;\;\, ~~&{}{p\ge \frac{2sr}{2\beta +1}+r,~s\le \frac{1}{r};} \\ \\ &{}\frac{s-\frac{1}{r}+\frac{1}{p}}{2(s-\frac{1}{r})+2\beta +1}, ~~ &{}{p\ge \frac{2sr}{2\beta +1}+r,~s>\frac{1}{r}}. \end{array} \right. \end{aligned}$$

Remark 4.1

Note that $\theta =\min \left\{ \frac{s}{2\,s+2\beta +1}, \frac{s-\frac{1}{r}+\frac{1}{p}}{2(s-\frac{1}{r})+2\beta +1}\right\} $ for $s>\frac{1}{r}$ and $\beta >1$, which coincides with Theorem 4 of Li and Liu [11]. On the other hand, the estimation in the case of $0<s\le \frac{1}{r}$ is investigated, whereas it is none for that with traditional wavelet estimators.

Remark 4.2

When $p\in \left[ 1,\frac{2sr+(2\beta +1)r}{sr+2\beta +1}\right] \subset \left[ 1,\frac{2sr}{2\beta +1}+r\right] $, the convergence rate $\frac{s}{2s+2\beta +1}$ is improved than $\frac{s(1-1/p)}{s-(2\beta +1)/r+(2\beta +1)}$ for not necessarily compactly supported density estimation with $\alpha =d=1$ in Refs. [9, 10]. It is reasonable that the condition of compactly support is stricter.

Proof

It follows from Theorem 2.1 that

$$\begin{aligned} \left| \widehat{f}_{n}(x)-f(x)\right| \lesssim U_{f}(x)+v(x) \end{aligned}$$

holds for any $x\in [-T,T]$. This with Proposition 3.1 leads to

$$\begin{aligned} E\left\| \widehat{f}_{n}I_{[-T,T]}-f\right\| _{p}^{p}=&{} E\int _{\Omega _{0}^{-}}\left| \widehat{f}_{n}(x)-f(x)\right| ^{p}dx + \sum _{m=0}^{\infty }E\int _{\Omega _{m}}\left| \widehat{f}_{n}(x)-f(x)\right| ^{p}dx\nonumber \\\lesssim&E\int _{\Omega _{0}^{-}}\left| \widehat{f}_{n}(x)-f(x)\right| ^{p}dx + \sum _{m=0}^{m_{2}}E\int _{\Omega _{m}}[U_{f}(x)]^{p}dx+n^{-\frac{p}{2}}\nonumber \\=&{} J_{0}^{-}+\sum _{m=0}^{m_{2}}J_{m}+n^{-\frac{p}{2}}. \end{aligned}$$

(4.1)

Here, $J_{0}^{-}$ and $J_{m}$ can be found in Proposition 3.2.

To complete the proof, one divides (4.1) into three regions. Recall that $2^{m_{2}}\thicksim \delta _n^{-1}$ and $\delta _n\thicksim \left( \frac{\ln n}{n}\right) ^{\frac{s}{2s+2\beta +1}}$ by (3.10)–(3.11). According to Proposition 3.2, the following estimations are established.

(i) For $1\le p<\frac{2sr}{2\beta +1}+r$,

$$\begin{aligned} J_{0}^{-}+\sum _{m=0}^{m_{2}}J_{m}+n^{-\frac{p}{2}} \lesssim \delta _n^{p}+n^{-\frac{p}{2}} \lesssim \Big (\frac{\ln n}{n}\Big )^{\frac{sp}{2s+2\beta +1}}. \end{aligned}$$

(4.2)

(ii) For $p\ge \frac{2sr}{2\beta +1}+r$,

$$\begin{aligned} {}&{} J_{0}^{-}+\sum _{m=0}^{m_{2}}J_{m}+n^{-\frac{p}{2}} \lesssim \delta _n^{p}+2^{m_{2}\left( p-r-\frac{2sr}{2\beta +1}~~\right) }~~\delta _n^{p}+n^{-\frac{p}{2}} \lesssim \Big (\frac{\ln n}{n}\Big )^{\frac{sr}{2\beta +1}}. \end{aligned}$$

(4.3)

(iii) For the case $p\ge \frac{2sr}{2\beta +1}+r$ and $s>\frac{1}{r}$. Take $m_1\in \mathbb {Z}$ satisfying

$$\begin{aligned} 2^{m_{1}}\thicksim \delta _n^{\frac{s'p\left( \frac{1}{s}-\frac{1}{s'}\right) }{\left( \frac{2s'}{2\beta +1}~+1\right) p-\frac{2sr}{2\beta +1}-r}}~~~~~~. \end{aligned}$$

(4.4)

Then it follows from $r<p,~p\ge \frac{2sr}{2\beta +1}+r$ and $s>\frac{1}{r}$ that $0<m_1<m_2$. Hence,

$$\begin{aligned} J_{0}^{-}+\sum _{m=0}^{m_{2}}J_{m}+n^{-\frac{p}{2}}\le&{} J_{0}^{-}+\sum _{m=0}^{m_1}J_{m}+\sum _{m=m_1} ^{m_{2}}J_{m}+n^{-\frac{p}{2}}\\ {}\lesssim&{} \delta _n^{p}+2^{m_{1}\left( p-r-\frac{2sr}{2\beta +1}~~\right) }~\delta _n^{p}+ 2^{-\frac{2m_{1}s'p}{2\beta +1}}~~\delta _n^{\frac{s'}{s}p}+n^{-\frac{p}{2}}. \end{aligned}$$

Combining with (4.4), $\delta _n\thicksim \left( \frac{\ln n}{n}\right) ^{\frac{s}{2\,s+2\beta +1}}$ and $s'=s-\frac{1}{r}+\frac{1}{p}$, the above inequality reduces to

$$\begin{aligned} J_{0}^{-}+\sum _{m=0}^{m_{2}}J_{m}+n^{-\frac{p}{2}}\lesssim \Big (\frac{\ln n}{n}\Big )^{\frac{s'p}{2(s-\frac{1}{r})+2\beta +1}}. \end{aligned}$$

This with (4.1)–(4.4) leads to the desired conclusion, which finishes the proof.

$\square $

$\bullet $ Concluding remark

It is worth to note that we assume $\beta >1$ in Theorem 4.1. For the case of $\beta \in (0,1]$, the same conclusion of Theorem 4.1 holds, if the following Condition (T3) is additional.

$$\begin{aligned} \mathrm (T3)~~\left\{ \begin{aligned}&\int _{\mathbb {R}}\frac{|g_{\varepsilon }(x)|}{1+|y+2^{j}x|^{2}}dx\lesssim 2^{j(\beta -2)}|y|^{1-\beta }(|y|\ge 1)~~ \text {for}~~\beta \in (0,1);\\ \\&\frac{d}{dt}\frac{(f_{\varepsilon }^{ft})'(t)}{[f_{\varepsilon }^{ft}(t)]^{2}}\in L^{1}(\mathbb {R})~~\text {for}~~\beta =1, \end{aligned} \right. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \end{aligned}$$

where $g_{\varepsilon }(x)=\mathcal {F}^{-1} \left\{ \frac{d}{dt}\frac{\left( f_{\varepsilon }^{ft}\right) '(t)}{\left[ f_{\varepsilon }^{ft}(t)\right] ^{2}}\right\} (x)$ and $\mathcal {F}^{-1}$ means the inverse Fourier transform. Although Condition (T3) looks complicated and unnatural, the Gamma distribution provides an example, see Example 4.1 in Ref. [13].

If Condition (T3) is added, the next lemma can be concluded easily.

Lemma 4.1

([13]). Let $\varphi $ be m regular and Conditions (T1)–(T3) hold with $0<\beta \le 1$ and $m>\beta +1$. Then $K_{j}\varphi $ in (1.2) satisfies that

$$\begin{aligned} \left| \sum _{k\in \mathbb {Z}}(K_{j}\varphi )(x-k)\varphi (y-k)\right| \le M_{0}2^{j\beta }\left( 1+|x-y|^{2}\right) ^{-\frac{\beta +1}{2}}, \end{aligned}$$

where $M_{0}>0$ is some constant.

Thus, the same conclusions of Lemma 3.2 are established for $0<\beta \le 1$, which imply that the same conclusion of Theorem 4.1 for the case $0<\beta \le 1$ holds under Conditions (T1)–(T3).

Data Availability

Not applicable.

References

Cao, K.K., Zeng, X.C.: Adaptive wavelet density estimation under independence hypothesis. Results Math. 76(4), 196 (2021)
Article MathSciNet MATH Google Scholar
Comte, F., Lacour, C.: Anisotropic adaptive kernel deconvolution. Ann. Inst. H. Poincaré Probab. Statist. 49(2), 569–609 (2013)
Article MathSciNet MATH Google Scholar
Daubechies, I.: Ten Lectures on Wavelets. SIAM, Philadelphia (1992)
Book MATH Google Scholar
Delyon, B., Juditsky, A.: On minimax wavelet estimator. Appl. Comput. Harmon. Anal. 3(3), 215–228 (1996)
Article MathSciNet MATH Google Scholar
Donoho, D.L., Johnstone, I.M., Kerkyacharian, G., Picard, D.: Density estimation by wavelet thresholding. Ann. Statist. 24(2), 508–539 (1996)
Article MathSciNet MATH Google Scholar
Fan, J., Koo, J.-Y.: Wavelet deconvolution. IEEE Trans. Inform. Theory 48(3), 734–747 (2002)
Article MathSciNet MATH Google Scholar
Goldenshluger, A., Lepski, O.: On adaptive minimax density estimation on $\mathbb{R} ^{d}$. Probab. Theory Relat. Fields 159(3–4), 479–543 (2014)
Article MATH Google Scholar
Härdle, W., Kerkyacharian, G., Picard, D., Tsybakov, A.: Wavelets. Approximation and Statistical Applications. Springer-Verlag, New York (1998)
Book MATH Google Scholar
Lepski, O., Willer, T.: Lower bounds in the deconvolution structure density model. Bernoulli 23(2), 884–926 (2017)
Article MathSciNet MATH Google Scholar
Lepski, O., Willer, T.: Oracle inequalities and adaptive estimation in the deconvolution structure density model. Ann. Statist. 47(1), 233–287 (2019)
Article MathSciNet MATH Google Scholar
Li, R., Liu, Y.M.: Wavelet optimal estimations for a density with some additive noises. Appl. Comput. Harmon. Anal. 36(3), 416–433 (2014)
Article MathSciNet MATH Google Scholar
Li, Q., Racine, J.S.: Nonparametric Econometrics: Theory and Practice. Princeton University Press, Princeton (2007)
MATH Google Scholar
Liu, Y.M., Zeng, X.C.: Asymptotic normality for wavelet deconvolution density estimators. Appl. Comput. Harmon. Anal. 48(1), 321–342 (2020)
Article MathSciNet MATH Google Scholar
Lounici, K., Nickl, R.: Global uniform risk bounds for wavelet deconvolution estimators. Ann. Statist. 39(1), 201–231 (2011)
Article MathSciNet MATH Google Scholar
Massart, P.: Concentration inequalities and model selection. In: Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour. Springer, Berlin (2007)
Meyer, Y.: Wavelets and Operators. Cambridge University Press, Cambridge (1992)
MATH Google Scholar
Pensky, M., Vidakovic, B.: Adaptive wavelet estimator for nonparametric density deconvolution. Ann. Statist. 27(6), 2033–2053 (1999)
Article MathSciNet MATH Google Scholar
Rebelles, G.: Structural adaptive deconvolution under $L^{p}$ losses. Math. Methods Statist. 25(1), 26–53 (2016)
Article MathSciNet MATH Google Scholar
Walter, G.G.: Density estimation in the presence of noise. Statist. Probab. Lett. 41(3), 237–246 (1999)
Article MathSciNet MATH Google Scholar
Zeng, X.C.: A note on wavelet deconvolution density estimation. Int. J. Wavelets Multiresolut. Inf. Process. 15(6), 1750055 (2017)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the referees for their valuable suggestions, which greatly improve the readability of the article. The first author is supported by the National Natural Science Foundation of China (No. 12101459). The second author (corresponding author) is supported by the National Natural Science Foundation of China (Nos. 12171016, 11901019), and the Science and Technology Program of Beijing Municipal Commission of Education (No. KM202010005025).

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

School of Mathematics and Information Science, Weifang University, Weifang, 261061, China
Kaikai Cao
Department of Mathematics, Faculty of Science, Beijing University of Technology, Beijing, 100124, China
Xiaochen Zeng

Authors

Kaikai Cao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaochen Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaochen Zeng.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interest to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cao, K., Zeng, X. A Data-Driven Wavelet Estimator For Deconvolution Density Estimations. Results Math 78, 156 (2023). https://doi.org/10.1007/s00025-023-01928-0

Download citation

Received: 23 November 2022
Accepted: 14 May 2023
Published: 03 June 2023
DOI: https://doi.org/10.1007/s00025-023-01928-0

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Data-Driven Wavelet Estimator For Deconvolution Density Estimations

Abstract

Similar content being viewed by others

Adaptive and optimal pointwise deconvolution density estimations by wavelets

Generalized Deconvolution Estimation by Multiwavelets

The mean consistency of wavelet density estimators

1 Introduction and Preliminary

1.1 Preliminaries

Lemma 1.1

1.2 Data-Driven Wavelet Estimator

2 Oracle Inequality

Theorem 2.1

Proof

3 Two Propositions

Lemma 3.1

Lemma 3.2

Proof

Proposition 3.1

Proof

Proposition 3.2

Proof

4 Main Result

Theorem 4.1

Remark 4.1

Remark 4.2

Proof

Lemma 4.1

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation