Multiplicative bias correction for discrete kernels

Harfouche, Lynda; Adjabi, Smail; Zougab, Nabil; Funke, Benedikt

doi:10.1007/s10260-017-0395-x

Multiplicative bias correction for discrete kernels

Original Paper
Published: 02 September 2017

Volume 27, pages 253–276, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Statistical Methods & Applications Aims and scope Submit manuscript

Multiplicative bias correction for discrete kernels

Download PDF

Lynda Harfouche¹,
Smail Adjabi¹,
Nabil Zougab¹ &
…
Benedikt Funke²

262 Accesses
13 Citations
Explore all metrics

Abstract

In this paper, we prove that two multiplicative bias correction techniques (MBC) can be applied for discrete kernels in the context of probability mass function estimation. First, some properties of the MBC discrete kernel estimators (bias, variance and mean integrated squared error) are investigated. Second, the popular cross-validation technique is adapted for bandwidth selection. Finally, a simulation study and a real data application for discrete data illustrate the performance of the MBC estimators based on dirac discrete uniform and triangular discrete kernels.

Multiplicative bias correction for generalized Birnbaum-Saunders kernel density estimators and application to nonnegative heavy tailed data

Article 17 July 2015

Bayesian bandwidth selection in discrete multivariate associated kernel estimators for probability mass functions

Article 30 April 2016

Tuning selection for two-scale kernel density estimators

Article 25 January 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Given a random sample of observations $X_{1},\ldots ,X_{n}$ with unknown probability mass function (pmf) f, which is supported on the discrete set $\mathbb {T}$ ($\mathbb {N}$, $\mathbb {Z}$ or $\mathbb {T}=\{0,1,\ldots ,N\}$), the discrete kernel estimator $\hat{f}_{h}(x)$ of $f(x)=\Pr (X_{i}=x)$ has been defined as follows [see for example Kokonendji and Senga Kiessé (2011)]:

$$\begin{aligned} \hat{f}_{h}(x)=\frac{1}{n}\sum \limits _{i=1}^{n}K_{x,h}(X_{i}), \end{aligned}$$

where $h=h(n)>0$ is a bandwidth (or smoothing parameter) and $K_{x,h}$ is a discrete kernel assumed to be a suitable pmf with support $\mathbb {S}_{x}$ not depending on h; see, e.g., Kokonendji et al. (2007) and Kokonendji and Senga Kiessé (2011). Naturally, the use of the discrete kernels is more appropriate than continuous kernels for estimating any discrete function; see again Kokonendji et al. (2007) and Kokonendji and Senga Kiessé (2011). See also Aitchison and Aitken (1976) for categorical data and finite discrete distributions and Wang and Ryzin (1980) for ordered discrete variables.

In view of the fact that the bias of $\hat{f}_{h}$ is O(h) as $h\rightarrow 0$, this paper considers improvements in discrete kernel estimation that reduce the order of magnitude in bias to $O(h^{2})$, while the order of magnitude in variance is maintained. In the case of symmetric kernels, this kind of rate improvements can be typically achieved by employing higher-order kernels; see Jones and Foster (1993) for methods of generating higher-order kernels from a given second-order kernel. To our best knowledge, equivalent techniques are yet to be proposed for discrete kernels. Instead, this paper applies two classes of multiplicative bias correction (MBC) techniques in order to attain the rate improvements. The MBC approaches were proposed and largely studied by several authors in symmetric and asymmetric kernel density estimation (continuous situation), see, e.g., Terrell and Scott (1980), Jones et al. (1995), Hirukawa (2010), Hirukawa and Sakudo (2014), Hirukawa and Sakudo (2015), Zougab and Adjabi (2015) and Funke and Kawka (2015). The first class of our considered MBC method is concerned with the construction of a multiplicative combination of two density estimators using different smoothing parameters. This idea is originally proposed by Terrell and Scott (1980) as an additive bias correction to the logarithm of densities. The second class of MBC in the spirit of Jones et al. (1995) is based on the idea of expressing $f(x)=\hat{f}(x)\left\{ f(x)/\hat{f}(x)\right\} $ and estimating the bias-correction term $f(x)/\hat{f}(x)$ nonparametrically. When applied to discrete kernel estimation, both MBC techniques still yield estimators that are free of boundary bias. In addition, these estimators have a practically appealing property. They always generate nonnegative density estimates everywhere by construction, as $\hat{f}_{h}$ does.

This paper is organized as follows. Section 2 briefly recalls discrete kernels for pmf estimation. In Sect. 3 we first introduce the MBC discrete kernel estimators. Second, we develop asymptotic properties like bias and variance of the newly proposed estimators. Third, we adapt the unbiased-cross validation (UCV) procedure for choosing the bandwidth. Section 4 conducts Monte Carlo simulations to compare the finite sample performance of standard discrete kernel estimators and the proposed MBC discrete kernel estimators. Section 5 provides an application on real data. Section 6 concludes the paper. All proofs are given in the “Appendix”.

2 Discrete kernel estimator

Given a random sample $X_{1},\ldots ,X_{n}$ with unknown probability mass function (pmf) f, which is supported on the discrete set $\mathbb {T}$ ($\mathbb {N}$, $\mathbb {Z}$ or $\mathbb {T}=\{0,1,\ldots ,N\}$), the discrete kernel estimator of $f(x)=\Pr (X_{i}=x)$ using kernel $\mathrm{L=DT,DDU},$ $\mathrm{WVR,LR}$ can be expressed as

$$\begin{aligned} \widehat{f}_{L}(x)=\frac{1}{n}\sum _{i=1}^{n}K_{L(x,h)}(X_{i}), \end{aligned}$$

(1)

where $x\in \mathbb {T}$ is the target (point where the pmf is estimated) and $h>0$ is a bandwidth (or smoothing parameter) and the explicit forms of the kernels are listed in Table 1. The asymptotic properties of the estimator (1) are studied in detail in Kokonendji and Senga Kiessé (2011). The asymptotic bias when $h\rightarrow 0$ is given by

$$\begin{aligned} \mathrm{bias}(\widehat{f}_{L}(x))=q(x,f)h+o(h), \end{aligned}$$

where the explicit forms of q(x, f) for a specific-kernel L are given in Table 2.

Table 1 Univariate discrete kernels

Full size table

Table 2 Explicit forms of q(x, f)

Full size table

Similarly, when $n\rightarrow \infty $ and $h\rightarrow 0$ the asymptotic variance is

$$\begin{aligned} \mathrm{Var}(\widehat{f}_{L}(x))=\frac{1}{n}f(x)\{1-f(x)\}K^{2}_{L(x,h)} (x)+o\left( \frac{1}{n}\right) . \end{aligned}$$

The mean integrated squared error ($\mathrm{MISE}$) is given in Kokonendji and Senga Kiessé (2011) and Kokonendji et al. (2007) and is expressed as

$$\begin{aligned} \mathrm{MISE}(\widehat{f}_{L})= & {} \sum \limits _{x\in \mathbb {T}}\mathrm{bias}^{2}(\widehat{f}_{L}(x))+ \sum \limits _{x\in \mathbb {T}}\mathrm{Var}(\widehat{f}_{L}(x)) \nonumber \\= & {} h^{2}\sum \limits _{x\in \mathbb {T}}q^{2}(x,f)+\frac{1}{n}\sum \limits _{x\in \mathbb {T}}f(x)\{1-f(x)\}K^{2}_{L(x,h)}(x)+ o\left( h^{2}+\frac{1}{n}\right) . \end{aligned}$$

In analogy to kernel density estimation, the choice of a suitable bandwidth is also a crucial issue in the discrete kernel method. For this reason, several approaches have been proposed in the literature. The common methods of continuous kernel estimators that adopt the mean integrated squared error (MISE) as a criterion and the techniques of cross-validation (CV) are also developed for discrete kernel estimation techniques; see, e.g, Kokonendji et al. (2007), Kokonendji and Senga Kiessé (2011) and Chu et al. (2015).

3 MBC for discrete kernel estimators

In this section, we adapt the mentioned MBC methods to the estimation of probability mass functions where the special kernels used for pmf estimation are called discrete kernels, which have support on some discrete set including $\mathbb {N}$, $\mathbb {Z}$ or a finite number of integers. Our proposed approaches have the same intuition as in the continuous case, each of the MBC methods is shown to improve the bias convergence of univariate pmf estimators from O(h) to $O(h^{2})$ while their variance convergence remains unchanged at $o(n^{-1})$. Globally the proof strategies of each MBC method in pmf estimation largely follow those of the corresponding method that is originally developed for kernel density estimation of scalar continuous random variables see, e.g., Terrell and Scott (1980), Jones et al. (1995), Hirukawa (2010), Hirukawa and Sakudo (2014), Zougab and Adjabi (2015) and Funke and Kawka (2015).

3.1 Estimators

Following the idea of the geometric estimator of Terrell and Scott (1980) and Hirukawa (2010) abbreviated as ”TS”. This can be readily extended to the discrete kernel L in the context of probability mass function estimation. For a given kernel L, let $\hat{f}_{L,h}(x)$ and $\hat{f}_{L,h/c}(x)$ are the pmf estimators using smoothing parameters h and h / c, respectively, where $c\in (0,1)$ is some predetermined constant that does not depend on the design point x. Then, the TS-MBC kernel pmf estimator can be adapted as follows:

$$\begin{aligned} \tilde{f}_{TS,L}(x)=\left\{ \hat{f}_{L,h}(x)\right\} ^{\frac{1}{1-c}} \left\{ \hat{f}_{L,h/c}(x)\right\} ^{-\frac{c}{1-c}}. \end{aligned}$$

(2)

The second approach of MBC techniques for symmetric kernel density estimators is attributed to Jones et al. (1995) [see also Hirukawa (2010), Hirukawa and Sakudo (2014), Zougab and Adjabi (2015) and Funke and Kawka (2015) by using asymmetric kernels] abbreviated as ”JLN” and utilizes a single smoothing parameter h. The JLN technique proposed by Jones et al. (1995) is based on the identity $f(x)=\hat{f}_{L,h}(x)\left\{ f(x)/\hat{f}_{L,h}(x)\right\} $. In analogy to their estimators, using the discrete kernel L, we denote as $\tilde{f}_{JLN,L}(x)$ the following estimator:

$$\begin{aligned} \tilde{f}_{JLN,L}(x)=\hat{f}_{L,h}(x)\left\{ \frac{1}{n}\sum _{i=1}^{n} \frac{K_{L(x,h)}(X_{i})}{\hat{f}_{L,h}(X_{i})}\right\} , \end{aligned}$$

(3)

where $K_{L(x,h)}$ is the kernel L. Recognize that the term inside the brackets is a natural estimator of the bias-correction term $f(x)/\hat{f}_{L,h}(x)$. Also, by construction, both $\tilde{f}_{TS,L}(x)$ and $\tilde{f}_{JLN,L}(x)$ always generate nonnegative probability mass function estimates everywhere.

3.2 Asymptotic properties

The asymptotic bias and variance of the MBC estimators are presented in the following theorems. We assume that

A1.
The derivatives of f at each point $x\in \mathbb {N}$ are replaced by the finite differences [see Kokonendji and Senga Kiessé (2011)]
$$\begin{aligned} f^{(j)}(x)=\{f^{(j-1)}(x)\}^{(1)}, \end{aligned}$$
where
$$\begin{aligned} f^{(1)}(x)=\left\{ \begin{array}{ll} \{f(x+1)-f(x-1)\}/2 &{} \quad \hbox {if}\quad x \in \mathbb {N} \setminus \{0\}; \\ f(1)-f(0) &{}\quad x=0. \end{array} \right. \end{aligned}$$
A2.
The smoothing parameter $h=h(n)$ satisfies $h\rightarrow 0$ as $n\rightarrow \infty $.

Theorem 1

Let $\tilde{f}_{TS,L}$ be the TS-MBC estimator using kernel L defined by (2). For a given target x, and assuming A1 and A2, it holds:

$(i )$ The bias of the TS-MBC discrete kernel estimator takes the following approximation

$$\begin{aligned} \mathrm{bias}(\tilde{f}_{TS,L}(x))=\frac{1}{c}\left[ \frac{1}{2} \left\{ \frac{l_{1}^{2}(x,f)}{f(x)}-l_{2}(x,f)\right\} \right] h^{2}+ o(h^{2}), \end{aligned}$$

where the explicit forms of $l_{1}(x,f)$ and $l_{2}(x,f)$ are respectively given in Tables 3 and 4. $(ii )$ The variance of the TS-MBC estimator with the kernel L admits the following expansion

$$\begin{aligned} \mathrm{Var}(\tilde{f}_{TS,L}(x))=\frac{f(x)(1-f(x))}{n(1-c)^{2}}\left( K_{L,h} (x)-cK_{L,h/c}(x)\right) ^{2}+o\left( \frac{1}{n}\right) . \end{aligned}$$

Table 3 Explicit forms of $l_{1}(x,f)$

Full size table

Table 4 Explicit forms of $l_{2}(x,f)$

Full size table

Proof

The proof is given in the “Appendix”.

Theorem 2

Let $\tilde{f}_{JLN,L}$ be the JLN-MBC estimator with kernel L defined by (3). For a given target x, and assuming A1 and A2, it holds that:

$(i )$ The bias of the JLN-MBC discrete kernel estimator takes the following approximation

$$\begin{aligned} \hbox {bias}(\tilde{f}_{JLN,L}(x))=-f(x)l_{1}(x,g)h^{2}+o(h^{2}), \end{aligned}$$

where $l_{1}(x,g)$ is obtained by replacing f by g in $l_{1}(x,f)$ with $g=g(x,f)=l_{1}(x,f)/f(x)$.

$(ii )$ The variance of the JLN-MBC discrete kernel estimator takes the following expression

$$\begin{aligned} Var(\tilde{f}_{JLN,L}(x))=\frac{f(x)(1-f(x))}{n}K^{2}_{L,h}(x) +o\left( \frac{1}{n}\right) . \end{aligned}$$

Proof

The proof is given in the “Appendix”. $\square $

3.3 Global property

We use the mean integrated squared error (MISE) as a criterion for the global property defined as

$$\begin{aligned} \mathrm{MISE}(\tilde{f}_{MBC,L})= & {} \sum \limits _{x\in \mathbb {T}}\mathrm{bias}^{2}(\tilde{f}_{MBC,L}(x))+ \sum \limits _{x\in \mathbb {T}}\mathrm{Var}(\tilde{f}_{MBC,L}(x)), \end{aligned}$$

where $\tilde{f}_{MBC,L}$ is the TS-L or the JLN-L kernel estimator. The mean integrated squared error (MISE) of the TS-L kernel estimator presented in (2) is given by

$$\begin{aligned} \mathrm{MISE}(\widehat{f}_{TS,L})= & {} \frac{h^{4}}{c^{2}}\sum \limits _{x\in \mathbb {T}}\left[ \frac{1}{2}\left\{ \frac{l_{1,L}^{2}(x,f)}{f(x)}-l_{2,L}(x,f)\right\} \right] ^{2}\\&+ \sum \limits _{x\in \mathbb {T}}\frac{f(x)(1-f(x))}{n(1-c)^{2}}\big (K_{L,h}(x)-cK_{L,h/c}(x)\big )^{2}+ o\left( h^{4}+\frac{1}{n}\right) . \end{aligned}$$

Similarly, the mean integrated squared error of the JLN-L kernel estimator presented in (3) is expressed as

$$\begin{aligned} \mathrm{MISE}(\widehat{f}_{JLN,L})= & {} h^{4}\sum \limits _{x\in \mathbb {T}}f(x)^{2}l_{1}^{2}(x,g)+\frac{1}{n} \sum \limits _{x\in \mathbb {T}}f(x)\{1-f(x)\}K^{2}_{L(x,h)}(x)\\&+\, o\left( h^{4}+\frac{1}{n}\right) . \end{aligned}$$

Remark 1

We can easily transmit the results when bias and variance have to be established at a point x. As we have seen, the bias remains unchanged and is uniformly of order $O(h^{2})$ over the whole support. Moreover, the variance exhibits the following order:

$$\begin{aligned} Var(\tilde{f}_{TS,L}(x))=Var(\tilde{f}_{JLN,L}(x))=Var(\hat{f}_{L}(x))=o(n^{-1}). \end{aligned}$$

The two theorems demonstrate that both TS and JLN estimators are free of boundary bias. More importantly, these two MBC estimators reduce the order of magnitude in bias from O(h) to $O(h^{2})$, while their variances are still $o(n^{-1})$. The variance of JLN estimator is first-order asymptotically equivalent to that of the corresponding classical estimator. Besides, since the variance of TS-MBC estimators depends on $c\in (0,1)$, the variance of these estimators tends to be larger than that of the classical estimator, but not least importantly, see, e.g., Hirukawa (2010) and Hirukawa and Sakudo (2014) for more details in continuous case.

3.4 Normalization

Neither $\tilde{f}_{TS,L}(x)$ nor $\tilde{f}_{JLN,L}(x)$ sum up to one. In general, MBC leads to lack of normalization, Hirukawa (2010) for example, argues that this issue can be resolved, and propose two renormalized beta MBC kernel density estimators. Taking the structures of $\tilde{f}_{TS,L}(x)$ and $\tilde{f}_{TS,L}(x)$ into account. Following Hirukawa (2010), we adopt their macro approach to obtain the renormalized of our MBC estimators.

$$\begin{aligned} \tilde{f}^{R}_{TS,L}(x)= & {} \frac{\tilde{f}_{TS,L}(x)}{\sum _{x \in \mathbb {T}}\tilde{f}_{TS,L}(x)},\\ \tilde{f}^{R}_{JLN,L}(x)= & {} \frac{\tilde{f}_{JLN,L}(x)}{\sum _{x \in \mathbb {T}}\tilde{f}_{JLN,L}(x)}. \end{aligned}$$

Since

$$\begin{aligned} \mathbb {E}\left( \sum _{x \in \mathbb {T}}\tilde{f}_{TS,L}(x)\right)= & {} \sum _{x \in \mathbb {T}}\left( \mathbb {E}(\tilde{f}_{TS,L}(x))\right) ,\\= & {} 1+\frac{1}{c}\sum _{x \in \mathbb {T}}\left[ \frac{1}{2}\left\{ \frac{l_{1}^{2}(x,f)}{f(x)}-l_{2}(x,f)\right\} \right] h^{2}+ o(h^{2}) \end{aligned}$$

and

$$\begin{aligned} \mathbb {E}\left( \sum _{x \in \mathbb {T}}\tilde{f}_{JLN,L}(x)\right)= & {} \sum _{x \in \mathbb {T}}\left( \mathbb {E}(\tilde{f}_{JLN,L}(x))\right) ,\\= & {} 1-\sum _{x \in \mathbb {T}}f(x)l_{1}(x,g)h^{2}+o(h^{2}), \end{aligned}$$

biases of $\tilde{f}^{R}_{TS,L}(x)$ and $\tilde{f}^{R}_{JLN,L}(x)$ can be approximated by

$$\begin{aligned}&bias\left( \tilde{f}^{R}_{TS,L}(x)\right) \sim \frac{1}{c}\left[ \frac{1}{2}\left\{ \frac{l_{1}^{2}(x,f)}{f(x)}-l_{2}(x,f)\right\} -\sum _{x \in \mathbb {T} }\frac{1}{2}\left\{ \frac{l_{1}^{2}(x,f)}{f(x)}-l_{2}(x,f)\right\} \right] h^{2},\\&bias\left( \tilde{f}^{R}_{JLN,L}(x)\right) \sim \left[ -f(x)l_{1}(x,g)+\sum _{x \in \mathbb {T} }f(x)l_{1}(x,g)\right] h^{2}. \end{aligned}$$

3.5 Choice of smoothing parameter for discrete MBC kernel estimators

We adapt in this section the popular unbiased cross validation (UCV) method. First, we consider the case of TS-kernel estimators based on the kernel L. The optimal smoothing parameter $h^{opt}_{TS,L}$ is given by

$$\begin{aligned} h^{opt}_{TS,L}=arg~~\min \limits _{h}~~UCV_{TS,L}(h), \end{aligned}$$

where

$$\begin{aligned} UCV_{TS,L}(h)= & {} \sum _{x\in \mathbb {T}}\tilde{f}^{2}_{TS-L}(x)-\frac{2}{(n-1)}\sum _{i=1}^{n}\tilde{f}^{(-i)}_{TS-L}(X_{i})\\= & {} \sum _{x\in \mathbb {T}}\left\{ \hat{f}_{L,h}(x)\right\} ^{\frac{2}{1-c}}\left\{ \hat{f}_{L,h/c}(x)\right\} ^{-\frac{2c}{1-c}} -\frac{2}{n(n-1)}\\&\times \sum _{i=1}^{n}\left[ \left\{ \sum _{j\ne i}K_{L(X_{i},h)}(X_{j})\right\} ^{\frac{1}{1-c}}\left\{ \sum _{j\ne i}K_{L(X_{i},h/c)}(X_{j})\right\} ^{-\frac{c}{1-c}}\right] . \end{aligned}$$

In the case of JLN-kernel estimators, the expression of UCV takes the following form:

$$\begin{aligned} UCV_{JLN,L}(h)= & {} \sum _{x\in \mathbb {T}}\tilde{f}^{2}_{JLN,L}(x)-\frac{2}{(n-1)}\sum _{i=1}^{n}\tilde{f}^{(-i)}_{JLN,L}(X_{i})\\= & {} \frac{1}{n^{2}}\sum _{x\in \mathbb {T}}\hat{f}^{2}_{L,h}(x)\left\{ \sum _{i=1}^{n}\frac{K_{L(x,h)}(X_{i})}{\hat{f}_{L,h}(X_{i})}\right\} ^{2}\\&- \frac{2}{n(n-1)}\times \sum _{i=1}^{n}\sum _{j\ne i}K_{L(X_{i},h)}(X_{j})\frac{\hat{f}_{L,h}(X_{i})}{\hat{f}_{L,h}(X_{j})} \end{aligned}$$

and the bandwidth $h^{opt}_{JLN,L}$ is defined as follows

$$\begin{aligned} h^{opt}_{JLN,L}=arg~~\min \limits _{h}~~UCV_{JLN,L}(h). \end{aligned}$$

4 Illustrations from simulated data

This section investigates the performances of TS-DDU, TS-DT, JLN-DDU and JLN-DT kernel estimators considered in the previous section and compares their performances with the standard DDU and DT kernel estimators. Note that for the DT kernel, we used the arm $a=2$, see for example Kokonendji and Senga Kiessé (2011). We consider six probability mass functions defined as follows:

(a):: $\mathbf F _{1}$ a Poisson distribution with parameter $\lambda =8$:
$$\begin{aligned} f(x)=e^{-8} \frac{8^{x}}{x!},~~~x\in \mathbb {N}. \end{aligned}$$
(b):: $\mathbf F _{2}$ a mixture of three Poisson distributions with parameters $\mu _{1}=3$, $\mu _{2}=12$ and $\mu _{3}=24$:
$$\begin{aligned} f(x)=\frac{1}{3}e^{-3} \frac{3^{x}}{x!}+ \frac{1}{3}e^{-12} \frac{12^{x}}{x!}+\frac{1}{3}e^{-24} \frac{24^{x}}{x!},~~~x\in \mathbb {N} \end{aligned}$$
(c):: $\mathbf F _{3}$ a Geometric distribution with parameter $p=0.1$:
$$\begin{aligned} f(x)= 0.1\cdot (0.9)^{x-1},~~~x\in \mathbb {N}. \end{aligned}$$
(d):: $\mathbf F _{4}$ a mixture of Poisson and Geometric distributions with parameters $\mu =10$ and $p=0.1$:
$$\begin{aligned} f(x)=\frac{2}{5}\cdot e^{-10} \frac{10^{x}}{x!}+ \frac{3}{5}\cdot 0.1\cdot (0.9)^{x-1},~~~x\in \mathbb {N}. \end{aligned}$$
(e):: $\mathbf F _{5}$ a negative binomial distribution with parameters $n_{1}=20$ and $p=2/3$:
$$\begin{aligned} f(x)=\frac{(19+x)!}{x!19!}\left( \frac{2}{3}\right) ^{20}\left( \frac{1}{3}\right) ^{x} ~~~x\in \mathbb {N}. \end{aligned}$$
(f):: $\mathbf F _{6}$ a binomial distribution with parameters $n_{1}=5$ and $p=0.1$:
$$\begin{aligned} f(x)=\frac{5!}{x!(5-x)!}0.1^{x}\cdot 0.9^{5-x},~~~x\in \{0,1,\ldots ,5\}. \end{aligned}$$

Table 5 Some expected values of $\mathrm{ISE}$ based on 500 replications for the previous considered pmfs

Full size table

Table 6 Empirical $\mathrm{ISB}$ values based on 500 replications for the previous considered pmfs

Full size table

Table 7 Empirical IV values based on 500 replications for the previous considered pmfs

Full size table

Note that for these considered pmfs, 500 replications of sizes $n=20, 50, 100$ and 200 are generated. The MBC-DDU (TS-DDU and JLN-DDU) and MBC-DT (TS-DT and JLN-DT) discrete kernel estimators are applied to estimate the pmfs generated from Poisson$(\lambda =8)$, a mixture of three Poisson with $(\mu _{1}=3,\mu _{2}=12, \mu _{3}=24)$, a Geometric$(p=0.1)$, mixture of Poisson$(\mu =10)$ and Geometric$(p=0.1)$ distributions, a negative binomial distribution BN$(n_{1}=20,p=2/3)$ and a binomial distribution B$(n_{1}=5,p=0.1)$. Note that, for our simulations the value of c which is obtained in the sense of mean integrated squared error (MISE) see, e.g., Hirukawa (2010) is fixed at $c=0.5$. We use the standard DDU and DT kernel estimators to compare their performance with the MBC-DDU and MBC-DT kernel estimators. For the choice of the bandwidth, we use the UCV technique proposed in the previous section. Finally, the performances of the different standard estimators and MBC estimators are examined via the integrated squared error (ISE) and the integrated squared bias (ISB) given respectively as follows:

$$\begin{aligned} \mathrm{ISE}:=\sum _{x \in \mathbb {T} } \left[ \widehat{f}(x)-f(x)\right] ^{2} \end{aligned}$$

and

$$\begin{aligned} \mathrm{ISB}:=\sum _{x \in \mathbb {T}}\left[ \mathbb {E}\{\widehat{f}(x)\}-f(x)\right] ^{2}. \end{aligned}$$

We also compute the integrated variance IV given by

$$\begin{aligned} \mathrm{IV}:=\sum _{x \in \mathbb {T} } \left[ Var\{\widehat{f}(x)\}\right] . \end{aligned}$$

Through simulation results (Tables 5, 6, 7), we can observe immediately that:

1.
for all estimators, the means of ISE and ISB based on 500 simulations decrease as sample size n increases, which indicates that our estimators are consistent;
2.
in terms of ISE and ISB, the performances of JLN-DDU and TS-DDU kernel estimators are mixed depending on the distribution. For example, in case of the binomial distribution, the JLN-DDU kernel estimator in general works better than the other competitors in the sense of ISB;
3.
for all sample sizes the TS-DDU, JLN-DDU, TS-DT, JLN-DT kernel estimators outperform the standard DDU and DT kernel estimators in the senses of $\mathrm{ISE}$ and $\mathrm{ISB}$.

Note that the performances of the WVR and LR kernels are similar to those obtained by the DDU kernel, for this reason and to avoid making the manuscript more cumbersome, we have considered the DDU kernel rather than the WVR nor the LR in simulations and empirical illustrations.

The comparison is also illustrated in Figs. 1 and 2. We have plotted the estimates for sample size $n = 200$ with DDU and DT($a=2$) kernel. The solid lines represent the true pmf, the dotted lines represent the classical (C) estimator with DDU or DT kernel, the TS-DDU and TS-DT estimator are represented by the dashed lines, the JLN-DDU and JLN-DT estimators by the solid lines in gray. The plot shows that in general the MBC-DDU and MBC-DT estimators improves the standard DDU and DT kernel estimator for all pmfs. The smoothing quality is considered satisfactory. We have obtained the best smoothing quality by using the MBC-DDU (TS-DDU and JLN-DDU) or the MBC-DT (TS-DT and JLN-DT) kernel estimators.

The TS-MBC estimator depends on two smoothing parameters h and h / c, these two smoothing parameters also play a role in determining the boundary region. Controlling both h and h / c is a cumbersome task. Because $0<c<1$, the pmf estimator using h / c tends to be oversmoothed, which is potentially a source of a large bias in every TS-MBC estimator. On the other hand, we make c too long in order to have a reasonable short of h / c.

5 Illustrations from real data

To complete our Monte Carlo simulations, we consider in this section two real data applications. First we illustrate the performances of the MBC techniques for discrete kernel estimators based on DDU and DT($a=2$) kernels for the travel mode choice (between Sydney and Melbourne, Australia) data from Greene (2011). This data consists of $n=210$ observations and $m=4$ categories (air, train, bus and car). Note that the relative proportions of air, train, bus and car are 0.28, 0.30, 0.14 and 0.28 respectively. Table 8 provides the summary statistics of these real data observations. The second real application is related to the development of an insect parasite called the spiraling whitefly and observed in Republic of Congo, see Senga Kiessé and Mizère (2012). This insect pest plant causes some damages as sucking the sap, decreasing photosynthesis activity and drying up the leaves. The congolese biologists are searching for a suitable modeling by studying some count data characterizing the growth of spiraling whitefly such as the longevity of the adult insect (see Table 9).

Table 8 Summary Statistics for Travel Mode Choice Data

Full size table

Table 9 Data of longevity of adult insects observed in days

Full size table

Now we apply the MBC-DDU and MBC-DT kernel estimators to estimate the pmfs for the considered real data. The value of c is fixed at 0.5 for TS-DDU and TS-DT kernel estimators. The standard DDU and DT kernel estimators are also used for comparison.

In order to measure the performance of all estimators, we simply use the practical integrated squared error given by [see Kokonendji and Senga Kiessé (2011)]:

$$\begin{aligned} \mathrm{ISE^{0}}:=\sum _{x \in \mathbb {N} } \left[ \widehat{f}(x)-f_{0}(x)\right] ^{2}, \end{aligned}$$

where $f_{0}(x)$ is the empirical (naive) estimator. Categorical independent variables can be used in a nonparametric pmf estimation, but they need to be coded. In our study we use the following code: 1=”air”; 2=”train”; 3=”bus”; 4=”car”. The bandwidths for the estimators are chosen by using the popular UCV technique. The obtained values of $h_{ucv}$ and $ISE^{0}$ for both applications are given in Tables 10 and 11.

Table 10 Results from bandwidth and $ISE^{0}$ by discrete and MBC kernels estimators of real data from the travel mode choice (between Sydney and Melbourne, Australia) of $n = 210$

Full size table

Table 11 Results from bandwidth and $ISE^{0}$ by discrete and MBC kernels estimators of real data from longevity of adult insects observed in days of $n = 82$

Full size table

We can see that in terms of the $\mathrm{ISE^{0}}$, the MBC discrete kernel estimators with UCV bandwidths perform better than the standard kernel estimators for both applications. We have also plotted the estimations obtained by the classical (C) estimator and the MBC (TS and JLN) estimators with the Dirac Discrete Uniform and the triangular kernels for the second real data of longevity of adults insects observed in days. From Fig. 3, we observe that the smoothing quality is satisfactory. But the smoothing contributed by the JLN estimator is more suitable.

6 Conclusion

This paper has proposed two multiplicative bias correction (MBC) techniques for discrete kernels in the context of probability mass function (pmf) estimation. We have shown that these two classes of MBC techniques improve the order of magnitude in bias from O(h) to $O(h^{2})$. The performances of the MBC techniques for discrete kernel estimators (TS-DDU, TS-DT, JLN-DT and JLN-DDU kernel estimators) with unbiased cross-validation (UCV) bandwidth selectors are investigated through a simulation study and a real data application for count and categorical data. In general, the MBC discrete kernel estimators perform better than the standard discrete kernel estimators in the sense of integrated squared error (ISE) and integrated squared bias (ISB).

This paper deals only with the univariate case. An extension is obviously given by the estimation of multivariate pmfs. We are aware of two recent publications, which deal with multivariate (discrete) kernels. Kokonendji and Somé (2015) investigated multivariate kernels for the estimation of the density of continuously distributed random vectors. Moreover, discrete multivariate kernels have been studied by Belaid et al. (2016). In the latter one, a Bayesian bandwidth selection method for those kernels has been proposed.

Hence, let $X=\{(X_{i1},\ldots ,X_{id}),~i=1,\ldots ,n\}$ be a sample of i.i.d. random vectors of dimension $d\ge 1$. When following the approach by Belaid et al. (2016), we define the multivariate version of a discrete kernel with diagonal bandwidth matrix $H=diag(h_1,\ldots ,h_d)$ and kernel L according to

$$\begin{aligned} \hat{f}_L(\mathbf x ):=\frac{1}{n}\sum _{i=1}^n\prod _{j=1}^dK_{L(x_j,h_j)}^{[j]}(X_{ij}), \end{aligned}$$

where $K_{L(x_j,h_j)}^{[j]}(X_{ij})$ denotes the univariate discrete kernel studied in this paper and

$$\begin{aligned} \mathbf x :=(x_{1},\ldots ,x_{d})\in \mathbb {T}^d:=\times _{j=1}^{d} \mathbb {T}_{1}^{[j]}\subseteq \mathbb {Z}^{d} \end{aligned}$$

denotes the target vector. Moreover, $\mathbb {T}^d$ denotes the support of the underlying pmf f, which has to be estimated at $\mathbf x $.

In view of our univariate findings and in analogy of the asymmetric kernel based approach by Funke and Kawka (2015), we define the multivariate version of the TS estimator as

$$\begin{aligned} \hat{f}_{TS,L}(\mathbf x ):=\left( \hat{f}_L(\mathbf x )\right) ^{\frac{1}{1-c}} \left( \hat{f}_L(\mathbf x )\right) ^{-\frac{c}{1-c}}. \end{aligned}$$

In an analogous way, the multivariate JLN estimator is defined according to

$$\begin{aligned} \hat{f}_{JLN}(\mathbf x ):=\hat{f}_L(\mathbf x )\frac{1}{n}\sum _{i=1}^n \frac{\prod _{j=1}^dK_{L(x_j,h_j)}^{[j]}(X_{ij})}{\hat{f}_L(\mathbf X _i)}. \end{aligned}$$

Under appropriate assumptions, it can be shown that the mean squared errors of both estimators are given by

$$\begin{aligned} MSE\left( \hat{f}_{TS,L}(\mathbf x )\right) {=}O\left( h^4+\frac{\prod _{j=1}^d \left( K_{L(x_j,h)}(x_j)-cK_{L(x_j,h/c)}(x_j)\right) ^{2}}{n}\right) \text { as }n{\rightarrow }\infty , \end{aligned}$$

as well as

$$\begin{aligned} MSE\left( \hat{f}_{JLN}(\mathbf x )\right) =O\left( h^4+\frac{1}{n}\prod _{j=1}^d K_{L(x_j,h)}^2(x_j)\right) \text { as }n\rightarrow \infty , \end{aligned}$$

where, for the sake of simplicity, the bandwidth vector h is given by $h\equiv h_1=\cdots =h_d$. Analytical exact expressions of both bias terms are under investigation and will be covered in a following paper.

References

Aitchison J, Aitken CGG (1976) Multivariate binary discrimination by the kernel method. Biometrika 63:413–420
Article MathSciNet MATH Google Scholar
Belaid N, Adjabi S, Zougab N, Kokonendji CC (2016) Bayesian bandwidth selection in discrete multivariate associated kernel estimators for probability mass functions. J Korean Stat Soc 45:557–567
Article MathSciNet MATH Google Scholar
Chu CY, Henderson DJ, Parmeter CF (2015) Plug-in bandwidth selection for kernel density estimation with discrete data. Econometrics 3:199–214
Article Google Scholar
Funke B, Kawka R (2015) Nonparametric density estimation for multivariate bounded data using two non-negative multiplicative bias correction methods. Comput Stat Data Anal 92:148–162
Article MathSciNet Google Scholar
Greene W (2011) Econometric analysis. Pearson, Cambridge
Google Scholar
Hirukawa M (2010) Nonparametric multiplicative bias correction for kernel-type density estimation on the unit interval. Comput Stat Data Anal 54:473–495
Article MathSciNet MATH Google Scholar
Hirukawa M, Sakudo M (2014) Nonnegative bias reduction methods for density estimation using asymmetric kernels. Comput Stat Data Anal 75:112–123
Article MathSciNet Google Scholar
Hirukawa M, Sakudo M (2015) Family of the generalised gamma kernels: a generator of asymmetric kernels for nonnegative data. J Nonparametric Stat 27:41–63
Article MathSciNet MATH Google Scholar
Jones MC, Foster PJ (1993) Generalized jackknifing and higher order kernels. J Nonparametric Stat 3:81–94
Article MathSciNet MATH Google Scholar
Jones MC, Linton O, Nielsen JP (1995) A simple bias reduction method for density estimation. Biometrika 82:327–338
Article MathSciNet MATH Google Scholar
Kokonendji CC, Senga Kiessé T (2011) Discrete associated kernels method and extensions. Stat Methodol 8:497–516
Article MathSciNet MATH Google Scholar
Kokonendji CC, Senga Kiessé T, Zocchi SS (2007) Discrete triangular distributions and non-parametric estimation for probability mass function. J Nonparametric Stat 19:241–254
Article MathSciNet MATH Google Scholar
Kokonendji CC, Somé SM (2015) On multivariate associated kernels for smoothing general density function. arXiv: 1502.01173
Racine JS, Li Q (2004) Nomparametric estimation of regression functions with both categorical and continuous data. J Econom 119:99–130
Article MATH Google Scholar
Senga Kiessé T, Mizère D (2012) Weighted Poisson and semiparametric kernel models applied for parasite growth. Aust N Z J Stat 55:1–13
Article MathSciNet MATH Google Scholar
Terrell GR, Scott DW (1980) On improving convergence rates for nonnegative kernel density estimators. Ann Stat 8:1160–1163
Article MathSciNet MATH Google Scholar
Wang M, Ryzin J (1980) A class of smooth estimators for discrete distributions. Biometrika 68:301–309
Article MathSciNet MATH Google Scholar
Zougab N, Adjabi S (2015) Multiplicative bias correction for generalized Birnbaum–Saunders kernel density estimators and application to nonnegative heavy tailed data. J Korean Stat Soc 45:51–63
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research has been supported by the Unit of Research LAMOS of University of Bejaia. The authors thank the editor, an associate editor and anonymous referees for their valuable comments that allowed us to improve this article.

Author information

Authors and Affiliations

Research Unit LaMOS, University of Bejaia, Bejaïa, Algeria
Lynda Harfouche, Smail Adjabi & Nabil Zougab
Department of Mathematics, Technical University of Dortmund, Dortmund, Germany
Benedikt Funke

Authors

Lynda Harfouche
View author publications
You can also search for this author in PubMed Google Scholar
Smail Adjabi
View author publications
You can also search for this author in PubMed Google Scholar
Nabil Zougab
View author publications
You can also search for this author in PubMed Google Scholar
Benedikt Funke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lynda Harfouche.

Appendix

We present a sketch of proofs of Theorems 1 and 2. We provide the proofs when the discrete triangular kernel is used. The proofs using the other kernels can be given similarly.

1.1 Sketch of the proof of Theorem 1

1.1.1 Bias

First, note that $\mathbb {E}\left( \widehat{f}_{DT,h}(x)\right) =\mathbb {E}(f(\mathcal {T}))$, where the random variable $\mathcal {T}\sim DT(a;x,h)$. By using a fourth-order discrete Taylor expansion around $\mathcal {T}=x$ for

$$\begin{aligned} I_{h}(x)=\mathbb {E}(\widehat{f}_{DT,h})=\sum K_{DT,h}(y)f(y)=\mathbb {E}(f(\mathcal {T})), \end{aligned}$$

we have

$$\begin{aligned} I_{h}(x)=f(x)+\sum \limits _{j=1}^{4}\frac{f^{(j)}}{j!} \mathbb {E}(\mathcal {T}-x)^{j}+o(\mathbb {E}(\mathcal {T}-x)^{4}). \end{aligned}$$

By using the property of the discrete triangular random variable and a Taylor expansion around $h=0$,

$$\begin{aligned} \mathbb {E}(\mathcal {T}-x)= & {} 0,\\ \mathbb {E}(\mathcal {T}-x)^{2}= & {} \left\{ \log (a+1)S(a)-2\sum \limits _{k=1}^{a}k^{2}\log (k)\right\} h \\&+\left\{ \frac{\log ^{2}(a+1)}{2}S(a)-\sum \limits _{k=1}^{a}k^{2}\log ^{2} (k)\right\} h^{2}+o(h^{2}),\\ \mathbb {E}(\mathcal {T}-x)^{3}= & {} 0,\\ \mathbb {E}(\mathcal {T}-x)^{4}= & {} \left\{ \log (a+1)R(a)-2\sum \limits _{k=1}^{a}k^{4}\log (k)\right\} h \\&+\left\{ \frac{\log ^{2}(a+1)}{2}R(a)-\sum \limits _{k=1}^{a}k^{4}\log ^{2} (k)\right\} h^{2}+o(h^{2}), \end{aligned}$$

where

$$\begin{aligned} R(a)=\frac{2}{5}a^{5}+a^{4}+\frac{2}{3}a^{3}-\frac{1}{15}a. \end{aligned}$$

The Taylor expansion of $I_{h}(x)$ around $h=0$ is then given by

$$\begin{aligned} I_{h}(x)=f(x)\left\{ 1+\frac{l_{1}(x,f)}{f(x)}h+\frac{l_{2}(x,f)}{f(x)}h^{2} +o(h^{2})\right\} , \end{aligned}$$

where

$$\begin{aligned} l_{1}(x,f)&=\frac{f^{(2)}(x)}{2}\left( \log (a+1)S(a)-2\sum \limits _{k=1}^{a} k^{2}\log (k)\right) \\&\quad + \frac{f^{(4)}(x)}{24}\left( \log (a+1)R(a)-2\sum \limits _{k=1}^{a} k^{4}\log (k)\right) \end{aligned}$$

and

$$\begin{aligned} l_{2}(x,f)&=\frac{f^{(2)}(x)}{2}\left( \frac{\log ^{2}(a+1)}{2}S(a) -\sum \limits _{k=1}^{a}k^{2}\log ^{2}(k)\right) \\&\quad + \frac{f^{(4)}(x)}{24}\left( \frac{\log ^{2}(a+1)}{2}R(a) -\sum \limits _{k=1}^{a}k^{4}\log ^{2}(k)\right) . \end{aligned}$$

Similarly, $I_{h/c}(x)=\mathbb {E}\left( \widehat{f}_{DT,h/c}(x)\right) $ can be approximated by

$$\begin{aligned} I_{h/c}(x)=f(x)\left\{ 1+\frac{1}{c}\frac{l_{1}(x,f)}{f(x)}h +\frac{1}{c^{2}}\frac{l_{2}(x,f)}{f(x)}h^{2}+o(h^{2})\right\} . \end{aligned}$$

Now, we define

$$\begin{aligned} \widehat{f}_{DT,h}(x)=I_{h}(x)+Z \end{aligned}$$

and

$$\begin{aligned} \widehat{f}_{DT,h/c}(x)=I_{h/c}(x)+W. \end{aligned}$$

The estimator $\tilde{f}_{TS,DT}$ can be written as follows:

$$\begin{aligned} \tilde{f}_{TS,DT}= \left\{ I_{h}(x)\right\} ^{\frac{1}{1-c}}\left\{ 1+\frac{Z}{I_{h}(x)}\right\} ^{\frac{1}{1-c}} \left\{ I_{h/c}(x)\right\} ^{-\frac{c}{1-c}}\left\{ 1+\frac{W}{I_{h/c}(x)}\right\} ^{-\frac{c}{1-c}}. \end{aligned}$$

Using the expansion $(1+t)^{\alpha }=1+\alpha t+o(t^{2})$, we then have

$$\begin{aligned} \tilde{f}_{TS,DT}(x)= & {} \left\{ I_{h}(x)\right\} ^{\frac{1}{1-c}}\left\{ I_{h/c}(x)\right\} ^{-\frac{c}{1-c}} +\frac{1}{1-c}Z\left\{ \frac{I_{h}(x)}{I_{h/c}(x)}\right\} ^{-\frac{c}{1-c}}\nonumber \\&-\frac{c}{1-c}W\left\{ \frac{I_{h}(x)}{I_{h/c}(x)}\right\} ^{\frac{1}{1-c}}+O\left\{ (Z+W)^{2}\right\} . \end{aligned}$$

(4)

Based on Assumption 2 and using the same calculations as in Hirukawa (2010) and Terrell and Scott (1980), we can show easily that

$$\begin{aligned} \mathbb {E}\left( \tilde{f}_{TS,DT}(x)\right) =f(x)+\frac{1}{c}\left[ \frac{1}{2}\left\{ \frac{l^{2}_{1}(x,f)}{f(x)}-l_{2}(x,f)\right\} \right] h^{2}+o(h^{2}). \end{aligned}$$

1.1.2 Variance

For the variance, from Eq. (4) we have

$$\begin{aligned} \mathrm{Var}\left( \tilde{f}_{TS,DT}(x)\right)= & {} \mathbb {E}\left( \frac{1}{1-c}Z-\frac{c}{1-c}W\right) ^{2}+o(n^{-1})\\= & {} \mathrm{Var}\left( \frac{1}{1-c}\widehat{f}_{DT,h}(x)-\frac{c}{1-c}\widehat{f}_{DT,h/c}(x)\right) +o(n^{-1})\\= & {} \frac{1}{(1-c)^{2}}\mathrm{Var}\left( \widehat{f}_{DT,h}(x)\right) +\frac{c^{2}}{(1-c)^{2}}\mathrm{Var}\left( \widehat{f}_{DT,h/c}(x)\right) \\&-\frac{2c}{(1-c)^{2}}\mathrm{cov}\left( \widehat{f}_{DT,h}(x),\widehat{f}_{DT,h/c}(x)\right) . \end{aligned}$$

First, note that the terms $\mathrm{Var}\left( \widehat{f}_{DT,h}(x)\right) $ and $\mathrm{Var}\left( \widehat{f}_{DT,h/c}(x)\right) $ are given by [see Kokonendji et al. (2007)]:

$$\begin{aligned} \mathrm{Var}\left( \widehat{f}_{DT,h}(x)\right)= & {} f(x)(1-f(x))K^{2}_{DT,h}(x)+o\left( \frac{1}{n}\right) \\= & {} \frac{f(x)}{n}(1-f(x))\frac{(1+a)^{2h}}{P^{2}(a,h)}+o\left( \frac{1}{n}\right) , \end{aligned}$$

and

$$\begin{aligned} \mathrm{Var}\left( \widehat{f}_{DT,h/c}(x)\right)= & {} f(x)(1-f(x))K^{2}_{DT,h/c}(x)+o\left( \frac{1}{n}\right) \\= & {} \frac{f(x)}{n}(1-f(x))\frac{(1+a)^{2h/c}}{P^{2}(a,h/c)}+o\left( \frac{1}{n}\right) . \end{aligned}$$

Now,

$$\begin{aligned}&\mathrm{cov}\left( \widehat{f}_{DT,h}(x),\widehat{f}_{DT,h/c}(x)\right) \\&\quad = \mathbb {E}\left( \widehat{f}_{DT,h}(x)\widehat{f}_{DT,h/c}(x)\right) -\mathbb {E}\left( \widehat{f}_{DT,h}(x)\right) \mathbb {E}\left( \widehat{f}_{DT,h/c} (x)\right) \\&\quad =\frac{1}{n^{2}}\sum \limits _{i=1}^{n}\sum \limits _{j=1}^{n}\mathbb {E} \left( K_{DT,h}(X_{i})K_{DT,h/c}(X_{j})\right) -\mathbb {E} \left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c}(X_{j})\right) \\&\quad =\frac{1}{n}\mathbb {E}\left( K_{DT,h}(X_{i})K_{DT,h/c}(X_{i})\right) +\frac{(n-1)}{n}\mathbb {E}\left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c} (X_{j})\right) \\&\qquad -\mathbb {E}\left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c}(X_{j})\right) \\&\quad =\frac{1}{n}\mathbb {E}\left( K_{DT,h}(X_{i})K_{DT,h/c}(X_{i})\right) -\frac{1}{n}\mathbb {E}\left( K_{DT,h}(X_{i})\right) \mathbb {E}\left( K_{DT,h/c} (X_{j})\right) \\&\quad =\frac{1}{n}K_{DT,h}(x)K_{DT,h/c}(x)f(x)-\frac{1}{n}K_{DT,h}(x)K_{DT,h/c} (x)f^{2}(x)+o\left( \frac{1}{n}\right) \\&\quad =\frac{1}{n}f(x)(1-f(x))\frac{(1+a)^{h}}{P(a,h)}\frac{(1+a)^{h/c}}{P(a,h/c)} +o\left( \frac{1}{n}\right) . \end{aligned}$$

Therefore, the variance of $\tilde{f}_{TS,DT}(x)$ is given by

$$\begin{aligned} \mathrm{Var}\left( \tilde{f}_{TS,DT}(x)\right) =\frac{f(x)(1-f(x))}{n(1-c)^{2}} \left( \frac{(1+a)^{h}}{P(a,h)}-c\frac{(1+a)^{h/c}}{P(a,h/c)}\right) ^{2}+o \left( \frac{1}{n}\right) , \end{aligned}$$

which corresponds to the results in Theorem 1.

1.2 Sketch of the proof of Theorem 2

1.2.1 Bias

At first, the estimator $\tilde{f}_{JLN,DT}$ can be written as [see Hirukawa (2010)]

$$\begin{aligned} \tilde{f}_{JLN,DT}(x)=f(x)\left\{ 1+\frac{\widehat{f}_{DT}(x)-f(x)}{f(x)} \right\} \left\{ 1+(\psi (x)-1)\right\} , \end{aligned}$$

where $\psi (x)=n^{-1}\sum _{i=1}^{n}K_{DT,h}(X_{i})/\widehat{f}_{DT}(X_{i})$. Then, we have

$$\begin{aligned} \mathbb {E}\left( \tilde{f}_{JLN,DT}(x)\right)= & {} f(x)+f(x)\mathbb {E} \left\{ \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right\} +f(x) \mathbb {E}\left\{ \psi (x)-1\right\} \\&+f(x)\mathbb {E}\left\{ \left( \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right) \left( \psi (x)-1\right) \right\} . \end{aligned}$$

By using Assumption 2 and the properties of DT random variables, the terms $\mathbb {E}\left\{ \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right\} $, $\mathbb {E}\left\{ \psi (x)-1\right\} $ and $\mathbb {E}\left\{ \left( \frac{\widehat{f}_{DT}(x)-f(x)}{f(x)}\right) \left( \psi (x)-1\right) \right\} $ can be approximated following the same procedures as in Hirukawa (2010). Thus, $\mathbb {E}(\widehat{f}_{JLN-DT})$ is approximated by

$$\begin{aligned} \mathbb {E}(\tilde{f}_{JLN-DT}(x))= & {} f(x)-f(x)\left[ \frac{1}{2}\left\{ \log (a+1)S(a)-2\sum \limits _{k=1}^{a} k^{2}\log (k)\right\} q^{(1)}(x)\right] h^{2}\\&-f(x) \left[ \frac{1}{24}\left\{ \log (a+1)R(a)-2\sum \limits _{k=1}^{a}k^{4} \log (k)\right\} q^{(2)}(x)\right] h^{2}+o(h^{2}),\\ \mathbb {E}(\tilde{f}_{JLN-DT}(x))= & {} f(x)-f(x)\left[ \frac{1}{2}\left\{ \log (a+1)S(a)-2\sum \limits _{k=1}^{a} k^{2}\log (k)\right\} q^{(1)}(x)\right. \\&\left. +\frac{1}{24}\left\{ \log (a+1)R(a)-2\sum \limits _{k=1}^{a}k^{4} \log (k)\right\} q^{(2)}(x)\right] h^{2}+o(h^{2}), \end{aligned}$$

where $q(x)=l_{1}(x,f)/f(x)$ with $l_{1}(x,f)$ given in the proof of Theorem 1.

1.2.2 Variance

Note that following Hirukawa (2010) and Jones et al. (1995), we can show that $\tilde{f}_{JLN,DT}(x)$ is equivalent to

$$\begin{aligned} \tilde{f}_{JLN-DT}(x)=f(x)\frac{1}{n}\sum \limits _{i=1}^{n}\frac{K_{DT,h} (X_{i})}{f(X_{i})}. \end{aligned}$$

It follows that

$$\begin{aligned} \mathrm{Var}\left( \widetilde{f}_{JLN-DT}(x)\right)= & {} f^{2}(x)\frac{1}{n}\mathrm{Var}\left\{ \frac{K_{DT,h}(X_{i})}{f(X_{i})}\right\} \\= & {} f^{2}(x)\frac{1}{n}\left\{ \mathbb {E}\left( \frac{K^{2}_{DT,h}(X_{i})}{f^{2}(X_{i})}\right) -\left[ \mathbb {E}\left( \frac{K_{DT,h}(X_{i})}{f(X_{i})}\right) \right] ^{2}\right\} \\= & {} \frac{f^{2}(x)}{n}\left\{ \frac{K^{2}_{DT,h}(x)}{f(x)}-K^{2}_{DT,h}(x)\right\} +o\left( \frac{1}{n}\right) \\= & {} \frac{f(x)}{n}(1-f(x))\frac{(1+a)^{2h}}{P^{2}(a,h)}+o\left( \frac{1}{n}\right) . \end{aligned}$$

Therefore, we obtain the approximation for the variance given in Theorem 2. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Harfouche, L., Adjabi, S., Zougab, N. et al. Multiplicative bias correction for discrete kernels. Stat Methods Appl 27, 253–276 (2018). https://doi.org/10.1007/s10260-017-0395-x

Download citation

Accepted: 27 August 2017
Published: 02 September 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10260-017-0395-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Multiplicative bias correction for discrete kernels

Abstract

Similar content being viewed by others

Multiplicative bias correction for generalized Birnbaum-Saunders kernel density estimators and application to nonnegative heavy tailed data

Bayesian bandwidth selection in discrete multivariate associated kernel estimators for probability mass functions

Tuning selection for two-scale kernel density estimators

1 Introduction

2 Discrete kernel estimator

3 MBC for discrete kernel estimators

3.1 Estimators

3.2 Asymptotic properties

Theorem 1

Proof

Theorem 2

Proof

3.3 Global property

Remark 1

3.4 Normalization

3.5 Choice of smoothing parameter for discrete MBC kernel estimators

4 Illustrations from simulated data

5 Illustrations from real data

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Sketch of the proof of Theorem 1

1.1.1 Bias

1.1.2 Variance

1.2 Sketch of the proof of Theorem 2

1.2.1 Bias

1.2.2 Variance

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multiplicative bias correction for discrete kernels

Abstract

Similar content being viewed by others

Multiplicative bias correction for generalized Birnbaum-Saunders kernel density estimators and application to nonnegative heavy tailed data

Bayesian bandwidth selection in discrete multivariate associated kernel estimators for probability mass functions

Tuning selection for two-scale kernel density estimators

1 Introduction

2 Discrete kernel estimator

3 MBC for discrete kernel estimators

3.1 Estimators

3.2 Asymptotic properties

Theorem 1

Proof

Theorem 2

Proof

3.3 Global property

Remark 1

3.4 Normalization

3.5 Choice of smoothing parameter for discrete MBC kernel estimators

4 Illustrations from simulated data

5 Illustrations from real data

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Sketch of the proof of Theorem 1

1.1.1 Bias

1.1.2 Variance

1.2 Sketch of the proof of Theorem 2

1.2.1 Bias

1.2.2 Variance

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation