Summary
This paper establishes asymptotic lower bounds which specify, in a variety of contexts, how well (in terms of relative rate of convergence) one may select the bandwidth of a kernel density estimator. These results provide important new insights concerning how the bandwidth selection problem should be considered. In particular it is shown that if the error criterion is Integrated Squared Error (ISE) then, even under very strong assumptions on the underlying density, relative error of the selected bandwidth cannot be reduced below ordern −1/10 (as the sample size grows). This very large error indicates that any technique which aims specifically to minimize ISE will be subject to serious practical difficulties arising from sampling fluctuations. Cross-validation exhibits this very slow convergence rate, and does suffer from unacceptably large sampling variation. On the other hand, if the error criterion is Mean Integrated Squared Error (MISE) then relative error of bandwidth selection can be reduced to ordern −1/2, when enough smoothness is assumed. Therefore bandwidth selection techniques which aim to minimize MISE can be much more stable, and less sensitive to small sampling fluctuations, than those which try to minimize ISE. We feel this indicates that performance in minimizing MISE, rather than ISE, should become the benchmark for measuring performance of bandwidth selection methods.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Anderson, G.D.: A comparison of methods for estimating a probability density function. Phd Dissertation, University of Washington, 1969
Bickel, P., Ritov, Y.: Estimating integrated squared density derivatives: sharp best order of convergence estimates. Sankhya50-A, 381–393 (1988)
Bowman, A.W.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika71, 353–360 (1984)
Burkholder, D.L.: Distribution function inequalities for martingales. Ann. Probab.1, 19–42 (1973)
Devroye, L., Györfi, L.: Nonparametric density estimation: the L1 View. New York; Wiley 1984
Donoho, D., Liu, R.: Geometrizing rates of convergence (unpublished manuscript 1987)
Es, B. van.: Likelihood cross-validation bandwidth selection for nonparametric kernel density estimators. J. Nonparamet. Stat. (in press 1991)
Fryer, M. J.: A review of some nonparametric methods of density estimation. J. Inst. Math. Appl.20, 335–354 (1977)
Hall, P.: Limit theorems for stochastic measures of the accuracy of density estimators. Stochastic Processes Appl.13, 11–25 (1982)
Hall, P., Marron, J.S.: Extent to which least-squares cross-validation minimises integrated square error in nonparametric density estimation. Probab. Th. Rel. Fields74, 567–581 (1987a)
Hall, P., Marron, J.S.: On the amount of noise inherent in bandwidth selection for a kernel density estimator. Ann. Stat.15, 163–181 (1987b)
Hall, P., Marron, J.S.: Estimation of integrated squared density derivatives. Stat. Probab. Lett.6, 109–115 (1987c)
Härdle, W., Hall, P., Marron, J.S.: How far are automatically chosen regression smoothers from their optimum?. J. Am. Stat. Assoc.83, 86–95 (1988)
Mammen, E.: A short note on optimal bandwidth selection for kernel estimators. Stat. Probab. Lett.9, 23–25 (1988)
Marron, J.S.: Convergence properties of an empirical error criterion for multivariate density estimation. J. Multivariate Anal.19, 1–13 (1986)
Marron, J.S.: Automatic smoothing parameter selection: A survey. Emp. Econ.13, 187–208 (1988)
Marron, J.S.: Comments on a data-driven bandwidth selector. Comp. Stat. Data Anal.8, 155–170 (1989)
Marron, J.S., Härdle, W.: Random approximations to some measures of accuracy in nonparametric curve estimation. J. Multivariate Anal.20, 91–113 (1986)
Park, B.U., Marron, J.S.: Comparison of data-driven bandwidth selectors. J. Am. Stat. Assoc.85, 66–72 (1990)
Rosenblatt, M.: Remarks on some non-parametric estimates of a density function. Ann. Math. Stat.27, 832–837 (1956)
Rosenblatt, M.: Curve estimates. Ann. Math. Stat.42, 1815–1842 (1971)
Rudemo, M.: Empirical choice of histograms and kernel density estimators. Scand. J. Stat.9, 65–78 (1982)
Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc.82, 1131–1146 (1987)
Silverman, B.W.: Density estimation for statistics and data analysis. New York: Chapman and Hall 1986
Steele, J.M.: Invalidity of average squared error criterion in density estimation. Can. J. Stat.6, 193–200 (1978)
Stone, C.J.: Optimal convergence rates for nonparametric estimators. Ann. Stat.8, 1348–1360 (1980)
Stone, C.J.: Optimal global rates of convergence of nonparametric regression. Ann. Stat.10, 1040–1053 (1982)
Watson, G.S., Leadbetter, M.R.: On the estimation of the probability density, I. Ann. Math. Stat.34, 480–491 (1963)
Wegman, E.J.: Nonparametric probability density estimation: II. A comparison of density estimation methods. J. Stat. Comput. Simulation1, 225–245 (1972)
Author information
Authors and Affiliations
Additional information
Research partially supported by National Science Foundation Grants DMS-8701201 and DMS-8902973
Research of the first author was done while on leave from the Australian National University
Rights and permissions
About this article
Cite this article
Hall, P., Marron, J.S. Lower bounds for bandwidth selection in density estimation. Probab. Th. Rel. Fields 90, 149–173 (1991). https://doi.org/10.1007/BF01192160
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01192160