Abstract
Kernel density estimation is a commonly used approach to classification. However, most of the theoretical results for kernel methods apply to estimation per se and not necessarily to classification. In this paper we show that when estimating the difference between two densities, the optimal smoothing parameters are increasing functions of the sample size of the complementary group, and we provide a small simluation study which examines the relative performance of kernel density methods when the final goal is classification.
A relative newcomer to the classification portfolio is “boosting”, and this paper proposes an algorithm for boosting kernel density classifiers. We note that boosting is closely linked to a previously proposed method of bias reduction in kernel density estimation and indicate how it will enjoy similar properties for classification. We show that boosting kernel classifiers reduces the bias whilst only slightly increasing the variance, with an overall reduction in error. Numerical examples and simulations are used to illustrate the findings, and we also suggest further areas of research.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Abramson I.S. 1982. On bandwidth variation in kernel estimates—a square root law. Annals of Statistics 9: 127–132.
Bühlmann P. and Yu B. 2003. Boosting with L2-Loss: Regression and classification. Journal of the American Statistical Association 98: 324–449.
Di Marzio M. and Taylor C.C. 2004a. Boosting Kernel density estimates: A bias reduction technique? Biometrika 91: 226–233.
Di Marzio M. and Taylor C.C. 2004b. On learning kernel density methods for multivariate data: Density estimation and classification. submitted for publication.
Freund Y. 1995. Boosting a weak learning algorithm by majority. Information and Computation 121: 256–285.
Freund Y. and Shapire R. 1996. Experiments with a new boosting algorithm. In: Saitta L. (Ed.), Machine Learning: Proceedings of the Thirteenth International Conference, Morgan Kauffman, San Francisco, pp. 148–156.
Friedman J.H. 1997. On bias, variance, 0/1-loss, and the curse of dimensionality. J. Data Mining and Knowledge Discovery 1: 55–77.
Friedman J.H. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics 29: 1189–1232.
Friedman J.H., Hastie T., and Tibshirani R. 2000. Additive logistic regression: A statistical view of boosting (with discussion). Annals of Statistics 28: 337–407.
Habbema J.D.F., Hermans J., and van der Burgt A.T. 1974. Cases of doubt in allocation problems. Biometrika 61: 313–324.
Hall P., Hu T.-C., and Marron J.S. 1995. Improved variable window kernel estimates of probability densities. Annals of Statistics 23: 1–10.
Hall P. and Turlach B. 1999. Reducing bias in curve estimation by the use of weights. Computational Statistics and Data Analysis 30: 67–86.
Hall P. and Wand M.P. 1988. On nonparametric discrimination using density differences. Biometrika 75: 541–547.
Hand D.J. 1982. Kernel Discriminant Analysis. Research Studies Press, Chichester.
Hastie T., Tibshirani R., and Friedman J. 2001. The Elements of Statistical Learning, Springer, New York.
Jones M.C., Linton O., and Nielsen J.P. 1995. A simple bias reduction method for density estimation. Biometrika 82: 327–338.
Jones M.C. and Signorini D.F. 1997. A comparison of higher-order bias kernel density estimators. Journal of the American Statistical Association 92: 1063–1073.
Jones M.C., Signorini D.F., and Li H.N. 1999. On multiplicative bias correction in kernel density estimation. Sankya 61: 1–9.
Michie D., Spiegelhalter D.J., and Taylor C.C. 1994. Machine Learning, Neural and Statistical Classification, Ellis Horwood, Chichester.
Ridgeway G. 2000. Discussion of Additive logistic regression: A statistical view. Annals of Statistics 28: 393–400.
Shapire R.E. 1990. The strength of weak learnability. Machine Learning 5: 313–321.
Shapire R.E. and Singer Y. 1998. Improved boosting algorithms using confidence-rated prediction. In Proceeding of the Eleventh Annual Conference on Computational Learning Theory.
Silverman B.W. 1986. Density estimation for Statistics and Data Analysis. Chapman and Hall, London.
Terrell G.R. and Scott D.W. 1992. Variable kernel density estimation. Annals of Statistics 20: 1236–1265.
Wand M.P. and Jones M.C. 1995. Kernel smoothing, Chapman and Hall, London.
Wright D.E., Stander J., and Nicolaides K. 1997. Nonparametric density estimation and discrimination from images of shapes. Journal of the Royal Statistical Society C 46: 365–380.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Marzio, M.D., Taylor, C.C. Kernel density classification and boosting: an L2 analysis. Stat Comput 15, 113–123 (2005). https://doi.org/10.1007/s11222-005-6203-8
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s11222-005-6203-8