Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy

Skurichina, Marina; Kuncheva, Liudmila I.; Duin, Robert P. W.

doi:10.1007/3-540-45428-4_6

Marina Skurichina⁶,
Liudmila I. Kuncheva⁷ &
Robert P. W. Duin⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2364))

Included in the following conference series:

International Workshop on Multiple Classifier Systems

756 Accesses
21 Citations

Abstract

In combining classifiers, it is believed that diverse ensembles perform better than non-diverse ones. In order to test this hypothesis, we study the accuracy and diversity of ensembles obtained in bagging and boosting applied to the nearest mean classifier. In our simulation study we consider two diversity measures: the Q statistic and the disagreement measure. The experiments, carried out on four data sets have shown that both diversity and the accuracy of the ensembles depend on the training sample size. With exception of very small training sample sizes, both bagging and boosting are more useful when ensembles consist of diverse classifiers. However, in boosting the relationship between diversity and the efficiency of ensembles is much stronger than in bagging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Ensemble Method Combination: Bagging and Boosting

Diversity Analysis on Imbalanced Data Using Neighbourhood and Roughly Balanced Bagging Ensembles

A Diversity Production Approach in Ensemble of Base Classifiers

References

Jain, A.K., Chandrasekaran, B.: Dimensionality and Sample Size Considerations in Pattern Recognition Practice. In: Krishnaiah, P.R., Kanal, L.N. (eds.): Handbook of Statistics, Vol. 2. North-Holland, Amsterdam (1987) 835–855
Google Scholar
Lam, L.: Classifier Combinations: Implementations and Theoretical Issues. In: Kittler, J., Roli, F. (eds.): Multiple Classifier Systems (Proc. of the First Int. Workshop MCS, Cagliari, Italy). Lecture Notes in Computer Science, Vol. 1857, Springer-Verlag, Berlin (2000) 78–86
Google Scholar
Cunningham, P., Carney, J.: Diversity versus Quality in Classification Ensembles Based on Feature Selection. Tech. Report TCD-CS-2000-02, Dept. of Computer Science, Trinity College, Dublin (2000)
Google Scholar
Kuncheva, L.I., Whitaker, C.J., Shipp, C.A., Duin, R.P.W.: Is Independence Good for Combining Classifiers? In: Proc. of the 15th Int. Conference on Pattern Recognition, Vol. 2, Barcelona, Spain (2000) 169–171
Google Scholar
Breiman, L.: Bagging predictors. In: Machine Learning Journal 24(2) (1996) 123–140
MATH MathSciNet Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Machine Learning: Proc. of the 13th Int. Conference (1996) 148–156
Google Scholar
Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press (1990) 400–407
Google Scholar
Bauer, E., Kohavi, R.: An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. In: Machine Learning 36 (1999) 105–142
Article Google Scholar
Dietterich, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.): Multiple Classifier Systems (Proc. of the First Int. Workshop MCS, Cagliari, Italy). Lecture Notes in Computer Science, Vol. 1857, Springer-Verlag, Berlin (2000) 1–15
Google Scholar
Quinlan, J.R.: Bagging, Boosting, and C4.5. In: Proc. of the 14th National Conference on Artificial Intelligence (1996)
Google Scholar
Skurichina, M.: Stabilizing Weak Classifiers. PhD thesis, Delft University of Technology, Delft, The Netherlands (2001)
Google Scholar
Avnimelech, R., Intrator, N.: Boosting Regression Estimators. In: Neural Computation 11 (1999) 499–520
Article Google Scholar
Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalization of On-line Learning and an Application to Boosting. In: Journal of Computer and System Sciences 55(1) (1997) 119–139
Article MATH MathSciNet Google Scholar
Skurichina, M., Duin, R.P.W.: The Role of Combining Rules in Bagging and Boosting. In: Ferri, F.J., Inesta, J.M., Amin, A., Pudil, P. (eds.): Advances in Pattern Recognition (Proc. of the Joint Int. Workshops SSPR and SPR, Alicante, Spain). Lecture Notes in Computer Science, Vol. 1876, Springer-Verlag, Berlin (2000) 631–640
Google Scholar
Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman&Hall, New York (1993)
MATH Google Scholar
Cortes, C., Vapnik, V.: Support Vector Networks. In: Machine Learning 20 (1995) 273–297
MATH Google Scholar
Breiman, L.: Arcing Classifiers. In: Annals of Statistics 26(3) (1998) 801–849
Article MATH MathSciNet Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Measures of Diversity in Classifier Ensembles (submitted)
Google Scholar
Yule, G.U.: On the Association of Attributes in Statistics. In: Phil. Transactions A(194) (1900) 257–319
Article Google Scholar
Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8) (1998) 832–844
Article Google Scholar
Skalak, D.B.: The Sources of Increased Accuracy for Two Proposed Boosting Algorithms. In: Proc. of American Association for Artificial Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop (1996)
Google Scholar
Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Pattern Recognition Group, Department of Applied Physics, Faculty of Applied Sciences, Delft University of Technology, P.O. Box 5046, 2600GA, Delft, The Netherlands
Marina Skurichina & Robert P. W. Duin
School of Informatics, University of Wales, Bangor, Gwynedd, LL57 1UT, UK
Liudmila I. Kuncheva

Authors

Marina Skurichina
View author publications
You can also search for this author in PubMed Google Scholar
Liudmila I. Kuncheva
View author publications
You can also search for this author in PubMed Google Scholar
Robert P. W. Duin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical and Electronical Engineering, University of Cagliari, Piazza D’Armi, 09123, Cagliari, Italy
Fabio Roli
Centre for Vision, Speech and Signal Processing, University of Surrey, Guilford, Surrey, GUZ 7XH, UK
Josef Kittler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Skurichina, M., Kuncheva, L.I., Duin, R.P.W. (2002). Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy. In: Roli, F., Kittler, J. (eds) Multiple Classifier Systems. MCS 2002. Lecture Notes in Computer Science, vol 2364. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45428-4_6

Download citation

DOI: https://doi.org/10.1007/3-540-45428-4_6
Published: 21 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43818-2
Online ISBN: 978-3-540-45428-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Ensemble Method Combination: Bagging and Boosting

Diversity Analysis on Imbalanced Data Using Neighbourhood and Roughly Balanced Bagging Ensembles

A Diversity Production Approach in Ensemble of Base Classifiers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Ensemble Method Combination: Bagging and Boosting

Diversity Analysis on Imbalanced Data Using Neighbourhood and Roughly Balanced Bagging Ensembles

A Diversity Production Approach in Ensemble of Base Classifiers

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation