Abstract
The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used for improving prediction performance. In this chapter we provide an overview of ensemble methods in classification tasks. We present all important types of ensemble method including boosting and bagging. Combining methods and modeling issues such as ensemble diversity and ensemble size are discussed.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
References
Ali K. M., Pazzani M. J., Error Reduction through Learning Multiple Descriptions, Machine Learning, 24:3, 173–202, 1996
Bartlett P. and Shawe-Taylor J., Generalization Performance of Support Vector Machines and Other Pattern Classifiers, In “Advances in Kernel Methods, Support Vector Learning”, Bernhard Scholkopf, Christopher J. C. Burges, and Alexander J. Smola (eds.), MIT Press, Cambridge, USA, 1998.
Bauer, E. and Kohavi, R., “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants”. Machine Learning, 35: 1–38, 1999.
Breiman L., Bagging predictors, Machine Learning, 24(2):123–140, 1996.
Bruzzone L., Cossu R., Vernazza C, Detection of land-cover transitions by combining multidate classifiers. Pattern Recognition Letters, 25(13): 1491–1500, 2004.
Buchanan, B.G. and Shortliffe, E.H., Rule Based Expert Systems, 272–292, Addison-Wesley, 1984.
Buhlmann, P. and Yu, B., Boosting with L2 loss: Regression and classification, Journal of the American Statistical Association, 98, 324338. 2003.
Buntine, W., A Theory of Learning Classification Rules. Doctoral dissertation. School of Computing Science, University of Technology. Sydney. Australia, 1990.
Caruana R., Niculescu-Mizil A., Crew G., Ksikes A., Ensemble selection from libraries of models, Twenty-first international conference on Machine learning, July 04–08, 2004, Banff, Alberta, Canada.
Chan P. K. and Stolfo, S. J., Toward parallel and distributed learning by meta-learning, In AAAI Workshop in Knowledge Discovery in Databases, pp. 227–240, 1993.
Chan P.K. and Stolfo, S.J., A Comparative Evaluation of Voting and Meta-learning on Partitioned Data, Proc. 12th Intl. Conf. On Machine Learning ICML-95, 1995.
Chan P.K. and Stolfo S.J, On the Accuracy of Meta-learning for Scalable Data Mining, J. Intelligent Information Systems, 8:5–28, 1997
Charnes, A., Cooper, W. W., and Rhodes, E, Measuring the efficiency of decision making units, European Journal of Operational Research, 2(6):429–444, 1978.
Christensen S. W., Sinclair I., Reed P. A. S., Designing committees of models through deliberate weighting of data points, The Journal of Machine Learning Research, 4(1):39–66. 2004.
Clark, P. and Boswell, R., “Rule induction with CN2: Some recent improvements.” In Proceedings of the European Working Session on Learning, pp. 151–163, Pitman, 1991.
Džeroski S., Ženko B., Is Combining Classifiers with Stacking Better than Selecting the Best One?, Machine Learning, 54(3): 255–273, 2004.
Dietterich, T. G., An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting and Randomization. 40(2): 139–157, 2000.
Dietterich T, Ensemble methods in machine learning. In J. Kittler and F. Roll, editors. First International Workshop on Multiple Classifier Systems, Lecture Notes in Computer Science, pages 1–15. Springer-Verlag, 2000
Dimitriadou E., Weingessel A., Hornik K., A cluster ensembles framework, Design and application of hybrid intelligent systems, IOS Press, Amsterdam, The Netherlands, 2003.
Domingos, P. Using Partitioning to Speed Up Specific-to-General Rule Induction. In Proceedings of the AAAI-96 Workshop on Integrating Multiple Learned Models, pp. 29–34, AAAI Press, 1996.
Freund Y. and Schapire R. E., Experiments with a new boosting algorithm. In Machine Learning: Proceedings of ihc Thirteenth International Conference, pages 325–332, 1996
Fürnkranz. J., More efficient windowing, In Proceeding of The 14th national Conference on Artificial Intelegence (AAAI-97), pp. 509–514, Providence, RI. AAAI Press. 1997.
Gams, M., New Measurements Highlight the Importance of Redundant Knowledge. In European Working Session on Learning, Montpeiller, France, Pitman, 1989.
Geman S., Bienenstock, E., and Doursat, R., Neural networks and the bias variance dilemma. Neural Compulation, 4:1–58. 1995.
Hansen J., Combining Predictors. Meta Machine Learning Methods and Bias Variance & Ambiguity Decompositions. PhD dissertation. Aurhus University. 2000.
Hansen, L. K., and Salamon. P., Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence. 12(10), 993–1001, 1990.
Hu, X., Using Rough Sets Theory and Database Operations to Construct a Good Ensemble of Classifiers for Data Mining Applications. ICDMOL01 pp. 233–240, 2001.
Hu X., Yoo I., Cluster ensemble and its applications in gene expression analysis. Proceedings of the second conference on Asia-Pacific bioinformatics, pp. 297–302, Dunedin, New Zealand, 2004.
Kolen, J. F., and Pollack, J. B., Back propagation is sesitive to initial conditions. In Advances in Neural Information Processing Systems, Vol. 3, pp. B60–867 San Francisco, CA. Morgan Kaufmann, 1991.
Krogh, A., and Vedelsby, J., Neural network ensembles, cross validation and active learning. In Advances in Neural Information Processing Systems 7, pp. 231–238 1995.
Kuncheva, L., & Whitaker, C, Measures of diversity in classifier ensembles and their relationship with ensemble accuracy. Machine Learning, pp. 181–207, 2003.
Leigh W., Purvis R., Ragusa J. M., Forecasting the NYSE composite index with technical analysis, pattern recognizer, neural networks, and genetic algorithm: a case study in romantic decision support. Decision Support Systems 32(4): 361–377, 2002.
Lewis D., and Catlett J., Heterogeneous uncertainty sampling for supervised learning. In Machine Learning: Proceedings of the Eleventh Annual Conference, pp. 148–156, New Brunswick, New Jersey, Morgan Kaufmann, 1994.
Lewis, D., and Gale, W., Training text classifiers by uncertainty sampling, In seventeenth annual international ACM SIGIR conference on research and development in information retrieval, pp. 3–12, 1994.
Liu H., Mandvikar A., Mody J., An Empirical Study of Building Compact Ensembles. WAIM 2004: pp. 622–627.
Maimon O. Rokach L., Ensemble of Decision Trees for Mining Manufacturing Data Sets, Machine Engineering, vol. 4 Nol–2, 2004.
Mangiameli P., West D., Rampal R., Model selection for medical diagnosis decision support systems, Decision Support Systems, 36(3): 247–259, 2004.
Margineantu D. and Dietterich T., Pruning adaptive boosting. In Proc. Fourteenth Intl. Conf. Machine Learning, pages 211–218, 1997.
Mitchell, T.. Machine Learning, McGraw-Hill, 1997.
Neat R., Probabilistic inference using Markov Chain Monte Carlo methods. Tech. Rep. CRG-TR-93-1, Department of Computer Science, University of Toronto, Toronto, CA, 1993.
Opitz, D. and Maclin, R., Popular Ensemble Methods: An Empirical Study, Journal of Artificial Research, 11: 169–198, 1999.
Parmanto, B., Munro, P. W., and Doyle, H. R., Improving committee diagnosis with resampling techinques. In Touretzky, D. S., Mozer, M. C, and Hesselmo, M. E. (Eds). Advances in Neural Information Processing Systems, Vol. 8, pp. 882–888 Cambridge. MA. MT Press, 1996.
Prodromidis, A. L., Stolfo, S. J. and Chan, P. K., Effective and efficient pruning of metaclassifiers in a distributed Data Mining system. Technical report CUCS-017-99, Columbia Univ., 1999.
Provost, F.J. and Kolluri, V., A Survey of Methods for Scaling Up Inductive Learning Algorithms, Proc. 3rd International Conference on Knowledge Discovery and Data Mining. 1997.
Quinlan, J. R., C4.5: Programs for Machine Learning, Morgan Kaufmann, Los Altos, 1993.
Quinlan, J. R., Bagging, Boosting, and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 725–730, 1996.
Schaffer, C, Selecting a classification method by cross-validation. Machine Learning 13(1):135–143, 1993.
Seewald, A.K. and Fürnkranz, J., Grading classifiers, Austrian research institute for Artificial intelligence, 2001.
Sharkey, A., On combining artificial neural nets, Connection Science, Vol. 8, pp.299–313, 1996.
Shilen, S., Multiple binary tree classifiers. Pattern Recognition 23(7): 757–763, 1990.
Shilen, S., Nonparametric classification using matched binary decision trees. Pattern Recognition Letters 13: 83–87, 1992.
Sohn S. Y, Choi, H., Ensemble based on Data Envelopment Analysis, ECML Meta Learning workshop, Sep. 4, 2001.
Strehl A., Ghosh J. (2003), Cluster ensembles-a knowledge reuse framework for combining multiple partitions, The Journal of Machine Learning Research, 3: 583–617, 2003.
Tan A. C, Gilbert D., Deville Y, Multi-class Protein Fold Classification using a New Ensemble Machine Learning Approach. Genome Informatics, 14:206–217, 2003.
Tukey J.W., Exploratory data analysis, Addison-Wesley, Reading, Mass, 1977.
Tumer, K and Ghosh J., Error Correlation and Error Reduction in Ensemble Classifiers, Connection Science, Special issue on combining artificial neural networks: ensemble approaches, 8(3–4): 385–404, 1996.
Tumer, K., and Ghosh J., Linear and Order Statistics Combiners for Pattern Classification, in Combining Articial Neural Nets, A. Sharkey (Ed.), pp. 127–162, Springer-Verlag, 1999.
Tumer, K., and Ghosh J., Robust Order Statistics based Ensembles for Distributed Data Mining. In Kargupta, H. and Chan P., eds, Advances in Distributed and Parallel Knowledge Discovery, pp. 185–210, AAAI/MIT Press, 2000.
Wolpert, D.H., Stacked Generalization. Neural Networks, Vol. 5, pp. 241–259, Pergamon Press, 1992.
Zenobi, G., and Cunningham, P. Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. In Proceedings of the European Conference on Machine Learning, 2001.
Zhou, Z. H., and Tang, W., Selective Ensemble of Decision Trees, in Guoyin Wang, Qing Liu, Yiyu Yao, Andrzej Skowron (Eds.): Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, 9th International Conference, RSFDGrC, Chongqing, China, Proceedings. Lecture Notes in Computer Science 2639, pp.476–483, 2003.
Zhou, Z. H., Wu J., Tang W., Ensembling neural networks: many could be better than all. Artificial Intelligence 137: 239–263, 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer Science+Business Media, Inc.
About this chapter
Cite this chapter
Rokach, L. (2005). Ensemble Methods for Classifiers. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/0-387-25465-X_45
Download citation
DOI: https://doi.org/10.1007/0-387-25465-X_45
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-24435-8
Online ISBN: 978-0-387-25465-4
eBook Packages: Computer ScienceComputer Science (R0)