Abstract
This chapter discusses ensembles of classification or regression models, because they represent an important area of machine learning. They have become popular as they tend to achieve high performance when compared with single models. Besides, they also play an essential role in data-streaming solutions. This chapter starts by introducing ensemble learning and presents an overview of some of its most well-known methods. These include bagging, boosting, stacking, cascade generalization, cascading, delegating, arbitrating and meta-decision trees.
Chapter PDF
Similar content being viewed by others
References
Alpaydin, E. and Kaynak, C. (1998). Cascading classifiers. Kybernetika, 34:369–374.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123–140.
Brown, G. (2005). Ensemble learning – on-line bibliography. http://www.cs.bham.ac.uk/gxb/ensemblebib.php.
Caruana, R., Niculescu-Mizil, A., Crew, G., and Ksikes, A. (2004). Ensemble selection from libraries of models. In Proceedings of the 21st International Conference on Machine Learning, ICML’04, pages 137–144. ACM.
Chan, P. and Stolfo, S. (1993). Toward parallel and distributed learning by metalearning. In Working Notes of the AAAI-93 Workshop on Knowledge Discovery in Databases, pages 227–240.
Chan, P. and Stolfo, S. (1997). On the accuracy of meta-learning for scalable data mining. Journal of Intelligent Information Systems, 8:5–28.
Efron, B. (1983). Estimating the error of a prediction rule: Improvement on crossvalidation. Journal of the American Statistical Association, 78(382):316–330.
Ferri, C., Flach, P., and Hernandez-Orallo, J. (2004). Delegating classifiers. In Proceedings of the 21st International Conference on Machine Learning, ICML’04, pages 289–296.
Frank, E. and Witten, I. H. (1998). Generating accurate rule sets without global optimization. In Proceedings of the 15th International Conference on Machine Learning, ICML’98, pages 144–151.
Freund, Y. and Schapire, R. (1996a). A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the European Conference on Computational Learning Theory, pages 23–37.
Freund, Y. and Schapire, R. (1996b). Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning, ICML’96, pages 148–156.
Fürnkranz, J. (1999). Separate-and-conquer rule learning. Artificial Intelligence Review, 13:3–54.
Gama, J. and Brazdil, P. (2000). Cascade generalization. Machine Learning, 41(3):315–343.
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., and Hinton, G. E. (1991). Adaptive mixture of local experts. Neural Computation, 3(1):79–87.
Jordan, M. I. and Jacobs, R. A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6:181–214.
Kaynak, C. and Alpaydin, E. (2000). Multistage cascading of multiple classifiers: One man’s noise is another man’s data. In Proceedings of the 17th International Conference on Machine Learning, ICML’00, pages 455–462.
Kittler, J., Hatef, M., Duin, R. P. W., and Matas, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20:226–239.
Koppel, M. and Engelson, S. P. (1997). Integrating multiple classifiers by finding their areas of expertise. In Proceedings of the AAAI-96 Workshop on Integrating Multiple Learned Models.
Opitz, D. and Maclin, R. (1999). Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research, 11:169–198.
Ortega, J. (1996). Making the Most of What You’ve Got: Using Models and Data to Improve Prediction Accuracy. PhD thesis, Vanderbilt University.
Ortega, J., Koppel, M., and Argamon, S. (2001). Arbitrating among competing classifiers using learned referees. Knowledge and Information Systems Journal, 3(4):470–490.
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco, CA.
Schapire, R. (1990). The strength of weak learnability. Machine Learning, 5(2):197–227.
Ting, K. and Witten, I. (1997). Stacked generalization: When does it work? In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 866–871.
Ting, K. M. and Low, B. T. (1997). Model combination in the multiple-data-batches scenario. In Proceedings of the Ninth European Conference on Machine Learning (ECML-97), pages 250–265.
Todorovski, L. and Džeroski, S. (2003). Combining classifiers with meta-decision trees. Machine Learning, 50(3):223–249.
Tsymbal, A., Puuronen, S., and Terziyan, V. (1998). A technique for advanced dynamic integration of multiple classifiers. In Proceedings of the Finnish Conference on Artificial Intelligence (STeP’98), pages 71–79.
Vilalta, R., Giraud-Carrier, C., Brazdil, P., and Soares, C. (2004). Using meta-learning to support data-mining. International Journal of Computer Science Applications, I(1):31–45.
Waterhouse, S. R. and Robinson, A. J. (1994). Classification using hierarchical mixtures of experts. In IEEE Workshop on Neural Networks for Signal Processing IV, pages 177–186.
Webb, G. I. (1997). Decision tree grafting. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 846–851.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2):241–259.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this chapter
Cite this chapter
Giraud-Carrier, C. (2022). Combining Base-Learners into Ensembles. In: Metalearning. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-030-67024-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-67024-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67023-8
Online ISBN: 978-3-030-67024-5
eBook Packages: Computer ScienceComputer Science (R0)