Abstract
The paper introduces meta decision trees (MDTs), a novel method for combining multiple models. Instead of giving a prediction, MDT leaves specify which model should be used to obtain a prediction. We present an algorithm for learning MDTs based on the C4.5 algorithm for learning ordinary decision trees (ODTs). An extensive experimental evaluation of the new algorithm is performed on twenty-one data sets, combining models generated by five learning algorithms: two algorithms for learning decision trees, a rule learning algorithm, a nearest neighbor algorithm and a naive Bayes algorithm. In terms of performance, MDTs combine models better than voting and stacking with ODTs. In addition, MDTs are much more concise than ODTs used for stacking and are thus a step towards comprehensible combination of multiple models.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Brazdil, P. B. and Henery, R. J. (1994) Analysis of Results. In Michie, D., Spiegel-Halter, D. J., and Taylor, C. C., editors: Machine learning, neural and statistical classification. Ellis Horwood.
Breiman, L. (1996) Bagging Predictors. Machine Learning 24(2): 123–140.
Chan, P. K. and Stolfo, S. J. (1997) On the Accuracy of Meta-learning for Scalable Data Mining. Journal of Intelligent Information Systems 8(1): 5–28.
Clark, P. and Boswell, R. (1991) Rule induction with CN2: Some recent improvements. In Proceedings of the Fifth European Working Session on Learning: 151–163. Springer-Werlag.
Dietterich, T. G. (1997) Machine-Learning Research: Four Current Directions. AI Magazine 18(4): 97–136.
Freund, Y. and Schapire, R. E. (1996) Experiments with a New Boosting Algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann.
Gama, J. (1998) Combining Classifiers by Constructive Induction. In Proceedings of the Ninth European Conference on Machine Learning.
Gama, J. (1999) Discriminant trees. In Proceedings of the Sixteenth International Conference on Machine Learning: 134–142. Morgan Kaufmann.
Gama, J. (2000) A Linear-Bayes Classifier. Technical Report. Artificial Intelligence and Computer Science Laboratory, University of Porto.
Merz, C. J. (1999) Using Correspondence Analysis to Combine Classifiers. Machine Learning 36(1/2): 33–58. Kluwer Academic Publishers.
Murphy, P. M. and Aha, D. W. (1994) UCI repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.
Quinlan, J. R. (1993) C4.5: Programs for Machine Learning. Morgan Kaufmann.
Todorovski, L. and Džeroski, S. (1999) Experiments in Meta-Level Learning with ILP. In Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery: 98–106. Springer-Werlag.
Wettschereck, D. (1994) A study of distance-based machine learning algorithms. PhD Thesis, Department of Computer Science, Oregon State University, Corvallis.
Wolpert, D. (1992) Stacked Generalization. Neural Networks 5(2): 241–260.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Todorovski, L., Džeroski, S. (2000). Combining Multiple Models with Meta Decision Trees. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_6
Download citation
DOI: https://doi.org/10.1007/3-540-45372-5_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive