Abstract
Several cost-sensitive boosting algorithms have been reported as effective methods in dealing with class imbalance problem. Misclassification costs, which reflect the different level of class identification importance, are integrated into the weight update formula of AdaBoost algorithm. Yet, it has been shown that the weight update parameter of AdaBoost is induced so as the training error can be reduced most rapidly. This is the most crucial step of AdaBoost in converting a weak learning algorithm into a strong one. However, most reported cost-sensitive boosting algorithms ignore such a property. In this paper, we come up with three versions of cost-sensitive AdaBoost algorithms where the parameters for sample weight updating are induced. Then, their identification abilities on the small classes are tested on four “real world” medical data sets taken from UCI Machine Learning Database based on F-measure. Our experimental results show that one of our proposed cost-sensitive AdaBoost algorithms is superior in achieving the best identification ability on the small class among all reported cost-sensitive boosting algorithms.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abe, N., Zadrozny, B., Langford, J.: An iterative method for multi-class costsensitive learning. In: Proceedings of the tenth ACN SIGKDD International Conference on Knowledge Discovery and Data MIning, Seattle, WA, August 2004, pp. 3–11 (2004)
Bradford, J., Kunz, C., Kohavi, R., Brunk, C., Brodley, C.E.: Pruning decision trees with misclassification costs. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 131–136. Springer, Heidelberg (1998)
Chan, P., Stolfo, S.: Toward scalable learning with non-uniform class and cost distributions. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data MIning, New York, NY, August 1998, pp. 164–168 (1998)
Elkan, C.: The foundations of cost-sensitive learning. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Seattle, Washington, August 2001, pp. 973–978 (2001)
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: Adacost: Misclasification costsensitive boosting. In: Proc. of Sixth International Conference on Machine Learning (ICML 1999), Bled, Slovenia, pp. 97–105 (1999)
Fawcett, T.E., Provost, F.: Adaptive fraud detection. Data Mining and Knowledge Discovery 1(3), 291–316 (1997)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an aplication to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Geibel, P., Wysotzki, F.: Perceptron based learning with example dependent and noisy costs. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning, pp. 218–226. AAAI Press / Mit Press (2003)
Kohavi, R., Sommerfield, D., Dougherty, J.: Data Mining Using MLC++: A machine learning library in C++. Tools with Artificial Intelligence. IEEE CS Press, Los Alamitos (1996)
Kubat, R., Holte, M., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Machine Learning 30, 195–215 (1998)
Murph, P.M., Aha, D.W.: UCI Repository of Machine Learning Databases. Dept. of Information and Computer Science, Univ. of California, Irvine (1991)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Ting, K.M.: A comparative study of cost-sensitive boosting algorithms. In: Proceedings of the 17th International Conference on Machine Learning, Stanford University, CA, pp. 983–990 (2000)
Wang, Y., Wong, A.K.C.: From association to classification: Inference using weight of evidence. IEEE Trans. on Knowledge and Data Engineering 15(3), 764–767 (2003)
Wong, A.K.C., Wang, Y.: High order pattern discovery from discrete-valued data. IEEE Trans. on Knowledge and Data Engineering 9(6), 877–893 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sun, Y., Wong, A.K.C., Wang, Y. (2005). Parameter Inference of Cost-Sensitive Boosting Algorithms. In: Perner, P., Imiya, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2005. Lecture Notes in Computer Science(), vol 3587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11510888_3
Download citation
DOI: https://doi.org/10.1007/11510888_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26923-6
Online ISBN: 978-3-540-31891-0
eBook Packages: Computer ScienceComputer Science (R0)