Abstract
Discriminant Measures for Classification Performance play a critical role in guiding the design of classifiers, assessment methods and evaluation measures are at least as important as algorithm and are the first key stage to a successful data mining. We systematically summarized the evaluation measures of Imbalanced Data Sets (IDS). Several different type measures, such as commonly performance evaluation measures and visualizing classifier performance measures have been analyzed and compared. The problems of these measures towards IDS may lead to misunderstanding of classification results and even wrong strategy decision. Beside that, a series of complex numerical evaluation measures were also investigated which can also serve for evaluating classification performance of IDS.
An Erratum for this chapter can be found at http://dx.doi.org/10.1007/978-3-642-04962-0_55
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Pepe, M.S.: Receiver Operating Characteristic Methodology. Journal of the American Statistical Association 95, 308–311 (2000)
Fawcett, T.: ROC graphs: Notes and practical considerations for researchers. Machine learning 31 (2004)
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: The 23rd International Conference on Machine Learning (ICML 2006), pp. 233–240. ACM, New York (2006)
Drummond, C., Holte, R.C.: Cost curves: An improved method for visualizing classifier performance. Machine learning 65, 95–130 (2006)
Weiss, G.M.: Mining with rarity: a unifying framework. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining 6, 7–19 (2004)
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Provost, F., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. In: The 3rd International Conference on Knowledge Discovery and Data Mining, pp. 43–48 (1997)
van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)
Kubat, M., Holte, R.C., Matwin, S.: Machine Learning for the Detection of Oil Spills in Satellite Radar Images. Machine Learning 30, 195–215 (1998)
Youden, W.J.: Index for rating diagnostic tests. Cancer 3, 32–35 (1950)
Biggersta, B.J.: Comparing diagnostic tests: a simple graphic using likelihood ratios. Statistics in Medicine 19, 649–663 (2000)
Blakeley, D.D., Oddone, E.Z., Hasselblad, V., Simel, D.L., Matchar, D.B.: Noninvasive carotid artery testing: a meta-analytic review. Am. Coll. Physicians 122, 360–367 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gu, Q., Zhu, L., Cai, Z. (2009). Evaluation Measures of the Classification Performance of Imbalanced Data Sets. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds) Computational Intelligence and Intelligent Systems. ISICA 2009. Communications in Computer and Information Science, vol 51. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04962-0_53
Download citation
DOI: https://doi.org/10.1007/978-3-642-04962-0_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04961-3
Online ISBN: 978-3-642-04962-0
eBook Packages: Computer ScienceComputer Science (R0)