Abstract
Rules are commonly used for classification because they are modular, intelligible and easy to learn. Existing work in classification rule learning assumes the goal is to produce categorical classifications to maximize classification accuracy. Recent work in machine learning has pointed out the limitations of classification accuracy: when class distributions are skewed, or error costs are unequal, an accuracy maximizing classifier can perform poorly. This paper presents a method for learning rules directly from ROC space when the goal is to maximize the area under the ROC curve (AUC). Basic principles from rule learning and computational geometry are used to focus the search for promising rule combinations. The result is a system that can learn intelligible rulelists with good ROC performance.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Barakat N, Bradley A (2006) Rule extraction from support vector machines: measuring the explanation capability using the area under the ROC curve. In: ICPR 2006. 18th international conference on pattern recognition, vol 2, IEEE Press, pp 812–815
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7): 1145–1159
Clark P, Boswell R (1991) Rule induction with CN2: some recent improvements. In: Kodratoff Y (ed) Machine learning—proceedings of the fifth European conference, pp 151–163
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3: 261–283
Cohen WW (1996) Learning trees and rules with set-valued features. In: AAAI/IAAI, vol. 1, pp 709–716
Egan JP (1975) Signal detection theory and ROC analysis. Series in cognitition and perception. Academic Press, New York
Fawcett T (2001) Using rule sets to maximize ROC performance. In: Proceedings of the IEEE international conference on data mining (ICDM-2001), pp 131–138
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8): 882–891
Flach P (2004) The many faces of ROC analysis in machine learning ICML-04 Tutorial. Notes available from http://www.cs.bris.ac.uk/~flach/ICML04tutorial/index.html
Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1): 3–54
Fürnkranz J, Flach PA (2005) Roc ‘n’ rule learning—towards a better understanding of covering algorithms. Mach Learn 58(1): 39–77
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143: 29–36
Ling CX, Huang J, Zhang H (2003) Auc: a better measure than accuracy in comparing learning algorithms. In: Advances in artificial intelligence: 16th conference of the canadian society for computational studies of intelligence, Springer, pp 329–341
Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. In: Raedt LD, Wrobel S (eds) Proceedings of the twenty-second international conference on machine learning (ICML’05), ACM Press, pp 625–632
Prati R, Flach P (2005) Roccer: an algorithm for rule learning based on ROC analysis. In: IJCAI 2005, pp 823–828
Provost F, Domingos P (2001) Well-trained PETs: improving probability estimation trees. CeDER Working Paper #IS-00-04, Stern School of Business, New York University, NY, NY 10012
Provost F, Fawcett T (1998) Robust classification systems for imprecise environments. In: Proceedings of AAAI-98. AAAI Press, Menlo Park, CA, pp 706–713
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3): 203–231
Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing induction algorithms. In: Shavlik J (ed) Proceedings of ICML-98. Morgan Kaufmann, San Francisco, CA, pp 445–453. Available: http://www.purl.org/NET/tfawcett/papers/ICML98-final.ps.gz
Santini S, Bimbo DA (1995) Recurrent neural networks can be trained to be maximum a posteriori probability classifiers. Neural Netw 8(1): 25–29
Srinivasan A (1999) Note on the location of optimal classifiers in n-dimensional ROC space. Technical Report PRG-TR-2-99, Oxford University Computing Laboratory, Oxford, England. Available: http://citeseer.nj.nec.com/srinivasan99note.html
Swets J (1988) Measuring the accuracy of diagnostic systems. Science 240: 1285–1293
Swets JA, Dawes RM, Monahan J (2000) Better decisions through science. Sci Am 283: 82–87
Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classiers. In: Proceedings of the eighteenth international conference on machine learning, pp 609–616
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Bianca Zadrozny.
Rights and permissions
About this article
Cite this article
Fawcett, T. PRIE: a system for generating rulelists to maximize ROC performance. Data Min Knowl Disc 17, 207–224 (2008). https://doi.org/10.1007/s10618-008-0089-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-008-0089-y