PRIE: a system for generating rulelists to maximize ROC performance

Fawcett, Tom

doi:10.1007/s10618-008-0089-y

PRIE: a system for generating rulelists to maximize ROC performance

Published: 05 February 2008

Volume 17, pages 207–224, (2008)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

PRIE: a system for generating rulelists to maximize ROC performance

Download PDF

Tom Fawcett¹

250 Accesses
33 Citations
Explore all metrics

Abstract

Rules are commonly used for classification because they are modular, intelligible and easy to learn. Existing work in classification rule learning assumes the goal is to produce categorical classifications to maximize classification accuracy. Recent work in machine learning has pointed out the limitations of classification accuracy: when class distributions are skewed, or error costs are unequal, an accuracy maximizing classifier can perform poorly. This paper presents a method for learning rules directly from ROC space when the goal is to maximize the area under the ROC curve (AUC). Basic principles from rule learning and computational geometry are used to focus the search for promising rule combinations. The result is a system that can learn intelligible rulelists with good ROC performance.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Barakat N, Bradley A (2006) Rule extraction from support vector machines: measuring the explanation capability using the area under the ROC curve. In: ICPR 2006. 18th international conference on pattern recognition, vol 2, IEEE Press, pp 812–815
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7): 1145–1159
Article Google Scholar
Clark P, Boswell R (1991) Rule induction with CN2: some recent improvements. In: Kodratoff Y (ed) Machine learning—proceedings of the fifth European conference, pp 151–163
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3: 261–283
Google Scholar
Cohen WW (1996) Learning trees and rules with set-valued features. In: AAAI/IAAI, vol. 1, pp 709–716
Egan JP (1975) Signal detection theory and ROC analysis. Series in cognitition and perception. Academic Press, New York
Google Scholar
Fawcett T (2001) Using rule sets to maximize ROC performance. In: Proceedings of the IEEE international conference on data mining (ICDM-2001), pp 131–138
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8): 882–891
Article MathSciNet Google Scholar
Flach P (2004) The many faces of ROC analysis in machine learning ICML-04 Tutorial. Notes available from http://www.cs.bris.ac.uk/~flach/ICML04tutorial/index.html
Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1): 3–54
Article MATH Google Scholar
Fürnkranz J, Flach PA (2005) Roc ‘n’ rule learning—towards a better understanding of covering algorithms. Mach Learn 58(1): 39–77
Article MATH Google Scholar
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143: 29–36
Google Scholar
Ling CX, Huang J, Zhang H (2003) Auc: a better measure than accuracy in comparing learning algorithms. In: Advances in artificial intelligence: 16th conference of the canadian society for computational studies of intelligence, Springer, pp 329–341
Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. In: Raedt LD, Wrobel S (eds) Proceedings of the twenty-second international conference on machine learning (ICML’05), ACM Press, pp 625–632
Prati R, Flach P (2005) Roccer: an algorithm for rule learning based on ROC analysis. In: IJCAI 2005, pp 823–828
Provost F, Domingos P (2001) Well-trained PETs: improving probability estimation trees. CeDER Working Paper #IS-00-04, Stern School of Business, New York University, NY, NY 10012
Provost F, Fawcett T (1998) Robust classification systems for imprecise environments. In: Proceedings of AAAI-98. AAAI Press, Menlo Park, CA, pp 706–713
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3): 203–231
Article MATH Google Scholar
Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing induction algorithms. In: Shavlik J (ed) Proceedings of ICML-98. Morgan Kaufmann, San Francisco, CA, pp 445–453. Available: http://www.purl.org/NET/tfawcett/papers/ICML98-final.ps.gz
Santini S, Bimbo DA (1995) Recurrent neural networks can be trained to be maximum a posteriori probability classifiers. Neural Netw 8(1): 25–29
Article Google Scholar
Srinivasan A (1999) Note on the location of optimal classifiers in n-dimensional ROC space. Technical Report PRG-TR-2-99, Oxford University Computing Laboratory, Oxford, England. Available: http://citeseer.nj.nec.com/srinivasan99note.html
Swets J (1988) Measuring the accuracy of diagnostic systems. Science 240: 1285–1293
Article MathSciNet Google Scholar
Swets JA, Dawes RM, Monahan J (2000) Better decisions through science. Sci Am 283: 82–87
Article Google Scholar
Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classiers. In: Proceedings of the eighteenth international conference on machine learning, pp 609–616

Download references

Author information

Authors and Affiliations

Center for the Study of Language and Information, Stanford University, Stanford, CA, 94305, USA
Tom Fawcett

Authors

Tom Fawcett
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tom Fawcett.

Additional information

Responsible editor: Bianca Zadrozny.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fawcett, T. PRIE: a system for generating rulelists to maximize ROC performance. Data Min Knowl Disc 17, 207–224 (2008). https://doi.org/10.1007/s10618-008-0089-y

Download citation

Received: 21 December 2006
Accepted: 16 January 2008
Published: 05 February 2008
Issue Date: October 2008
DOI: https://doi.org/10.1007/s10618-008-0089-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

PRIE: a system for generating rulelists to maximize ROC performance

Abstract

Article PDF

Similar content being viewed by others

Investigating the Impact of Independent Rule Fitnesses in a Learning Classifier System

A Metaheuristic Perspective on Learning Classifier Systems

Classification Rule Mining with Iterated Greedy

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PRIE: a system for generating rulelists to maximize ROC performance

Abstract

Article PDF

Similar content being viewed by others

Investigating the Impact of Independent Rule Fitnesses in a Learning Classifier System

A Metaheuristic Perspective on Learning Classifier Systems

Classification Rule Mining with Iterated Greedy

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation