Abstract
This paper describes the use of taxonomic hierarchies of conceptclasses (dependent class values) for knowledge discovery. The approach allows evidence to accumulate for rules at different levels of generality and avoids the need for domain experts to predetermine which levels of concepts should be learned. In particular, higher-level rules can be learned automatically when the data doesn’t support more specific learning, and higher level rules can be used to predict a particular case when the data is not detailed enough for a more specific rule. The process introduces difficulties concerning how to heuristically select rules during the learning process, since accuracy alone is not adequate. This paper explains the algorithm for using concept-class taxonomies, as well as techniques for incorporating granularity (together with accuracy) in the heuristic selection process. Empirical results on three data sets are summarized to highlight the tradeoff between predictive accuracy and predictive granularity.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Almuallim, H., Akiba, Y., Kaneda, S.: On handling tree-structure attributes in decision tree learning. In: Proc. of the 12th Intl. Conf. on Machine Learning, Morgan Kaufmann, San Francisco (1995)
Aronis, J.M., Provost, F.J., Buchanan, B.G.: Exploiting background knowledge in automated discovery. In: Proc. of the 2nd Intl. Conf. on Knowledge Discovery and Data Mining, pp. 355–358. AAAI Press, Menlo Park (1996)
Aronis, J.M., Provost, F.J.: Efficient data mining with or without hierarchical background knowledge. In: Proc. of the 3rd Intl. Conf. on Knowledge Discovery and Data Mining, New Port Beach, CA (1997)
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Breiman, L., Friedman, J.H., Olsen, R.A., Stone, P.J.: Classification and regression trees. Wadsworth International Corp, CA (1984)
Buchanan, B.G., Mitchell, T.M.: Model-directed learning of production rules. In: Waterman, D., Hayes-Roth, F. (eds.) Pattern Directed Inference Systems, Academic Press, New York (1978)
Fürnkranz, J.: Separate-and-Conquer Rule Learning. Artificial Intelligence Review 13(1), 3–54 (1999)
Kaufmann, K.A., Michalski, R.S.: A Method for Reasoning with Structured and Continuous Attributes in the INLEN-2 Multistrategy Knowledge Discovery System. In: Proc. of the 2nd Intl. Conf. on Knowledge Discovery and Data Mining, pp. 232–238 (1996)
Koller, D., Sahami, M.: Hierarchically Classifying Documents Using Very Few Words. In: Proc. of the 14th Intl. Conf. on Machine Learning, pp. 170–178. Morgan Kaufmann, San Francisco (1997)
Krenzelok, E., Jacobsen, T., Aronis, J.M.: Jimsonweed (datura-stramonium) poisoning and abuse: an analysis of 1,458 cases. In: Proc. of North American Congress of Clinical Toxicology, Rochester NY (1995)
McCallum, A., Rosenfeld, R., Mitchell, T., Nigam, K.: Improving Text Classification by Shrinkage in Hierarchy of Classes. In: Proc. Of the 15th Intl. Conf. in Machine Learning (1998)
Michalski, R.S.: Inductive Rule-Guided Generalization and Conceptual Simplification of Symbolic Descriptions: Unifying Principles and Methodology. In: Workshop on Current Developments in Machine Learning, Carnegie Mellon University, Pittsburgh (1980)
Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., Brunk, C.: Reducing Misclassification Costs. In: Proc of the 11th Intl. Conf. of Machine Learning, New Brunswick, Morgan Kaufmann, San Francisco (1994)
Provost, F.J., Buchanan, B.G.: Inductive Policy: The Pragmatics of Bias Selection. Machine Learning (20) (1995)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Reich, R., Fenves, R.: Incremental Learning for Capturing Design Expertise. Technical Report: EDRC 12-34-89, Engineering Design Research Center, Carnegie Mellon University, Pittsburgh, PA (1989)
Taylor, M.G., Stoffel, K., Hendler, J.A.: Ontology-based Induction of High Level Classification Rules. In: Proc. of the SIGMOD (1997)
Turney, P.D.: Cost-sensitive classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. Journal of Artificial Intelligence Research 2, 369–409 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kolluri, V., Provost, F., Buchanan, B., Metzler, D. (2004). Knowledge Discovery Using Concept-Class Taxonomies. In: Webb, G.I., Yu, X. (eds) AI 2004: Advances in Artificial Intelligence. AI 2004. Lecture Notes in Computer Science(), vol 3339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30549-1_40
Download citation
DOI: https://doi.org/10.1007/978-3-540-30549-1_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24059-4
Online ISBN: 978-3-540-30549-1
eBook Packages: Computer ScienceComputer Science (R0)