Abstract
This paper proposes an optimal strategy for extracting probabilistic rules from databases. Two inductive learning-based statistic measures and their rough set-based definitions: accuracy and coverage are introduced. The simplicity of a rule emphasized in this paper has previously been ignored in the discovery of probabilistic rules. To avoid the high computational complexity of rough-set approach, some rough-set terminologies rather than the approach itself are applied to represent the probabilistic rules. The genetic algorithm is exploited to find the optimal probabilistic rules that have the highest accuracy and coverage, and shortest length. Some heuristic genetic operators are also utilized in order to make the global searching and evolution of rules more efficiently. Experimental results have revealed that it run more efficiently and generate probabilistic classification rules of the same integrity when compared with traditional classification methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Tsumoto, S.: Knowledge discovery in clinic databases and evaluation of discovered knowledge in outpatient clinic. Information science 124, 125–137 (2000)
Wogulis, J., Iba, W., Langley, P.: Trading off simplicity and coverage in incremental concept learning. In: Proceedings of the Fifth International Conference in Machine Learning, pp. 73–79. Morgan Kaufmann, San Mateo (1992)
Papadakis, S.E., Theocharis, J.B.: A GA-based modeling approach for generating. TSK models Fuzzy sets and system 131, 121–152 (2002)
Tsumoto, S.: Automated extraction of medical expert system rules from clinic databases based on rough set theory. Information science 112, 67–84 (1998)
Hu, X.: Using Rough sets theory and database operations to construct a good ensemble of classifiers for data mining applications. In: The proceeding of IEEE International conference on data mining, San Jose, California,USA, November 29 December 2 (2001)
Chow, K.M., Rad, A.B.: On-line fuzzy identification using genetic algorithms. Fuzzy sets and systems 132, 147–171 (2002)
Siromoney, A., Inoue, K.: Consistency and Completeness in rough sets. Journal of Intelligent and information system 15, 207–220 (2000)
Piasta, Z., Lenarcik, A.: Rule induction with Probabilistic rough classifications
lan Flochkhart, W., Radcliffe, N.J.: A genetic algorithm-based approach to data mining. In: International conference on KDD (1996)
Li, M., Kou, J., Zhou, J.: Programming Model for concept learning and its solution based on genetic algorithm. In: Proceeding of the 3rd world congress on intelligent control and automation, Hefei,P.R.China, June 28–July 2 (2000)
Stenes, M., Roubos, H.: GA_Fuzzy modelling and classification: complexity and performance. IEEE transactions on fuzzy systems 8(5) (October 2000)
Kryszkiewicz, M.: Rough set approach to incomplete information systems. Information sciences 112, 39–49 (1998)
Mantaras, R.L., Armentgol, E.: Machine learning from examples: inductive and lazy methods. Data & Knowledge engineering 25, 99–123 (1998)
Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. IEEE Trans. on Intelligent Systems 13(2), 44–49 (1998)
Kim, D., Bang, S.-Y.: A Handwritten Numeral Character Classification Using Tolerant Rough Set. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 923–937 (2000)
Vinterbo, S.: A genetic algorithm for a family of a set cover problems, http://www.idi.ntnu.no/~staal/setc/setc.pdf
Dai, H., Hang, X.: A Rough set Theory Based Optimal Attribute Reduction using Genetic Algorithm. In: Proceedings of Computational Intelligence for Modelling Control and Automation(CIMCA), Las Vegas,Vevada, USA, pp. 140–148 (2001)
Hang, X., Dai, H.: Rough computation of extension matrix for learning from examples. In: Proceedings of Computational Intelligence for Modelling Control and Automation(CIMCA), Las Vegas,Vevada,USA, pp. 161–171 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hang, X., Dai, H. (2003). An Optimal Strategy for Extracting Probabilistic Rules by Combining Rough Sets and Genetic Algorithm. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-39644-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20293-6
Online ISBN: 978-3-540-39644-4
eBook Packages: Springer Book Archive