Abstract
This paper introduces a new approach for mining if-then rules in databases with uncertainty and incompleteness. The approach is based on the combination of Generalization Distribution Table (GDT) and the Rough Set methodology. A GDT is a table in which the probabilistic relationships between concepts and instances over discrete domains are represented. By using a GDT as a hypothesis search space and combining the GDT with the rough set methodology, noises and unseen instances can be handled, biases can be flexibly selected, background knowledge can be used to constrain rule generation, and if-then rules with strengths can be effectively acquired from large, complex databases in an incremental, bottom-up mode. In this paper, we focus on basic concepts and an implementation of our methodology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proc. 12th Inter. Conf. on Machine Learning (1995) 194–202.
Gordon, D.F., DesJardins, M.: Evaluation and selection of biases in machine learning. Machine Learning 20 (1995) 5–22
Hirsh, H.: Generalizing version spaces. Machine Learning 17 (1994) 5–46
Langley, P.: Elements of machine learning, Morgan Kaufmann Publishers (1996)
Mollestad, T., Skowron, A.: A rough set framework for data mining of propositional default rules. In: Z.W. Ras and M. Michalewicz (eds.), Ninth International Symposium on Methodologies for Intelligent Systems (ISMIS-96), Zakopane, Poland, June 9–13, Lecture Notes in Artificial Intelligence 1079, Springer-Verlag, Berlin (1996) 448–457
Michalski, R.S., Carbonell, J.G., Mitchell, T.M.: Machine learning - An artificial intelligence approach, 1–3 Morgan Kaufmann Publishers ( 1983, 1986, 1990 )
Mitchell, T.M.: Version spaces: A candidate elimination approach to rule learning. In: Proc. 5th Int. Joint Conf. Artificial Intelligence, (1977) 305–310
Mitchell, T.M.: Generalization as search. Artificial Intelligence 18 (1982) 203–226
Ohsuga, S.: Symbol processing by non-symbol processor. In: Proc. 4th Pacific Rim International Conference on Artificial Intelligence (PRICAI’96) (1996) 193–205
Pfahringer, B.: Compression-based discretization of continuous attributes. In: Proc. 12th Inter. Conf. on Machine Learning (1995) 456–463
Piatetsky-Shapiro, G., Frawley, W.J. (eds.): Knowledge discovery in databases. AAAI Press and The MIT Press, (1991)
Shavlik, J.W., Dietterich, T.G. (eds.): Readings in machine learning. Morgan Kaufmann Publishers, San Mateo, CA (1990)
Shan, N., Hamilton, H.J., Ziarko, W., Cercone, N.: Discretization of continuos valued attributes in classification systems, In: S. Tsumoto, S. Kobayashi, T. Yokomori, H. Tanaka, and A. Nakamura (eds.): Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery (RSFD’96), The University of Tokyo, November 6–8 (1996) 74–81
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: R. Slowinski (ed.): Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992) 331–362
Skowron, A., Suraj, Z.: A parallel algorithm for real-time decision making: A rough set approach. Journal of Intelligent Information Systems 7 (1996) 5–28
Skowron, A., Polkowski, L.: Synthesis of decision systems from data tables. In: T.Y. Lin, N. Cercone (eds.): Rough Sets and Data Mining. Analysis of Imprecise Data, Kluwer Academic Publishers, Boston, Dordrecht (1997) 259–299
Teghem, J., Charlet J.-M.: Use of ‘rough sets’ method to draw premonitory factors for earthquakes by emphasing gas geochemistry: The case of a low seismic activity context, in Belgium. In: R. Slowinski (ed.): Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992) 165–179
Lin, T.Y.: Neighborhood systems - A qualitative theory for fuzzy and rough sets. In: P.P. Wang (ed.), Advances in Machine Intelligence and Soft Computing 4 (1996) 132–155
Lin, T.Y., Cercone, N. (eds.): Rough sets and data mining: Analysis of imprecise data. Kluwer Academic Publishers, Boston, Dordrecht (1997)
Pawlak, Z.: Rough sets - Theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht (1991)
Zhong, N. Ohsuga,S.: Using generalization distribution tables as a hypotheses search space for generalization. In: S. Tsumoto, S. Kobayashi, T. Yokomori, H. Tanaka, and A. Nakamura (eds.): Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery (RSFD’96), The University of Tokyo, November 6–8 (1996) 396–403
Zhong, N., Fujitsu, S., Ohsuga, S.: Generalization based on the connectionist networks representation of a generalization distribution table. In: Proc. First Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD97), World Scientific (1997) 183–197
Zhong, N., Dong, J.Z., Ohsuga, S.: Discovering rules in the environment with noise and incompleteness. In: Proc. 10th International Florida AI Reaserch Symposium (FLAIRS-97), Special Track on Uncertainty in AI (1997) 186–191
Zhong, N., Dong, J.Z., Ohsuga, S.: Soft techniques to rule discovery in data. In: Proceedings of the Fifth European Congress on Intelligent Techniques and Soft Computing (EUFIT’97), September 8–11, Aachen, Germany, Verlag Mainz, Aachen (1997) 212–217
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Zhong, N., Dong, JZ., Ohsuga, S. (1998). Data Mining: A Probabilistic Rough Set Approach. In: Polkowski, L., Skowron, A. (eds) Rough Sets in Knowledge Discovery 2. Studies in Fuzziness and Soft Computing, vol 19. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1883-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-7908-1883-3_7
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-2459-9
Online ISBN: 978-3-7908-1883-3
eBook Packages: Springer Book Archive