Abstract
Abstract: Knowledge discovery is concerned with extraction of useful information from databases ([21]). One of the basic tasks of knowledge discovery and data mining is to synthesize the description of some subsets (concepts) of entities contained in databases. The patterns and/or rules extracted from data are used as basic tools for concept description. In this Chapter we propose a certain framework for approximating concepts. Our approach emphasizes extracting regularities from data. In this Chapter the following problems are investigated: (1) issues concerning the languages used to represent patterns; (2) computational complexity of problems in approximating concepts; (3) methods of identifying, optimal patterns. Data regularity is a useful tool not only for concept description. It is also indispensable for various applications like classification or decomposition. In this Chapter we present also the applications of data regularity to three basic problems of data mining: classification, data description and data decomposition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal R., Imielinski T., Suami A, 1993. Mining Assocation Rules Between Sets of Items in Large Datatabes, ACM SIGMOD. Conference on Management of Data, Washington, D.C., pp. 207–216.
Agrawal R., Imielinski T., Suami A, 1993. Mining Assocation Rules Between Sets of Items in Large Datatabes, ACM SIGMOD. Conference on Management of Data, Washington, D.C., pp. 207–216.
Agrawal R., Mannila H., Srikant R., Toivonen H., Verkamo A.I., 1996. Fast Discovery of Assocation Rules. In: V.M. Fayad, G. Piatetsky Shapiro, P. Smyth, R. Uthurusamy, (Eds.), Advanced in Knowledge Discovery and Data Mining, AAAI/MIT Press, pp. 307–328.
Agrawal R., Piatetsky-Shapiro G., (Eds.), 1998. Proc. of the Four International Conference on Knowledge Discovery and Data Mining, August 27–31, AAAI PRESS, Menlo Park, California.
Anzai Y., (Ed.), 1992. Pattern Recognition and Machine Learning. Academic Press, inc. Harcourt Brace Jovanovich, Publishers.
Bay S. D., 1998. Combining Nearest Neighbor Classifiers Through Multiply Feature Subsets. In: Proc. of the International Conference on Machine Learning. Morgan Kaufmann Publishers. Madison.
Bazan J., 1998. A Comparison of Dynamic Non-dynamic Rough Set Methods for Extracting Laws from Decision Tables. In: Polkowski and Skowron [67], pp. 321–365.
Bazan J. G., 1998. Metody Wnioskowańń Aproksymacyjnych dla Syntezy Algorytmów Decyzyjnych. PhD Dissertation, Warsaw University.
Bezdek J., 1996. A Sampler of Non-Neural Fuzzy Models for Clustering and Classification. Manuscript of Tutorial at the Fourth European Congress on Intelligent Techniques and Soft Computing, September 2–5.
Breiman L., Friedman J. H., Olshen R. A., Stone, P. J., 1984. Classification and Regression Trees. Belmont, CA: Wadsworth International Group.
Brown F. M. 1990. Boolean Reasoning. Kluwer Academic Publishers, Dordrecht.
Bouchon-Meunier B., Delgado M., Verdegay J.L., Vila M.A., Yager R.R., 1996. Proceedings of the Sixth International Conference on Information Processing Management of Uncertainty in Knowledge-Based Systems (IPMU’96). July 1–5, Granada, Spain 1–3, pp. 1–1546.
Cattaneo G., 1997. Generalized Rough Sets. Preclusivity Fuzzy-Intuitionistic (BZ) Lattices, Studia Logica 58, pp. 47–77.
Cattaneo G., 1998. Abstract Approximation Spaces for Rough Theories. In: Polkowski and Skowron 67, pp. 59–98.
Chiu D.K.Y., Cheung B., Wong, A.K.C., 1990. Information Synthesis Based on Hierarchical Entropy Discretization. Journal of Experimental and Theoretical Artificial Intelligence 2, pp. 117–129.
Chlebus B. S., Nguyen S. Hoa, 1998. On Finding Optimal Discretizations for Two Attributes. In: Polkowski and Skowron [69], pp. 537–544.
Chmielewski, M. R., Grzymala-Busse, J. W., 1994. Global Discretization of Attributes as Preprocessing for Machine Learning. Proc. of the III International Workshop on RSSC94, November, pp. 294–301.
Clark P., Niblett R., 1989. The CN2 Induction Algorithm. Machine Learning, 3, pp. 261–284.
Cormen T. H., Leiserson C. E., Rivest R. L. The Set Covering Problem. Introduction to Algorithms. The MIT Press, Cambridge, Massachusetts, London, England, pp. 974–978.
Everitt B. S., (Ed.), 1993. Cluster Analysis. Reprinted 1998 by Arnold, a member of the Hodder Headline Group.
Fayyad U.M., 1991. On the Induction of Decision Trees for Multiple Concept Learning. PhD Dissertation, the University of Michigan.
Fayyad U., Piatetsky-Shapiro G., (Eds.), 1996. Advances in Knowledge Discovery and Data Mining. MIT/AAAI Press.
Fayyad U. M., Irani KB., 1993. Multi-Interval Discretization of ContinuousValued Attributes for Classification Learning. In: Proc. of the 13th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, pp. 1022–1027.
Garey M.R., Johnson D.S., 1979. Computers and Interactability. A Guide to the Theory of NP-Completeness. W.H. Freeman and Company New York.
Greco S., Matarazzo B., Slowiński R., 1995. Rough Set Approach to Multi-Attribute Choice Ranking Problems. Institute of Computer Science, Warsaw University of Technology, ICS Research Report 38/95; see also, G. Fandel, T. Gal (Eds.), Multiple Criteria Decision Making, Proc. of 12th International Conference in Hagen, Springer-Verlag, Berlin, pp. 318–329.
Greco S. Matarazzo B. Slowiński R., 1998. Rough Approximation of a Preference Relation in a Pairwise Comparison Table. In: Polkowski and Skowron [67] pp. 13–36.
Greco S. Matarazzo B. , Slowiński R. 1998. A New Rough Set Approach to Mulcriteria and Multiattribute Classification. In: Polkowski and Skowron [69], pp. 60–67.
Greco S., Matarazzo B., Sowiski R. 1999. On Joint Use of Indicernibility, Similarity and Dominance in Rough Approximation of Decision Classes. Proc. of Fifth International Conference on Integrating Technology and Human Decisions: Global Bridges into the 21th Century. Athens, Greece, July 4–7.
Heath D., Kasif S., Salzberg S., 1993. Induction of Oblique Decision Trees. Proc. of 13th International Joint Conference on Artificial Intelligence. Chambery, France, pp. 1002–1007.
Hu X., Cercone N., 1995. Rough Set Similarity Based Learning from Databases. Proc. of the First International Conference of Knowledge Discovery and Data Mining. August 20–21, Montreal, Canada, pp. 162–167.
Komorowski J., Ras Z.W., (Eds.), 1993. Proceedings of the Seventh International Symposium on Methodologies for Intelligent Systems (ISMIS’93), Trondheim, Norway, June 15–18, Lecture Notes in Computer Science 689, Springer-Verlag, Berlin.
Krawiec K., Slowiński R., Vanderpooten D., 1996. Construction of Rough Classifires Based on Application of a Similarity Relation. Proc. of the Fourth International Workshop on Rough Set, Fuzzy Set and Machine Discovery. November 6–8, Tokyo, Japan, pp. 23–30.
Krętowski M., Stepaniuk J., 1996. Selection of Objects and Attributes a Tolerance Rough Set Approach. Proc. of the Ninth International Symposium on Methodologies for Intelligent Systems . June 9–13, Zakopane, Poland, pp. 169–180.
Lenz M., Bartsch-Sporl B., Hans-Dieter Burkhard, Wess S., (Eds.), 1998. CaseBased Reasoning Technology: From Fundation to Application. LNAI 1400, Springer-Verlag, Berlin.
Lin T.Y., 1989. Neighbourhood System and Approximation in Database and Knowledge Base Systems. Proc. of the Fourth International Symposium on Methodologies of Intelligent System.
Lin T.Y., 1998. Granular Computing on Binary Relations I. In: Polkowski and Skowron [67], pp. 107–121.
Lin T.Y., 1989. Granular Computing on Binary Relations II. In: Polkowski and Skowron [67], pp. 122–140.
Liu H., Motoda H., (Eds.), 1998. Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers.
Mannila H., Toivonen H., Verkamo A. I., 1994. Efficient Algorithms for Discovering Association Rules. In: U. Fayyad and R. Uthurusamy (Eds.): AAAI Workshop on Knowledge Discovery in Databases, Seattle, WA, pp. 181–192.
Marcus S., 1994. Tolerance Rough Sets, Čech Topologies, Learning Process, Bull. Polish Academy. Ser. Sci. Tech., Vol. 42, No. 3, pp. 471–487.
Maritz P., 1996. Pawlak and Topological Rough Sets in Terms of Multi-Functions. Glasnik Matematicki 31/51, pp. 159–178.
Michalski R. S., Mozetic I., Hong J., Lavrac H. (1986). The Multi-Purpose Incremental Learning System AQ15 and Its Testing Application to Three Medical Domains. Proc. of the Fifth National Conference on AI, Filadelfia, MorganKaufmann, pp. 1041–1045.
Michie, D., Spiegelhanter, D.J. and Taylor, C.C., (Eds.), 1994. Machine Learning, Neural and Statistical Classifications. Great Britain: Ellis Horwood.
Mitchell T. M., (Ed.), 1997. Machine Learning. The McGraw-Hill Companies, Inc.
Mollestad T., Skowron A., 1996. A Rough Set Framework for Data Mining of Propositional Default Rules. Proc. ISMIS-96, Zakopane, Poland, June, pp. 448–457.
Murthy S., Aha D., 1996. UCI Repository of Machine Learning Data Tables. http://www/ics.uci.edu/mlearn.
Murthy S., Kasif S., Saltzberg S., Beigel R., 1993. OC1: Randomized Induction of Oblique Decision Trees. Proc. of the Eleventh National Conference on AI, July, pp. 322–327.
Murthy S., Kasif S., Saltzberg S., 1994. A System for Induction of Oblique Decision Trees, In Proc. of Sixth International Machine Learning Workshop, Ithaca N.Y., Morgan Kaufmann.
Nguyen S. Hoa, Nguyen H. Son, 1996. Some Effficient Algorithms for Rough Set Methods. In: Bouchon-Meunier, Delgado, Verdegay, Vila, and Yager [11], pp. 1451–1456.
Nguyen S. Hoa, Nguyen T. Trung, Skowron A., Synak P., 1996. Knowledge Discovery by Rough Set Methods. Proc. of the International Conference on Information Systems Analysis and Synthesis, July 22–26, Orlando, USA, pp. 26–33.
Nguyen S. Hoa., Polkowski L., Skowron A., Synak P., Wróblewski J.,1996. Searching for Approximate Description of Decision Classes. Proc. of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery, November 6–8, Tokyo, Japan, pp. 153–161.
Nguyen S. Hoa., Skowron A., Synak P., 1996. Rough Sets in Data Mining: Approximate Description of Decision Classes. Proc. of the fourth European Congress on Intelligent Techniques and Soft Computing, Aachen, Germany, September 2–5, pp. 149–153.
Nguyen S. Hoa, Skowron A., 1997. Searching for Relational Patterns in Data. Proc. of the First European Symposium on Principles of Data Mining and Knowledge Discovery, June 25–27, Trondheim, Norway, pp. 265–276.
Nguyen S. Hoa, Nguyen H. Son, 1998. Pattern Extraction from Data. Fundamenta Informaticae 34, 1998, pp. 129–144.
Nguyen S. Hoa, Nguyen H. Son, 1998. The Decomposition Problem in Multi-Agent Systems. In: J. Komorowski, A. Skowron, I. Duntsch (Eds.): Proc. of the ECAI’98 Workshop on Synthesis of Intelligent Agent Systems from Experimental Data, Brighton, UK.
Nguyen S. Hoa, Nguyen H. Son, 1998. Pattern Extraction from Data. Proc. of the Seventh International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems (IPMU’98), Paris, France, July 6–10, pp. 1346–1353.
Nguyen S. Hoa, Skowron A., Synak P., 1998. Discovery of Data Patterns with Applications to Decomposition and Classification Problems. In: Polkowski and Skowron [67], pp. 55–97.
Nguyen S. Hoa, 1999. Discovery of Generalized Patterns. Proc. of the Eleventh International Symposium on Methodologies for Intelligent Systems (ISMIS’99), Warsaw, Poland, June 8–12, Lecture Notes in Computer Science 1609, Springer-Verlag, Berlin, pp. 574–582.
Nguyen H. Son, Skowron A., 1995. Quantization of Real Value Attributes: Rough Set and Boolean Reasoning Approach. Proc. of the Second Annual Joint Conference on Information Sciences, Wrightsville Beach, NC, USA, September 28 — October 1, pp. 34–37.
Nguyen H. Son, Nguyen S. Hoa, Skowron A., 1996. Searching for Features Defined by Hyper-planes. In: Z. W. Rag, M. Michalewicz (Eds.), Proc. of the IX International Symposium on Methodologies for Information Systems ISMIS’96, June 1996, Zakopane, Poland. Lecture Notes in AI 1079, Berlin, Springer-Verlag, pp. 366–375.
Nguyen H. Son, 1997. Discretization of Real Value Attributes: Boolean Reasoning Approach. PhD Dissertation, Warsaw University.
Nguyen H. Son, Nguyen S. Hoa, 1998. Discretization methods in data mining. In: Polkowski and Skowron [67], pp. 451–482 .
Pawlak Z., 1991. Rough Sets. Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Dordrecht.
Pfahringer B., 1995. Compression-Based Discretization of Continuous Attributes. In A. Prieditis, S. Russell, (Eds.), Proc. of the Twelfth International Conference on Machine Learning, Morgan Kaufmann.
Piatetsky-Shapiro G., 1991. Discovery, Analysis and Presentation of Strong Rules. In: Piatetsky-Shapiro G. and Frawley W.J. (Eds.): Knowledge Discovery in Databases, AAAI/MIT, pp. 229 – 247.
Polkowski L., Skowron A.,Żytkow J., 1995. Tolerance Based Rough Sets. In: Soft Computing, T.Y. Lin, A.M. Wildberger (Eds.), San Diego, Simulation Council, Inc., pp. 55–58.
Polkowski L., Skowron A., 1996. Rough Mereological Approach to Knowledge-based Distributed AI. In: J.K. Lee, J. Liebowitz, Y. M. Chae (Eds.): Critical Technology. Proc. of the Third World Congress on Expert Systems, pp. 774–781, Seoul, Cognisant Communication Corporation, New York.
Polkowski L., Skowron A., (Eds.), 1998. Rough Sets in Knowledge Discovery 1: Methodology and Applications. Physica-Verlag, Heidelberg.
Polkowski L., Skowron A., (Eds.), 1998. Rough Sets in Knowledge Discovery 2: Methodology and Applications. Physica-Verlag, Heidelberg.
Polkowski L, Skowron A. (Eds.), 1998, Proceedings of the First International Conference on Rough Sets and Soft Computing (RSCTC’98. Warszawa, Poland, June 22–27, Springer-Verlag, LNAI 1424.
Quinlan, JR., 1986. Induction ofDecisionTrees. In: Machine Learning 1, pp. 81–106.
Quinlan J. R., 1993. C4.5:Programs forMachineLearning. Morgan Kaufmann, San Mateo, CA.
Stefanowski J. 1998. On Rough Set Based Approaches to Induction of Decision Rules. In: Polkowski and Skowron [67], pp. 500–529.
Stepaniuk J., 1996. Similarity Based Rough Sets and Learning. Proc. of the Fourth International Workshop on Rough Sets, Fuzzy Sets and Machine Discovery, November 6–8, Tokyo, Japan, pp. 18–22.
Skowron A., Polkowski L., Komorowski J., 1996. Learning Tolerance Relation by Boolean Descriptions: Automatic Feature Extraction from Data Tabes/. Proc. of the Fourth International Workshop on Rough Set, Fuzzy Set and Machine Discovery . November 6–8, Tokio, Japan, pp. 11–17.
Skowron A., Rauszer C., 1992. The Discernibility Matrices and Functions in Information Systems. In: R. Slowiński (Eds.): Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, Kluwer, Dordrecht, pp. 331–362.
Skowron A., Pal S. K., (Eds.), 1998. Rough-Fuzzy Hybridization: A New Trend in Decision Making. Springer-Verlag. Singapore, pp. 1–97.
Skowron A., Stepaniuk J., 1996. Tolerance Approximation Spaces. In: Fundamenta Informaticae, August, 27(2–3), pp. 245–253.
Slowiński R., (Ed.), 1992, Intelligent Decision Support — Handbook of Applications and Advances of the Rough Sets Theory. Kluwer Academic Publishers, Dordrecht.
Slowitński R., 1993. Rough Set Learning of Preferential Attitude in Multi-Criteria Decision Making. In: Komorowski and Ras [30], pp. 642–651.
Slowiński R., 1994. Handling Various Types of Uncertainty in the Rough Set Approach. In: Ziarko [95], pp. 366–376.
Słowirński R., 1994. Rough Set Analysis of Multi-Attribute Decision Problems. In: Ziarko [95], pp. 136–143.
Slowiński R., 1995. Rough Set Approach to Decision Analysis. AI Expert 10, pp. 18–25.
Slowiński R., Vanderpooten D., 1995. Similarity Relation as a Basis for Rough Approximations. In: P. Wang (Ed.): Advances in Machine Intelligence & Soft Computing, Bookwrights, Raleigh NC (1997) pp. 17–33.
Slowiński R., Vanderpooten D., 1997. A Generalized Definition of Rough Approximations Based on Similarity. IEEE Trans. on Data and Knowledge Engineering.
Stepaniuk J., 1996. Similarity Based Rough Sets and Learning. In: Tsumoto, Kobayashi, Yokomori, Tanaka, and Nakamura [89], pp. 18–22.
Stepaniuk J., 1998. Approximation Spaces, Reducts and Representatives. In: Polkowski and Skowron [67], pp. 109–126.
Tentush I., 1995. On Minimal Absorbent Sets for some Types of Tolerance Relations, Bulletin of the Polish Academy of Sciences, 43(1), pp. 79–88.
Toivonen H., Klemettinen M., Ronkainen P., Hatonen P., Mannila H., 1995. Pruning and Grouping Discovered Association Rules. In: Mlnet: Familiarisation Workshop on Statistics, Machine Learning and Knowledge Discovery in Databases , Heraklion, Crete, April, pp. 47–52.
Tsumoto S., Kobayashi S., Yokomori T., Tanaka H., Nakamura A., (Eds.), 1996. Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery (RSFD’96). The University of Tokyo, November 6–8.
Uthurusamy H., Fayyad V.M., Spangler S., 1991. Learning Useful Rules from Inconclusive Data. In: Piatetsky-Shapiro G. and Frawley W.J. (Eds.): Knowledge Discovery in Databases, AAAI/MIT, pp. 141–157.
Yao Y.Y., 1998. Generalized Rough Set Models. In: Polkowski and Skowron [67], pp. 286–318.
Yao Y.Y., Wong S.K.M., Lin T.Y., 1997. A Review of Rough Set Models. In: T.Y. Lin, N. Cercone (Eds.): Rough Sets and Data Mining Analysis of Imprecise Data, Kluwer Academic Publishers, pp. 47–75.
Westphal C., Blaxton T., (Eds.), 1998. Data Mining Solution. Wiley Computer Publishing.
Zadeh L.A., 1997. Toward a Theory of Fuzzy Information Granulation and Its Certainty in Human Reasoning and Fuzzy Logic. Fuzzy Sets and Systems 90, pp. 111–127.
Ziarko W., (Ed.), 1994. Rough Sets, Fuzzy Sets and Knowledge Discovery, Workshops in Computing, Springer Verlag & British Computer Society.
Ziarko W., 1998. Rough Sets as a Methodology for Data Mining. In: Polkowski and Skowron [67], pp. 554–576.
Żytkow J., Baker J., 1991. Interactive Mining of Regularities in Data Bases. In: Piatetsky-Shapiro, W. J. Frawley, (Eds.): Knowledge Discovery in Databases, Menlo Park, CA: AAAI Press, pp. 31–53.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Physica-Verlag Heidelberg
About this chapter
Cite this chapter
Nguyen, S.H. (2000). Regularity Analysis and its Applications in Data Mining. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds) Rough Set Methods and Applications. Studies in Fuzziness and Soft Computing, vol 56. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1840-6_7
Download citation
DOI: https://doi.org/10.1007/978-3-7908-1840-6_7
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-662-00376-3
Online ISBN: 978-3-7908-1840-6
eBook Packages: Springer Book Archive