Abstract
Agrawal et at. [2] have proposed a fast algorithm to explore very large transaction databases with association rules [l]. In many real world applications data are managed in relational databases where missing values are often inevitable. We will show, in this case, that association rules algorithms give bad results because they have been developed for transaction databases where practically the problem of missing values does not exist. In this paper, we propose a new approach to mine association rules in relational databases containing missing values. The main idea is to cut a database in several valid databases (vdb) for a rule, a vdb must not have any missing values. We redefine support and confidence of rules for vdb. These definitions are fully compatible with
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. of the ACM SIGMOD Conference on Management of Data, Washington, D.C., p 207–216, May 1993.
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A. l. Verkamo. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, Chapter 12, AAAI/MIT Press, 1996.
R. Agrawal, R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. of the Twentieth Int'l Conference on Very Large Databases (VLDB'94), p. 487–499, September 1994.
K. Ali, S. Manganaris, R. Srikant: Partial Classification using Association Rules, in Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining, Newport Beach, California, August 1997.
L. Breiman, J.H Friedman, R.A Olshen, CJ Stone. Classification and Regression Trees, Wadsworth Int'l Group, Belmont, CA, The Wadsworth Statistics/ Probability Series, 1984.
G. Celeux. Le traitement des données manquantes dans le logiciel SICLA. Technical reports number 102. INRIA, France, December 1988.
B. Crémilleux, C. Robert. A pruning method for decision trees in uncertain domains: applications in medicine. Int'l Workshop on Intelligent Data Analysis in Medicine and Pharmacology, ECAI 1996, p 15–20, Budapest 1996.
B. Crémilleux, C. Robert. A theorical framework for decision trees in uncertain domains: application to medical data sets. 6th Conference in Artificial Intelligence in Medicine Europe (AIME 97), p 145–156, Grenoble 1997.
W.Z Liu, A.P White, S.G Thompson and M.A Bramer. Techniques for Dealing with Missing Values in Classification. In Second Int'l Symposium on Intelligent Data Analysis, Birkbeck College, University of London, 4th-5th August 1997.
H. Mannila, H. Toivonen and A. Inkeri Verkamo. Efficient algorithms for discovering association rules. In Knowledge Discovery in Databases, Papers from the 1994 AAAI Workshop (KDD'94), p. 181–192, Seattle, Washington, July 1994.
J.R Quinlan. Induction of decision trees. Machine learning, 1, p. 81–106, 1986.
J.R Quinlan. Unknown Attribute Values in Induction, in Segre A.M.(ed.), Proc. of the Sixth Int'l Workshop on Machine Learning, Morgan Kaufmann, Los Altos, CA, p. 164–168, 1989.
J.R Quinlan. C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993.
A. Ragel: Traitement des valeurs manquantes dans les arbres de décision. Technical reports, Les cahiers du GREYC. University of Caen, France, 1997.
A. Savasere, E. Omiecinski, S. Navathe. An efficient algorithm for mining association rules in large databases. Proc. of the 21st Int. Conference on Very Large Databases (VLDB'95), p. 432–444, Zurich, Switzerland, 1995.
H. Toivonen. Sampling large databases for association rules. In Proc. of the 22nd Int'l Conference on Very Large Databases (VLDB'96), p. 134–145, Bombay, India, 1996
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ragel, A., Crémilleux, B. (1998). Treatment of missing values for association rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds) Research and Development in Knowledge Discovery and Data Mining. PAKDD 1998. Lecture Notes in Computer Science, vol 1394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64383-4_22
Download citation
DOI: https://doi.org/10.1007/3-540-64383-4_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64383-8
Online ISBN: 978-3-540-69768-8
eBook Packages: Springer Book Archive