Abstract
Mining patterns under constraints in large data (also called fat data) is an important task to benefit from the multiple uses of the patterns embedded in these data sets. It is a difficult task due to the exponential growth of the search space according to the number of attributes. From such contexts, closed patterns can be extracted by using the properties of the Galois connections. But, from the best of our knowledge, there is no approach to extract interesting patterns like δ-free patterns which are on the core of a lot of relevant rules. In this paper, we propose a new method based on an efficient way to compute the extension of a pattern and a pruning criterion to mine frequent δ-free patterns in large databases. We give an algorithm (FTminer) for the practical use of this method. We show the efficiency of this approach by means of experiments on benchmarks and on gene expression data.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H.: Fast discovery of associations rules. In: Advances in Knowledge Discovery and Data Mining (1996)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Bayardo, R.J.: The hows, whys, and whens of constraints in itemset and rule discovery. In: Proceedings of the workshop on Inductive Databases and Constraint Based Mining (2005)
Boulicaut, J.-F., Bykowski, A.: Frequent closures as a concise representation for binary data mining. In: Terano, T., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 62–73. Springer, Heidelberg (2000)
Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means of free-sets. In: The Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases, Lyon (2000)
Boulicaut, J.F., Bykowski, A., Rigotti, C.: Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery journal 7(1), 5–22 (2003)
Crémilleux, B., Boulicaut, J.-F.: Simplest rules characterizing classes generated by delta-free sets. In: 22nd Int. Conf. on Knowledge Based Systems and Applied Artificial Intelligence (ES 2002), Cambridge, UK, December 2002, pp. 33–46 (2002)
Eiter, T., Gottlob, G.: Identifying the minimal transversals of a hypergraph and related problems. SIAM Journal on Computing archive 24(6), 1278–1304 (1995)
Jamison, R.E., Pfaltz, J.L.: Closure spaces that are not uniquely generated. In: Ordinal and Symbolic Data Analysis, Brussels (2000)
Jeudy, B., Rioult, F.: Database transposition for constrained closed pattern mining. In: Third International Workshop on Knowledge Discovery in Inductive Databases (2004)
Mannila, H., Toivonen, H.: Multiple uses of frequent sets and condensed representations. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), Portland, Oregon, pp. 189–194 (1996)
Mannila, H., Toivonen, H.: Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery 1(3), 241–258 (1997)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. Lecture Notes in Computer Science (1999)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Data Mining and Knowledge Discovery journal 24(1), 25–46 (1999)
Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)
Pfaltz, J.L., Taylor, C.: Closed set mining of biological data. In: Workshop on Data Mining and Bioinformatics (2002)
Rioult, F., Boulicaut, J.-F., Crémilleux, B., Besson, J.: Using transposition for pattern discovery from microarray data. In: Zaki, M.J., Aggarwal, C.C. (eds.) Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2003), San Diego, CA, pp. 73–79 (2003)
Zaki, M.J.: Generating non-redundant association rules. In: Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD 2000), Boston, MA, pp. 34–43 (2000)
Zaki, M.J., Hsiao, C.J.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the 2th SIAM international conference on Data Mining, Arlington, pp. 33–43 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hébert, C., Crémilleux, B. (2005). Mining Frequent δ-Free Patterns in Large Databases. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds) Discovery Science. DS 2005. Lecture Notes in Computer Science(), vol 3735. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563983_12
Download citation
DOI: https://doi.org/10.1007/11563983_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29230-2
Online ISBN: 978-3-540-31698-5
eBook Packages: Computer ScienceComputer Science (R0)