Abstract
Mining of frequent closed itemsets has been shown to be more efficient than mining frequent itemsets for generating non-redundant association rules. The task is challenging in data stream environment because of the unbounded nature and no-second-look characteristics.
In this paper, we propose an algorithm, CLICI, for mining all recent closed itemsets in landmark window model of online data stream. The algorithm consists of an online component, which processes the transactions arriving in the stream without candidate generation and updates the synopsis appropriately. The offline component is invoked on demand to mine all frequent closed itemsets. User can explore and experiment by specifying the support threshold dynamically.
The synopsis, CILattice, stores all recent closed itemsets in the stream. It is based on Concept Lattice - a core structure of Formal Concept Analysis (FCA). Closed itemsets stored in the form of lattice facilitate generation of non-redundant association rules and is the main motivation behind using lattice based synopsis.
Experimental evaluation using synthetic and real life datasets demonstrates the scalablility of the algorithm.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Agarwal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: 20th International Conference on Very Large Databases, pp. 487–499 (1994)
Chang, J., Lee, W.: Finding Recent Frequent Itemsets Adaptively over Online Data stream. In: 9th ACM SIGKDD, pp. 487–492. ACM Press, New York (2003)
Cheng, J., Ke, Y., Ng, W.: A Survey on Algorithms for Mining Frequent Itemsets over Data stream. KAIS Journal 16(1), 1–27 (2008)
Chen, J., Li, S.: GC-Tree: A Fast Online Algorithm for Mining Frequent Closed Itemsets. In: Proceeding of PAKDD Workshop of HPDMA, pp. 457–468 (2007)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent Pattern Mining: Current Status and Future Directions. Journal of Data Mining and Knowledge Discovery 15, 55–86 (2007)
Jiang, N., Gruenwald, L.: CFI-Stream: Mining Closed Frequent Itemsets in Data stream. In: 12th ACM SIGKDD, Poster Paper, pp. 592–597. ACM Press, New York (2006)
Kuznetsov, S.O., Obiedkov, S.A.: Comparing Performance of Algorithms for Generating Concept Lattices. JETAI 14, 189–216 (2002)
Li, H., Ho, C., Lee, S.: Incremental Updates of Closed Frequent Itemsets Over Continuous Data stream. Expert Systems with Applications 36, 2451–2458 (2009)
Liu, X., Guan, J., Hu, P.: Mining Frequent Closed Itemsets from a landmark window over online data stream. Journal of Computers and Mathematics with Applications 57(6), 927–936 (2009)
Pasquier, N., et al.: Efficient Mining of Association Rules using Closed Itemset Lattices. Journal of Information Systems 24(1), 25–46 (1999)
Stumme, G., et al.: Computing Iceberg Concept Lattices with Titanic. Journal on Knowledge and Data Engineering 42(2), 189–222 (2002)
Valtchev, P., Missaoui, R., Godin, R.: A framework for incremental generation of closed itemsets. Discrete Applied Mathematics 156(6), 924–949 (2008)
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Catch the Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window. Journal of Knowledge and Information Systems 10, 265–294 (2006)
Yahia, S.B., Hamrouni, T., Nguifo, E.M.: Frequent Closed Itemset Based Algorithms: A thorough structural and analytical survey. ACM SIGKDD Explorations Newsletter 8, 93–104 (2006)
Zaki, M.J.: Generating Non-Redundant Association Rules. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 34–43. ACM Press, New York (2000)
Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. In: Proceedings of the 2001 International Conference Knowledge Discovery and Data Mining, SIGKDD 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gupta, A., Bhatnagar, V., Kumar, N. (2010). Mining Closed Itemsets in Data Stream Using Formal Concept Analysis. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2010. Lecture Notes in Computer Science, vol 6263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15105-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-15105-7_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15104-0
Online ISBN: 978-3-642-15105-7
eBook Packages: Computer ScienceComputer Science (R0)