Abstract
Frequent sets lie at the basis of many Data Mining algorithms. As a result, hundreds of algorithms have been proposed in order to solve the frequent set mining problem. In this chapter, we attempt to survey the most successful algorithms and techniques that try to solve this problem efficiently.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
References
Agrawal, R., Imielinski, T., and Swami, A. (1993). Mining association rules between sets of items in large databases. In Buneman, P. and Jajodia, S., editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, volume 22(2) of SIGMOD Record, pages 207–216. ACM Press.
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A. (1996). Fast discovery of association rules. In Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., editors, Advances in Knowledge Discovery and Data Mining, pages 307–328. MIT Press.
Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. In Bocca, J., Jarke, M., and Zaniolo, C, editors, Proceedings 20th International Conference on Very Large Data Bases, pages 487–499. Morgan Kaufmann.
Amir, A., Feldman, R., and Kashi, R. (1997). A new and versatile method for association generation. Information Systems, 2:333–347.
Bayardo, Jr., R. (1998). Efficiently mining long patterns from databases. In (Haas and Tiwary, 1998), pages 85–93.
Bonchi, F., Giannotti, F, Mazzanti, A., and Pedreschi, D. (2003). Exante: Anticipated data reduction in constrained pattern mining. In (Lavrac et al., 2003).
Borgelt, C. and Kruse, R. (2002). Induction of association rules: Apriori implementation. In Hardle, W. and Ronz, B., editors, Proceedings of the 15th Conference on Computational Statistics, pages 395–400. Physica-Verlag.
Boulicaut, J.-E, Bykowski, A., and Rigotti, C. (2003). Free-sets: A condensed representation of boolean data for the approximation of frequency queries. Data Mining and Knowledge Discovery, 7(1):5–22.
Brin, S., Motwani, R., Ullman, J., and Tsur, S. (1997). Dynamic itemset counting and implication rules for market basket data. In Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, volume 26(2) of SIGMOD Record, pages 255–264. ACM Press.
Burdick, D., Calimlim, M., and Gehrke, J. (2001). MAFIA: A maximal frequent itemset algorithm for transactional databases. In Proceedings of the 17th International Conference on Data Engineering, pages 443–452. IEEE Computer Society.
Bykowski, A. and Rigotti, C. (2001). A condensed representation to find frequent patterns. In Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 267–273. ACM Press.
Calders, T. (2004). Computational complexity of itemset frequency satisfiability. In Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 143–154. ACM Press.
Calders, T. and Goethals, B. (2002). Mining all non-derivable frequent item-sets. In Elomaa, T., Mannila, H., and Toivonen, H., editors, Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery, volume 2431 of Lecture Notes in Computer Science, pages 74–85. Springer.
Calders, T. and Goethals, B. (2003). Minimal k-free representations of frequent sets. In (Lavrac et al., 2003), pages 71–82.
Cercone, N., Lin, T., and Wu, X., editors (2001). Proceedings of the 2001 IEEE International Conference on Data Mining. IEEE Computer Society.
Dayal, U., Gray, P., and Nishio, S., editors (1995). Proceedings 21th International Conference on Very Large Data Bases. Morgan Kaufmann.
G. Grahne, J. Z. (2003). Efficiently using prefix-trees in mining frequent item-set. In (Goethals and Zaki, 2003).
G. Ramesh, W. Maniatty, M. Z. (2003). Feasible itemset distributions in Data Mining: theory and application. In Proceedings of the Twenty-second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 284–295. ACM Press.
Geerts, E, Goethals, B., and den Bussche, J. V. (2001). A tight upper bound on the number of candidate patterns. In (Cercone et al., 2001), pages 155–162.
Getoor, L., Senator, T., Domingos, P., and Faloutsos, C, editors (2003). Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press.
Goethals, B. (2004). Memory issues in frequent itemset mining. In Haddad, H., Omicini, A., Wainwright, R., and Liebrock, L., editors, Proceedings of the 2004 ACM symposium on Applied computing, pages 530–534. ACM Press.
Goethals, B. and den Bussche, J. V. (2000). On supporting interactive association rule mining. In Kambayashi, Y., Mohania, M., and Tjoa, A., editors, Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery, volume 1874 of Lecture Notes in Computer Science, pages 307–316. Springer.
Goethals, B. and Zaki, M., editors (2003). Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, volume 90 of CEUR Workshop Proceedings.
Gouda, K. and Zaki, M. (2001). Efficiently mining maximal frequent itemset. In (Cercone et al., 2001), pages 163–170.
Gunopulos, D., Khardon, R., Mannila, H., Saluja, S., Toivonen, H., and Sharma, R. (2003). Discovering all most specific sentences. ACM Transactions on Database Systems, 28(2): 140–174.
Haas, L. and Tiwary, A., editors (1998). Proceedings of the 1998 ACM SIG-MOD International Conference on Management of Data, volume 27(2) of SIGMOD Record. ACM Press.
Han, J., Pei, J., Yin, Y., and Mao, R. (2004). Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1):53–87.
Holsheimer, M., Kersten, M., Mannila, H., and Toivonen, H. (1995). A perspective on databases and Data Mining. In Fayyad, U. and Uthurusamy, R., editors, Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 150–155. AAAI Press.
Lavrac, N., Gamberger, D., Blocked, H., and Todorovski, L., editors (2003). Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, volume 2838 of Lecture Notes in Computer Science. Springer.
Liu, G., Lu, H., Yu, J., Wei, W., and Xiao, X. (2003). AFOPT: An efficient implementation of pattern growth approach. In (Goethals and Zaki, 2003).
Mannila, H. (1997). Inductive databases and condensed representations for Data Mining. In Maluszynski, J., editor, Proceedings of the 1997 International Symposium on Logic Programming, pages 21–30. MIT Press.
Mannila, H. (2002). Local and global methods in Data Mining: Basic techniques and open problems. In Widmayer, P., Ruiz, F., Morales, R., Hennessy, M., Eidenbenz, S., and Conejo, R., editors, Proceedings of the 29th International Colloquium on Automata, Languages and Programming, volume 2380 of Lecture Notes in Computer Science, pages 57–68. Springer.
Mannila, H., Toivonen, H., and Verkamo, A. (1994). Efficient algorithms for discovering association rules. In Fayyad, U. and Uthurusamy, R., editors, Proceedings of the AAAI Workshop on Knowledge Discovery in Databases, pages 181–192. AAAI Press.
Mielikäinen, T. (2003). On inverse frequent set mining. In Du, W. and Clifton, C, editors, 2nd Workshop on Privacy Preserving Data Mining, pages 18–23.
Ng, R., Lakshmanan, L., Han, J., and Pang, A. (1998). Exploratory mining and pruning optimizations of constrained association rules. In (Haas and Tiwary, 1998), pages 13–24.
Orlando, S., Palmerini, P., Perego, R., and Silvestri, F. (2002). Adaptive and resource-aware mining of frequent sets. In Kumar, V., Tsumoto, S., Yu, P., and N. Zhong, editors, Proceedings of the 2002 IEEE International Conference on Data Mining. IEEE Computer Society. To appear.
Pan, F., Cong, G., and A.K.H. Tung, J. Yang, M. Z. (2003). Carpenter: finding closed patterns in long biological datasets. In (Getoor et al., 2003), pages 637–642.
Park, J., Chen, M.-S., and Yu, P. (1995). An effective hash based algorithm for mining association rules. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, volume 24(2) of SIGMOD Record, pages 175–186. ACM Press.
Pasquier, N., Bastide, Y, Taouil, R., and Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. In Beeri, C. and Buneman, P., editors, Proceedings of the 7th International Conference on Database Theory, volume 1540 of Lecture Notes in Computer Science, pages 398–416. Springer.
Rioult, K, Boulicaut, J.-R, and B. Cremilleux, J. B. (2003). Using transposition for pattern discovery from microarray data. In Zaki, M. and Aggarwal, C, editors, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pages 73–79. ACM Press.
Savasere, A., Omiecinski, E., and Navathe, S. (1995). An efficient algorithm for mining association rules in large databases. In (Dayal et al., 1995), pages 432–444.
Srikant, R. (1996). Fast algorithms for mining association rules and sequential patterns. PhD thesis, University of Wisconsin, Madison.
Srikant, R. and Agrawal, R. (1995). Mining generalized association rules. In (Dayal et al., 1995), pages 407–419.
Srikant, R., Vu, Q., and Agrawal, R. (1997). Mining association rules with item constraints. In Heckerman, D., Mannila, H., and Pregibon, D., editors, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pages 66–73. AAAI Press.
Toivonen, H. (1996). Sampling large databases for association rules. In Vija-yaraman, T., Buchmann, A., Mohan, C, and Sarda, N., editors, Proceedings 22nd International Conference on Very Large Data Bases, pages 134–145. Morgan Kaufmann.
Uno, T. and Satoh, K. (2003). Detailed description of an algorithm for enumeration of maximal frequent sets with irredundant dualization. In (Goethals and Zaki, 2003).
Vaidya, J. and Clifton, C. (2002). Privacy preserving association rule mining in vertically partitioned data. In Hand, D., Keim, D., and Ng, R., editors, Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 639–644. ACM Press.
Wang, J., Han, J., and Pei, J. (2003). CLOSET+: searching for the best strategies for mining frequent closed itemsets. In (Getoor et al., 2003), pages 236–245.
Yang, G. (2004). The complexity of mining maximal frequent itemsets and maximal frequent patterns. In DuMouchel, W., Gehrke, J., Ghosh, J., and Kohavi, R., editors, Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press.
Zaki, M. (2000). Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12(3):372–390.
Zaki, M. and Gouda, K. (2003). Fast vertical mining using diffsets. In (Getoor et al., 2003), pages 326–335.
Zaki, M. and Hsiao, C.-J. (2002). CHARM: An efficient algorithm for closed itemset mining. In Grossman, R., Han, J., Kumar, V., Mannila, H., and Mot-wani, R., editors, Proceedings of the Second SIAM International Conference on Data Mining.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer Science+Business Media, Inc.
About this chapter
Cite this chapter
Goethals, B. (2005). Frequent Set Mining. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/0-387-25465-X_17
Download citation
DOI: https://doi.org/10.1007/0-387-25465-X_17
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-24435-8
Online ISBN: 978-0-387-25465-4
eBook Packages: Computer ScienceComputer Science (R0)