Abstract
High-Utility Rare Itemset (HURI) mining finds itemsets from a database which have their utility no less than a given minimum utility threshold and have their support less than a given frequency threshold. Identifying high-utility rare itemsets from a database can help in better business decision making by highlighting the rare itemsets which give high profits so that they can be marketed more to earn good profit. Some two-phase algorithms have been proposed to mine high-utility rare itemsets. The rare itemsets are generated in the first phase and the high-utility rare itemsets are extracted from rare itemsets in the second phase. However, a two-phase solution is inefficient as the number of rare itemsets is enormous as they increase at a very fast rate with the increase in the frequency threshold. In this paper, we propose an algorithm, namely UP-Rare Growth, which uses UP-Tree data structure to find high-utility rare itemsets from a transaction database. Instead of finding the rare itemsets explicitly, our proposed algorithm works on both frequency and utility of itemsets together. We also propose a couple of effective strategies to avoid searching the non-useful branches of the tree. Extensive experiments show that our proposed algorithm outperforms the state-of-the-art algorithms in terms of number of candidates.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Pillai, J., Vyas, O.P., Muyeba, M.: Huri–a novel algorithm for mining high utility rare itemsets. In: Advances in Computing and Information Technology, pp. 531–540. Springer, Heidelberg (2013)
Medici, F., Hawa, M.I., Giorgini, A.N.G.E.L.A., Panelo, A.R.A.C.E.L.I., Solfelix, C.M., Leslie, R.D., Pozzilli, P.: Antibodies to gad65 and a tyrosine phosphatase-like molecule ia-2ic in filipino type 1 diabetic patients. Diabetes Care 22(9), 1458–1461 (1999)
Shi, W., Ngok, F.K., Zusman, D.R.: Cell density regulates cellular reversal frequency in myxococcus xanthus. Proceedings of the National Academy of Sciences 93(9), 4142–4146 (1996)
Saha, B., Lazarescu, M., Venkatesh, S.: Infrequent item mining in multiple data streams. In: Seventh IEEE International Conference on Data Mining Workshops, ICDM Workshops 2007, pp. 569–574 (2007)
Agrawal, R., Ramakrishnan, Srikant, o.: Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, vol. 1215, pp. 487–499 (1994)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD, vol. 29, pp. 1–12. ACM (2000)
Leung, C.K.-S., Khan, Q.I., Li, Z., Hoque, T.: Cantree: a canonical-order tree for incremental frequent-pattern mining. Knowledge and Information Systems 11(3), 287–311 (2007)
Liu, Y., Liao, W.-k., Choudhary, A.: A fast high utility itemsets mining algorithm. In: International Workshop on Utility-Based Data Mining, pp. 90–99. ACM (2005)
Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering 25(8), 1772–1786 (2013)
Liu, Y., Liao, W.-k., Choudhary, A.K.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005)
Shie, B.-E., Tseng, V.S., Yu, P.S.: Online mining of temporal maximal utility itemsets from data streams. In: ACM Symposium on Applied Computing, pp. 1622–1626. ACM (2010)
Pillai, J., Vyas, O.P.: Transaction profitability using huri algorithm [tphuri]. International Journal of Business Information Systems 2(1) (2013)
Tseng, V.S., Wu, C.-W., Shie, B.-E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD, pp. 253–262. ACM (2010)
Park, J.S., Chen, M.-S., Yu, P.S.: An effective hash-based algorithm for mining association rules, vol. 24. ACM (1995)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Wei, Li, o.: New algorithms for fast discovery of association rules. In: KDD, vol. 97, pp. 283–286 (1997)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering 21(12), 1708–1721 (2009)
Yeh, J.-S., Li, Y.-C., Chang, C.-C.: Two-Phase Algorithms for a Novel Utility-Frequent Mining Model. In: Washio, T., Zhou, Z.-H., Huang, J.Z., Hu, X., Li, J., Xie, C., He, J., Zou, D., Li, K.-C., Freire, M.M. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4819, pp. 433–444. Springer, Heidelberg (2007)
Koh, Y.S., Rountree, N.: Finding sporadic rules using apriori-inverse. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 97–106. Springer, Heidelberg (2005)
Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2007, vol. 1, pp. 305–312. IEEE (2007)
Troiano, L., Scibelli, G., Birtolo, C.: A fast algorithm for mining rare itemsets. ISDA 9, 1149–1155 (2009)
Goethals, B., Zaki, M.J.: The fimi repository (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Goyal, V., Dawar, S., Sureka, A. (2015). High Utility Rare Itemset Mining over Transaction Databases. In: Chu, W., Kikuchi, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2015. Lecture Notes in Computer Science, vol 8999. Springer, Cham. https://doi.org/10.1007/978-3-319-16313-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-16313-0_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16312-3
Online ISBN: 978-3-319-16313-0
eBook Packages: Computer ScienceComputer Science (R0)