Abstract
Rare itemset mining is the emerging research domain in data mining. Patterns with low support and high confidence are referred to as Rare Patterns, which are very interesting compared to frequent Patterns in certain application domains like analysis of network logs, online customer purchase behavior, online banking transaction analysis, sensor data analysis, stock market data analysis. Many applications generate large volumes of the continuous data streams. To analyze such data streams and to identify rare patterns, we need efficient algorithms that can process data streams. Many research articles on rare pattern mining are available for static databases. However, it is not possible to apply the algorithms designed for static databases to data streams. Hence, we need algorithms that are specifically designed for data stream processing, to mine important rare patterns. Rare pattern mining from the data stream is in the budding stage and only a few algorithms are available. To address this, we have proposed algorithm HEclat-RPStream, an Eclat based method to mine rare patterns from a data stream using a vertical mining with bitsets. The discovered patterns are maintained in a prefix-based rare pattern tree, it uses double hashing to maintain a rare pattern in the data stream. The algorithm also uses the Breadth First Search (BFS) and Depth First Search (DFS) to discover interesting large itemsets. To handle data streams, we have used a time sensitive sliding window approach which captures most recent patterns. The pruning technique based on two items is used to optimize the performance. The experimental results of the proposed method demonstrated good performance concerning execution time and the total number of rare patterns generated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, H.-F., Lee, S.-Y.: Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst. Appl. 36, 1466–1477 (2009)
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. ACM SIGMOD Rec. 34(2), 18–26 (2005)
Koh, L., Shin, S.-N.: An approximate approach for mining recently frequent itemsets from data streams. Proc. DaWaK, 352–362 (2006)
Cheng, J., Ke, Y., Ng, W.: Maintaining frequent closed itemsets over a sliding window. J. Intell. Inf. Syst. 31, 191–215 (2008)
Tanbeer, S.K., Ahmed, C.F., Jeong, B.S., Lee, Y.K.: Efficient single-pass frequent pattern mining using a prefix-tree. Inf. Sci. 179(5), 559–583 (2009)
Deypir, M., Sadreddini, M.H., Hashemi, S.: Towards a variable size sliding window model for frequent item set mining over data streams. Comput. Ind. Eng. 63(1), 161–172 (2012)
Lee, G., Yun, U., Ryu, K.H.: Sliding window based weighted maximal frequent pattern mining over data streams. Expert Syst. Appl. 41, 694–708 (2014)
Huang, D., Koh, Y.S., Dobbin, G.: Rare Pattern Mining on DataStream Data Warehousing and Knowledge Discovery Lecture, Notes in Computer Science, vol. 7448, p. 30 (2012)
Tsang, S., Koh, Y.S., Dobbie, G.: RP-Tree: rare pattern tree mining. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 277–288. Springer, Heidelberg (2011)
Vanamala, S., Padma Sree, L., Durga Bhavani, S.: Efficient rare association rule minig algorithm. Int. J. Eng. Res. Appl. (IJERA) 3(3), 753–757 (2013)
Vanamala, S., Sree, L., Bhavani, S.: Rare association rule mining for data stream. In: International Conference on Computing and Communication Technologies, pp. 1–6 (2014)
Almuammar, M., Fasli, M.: Learning patterns from imbalanced evolving data streams. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 2048–2057 (2018)
Manal Almuammar, Maria Fasli, “Pattern discovery from dynamic data streams using frequent pattern mining with multi-support thresholds”, the Frontiers and Advances in Data Science (FADS) 2017 International Conference on, pp. 35–40, 2017.
Vanamala, S., Padma Sree, L., Durga Bhavani, S.: Eclat_RPGrowth: finding rare patterns using vertical mining and rare pattern tree. In: Pandian, A., Fernando, X., Islam, S.M.S. (eds.) Computer Networks, Big Data and IoT. Lecture Notes on Data Engineering and Communications Technologies, vol. 66. Springer, Singapore. https://doi.org/10.1007/978-981-16-0965-7_14(2021)
Zaki, M.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Frequent itemset mining dataset repository. http://fimi.uantwerpen.be/data/
Anil Ghatage, R.: Frequent Pattern Mining Over Data Stream Using Compact Sliding Window Tree and Sliding Window Model (IRJET), vol. 02. e-ISSN: 2395-0056 p-ISSN: 2395–0072 15 (2015)
Borah, A., Nath, B.: Incremental rare pattern based approach for identifying outliers in medical data. Appl. Soft Comput. 85 (2019)
Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Vanamala, S., Sree, L.P., Bhavani, S.D. (2022). Rare Pattern Mining from Data Stream Using Hash-Based Search and Vertical Mining. In: Reddy, V.S., Prasad, V.K., Mallikarjuna Rao, D.N., Satapathy, S.C. (eds) Intelligent Systems and Sustainable Computing. Smart Innovation, Systems and Technologies, vol 289. Springer, Singapore. https://doi.org/10.1007/978-981-19-0011-2_48
Download citation
DOI: https://doi.org/10.1007/978-981-19-0011-2_48
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0010-5
Online ISBN: 978-981-19-0011-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)