Skip to main content

Rare Pattern Mining from Data Stream Using Hash-Based Search and Vertical Mining

  • Conference paper
  • First Online:
Intelligent Systems and Sustainable Computing

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 289))

Abstract

Rare itemset mining is the emerging research domain in data mining. Patterns with low support and high confidence are referred to as Rare Patterns, which are very interesting compared to frequent Patterns in certain application domains like analysis of network logs, online customer purchase behavior, online banking transaction analysis, sensor data analysis, stock market data analysis. Many applications generate large volumes of the continuous data streams. To analyze such data streams and to identify rare patterns, we need efficient algorithms that can process data streams. Many research articles on rare pattern mining are available for static databases. However, it is not possible to apply the algorithms designed for static databases to data streams. Hence, we need algorithms that are specifically designed for data stream processing, to mine important rare patterns. Rare pattern mining from the data stream is in the budding stage and only a few algorithms are available. To address this, we have proposed algorithm HEclat-RPStream, an Eclat based method to mine rare patterns from a data stream using a vertical mining with bitsets. The discovered patterns are maintained in a prefix-based rare pattern tree, it uses double hashing to maintain a rare pattern in the data stream. The algorithm also uses the Breadth First Search (BFS) and Depth First Search (DFS) to discover interesting large itemsets. To handle data streams, we have used a time sensitive sliding window approach which captures most recent patterns. The pruning technique based on two items is used to optimize the performance. The experimental results of the proposed method demonstrated good performance concerning execution time and the total number of rare patterns generated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, H.-F., Lee, S.-Y.: Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst. Appl. 36, 1466–1477 (2009)

    Google Scholar 

  2. Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. ACM SIGMOD Rec. 34(2), 18–26 (2005)

    Article  Google Scholar 

  3. Koh, L., Shin, S.-N.: An approximate approach for mining recently frequent itemsets from data streams. Proc. DaWaK, 352–362 (2006)

    Google Scholar 

  4. Cheng, J., Ke, Y., Ng, W.: Maintaining frequent closed itemsets over a sliding window. J. Intell. Inf. Syst. 31, 191–215 (2008)

    Article  Google Scholar 

  5. Tanbeer, S.K., Ahmed, C.F., Jeong, B.S., Lee, Y.K.: Efficient single-pass frequent pattern mining using a prefix-tree. Inf. Sci. 179(5), 559–583 (2009)

    Article  MathSciNet  Google Scholar 

  6. Deypir, M., Sadreddini, M.H., Hashemi, S.: Towards a variable size sliding window model for frequent item set mining over data streams. Comput. Ind. Eng. 63(1), 161–172 (2012)

    Article  Google Scholar 

  7. Lee, G., Yun, U., Ryu, K.H.: Sliding window based weighted maximal frequent pattern mining over data streams. Expert Syst. Appl. 41, 694–708 (2014)

    Google Scholar 

  8. Huang, D., Koh, Y.S., Dobbin, G.: Rare Pattern Mining on DataStream Data Warehousing and Knowledge Discovery Lecture, Notes in Computer Science, vol. 7448, p. 30 (2012)

    Google Scholar 

  9. Tsang, S., Koh, Y.S., Dobbie, G.: RP-Tree: rare pattern tree mining. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 277–288. Springer, Heidelberg (2011)

    Google Scholar 

  10. Vanamala, S., Padma Sree, L., Durga Bhavani, S.: Efficient rare association rule minig algorithm. Int. J. Eng. Res. Appl. (IJERA) 3(3), 753–757 (2013)

    Google Scholar 

  11. Vanamala, S., Sree, L., Bhavani, S.: Rare association rule mining for data stream. In: International Conference on Computing and Communication Technologies, pp. 1–6 (2014)

    Google Scholar 

  12. Almuammar, M., Fasli, M.: Learning patterns from imbalanced evolving data streams. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 2048–2057 (2018)

    Google Scholar 

  13. Manal Almuammar, Maria Fasli, “Pattern discovery from dynamic data streams using frequent pattern mining with multi-support thresholds”, the Frontiers and Advances in Data Science (FADS) 2017 International Conference on, pp. 35–40, 2017.

    Google Scholar 

  14. Vanamala, S., Padma Sree, L., Durga Bhavani, S.: Eclat_RPGrowth: finding rare patterns using vertical mining and rare pattern tree. In: Pandian, A., Fernando, X., Islam, S.M.S. (eds.) Computer Networks, Big Data and IoT. Lecture Notes on Data Engineering and Communications Technologies, vol. 66. Springer, Singapore. https://doi.org/10.1007/978-981-16-0965-7_14(2021)

  15. Zaki, M.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)

    Article  Google Scholar 

  16. Frequent itemset mining dataset repository. http://fimi.uantwerpen.be/data/

  17. Anil Ghatage, R.: Frequent Pattern Mining Over Data Stream Using Compact Sliding Window Tree and Sliding Window Model (IRJET), vol. 02. e-ISSN: 2395-0056 p-ISSN: 2395–0072 15 (2015)

    Google Scholar 

  18. Borah, A., Nath, B.: Incremental rare pattern based approach for identifying outliers in medical data. Appl. Soft Comput. 85 (2019)

    Google Scholar 

  19. Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vanamala, S., Sree, L.P., Bhavani, S.D. (2022). Rare Pattern Mining from Data Stream Using Hash-Based Search and Vertical Mining. In: Reddy, V.S., Prasad, V.K., Mallikarjuna Rao, D.N., Satapathy, S.C. (eds) Intelligent Systems and Sustainable Computing. Smart Innovation, Systems and Technologies, vol 289. Springer, Singapore. https://doi.org/10.1007/978-981-19-0011-2_48

Download citation

Publish with us

Policies and ethics