Skip to main content

A Comparative Analysis on Ensemble Classifiers for Concept Drifting Data Streams

  • Chapter
  • First Online:
Soft Computing and Medical Bioinformatics

Abstract

Mining in data stream plays a vital role in Big Data analytics. Traffic management, sensor networks and monitoring, weblogs analysis are the application of dynamic environments which generate streaming data. In a dynamic environment, data arrives at high speed and algorithms that process them need to fulfill the constraints on limited memory, computation time, and one-time scan of incoming data. The significant challenge in data stream mining is data distribution changes over a time period which is called concept drifts. So, learning model need to detect the changes and adapt according to that model. By nature, ensemble classifiers are adapting to changes very well and deal the concept drift very well. Three ensemble-based approaches were used to handle the concept drift: online, block-based ensemble, and hybrid approaches. We provide a survey on various ensemble classifiers for learning in data stream mining. Finally, we compare their performance on accuracy, memory, and time on synthetic and real datasets with different drift scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Geisler S, Quix C, Schiffer S, Jarke M (2012) An evaluation framework for traffic information systems based on data streams. Transp Res Part C: Emerg Tech 23:29–55

    Article  Google Scholar 

  2. Cohen L, Avrahami-Bakish G, Last M, Kandel A, Kipersztok O (2008) Real-time data mining of non-stationary data streams from sensor networks. Inf Fusion 9(3):344–353

    Article  Google Scholar 

  3. Delany SJ, Cunningham P, Tsymbal A, Coyle L (2005) A case-based technique for tracking concept drift in electricity filtering. Knowl-Based Syst 187–188

    Google Scholar 

  4. Lane T, Brodley CE (1998) Approaches to online learning and concept drift for user identification in computer security. In: The fourth international conference on knowledge discovery and data mining—KDD-98, 1998, pp 259–263

    Google Scholar 

  5. Garofalakis M, Gehrke J, Rastogi R (2002) Querying and mining data streams: you only get one look a tutorial. In: Proceedings of the 2002 ACM SIGMOD international conference on management of data, Madison, WI, USA, 2002, p 635

    Google Scholar 

  6. Aggarwal CC (2007) Data streams: models and algorithms, vol 31. Springer Science and Business Media, Kluwer Academic Publishers, London, pp 1–372

    Google Scholar 

  7. Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101

    Google Scholar 

  8. Tsymbal A (2004) The problem of concept drift: definitions and related work, vol 106. Computer Science Department, Trinity College, Dublin, Ireland, Tech. Rep. 2004, pp 1–7

    Google Scholar 

  9. Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Intell Mag 10(4):1–14

    Article  Google Scholar 

  10. Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drifts adaptation. ACM Comput Surv 46(4):1–44

    Article  Google Scholar 

  11. Oza NC (2005) Online bagging and boosting. In: 2005 IEEE International conference on systems, man and cybernetics, vol 3, Waikoloa, HI, USA, pp 2340–2345

    Google Scholar 

  12. Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. In: Joint European conference on machine learning and knowledge discovery in databases, Barcelona, Spain, pp 135–150

    Chapter  Google Scholar 

  13. Street WN, Kim Y (2001) A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the 7th ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, CA, USA, pp 377–382

    Google Scholar 

  14. Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, DC, USA, pp 226–235

    Google Scholar 

  15. Brzezinski D, Stefanowski J (2011) Accuracy updated ensemble for data streams with concept drift. In: 6th International conference on hybrid artificial intelligence systems, Wroclaw, Poland, pp 155–159

    Google Scholar 

  16. Brzezinski D, Stefanowski J (2014) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans Neural Networks Learn Syst 25(1):81–94

    Article  Google Scholar 

  17. Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67

    Article  MathSciNet  Google Scholar 

  18. Sun Y, Wang Z, Liu H, Du C, Yuan J (2016) Online ensemble using adaptive windowing for data streams with concept drift. Int J Distrib Sens Netw 12(5):1–9

    Article  Google Scholar 

  19. Maimon O, Rokach L (2010) Data mining and knowledge discovery handbook. Springer Science & Business Media, London, pp 1–1306

    Book  Google Scholar 

  20. Littlestone N, Warmuth MK (1994) The weighted majority algorithm. Inf Comput 108(2):212–261

    Article  MathSciNet  Google Scholar 

  21. Gama J, Sebastião R, Rodrigues P (2013) On evaluating stream learning algorithms. Mach Learn 90(3):317–346

    Article  MathSciNet  Google Scholar 

  22. Bifet A, Francisci Morales G, Read J, Holmes G, Pfahringer B (2015) Efficient online evaluation of big data stream classifiers. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, Australia, pp 59–68

    Google Scholar 

  23. Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) Moa: massive online analysis. J Mach Learn Res 11:1601–1603

    Google Scholar 

  24. Bifet A, Holmes G, Pfahringer B, Kranen P, Kremer H, Jansen T, Seidl T (2010) MOA: massive online analysis, a framework for stream classification and clustering. In: Proceedings of the first workshop on applications of pattern analysis, Cumberland Lodge, Windsor, UK, pp 44–48

    Google Scholar 

  25. Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml

  26. Krempl G, Žliobaite I, Brzeziński D, Hüllermeier E, Last M, Lemaire V, Stefanowski J (2014) Open challenges for data stream mining research. ACM SIGKDD Explor Newsl 16(1):1–10

    Article  Google Scholar 

  27. Krawczyk B, Stefanowski J, Wozniak M (2015) Data stream classification and big data analytics. Neurocomputing 150(PA):238–239

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 The Author(s)

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Nagendran, N., Sultana, H.P., Sarkar, A. (2019). A Comparative Analysis on Ensemble Classifiers for Concept Drifting Data Streams. In: Soft Computing and Medical Bioinformatics. SpringerBriefs in Applied Sciences and Technology(). Springer, Singapore. https://doi.org/10.1007/978-981-13-0059-2_7

Download citation

Publish with us

Policies and ethics