Abstract
In this research we address the problem of capturing recurring concepts in a data stream environment. Recurrence capture enables the re-use of previously learned classifiers without the need for re-learning while providing for better accuracy during the concept recurrence interval. We capture concepts by applying the Discrete Fourier Transform (DFT) to Decision Tree classifiers to obtain highly compressed versions of the trees at concept drift points in the stream and store such trees in a repository for future use. Our empirical results on real world and synthetic data exhibiting varying degrees of recurrence show that the Fourier compressed trees are more robust to noise and are able to capture recurring concepts with higher precision than a meta learning approach that chooses to re-use classifiers in their originally occurring form.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Alippi, C., Boracchi, G., Roveri, M.: Just-In-Time Classifiers for Recurrent Concepts. IEEE Transactions on Neural Networks and Learning Systems 24(4), 620–634 (2013), doi:10.1109/tnnls.2013.2239309
Bifet, A., Gavaldà, R.: Learning from Time-Changing Data with Adaptive Windowing. In: Symposium Conducted at the Meeting of the 2007 SIAM International Conference on Data Mining (SDM 2007), Minneapolis, Minnesota (2007)
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: Moa: Massive online analysis. The Journal of Machine Learning Research 11, 1601–1604 (2010)
Gama, J., Kosina, P.: Tracking recurring concepts with meta-learners. In: Lopes, L.S., Lau, N., Mariano, P., Rocha, L.M. (eds.) EPIA 2009. LNCS, vol. 5816, pp. 423–434. Springer, Heidelberg (2009)
Gomes, J.B., Sousa, P.A., Menasalvas, E.: Tracking recurrent concepts using context. Intelligent Data Analysis 16(5), 803–825 (2012)
Hoeglinger, S., Pears, R., Koh, Y.S.: CBDT: A Concept Based Approach to Data Stream Mining. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 1006–1012. Springer, Heidelberg (2009), doi:10.1007/978-3-642-01307-2_107
Hosseini, M.J., Ahmadi, Z., Beigy, H.: New management operations on classifiers pool to track recurring concepts. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 327–339. Springer, Heidelberg (2012)
Katakis, I., Tsoumakas, G., Vlahavas, I.P.: An Ensemble of Classifiers for coping with Recurring Contexts in Data Streams. In: Symposium Conducted at the Meeting of the ECAI (2008)
Kargupta, H., Park, B.-H.: A Fourier Spectrum-Based Approach to Represent Decision Trees for Mining Data Streams in Mobile Environments. IEEE Trans. on Knowl. and Data Eng. 16(2), 216–229 (2004), doi:10.1109/tkde.2004.1269599
Lazarescu, M.: A Multi-Resolution Learning Approach to Tracking Concept Drift and Recurrent Concepts. In: Symposium Conducted at the Meeting of the PRIS (2005)
Linial, N., Mansour, Y., Nisan, N.: Constant depth circuits, Fourier transform, and learnability. Journal of the ACM 40(3), 607–620 (1993), doi:10.1145/174130.174138
Morshedlou, H., Barforoush, A.A.: A new history based method to handle the recurring concept shifts in data streams. World Acad. Sci. Eng. Technol. 58, 917–922 (2009)
Park, B.-H.: Knowledge discovery from heterogeneous data streams using fourier spectrum of decision trees. Washington State University (2001)
Pears, R., Sakthithasan, S., Koh, Y.: Detecting concept change in dynamic data streams. Machine Learning, 1–35 (2014), doi:10.1007/s10994-013-5433-9
Ramamurthy, S., Bhatnagar, R. (2007). Tracking recurrent concept drift in streaming data using ensemble classifiers. In: Symposium Conducted at the Meeting of the Sixth International Conference on Machine Learning and Applications (2007)
Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. Presented at the Meeting of the Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California (2001), doi:10.1145/502512.502568
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Sripirakas, S., Pears, R. (2014). Mining Recurrent Concepts in Data Streams Using the Discrete Fourier Transform. In: Bellatreche, L., Mohania, M.K. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2014. Lecture Notes in Computer Science, vol 8646. Springer, Cham. https://doi.org/10.1007/978-3-319-10160-6_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-10160-6_39
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10159-0
Online ISBN: 978-3-319-10160-6
eBook Packages: Computer ScienceComputer Science (R0)