Abstract
Supporting continuous mining queries on data streams requires algorithms that (i) are fast, (ii) make light demands on memory resources, and (iii) are easily to adapt to concept drift. We propose a novel boosting ensemble method that achieves these objectives. The technique is based on a dynamic sample-weight assignment scheme that achieves the accuracy of traditional boosting without requiring multiple passes through the data. The technique assures faster learning and competitive accuracy using simpler base models. The scheme is then extended to handle concept drift via change detection. The change detection approach aims at significant data changes that could cause serious deterioration of the ensemble performance, and replaces the obsolete ensemble with one built from scratch. Experimental results confirm the advantages of our adaptive boosting scheme over previous approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Breiman, L.: Bagging predictors. In: ICML (1996)
Dietterich, T.: Ensemble methods in machine learning. Multiple Classifier Systems (2000)
Domeniconi, C., Gunopulos, D.: Incremental support vector machine construction. In: ICDM (2001)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: ACM SIGKDD (2000)
Dong, G., Han, J., Lakshmanan, L.V.S., Pei, J., Wang, H., Yu, P.S.: Online mining of changes from data streams: Research problems and preliminary results. In: ACM SIGMOD MPDS (2003)
Fern, A., Givan, R.: Online ensemble learning: An empirical study. In: ICML (2000)
Frank, E., Holmes, G., Kirkby, R., Hall, M.: Racing committees for large datasets. Discovery Science (2002)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: ICML (1996)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. The Annals of Statistics 28(2), 337–407 (1998)
Ganti, V., Gehrke, J., Ramakrishnan, R.: andW. Loh. Mining data streams under block evolution. SIGKDD Explorations 3(2), 1–10 (2002)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: ACM SIGKDD (2001)
Oza, N., Russell, S.: Experimental comparisons of online and batch versions of bagging and boosting. In: ACM SIGKDD (2001)
Schapire, R., Freund, Y., Bartlett, P.: Boosting the margin: A new explanation for the effectiveness of voting methods. In: ICML (1997)
Stolfo, S., Fan, W., Lee, W., Prodromidis, A., Chan, P.: Credit card fraud detection using meta-learning: Issues and initial results. In: AAAI 1997 Workshop on Fraud Detection and Risk Management (1997)
Street, W., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: ACM SIGKDD (2001)
Wang, H., Fan, W., Yu, P., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: ACM SIGKDD (2003)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chu, F., Zaniolo, C. (2004). Fast and Light Boosting for Adaptive Mining of Data Streams. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-24775-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive