Abstract
Concept drift is a common challenge for many real-world data mining and knowledge discovery applications. Most of the existing studies for concept drift are based on centralized settings, and are often hard to adapt in a distributed computing environment. In this paper, we investigate a new research problem, P2P concept drift detection, which aims to effectively classify drifting concepts in P2P networks. We propose a novel P2P learning framework for concept drift classification, which includes both reactive and proactive approaches to classify the drifting concepts in a distributed manner. Our empirical study shows that the proposed technique is able to effectively detect the drifting concepts and improve the classification performance.
Chapter PDF
Similar content being viewed by others
References
Ang, H.H., Gopalkrishnan, V., Ng, W.K., Hoi, S.C.H.: Communication-efficient classification in P2P networks. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML/PKDD 2009. LNCS, vol. 5782, pp. 83–98. Springer, Heidelberg (2009)
Bhaduri, K., Wolff, R., Giannella, C., Kargupta, H.: Distributed decision-tree induction in peer-to-peer systems. Statistical Analysis and Data Mining 1(2), 85–103 (2008)
Luo, P., Xiong, H., Lü, K., Shi, Z.: Distributed classification in peer-to-peer networks. In: ACM SIGKDD, pp. 968–976 (2007)
Datta, S., Bhaduri, K., Giannella, C., Wolff, R., Kargupta, H.: Distributed data mining in peer-to-peer networks. Internet Computing 10(4), 18–26 (2006)
Chen, R., Sivakumar, K., Kargupta, H.: Distributed web mining using bayesian networks from multiple data streams. In: ICDM, pp. 75–82 (2001)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: KDD, pp. 97–106 (2001)
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: An ensemble method for drifting concepts. Journal of Machine Learning Research 8, 2755–2790 (2007)
Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: KDD, pp. 377–382 (2001)
Tsymbal, A., Pechenizkiy, M., Cunningham, P., Puuronen, S.: Dynamic integration of classifiers for handling concept drift. Information Fusion 9(1), 56–68 (2008)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: KDD, pp. 226–235 (2003)
Xu, Y., Wang, K., Fu, A.W.C., She, R., Pei, J.: Classification spanning correlated data streams. In: CIKM, pp. 132–141 (2006)
Yang, Y., Wu, X., Zhu, X.: Mining in anticipation for concept change: Proactive-reactive prediction in data streams. Data Mining and Knowledge Discovery 13(3), 261–289 (2006)
Hsieh, C.J., Chang, K.W., Lin, C.J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. In: ICML, pp. 408–415. ACM, New York (2008)
Dong, W., Wang, Z., Josephson, W., Charikar, M., Li, K.: Modeling lsh for performance tuning. In: CIKM, pp. 669–678 (2008)
Lemire, D.: Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recognition 42(9), 2169–2180 (2009)
Kubat, M.: A machine learning-based approach to load balancing in computer networks. Cybernetics and Systems 23(3-4), 389–400 (1992)
Widmer, G., Kubat, M.: Effective learning in dynamic environments by explicit context tracking. In: Brazdil, P.B. (ed.) ECML 1993. LNCS, vol. 667, pp. 227–243. Springer, Heidelberg (1993)
Kelly, M.G., Hand, D.J., Adams, N.M.: The impact of changing populations on classifier performance. In: KDD, pp. 367–371 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ang, H.H., Gopalkrishnan, V., Ng, W.K., Hoi, S. (2010). On Classifying Drifting Concepts in P2P Networks. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15880-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-15880-3_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15879-7
Online ISBN: 978-3-642-15880-3
eBook Packages: Computer ScienceComputer Science (R0)