Abstract
Networked data is emerging with great amount in various fields like social networks, biological networks, research publication networks, etc. Networked data classification is therefore of critical importance in real world, and it is noticed that link information can help improve learning performance. However, classification of such networked data can be challenging since: 1) the original links (also referred as relations) in such networks, are always sparse, incomplete and noisy; 2) it is not easy to characterize, select and leverage effective link information from the networks, involving multiple types of links with distinct semantics; 3) it is difficult to seamlessly integrate link information with attribute information in a network. To address these limitations, in this paper we develop a novel Seamlessly-integrated Link-Attribute Collective Matrix Factorization (SLA-CMF) framework, which mines highly effective link information given arbitrary information network and leverages it with attribute information in a unified perspective. Algorithmwise, SLA-CMF first mines highly effective link information via link path weighting and link strength learning. Then it learns a low-dimension link-attribute joint representation via graph Laplacian CMF. Finally the joint representation is put into a traditional classifier such as SVM for classification. Extensive experiments on benchmark datasets demonstrate the effectiveness of our method.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Lin, Y., Sun, J., Castro, P., Konuru, R., Sundaram, H., Kelliher, A.: Metafac: community discovery via relational hypergraph factorization. KDD 15, 527–536 (2009)
Zhu, S., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. ACM SIGIR 30, 487–494 (2007)
McCallum, A., Nigam, K., Rennie, J., Seymore, K.: Automating the construction of internet portals with machine learning. Kluwer Academic Publishers Hingham. Inf. Retr. 3(2), 127–163 (2000)
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T.M., Nigam, K., Slattery, S.: Learning to extract symbolic knowledge from the world wide web. In: AAAI/IAAI, pp. 509–516 (1998)
Sun, Y., Han, J., Yan, X., Yu, P., Wu, T.: PathSim : Meta path-based top-k similarity search in heterogeneous information networks. In: VLDB (2011)
Li, W., Yeung, D.Y.: Relation regularized matrix factorization. In: IJCAI, pp. 1126–1131 (2009)
M. Nickel, V. Tresp and H. P. Kriegel: A three-way model for collective learning on multi-relational data. In: ICML, pp. 809–816 (2011)
M. Nickel, V. Tresp and H. P. Kriegel: Factorizing yago: scalable machine learning for linked data. In: WWW, pp. 271–280 (2012)
Sofus, A.: Macskassy: Improving Learning in Networked Data by Combining Explicit and Mined Links. In: AAAI (2007)
Shi, C., Kong, X., Huang, Y., Yu, P.S., Wu, B.: HeteSim: A General Framework for Relevance Measure in Heterogeneous Networks. IEEE TKDE (2013). doi:10.1109/TKDE.2013.2297920
Kong, X., Yu, P.S., Ding, Y., Wild, D.J.: Meta path-based collective classification in heterogeneous information networks. In: CIKM, pp. 1567–1571 (2012)
Liu, J., Wang, C., Gao, J., Han, J.: Multi-view clustering via joint nonnegative matrix factorization. In Proceedings of the 13th SIAM International Conference on Data Mining, pp. 252–260 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhao, Y., Sun, Z., Xu, C., Hao, H. (2015). Seamlessly Integrating Effective Links with Attributes for Networked Data Classification. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-18032-8_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18031-1
Online ISBN: 978-3-319-18032-8
eBook Packages: Computer ScienceComputer Science (R0)