Abstract
Information networks provide a powerful representation of entities and the relationships between them. Information networks fusion is a technique for information fusion that jointly reasons about entities, links and relations in the presence of various sources. However, existing methods for information networks fusion tend to rely on a single task which might not get enough evidence for reasoning. In order to solve this issue, in this paper, we present a novel model called MC-INFM (information networks fusion model based on multi-task coordination). Different from traditional models, MC-INFM casts the fusion problem as a probabilistic inference problem, and collectively performs multiple tasks (including entity resolution, link prediction and relation matching) to infer the final result of fusion. First, we define the intra-features and the inter-features respectively and model them as factor graphs, which can provide abundant evidence to infer. Then, we use conditional random field (CRF) to learn the weight of each feature and infer the results of these tasks simultaneously by performing the maximum probabilistic inference. Experiments demonstrate the effectiveness of our proposed model.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Zhang J. Social network fusion and mining: a survey. 2018, arXiv preprint arXiv:1804.09874
Namata G, Kok S, Getoor L. Collective graph identification. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2011, 87–95
Lacoste-Julien S, Palla K, Davies A, Kasneci G, Graepel T. SIGMa: simple greedy matching for aligning large knowledge bases. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 572–580
Suchanek F, Abiteboul S, Senellart P. PARIS: probabilistic alignment of relations, instances, and schema. Proceedings of the VLDB Endowment, 2011, 5(3): 157–168
Niu F, Re C, Doan A, Shavlik J. Tuffy: scaling up statistical inference in Markov logic networks using an RDBMS. Proceedings of the VLDB Endowment, 2011, 4(6): 373–384
Lao N, Mitchell T, Cohen W. Random walk inference and learning in a large scale knowledge base. In: Proceedings of Conference on Empirical Methods in Natural Language Processing. 2011, 27–31
Kong X, Zhang J, Yu P. Inferring anchor links across multiple heterogeneous social networks. In: Proceedings of ACM International Conference on Information and Knowledge Management. 2013, 179–188
Koutra D, Tong H, Lubensky D. Big-align: fast bipartite graph alignment. In: Proceedings of International Conference on Data Mining. 2013, 389–398
Zafarani R, Liu H. Connecting users across social media sites: a behavioral-modeling approach. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013, 41–49
Zhang J, Shao W, Wang S, Kong X, Yu P. PNA: partial network alignment with generic stable matching. In: Proceedings of IEEE International Conference on Information Reuse and Integration. 2015, 166–173
Zhang J, Yu P. Integrated anchor and social link predictions across partially aligned social networks. In: Proceedings of International Joint Conference on Artificial Intelligence. 2015, 1620–1626
Zhang J, Yu P. Multiple anonymized social networks alignment. In: Proceedings of International Conference on Data Mining. 2015, 599–608
Zhang J, Yu P. PCT: partial co-alignment of social networks. In: Proceedings of International World Wide Web Conference. 2016, 749–759
Zhang J, Kong X, Yu P. Predicting social links for new users across aligned heterogeneous social networks. In: Proceedings of International Conference on Data Mining. 2013, 1289–1294
Zhang J, Kong X, Yu P. Transfer heterogeneous links across location-based social networks. In: Proceedings of ACM International Conference on Web Search and Data Mining. 2014, 303–312
Zhang J. Link prediction across heterogeneous social networks: a survey. Dissertation, University of Illinois at Chicago, US. 2014
Zhang J, Yu P, Zhou Z. Meta-path based multi-network collective link prediction. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 1286–1295
Richardson M, Domingos P. Markov logic networks. Machine Learning, 2006, 62: 107–136
Lafferty J, McCallum A, Pereira F. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of International Conference on Machine Learning. 2001, 282–289
Zhou T, Lv L, Zhang Y. Predicting missing links via local information. The European Physical Journal B, 2009, 71(4): 623–630
Lv L, Zhou T. Link prediction in complex networks: a survey. Physica A: Statistical Mechanics and its Applications, 2011, 390: 1150–1170
Lee J Y, Tukhvatov R. Evaluations of similarity measures on VK for link prediction. Data Science and Engineering, 2018, 3(3): 277–289
Hasan M, Chaoji V, Salem S, Zaki M. Link prediction using supervised learning. In: Proceedings of SIAM International Conference on Data Mining. 2006
Aditya K, Menon A, Elkan C. Link prediction via matrix factorization. In: Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. 2011, 437–452
Dunlavy D, Kolda T, Acar E. Temporal link prediction using matrix and tensor factorizations. ACM Transactions on Knowledge Discovery from Data, 2011, 5(2): 10
Tang J, Gao H, Hu X, Liu H. Exploiting homophily effect for trust prediction. In: Proceedings of ACM International Conference on Web Search and Data Mining. 2013, 53–62
Yates A, Etzioni O. Unsupervised methods for determining object and relation synonyms on the web. Journal of Artificial Intelligence Research, 2009, 34(1): 255–296
Dong X, Srivastava D. Knowledge curation and knowledge fusion: challenges, models and applications. In: Proceedings of the ACM SIGMDD International Conference on Management of Data. 2015, 2063–2066
Galarraga L, Heitz G. Canonicalizing open knowledge bases. In: Proceedings of ACM International Conference on Information and Knowledge Management. 2014, 1679–1688
Cohen W, Ravikumar P, Fienberg S. A comparison of string distance metrics for name-matching tasks. In: Proceedings of International Joint Conference on Artificial Intelligence. 2003, 73–78
Chen Y, Wang D. Knowledge expansion over probabilistic knowledge bases. In: Proceedings of International Conference on Management of Data. 2014, 649–660
Rossi R J. Mathematical Statistics: an Introduction to Likelihood Based Inference. New York: John Wiley & Sons, 2018
Fader A, Soderland S, Etzioni O. Identifying relations for open information extraction. In: Proceedings of Conference on Empirical Methods in Natural Language Processing. 2011, 1535–1545
Suchanek F, Kasneci G, Weikum G. Yago: a core of semantic knowledge. In: Proceedings of International World Wide Web Conference. 2007, 697–706
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, et al. DBpedia — a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 2013, 6(2): 167–195
Acknowledgements
This work was supported by the National Key R&D Program of China (2018YFB1003404) and the National Natural Science Foundation of China (Grant Nos. 61672142, U1435216, 61602103).
Author information
Authors and Affiliations
Corresponding author
Additional information
Dong LI is PhD candidate. He received his Master degree in Computer Technology from Northeastern University, China in 2008. His research interests include social networks analysis and data mining.
Derong Shen received her PhD degree in Computer Software and Theory from Northeastern University, China in 2004. Currently, she is a professor in the School of Computer Science & Engineering, Northeastern University, China. Her research interests include social networks analysis and data integration. She is a member of senior CCF, IEEE, and ACM.
Yue Kou received her PhD degree in Computer Software and Theory from Northeastern University, China in 2009. Currently, she is an associate professor in the School of Computer Science & Engineering, Northeastern University, China. Her research interests include social networks analysis and data mining. She is a member of CCF, IEEE, and ACM.
Tiezheng Nie received his PhD degree in Computer Software and Theory from Northeastern University, China in 2009. Currently, he is an associate professor in the School of Computer Science & Engineering, Northeastern University, China. His research interests include data integration and data mining. He is a member of CCF, IEEE, and ACM.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Li, D., Shen, D., Kou, Y. et al. Information networks fusion based on multi-task coordination. Front. Comput. Sci. 15, 154608 (2021). https://doi.org/10.1007/s11704-020-9195-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-020-9195-9