Abstract
Network diffusion, such as spread of ideas, rumors, contagious disease, or a new type of behaviors, is one of the fundamental processes within networks. Designing effective strategies for influence spread maximization, rumor spread minimization, or epidemic immunization has attracted considerable research attention. However, a key challenge is that many times we can only observe the trace of contagion spreading across network, but the underlying network structure is unknown and the transmission rates between node pairs are unclear to us. In this paper, given the observed information cascades, we aim to address two problems: diffusion network structure inferring and information diffusion pathways tracking. We propose a novel probabilistic model called Network Inferring from Multidimensional Features of Cascades (NIMFC) which takes into account heterogeneous features, including temporal and topological features of cascades, node attributes, and information content, to infer the latent network structure and transmission rates of edges. Also, based on the inferred network structure, we may track diffusion pathways of a cascade in social networks. We use blocked coordinate descent method to learn a sparse estimation of the latent network. Our proposed model NIMFC is evaluated both on large synthetic and real-world data sets, and experimental results show that our method significantly outperforms state-of-the-art models both in terms of recovering the latent network structure and information pathway tracking.
抽象
创新点
近几年来, 随着互联网的快速发展, 在线社交网络中的信息传播越来越普遍, 包括热点话题传播, 观点传播, 谣言传播, 购买行为或流行时尚的传播等等. 与网络传播行为相关的研究话题, 包括影响力传播最大化, 意见领袖挖掘, 话题发现和跟踪, 事件爆发检测等, 正吸引着越来越多的研究兴趣.
这些研究所面临的一个共同挑战是, 很多时候我们只能观察到网络中的传播现象, 也即不同节点在不同时间 “感染” 了目标 “传染源”, 却不知道潜在的网络结构, 以及网络中结点之间的传播速率.
我们将信息在网络中传播时所留下的时空轨迹称为传播级联. 本文基于信息传播级联数据主要研究两方面的问题, 分别是传播网络结构推断和信息传播路径跟踪.
本文提出了一个基于信息级联多维特征融合的网络推断算法 (NIMFC), 该算法融合了多维异构特征, 包括信息级联的时序和拓扑特征, 节点属性, 信息内容, 来推断潜在网络结构, 以及节点之间的传播速率. 另外, 该算法还可以基于已推断的网络结构, 进一步推断信息在过去时间里的传播路径.
最后, 为了验证算法的效果和性能, 我们在人工合成数据集和新浪微博真实数据集上进行了实验, 结果表明在网络结构推断和信息传播路径跟踪准确率两个方面, NIMFC 算法均优于同类算法.
-
(1)
提出了一个基于信息级联多维特征融合的网络推断算法 (NIMFC), 该算法融合了多维异构特征包括信息级联的时序和拓扑特征, 节点属性, 信息内容, 来推断潜在网络结构, 以及节点之间的传播速率.
-
(2)
基于网络推断算法 (NIMFC), 提出信息传播路径跟踪算法. 该算法可以根据信息传播级联, 推测信息最大可能的历史传播路径.
-
(3)
将网络推断算法 (NIMFC) 的求解归约为一个带 lasso 正则项的最优化问题, 证明该问题具有全局最优解, 并基于坐标梯度下降算法求解该问题.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Xiong X B, Zhou G, Huang Y Z, et al. Dynamic evolution of collective emotions in social networks: a case study of Sinaweibo. Sci China Inf Sci, 2013, 56: 078101
Khondker H H. Role of the new media in the Arab Spring. Globalizations, 2011, 8: 675–679
Tumasjan A, Sprenger T O, Sandner P G, et al. Predicting elections with twitter: what 140 characters reveal about political sentiment. In: Proceedings of the 4th International AAAI Conference on Weblogs and Social Media. Menlo Park: AAAI, 2010. 178–185
Kempe D, Kleinberg J, Tardos E. Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACMSIGKDD International Conference on Knowledge Discovery and DataMining. New York: ACM, 2003. 137–146
Wang Y, Huang W J, Zong L, et al. Influence maximization with limit cost in social network. Sci China Inf Sci, 2013, 56: 078102
Nguyen N P, Yan G, Thai M T. Analysis of misinformation spread containment in online social networks. Comput Netw, 2013, 57: 2133–2146
Leskovec J, Krause A, Guestrin C, et al. Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2007. 420–429
Gomez-Rodriguez M, Balduzzi D, Scholkopf B. Uncovering the temporal dynamics of diffusion networks. In: Proceedings of the 28th International Conference on Machine Learning. New York: ACM, 2011. 561–568
Gomez-Rodriguez M, Leskovec J, Krause A. Inferring networks of diffusion and influence. In: Proceedings of the 16th ACMSIGKDD International Conference on Knowledge Discovery in Data Mining. New York: ACM, 2010. 1019–1028
Myers S, Leskovec J. On the convexity of latent social network inference. In: Proceedings of the 24th Annual Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2010. 1741–1749
Netrapalli P, Sanghavi S. Learning the graph of epidemic cascades. ACM SIGMETRICS Perform Eval Rev, 2012, 40: 211–222
Snowsill T, Fyson N, de Bie T, et al. Refining causality: who copied from whom? In: Proceedings of the 17th ACMSIGKDD International Conference on Knowledge Discovery in Data Mining. New York: ACM, 2011. 466–474
Wang L, Ermon S, Hopcroft J E. Feature-enhanced probabilistic models for diffusion network inference. Lect Notes Comput Sci, 2012, 7524: 499–514
Du N, Song L, Woo H, et al. Uncover topic-sensitive information diffusion networks. In: Proceedings of the 16th International Conference on Artificial Intelligence and Statistics. Cambridge: MIT Press, 2013. 229–237
Gomez-Rodriguez M, Leskovec J, Scholkopf B. Structure and dynamics of information pathways in online media. In: Proceedings of the 6th ACM International Conference on Web Search and Data Mining. New York: ACM, 2013. 23–32
Eagle N, Pentland A S, Lazer D. From the cover: inferring friendship network structure by using mobile phone data. In: Proceedings of the National Academy of Sciences. Washington DC: National Academies Press, 2009. 15274–15278
Kolar M, Song L, Ahmed A, et al. Estimating time-varying networks. Ann Appl Stat, 2010, 4: 94–123
Yang S, Zha H. Mixture of mutually exciting processes for viral diffusion. In: Proceedings of the 30th International Conference on Machine Learning. New York: ACM, 2013. 1–9
Aalen O O, Borgan O, Gjessing H K. Survival and event history analysis: a process point of view. Berlin/Heidelberg: Springer-Verlag, 2008. 5–36
Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. J Mach Learn Res, 2003, 3: 993–1022
Crandall D, Cosley D, Huttenlocher D, et al. Feedback effects between similarity and social influence in online communities. In: Proceeding of the 14th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2008. 160–168
McPherson M, Smith-Lovin L, Cook J M. Birds of a feather: homophily in social networks. Annl Rev Sociol, 2001, 27: 415–444
Wallinga J, Teunis P. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. Amer J Epidemiol, 2004, 160: 509–516
Tseng P, Mangasarian C O L. Convergence of a block coordinate descent method for nondifferentiable minimization. J Opt Theory Appl, 2001, 109: 475–494
Leskovec J, Chakrabarti D, Kleinberg J, et al. Kronecker graphs: an approach to modeling networks. J Mach Learn Res, 2010, 11: 985–1042
Myers S, Zhu C, Leskovec J. Information diffusion and external influence in networks. In: Proceeding of the 18th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2012. 33–41
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhou, D., Han, W., Wang, Y. et al. Information diffusion network inferring and pathway tracking. Sci. China Inf. Sci. 58, 1–15 (2015). https://doi.org/10.1007/s11432-015-5288-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11432-015-5288-8
Keywords
- diffusion network
- network inferring
- information pathway tracking
- temporal feature
- node attributes
- topological feature
- information cascades