Abstract
Signed social networks have both positive and negative links which convey rich information such as trust or distrust, like or dislike. However, existing network embedding methods mostly focus on unsigned networks and ignore the negative interactions between users. In this paper, we investigate the problem of learning representations for signed networks and present a novel deep network structure to incorporate both the balance and status theory in signed networks. With the proposed framework, we can simultaneously learn the node embedding encoding the status of a node and the edge embedding denoting the sign of an edge. Furthermore, the learnt node and edge embeddings can be directly applied to the sign prediction and node ranking tasks. Experiments on real-world social networks demonstrate that our model significantly outperforms the state-of-the-art baselines.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
In recent years, online signed networks are proliferating rapidly. For example, the consumer review sites like Epinions allow members decide whether to trust each other; news websites such as Slashdot allow users to tag each other as friends or foes. In the signed networks, the relations between entities convey rich information and are signed positively or negatively. Signed network are useful in many applications like recommendation, advertisement, and community detection. Among which, sign prediction and ranking nodes constitute the foundations for sign network analysis. The task of sign prediction [9, 18] is to infer signs of existing links. Links with positive signs may show trust and agreement, while negative links may represent the distrust and disagreement. It is essential to identify the positive or negative relation between two users. For example, when recommending friends for a user u in a social network, it is required not to list u’s foes in the candidates. Sign prediction can be viewed as a sub-task of link prediction [15] but has received very limited attentions [3, 9, 17, 18] by now.
Ranking nodes aims to find out how important a node is based on its reputation [2, 8]. The solution to this problem is traditionally developed in the area of webpage ranking, and the PageRank [2] and HITS [8] are two classic algorithms. More recently, ranking in signed network has aroused great research interests due to its wide applications like targeted advertisement or trust network construction. However, most of existing approaches [7, 12, 16] are simple modifications of PageRank and HITS and cannot deal with signed links directly. Hence new approaches are desired to include the impacts from both positive and negative links.
We follow this line of work and consider the problem of sign prediction and node ranking in signed networks. Our approach is motivated by balance [6] and status [5] theory. We build on the work of SiNE [18], which considers balance theory as “users should sit closer to their friends than their foes” and adopts a similarity based function to measure such a relation. We go beyond a single usage of balance theory and propose a novel framework which incorporates the balance and status theory into a unified deep learning framework.
2 The Proposed BASSI Model
2.1 Balance Theory and Status Theory
Balance and status are two fundamental theories in social science. Balance theory is originally defined for undirected networks to model the relations of likes and dislikes. It implies that “the friend of my friend is my friend” and “the enemy of my enemy is my friend” [6]. Status theory [5, 10] is proposed to represent the social status of the people in directed networks, where the status may denote the relative ranking, prestige, or skill level. For example, a positive/negative link from a to b denotes “b has higher/lower status than a”. The status relation should be transitive, which means that “a person respected me should be respected by my subordinate”. In order to illustrate the balance and status theory in the directed sign network, we adopt the triangle representations [10, 15] and list the possibilities of 12 triangles in Fig. 1.
For example, for \(T_{15}\) in Fig. 1, balance theory indicates that the sign of edge \(e_{ij}\) should be a “+” (j is i’s friend) given that k is an enemy of i and k is an enemy of j. Similarly, for \(T_{22}\) in Fig. 1, status theory suggests that the sign of edge \(e_{ij}\) should be a “−” (the status of i is higher than that of j) given that i has higher status than k and k has higher status than j.
In real-world signed networks, Leskovec et al. found that more than 0.9 of triangles satisfy balance [9] and status theory [10], respectively. Hence in the following, we investigate how to incorporate the balance and status theory in a unified framework.
2.2 Modeling Balance Theory
We now take the triangle \(T_{15}\) in Fig. 1 as an example to show the the detail of modeling balance theory. As we illustrated in the previous section, given that \(e_{jk} = -\) and \(e_{ik} = -\), we have \(e_{ij} = + \) if we follow the rule of “the enemy of my enemy is my friend”. Mathematically, taking the sign of three edges together, our goal is to maximize the following objective function.
where \(J^b_{T_{15}}\) is the overall probability for predicting the sign of three edges in \(T_{15}\) guided by balance theory. Taking all triangles in the network into consideration, we can define the objective function \(J_{bal}\) for triangles satisfying balance and use a loss function \(L_{bal}\) to measure the difference between the observation and the prediction as:
where \(L^b_{\triangle }\) denotes the loss for sampled triangles, \(T_{sam}\) is the set of triangles satisfying balance theory and \(\lambda ^b_{ij}, \lambda ^b_{ik}, \lambda ^b_{jk}\) are the indicator function denoting whether an edge is missing from the sampled triangle.
\(L^b_{ij}\) in Eq. 2 is used to measure the difference between the predicted value \(P(+|e_{ij})\) for the sign of the edge \(e_{ij}\) and the ground truth value \(y_{ij}\), and it can be defined using a cross-entropy loss function. Hence we have:
Similarly we can define the loss for other edges and get \(L^b_{ik}\), \(L^b_{jk}\) in Eq. 2.
2.3 Modeling Status Theory
We continue to use the triangle \(T_{15}\) in Fig. 1 to show the the detail. We denote the status ranking score of node \(v_i\), \(v_j\), \(v_k\) as \(R_i\), \(R_j\), \(R_k\) respectively. Given that \(e_{ij} = +\) and \(e_{ik} = -\) in \(T_{15}\), we have the status relationships as \(R_i < R_j\) and \(R_i > R_k\) if we follow the rule of “the person respected by me should have higher status than me”. Mathematically, taking the status relationships of two edges together, our goal is to minimize the following objective function.
where \(J^s_{T_{15}}\) is the overall distances from status relationships \(R_i<R_j\) and \(R_i>R_k\), and Q is the function to measure the distances between true status relationship (\(R_i<R_j\)) and predicted status relationship value (\(R_i-R_j\)). Taking all triangles in the network into consideration, we can define the objective function \(J_{sta}\) for the triangles satisfying status theory and use a loss function \(L_{sta}\) to measure the difference between the observation and the prediction as:
\(L^s_{ij}\) in Eq. 5 is used to measure the difference between the predicted status relationship value \(R_i-R_j\) from the edge \(e_{ij}\) and the “ground truth” value \(q_{ij}\), and it can be defined using a square loss function. Hence we have:
Similarly we can define the loss for other edges and get \(L^s_{ik}\), \(L^s_{ki}\) in Eq. 5. Since we do not have the ground truth value \(q_{ij}\) measuring status relationship between \(v_i\) and \(v_j\), we define the following boundary function to calculate the value of \(q_{ij}\).
where \(\gamma \) is a threshold of status relationship value.
2.4 BASSI Model
Based on the mathematically modeled balance and status theory, we now propose a novel BASSI model to combine two theories into a unified deep network. The whole objective function of BASSI can be written as:
where \(L_{reg}=||\mathbf {\Theta }||_2^2\) is the \(L_2\) regularizer for all parameters \(\mathbf {\Theta }\) in the neural network and \(\lambda _{reg}\) is the corresponding weighing factor (\(\lambda _{reg}=0.0001\) in our experiments). With the objective function \(L_{bal}\) and \(L_{sta}\) given above, our task now is to find a function f to measure the probability \(P(+|e_{ij})\) or \(P(-|e_{ij})\) of the sign of an edge \(e_{ij}\) and a function g to get the ranking score \(R_i\) of a node \(v_i\). Motivated by recent advances in deep learning which has been proven to be powerful in learning nonlinear representations, we propose a deep neural network to learn the embedding of nodes and edges as well as the function f and g. Figure 2 shows the architecture of BASSI.
The input to the framework is the set of triplets (\(\mathbf {x_i, x_j, x_k}\)) denoting the embedding of the nodes in triangles extracted from the signed network. The upper part is the embedding process based on balance theory, where we start from the node embedding layer to optimize the relationships in a triangle in next layers. More formally, we define the the probability \(P(+|e_{ij})\) of the sign of an edge \(e_{ij}\) as:
where \(\sigma \) is a sigmoid function, \(\mathbf {W_1}\) and \(\mathbf {b_1}\) is the the weight and bias in the second layer of the upper part, respectively, and \(\mathbf {e_{ij}}\) is the embedding of an edge \(e_{ij}\) and defined as:
where \(\mathbf {W_{eh}}\) and \(\mathbf {W_{et}}\) are the weights and \(\mathbf {b_e}\) the bias in the first layer of the upper part.
The lower part is the embedding process based on status theory, where we begin with the node embedding layer to model the relationships between nodes in the point of view of ranking scores. Formally, we define the following ranking score function g for a node \(v_i\) as:
where \(\mathbf {W_2}\), \(\mathbf {b_2}\), and \(\mathbf {W_3}\), \(\mathbf {b_3}\) are the weights and biases in the first and second layer of the lower part.
To train our BASSI model, we simply adopt the stochastic gradient descent approach [1] to update the parameters in the neural network and get the representations for nodes \(\mathbf {X}\) in the signed network.
3 Experimental Evaluation on Sign Prediction
We first conduct the sign prediction experiment to check whether our embeddings improve the performance of signed network analysis [18].
3.1 Settings
We conduct experiments on two well known and publicly available signed social network datasets. Slashdot [10] is a technology-related news website known for its specific user community. Users are allowed to tag each other as friends or foes. Epinions [10] is an online social network of a general consumer review site. Members of the site can decide whether to “trust” each other.
For this prediction experiment, we use four state-of-the-art baselines: DeepWalk [11], LINE [13], FExtra [9], and SiNE [18]. We randomly select 80% edges as training set and the remaining 20% as the test set. After we get node embeddings from DeepWalk, LINE or SiNE, we follow the way in [4] to get edge features through four operators (average, Hadamard, weighted-L1, weighted-L2) and then choose the best one to display. With the learnt edge features, we train a logistic regression classifier on training set and use it to predict the edge sign in test set. We run with different random seed for 5 times to get the average scores. We set all the embedding dimension 20 as SiNE [18] does. For the main hyper-parameters in BASSI, we set \(epoch=10\), status difference gap \(\gamma =0.5\), missing third edge learning rate \(\eta =0.01\). For baselines, we follow the default settings in their paper.
3.2 Results for Sign Prediction
We report the average auc, macro-F1 and micro-F1 as evaluation metrics as those in [9, 18]. Table 1 shows the results. Scores in bold denote the highest performance among all methods.
We can observe that on both datasets our BASSI model significantly outperforms all baselines in terms of three metrics. (1) DeepWalk and LINE are the worst, showing that it is not suitable to directly apply unsigned network embedding methods to this problem. (2) SiNE performs better than LINE and DeepWalk in most of cases while it is worse than FExtra and BASSI. This can be due to that SiNE ignores sign’s direction information as it is designed for undirected signed network. (3) FExtra is the second best, but it cannot compete with BASSI. For example, BASSI has the macro-F1 score of 0.8714, while that for FExtra is 0.8070, showing a 7.98% increase. The reason can be that the embedding process in BASSI can capture more complex latent relations than the feature-engineering approach FExtra.
4 Experimental Evaluation on Node Ranking
Based on the status theory in signed network, the ranking score of each node can be seen as its status. To demonstrate the property of such ranking scores in real-world signed social network, we design a status comparison experiment. Specifically, we simplify the comparison procedure used in the [12, 14] as follows: we take the test edges as the ground truth, and compare the status between two adjacent nodes by their ranking scores. For example, \(v_i \overset{+(-)}{\longrightarrow } v_j\) can be transformed into \(R_i <(>) R_j\) based on status theory. In this way, we can measure how the ranking scores generated by different methods are consistent with the ground truth.
We conduct the status comparison experiments on the Epinions and Slashdot datasets. We use the following six baselines: Prestige [20], PageRank [2], Exp [16], PageRank [2], MPR [12], MHITS [19] and Troll-Trust [19]. We use 80% edges as training set and 20% edges as test set and use accuracy an the evaluation metric as that in [12, 19]. For Troll-Trust, we test different combinations of \(\beta =[0.01\), 0.1, 0.2, 0.5, 0.9] and \(\lambda _1 = [0.1\), 0.5, 1.0, 5.0, 10.0, 100.0] and choose the best results on Slashdot and Epinions. For other baselines, we follow the setting of Troll-Trust [19]. For our BASSI model, we simply get the ranking scores from the model trained in the sign prediction task. The results are shown in Table 2.
It is clear that our BASSI model preserves the status property significantly better than any other ranking methods in both networks. This could be attributed to that BASSI unifies two theories in a framework, and it benefits from the balance theory to learn better status representations for the nodes.
5 Conclusion
In this paper, we propose a novel BASSI (BAlance and Status combined SIgned network embedding) model to learn the node and edge representations in social networks. In particular, we first define two new objective functions to mathematically modeling the balance theory and status theory. We then design a deep neural structure to combine two theories in a unified framework. Based on the deep network, we learn the node embedding and edge embedding denoting the status of a node and the sign of an edge, which can be directly used in node ranking and sign prediction tasks. We conduct extensive experiments on real world networks. The results demonstrate that our model significantly outperforms the state-of-the-art baselines.
In the future, we plan to investigate how the learnt representations can be used in other applications like community detection or recommendation. We are also interested in making connections between our model and the social properties of real world networks.
Change history
01 July 2018
The original version of this chapter titled “Free-Rider Episode Screening via Dual Partition Model” contained the following three mistakes:
- 1.
In table 1, row 2, column 3, the average occurrence per event on STK dataset was “1.037”. It should be “1,037”.
- 2.
The last model name in the legend of Figure 3 was “EIP”. It should be “EDP”.
- 3.
In the experiment part the stock symbols and their companies were confused.
In the updated version these mistakes were corrected.
In the originally published version of chapters titled “BASSI: Balance and Status Combined Signed Network Embedding” and “Sample Location Selection for Efficient Distance-Aware Influence Maximization in Geo-Social Networks” the funding information in the acknowledgement section was incomplete. This has now been corrected.
- 1.
References
Bottou, L.: Stochastic gradient learning in neural networks. Proc. Neuro-Nımes 91(8) (1991)
Brin, S., Page, L.: Reprint of: the anatomy of a large-scale hypertextual web search engine. Comput. Netw. 56(18), 3825–3833 (2012)
Chiang, K.Y., Natarajan, N., Tewari, A.: Exploiting longer cycles for link prediction in signed networks. In: Proceedings of CIKM, pp. 1157–1162 (2011)
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks. In: Proceedings of SIGKDD, pp. 855–864 (2016)
Guha, R., Kumar, R., Raghavan, P., Tomkins, A.: Propagation of trust and distrust. In: Proceedings of WWW, pp. 403–412. ACM (2004)
Heider, F.: Attitudes and cognitive organization. J. Psychol. 21(1), 107–112 (1946)
de Kerchove, C., Van Dooren, P.: The PageTrust algorithm: how to rank web pages when negative links are allowed? In: Proceedings of SDM, pp. 346–352. SIAM (2008)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. JACM 46(5), 604–632 (1999)
Leskovec, J., Huttenlocher, D., Kleinberg, J.: Predicting positive and negative links in online social networks. In: Proceedings of WWW, pp. 641–650. ACM (2010)
Leskovec, J., Huttenlocher, D., Kleinberg, J.: Signed networks in social media. In: Proceedings of SIGCHI, pp. 1361–1370. ACM (2010)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of SIGKDD, pp. 701–710 (2014)
Shahriari, M., Jalili, M.: Ranking nodes in signed social networks. SNAM 4(1), 172 (2014)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of WWW, pp. 1067–1077 (2015)
Tang, J., Chang, S., Aggarwal, C., Liu, H.: Negative link prediction in social media. In: Proceedings of ICDM, pp. 87–96. ACM (2015)
Tang, J., Chang, Y., Aggarwal, C., Liu, H.: A survey of signed network mining in social media. ACM Comput. Surv. (CSUR) 49(3), 42 (2016)
Traag, V.A., Nesterov, Y.E., Van Dooren, P.: Exponential ranking: taking into account negative links. In: Bolc, L., Makowski, M., Wierzbicki, A. (eds.) SocInfo 2010. LNCS, vol. 6430, pp. 192–202. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16567-2_14
Wang, S., Aggarwal, C., Tang, J., Liu, H.: Attributed signed network embedding. In: Proceedings of CIKM, pp. 137–146. ACM (2017)
Wang, S., Tang, J., Aggarwal, C., Chang, Y., Liu, H.: Signed network embedding in social media. In: Proceedings of SDM, pp. 327–335. SIAM (2017)
Wu, Z., Aggarwal, C.C., Sun, J.: The troll-trust model for ranking in signed networks. In: Proceedings of ICDM, pp. 447–456. ACM (2016)
Zolfaghar, K., Aghaie, A.: Mining trust and distrust relationships in social web applications. In: Proceedings of ICCP, pp. 73–80. IEEE (2010)
Acknowledgement
The work described in this paper has been supported in part by the NSFC projects (61572376, 91646206), the 111 project (B07037), and Natural Science Foundation of Hubei Province under Grant No. 2018CFB616.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Chen, Y., Qian, T., Zhong, M., Li, X. (2018). BASSI: Balance and Status Combined Signed Network Embedding. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10827. Springer, Cham. https://doi.org/10.1007/978-3-319-91452-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-91452-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91451-0
Online ISBN: 978-3-319-91452-7
eBook Packages: Computer ScienceComputer Science (R0)