1 Introduction

With the rapid emergence of social networking websites, e.g., Facebook, Twitter, LinkedIn, Epinions, etc., a considerable amount of attention has been devoted to investigating the underlying social mechanisms in order to enhance users’ experiences (Liben-Nowell and Kleinberg 2007; Jiang et al. 2010; Leskovec et al. 2010c; Kadushin 2012). Traditional social network analysis concerns itself primarily with unsigned social networks such as Facebook or Myspace which can be modeled as graphs, with nodes representing entities, and positively weighted edges representing the existence of relationships between pairs of entities. Recently, signed directed social networks, in which the relationships between users can be either positive (indicating relations such as trust) or negative (indicating relations such as distrust), are increasingly common. For instance, in Epinions (Guha et al. 2004), which is a product review website with an active user community, users can indicate whether they trust or distrust other users based upon their reviews; in Slashdot (Lampe et al. 2007; Brzozowski et al. 2008), which is a technology-related news website, users can tag each other as “friend” or “foe” based upon their comments. Such a signed directed network can be modeled as a graph expressed as an asymmetric adjacency matrix in which an entry is 1 (or \(-\)1) if the relationship is positive (or negative) and 0 if the relationship is absent.

One of the fundamental problems in signed social network analysis is edge sign inference (also named as link sign prediction in this work) (Guha et al. 2004; Leskovec et al. 2010a), i.e., inferring the unknown trust or distrust relationship given the existence of a particular edge. Another important problem is link ranking, i.e., given a user, we aim to rank people this user is interested in (i.e., positive links) on the top of the ranking list, people this user distrusts (or dislikes) (i.e., negative links) at the bottom. The success for both of these two tasks relies on how to explain the edge signs in signed social networks effectively. To address this issue, many approaches have been developed based upon two main social–psychological theories, i.e., structural balance theory (Heider 1946; Cartwright and Harary 1956) and social status theory (Leskovec et al. 2010b). Structural balance theory is more well-known and it states that people in signed networks tend to follow the rules that “the friend of my friend is my friend”, “the enemy of my friend is my enemy”, etc. Social status theory, which is implicit in Guha et al. (2004), further exploited by Leskovec et al. (2010b), and based upon a foundation in social psychology (Kadushin 2012), considers a positive directed edge to indicate that the initiator of the edge views the recipient as having higher status and a negative directed edge to indicate that the recipient is viewed as having lower status. The relative levels of status determine the allowed sign-direction pairs for an edge assuming that this edge exists.

Although both structural balance theory and social status theory have proved useful for explaining the signs of edges in signed networks, neither is suitable for explaining an observed edge when the two nodes connected by this edge share no common neighbor (e.g., common friend), and in fact, structural balance theory simply does not apply to this situation. Since many real-world social networking graphs tend to be very sparse, this is the case for a large fraction of their edges. To better explain the observed edge signs in general so as to facilitate the underlying link sign prediction and link ranking tasks, in this paper we develop a novel approach to address this issue by applying a new model for node types.

To summarize the contributions of this paper:

  • We explore the underlying local node structures in fully observed signed directed networks, recognizing that there are 16 different types of node and each type of node constrains both its incoming node types and its outgoing node types, i.e., the signs of their edges must be consistent with their types.

  • We show that node type features can be extended to incorporate structural balance theory or social status theory, to help make predictions for those edges whose endpoints have common neighbors.

  • For the purpose of practical applications, we derive Bayesian node features (including Bayesian node type and Bayesian node properties) based upon partially observed signed directed networks.

  • We present link sign prediction and link ranking algorithms based upon the proposed Bayesian node features and node features, respectively.

  • We conduct empirical studies based upon three real-world datasets and show that our proposed approaches can outperform state-of-the-art algorithms.

The rest of this paper is organized as follows. Section 2 reviews the related work. Section 3 briefs the datasets we used for study. Section 4 presents node types in fully observed networks. Section 5 introduces Bayesian node features in partially observed networks. Section 6 presents the link sign prediction and link ranking approaches. Sections 7 and 8 show the experiment results for link sign prediction and link ranking, respectively. We conclude this work in Sect. 9.

2 Related work

In the past few years, many approaches have been developed to explore different aspects of signed social networks (Guha et al. 2004; Chiang et al. 2011; Hsieh et al. 2012; Li et al. 2015; Yang et al. 2011, 2012; Ye et al. 2013; Shahriari and Jalili 2014; Song and Meyer 2014, 2015), ranging from edge sign prediction (also named as link sign prediction in this work) (Leskovec et al. 2010a), link recommendation (Liben-Nowell and Kleinberg 2007) to community detection (Kunegis et al. 2010; Anchuri and Magdon-Ismail 2012). Most of these approaches are based upon structural balance theory or social status theory.

2.1 Structural balance theory

The investigation of signed networks (Heider 1946; Cartwright and Harary 1956; Frank and Harary 1979; Davis 1967) can be traced back to the 1920s. Heider (1946) first formulated structural balance theory within social psychology. After that, Cartwright and Harary (1956) formally provided the notion of structural balance with undirected triads (as shown in Fig. 1) and proved its necessity and sufficiency by utilizing the mathematical theory of graphs. Intuitively, their theory can be explained as: “the friend of my friend is my friend” (\(T_1\)), “the enemy of my friend is my enemy” (\(T_2\)), “the friend of my enemy is my enemy” (\(T_2\)), and “the enemy of my enemy is my friend” (\(T_2\)). Conceptually, their theory claims that \(T_1\) and \(T_2\) are balanced while \(T_3\) and \(T_4\) are unbalanced. Davis (1967) further generalized this theory to weak structural balance theory by allowing all the edges of triads to be negative, i.e., “the enemy of my enemy is my enemy” (\(T_4\) is also balanced). Note that these two balance theories were initially intended for modeling undirected networks, although they have been commonly applied to directed networks by disregarding the direction of edges (Leskovec et al. 2010b). We remark that structural balance theory is also popular for community detection in signed networks (Kunegis et al. 2010; Anchuri and Magdon-Ismail 2012).

Fig. 1
figure 1

Undirected signed triads. Structural balance theory states that \(T_1\) and \(T_2\) are balanced, while \(T_3\) and \(T_4\) are unbalanced. Weak structural balance theory states that \(T_1\), \(T_2\), and \(T_4\) are balanced, while only \(T_3\) is unbalanced

2.2 Social status theory

Guha et al. (2004) first considered the edge sign prediction problem by developing a trust propagation framework to predict the trust (or distrust) between pairs of nodes. In their framework, they calculate a combined matrix which is a linear combination of four different one-step propagations, i.e., direct propagation, co-citation, transpose trust, and trust coupling. Then the trust and distrust propagations are achieved by calculating a linear combination of powers of this combined matrix. A shortcoming of this approach is that it cannot be explained by structural balance theory (Cartwright and Harary 1956; Heider 1946).

Motivated by this trust propagation idea (Guha et al. 2004) and informed by social psychology (Kadushin 2012). Leskovec et al. (2010b) developed social status theory to explain signed directed networks. In this theory, they assume that if there is a positive edge from x to y, it represents the fact that x regards y as having higher status than himself (or herself), and if there is a negative edge from x to y, it represents the fact that x regards y as having lower status than himself (or herself). Assuming everyone in the system agrees on the same status ordering, we can infer signs easily as long as the existence and direction of edges are available. When prior status information for x and y is not available, we can still perform sign inference using the context provided by the rest of the network. For instance, in Fig. 2, the sign of x to y can be inferred by referring to the status of z, and is unambiguous in half the cases.

Fig. 2
figure 2

All contexts of (x, y; z). The red edge’s sign is not available; it can be determined based upon x’s and y’s interactions with z. Take \(t_1\) for example: since y gives z a positive evaluation and z gives x a positive evaluation, x tends to give y a negative evaluation because x has higher status (color figure online)

2.3 Approaches for link sign prediction

Based upon structural balance theory and social status theory, Leskovec et al. (2010a) selected degree features and directed triad features for edges in signed directed networks. Specifically, for the edge from node x to node y, they consider seven degree features, i.e., \(d_{\text {in}}^{+}(y)\) and \(d_{\text {in}}^{-}(y)\), the number of incoming positive and negative edges to y, respectively; \(d_{\text {out}}^{+}(x)\) and \(d_{\text {out}}^{-}(x)\), the number of outgoing positive and negative edges from x, respectively; C(xy), the number of common neighbors (i.e., embeddedness) of node x and node y; \(d_{\text {out}}^{+}(x)+d_{\text {out}}^{-}(x)\) and \(d_{\text {in}}^{+}(y)+d_{\text {in}}^{-}(y)\), the total out-degree of x and the total in-degree of y, respectively. Since each of the 16 triad types in Fig. 2 provides different evidence for the sign of the edge from node x to node y, directed triad features of this edge are encoded in a 16-dimensional vector counting the number of triads of each type in which this edge is involved. After computing the degree or directed triad features for the edge from x to y, a logistic regression classifier is used to combine the evidence from these individual features into an edge sign prediction.

Subsequently, Chiang et al. (2011) extended this approach by considering longer cycles (e.g., quadrilaterals, pentagons) while ignoring the directions of edges to reduce the computational complexity. Hsieh et al. (2012) formulated the sign inference problem as a low-rank matrix completion (approximation) problem based upon weak balance theory. Note that this approach was originally developed to explain a signed undirected network which is associated with a symmetric adjacency matrix and is different from our setting (signed directed networks) in this paper.

Although many approaches based upon structural balance theory or social status theory have been developed to perform edge sign prediction (link sign prediction) in signed networks, they cannot work well when few topological features, i.e., undirected (or directed) triads and long-range cycles, are available in the network. Since many real-world signed directed social networking graphs are very sparse, the efficacy of methods based upon these theories is limited. A more general approach for such networks is necessary.

2.4 Approaches for link ranking

How to conduct link ranking in signed social networks is an interesting problem to investigate. In the past few years, various approaches for link prediction and recommender systems have been developed and most of them can be adapted to perform link ranking.

Among these approaches, link prediction methods, including common neighbors (Liben-Nowell and Kleinberg 2007), Jaccard’s coefficient (Liben-Nowell and Kleinberg 2007), Adamic/Adar (Adamic and Adar 2003), and Katz (Katz 1953), utilize explicit topological features of the networks to perform link ranking. Recommender systems, e.g., singular value decomposition (SVD), matrix factorization (MF) (Salakhutdinov and Mnih 2007; Koren et al. 2009; Song et al. 2014), Bayesian probabilistic ranking with matrix factorization (BPR \(+\) MF) (Rendle et al. 2009), and Listwise learning to rank with matrix factorization (List \(+\) MF) (Shi et al. 2010), extract latent features from the networks so as to conduct link ranking.

These approaches, however, may not perform well on signed directed social networks since they are not specially designed for this purpose.

3 Datasets

In this paper, we consider three well-known signed directed social networks: Wikipedia (Burke and Kraul 2008), Slashdot (Kunegis et al. 2009; Lampe et al. 2007) and Epinions (Guha et al. 2004)Footnote 1:

  • The Wikipedia data comprise a voting network for promoting candidates to the role of admin. The voters, half coming from existing admins and another half coming from ordinary Wikipedia users, can indicate a positive (for supporting) or negative (for opposing) vote with respect to the promotion of a candidate (Leskovec et al. 2010a).

  • Slashdot is a social website focusing on technology-related news. In Slashdot Zoo, users can tag each other as friends (like) or foes (dislike) based upon comments on articles.

  • Epinions, which is a product review website, is a trust network in which users can indicate whether they trust or distrust each other based upon their reviews.

The detailed statistics of these datasets are provided in Table 1. Note that in all three datasets, the majority of the edges is positive. Due to this imbalance, simply predicting all edges to be positive would yield 78.78, 77.4 %, and 85.0 % accuracy across the three datasets. To show the effectiveness of any approach, it should achieve substantially better performance than this.

Table 1 Dataset statistics of three datasets, i.e., Wikipedia, Slashdot and Epinions

Figure 3 shows the fraction of edges versus edge embeddedness (Granovetter 1985) (the number of common neighbors of the two nodes connected by the edge) for three datasets. We observe that the edges with zero embeddedness comprise about 8.17, 47.90, 19.88 % of the edges for Wikipedia, Slashdot, and Epinions, respectively. Note that a large fraction of zero embeddedness edges means that triad features (Leskovec et al. 2010a) cannot work well for edge sign prediction. This is because the entries of triad feature vector will be zero and thus the triad features provide no evidence for edge sign prediction.

Fig. 3
figure 3

The fraction of edges versus edge embeddedness for three datasets

4 Node types in fully observed networks

In this paper, a fully observed signed directed network refers to a network in which there is no uncertainty about the existence of any directed edge and its associated sign.

We consider a fully observed signed directed network as a graph \(G=(V,E,W)\), where V is the vertex set of size n, E is the edge set of size m, and \(W\in \mathbb {R}^{n \times n}\) is the associated signed adjacency matrix. Because G is a directed network, W is an asymmetric matrix and can be represented as:

$$\begin{aligned} W_{ij}=\left\{ \begin{array}{ll} 1, &{} \quad \text{if}\,\, i\, \text {trusts}\, j \\ -1, &{}\quad \text {if}\,\, i \,\text {distrusts} \,j \\ 0, &{} \quad \hbox {otherwise} \end{array} \right. \end{aligned}$$
(1)

Note that \(W_{ij}=0\) represents no directed edge from node i to node j.

In this section, we first investigate local node structures within fully observed signed directed networks, recognize a set of node types, show that these node types can be used to explain real-world signed directed social networks, and show how to encode a specific node in such networks. Next, we explore how these node types interact with one another and how these interactions can explain the edge signs. Finally, we show our approach can be extended to incorporate structural balance theory or social status theory.

4.1 Node types

In our study, we focus on analyzing signed directed networks because they are more general and common than signed undirected networks in real-world applications. For instance, each of the three datasets we consider in this paper is a signed directed network. Generally, signed directed networks are sparse graphs in which nodes may be categorized into four groups based upon whether they have incoming edges and outgoing edges, i.e., nodes with neither incoming nor outgoing edges (e.g., \(N_{16}\) in Fig. 4), nodes with only incoming edges (e.g., \(N_{4}\), \(N_{5}\), and \(N_{6}\) in Fig. 4), nodes with only outgoing edges (e.g., \(N_{1}\), \(N_{2}\), and \(N_{3}\) in Fig. 4), and nodes having both incoming and outgoing edges (e.g., \(N_{9}\), \(N_{14}\), \(N_{15}\), etc.). Moreover, both the incoming edges and the outgoing edges of a given node can be categorized into three classes, one class with only positive edges, another class with only negative edges, and the third class with a mixture of positive and negative edges. Combining these two principles, the nodes in signed directed networks can be categorized into 16 types, shown in Fig. 4. Note that the edges in Fig. 4 only indicate the types of the incoming or outgoing, i.e., they do not represent the actual nonzero number of incoming (outgoing) positive (negative) edges. The fractions of each node type for the three real-world datasets are shown in Fig. 5.

Fig. 4
figure 4

16 node types in signed directed networks. For \(N_1\) (node type 1), all the outgoing edges are positive. For \(N_5\), all the incoming edges are negative. For \(N_{15}\), both the incoming edges and outgoing edges are mixtures of positive edges and negative edges

Fig. 5
figure 5

The fractions of different node types for the three datasets

4.1.1 The representation of node features

To represent each node effectively, in addition to its node type N, we should consider its associated node properties, i.e., the relative level of the number of positive (negative) incoming edges \(d_\mathrm{in}(+)\) \((d_\mathrm{in}(-))\), and the relative level of the number of positive (negative) outgoing edges \(d_\mathrm{out}(+)\) (\(d_\mathrm{out}(-)\)).

First, we can use a 16-dimensional binary vector, \([\mathbf 1 (N=N_{1})\),\(\mathbf 1 (N=N_{2})\),..., \(\mathbf 1 (N=N_{16})]\), to indicate the node type. The indicator \(\mathbf 1 (N=N_{i})\) is 1 if N is the same as \(N_{i}\). Next, we can use a vector \([P_\mathrm{in}(+), P_\mathrm{in}(-),\) \(P_\mathrm{out}(+), P_\mathrm{out}(-)]\) to denote the ratio of positive (negative) incoming edges and that of positive (negative) outgoing edges.

Intuitively, \(P_\mathrm{in}(+)\), \(P_\mathrm{in}(-)\), \(P_\mathrm{out}(+)\), and \(P_\mathrm{out}(-)\) represent the locally propagating properties of a node (node property) and they can be calculated with

$$\begin{aligned} P_\mathrm{in}(+)=\frac{d_\mathrm{in}(+)}{d_\mathrm{in}(+)+d_\mathrm{in}(-)+\varepsilon } \end{aligned}$$
(2)
$$\begin{aligned} P_\mathrm{in}(-)=\frac{d_\mathrm{in}(-)}{d_\mathrm{in}(+)+d_\mathrm{in}(-)+\varepsilon } \end{aligned}$$
(3)
$$\begin{aligned} P_\mathrm{out}(+)=\frac{d_\mathrm{out}(+)}{d_\mathrm{out}(+)+d_\mathrm{out}(-)+\varepsilon } \end{aligned}$$
(4)
$$\begin{aligned} P_\mathrm{out}(-)=\frac{d_\mathrm{out}(-)}{d_\mathrm{out}(+)+d_\mathrm{out}(-)+\varepsilon }, \end{aligned}$$
(5)

where we set \(\varepsilon =10^{-10}\) to avoid zero denominators. Figure 6 shows examples of these two parts of features for node type \(N_3\), \(N_{11}\), and \(N_{15}\). Notice that \(P_\mathrm{in}(+)+P_\mathrm{in}(-)=1\) if there is any input edge, but this sum is zero if there are none; so these features are not redundant. Also note that node property is essentially different from degree features because degree features aim to model a particular edge by considering the initiator’s outgoing edges, recipient’s incoming edges, and their common neighbors.

Note that although the node properties, i.e., \([P_\mathrm{in}(+), P_\mathrm{in}(-), P_\mathrm{out}(+), P_\mathrm{out}(-)]\) implicitly indicate the node type information, it is still useful to consider type indication, i.e., \([\mathbf 1 (N=N_{1})\),\(\mathbf 1 (N=N_{2})\),...,\(\mathbf 1 (N=N_{16})]\). Since the latter is not a linear combination of the former, it can provide non-redundant information in the logistic regression classifier we will describe in the Sect. 6.

Fig. 6
figure 6

Two parts of the features for node types \(N_3\), \(N_{11}\), and \(N_{15}\). Notice that the first (last) two numbers in the second part need not sum to 1

4.2 The interaction of node types

We have shown there are 16 possible node types in any signed directed network. Hence, theoretically there are \(16^{2}\) combinations of node types. Given a node of a certain type, however, it usually can only connect to (or be reached from) nodes in a subset of these types due to the compatibility of both directions and signs. In other words, there exists a logic to determine whether two nodes can be reached or not and whether the sign should be positive or negative. For instance, given a node of type \(N_5\), it can only be connected from a node of type \(N_2,\; N_3,\; N_7,\; N_9,\; N_{11},\; N_{13},\; N_{14},\) or \(N_{15}\) and the edge sign can only be negative. Similarly, given a node of type \(N_9\), it can only be connected from a node of type \(N_1,\; N_3,\; N_8,\; N_{10},\; N_{12},\; N_{13},\; N_{14},\) or \(N_{15}\) and connected to a node of type \(N_5,\; N_6,\; N_7,\; N_{10},\; N_{11},\; N_{12},\; N_{14}\), or \(N_{15}\). Moreover, the edge sign is determined as positive and negative, respectively.

Given an edge from x to y, based upon the combinations of node type x and node type y, this edge can be categorized into three classes. “\(+\)” denotes the edge sign is determined to be positive, “\(-\)” denotes the edge sign is determined to be negative, and “?” denotes the edge sign that cannot be determined by the interaction of current two node types, i.e., the edge sign can be either positive or negative. In our three datasets, there are 29,945, 250,487 and 500,309 determined edges (i.e., “\(+\)” and “\(-\)”) for Wikipedia, Slashdot and Epinions, respectively, each a large fraction of the total edges.

Although node types have shown their effectiveness for explaining the edge signs in fully observed signed social networks, there exists a fraction of the total edges for which signs cannot be explained simply by node types. In this case, we can incorporate structural balance or social status theory with node types to address this issue.

4.3 Incorporating structural balance or social status theory

As we described the node types and the interactions of these node types in the previous subsections, we did not need to consider whether there is any common neighbor for a pair of nodes. We should, however, be aware that when common neighbors exist for a pair of nodes, structural balance or social status theory may help to explain the sign of an edge between them.

In Fig. 7, for instance, since \(N_{13}\) has both positive and negative outgoing edges and \(N_{15}\) has both positive and negative incoming edges, the sign of the edge between them cannot be determined by the interaction of these two types. Since \(N_{13}\) and \(N_{15}\) have two common neighbors, however, the sign of the edge between them may be explained by either structural balance theory or social status theory. Within structural balance theory, we can disregard the directions of these two triads. From the red (dotted) triad, we can infer the sign of the edge between \(N_{13}\) and \(N_{15}\) to be positive based upon the rule that “my friend’s friend is my friend”. From the blue (dashed) triad, on the contrary, we can infer the sign of this edge to be negative based upon the rule that “my friend’s enemy is my enemy”. Within social status theory, since both the red (dotted) triad and the blue (dashed) triad indicate that \(N_{13}\) has lower status than \(N_{15}\), they consistently imply that the sign of the edge from \(N_{13}\) to \(N_{15}\) is positive.

To incorporate these directed triads as features, we use the same approach as Leskovec et al. (2010a, b). Given an edge from x to y, and a common neighbor z of x and y, the edge between x and z can have four possible configurations, i.e., \(x \mathrel {\overset{+}{\longrightarrow }}z\), \(x \mathrel {\overset{-}{\longrightarrow }}z\), \(x \mathrel {\overset{+}{\longleftarrow }}z\), and \(x \mathrel {\overset{-}{\longleftarrow }}z\). Similarly, there are four possible signed edges between z and y. Hence we can obtain 16 types of triads each of which may provide different evidence about the sign of the edge from x to y.

Fig. 7
figure 7

Examples for incorporation of structural balance theory or social status theory into the node types interaction

5 Bayesian node features in partially observed networks

In real-world applications, signed directed networks are often partially observed, i.e., several edges’ signs are unknown or hidden. For example, in the Wikipedia dataset, we probably know that someone has voted on a candidate, but we may not know this voter’s opinion. In this case, we would like to infer this voter’s opinion by learning some patterns based upon observed edges in the network. However, when these unobserved edges take different signs, both the node types and node properties may change. In this case, simple node features (including node types and node properties) may not be capable of capturing the range of possible unobserved signs and thus will be not reliable.

To address this issue, we extend node features to Bayesian node features by considering prior knowledge about unobserved signs in partially observed signed directed networks.

Similarly to a fully observed signed directed network, a partially observed signed directed network can also be represented as a graph \(G=(V,E,W)\), where V is the vertex set of size n, E is the edge set of size m, and \(W\in \mathbb {R}^{n \times n}\) is the associated signed adjacency matrix. Since G is a directed network, W is an asymmetric matrix and can be represented as:

$$\begin{aligned} W_{ij}=\left\{ \begin{array}{ll} 1, &{} \quad {\text {if}}\,\, i \,\,{\text {trusts}}\,\, j \\ -1, &{} \quad \text {if}\,\, i \,\,\text {distrusts}\,\, j \\ ?, &{} \quad {\hbox {if an edge from}}\,\, i\,\, {\text {to}}\,\, j\,\, {\hbox {exists, but sign is unknown}}\\ 0, &{} \quad \hbox {otherwise} \end{array} \right. \end{aligned}$$
(6)

\(W_{ij}=0\) represents no directed edge from node i to node j.

In this section, we first introduce Bayesian node type, show how to calculate it based upon both observed (training) edges and unobserved (test) edges, and present two ways to encode the interaction of Bayesian node types. Next, we show how to calculate and represent Bayesian node properties.

5.1 Bayesian node type

Given any node in a partially observed network, assuming u denotes the number of unobserved incoming edges and v denotes the number of unobserved outgoing edges, let \(P_{u}(+)\) [or \(P_{u}(-)\)] represent the prior probability of incoming edges being positive (negative) and \(P_{v}(+)\) [or \(P_{v}(-)\)] represent the prior probability of outgoing edges being positive. From these probabilities, we can calculate its probability distribution over node types.

Take a node of type \(N_{1}\) for example (as shown in Fig. 8), if \(u=0\) and \(v=0\), its node type does not change; if \(u=0\) and \(v>0\), it has \((P_{v}(+))^{v}\) probability to be \(N_{1}\) and \(1-(P_{v}(+))^{v}\) probability to be \(N_{6}\); if \(u>0\) and \(v=0\), it has \((P_{u}(+))^{u}\) probability to be \(N_{8}\), \((P_{u}(-))^{u}\) probability to be \(N_{10}\), and \(1-(P_{u}(+))^{u}-(P_{u}(-))^{u}\) probability to be \(N_{12}\); if \(u>0\) and \(v>0\), it has \((P_{u}(+))^{u}(P_{v}(+))^{v}\) probability to be \(N_{8}\), \((P_{u}(-))^{u}(P_{v}(+))^{v}\) probability to be \(N_{10}\), \((1-(P_{u}(+))^{u}-(P_{u}(-))^{u})(P_{v}(+))^{v}\) probability to be \(N_{12}\), \((P_{u}(+))^{u}(1-(P_{v}(+)^{v}))\) probability to be \(N_{13}\), \((P_{u}(-))^{u}(1-(P_{v}(+)^{v}))\) probability to be \(N_{14}\), and \((1-(P_{u}(+))^{u}-(P_{u}(-))^{u})\) \((1-(P_{v}(+)^{v}))\) probability to be \(N_{15}\). This calculation determines a 16-dimensional vector which encodes the distribution of possible node types. Note that similar vectors can be calculated for other types of nodes. We do not specify the calculation for each node type due to the space limit.

Fig. 8
figure 8

The possible node of type 1 with different u and v

To initialize \(P_{u}(+)\) and \(P_{v}(+)\), we can also use the Bayesian node properties \(P_{\text {in}}(+)\) and \(P_{\text {out}}(+)\), i.e., \(P_{u}(+)=P_{\text {in}}(+)\) and \(P_{v}(+)=P_{\text {out}}(+)\), to claim that each unobserved edge obeys Bayesian node properties (i.e., local priors).

Given an observed (training) edge connecting node x and node y, we can obtain two vectors \(V_{x}\in \mathbb {R}^{16}\) and \(V_{y}\in \mathbb {R}^{16}\) by calculating their Bayesian node types. To encode the interaction of Bayesian node types, we can (1) simply concatenate these two vectors to form a 32-dimensional vector; or (2) calculate the Kronecker product of these two vectors, i.e., \(V_{x}\bigotimes V_{y}\), and form a 256-dimensional vector. We should be aware that the vector formed by the Kronecker product encodes the probability distribution of different node type interactions.

Given an unobserved (test) edge connecting node x and node y, we should consider both possible signs as shown in Fig. 9. Specifically, we first decompose this kind of interaction into two separate cases, i.e., the edge sign being positive and negative. Next, we calculate the Bayesian node types, represent their interactions (either concatenation vector or Kronecker product vector) of both cases. Finally, we calculate the linear combination of these two cases with respect to the prior probability of the signs [i.e., \(P(+)\) and \(P(-)\)] over observed edges.

Fig. 9
figure 9

Example of calculating the Bayesian node type interaction when two nodes are connected with an unobserved edge. Note that the Bayesian node types of \(N_{10}\), \(N_{11}\), and \(N_{14}\) should be calculated similar to \(N_{1}\) in Fig. 8

5.2 Bayesian node properties

Given a node in a partially observed network, assuming u denotes the number of unobserved incoming edges and v denotes the number of unobserved outgoing edges, by assigning to these \(u+v\) edges different signs, the node properties also change.

To capture the range of possible signs of unobserved edges, we should consider Bayesian node properties, i.e., incorporating prior information, namely the expected number of incoming positive (negative) and outgoing positive (negative) edges, with the number of positive (negative) incoming edges \(d_\mathrm{in}(+)\) \((d_\mathrm{in}(-))\) or outgoing edges \(d_\mathrm{out}(+)\) \((d_\mathrm{out}(-))\).

Specifically, the Bayesian node properties are represented as following:

$$\begin{aligned} P_\mathrm{in}(+)=\frac{d_\mathrm{in}(+)+P(+)u}{d_\mathrm{in}(+)+d_\mathrm{in}(-)+u+\varepsilon } \end{aligned}$$
(7)
$$\begin{aligned} P_\mathrm{in}(-)=\frac{d_\mathrm{in}(-)+P(-)u}{d_\mathrm{in}(+)+d_\mathrm{in}(-)+u+\varepsilon } \end{aligned}$$
(8)
$$\begin{aligned} P_\mathrm{out}(+)=\frac{d_\mathrm{out}(+)+P(+)v}{d_\mathrm{out}(+)+d_\mathrm{out}(-)+v+\varepsilon } \end{aligned}$$
(9)
$$\begin{aligned} P_\mathrm{out}(-)=\frac{d_\mathrm{out}(-)+P(-)v}{d_\mathrm{out}(+)+d_\mathrm{out}(-)+v+\varepsilon }, \end{aligned}$$
(10)

where \(P(+)\) is the prior probability of positive edges and \(P(-)\) is the prior probability of negative edges.

To encode the interaction of Bayesian node properties, we simply concatenate two of the Bayesian node properties vectors to form an 8-dimensional vector.

As in the previous section, when common neighbors exist for a pair of nodes, structural balance or social status theory may be incorporated with Bayesian node features to help explain the sign of an edge between them.

6 Link sign prediction and link ranking with proposed features

6.1 Link sign prediction

Given a fully observed signed directed network, the node type interactions, extended to triads with structural balance or social status theory, are useful to explain the edge signs. Partially observed signed directed networks, however, are too complicated to fully conform to the rules of simple node type interactions. Also, as illustrated in Fig. 7, if there are multiple common neighbors for \(N_{13}\) and \(N_{15}\), structural balance theory (or social status theory) may conflict with itself. To address this issue, we can utilize a logistic regression to combine the evidence from the interaction of Bayesian node features and triad features.

We now consider the features collected for link sign prediction with logistic regression. The features we utilize can be divided into three classes. One class comes from Bayesian node type interaction (32 or 256-dimensional vector); another class is based upon Bayesian node properties interaction (8-dimensional vector); the last class is triads (16-dimensional vector).

Given a partially observed signed directed social network, we first use a logistic regression to fit the features of observed edges (training data) and then utilize the learned coefficients to linearly combine the evidence from each individual feature of unobserved edges (test data) so as to predict the sign. The logistic regression can be written in the following form:

$$\begin{aligned} P(\ell =1|{\mathbf {f}})=\frac{1}{1+\exp [-(\mathbf {w}^\mathrm{{T}}{\mathbf {f}}+w_{0})]}, \end{aligned}$$
(11)

where \(\ell \in \{0,1\}\) is the label, 1 represents positive edge while 0 represents negative edge. \({\mathbf {f}}\in{ \mathbb {R}}^{2d}\) is the feature vector, and \([{\mathbf {w}}; w_{0}]\in {\mathbb {R}}^{2d+1}\) are the coefficients we estimate from the features of observed edges (training data).

6.2 Link ranking

Given a user, link ranking aims to rank people this user is interested in (i.e., positive links) on the top of the ranking list, people this user distrusts (or dislikes) (i.e., negative links) at the bottom. By modeling the observed positive and negative links, we expect to rank the unknown status (i.e., unobserved) links such that positive links are on the top and negative links are at the bottom. Since the area under the receiver operating characteristic (ROC) curve (AUC) (Hanley and Mcneil 1982) is a classical metric for ranking, we first introduce this metric and then present a link ranking approach by optimizing AUC.

6.2.1 AUC

Given a binary classifier f and a training set \((x_{i}, y_{i})_{i=1}^{n}\) with \(x_{i}\in \mathbb {R}^{d}\) and \(y_{i}\in \{-1,1\}\), let \(\mathcal {P} = \{x_i\mid y_i = 1\}\) be the set of positive samples and \(\mathcal {N} = \{x_i\mid y_i = -1\}\) the set of negative samples. Then the AUC is defined by:

$$\begin{aligned} \text {AUC}=\frac{1}{|\mathcal {P}||\mathcal {N}|}\sum _{x_{i}\in \mathcal {P}}\sum _{x_{j}\in \mathcal {N}}I(f(x_{i})>f(x_{j})), \end{aligned}$$
(12)

where \(I(\cdot )\) is an indicator function which is 1 if the condition in the parenthesis is satisfied and 0 otherwise; \(|\mathcal {P}|\) and \(|\mathcal {N}|\) are the numbers of positive samples and negative samples, respectively. AUC is the value of the Wilcoxon–Mann–Whitney statistic (Hanley and Mcneil 1982) which is essentially the probability that a random element of one set \(f(x_{i})\) is larger than a random element of another \(f(x_{j})\). With an ideal ranking list, AUC should be 1 representing each positive sample is ranked higher than all the negative samples. For a random ranking, AUC will be 0.5.

6.2.2 Model

Since AUC is a reasonable way to quantify the ranking performance, a ideal link ranking model would be expected to optimize AUC directly. To achieve this purpose, we utilize the log-likelihood of sigmoid function as a convex surrogate for the indicator function in Eq. 12 as shown in Fig. 10. Let \(\mathbf {g}_{i}\in \mathbb {R}^{d}\) represent the feature for node i, the link ranking model can be given as:

$$\begin{aligned} \mathcal {O}(M)=-\sum _{(i,j)\in \mathcal {P}}\sum _{(i,k)\in \mathcal {N}}\log \Big (1+\exp (-(\mathbf {g}_{i}^{T}M\mathbf {g}_{j}-\mathbf {g}_{i}^{T}M\mathbf {g}_{k}))\Big )-\frac{\lambda }{2}\Vert M\Vert _{F}^{2}, \end{aligned}$$
(13)

where the second term is a regularization term to prevent over-fitting, \((i,j)\in \mathcal {P}\) denotes that the edge between node i and node j is positive, \((i,k)\in \mathcal {N}\) represents that the edge between node i and node k is negative, \(M\in \mathbb {R}^{d\times d}\) is the model parameter, and \(\lambda >0\) controls the trade-off between these two terms.

Fig. 10
figure 10

Indicator function (red), sigmoid function (green), and log-likelihood of sigmoid function (blue) (color figure online)

Since \(\mathcal {O}(M)\) is convex with respect to M, gradient ascent approach can be utilized to obtain a global optimal. Specifically, the partial derivative of \(\mathcal {O}(M)\) given M is written as:

$$\begin{aligned} \frac{\partial {\mathcal {O}}(M)}{\partial M}=\sum _{(i,j)\in \mathcal {P}}\sum _{(i,k)\in \mathcal {N}} \frac{\exp (-({\mathbf {g}}_{i}^{T} M {\mathbf {g}}_{j}-{\mathbf {g}}_{i}^{T} M {\mathbf {g}}_{k}))}{1+\exp (-({\mathbf {g}}_{i}^{T} M {\mathbf {g}}_{j}-{\mathbf {g}}_{i}^{T} M {\mathbf {g}}_{k}))}\cdot{\mathbf {g}}_{i}({\mathbf {g}}_{j}-{\mathbf {g}}_{k})^{T}-\lambda M, \end{aligned}$$
(14)

For practical applications, since the number of positive links \(|\mathcal {P}|\) and the number of negative links \(|\mathcal {N}|\) may be relatively large which will make the computation of the gradient be infeasible, stochastic gradient ascent can be used as shown in Algorithm 1.

figure a

7 Experiment: link sign prediction

In this section, we conduct link sign prediction (i.e., edge sign prediction) based upon Wikipedia (Burke and Kraul 2008), Slashdot (Kunegis et al. 2009; Lampe et al. 2007) and Epinions (Guha et al. 2004). We first construct three fully observed asymmetric adjacency matrices (as in [Eq. (1)] based upon these three datasets. Next, for each adjacency matrix, we randomly remove 10 % of edges’ signs and form a partially observed network [as in Eq. (6)]. Subsequently, we calculate Bayesian node features (including Bayesian node types and Bayesian node properties) and triad features for both observed (training) and unobserved (test) edges. Then, we estimate the parameters of logistic regression based upon the features of observed edges and make predictions based upon the features of unobserved edges. In our experiment, we repeat this procedure 5 times and report the average prediction accuracy and standard deviation for each approach. The baseline approaches are implemented with identical parameter settings as in the original works for fair comparisons.

7.1 Step by step justification

We examine the effectiveness of the proposed features by testing each component step by step. We use NT and NP to denote node types and node properties, respectively, utilize BNT and BNP to represent Bayesian node types and Bayesian node properties, respectively, use BNTC to represent encoding the interaction of Bayesian node types with concatenation, utilize BNTK to denote encoding the interaction of Bayesian node types with the Kronecker product, use BNPC to represent the interaction of Bayesian node properties, and denote triad features (Leskovec et al. 2010a, b) with Triad.

Table 2 shows the results of step by step justification for edge sign prediction on three datasets. We observe that the interaction of Bayesian node types (BNTC and BNTK) generally outperforms simply predicting all edges to be positive. This demonstrates that the interaction of Bayesian node types is useful to explain the edge signs in partially observed social networks. We also observe that encoding Bayesian node types with the Kronecker product achieves better performance than concatenation on Wikipedia and Slashdot, while concatenation performs slightly better on Epinions.

Table 2 Step by step justification results on Wikipedia, Slashdot, and Epinions [accuracy: % and standard deviation (std): %]

By concatenating BNTC and BNTK with BNPC, we observe that BNTC \(+\) BNPC and BNTK \(+\) BNPC consistently outperforms BNTC and BNTK. This is because BNPC provides more specific information about the incoming positive (negative) and outgoing positive (negative) edges of nodes.

Finally, we show that, by concatenating BNTC \(+\) BNPC and BNTK \(+\) BNPC with triad features to form BNTC \(+\) BNPC \(+\) Triad and BNTK \(+\) BNPC \(+\) Triad, the performances are consistently slightly improved. This is because triad features are useful in explaining the edge signs when common neighbors are available.

The step by step justification not only examines the effectiveness of each component of the proposed Bayesian node features, but also shows that the Bayesian node features can incorporate structural balance or social status theory in the form of triad features.

7.2 Link sign prediction

In this subsection, we compare Bayesian node features (including Bayesian node types and Bayesian node properties) plus triad features with state-of-the-art approaches, i.e., degree features (Leskovec et al. 2010a), triad features (Leskovec et al. 2010b), degree \(+\) triad features (Leskovec et al. 2010a, b), longer cycles features (Chiang et al. 2011), and low-rank modeling (Hsieh et al. 2012). Note that in our experiment we extract longer cycles features based upon the partially observed asymmetric adjacency matrix and report the best performance over order 3, 4, and 5 for comparison. Also notice that low-rank modeling (Hsieh et al. 2012) can only theoretically analyze the undirected signed networks; in our experiment, we adapt it and apply it to partially observed signed directed networks.

In Table 3, we compare BNTC+BNPC+Triad and BNTK \(+\) BNPC \(+\) Triad with the other five state-of-the-art approaches. We observe that BNTC \(+\) BNPC \(+\) Triad and BNTK \(+\) BNPC \(+\) Triad consistently outperform the other five algorithms. Note that these two variants achieve best accuracies of 87.37 \((\pm 0.22)\,\%\), 85.65 \((\pm 0.11)\,\%\), and 93.61 \((\pm 0.02)\,\%\) over Wikipedia, Slashdot and Epinions, respectively. This is because these two variants not only can explain the edge signs well when common neighbors are not available but also can effectively explain the edge signs when common neighbors exist.

Table 3 Edge sign prediction accuracies [accuracy: % standard deviation (std): %]

In Fig. 11, we compare BNTK \(+\) BNPC \(+\) Triad with degree features, triad features, and degree \(+\) triad features at different levels of embeddedness (the number of common neighbors). In general, we observe that when the minimum embeddedness increases, the performance of BNTK \(+\) BNPC \(+\) Triad increases. This is because structural balance theory and social status theory are incorporated into BNTK \(+\) BNPC \(+\) Triad in the form of triads (Triad) and are effective in explaining edge signs when common neighbors exist. Moreover, we notice that BNTK \(+\) BNPC \(+\) Triad generally outperforms other methods with different levels of embeddedness. This is because BNTK \(+\) BNPC \(+\) Triad leverages the power of node type interactions as well as the power of structural balance or social status theory in the form of triad features (Triad).

Fig. 11
figure 11

The edge sign prediction accuracies of different approaches versus minimum embeddedness

7.3 Cross-dataset evaluation

We conduct cross-dataset evaluation with degree features, triad features, degree + triad features, and Bayesian node features plus triad features in the form of BNTK \(+\) BNPC \(+\) Triad on these three datasets. The aim is to examine the generalization capability of each approach. In particular, given each type of features, we train them on one dataset (e.g., Wikipedia) and evaluate the edge sign prediction performance on another (e.g., Slashdot). For each pair of datasets, the test is conducted 5 times based upon the random selected test sets. We report the average accuracies of different approaches in Table 4.

Table 4 Cross-dataset evaluation results

We observe that Bayesian node features plus triad features in the form of BNTK \(+\) BNPC \(+\) Triad can achieve the best performance on each pair of the cross-dataset evaluation. This illustrates that Bayesian node features plus triad features not only are useful on intra-dataset evaluation, but also have good generalization capability. This is extremely helpful for edge sign prediction in signed networks with few training examples.

8 Experiment: link ranking

To conduct link ranking based upon Wikipedia (Burke and Kraul 2008), Slashdot (Kunegis et al. 2009; Lampe et al. 2007), and Epinions (Guha et al. 2004). Given a fully observed asymmetric adjacency matrix associated to one of the three datasets, we randomly remove 20 % of edges to form a test set (\(X_{\rm{test}}\)) and use the remaining edges as the training set (\(X_{\rm{train}}\)). Then, we calculate node features (including node types and node properties) based upon training set and test set. In the experiment, we aim to learn model parameters M in Eq. 13 with Algorithm 1 by ranking all observed positive links above all negative links in the training stage. In the test stage, we aim to conduct personalized link ranking, i.e., given each user, we expect to rank unobserved links (i.e., unknown status links indicating relationship of either trust or distrust) such that positive links are ranked on top and negative links are ranked at the bottom of a ranking list. Although our proposed model has the potential to rank all unobserved links for a given user, we only evaluate the effectiveness of the proposed model on \(X_{\rm{test}}\) since we only have limited number of ground truth labels (i.e., \(X_{\rm{test}}\)) for unobserved links.

We repeat this procedure 5 times and report the average AUC, mean average precision (MAP), and their standard deviations for the proposed approach as well as baseline approaches.

To study the effectiveness of node features [including node types (NT) and node properties (NP)] for link ranking, we compare it with four baseline approaches, i.e., common neighbors (CN) (Liben-Nowell and Kleinberg 2007) which is a representative explicit feature based approach, matrix factorization (MF) (Koren et al. 2009), Bayesian probabilistic ranking with matrix factorization (BPR \(+\) MF) (Rendle et al. 2009), and List \(+\) MF (Shi et al. 2010) which are three latent feature based approaches.

For MF, we first reconstruct the observed entries in \(X_{\rm{train}}\) with low-rank approximation (i.e., \(X_{\rm{train}}\approx U^{T}V\)) of rank r in the training stage. In the test stage, for each given user, we rank unobserved links based upon the values of low-rank approximation. For BPR \(+\) MF, we first treat the positive links in \(X_{\rm{train}}\) as 1s and treat both the unobserved links and negative links in \(X_{\rm{train}}\) as 0s in the training stage; then for each given user, we rank unobserved links based upon the values of low-rank approximation in the test stage. For List \(+\) MF (Shi et al. 2010), we treat positive links as 3s, unobserved links as 2s, and negative links as 1s in the training stage; then we rank unobserved links based upon the values of low-rank approximation in the test stage. Note that r is determined via grid search over \(\{10, 30, 50, 70, 90\}\) based upon validation set.

The results of various approaches for link ranking are shown in Tables 5 and 6. We observe that the proposed link ranking model plus node features (i.e., NT \(+\) NP) consistently outperforms other three baseline approaches. This is because (1) the link sign prediction model directly optimizes the ranking measure (i.e., AUC); (2) node features are effective to characterize the nodes in signed social networks.

Table 5 Link ranking results (AUC and standard deviation (std))
Table 6 Link ranking results (MAP and standard deviation (std))

9 Conclusions

In this paper, we explored the underlying local node structures in signed networks, recognizing that there are 16 different types of node and each type of node constrains both its incoming node types and its outgoing node types, i.e., the sign of an edge between two nodes must be consistent with their types. This is a highly structured alternative to the ordered scalar node types postulated by social status theory. We demonstrated that the interaction between these more complicated node types can explain edge signs well. We also showed that our approach can be extended to incorporate triad features whose inclusion is motivated by structural balance theory or social status theory. We derived Bayesian node features (including Bayesian node type and Bayesian node properties) based upon partially observed signed directed network.

We conducted empirical studies based upon three large scale datasets, i.e., Wikipedia, Slashdot, and Epinions:

  • For link sign prediction (i.e., edge sign prediction), we showed that the proposed Bayesian node features plus triad features outperform state-of-the-art algorithms for edge sign prediction. Moreover, we showed that Bayesian node features plus triad features are more effective and robust than baseline approaches for cross-dataset edge sign predictions.

  • To perform link ranking, we proposed a link ranking model by directly optimizing AUC and demonstrated that by using node features our proposed model outperforms common neighbors (CN), matrix factorization (MF), BPR \(+\) MF, and List \(+\) MF.

In the future, it will be interesting to study how to combine Bayesian node features, other explicit topological features, and latent features to perform link sign prediction and link ranking in signed social networks. It will also be interesting to study the underlying communities (or circles) (Yang et al. 2012) to improve link sign prediction and link ranking in signed social networks.