1 Introduction

Analyzing the emotional tone with which people express themselves helps to better understand their attitudes and react accordingly. This process is known as sentiment analysis and deals with the computational treatment of opinion, sentiment and subjectivity in text (Pang and Lee 2008). Particularly, the uses of sentiment analysis are both broad and powerful. This field is increasingly being used to obtain public opinions and emotions about certain topics of interest from social network sites since they are opinion-rich sources (Dragoni 2017). Indeed, the practice of extracting insights from social web data is widely adopted by enterprises around the world. It has been shown that tonal changes on social networks are correlated with changes in the stock market (Leitcha and Sherif 2017). Also, it can explain the results of marketing strategies (Wu et al. 2018). In fact, the growing availability and popularity of these platforms are making them the mainstream communication media due to the huge amount of user-generated content. Hence, they represent an emerging challenging sector in the field of sentiment analysis, which have to deal with polarity classification. The latter is a subtask aimed at extracting “positive” and “negative” sentiments on a specific topic, also called polarities. How could social networks, which are complex and so ubiquitous, affect sentiment polarities ? Consider a company that uses social networks to sell its products. Decision makers know that customers who are friends in the network are buying similar products. The question that arises is, what are the reasons behind this similarity? Do they have similar tastes and opinions because they are connected by a personal relationship of friendship? Or are they influenced by each other since they communicate frequently? It is on the basis of the answer to this question that decision makers will know how to interpret their business data and highlight strategic plans.

Most of the works in sentiment analysis, mainly focus on the classification based on textual information, expressed on social networks. Recent approaches consider the network structure of these platforms in addition to texts. Their aim is to infer the sentiments of users from their shared posts by integrating the friendship relations. For dealing with the connections of users, they exploit the principle of Homophily (McPherson et al. 2001). The latter illustrates the notion of proximity and relates that “a contact between similar people occurs at a higher rate than among dissimilar people” (McPherson et al. 2001). Thus it suggests that connected users may tend to hold similar opinions. Considering Homophily, however, is a weak assumption for modeling the correlation between the sentiments of users. For overcoming this limitation, we aim to improve sentiment polarity classification by leveraging the information about social influence made evident by online social networks. Social influence occurs when an actor adapts his behavior, attitude or belief, to the behavior, attitude, or belief of other actors in the social system (Leenders 2002). It is considered as a dyadic process where ego adapts his opinion to that of alter, so leading them to behave similarly (Leenders 2002). We propose an approach for sentiment polarity classification by integrating social influence principle—the idea that similarity and influence tend to co-occur. We suggest that users that have a social influence between them tend to hold similar opinions.

“What people think about X”, where X is the target topic of interest, is the end goal of our sentiment analysis approach. Assuming that the sentiment of a user is estimated by aggregating the sentiments of his posts, determining the sentiment expressed in individual texts is a subtask of that ultimate objective. According to this goal, our proposed approach can infer sentiment polarities at user-level by leveraging the information about the sentiment polarities of his corresponding posts and relationships. For inferring polarities, we use a semi-supervised learning paradigm that predicts the unlabeled users sentiments, given a small proportion of data already labeled. It is based on heterogeneous graph model incorporating the social network information.

The main contributions of this paper are three-fold:

  • We empirically confirm that the probability that two users share the same sentiment is indeed correlated with whether they have an influence relationship in the social network;

  • We propose a novel heterogeneous graph representation for sentiment analysis on social networks at user-level, which incorporates the implicit influence relationships information;

  • We show that competitive results for sentiment analysis on social networks can be achieved using influence relationships information, reporting best accuracies on a collected data set.

The outline of this paper is as follows. The next section introduces the related work. Section 3 describes the modeling of the social network in order to take into account the influence information. Section 4 is devoted to the proposed influence model for sentiment inference. In Sect. 5, we present the experimental results and evaluation. Finally, the paper concludes and introduces the future work.

Table 1 Methods of sentiment analysis on social networks

2 Related work

Sentiment analysis on social networks is a new research area that applies the techniques inherited from traditional sentiment analysis to social network analysis. Sentiment analysis, generally, assumes that the users’ generated posts are independent and identically distributed (Pozzi et al. 2013). Yet, given user-network information, social network analysis can enhance opinion mining. In order to determine the sentiments expressed on social platforms, several approaches have been proposed. Traditionally there have been efforts mainly focused on classifying sentiment polarities of individual textual posts without considering the information about the overall sentiments of the users who posted them (Malouf and Mullen 2008; Gryc and Moilanen 2010; Agrawal et al. 2003; Wang and Manning 2012; Maas et al. 2011; Go et al. 2009; Barbosa and Feng 2010). At post-level, most of the proposed approaches can be classified in two categories: lexicon-based and machine-learning based. On the one hand, lexicon-based approaches generally have tendency to compare the numbers of positive and negative words determined using predefined external resources and dictionaries, or apply label propagation on a graph of lengthening words. They used polarity dictionaries or lexicons as external resources in order to detect the sentiment polarities (Ding et al. 2008; Bollen et al. 2011; OConnor et al. 2010; Taboada et al. 2011; Thelwall et al. 2012, 2010). On the other hand, machine learning algorithms, such as Naive Bayes, Maximum entropy and support vector machines, are used in machine learning-based approaches (Go et al. 2009; Barbosa and Feng 2010; Bermingham and Smeaton 2010; Bifet and Frank 2010; Pang et al. 2002; Liu et al. 2012; Speriosu et al. 2011). Recently, some research works have combined these two types of approaches (Fang and Chen 2011; Kumar and Sebastian 2012; Mudinas et al. 2012; Saif et al. 2012). They performed better results in terms of polarity prediction. Most of these works are target-independent, that is, classifying the post polarity is general and not according to a specific target topic of interest. They utilize machine learning based classifiers where all the used features are independent of the target. However, users may refer to multiple target topics in one post, thus it is not reasonable to use target-dependent approaches. (Jiang et al. 2011) were the first to propose target-dependent Sentiment Analysis on TwitterFootnote 1 social network. More recently, researchers are working on texts mentioning multiple aspects, so called aspect-based sentiment analysis. It is the identification of the sentiment expressed about each specific aspect of a given target entity in a text (Dragoni and Petrucci 2017, 2018). Moreover, beyond the polarity of individual texts, it is important to recognize the sentiment of each user. This has been addressed in more recent studies. In (Kim et al. 2013), authors proposed a user-sentiment prediction framework which relies on collaborative filtering techniques where predicting a sentiment is based on content features of Twitter messages. The state-of-the-art approaches basically assumed that the overall sentiments of users are estimated by aggregating the sentiments of their posts in their history corpus (Kaewpitakkun and Shirai 2016; Smith et al. 2013). However, the aggregation of posts sentiments is likely to yield unsatisfactory results due to their ambiguity (natural language) and unstructured format of data, so may induce errors and noise. In order to improve user-level sentiment analysis, several researchers have exploited social relationships and network structure (Speriosu et al. 2011; Tan et al. 2011; Pozzi et al. 2013; Hu et al. 2013). Their motivation, with reference to the phenomenon of Homophily, is that connected users may be more likely to hold similar opinions. In (Tan et al. 2011) a user overall sentiment was determined by looking at his tweets and who he is connected to. However, it has been demonstrated that considering friendship connections is a weak assumption for modeling Homophily. (Pozzi et al. 2013) proposed a framework for user-label polarity classification which integrates posts contents with approval relationships. More recent studies defined frameworks for inferring both post-level and user-level sentiments at the same time. The sentiments of the posts are influenced by those of the users, and in the same way, the sentiments of the posts can influence the sentiments of the users (Nozza et al. 2014). What is more, in social networks, where the nodes follow a power-law distribution, some actors play a more important role than others. These actors occupy a strategic position within the network and can be more influential. They are called opinion leaders. The methods of their identification are classified into two categories : (1) link-based methods (Carson et al. 2007; Li et al. 2013); and (2) combination of link and semantic information (Freeman 1978; Kleinberg 1999; Page et al. 1999).

In Table 1, we describe some of the mentioned methods.

3 Modeling the influence network

When analyzing the communication between people that take place through online social networks, we distinguish several types of relationships which govern their dynamics. One key relationship is social influence which occurs when a user’s opinions or sentiments are affected by other people. It is defined in Merriam-WebsterFootnote 2 dictionary as the power or capacity of a person or things in causing an effect in indirect or intangible ways. Most studies of social influence assume communication to be the underlying process through direct contact between actors (Leenders 2002). “The more frequent and vivid the communication between ego and alter, the more likely it is that ego will adapt alter’ s ideas and beliefs” (Leenders 2002). We assume that two users who have such relationship between them will likely share the same sentiment about a certain topic of interest. In fact, the user behavior in social networks is closely related to cognitive biases (Lee 2010; Vishwanath 2006) claim that the choices made by individuals are usually shaped by the opinions of others. Indeed, according to the logic of propagation of information transmitting a message to a handful of influencers, they will raise it in their large networks. Thus people change their opinions due to the influence of their neighbors.

Therefore, combined with influence patterns, we propose a graph based model to infer sentiment polarities at user-level. We integrate the social factors in a meaningful manner such that the model is capable to infer polarities effectively. Before introducing the model, we formally define the key components that allow to explicitly exploit the influence relationships. We first introduce some notations and definitions that we will use throughout the rest of paper (see Table 2).

Table 2 Notations

Focusing on an individual’s potential to lead his “friend”Footnote 3 to engage in a certain act about a certain topic of interest, we deem measurements of influence in social networks based on two metrics :

  • Interpersonal activities : we use three interpersonal activities on social networks. In fact, users communicate and interact with one another by likes, comments and shares of mutual posted contents.

  • In-degree : it is the number of followers of a user which reflects his popularity. We use it later on to identify opinion leaders and the most influential users in the social network.

In order to quantitatively measure the interactions between users and leverage the metrics, we propose the following functions.

Definition 1

\({{\varvec{LCS}}}\) is a function that determines whether a user \(v_j \, \in \textit{V}\) has liked, commented or shared a particular post \(p_{i,k}\) about a topic q published by a user \(v_i\) in the social network (knowing that \(v_i\) and \(v_j\) are already connected).

$$\begin{aligned} \begin{aligned}&LCS(v_i,p_{i,k},v_j) \\&\quad = \left\{ \begin{array}{r c} 1 &{} if : v_j \in ( L(p_{i,k}) \cup C(p_{i,k}) \cup S(p_{i,k}) ) \\ 0 &{} otherwise \\ \end{array} \right. \end{aligned} \end{aligned}$$
(1)

Definition 2

The ratio of influenceROI value represents the measure of influence of a particular user \(v_i\) on another user \(v_j\) with reference to all the posts of \(v_i\) about the topic q (similarly \(v_i\) and \(v_j\) are connected in the network). This ratio is proportional to the number of interpersonal activities of \(v_j\) on \(v_i\)’ s posts; and the follower/followed links between them.

$$\begin{aligned} ROI(v_i,v_j)= & {} \dfrac{\sum _{p\in P(v_i)}(LCS(v_i,p,v_j))}{|P(v_i)|} \nonumber \\&+ Fol(v_j,v_i) \end{aligned}$$
(2)

Where \({\textit{Fol}} (v_j,v_i)\) is set to 1 if \(v_j\) follows \(v_i\) (\(v_j \, \in \, F(v_i)\)) and 0 otherwise.

Definition 3

The magnitude of influenceMOI is the root mean square value of ROI for all the friends of a user. The root mean square is generally used for measuring the magnitude of a variable quantity. In our case, it indicates the magnitude of influence for different friends of a particular user in the social network. MOI reflects the total influence of a user on his network.

$$\begin{aligned} MOI(v_i)= \sqrt{\dfrac{\sum _{v_j\in N(v_i)} (ROI(v_i,v_j)^{2})}{|P(v_i) |}} \end{aligned}$$
(3)

In order to give a formal structure of the social network coupled with the influence information, we introduce and define the influence network. The latter is represented as a heterogeneous graph incorporating both posts and influence relationships where nodes and edges can be of different types.

Definition 4

Given a topic of interest q, the influence network is represented by a directed influence graph which is a quadruple \(DIG_q = \{V_q, E_q, X_{q,v}, X_{q,e}\}\), where \(V_q = \{v_1,\ldots ,v_n\}\) the set of users interested in q; \(E_q =\{(v_i, v_j) \, |\, v_i, v_j \, \in \, V_q\}\) is the set of directed influence edges (meaning that \(v_i\) influences \(v_j\) about q ; knowing that \(v_i\) and \(v_j\) are already connected in the network by a friendship relation); \(X_{q,v} = \{\hbox {MOI} (v_i) \, |\, v_i \, \in \, V_q\}\) is the set of weights assigned to the nodes; \(X_{q,e} = \{\hbox {ROI}(v_i, v_j) \, |\, v_i, v_j \, \in V_q\}\) is the set of weights assigned to the arcs.

Definition 5

Given a directed influence graph \(DIG_q\), a normalized directed influence graph is derived as a triple \(NDIG_q = \{V_q, E_q, C_{q,e}\}\), where \(C_{q,e} = \{w(v_i,v_j) = \dfrac{ROI(v_i,v_j)}{MOI(v_i)} |v_i,v_j \in V_q\}\) is the set of normalized weights of the arcs of influence.

Definition 6

Given a \(NDIG_q\), a heterogeneous normalized directed influence graph is a quintuplet \(HNDIG_q =\{V_q,E_q,C_{q,e}, P_q,X_{q,v}\}\). An example of heterogeneous normalized directed influence graph is presented in Fig. 1. In this graph, the user nodes are weighted by the MOI measure and the arcs of influence between two users are weighted by the w values introduced in Definition 5.

Fig. 1
figure 1

An example of \(HNDIG_q\)

4 Sentiment polarity prediction

Given a fixed topic of interest q and a \(HNDIG_q\), let \(y_i \, \in \, \{-1,+1\}\) the label polarity that defines each user and post sentiment as either “positive” \((+1)\) or “negative” \((-1)\) towards the topic q. Let \(Y_v\) the vector of labels for all users and \(Y_p\) that of all posts. In particular, we distinguish two categories of users: labeled users for whom the polarity labels are known and the unlabeled users those with unknown polarity labels. Given the difficulty in collecting labels and the scale of social networks, we work within a semi-supervised learning paradigm. We assume that only a small set of users are already labeled. Thus, our task is to predict the polarity labels of all the unlabeled users.

We define a model which obeys to the Markov assumption implying that the sentiment label of a user is determined by the sentiment labels of his posts (user–post factor) and those of his adjacent connected neighbors who may influence him (user–user factor). According to that assumption, the defined probabilistic model is detailed in the following.

$$\begin{aligned} \begin{aligned} \log P(Y_v)&= \left( \sum _{v_i\in V}^{} \left[ \sum _{p\in P(v_i),k,l}^{} \mu _{k,l}f_{k,l}(y_{v_i},y_p) \right. \right. \\&\quad \left. \left. + \sum _{v_j\in N(v_i),k,l}^{} \lambda _{k,l}h_{k,l}(y_{v_i},y_{v_j})\right] \right) \\&\quad - \log Z \end{aligned} \end{aligned}$$
(4)

Where:

  • The indices k,l range over the set of label polarity {\(-1,+1\)} that defines each user or post node sentiment as either positive or negative;

  • \(\mu _{k,l}\) and \(\lambda _{k,l}\) impact parameters;

  • \(f_{k,l} (\cdot ,\cdot )\) the feature function that evaluates the user–post factor;

  • \(h_{k,l} (\cdot ,\cdot )\) the user–user feature function;

  • \(y_p\) the sentiment label of the post p;

  • Z a normalization factor.

User–post factor A user’s posts are expected to provide information about his opinion. The user–post feature function evaluates the agreement between post polarity and user sentiment, with respect to the levels of confidence in users initially labeled or not. It is defined by the different configurations specified by the indices k and l.

The levels \(\tau _{labeled}\) and \(\tau _{unlabeled}\) are estimated based on the assumption that initial labels are the most trustworthy, so we fixed \(\tau _{labeled} = 1.0\) and \(\tau _{unlabeled} = 0.125\). Note that this feature function assumes that each post has to be classified.

$$\begin{aligned} f_{k,l}(y_{v_i},{\hat{y}}_p)= \left\{ \begin{array}{r c} \dfrac{\tau _{labeled}}{\mid P(v_i) \mid } &{} y_{v_i}= k, {\hat{y}}_p= l, v_i : labeled \\ \dfrac{\tau _{unlabeled}}{\mid P(v_i) \mid } &{} y_{v_i}= k, {\hat{y}}_p= l, v_i : unlabeled \\ 0 &{} otherwise \end{array} \right. \end{aligned}$$
(5)

Definition 7

In-degree normalization function\(\delta (v_i)\) normalizes the number of followers (in-degree) of a user in the range [0,1].

$$\begin{aligned} \delta (v_i)= \dfrac{\ln (|F(v_i)|- \min _{v_j \in V} |F(v_j)|)}{\ln (\max _{v_j \in V} |F(v_j)|- \min _{v_j \in V} |F(v_j)|)} \end{aligned}$$
(6)

Definition 8

Influence rankIR allows to rank influential users. Opinion leaders are those with high values of IR. This measure is proportional to the followers magnitude of influence.

$$\begin{aligned} IR(v_i)= (1-\delta (v_i) \dfrac{\sum _{v_j\in F(v_i)} IR(v_j)}{\mid F(v_i) \mid } + \delta (v_i) MOI(v_i)) \end{aligned}$$
(7)

In order to calculate IR values, we use limited recursive algorithm (LRA) (Hajian and White 2011). Recursively, it explores the neighborhood of the node for which the influence rank is estimated. This estimation is assessed by traversing each node neighborhoods to a given maximum depth.

figure a

User–user factor. We recognize that the social influence relationships between users can correlate with agreement in sentiment. The user–user feature function evaluates the agreement of a user with his neighbor’s opinion by referring to their social relationships of friendship and influence. To the best of our knowledge, this is the first work aiming at introducing information about influence in the user–user factor. Similarly to the user–post function, it is defined by the different configurations specified by the indices k and l.

In our experiment settings, we set \(\tau _{relation}\) and \(\tau _{influence}\) to 0.7 and 0.6 in order to adjust the importance of friendship and influence information.

$$\begin{aligned} \begin{aligned}&h_{k,l}(y_{v_i},y_{v_j}) \\&\quad = \left\{ \begin{array}{r c} \dfrac{\tau _{relation}}{\mid N(v_i)\mid } + \dfrac{\tau _{influence}}{\mid N(v_i)\mid } \times \dfrac{1}{1-IR(v_j)} &{} y_{v_i}= k, y_{v_j}= l \\ 0 &{} otherwise \end{array} \right. \end{aligned} \end{aligned}$$
(8)

The two factors of user–post and user–user are estimated directly from simple statistics by using counts from the labeled data. So far, it remains to estimate the optimal parameter values \(\mu _{k,l}\) and \(\lambda _{k,l}\) in order that assigning the polarity label of a user maximizes \(\log P(Y_v)\). For learning these parameters, we use SampleRank algorithm (Wick et al. 2009).

4.1 SampleRank algorithm (learning)

For simplicity, we refer by \(\phi\) to the vector of parameters \(\mu _{k,l}\) and \(\lambda _{k,l}\). We aim to learn these parameters by maximizing logP(Y) (according to \(\phi\)). For this purpose, we employ SampleRank algorithm. In this algorithm, a sampling function chooses randomly an element of Y for reverting its polarity. It converges when the objective function and the accuracy difference between \(Y^{new}\) and Y do not increase for a given number of steps.

figure b

Where:

  • The “Sampling” function is used to sample from a uniform distribution which reverts the polarity of an element of \(Y_v\) randomly chosen.

  • The “Initialize” function initializes the values of parameters by simply using counts from the subsets of labeled users and posts.

    $$\begin{aligned} \begin{aligned} \mu _{k,l} = \dfrac{\sum _{(v_i,p)\in E_{labeled}}^{ } I(Y_{v_i}=k,Y_p=1) }{X} \end{aligned} \end{aligned}$$
    (9)

    where:

    $$\begin{aligned} X= & {} \sum _{(v_i,p)\in E_{l}}^{ } I(Y_{v_i}=k,Y_{p}=1) \nonumber \\&+ I(Y_{v_i}=k,Y_p=-1) \end{aligned}$$
    (10)

    and :

    $$\begin{aligned} \lambda _{k,l} = \dfrac{\sum _{(v_i,v_j)\in E_{labeled}}^{ } I(Y_{v_i}=k,Y_{v_j}=1) }{Y} \end{aligned}$$
    (11)

    Where

    $$\begin{aligned} Y= & {} \sum _{(v_i,v_j)\in E_{l}}^{ } I(Y_{v_i}=k,Y_{v_j}=1)\nonumber \\&+ I(Y_{v_i}=k,Y_{v_j}=-1) \end{aligned}$$
    (12)

    I\((\cdot )\) is the indicator function. Thus, \(\mu _{k,l}\) is set to 1 if k = l and 0 otherwise (assuming that negative users share only negative posts and conversely positive users share only positive posts). In our experiments, we set \(\eta\) to 0.001.

  • The Performance function “w” measures the accuracy difference between \(Y_{new}\) and Y, on the labeled data only. This function is detailed in Sect. 5.4.

  • “Convergence”: the solution converges when the objective function do not increase for a given number of steps. We set the maximum number of steps to 10,000.

Table 3 Statistics of our main dataset

4.2 Document polarity classification

The main polarity classification methods are target-independent. However, classifying the sentiment of a post towards a certain topic is needed. For this reason, we exploit the method proposed in (Vo and Zhang 2015). It aims at categorizing the sentiment in a tweet towards a specific target by extracting a rich set of automatic features. They use a model that takes a textual post (tweet) as input and outputs its sentiment polarity about a topic of interest (target). The system represents a post using distributed word representation. From the matrix representation, the right and left contexts are extracted (consisting of all the words on the right and left of the target). The context representation is leveraged by applying distributed word embedding using standard embeddings. In addition, lexicon-based distributed contexts are generated by filtering out the words that are not in the sentiment lexicon. The new contexts are thus sentiment-baring only. From the rich set of contexts, a set of rich automatic features is extracted using row-wise pooling functions. The set of real-values features is used as input to the final sentiment classifier. For binary classification, a linear model, where the input is the set of rich real-values features, is trained by optimizing the following objective function:

$$\begin{aligned} \min \frac{1}{2} w^{T}w + C\sum _{i=1}^{l} L(w;x_i;y_i) \end{aligned}$$
(13)

Where C \(\ge\) 0 is a penalty parameter and \(L(w;x_i;y_i)\) is a loss function.

5 Experimental results and evaluation

In this section, we carry out experiments over real-world social networks to present the effectiveness of the proposed model on user sentiment polarity inference. In particular, we focus our investigation on data from Twitter. We begin by describing the dataset and presenting the observations that validate our intuitions as to how the influence information helps sentiment classification. We finally analyze the performance results.

5.1 Data distribution

In order to conduct the experiments and evaluate the proposed model, we need a dataset composed of:

  • a set (V) of labeled users about a specific topic q,

  • sets of the friends and followers of all users \(\in\) V,

  • labeled tweets published by users \(\in\) V about the specific topic q,

  • likes, comments and shares of the tweets about q.

To the best of our knowledge, there are no datasets containing all the above information which are available. So we apply TwitterAPI to crawl Twitter data. Table 3 shows the basic statistics on all the data collected on the selected topics from different domains : politics and music (Donald Trump, Hilary Clinton and Lady Gaga). “On-topic posts” means that the posts mention the topic by the corresponding assigned name. Our goal was to find a large set of users whose sentiment polarities are clear, so that the gold-standard labels could be reliable. We selected a set of high profile celebrities of the political and musical worlds, and a set of users who are opposed to them. We manually tagged the sentiment labels of the users and their corresponding tweets by two annotators. We tried to find topics with a more balanced class distribution using the profiles originally collected. These keywords are calculated with the highest frequencies among all the words in the profiles using TF-IDF method. The resulting data are exploited to carry out our main experiments. In fact, we build graphs from users with gold-standard sentiment polarity labels and edges between them.

In Fig. 2, we describe the distribution of positive and negative posts about the different topics by counting the proportions of these posts among all the in-topic posts.

Fig. 2
figure 2

Statistics of the corpus of in-topic tweets

5.2 Observational statistics

In this section we investigate the degree to which social influence relationships and user label polarities correlate, since the motivation of our work is that users having influence relationships tend to exhibit similar sentiment.

We have defined two types of statistics to study the interplay between user label polarity and influence relationships. These statistics are as follows:

  • The probability that two users who are influenced by one another, conditioned on whether or not they have the same label: this statistic measures the influence conditioned on labels. Figure 3 shows that shared sentiment tends to imply influence. In fact, in the resultant graphs, it is more likely for users to have an influence relation if they share an opinion than if they differ.

  • The probability that two users have the same label, conditioned on whether or not they are influenced by one another: the second statistic measures the shared sentiments conditioned on having an influence relationship. Figure 4 shows that the probability of two users influenced one by another sharing the same sentiment on a topic is much higher than chance.

In sum, user pairs in which at least one influences the other are more likely to hold the same sentiment and two users with the same sentiment are more likely to be influenced one by another than two users with different sentiments. These observations validate our intuition that influence and shared sentiment are clearly correlated.

Fig. 3
figure 3

Probability of influence relationship conditioned on whether or not users have the same sentiment label

Fig. 4
figure 4

Probability of two users having the same sentiment label, conditioned on having an influence relationship

5.3 Performance analysis

In order to take into account that SampleRank algorithm is randomized because of its dependence on the sampling function, we performed inferences k (k \(\in\) 1,3,5,11) times to get k predictions. The idea is to retain a majority vote (prediction) among the k possible labels.

We run experiments 10 times. In each time, the data with ground truth labels are partitioned into a training set and an evaluation set. The first set is composed of 50 positive users and 50 negative users, chosen randomly. The second set is consisting of the remaining labeled users. The ratio of the two sets are different in different topics.

As part of the model, we need the annotation of tweet labels which is obtained by running the proposed method in (Wick et al. 2009).

We compare two user classification methods in order to evaluate our results.

  • Heterogenous graph model (Tan et al. 2011): it performs semi-supervised learning on the heterogenous graph representing the users, the mutual connections and their posts. Then, it applies loopy belief propagation to obtain user-level sentiment labels.

  • Heterogenous Influence graph model: we perform our semi-supervised learning on the heterogeneous influence graph to get user classification.

In order to evaluate the obtained results, we introduce the performance results for the different considered methods. We evaluate performance using precision (P), recall (R) and F1-score (F1) on each topic.

$$\begin{aligned} P= & {} \frac{TP}{TP+FP} \end{aligned}$$
(14)
$$\begin{aligned} R= & {} \frac{TP}{TP+FN} \end{aligned}$$
(15)
$$\begin{aligned} F1= & {} \frac{2 P R}{P + R} \end{aligned}$$
(16)

Then accuracy is measured using these measures. Its equation is the following:

$$\begin{aligned} Accuracy = \frac{TP + TN}{TP + FP + FN + TN} \end{aligned}$$
(17)

where TP is the number of true positive, FP the false positive, FN the false negative and TN the true negative in terms of predictions.

Fig. 5
figure 5

Performance analysis of accuracy using the two methods: (Tan et al. 2011) and our proposed model HIG

The results confirm the performance improvement in sentiment prediction. Figure 5 shows these results for the different methods and that our model achieves the highest accuracies. Finally, the F1 measures are reported in Table 4.

Table 4 F1 results

5.4 Discussion

In this paper, we have made two key observations: the probability that two users who are influenced by one another, conditioned on whether or not they have the same label; and the probability that two users have the same label conditioned on whether or not they are influenced by one another are much higher than chance. These observations confirm our hypothesis and the results have demonstrated that our proposed model for sentiment classification at user level is promising and worth further investigation. The more two users influence each other the more they could have similar sentiment polarities. This has improved the task of sentiment classification by introducing information about influence and showing the practical use of our proposed heterogeneous influence graph.

We have illustrated how to use our graph model to enhance the polarity classification for new users. We also believe that our proposed model can be easily used to other applications.

The influencers, generally, have a large set of following users expressed by high values of in-degree. The followers engage in discussions on the topics of the influencers. Not only that, they retweet these posts so the audience multiplies over time. Through them, other influential people, belonging to their network, can be identified. They can in turn influence the opinion of their own friends. This can be explained by the fact that one user follows an influencer because he considers him to be an opinion leader. So the factor that contributes to their popularity is credibility.

However, our model is applied only at a user level. It would be interesting to add the prediction of the polarities of the posts too. Above all, we consider that the posts are independent and concern only one topic at a time. While, several real posts relate to different topics and can also influence each other.

6 Conclusion and future work

The general idea in this paper is to help sentiment analysis by exploring social network structures. We demonstrated that sentiment analysis at user-level can be improved by incorporating influence information from a social network. We proposed a polarity classification approach which exploits influence relationships to represent the principle of influence in addition to direct connections reflecting the principle of homophily. The computational results show that our model outperforms an approach which considers only information about homophily. These results clearly show that the proposed model is promising and worth further investigation. In this work, we addressed the problem in the context of Twitter. A straightforward task would be to build datasets from other online social networks. Moreover, build datasets across more general topics. Also, we want to compare our approach with other user-level polarity classifiers.

As a perspective, it will be interesting to add the prediction of the polarities of the posts based on the ABSA. We also plan to valid our model in a larger dataset collected from different social network sites.