Keywords

1 Introduction

From the perspective of information propagation, forwarding is viewed as an atomic behavior. Published messages are visible to the followers, and a follower can quickly share information that he or she is interested in. Considering the direct driving force on information diffusion, forward prediction has been applied in many fields [1, 2]. The relevant research can help us explore the direction of information dissemination [3], and has a positive significance to public opinion control [4]. Although improved methods have achieved positive results in current research on forward prediction, some challenges still remain.

On the one hand, user behavior in social networks is caused by complex factors [5]. Users are the core media of online social network information dissemination and directly affect the breadth and depth of information dissemination. Although the current research concerns the impact of user attributes on information forwarding, it only considers the basic attributes, such as the number of fans or friends [6], and ignores the intrinsic mechanisms [7] affecting forward behavior such as interests and habits [8]. Moreover, user interests tend to be multidimensional. This may make users participate in different social actions and be influenced by different users. However, there are few studies that consider the multidimensional interests of users. We need to explore more in the future research.

On the other hand, a lot of methods for forward prediction in social networks consider only static features and attributes [9, 10], and few works take time factor into consideration [11]. Not only nodes and edges are changing with time, the information forwarding behavior also change with time. Incorporating time factor into forward prediction methods would be promising.

In order to analyze the complexity of user forward behavior, it is mapped into multiple mechanisms. Both internal driving mechanisms such as interests, habits and external driving mechanism of network structure are considered. By introducing time information, we propose a model to dynamically monitor user forward behavior. To verify our proposed model, we choose real data of the Sina Weibo for evaluation. Experimental results indicate that the model not only mine user latent topic in multiple dimensions, but also improve forward prediction performance.

Our contribution can be summarized as follows:

  • In order to analyze the complexity of user forward behavior, it is mapped into multiple mechanisms: interest-driven, habit-driven and structure-driven. By analyzing and quantifying these driving mechanisms, we can effectively predict user forward behavior.

  • Owing to the continuity of some user attributes, the traditional LDA text modeling method is improved by Gaussian distribution and applied to user interest, activity and influence modeling. In this way, the user topic distribution for each dimension can be obtained regardless of whether the word is discrete or continuous.

  • Given the insufficiency of current model consider only static features and the advantage of pre-discretizing method on helping LDA detect the topic evolution automatically. Our model can be extended using the pre-discretizing method. By introducing time information, we can dynamically monitor user activity and mine the hidden behavioral habit.

The remainder of this paper is organized as follows. Section 2 introduces related work. Section 3 formulates the problem and gives the necessary definitions. Section 4 explains the proposed model and describes the learning algorithm. Section 5 presents and analyzes the experimental results. Finally, Sect. 6 concludes the paper.

2 Related Work

In online social networks, information dissemination is mainly depended on forward behavior. Forward prediction is achieved by learning the user interests and behavior patterns. According to different assumptions, we structure the discussion of related work onto two broad previously mentioned categories: user behavior and interest modeling, dynamic modeling.

Forward Prediction with User Behavior and Interest Models.

There are some prior works that focused on predicted forwarding by similar interests [12, 13]. These approaches treat forwarding as the way people interact with the messages. It is critical in understanding user behavior patterns and modeling user interest. Qiu et al. [14] proposed an LDA-based behavior-topic model which jointly models user topic interests and behavioral patterns. Bin et al. [15] proposed two novel Bayesian models which allow the prediction of future behavior and user interest in a wide range of practical applications. Comarela et al. [16] studied factors that influence a users’ response, and found that the previous behavior of user, the freshness of information, the length of message could affect the users’ response. However, most of the existing work is difficult to take into account the complex drivers of user behavior and neglected the intrinsic mechanisms such as habits.

Modeling and Predicting Forward Behavior Dynamically.

Many studies propose dynamic modeling of user behavior. Ahmed et al. [17] proposed a time-varying model. They assumed that user actions are fully exchangeable, and that users’ interest are not fixed over time. The paper divided user actions into several epochs based on the time stamp of the action and modeled user action inside each epoch using LDA. Liu et al. [18] proposed a fully dynamic topic community model to capture the time-evolving latent structures in such social streams. Moreover, some researches [19,20,21] assumed data in similar space are exchangeable and effectively capture the dynamics of topics in message. Zhao et al. [22] proposed a dynamic user clustering topic model. The model adaptively tracks changes of each user’s time-varying topic distribution based both on the short texts the user posts during a given time period and on the previously estimated distribution. Most of the methods implement dynamic modeling by quantifying user action or interest.

This paper models user interest and behavior by using GLDA which integrate the Gaussian distribution into LDA. The internal driving mechanisms and external driving mechanisms are jointed into our model. Meanwhile, time factor is introduced by pre-discretizing method, which can help GLDA detect the topic evolution dynamically. Since then, we can obtain user interest and mine the hidden behavioral habit, as well as predict forward behavior dynamically.

3 Problem Definition

3.1 Related Definitions

We use \( G = (V,E) \) to denote the structure of a social network, where V is the set of all users and E is an \( N \times N \) matrix, with each element \( e_{m,n} = 0\,\,{\text{or}}\,\, 1 \) indicating whether user \( v_{m} \) has a link to user \( v_{n} \). The cardinality \( \left| V \right| = N \) is used to denote the total number of whole network users. For predicting the forward behavior, some basic concepts and related definitions are introduced.

Definition 1.

Interest following vector \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{e}_{m}^{(a)} = \left[ {e_{m,1}^{(a)} ,e_{m,2}^{(a)} , \ldots ,e_{{m,N_{m} }}^{(a)} } \right] \).

We hold the view that users are more likely to follow a user they are interested in. Therefore, the user following behavior is used to define the interest following vector. \( e_{m,n}^{(a)} (n = 1,2, \ldots ,N_{m} ) \) is a user followed by user \( v_{m} \), referred to here as a followed user. \( N_{m} \) is the number of followed users.

Definition 2.

Interest interacting vector \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{e}_{m}^{(i)} = \left[ {e_{m,1}^{(i)} ,e_{m,2}^{(i)} , \ldots ,e_{{m,N_{m}^{{\prime }} }}^{(i)} } \right] \).

The following relationship only indicates the possibility of interaction between users and reflects the static user interests. By analyzing the historical interaction, we can mine active user interests. \( e_{m,n}^{(i)} \;(n = 1,2, \ldots ,N_{m}^{{\prime }} ) \) is a user interacted with user \( v_{m} \), referred to here as an interacted user. \( N_{m}^{{\prime }} \) is the number of interacted users.

Definition 3.

Interest-driven vector \( I(v_{m} ) = \left[ {e_{m,1}^{(a)} , \ldots ,e_{{m,N_{m} }}^{(a)} ,e_{m,1}^{(i)} , \ldots ,e_{{m,N_{m}^{{\prime }} }}^{(i)} } \right] \).

The interest-driven vector is referred to as a user interest document, which can also be expressed as the superposition of interest followed users and interacted users. Each followed user or interacted user can be referred to here as a behavioral user.

Definition 4.

Habit-driven vector \( A(v_{m} ,t) = [x_{m,t,1} ,x_{m,t,2} ] \).

The activity is divided into post activity \( x_{m,t,1} \) and forward activity \( x_{m,t,2} \). Considering the characteristics of user daily routine, we divide a day into four six-hour slices and map the activity related attributes into multiple time slices, which are defined as:

$$ \left\{ {\begin{array}{*{20}l} {x_{m,t,1} = n_{m,t}^{pos} /n^{pos} } \hfill \\ {x_{m,t,2} = n_{m,t}^{ret} /n^{ret} } \hfill \\ \end{array} } \right. $$
(1)

Where \( n_{m,t}^{pos} \) and \( n_{m,t}^{ret} \) represent the average post or forward number of user \( v_{m} \) at time t, \( n^{pos} \) and \( n^{ret} \) are the average post or forward number per day.

Definition 5.

Structural-driven vector \( S(v_{m} ) = [d_{m,1} ,d_{m,2} ,d_{m,3} ] \).

Network provides substrate for information propagation, thus forward behavior strongly depends on network structure. Based on the network influence related attributes, we can define structural-driven vector, where \( d_{m,1} ,\,d_{m,2} ,\,d_{m,3} \) are the in-degree, out-degree, and node degree centrality respectively.

3.2 Problem Formulation

To formally formulate the problem of our research, let \( G = (V,E) \) be the whole network, \( B = \left\{ {\left. {(b,v,t)} \right|v \in V} \right\} \) represents the behavior information of all users. Firstly, the cause of forward behavior can be mapped into multidimensional vectors: I, A, S. If a user published a message at time slice t, then we can use our method to predict fans forward behavior Y. Specifically, the problem is formulated as follows:

$$ \left. {\begin{array}{*{20}c} {G,B \to I,A,S} \\ t \\ \end{array} } \right\} \Rightarrow f:(I,A,S,t) \to Y $$
(2)

4 Proposed Model

To solve the above problems, we propose a novel prediction model, GLDA, based on user behavior and relationships. The details of the model framework are introduced in three modules: driving mechanisms quantification, user interest, activity and influence modeling and forward prediction, as shown in Fig. 1. In the first module, related attributes are considered for driving mechanisms quantification, and multiple driven vectors are proposed to represent them. In the second module, the user topic distribution for each dimension can be obtained based on improved LDA. In the third module, using Gibbs sampling method to get the probability distribution of forward behavior, then the model can be proposed to do forward prediction.

Fig. 1.
figure 1

Model framework.

4.1 Model Details

Given the three driven vectors defined in Sect. 3.1, the problem to be solved becomes how to incorporate those vectors into multiple prediction features and behavior modeling. This module presents the process of modeling, includes: interest-driven simulation analysis, habit-driven simulation analysis, structural-driven simulation analysis. Corresponding to different driving mechanisms, the relevant distributions of prediction features can be obtained.

Interest-Driven Simulation Analysis.

User interest is reflected primarily in user behavior. We focus on the analysis of following behavior and interacting behavior. Taking advantage of the LDA topic model in dealing with polysemy and synonym problems, the traditional text modeling method is used to model user interest. Each user can be understood as the component of followed users and interacted users, which can also be expressed as its interest-driven vector. Given parameter Z as the number of interest topics, the simulated interest-driven vector generative process is:

  1. 1.

    For each interest topic z, draw \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\xi }_{z} \sim Dir\,(\lambda ) \);

  2. 2.

    Given the mth user \( v_{m} \), in whole network G, draw \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{m} \sim Dir(\alpha ) \);

  3. 3.

    For the nth behavioral user in the mth user \( e_{m,n} \):

    1. a.

      Draw an interest topic \( z = z_{m,n} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{m} ) \);

    2. b.

      Draw a behavioral user \( e_{m,n} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\xi }_{{z_{m,n} }} ) \);

    Here, \( Dir(.),Mult(.) \) denotes Dirichlet distribution and Multinomial distribution. The graphic model is shown in Fig. 2 and the symbols are described in Table 1.

    Fig. 2.
    figure 2

    Graphic model.

    Table 1. Description of symbols in graphic model.

Actually, the aim of user interest modeling is to compute Multinomial distributions \( \Phi = \left[ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{1} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{2} , \ldots ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{N} } \right] \) and \( \Sigma = \left[ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\xi }_{1} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\xi }_{2} , \ldots ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\xi }_{Z} } \right] \). Owing to the coupling of \( \Phi \) and \( \Sigma \), we cannot compute them directly and Gibbs sampling [23] is applied to indirectly get \( \Phi \) and \( \Sigma \). The principle of Gibbs sampling in terms of extracting topic \( z_{i} \) of behavior user \( e_{i} \) is as follows:

$$ p\left( {\left. {z_{i} = z} \right|\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{z}_{ - i} ,E} \right) \propto p\left( {\left. {z_{i} = z,e_{i} = e} \right|\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{z}_{ - i} ,E} \right) = \widehat{\varphi }_{m,z} \times \widehat{\xi }_{z,e} = \frac{{n_{m, - i}^{(z)} + \alpha }}{{\sum\nolimits_{z = 1}^{Z} {n_{m,\neg i}^{(z)} + \alpha } }} \times \frac{{n_{z, - i}^{(e)} + \beta }}{{\sum\nolimits_{e = 1}^{N} {n_{z,\neg i}^{(e)} + \beta } }} $$
(3)

Where \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{z}_{\neg i} \) represents the topic of behavioral users except for the current behavioral user; \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{e}_{\neg i} \) represents behavioral users except for the current behavioral user; \( n_{z,\neg i}^{(e)} \) is the number of behavioral user e assigned to interest topic z except for the current behavioral user; and \( n_{m,\neg i}^{(z)} \) is the number of interest topic z assigned to user \( v_{m} \) except for the current behavioral user. When the sampling converges, \( \Phi \) and \( \Sigma \) can be obtained.

Habit-Driven Simulation Analysis.

The user behavioral habit can be analyzed based on historical behavior. Here, we focus on post behavior and forward behavior. Considering the dynamics of user behavioral habit, the past behavioral data are pre-discretized to get habit-driven vector. In other words, a user at every time slice is regarded as a document and the activity related attributes are regarded as words. By introducing activity as topics, we can mine potential user behavioral habits.

Unlike the value of \( e_{m,n} \), the activity related attributes are both continuous. Owing to the useless for dealing with continuous attributes modeling, Gaussian distribution is used to replace Multinomial distribution in standard LDA. In the improved model, the activity related attributes obey the following Gaussian distribution:

$$ f(x_{m,t,h} ;\mu_{q,h} ,\sigma_{q,h} ) = \frac{1}{{\sqrt {2\pi } \sigma_{q,h} }}\exp \left[ { - (x_{m,t,h} - \mu_{q,h} )^{2} /2\sigma_{q,h}^{2} } \right] $$
(4)

Where \( x_{m,t,h} \) is the hth attribute of user \( v_{m} \) at time slice t, and \( \mu_{q,h} ,\sigma_{q,h} \) are parameters of Gaussian distribution that \( x_{m,t,h} \) obeys. The simulated habit-driven vector generative process can be described as follows:

  1. 1.

    Given the mth user \( v_{m} \) at any time slice t, draw \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\theta }_{m,t} \sim Dir(\beta ) \);

  2. 2.

    For the hth activity related attribute of the mth user \( x_{m,t,h} \):

    1. a.

      Draw an activity topic \( q = q_{m,t,h} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\theta }_{m,t} ) \);

    2. b.

      Draw an attribute value \( x_{m,t,h} \sim N(\mu_{{q_{m,t,h} ,h}} ,\sigma_{{q_{m,t,h} ,h}} ) \);

Where \( N(.) \) denotes Gaussian distribution. The purpose of user activity modeling is to learn the distribution set \( \Pi = [(\mu_{1,h} ,\sigma_{1,h} ),(\mu_{2,h} ,\sigma_{2,h} ), \ldots ,(\mu_{Q,h} ,\sigma_{Q,h} )]\;(h \in [1,H]) \) and \( \Theta = [\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\theta }_{1,t} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\theta }_{2,t} , \ldots ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\theta }_{N,t} ]\;(t \in T) \). Owing to the existence of hidden variables, the EM algorithm [24] is used to estimate model parameters. E step computes the responsiveness of topic to attribute according to the current model parameters:

$$ \chi_{m,t,h,q} = P\left( {\left. q \right|v_{m} ,t,x_{m,t,h} } \right) = P\left( {\left. {x_{m,t,h} } \right|q} \right) \times P\left( {\left. q \right|v_{m} ,t} \right) = \frac{{f(x_{m,t,h} ;\mu_{q,h} ,\sigma_{q,h} )\theta_{m,t,q} }}{{\sum\nolimits_{{q^{'} = 1}}^{Q} {f(x_{m,t,h} ;\mu_{{q^{{\prime }} ,h}} ,\sigma_{{q^{{\prime }} ,h}} )\theta_{{m,t,q^{{\prime }} }} } }} $$
(5)

M step updates the model parameters for the new round of iteration:

$$ \mu_{q,h} = \frac{{\sum\nolimits_{m = 1}^{N} {\sum\nolimits_{t = 1}^{T} {\chi_{m,t,h,q} * x_{m,t,h} } } }}{{\sum\nolimits_{m = 1}^{N} {\sum\nolimits_{t = 1}^{T} {\chi_{m,t,h,q} } } }},\;\sigma_{q,h} = \sqrt {\frac{{\sum\nolimits_{m = 1}^{N} {\sum\nolimits_{t = 1}^{T} {\chi_{m,t,h,q} (x_{m,t,h} - \mu_{q,h} )^{2} } } }}{{\sum\nolimits_{m = 1}^{N} {\sum\nolimits_{t = 1}^{T} {\chi_{m,t,h,q} } } }}} $$
(6)
$$ \theta_{m,t,q} = \frac{1}{H}\sum\nolimits_{h = 1}^{H} {\chi_{m,t,h,q} } $$
(7)

Where \( \chi_{m,t,h,q} \) is the responsiveness of activity topic q to the attribute \( x_{m,t,h} \). \( \theta_{m,t,q} \) denotes the probability of user \( v_{m} \) assigned to activity topic q at time slice t. Repeat the above two steps until convergence, \( \Theta \) and \( \Pi \) can be obtained.

Structural-Driven Simulation Analysis.

The network structure contains many user attributes, such as in-degree, out-degree, and other attributes, which can be expressed as structural-driven vector. Based on it, we can classify users into clusters. Each cluster can be regarded as an influence role that users play. Each influence role has a set of parameters of distribution that the influence related attributes conform to. Similar with the previous Section, we also use Gaussian distribution. If user \( v_{m} \) play influence role r, its kth attribute \( d_{m,k} \) conforms to:

$$ f\left( {d_{m,k} ;\mu_{r,k}^{{\prime }} ,\sigma_{r,k}^{{\prime }} } \right) = \frac{1}{{\sqrt {2\pi } \sigma_{r,k}^{{\prime }} }}\exp \left[ { - \left( {d_{m,k} - \mu_{r,k}^{{\prime }} } \right)^{2} /2\sigma_{r,k}^{{{\prime }2}} } \right] $$
(8)

Where \( \mu_{r,k}^{{\prime }} \) and \( \sigma_{r,k}^{{\prime }} \) are parameters of Gaussian distribution that \( d_{m,k} \) obeys. The simulated structural-driven vector generative process is as follows:

  1. 1.

    Given the mth user \( v_{m} \), in the whole network, draw \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\psi }_{m} \sim Dir(\varepsilon ) \);

  2. 2.

    For the kth influence related attribute of the mth user \( d_{m,k} \):

    1. a.

      Draw an influence topic \( r = r_{m,k} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\psi }_{m} ) \);

    2. b.

      Draw an attribute value \( d_{m,k} \sim N(\mu_{{r_{m,k} ,k}}^{{\prime }} ,\sigma_{{r_{m,k} ,k}}^{{\prime }} ) \);

Our goal is to learn distributions \( \Pi ^{{\prime }} = [(\mu_{1,k}^{{\prime }} ,\sigma_{1,k}^{{\prime }} ),(\mu_{2,k}^{{\prime }} ,\sigma_{2,k}^{{\prime }} ), \ldots ,(\mu_{R,k}^{{\prime }} ,\sigma_{R,k}^{{\prime }} )]\;(k \in [1,K]) \) and \( \Psi = [\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\psi }_{1} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\psi }_{2} , \ldots ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\psi }_{N} ] \). And EM algorithm is also used to estimate model parameters. E step computes the responsiveness as:

$$ \chi_{m,k,r}^{{\prime }} = P\left( {\left. r \right|v_{m} ,d_{m,k} } \right) = P\left( {\left. {d_{m,k} } \right|r} \right) \times P\left( {\left. r \right|v_{m} } \right) = \frac{{f(d_{m,k} ;\mu_{r,k}^{{\prime }} ,\sigma_{r,k}^{{\prime }} )\psi_{m,r} }}{{\sum\nolimits_{{r^{{\prime }} = 1}}^{R} {f(d_{m,k} ;\mu_{{r^{{\prime }} ,k}}^{{\prime }} ,\sigma_{{r^{{\prime }} ,k}}^{{\prime }} )\psi_{{m,r^{{\prime }} }} } }} $$
(9)

M step updates the model parameters for the new round of iteration:

$$ \mu_{r,k}^{{\prime }} = \frac{{\sum\nolimits_{m = 1}^{N} {\chi_{m,k,r}^{{\prime }} d_{m,k} } }}{{\sum\nolimits_{m = 1}^{N} {\chi_{m,k,r}^{{\prime }} } }},\;\sigma_{r,k}^{{\prime }} = \sqrt {\frac{{\sum\nolimits_{m = 1}^{N} {\chi_{m,k,r}^{{\prime }} (d_{m,k} - \mu_{r,k}^{{\prime }} )^{2} } }}{{\sum\nolimits_{m = 1}^{N} {\chi_{m,k,r}^{{\prime }} } }}} $$
(10)
$$ \psi_{m,r} = \frac{1}{K}\sum\nolimits_{k = 1}^{K} {\chi_{m,k,r}^{{\prime }} } $$
(11)

Where \( \chi_{m,k,r}^{{\prime }} \) is the responsiveness of influence role r to the attribute \( d_{m,k} \). \( \psi_{m,r} \) denotes the probability of user \( v_{m} \) assigned to influence role r. Repeat the two steps until convergence, \( \Psi \) and \( \Pi ^{{\prime }} \) can be obtained.

4.2 Comprehensive Forward Behavior Modeling and Prediction

Based on the previous modeling process, the relevant probability distributions of prediction features are obtained. By combining these distributions, the forward behavior distribution can be computed to predict the forward action that a user may take. Assume that \( Y = \left\{ {y_{1} ,y_{2} , \ldots ,y_{M} } \right\} \) is the behavior set to be predicted and its generative process can be described as follows:

  1. 1.

    Draw \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\rho } \sim Dir(\gamma ) \);

  2. 2.

    For the jth behavior \( y_{j} \):

    1. a.

      Draw an interest topic \( z_{m} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{m} ) \) for post user \( v_{m} \);

    2. b.

      Draw an interest topic \( z_{n} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{n} ) \) for fans \( v_{n} \);

    3. c.

      Draw an activity topic \( q = q_{n,t} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\theta }_{n,t} ) \) for fans \( v_{n} \);

    4. d.

      Draw a influence role \( r = r_{m} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\psi }_{m} ) \) for post user \( v_{m} \);

    5. e.

      Draw the behavior \( y_{j} \sim Mult(\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\rho }_{\tau ,q,r} ) \);

Where \( \tau \) is an indicator function of interest, if \( z_{m} = z_{n} ,\,\tau = 1 \), otherwise \( \tau = 0 \). Behavior \( y_{j} \) only contains two cases (\( y_{j} = 1 \) indicates establish forward action, \( y_{j} = 0 \) indicates not establish forward action), so we can use a Bernoulli distribution \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\rho }_{\tau ,q,r} \) to represent the probability distribution of multiple features over forward actions and the parameter \( \Omega = \left[ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\rho }_{0,q,r} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\rho }_{1,q,r} } \right]\;(q \in [1,Q],r \in [1,R]) \). By using Gibbs sampling, the principle of extracting features \( \tau ,q,r \) of behavior \( y_{j} \) is as follows:

$$ \begin{aligned} p\left( {\tau_{j} = \tau ,q_{j} = q,r_{j} = \left. r \right|\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\tau }_{\neg j} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{q}_{\neg j} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{r}_{\neg j} ,Y} \right) & \propto p\left( {\tau_{j} = \tau ,q_{j} = q,r_{j} = r,y_{j} = \left. y \right|\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\tau }_{\neg j} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{q}_{\neg j} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{r}_{\neg j} ,Y_{\neg j} } \right) \\ & = p\left( {\left. \tau \right|\Phi } \right)p\left( {\left. q \right|\Theta } \right)p\left( {\left. r \right|\Psi } \right)\widehat{\rho }_{\tau ,q,r,y} \\ & = \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{m} \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\varphi }_{n}^{T} } \right)\theta_{n,t}^{(q)} \psi_{m}^{(r)} \times \frac{{n_{\tau ,q,r,\neg j}^{(y)} + \gamma }}{{\sum\nolimits_{y = 0}^{1} {n_{\tau ,q,r,\neg j}^{(y)} + \gamma } }} \\ \end{aligned} $$
(12)

Where \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{\tau }_{\neg j} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{q}_{\neg j} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{r}_{\neg j} \) represents the prediction features of behavioral except for the current behavior; \( Y_{\neg j} \) represents behavior to be predicted except for the current behavior; \( n_{\tau ,q,r,\neg j}^{(y)} \) is the number of behavior y assigned to prediction features \( \tau ,q,r \) except for the current behavior; When the sampling converges, \( \Omega \) can be obtained.

Given a message, we can predict the forward behavior based on the trained model. Firstly, get the features \( \tau ,q,r \) for model input by probability sampling method, and then the user’s forward probability \( \rho_{\tau ,q,r,1} \) and non-forward probability \( \rho_{\tau ,q,r,0} \) are calculated according to the parameter \( \Omega \). If \( \rho_{\tau ,q,r,1} > \rho_{\tau ,q,r,0} \), we predict the fans will forward the message, \( y = 1 \); Otherwise it will not, \( y = 0 \), formally expressed as:

$$ y = \text{argmax}_{\Omega }\, p\left( {\left| y \right|\tau ,q,r} \right) $$
(13)

5 Experiments and Analysis

5.1 Experimental Data and Evaluation Metrics

The experimental data used in this paper is collected from Sina micro-blog, a popular social networking platforms in China. In the process of data collection, we randomly selected a user (user ID: 2312704093) as the starting point. Some users and their micro-blogs are captured based on breadth-first-search, forming a sub-network containing 49,556 users and 61,880 user relationships for the 2011/08/21-2012/02/22. The statistics of the dataset is shown in Table 2.

Table 2. Statistics of the dataset

In this paper, Accuracy, Precision, Recall, F1-Measure, and ROC curve were used to verify the prediction results. We assumed that the forward behavior of fans is a positive example “1”, and non-forward behavior is a negative example “0”. Meanwhile, the dataset needs to be partitioned into training set and test set. Here, we set the proportion of training set and test set to be 8:2. The better prediction results have greater Accuracy, Precision, Recall, F1-Measure, and their ROC curves are close to the upper left corner.

5.2 Prediction Performance Analysis

In this section, the performances of our model are evaluated from three viewpoints. Firstly, we show the results of user latent interest distribution and analyze the overall interest distribution of network. Then, the impact of interest number and the proportion of training set on forward prediction can be verified. Finally, we evaluate the performance by comparing our model with other baseline methods. According to the above three viewpoints, the superiority of our model can be verified.

Firstly, the result of user latent interest distribution is analyzed. We select several representative users to show their latent interests in Fig. 3. Meanwhile, latent interest distributions in network can be shown in Fig. 4. Both in the figures, the x-axis represents the interest ID and the y-axis represents the probability value. The highest interest focus values are provided in parentheses in the legend.

Fig. 3.
figure 3

User latent interest distributions.

Fig. 4.
figure 4

Latent interest distributions in network.

As shown in Fig. 3, the distribution of each user interest is different. When latent interest number \( Z = 25 \), user U1 interest is obvious and prefer Interest \( ID = 17 \). The range of user U2 interest is relatively wide and the proportion interest of user U3 is average. And from Fig. 4, we can observe that the network interest distribution is uniform, although user preferences in the entire network have some differences. Next we will verify the latent interest has a driving effect on forward prediction.

Secondly, considering the excellent classification effect of LR and SVM, they are applied to the forward prediction problem and compared with our model. And the effects of the proportion of training set on forward prediction are shown in Fig. 5. In addition, by reducing the dimension of driving mechanism in our model, three sub-models are obtained: Sub-IA, Sub-IS, and Sub-I. Comparing our model with the sub-models, the effects of the interest topic number on each model can be shown as Fig. 6.

Fig. 5.
figure 5

Comparison of prediction effects between proposed model and classifiers.

Fig. 6.
figure 6

Comparison of prediction effects between proposed model and sub-models.

As shown in Fig. 5, the performance of GLDA is better than LR and SVM. And the performance of our model is least affected by the training set proportion. Overall, as the proportion of training set increases, the prediction effect of each method improves. From Fig. 6, we also can see our model has better prediction performance than its sub-models. It indicates that the extraction of multidimensional driven vector can improve the effect of forward prediction. With the increase of interest topic number, the Precision increases gradually, while the Recall decreases rapidly. From the change of F1-Measure, we can see that when Z is 10–20, the model performs well. In addition, the number of activity topics and influence topics are proposed in the literature [25, 26]. It proposes to classify the activity level as inactive, generally active and very active and points out that users can be divided into three types: ordinary user, opinion leader and structural hole spanner.

Finally, the performance of our model is evaluated by comparing with some baseline methods, such as probabilistic graph model LDA [27] and CRM [6], classical forward prediction methods CF [28] and VSM [29]. The performances of them are shown in Table 3. And ROC curves comparison are shown in Fig. 7. The results show that our model plays optimal performance in Accuracy, Recall and F1-Measure compared with baseline methods. And it can be seen that the ROC curve of our proposed model is closest to the upper left corner, and the overall performance is best. Therefore, our model can improve the forward prediction performance effectively.

Table 3. Comparison between our model and baseline methods.
Fig. 7.
figure 7

Comparison of different methods in ROC.

6 Conclusion

In this study, a novel forward prediction model GLDA is proposed, and it can effectively predict forward behavior by analyzing user behavior and relationships. Firstly, we mapped the cause of forward behavior into three driving mechanisms: interest-driven, activity-driven and structure-driven. Secondly, the traditional LDA was improved by Gaussian distribution and applied to user interest, activity and influence modeling. Finally, the model was extended with the pre-discretizing method and we can dynamically monitor user activity and mine the hidden behavioral habit.

The experimental results showed that our model can improve forward prediction performance in comparison to other baseline methods. By studying forward prediction in social networks, we can acquire a better understanding of information propagation mechanism. And the model can provide support for public opinion management and control. In the future work, it would be intriguing to integrate nonparametric methods into our model to base parameter value choices on the data itself. It is also interesting to explore a method to alleviate the time complexity of training algorithms.