1 Introduction

The advent of Internet and emergence of Web 2.0 technologies have made Online Social Networks (OSN) an important media for propagation of news and views. The online social media could be blogs, microblogs like Twitter, video sharing platforms like You tube, email networks, photo sharing sites like Flickr, news rating sites like Digg, friendship networks like Facebook, professional networks like LinkedIn etc. The spread of misinformation always existed in society. But volume, speed, multimedia support and ease of access supported by OSNs have made spread of information and abuse of media to spread misinformation equally possible with critical implications. The recent examples of such incidents include false identification of the ‘Boston bomber’ in social media [3], spreading of hate messages and videos targeting particular community in OSNs resulting in mass exodus of thousands of people in India in the year 2012 [19], spread of fear of swine flu fever in Twitter [15], political astroturfing witnessed in the Presidential elections of US in 2010 [18]. Concerted efforts made by a cluster of users, whom we call as ‘malicious collectives’, could cause information cascades in the network where a large proportion of population would get infected.

OSNs are also effective social computing systems. The users of such networks could be used to detect spread of false information. The difference in flow of good information and false information have been studied in many OSNs like Twitter. In [14], the authors confirmed that rumours and unconfirmed news were questioned more than good information. In this paper, we propose a psychometric analysis of users and their responses to different types of information using Latent Trait Theory (LTT) to differentiate between true information and false information. Latent trait theory is also called item response theory and is used extensively for educational purposes to measure ability of students as a latent trait based on their responses to different items or questions as well as estimate the quality of question papers in measuring the ability of students. The process involves matching the ability of students to the difficulty levels of questions. In this paper, we use the latent trait of ability of users in OSNs to determine credibility of information in the form of news items and messages.

The use of propagation based methods and content analysis methods to detect possible spread of misinformation have been brought out by number of researchers. The differential spread of misinformation, rumours and credible information have also been documented. However, these empirical results did not have the necessary theoretical support to be used as a formal method of analysis. Our unique contribution in this paper is in the use of established theoretical framework of latent trait theory, to prove the social computing properties of users of OSNs and use the same for classification of quality of information propagation in them. Towards this, we use latent trait theory to measure the ability of users to determine credibility of information. Further, we use the ability of users to segregate possible misinformation in the contents being propagated. We believe that speedy detection is the essence of limiting the spread of misinformation and towards that using the social network users as the first layer of filter would seem the best possible solution.

The rest of the paper is organized as follows. In Sect. 2 we give background and related work and in Sect. 3 we highlight the important aspects of latent trait theoretical models and outline our methodology for use of the same in classification of information in OSNs. We describe LTT models for dichotomous responses in Sect. 4. Trust based LTT models based on polytomous responses is explained in Sect. 5. Experimental results using seven different data sets from Twitter are given in Sect. 6 and we conclude and give future directions of work in Sect. 7.

2 Background and related work

There are many psychometric tools which are used to analyse abilities of individuals. Latent trait theory also known as Item response theory is used in psychometric analysis for design, analysis and scoring of tests, questionnaires and similar instruments measuring abilities, attitudes or other variables. LTT refers to a set of mathematical models that describe in probabilistic terms, the relationship between a person’s response to a test item and his or her level of latent variable being measured by a scale [2, 6]. This latent variable is a trait that is postulated to exist but cannot be observed by a single observable variable/item. It is measured using multiple items in a multi item test. In order to apply LTT for studying information propagation in OSNs, we would like to understand the concepts of different types of information, their credibility and usage of social trust. In this section we describe social, psychological and computer science aspects of information, misinformation and disinformation as well as work done related to the use of trust to determine credibility of information.

2.1 Information, misinformation, disinformation

True information, misinformation and disinformation are different types of information which are developed in social, cultural and historical milieux [11]. Deliberate spread of false information is disinformation, which differs from misinformation only in terms of intent of source of information. Disinformation is defined as ‘deliberate falsehood’, whereas misinformation is defined as ‘accidental falsehood’ [22]. The information spread is a result of intents and beliefs of diffusers and receivers. The intent of diffusers may not be obvious. The receivers use cues to credibility and cues to deception to form judgement regarding the truthfulness of information. Information literacy is being able to identify, locate, evaluate, and effectively use information [7]. Diffusion and sharing of information and cues to credibility of information are areas in which misinformation and disinformation can be regarded within information literacy. Information literacy is ability of people to make such judgements to distinguish between misinformation and disinformation.

Misinformation and disinformation may spread easily because cues to deception and cues to credibility may shift in their meaning, relevance and context [11]. Information literacy is an important factor which decides the ability of a person to distinguish between information, misinformation and disinformation. It is the ability to understand cues to deception and cues to credibility when forming judgements about information, misinformation and disinformation. It is not the same for all people.

The literature on misinformation and disinformation has focused on individual deceivers or pairs of conversants. The core elements of misinformation and disinformation are relationships and context which are better understood by concentrating on teams or communities [11]. Information literacy is an important aspect of social network users which would decide the spread of misinformation and disinformation in such networks. Disinformation may be true but misleading relative to the situation through implication or revelation. Deception is one of the prime applications of disinformation. A social constructionist view of information highlights context, meaning and informativeness, rather than truth or falsity. We would use information literacy of users, and estimate their ability to distinguish between different types of information to segregate possible misinformation and disinformation. The reasons for believing a piece of information is complex and often socially mediated [12].

Studies in computer mediated communication have revealed that repropagation is an effective measure of analysing credibility of information as well as their sources. The observed response to receiving information in the form of posts, tweets, shares etc is considered a manifestation of the inherent ability to distinguish credible and accurate information and hence information literacy. Evaluation of credibility of information in communities with special reference to information literacy of members would be an ideal way to segregate different types of information. Latent trait theory provides the necessary theoretical framework to study information literacy as an ability to distinguish information contents using cues to credibility and cues to deception in the process of information diffusion and sharing in OSNs. We intend to explore the use of this latent trait to separate true information and false information which could be disinformation or misinformation.

2.2 Trust in social networks

A survey of trust in world wide web is given in [8]. In this authors have discussed trust as a concept which helps users to make decisions about contents, services or people under conditions of uncertainty. An excellent review of trust from social science and computer science perspectives is given in [21]. As per the authors, trust provides an ideal platform for development of trust communities. Quality of information along with quantum of interactions in these communities are indicators of social capital of these networks. Estimation of social capital and analysis of trust communities is important for the analysis of quality of information propagation in OSNs.

The computation of trust by users in social networks could be considered a cognitive process influenced by their experiences and past behaviour of peers. Computation of trust should therefore take into account developments in the fields of cognitive, sociological and computer science. This importance of an integrated approach has also been highlighted in [21]. An excellent review of the psychological aspects of decision making by an individual is given in [13]. There are four factors or 4C’s involved in the decision making of a person—Internal factors to include Consistency and Coherency of contents of messages with the internal beliefs and values held by the user, External factors of Credibility of the source of information as could be determined by past experiences and behaviour of the source and perceived social Consensus of information, as determined by social interactions of the user with others.

In [1], the authors have proposed a behaviour based trust model for OSNs. The authors have provided algorithmically quantifiable measures of trust based on communication patterns between users in OSNs. As per authors, repropagation of information from a source indicate a degree of trust on the information and the source. We would use repropagation decisions of users of OSNs as a measure of their affirmation of credibility of messages.

3 Latent trait theory

Social networks have been defined as social computing platforms to assess quality of information propagating in them. Rumours and false information were questioned more by users than credible information. The inherent ability of social network users to identify non-credible information is a function of information literacy of the user. The ability of a user and the parameters of news items (henceforth called items) would determine the probability that the user would consider the items to be credible. The manifestation of credibility of items would be in the form of repropagation of such items.

Latent trait theory provides an ideal framework for the study of an underlying latent trait and manifestation of the corresponding response. Often called Item Response Theory (IRT) and extensively used for educational purposes, the theory provides ideal framework for estimation of item parameters and ability of users in social networks [2]. Unlike educational usage, where the intention is to measure the ability of students accurately, we would like to use the ability of social network users to estimate credibility of items. Estimation of item parameters and population parameters is an iterative process and we would use different models in LTT to accurately model the acceptance of information in OSNs.

The basic assumptions that we make while we develop the latent theory model is that each individual possesses some amount of the underlying trait of ability in the context of information literacy. At each of the ability levels, an individual uses his or her ability to assess the credibility of information and decides to repropagate it. We would also assume that ability of individuals increase monotonically as we move from one end of the scale to the other. The probability of repropagating a typical item would depend on the difficulty in assessing the credibility of item and the ability levels of individuals.

Let the ability of users of OSN to assess credibility of items be \(\theta \). Let P(\(\theta \)) be the probability that an individual with ability \(\theta \) would repropagate an item. As the ability increases, the probability of correctly assessing the credibility of items also increases. Let ‘b’ be the difficulty of assessing the credibility of an item. We would assume that as the difficulty parameters of items increase, greater ability of users would be required to correctly assess credibility of items. If \(\theta _1\) and \(\theta _2\) are two ability levels such that \(\theta _1 < \theta _2\), then users with ability of \(\theta _2\) would have a higher probability of correctly assessing the credibility of an item as compared to users with ability level of \(\theta _1\). The plot of correct response to an item and the ability of users is described using an item characteristic curve. Each item would have a separate characteristic curve.

Each items has certain parameters. Let ‘a’ be the discrimination parameter which determines how well items can differentiate between individuals having ability above the item location and below the item location on the ability/difficulty scale. Steeper the curve, more would be the discrimination parameter and the items would be able to discriminate better. As the slope decreases, the discrimination reduces and the probability of correct response at higher and lower levels of ability would be the same. Let ‘c’ be the guessing parameter, which takes into account the probability that the user could not evaluate the credibility of items and repropagated them nevertheless. This is similar to guessing the answer by examinees in case of multiple choice questions.

The difficulty of items and abilities of users are plotted together on the same scale on the X-axis called conjoint scaling. The probability of endorsing the item is plotted on the Y-axis. We would assume a binary response of 1 for repropagating and 0 for not repropagating an item. This type of responses are called dichotomous responses in LTT. We would assume a logit curve to plot the probabilities of endorsing an item rather than a linear curve as shown in Fig. 1a. A sample plot of item characteristics curves (ICC) to explain different item parameters is given in Fig. 1b.

Fig. 1
figure 1

a Non-linear regression of probability of endorsing and latent trait, ability b item characteristics curves for two items with different item parameters

3.1 Item response matrix

The use of LTT model for evaluating information propagation in OSNs would require segregation of items in terms of messages and users who propagate them. Given a social network graph G, where G = (V,E), V is the set of user nodes in G and E is the set of edges between them. Let M be the set of messages propagating in the graph G, which are required to be analysed. Trust is context specific and to measure behaviour trust between users in terms of credibility of messages propagating in the graph, we need to group messages based on context. We assume that messages could be separated based on keywords provided by subject matter experts.

The analysis of messages in M using LTT would require us to separate out messages which have been repropagated at least once. In an Item Response Matrix with items as columns and users as rows, we have to remove all rows which have all 1’s and all 0’s as this would mean ability levels of positive infinity and negative infinity respectively. Similarly columns with all entries as 0’s and 1’s are also removed as they would indicate difficulty levels of either positive infinity or negative infinity. These are standard practices in modeling using LTT using computers. In real terms, we are removing items or users who would not contribute towards analysis of either the items or the users.

The propagation of items in the network could be visualised using a repropagation graph. The construction of repropagation graph is best understood with the help of a bipartite graph consisting of user nodes and message nodes as shown in Fig. 2. For each message, a directed edge is created from the message to the source. For each user repropagating a message, a directed edge is created from the user to the message.

Fig. 2
figure 2

Repropagation graph as a bipartite graph of messages and users

The accuracy of the model would depend on the accuracy of data in terms of responses of users to items received by them. While repropagation could be termed as confirmation of credibility, users not repropagating an item could be seen as a result of uncertainty about the credibility of the item as well as not having exposed to it at all. In order to tackle the second scenario and remove users who probably would not have even received the item, we used community detection algorithms in a repropagation graph. Community detection algorithms based on modularity would be able to isolate users in the social network graph who have more interaction edges within the community than external to it.

Detection of communities  We use community detection algorithms to group together users propagating similar items. The nodes in a community are connected and the property of modularity would ensure that the nodes have greater similarity with other nodes inside the community than outside it. The probability of nodes inside the community to have received all items propagating in the community is very high. Hence, the absence of an edge between a user and a message node in the repropagation graph would indicate lack of acceptability of the message for the user. We used community detection algorithm proposed by Blondel et al. [4].

Construction of item response matrix An item response matrix is constructed for each community in the repropagation graph. Since the cardinality of the communities vary, we consider communities of sufficiently large sizes. The item response matrix consists of items as columns and users as rows. A repropagation of an item by a user would result in an edge in the repropagation graph and is indicated by 1 in the matrix. The absence of edges in the repropagation graph is indicated by 0 in the matrix. For each community we separate messages from the community as the columns in the item response matrix. All users who have repropagated at least one of these messages are the rows in the matrix. This item response matrix is then used for further evaluation to detect the appropriate model using LTT. The outline of the proposed algorithm for construction of item response matrix is given Algorithm 3.1

figure a

4 LTT models based on dichotomous responses

Repropagation of items by a user is considered as a confirmation of the credibility of the items as well as their sources. In [1], the authors have established this fact and have proposed metrics for quantifying behavioural trust based on proportion of repropagation of messages of a source. In this section we aim to use repropagation as a measure of acceptance of credibility of news items in the form of binary responses. This would result in a dichotomous item response matrix. In subsequent sections, we intend to calculate trust value of a source based on metrics provided in [1]. This value is polytomous in nature and there we would use IRT models based on polytomous values of trust to estimate credibility of sources of information in OSNs.

The responses of users to various messages are functions of their latent trait of ability. As per LTT, there are four models which could describe them. These are Rasch model, 1-parameter logistic model (1PL), 2-parameter logistic model (2PL) and 3-parameter logistic model (3PL) [2]. They differ in the way item parameters are considered. Our aim would be to estimate the most appropriate model which fits the observed data of responses. This would enable us to study the item parameters as a function of ability of users. More importantly, we would like to estimate the quality of news items from the observed responses. This is akin to studying the quality of question papers using IRT based on the ability of students and their responses to the questions. The estimation of item parameters would throw light on quality of information diffusion in OSNs and possibly segregate likely misinformation and disinformation contents in them.

Table 1 Item parameters for all latent trait models

In order to model the responses of endorsing an item as a function of ability of a user, we use the basic Rasch model as the initial null hypothesis. In the Rasch model, only the ability of the user and difficulty of estimating the credibility of items are considered. The parameters of discrimination, a is kept as 1 and the guessing parameter, c is kept as 0. The other models are the alternate hypotheses which we compare against the null hypothesis. A brief of the different models are given below. We summarise all the models in Table 1.

  1. (a)

    Rasch model In this model, it is assumed that all items relate to the latent trait equally and only the difficulty varies between items. The value of discrimination parameter is kept as 1. Let X be a random variable which takes a value of 1 or 0, as we deal with dichotomous values. If X = 1, we assume that the user endorses the item for its credibility and repropagates it. If X = 0, the user does not endorse the credibility of the item due to different reasons. P(X = 1) is the conditional probability of random variable X taking a value of 1, given parameters of \(\theta \) and b. The item response function (IRF), P(X) is given by Eq. 1.

    $$\begin{aligned} P(X=1|\theta ,b)=\frac{e^{(\theta -b)}}{1+e^{(\theta -b)}} \end{aligned}$$
    (1)
  2. (b)

    1PL model  The 1-parameter logistic model is similar to Rasch model except for the fact that the value of discrimination, a is considered to be a constant, k while estimating P(X). This value remains the same for all items. The IRF is given by the Eq. 2.

    $$\begin{aligned} P(X=1|\theta ,b)=\frac{e^{k(\theta -b)}}{1+e^{k(\theta -b)}} \end{aligned}$$
    (2)
  3. (c)

    2PL model  In the 2-parameter logistic model, the IRFs vary in their discrimination a and difficulty b parameters. The guessing parameter is set to 0. Equation 3 gives the IRF for 2PL model.

    $$\begin{aligned} P(X=1|\theta ,a,b)=\frac{e^{a(\theta -b)}}{1+e^{a(\theta -b)}} \end{aligned}$$
    (3)
  4. (d)

    3PL model  In the 3-parameter logistic model, the IRFs include a guessing parameter, c which denotes a non-zero probability of endorsing an item at lower levels of latent trait. The IRF for 3PL model is given by Eq. 4.

    $$\begin{aligned} P(X=1|\theta ,a,b,c)= c + (1-c) \frac{e^{a(\theta -b)}}{1+e^{a(\theta -b)}} \end{aligned}$$
    (4)

4.1 Evaluation of models

Given a set of data and a proposed mathematical model which describes the distribution of parameters in the data set, we would use maximum likelihood test that best explain the data in terms of largest probability or likelihood. The technique of maximum likelihood is used to estimate the parameters of a model and test hypotheses about those parameters. The determination of latent trait, ability \(\theta \) of users based on responses of items with parameters a, b and c were done in an iterative manner. We now give the methodology adopted for a set of items to identify appropriate model which could fit the observed responses. The four models are nested models with the Rasch model having the most constrained parameters. We carried our likelihood ratio test (LRT) for each pair of models starting with Rasch model as the null hypothesis. If L is the likelihood function and \(\hat{\varOmega _0}\) and \(\hat{\varOmega }\) are the likelihood estimators of \(\varOmega _0\) and \(\varOmega \), the likelihood ratio test (LRT) is given by Eq. 5. The value of L(\(\theta _i\)) would be given by the item response functions for each model given earlier and the combined probability distribution is given by Eq. 6. Normally log likelihoods are used. Let \(L(x)_f\) and \(L(x)_r\) be the log likelihoods of the ‘full’ (least constrained) and ‘reduced’ (more constrained) models. Let \(n_f\) and \(n_r\) be the number of parameters in the full and reduced models. Then the log likelihood ratio test is given by Eq. 7. Log Likelihood Ratio, L will be distributed as \(\chi ^2\) statistic with degrees of freedom equal to (\(n_f - n_r\)).

Here, we test the null hypothesis \( H_0 :\varTheta \in \varOmega _0\) versus \(H_a:\varTheta \in \varOmega _a\) and use \(\lambda \) as the test statistic. The value of \(\lambda \) where \(0 \le \lambda \le 1\) and \(\lambda \le k\), is chosen in such a way that the Type-I error, \(\alpha \) is very low as determined by the p-values. The p-values of \(\chi ^2\) is the probability of randomly selecting a \(\chi ^2\) from the estimated distribution of \(\chi ^2\) with (\(n_f - n_r\)) degrees of freedom that is greater than the \(\chi ^2\) observed in the data. That is, we estimate the p-value as the probability, p(\(\chi ^2>\chi _{obs}^2\)). p-value is regarded as a probability quantifying the strength of evidence against the null hypothesis in favour of the alternative hypothesis. Smaller the p-value, stronger is the evidence against the null hypothesis in favour of the alternative hypothesis. We use p-values \(<\) 0.01 to reject or accept the null hypothesis. We selected the best model amongst the four which showed the best fit of the observed data. The model which fits the data could vary between communities in the graph. The model has a bearing on the analysis of the type of information propagating in the communities.

$$\begin{aligned} \lambda= & {} \frac{L(\hat{\varOmega _0})}{L(\hat{\varOmega })}= \frac{max_{\varTheta \in \varOmega _0}L(\varTheta )}{max_{\varTheta \in \varOmega }L(\varTheta )} \end{aligned}$$
(5)
$$\begin{aligned} L(\varTheta )= & {} \prod \limits _{i=1}^N{L(\theta _i)} \end{aligned}$$
(6)
$$\begin{aligned} L= & {} 2(L(x)_f -L(x)_r) \end{aligned}$$
(7)

4.2 Analysis of data using selected model

Once the model was selected, we plotted the item characteristic curves (ICC), item information curves (IIC) and test information curves (TIC) for each of the item response matrices in the data set to study the types of information propagating in communities.

Item characteristic curve  The ICC defined in LTT is the cumulative form of the logistic functions defined in Eqs. 1, 2, 3 and 4. The general characteristics of the curve is in the form of S-shape where the slope of the curve changes as a function of ability level and reaches a maximum value when the ability level equals the difficulty of the item [2]. Using these curves we would get to know the ability levels of users who repropagated news items, difficulty and discrimination of the news items and the guessing parameters, if any, in their propagation. While positive discrimination of news items augur well about the quality of item, it would be the items with negative discrimination which would have to be separated. Positive discrimination would indicate that the probability of endorsing an item would increase with increase in the ability levels of the users. Negative discrimination would indicate the reverse, i.e, the probability of endorsing an item decreases with the increase in the ability levels of users. The occurrence of negative discrimination is not usual in LTT and two reasons are cited in [2] for their occurrence.

  • Incorrect response Incorrect response to an item would always have a negative discrimination index if the correct response has a positive value. This would mean, errors in modeling the correct response of endorsing an item could result in negative discrimination index.

  • Misinforming item There is something wrong with the item which may be due to misinformation contents in it, due to which higher ability users did not endorse it. Negative discrimination is a warning that the item needs further attention.

Based on the above explanation, along with the results obtained in multiple data sets, discrimination parameters could be used to classify different types of information as given in Table 2. The possibility of incorrect response being modeled could be verified by reversing the values in the item response matrix and observing the ICC. The nature of ICC with positive and negative discrimination is shown Fig. 3 [2].

Table 2 Interpretation of discrimination parameter, a
Fig. 3
figure 3

Item characteristic curves showing positive and negative discrimination parameters

Table 3 Labels for Item discrimination parameter values

The presence of small negative discrimination index may not be alarming, large number of items having large values of negative discrimination index would certainly need attention. We retain items which displayed negative discrimination parameter even after the response values were reversed. A set of labels for item discrimination parameter values were proposed in [2]. We extend the same to include negative discrimination also so as to cover the type of items encountered in different real world data sets. The proposed numerical values for \(threshold_{low}\) and \(threshold_{high}\) are given in Table 3. This occurrence in OSNs may not be surprising. While the questions in examinations are set by qualified teachers, the author of messages in OSNs are normal users and the ubiquitous availability of social media enables them to express their thoughts without editorial filtering of any kind.

Item information function The item information function (IIF) gives the estimate about the ability of a user from the information provided by the item. The amount of information depends on how closely the difficulty of the item matches the ability of the user. The IIFs for 1PL, 2PL and 3PL models are given in Table 4. For an item i, the probability of endorsing at ability \(\theta \) is given by \(P_i(\theta )\), and \(Q_i(\theta )\) is defined as \(1-P_i(\theta )\).

Table 4 Information functions for all latent trait models

Information Information in statistics and psychometrics is defined as the reciprocal of the precision with which a parameter could be estimated [2]. If variance of estimating the ability of a user is denoted by \(\sigma ^2\), the amount of information, I is given by Eq. 8. If the amount of information at each ability level is plotted against ability we will get a continuous curve. Larger the information, the ability of the user could be determined with greater accuracy.

$$\begin{aligned} I = \frac{1}{\sigma ^2} \end{aligned}$$
(8)

Test information function Test information function (TIF) gives the complete distribution of information across all ability levels for all items. TIF is given by the sum of item information functions (IIFs) and is defined by Eq. 9 [17]. The IIF and TIF are directly related to the square of discrimination parameters for both 2PL and 3PL models. The plot of TIF would give a good idea of the variation of discrimination parameters, which is important for making a decision about the quality of information spread in the network. The curve would indicate whether there is even spread across all abilities or peaks at specific abilities which could then be marked for further analysis.

$$\begin{aligned} I_j(\theta _j) = \varSigma _i{I_{ij}(\theta _j,b_i)} \end{aligned}$$
(9)

Invariance of item parameters As per LTT, the parameters of ICC, which includes discrimination parameter a, difficulty parameter b, and guessing parameter c, are invariant across different groups of users. The determination of item parameters are independent of the distribution of ability levels of the users in the data set. This is called group invariance [2]. This means that the ICC obtained do not depend on the ability levels of users and would remain same, subject to sampling errors. The item parameters are properties of the item and not of the users responding to the item.

5 LTT models based on polytomous responses

In this section, we explore modeling the latent trait of trust of users in sources of information in OSNs as indicated by repropagation of their messages. We would measure behavioural trust of sources of information. They are polytomous and could take a range of values from no trust to complete trust as described in [9].

Quantifiable measures of behavioural trust in OSNs have been described in [1]. Repropagation of information is an indication of credibility of information and trust in the source for the context specified by the information. Consider two users A and B. Refer Fig. 2. Let \(m_{AB}\) be the number of messages sent by A to B and \(prop_{AB}\) be the number of messages of A repropagated by B, and \(prop_{B}\) be the total number of messages repropagated by B from all sources. The direct trust from B to A could be calculated by one of the methods given in Eqs. 10 and 11.

$$\begin{aligned} T(B,A)= & {} \frac{prop_{AB}}{prop_{B}} \end{aligned}$$
(10)
$$\begin{aligned} T(B,A)= & {} \frac{prop_{AB}}{m_{AB}} \end{aligned}$$
(11)

5.1 Item response matrix

The construction of item response matrix is done from the repropagation graph given in Fig. 2. We consider sources of information as items depicted in the columns of the matrix and all users as rows in the matrix. For denoting trust between two users, we use Eq. 10, which gives the trust value of user B for source A, based on the proportion of messages of A out of all messages repropagated by user B. The algorithm for construction of item response matrix for polytomous trust values is given in Algorithm 5.1.

figure b

5.2 LTT models based on polytomous matrix

The models for application of polytomous responses differ from models for dichotomous responses. The number of responses for polytomous values are more than two. There are two main models considered for modeling trust between users and sources. They are Graded Response Model and Generalized Partial Credit Model. A brief description of various models are given below.

  1. (a)

    Graded response model (GRM) GRM is used when the outcome categories of responses are ordered [20]. The trust scores determined by Eqs. 10 and 11 would enable ordering of responses. Hence we use GRM as the first choice of models for polytomous item response matrices. GRM models the probability for any given response category or higher, so for any given difficulty sub model, it would be like 2PL model. Equation 12 gives \(P_{ik}(\theta )\), the probability of responding in item category k (k = 0, 1, 2, ..., m) of item i at trait level, \(\theta \).

    $$\begin{aligned} P_{ik}(\theta )=\frac{e^{a(\theta -b_{ik})}}{1+e^{a(\theta -b_{ik})}} - \frac{e^{a(\theta -b_{ik+1})}}{1+e^{a(\theta -b_{ik+1})}} \end{aligned}$$
    (12)

    The equation is summarised as \(P_{ik} = P_{ik}^{*} - P_{ik+1}^{*}\), where \(P_{ik}^{*}\) represents the category threshold function for category k of item i.

  2. (b)

    Generalised partial credit model (GPCM) GPCM is used for both ordered polytomous responses and unordered polytomous responses [16]. We used the same only when the polytomous item responses did not fit the GRM model after repeated trials. GPCM models the probability of adjacent response categories. Equation 13 gives the \(P_{ik}(\theta )\), probability of responding in item category k (k = 0, 1, 2,..., m) of item \({i}, a_i\) is the item discrimination parameter, \(b_i\) is the item difficulty parameter and \(d_i\) is how far the item is located from the threshold.

    $$\begin{aligned} P_{ik}(\theta )=\frac{e^{\sum \nolimits _{j=0}^k {a_i(\theta -b_i + d_j)}}}{\sum \nolimits _{i=0}^{m-1} {e^{\sum \nolimits _{j=0}^i {a_i(\theta -b_i + d_j)}}}} \end{aligned}$$
    (13)

5.3 Evaluation of models

We evaluated different models similar to the methodology adopted for dichotomous models using likelihood ratio test and p-values \(<\)0.01. We used the constrained and unconstrained versions of GRM to evaluate item response matrices. Constrained model is similar to the Rasch model, where the discrimination parameters were kept the same. In unconstrained model, they were varied between different items. GPCM was used to evaluate only those matrices which did not converge to a stable solution with GRM.

5.4 Analysis of data using selected model

After selecting the correct model for each of the response matrices, we plotted item response category characteristic curves (ICC), item information curves (IIC) and test information curves (TIC) for each of the item response matrices in the data sets to study trust on the sources of information in the communities.

Item information function The item information functions (IIF) for GRM and GPCM are given in Table 5. The meaning of the function is same as that of dichotomous responses. The presence of \(a^2\), the square of the discrimination parameter in the equations is similar, and hence plotting the function would help in evaluating the quality of information in the communities.

Table 5 Information functions for all polytomous latent trait models

Test information function The test information function (TIF) is again given by the sum of the item information functions (IIF). The TIFs for GRM and GPCM are given in Eq. 14.

$$\begin{aligned} I_j(\theta _j) = \varSigma _i{I_{ij}(\theta _j,b_i)} \end{aligned}$$
(14)

6 Experiment results

We used data obtained from the popular OSN, ‘Twitter’ to validate our proposal. Twitter is a micro-blogging site which enables users to post messages called ‘tweets’ which then becomes available to their followers. Twitter offers a unique ‘retweet’ mechanism for users to repropagate tweets. We used Twitter API to collect the tweets. The spreadsheet tool TAGS v5 used for collection of tweets using the Search API was provided by Martin Hawskey [10]. The data sets were obtained for events related to happenings in India and the World during the period from Jul to Oct 2013. We collected tweets during the period to cover different types of events which were discussed by users of Twitter.

6.1 Data sets

The credibility and trust worthiness is context specific. We obtained data from Twitter for different contexts as defined in Table 6. These contexts enabled us to group together tweets pertaining to the same topic. Twitter also provides ‘hashtags’ which thread together tweets related to the same topic. The data sets are described below. The tweets were grouped together using results obtained with specific keywords.

  • Higgs  The data set of anonymized tweets pertains to messages in Twitter before, during and after the announcement of the discovery of a new particle with the features of Higgs boson on 4th July 2012. The messages posted in Twitter about this discovery pertain to the period between 1 July and 7 July 2012 [5]. The data set is publicly available at http://snap.stanford.edu/data/higgs-twitter.html.

  • Egypt  We investigated the spread of news related to the political unrest and massive protests in Egypt during the period from 13 Aug 2013 to 23 Sep 2013. The tweets were collected using the keyword ‘egypt’.

  • Syria  We tracked the events of use of chemical agents in Syria and all news related to it using the keyword ‘syria’. The tweets were collected over the period between 25 Aug 2013 and 21 Sep 2013.

  • Bodhgaya  The spread of information about terrorist attacks on 7 Jul 2013 at ‘Bodhgaya’ temple in India was tracked for a period of nineteen days from 07 Jul 2013 to 25 Jul 2013. The tweets were collected using the keyword ‘bodhgaya’.

  • MyJihad  We tracked a particular hashtag ‘MyJihad’ which we observed had contents which were controversial and the frequency of tweets were quite high. The tweets were collected over a period of eight days between 20 Jul 2013 and 27 Jul 2013.

  • Telangana  The spread of politically sensitive information in India over the demand for a separate state of Telangana was studied using the keyword ‘telangana’. The tweets were collected over a period of eight days between 23 Jul 2013 and 30 Jul 2013 prior to the government decision being announced.

  • Andhra  There was wide spread stir against the bifurcation of the state of Andhra Pradesh in India after the decision was announced. We tracked the movement using the keyword ’andhra’ and ’telangana’ for the period from 30 Sep 2013 to 09 Oct 2013.

Table 6 Details of data sets

6.2 Results

We constructed retweet graph as the repropagation graph and item response matrices as per methodology given in Algorithms 3.1 and 5.1. The different communities were analysed using appropriate models. A detailed view of ICC curves and their interpretation is given in Fig. 4. The use of discrimination parameter, a, to classify information is shown. Positive values of a indicates credible information. When a is approximately 0, it would mean very less discrimination between ability levels of users and hence such information have either complete acceptance or rejection by the users, and hence not required to be considered as misinformation. The value of a less than 0 indicates possible misinformation. The figures highlight repropagated messages which fall in different categories. The results and interpretations of some of the communities in the data sets are shown in Figs. 5 and 6.

Fig. 4
figure 4

Sample item characteristic curves and their interpretation in 2PL and 3PL models

Fig. 5
figure 5

Sample item characteristic curves, item information curves and test information curves of dichotomous responses for different communities in data sets

Fig. 6
figure 6

Sample item response category characteristic curves, item information curves and test information curves for polytomous responses for different communities in data sets

Each row in Figs. 5 and 6 represents sample output of analysis of information propagation in communities from a data set. Figure 5 is for dichotomous responses and Fig. 6 is for polytomous responses for different data sets. We show only the results for four data sets, but similar results were obtained for other data sets also. For each set of dichotomous responses, we show item characteristics curves (ICC), item information curve (IIC) and test information curves (TIC). For each set of polytomous responses, we show one of the item response category characteristics curve, item information curve and test information curve. For determining credibility of information, the value of discrimination is used to make decisions as per ranges given in Table 1. The use of TIC and IIC gives a visual representation of the type of information in the community. Both figures represent information as proportional to square of discrimination parameter, a. The values of ability levels at which the information functions are maximum also indicates the levels at which discrimination parameters are also maximum. We need to concentrate only at these levels. Positive and negative discrimination values could be further assessed from the ICC. TIC and IIC enable us to carry out initial estimation of quality of information. A monotonic rising TIC, would indicate good information and would require no further analysis. As social media is mostly used by legitimate users, we expect most of the communities to be segregated in this manner. This would enable us to concentrate on those communities in which there is a greater possibility of flow of misinformation.

The spikes in IICs and TICs would occur for items with both negative and positive discrimination parameters because of the presence of \(a^2\) term in their functions. The ICCs of the first two rows of dichotomous responses in Fig. 5 indicate the presence of misinforming items. The ICCs of these two rows indicate sizeable presence of negative and near zero discrimination levels in the items. The figures in third row has a single item which could be classified as misinforming. The value of negative discrimination in the ICC of the fourth row is below the threshold to be classified as misinforming. Analysis of TICs would indicate spikes in the curve when ability levels are \(<\)0 for only the dichotomous responses in the first two rows. Though not improbable, negative discrimination ICCs were not observed when ability levels are \(>\)0. This would mean, when TIC and also IIC with information spikes at positive ability levels would not have negative discrimination and hence items would not be classified as misinforming items. The absence of such spikes in the IIC and TIC of fourth row figures of dichotomous matrices are clear. The presence of misinforming items in the communities of first two rows and items contributing towards it could be identified by these figures. The absence of misinforming items in the communities depicted in the fourth row and a single misinforming item in the third row are also evident from the figures.

The presence of spikes in polytomous matrices in Fig. 6 are also analysed in a similar manner. Spikes in TIC are present in the polytomous matrices of first, third and fourth rows. The monotonic increase in TIC of the second row of polytomous responses is the ideal case of presence of credible information in the community without any misinforming contents. Spikes in TICs and IICs at negative ability levels could be matched with Item response category curves (IRC) and the presence of negative discrimination in them. IRC curve is for each category of item and for each row, we have shown only one IRC curve of the community which shows negative discrimination parameter. Similar to dichotomous responses, there were no information spikes at positive ability levels for items with negative discrimination.

The accuracy of the classification was verified in the data sets using human annotation. The figure of ‘precision’, which refers to the amount of actual false/unverified contents in the segregated messages was very high at around 95 %. The identification of possible sources of misinformation is very important as further monitoring of OSNs would involve monitoring the activities of suspected sources of misinformation.

6.3 Time complexity of algorithm

The proposed methodology could provide an effective social media monitoring framework only if the time complexity of the algorithm is less. The running time of the algorithm would depend on the size of the data sets and would be proportional to the number of nodes. However, the algorithm enables parallelization of computation. After identification of different communities in the data sets, analysis of communities for quality of information propagation can be taken up independently. While each of the communities can be processed in parallel, the number of communities which require analysis could be reduced. We need to isolate communities with sizeable amount of interactions and number of nodes. An empirical measure could be further used to limit the number of communities to be analysed. We use entropy as a measure to analyse information contents in the bipartite graphs. The edges in the graph represents quantum of interactions between user nodes. Increase in the number of nodes or number of edges would indicate greater interactions and hence greater flow of information. We define entropy in Eq. 15. Here H(d) refers to the entropy in the community, p is the proportion of user nodes and 1–p is the proportion of weighted edges. The analysis of communities could be limited to communities with higher entropy. This filtering is essential, as it would be difficult for LTT to be applied to very small data sets accurately.

$$\begin{aligned} H(d) = -plogp - (1-p) log(1-p) \end{aligned}$$
(15)

The distribution of entropy for different data sets is shown in Fig. 7. The figure shows a heavy tailed distribution and only few communities possess entropy values greater than \(\approx \)1. These would be the communities with sufficient number of nodes and interaction edges which could be accurately analysed. The figures are an indication of the order in which the communities need to be analysed. The limited flow of information in communities with low entropy are an indication of the lack of wide acceptability of users and messages in them.

Fig. 7
figure 7

Distribution of Entropy in communities in six data sets

6.4 Visualisation of trusted communities

The use of LTT models to evaluate behaviour trust in terms of polytomous item response matrices of communities gives us a novel method of visualisation of trust in these communities. The TIF and IIF curves indicate distribution of credibility of information as well as their sources. Even spread of information indicates greater trust and uneven distribution would mean formation of clusters of users in the community with no universal acceptance of credibility. Measure of trust is also an indication of social capital of these networks and we feel our work enables quantifying social capital of the networks and their visualisation in an effective manner.

7 Conclusion and future work

We have proposed an integrated model based on psychometric analysis and computer science to accurately detect quality of information propagation in OSNs. Using extensive data sets we could prove the effectiveness of OSNs as social computing platforms. In our proposed models, we used ability of a user as the latent trait to measure the credibility of messages propagating in the network. Further, we used behavioural trust characteristics to detect trust communities and segregate possible sources of misinformation and disinformation.

We have proved the usefulness of social computing properties of users of OSNs to determine quality of information. The responses of users in terms of their decision to repropagate messages have been used to measure their ability to distinguish between credible information and misinformation. Then we have used the ability to further decide on the credibility of information by selecting appropriate models in latent trait theory. The characteristic curves which denote item characteristics and information functions were used to detect quality of information in communities. We have used repropagation as a measure of existence of trust relationships between users to establish trust communities. The formation of trust communities would further help in accepting the credibility of information from users of such communities. This would further be a big step to build a scalable Reputation system for OSNs. A sequential process of segregating trusted communities and further analysis of user behavior would help in determining quality of information in them. Experiments have shown the development of trust communities in repropagation graphs. This could be effectively used as the first stage for any social media monitoring system.

The importance of social networks to disseminate information has seen a phenomenal rise in the last decade. The abuse of media to spread disinformation by malicious users have also increased many fold during this time. The menace of spread of misinformation could be tackled only by an efficient social media monitoring system to detect spread of false information and vigilance by individual users while propagating information of any kind. Social network data is Big Data. The speed of spread of information requires an efficient near real time monitoring system. The importance of use of social networks as social computing platforms to detect and segregate possible misinformation is a result of this. The employment of users themselves to detect possible misinformation would result in scalable algorithms. Prevention is better than cure and measures to limit the spread of false information would require development of trust relationships between users in the network in the form of trusted contents and trusted people.

In future, we would further explore the feasibility of combining the use of principles of cognitive psychology, social sciences and trust relationships to predict the likely extent of spread of information. Use of latent trait theory seems to have enough potential to be used effectively in other social computing systems like Recommender systems and Reputation systems also.