Keywords

1 Introduction

The Social network is congested by a large number of posts such as blog, reviews, opinions, image, video, etc. Extracting the required information from such a congested network is very difficult and time consuming task. An online recommender system helps to retrieve the desired information from this crowded network. For example, in Amazon’s recommender system, item-to-item collaborative filtering approach is used for item recommendation. Similarly, Facebook, LinkedIn and other social networking sites to examine the network of connections between a user and their friends to suggest a new group based on interest. The downside of this online recommender system is that, the recommendations are generated based on anonymous people similar to the target user. This recommendation does not guarantee that the recommendation generated is from trusted people. Therefore, people tend to rely more on trusted person’s recommendation than online recommendation [21]. The recommender system designed for the trust network is called trust-based recommender system.

When the trust model becomes potentially vulnerable then the transparency of the trust rating is lost [13]. The critical analysis of content or web resources makes the trust rating transparent, which is made possible only by provenance. Provenance provides Meta information about the creation and processing of content. Thus, in this model, the trust rating is computed using the provenance data derived from W7 model. Also, the trust ratings derived in the models [1, 2, 8,9,10, 14] are single rating or single preference. For example, in five rating scale, the trust value ‘4’ represents high trust while trust value ‘1’ represents very low trust. With a single rating or preference, the multiple aspects of the user or item cannot be expressed which will either directly or indirectly reduce the recommendation quality. Therefore, if the trust rating is derived using multiple criteria or features such as ‘Originality of the content’, ‘Timeliness of the post’, and ‘Relevancy of the content’ as (4, 3, 2), then evidently the quality of recommendation is improved.

The issues discussed above are handled by the proposed recommender system. One is multi-dimensional or multi attributes based trust evaluation than single dimension or a single attribute. If multiple aspects of users are analyzed for trust computation, then the impact of recommendation is stronger and positive. Next the attribute information gain is used as weight component and weighted similarity measure is computed. This multiple dimensions are easily represented using vector and hence vector similarity measures such as Jaccard, Dice and Cosine are used. Then this similarity is boosted by users trust degree or trust level and recommendation is made. The contributions of the proposed recommender system are as follows.

  • Modeling the user

  • Formation of fuzzy vector space

  • Finding preference and recommending top-k users

The structure of the paper is as follows. Section 2 briefs about the related research. The proposed recommender system is elaborately discussed in Sect. 3. Performance evaluation is discussed in Sect. 4. Finally, conclusion and future works are stated in Sect. 5.

2 Related Work

The trust-based recommendation techniques depend on two important components, namely recommendation techniques and representation of trust models.

2.1 Trust-Enhanced Recommendation Techniques

The trust enhanced recommendation algorithms are generally an enhancement of standard recommendation techniques such as Simple mean; Pearson weighted mean, Pearson collaborative filtering. The former method receives recommendation from trusted peers, whereas the latter method received recommendation from normal users. The most common trust enhanced recommender strategy is asking the users to explicitly mention the trust statements about other users. For instance the Moleskiing recommender system [3] uses FOAF files that contain trusted information scale ranging from 1 to 9. The Trust model proposed by A. Abdul Rahman and S. Hailes [1] for virtual communities grounded in real-world social trust characteristics, reputation or word-of-mouth. Falcone et al. proposed a fuzzy cognitive map model [8] to derive the trust based on belief value of an agent. This model shows how different component (belief) may change and how their impact can change depending on the specific situation and from the agent personality. The aim of a Golbeck’s trust model [9] is, to determine how much one person in the network should trust another person to whom they are not directly connected. This algorithm accurately analyses the opinions of the people in the system. TidalTrust algorithm works based on trust-based weighted mean which uses the trust value of users as a weight for the ratings of other users.

Hang et al. [10] used a graph-based approach to recommend a node in a social network using similarity in trust networks. Massa and Aversani [14] proposed a trust-based recommendation system where it is possible to search for trustable users by exploiting trust propagation over the trust network. Andersen et al. [2] explored an axiomatic approach for trust-based recommendation and propose several recommendation models, some of which are incentive compatible. In MoleTrust method the similarity weight attributed to ratings by user. A trust-filtered collaborative filtering technique is used by O’Donovan and Smith in [4]. Here the trust value is used as a filtering mechanism to choose only, the item raters who are trusted above a certain threshold. An Ensemble trust technique is proposed by victor et al. [17] aims to take into account all possible ways to obtain a positive weight for a rater of an item while favoring trust over similarity.

2.2 Trust Model Representation

Trust representations can be classified from three different perspectives, namely (i) Probabilistic vs. gradual trust (ii) Single vs. multi-dimensional trust and (iii) Trust vs. distrust. Probabilistic representations use probabilities to indicate how much trust is placed by a user to another [17] Stronger trust corresponds to a higher probability. Gradual representations [17] use continuous values to represent trust. The values can be any values so they cannot be explained as probabilities. The values directly indicate trust strengths. Here, (u, v, t) denotes that the trust value from u to v is t. Trust is a complex concept with multiple dimensions (i) Multifaceted trust and (ii) Trust evolution. It is an extension of single trust representations of multi-dimensional trust representations [11]. Trust is context dependent. Trusting someone on one topic does not necessarily mean he will be trusted by others. The trust value is represented with <u, v, f, p>, where u trust v with probability p in the facet f. Also author suggests that trust evolves as humans interact over time T. Josang’s subjective logic explores the probabilistic model [12] that considers both trust and distrust simultaneously. A gradual trust model for both trust and distrust can be found in [5, 8, 16]. Guha et al. use a pair (t, d) [18] with trust degree t and distrust degree d and final suggested trust value is obtained by subtracting d from t i.e. t-d.

3 Proposed Recommender System

The proposed recommender system is built to recommend the top-k reviewers in a book-based social network. For this, the data about the reviewer and the review is collected from Goodreads, Google Books and Amazon using ad hoc-API and scrapping HTML pages. The fields collected from the social network are given in Table 1. More than 61,000 reviews and associated reviewer’s data available from 2007 to 2015 is collected and details of the dataset collected is given in Table 2. Number of reviews and number of reviewers are not same always. But this dataset has single review from each reviewer.

Table 1. Fields collected from the social network
Table 2. Dataset details

The collected data is preprocessed and from this the trust score of each reviewer is computed using W7 provenance model [22]. Then, using DoT pruned Fuzzy Decision Tree (FDT) classifier [7] the reviewers are classified and fuzzy rules were generated. Finally, fuzzy rules are combined with a target user’s request to perform recommendation. The major components of the proposed recommender system are as follows.

  1. 1.

    Provenance Based Trust Assessment

  2. 2.

    Fuzzy Decision Tree Based Classifier

  3. 3.

    Trust-Boosted Recommender System

3.1 Provenance Based Trust Assessment

This section briefs about how trust assessment is made using W7 provenance model. For the experiment the data are collected from Goodreads.com book-based social network. The provenance elements defined in W7 model are a 7-tuple: (WHAT, WHEN, WHO, HOW, WHY, WHICH, WHERE). The description of provenance elements in the context of trust is shown in Table 3.

Table 3. Description of provenance elements

Since, the data for PWHERE and PWHICH is not provided by the domain, these two elements cannot be modeled. Therefore, the core provenance elements taken for trust quantification are PWHAT, PWHEN, PWHO, PHOW and PWHY. Trust assessment algorithm quantifies these five provenance elements. This trust value is then given to the learning model to classify the users with various levels.

3.2 Fuzzy Decision Tree Based Classification

The learning model takes the quantified provenance value obtained using W7 model as a trust input. This is fuzzified using Triangular Membership Function (TMF) and rule base is constructed using Mamdani’s ‘If… Then’ interpretation. Fuzzy Decision Tree (FDT) [19] takes the rule base and generates decision trees using a fuzzy ID3 [6] algorithm. To construct FDT, two criteria need to be evaluated, one is splitting criterion and the other is stopping criterion. The former one helps to choose the root node and child nodes. The latter one controls the growth of the tree.

In FDT, provenance element having highest information gain is assigned as the root node and leaf node denotes trust decision. Each distinct path from root to a leaf produces distinct rule. Each generated rule is assigned Degrees of Truth (DoT) [15] to state that how much truth value it holds. If DoT = 1, then the rule is absolutely true and if DoT = 0 then the rule is absolutely false. Sample fuzzy rules are shown in Fig. 1. Here, reviewers are classified into 5 different trust levels as VHGT (Very High Trust), HT (High Trust), MT (Moderate Trust), LT (Low Trust) and VLWT (Very Low Trust). The abbreviation for the linguistic terms present in Fig. 1 is as follows. MSM (Moderately Same), HD (Highly Deviated), HSM (Highly Same), MD (Moderately Deviated), MITM (Moderately Ineffective Time Spent), HITM (Highly Ineffective Time Spent), HUTR (Highly Untruthful), HR (Highly Relevance), NTR (Neutrally Truthful), MIR (Moderately Irrelevance) and MDSML (Moderately Dissimilar).

Fig. 1.
figure 1

Fuzzy rules

In order to get better accuracy with minimum number of rules, the stopping criterion (β) is used. The value of β chosen is 0.9 and 1 and lengths of rules ranges from 2 to 5. The Table 4 shows the number of rules generated. For example, rule #2 and #3 in Fig. 1 has length 2 and rule #5 has length 5. These fuzzy rules are taken as input to build a trust-boosted recommender system.

Table 4. Rules generated

3.3 Trust-Boosted Recommender System

The proposed trust-boosted recommender system recommends the trustworthy users to the target user UT is shown in Fig. 2. The major components of this recommender system model are:

  • User Profile Learning

  • Recommendation

Fig. 2.
figure 2

Proposed trust boosted recommender system

The target user or requester (UT) sends a query as a request (Rq) asking for recommendations from the trust network. This query is sent to the trusted network and it checks whether the UT is new user or not. If UT is the existing user, then recommend the highly trusted users. Otherwise the request is sent to the profile learner where profile data (Pdata) are updated based on the query and existing profile information. Then this updated Pdata is sent to the trust network. In the trust network, each reviewer is grouped based on trust levels <VHGT, HT, MT, LT, VLWT>. From the set of fuzzy rules, extract the conditional attribute and the decision attributes. For each conditional attribute, generate fuzzy vector space (FVSP). The FVSP consists of a tuple <Attribute, Preference based Fuzzy Number>. The vector similarity measures such as Jaccard, Dice and Cosine is carried out to find how much target user is similar to the others in trust network.

The gain value of each attribute (AG) is assigned as a weight component and it is applied to the above mentioned three similarity measures and weighted similarity value is calculated. Then this similarity is boosted by the corresponding decision attribute’s trust degree (trust level). Then, based on the boosted similarity value, the trusted users are ranked from highest to lowest. Finally, top-k users are recommended to the target user (UT). After a recommendation, the target user’s feedback is collected and profile learner will update the Pdata accordingly. To collect the feedback, set of feedback query (FDqry) is formulated based on five attributes PWHAT, PWHEN, PWHO, PHOW and PWHY. For each FDqry, users are asked to provide a quantitative value in the scale 0 to 1. This recommendation process is repeated for each user request with the updated profile.

User Profile Learning Phase

In user modeling phase, the fuzzy rules are extracted from the rule database derived using fuzzy decision tree (discussed in Sect. 3.2). Each user may have one or more than one rule as shown in Fig. 1. Using rule matching algorithm, each users is assigned matched rule(s). Then the users are grouped based on above mentioned five different trust levels. For example, if the user U109; U169 is classified as a Low Trust (LT) then under LT these users are grouped. For the rest of the users similar procedure is carried out and users are grouped accordingly. If UT is an existing user then the details (profile) of the user are known already and can directly access the trust network. If UT is new user then profile of the user needs to be learned prior to access the trust network. The profile learning is depicted in Fig. 3.

Fig. 3.
figure 3

Profile learning model

Initially UT’s field of interest and training examples or already labeled items are collected and this forms a basic profile (Pb) of the user. The set of feedback (Fb) provided by the UT for the items is also collected. Finally the (Pb, Fb, Rq) are combined and sent to the profile learner. Then the learned user profile is given as input to the recommendation phase.

Recommendation Phase

The recommendation is carried out in two steps. One is creation of FVSP and the other is recommendation of the user.

Formation of FVSP

The rules extracted from the trust network are partitioned into conditional attributes set (CAS) and decision attributes set (DAS). The CAS consists of all the trust attributes <PWHAT, PWHEN, PWHO, PHOW, PWHY>. The DAS consists of trust decision <VLWT, LT, MT, HT, VHGT>.

Step 1: For each attribute in the conditional attribute set, assign attribute grade. This is based on the position of the triangular fuzzy function and is given in Table 5. The linguistics space of each attribute is given below.

$$ Linguistic \,Space = \left( {\begin{array}{*{20}c} {P_{WHAT} = \left[ {HIR,MIR,NR,MR,HR} \right]} \\ {\begin{array}{*{20}c} {P_{HOW} = \left[ {HSM,MSM,NSM,MD,HD} \right]} \\ {P_{WHEN} = \left[ {HITM,MITM,NETM,METM,HETM} \right]} \\ {P_{WHY} = \left[ {HTR,MTR,NTR,MUTR,HUTR} \right]} \\ {P_{WHO} = \left[ {HDSML,MDSML,NDSML,MSML,HSML} \right]} \\ \end{array} } \\ \end{array} } \right) $$
Table 5. Fuzzy number for the attribute PWHAT

For example, in PWHAT attribute the position of ‘HIR’ has low grade, i.e. 1 and ‘HR’ has high grades, i.e. 5. Similarly, in PWHY attribute the position of ‘MTR’ (Moderately Truthful”) has medium grade, i.e. 3.

Step 2: Assign the fuzzy number for each linguistic term based on the grade. Since it follows the triangular fuzzy logic, the fuzzy number assigned for each grade is shown below. For example, the fuzzy number for the linguistic term for the attribute PWHAT is shown in Table 6. For other attributes, fuzzy number is same as that of shown in Table 5.

Table 6. Fuzzy number for the attribute PWHAT

Step 3: The fuzzy number for each attribute is now represented as a vector in FVSP. The FVSP for each rule is represented as a pair \( \left\{ {{ <} {\rm A}_{\rm K} ,{\rm FN}_{\rm AK} {>} } \right\} \). where,

  • K refers to a number of attributes, here K = 5.

  • AK represents the current attribute and

  • FNAK refers to the fuzzy number for the specified attribute AK. That is FVSP = \( \{ {<} {\rm A}_{\rm 1} , \, \left( {{\rm a}_{{{\rm 11},}} {\rm a}_{{{\rm 12},}} {\rm a}_{\rm 13} } \right){>} , \, {<} {\rm A}_{\rm 2} , \, ({\rm a}_{\rm 21} ,{\rm a}_{\rm 22} , \) \( {\rm a}_{\rm 23} ) {>} \ldots {<} {\rm A}_{\rm 5} , \, \left( {{\rm a}_{\rm 51} ,{\rm a}_{\rm 52} ,{\rm a}_{\rm 53} } \right) {>} \} \). Here (a11, a12, a13) is a triplet used in TMF to define the fuzzy number where 0 ≤ a11 ≤ a12 ≤ a13 ≤ 1.

For example FVSP for the rule1 shown in Fig. 1 is given below. \( \{ {<} {\rm P}_{\rm HOW} , \, \left( {0.{\rm 25}, \, 0.{\rm 5}0, \, 0.{\rm 75}} \right) {>} , \) <PWHY, (0.50, 0.75, 1.0)>, <PWHEN, (0.50, 0.75, 1:0), <PWHAT, (0.50, 0.75, 1.0)>, \( {<} {\rm P}_{\rm WHO} , \, \left( {0.{\rm 75},{\rm 1}.0,{\rm 1}.0} \right) {>} \} \). This FVSP is taken as input to calculate the vector similarity and to suggest the top-k trustworthy users.

Recommendation of the Top-k Users

In the vector space there are some similarity measures between two vectors which have been successfully applied in fields such as pattern recognition, classification of complex objects and other decision making problems. The vector similarity measures chosen in the proposed recommendation system are the Cosine similarity. Using this FVSP, above vector based similarity measure is carried out to find how much UT is similar to the other users in trust network. The gain value of each attribute (AG) is taken as a weight component and it is applied to the above mentioned measures and similarity value is calculated.

Let X = UT = (a1, a2, a3) and Y = UN = (b1, b2, b3) is the fuzzy number of the target user (UT) and the other user (UN) from the trust network respectively, then the cosine similarity measure is given in Eq. (1) is as follows.

$$ S = Cosine \left( {U_{T} ,U_{N} } \right) = \mathop \sum \limits_{k = 1}^{5} A_{{G_{k} }} \frac{{\mathop \sum \nolimits_{f = 1}^{3} \left( {FN_{{A_{Tkf} }} .FN_{{A_{Nkf} }} } \right)}}{{\sqrt {\mathop \sum \nolimits_{f = 1}^{3} \left( {FN_{{A_{Tkf} }}^{2} } \right)} .\sqrt {\mathop \sum \nolimits_{f = 1}^{3} (FN_{{A_{Nkf} }}^{2} )} }} $$
(1)

where,

  • AG represents the attribute gain,

  • f represents the fuzzy number of values in each fuzzy number,

  • a1, a3, b1 and b3 are the endpoints of fuzzy numbers,

  • a2 and b2 are the peak point of fuzzy numbers

After finding the similarity (S), boost this value by a corresponding trust score of the user UN given in Eq. (2). Boosting is linear, since it is done with associated trust level. Using this boosted similarity (Sb), prediction of the target user’s trust score is carried out. The prediction formula is given in Eq. (3).

$$ S_{b} = S *S^{{T_{wt} }} $$
(2)
$$ Pred \left( {U_{T} ,\,I_{j} } \right)\, = \,\left\{ {\begin{array}{*{20}l} {\quad tr_{{U_{T} }} ,\, if\, S_{b} \, = \,0 \,or\, if\, tr_{{U_{N} ,I_{j} }} \, = \,\overline{{tr_{{U_{N} }} }} } \hfill \\ {tr_{{U_{T} }} \, + \, \frac{{\mathop \sum \nolimits_{{U_{N} \in NB}} S_{b} \left( {U_{N} ,\,U_{T} } \right)\, \times \,\left( {tr_{{U_{N} ,I_{j} }} \, - \,\overline{{tr_{{U_{N} }} }} } \right)}}{{\mathop \sum \nolimits_{{U_{N} \in NB}} \left| {S_{b} \left( {U_{N} ,\,U_{T} } \right)} \right|}}} \hfill \\ \end{array} , \,\, else} \right. $$
(3)

where,

  • Twt refers to trust weight assigned based on the trust level of user UN. (For e.g., VHGT has Twt of 1 and MT has Twt of 0.6 and VLWT has Twt of 0.2)

  • \( tr_{{U_{T} }} \) represents the trust value of the Target User UT presented in fuzzy number format as shown in Table 6.

  • Ij represents items (books) which are not given any review

  • NB represents the number of neighbors chosen

Consider the randomly chosen reviewer say reviewer 72 (R72) requesting for the recommendation of k users (Let k = 10). The similarity (S) between the requester and the rest of the users is calculated. Then it is boosted using Eq. (2). The Table 7 shows the similarity and boosted similarity (Sb) score of the top-k reviewer. The reviewers are sorted based on similarity from highest to lowest. Though both similarities show the highest score for the top reviewers, the trust level differs. The trust level of highly matched reviewer with R72 is ‘HT’. Therefore, the top-10 reviewers are expected to have the trust level of ‘HT’. But, in case of without boosting, top 4th, 6th and 10th reviewers have other trust level (‘MT’) instead of ‘HT’. Similarly, in case of boosting, the top 10th reviewer has different trust level.

Table 7. Similarity and Boosted similarity score of top-k reviewer.

Therefore, the prediction error is more in without boosting and lesser in boosted method. Boosting the similarity appropriately ranks the top-k reviewers. This way the proposed system gets a reduced MAE and RMSE.

4 Experiments and Result Analysis

To evaluate the performance of gain weighted trust boosted recommender system, experiments are conducted on the popular book based social network Goodreads data set. The aim of these experiments is to present a comparative study of proposed recommendation strategy in fuzzy trust concept. Also proposed trust boosted model is evaluated against other weight strategies. The performance of the proposed recommendation strategy is measured with respect to quality of predictions and quality of recommendations. The quality of prediction is done by measuring Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). Similarly the quality of recommendation is done by measuring precision@k and recall@k and Average Precision (AP). Leave-one-out method is used to evaluate recommendation systems [14]. This technique involves withholding one rating and trying to predict it with remaining ratings. Then the predicted rating can be compared with the actual rating and the difference will be considered as the prediction error.

4.1 Evaluation of Different Weight Strategies

The different weight strategies considered for evaluation are expected weight method; preference based method [20] and proposed trust boosted method. The MAE, RMSE and AP measures are evaluated for the above mentioned weight strategies. The Fig. 4 shows the MAE value obtained for the cosine similarity method. From this figure, it is observed that the proposed trust boosted method shows the less prediction error (MAE) than the other two methods. Similarly, Fig. 5 shows the RMSE value obtained for the cosine similarity measure. From the figure it is observed that the proposed trust boosted method shows the less prediction error (RMSE) when compared with the expected weight method. The preference based method shows more error rate than the other two methods.

Fig. 4.
figure 4

MAE for cosine similarity measure

Fig. 5.
figure 5

RMSE for cosine similarity measure

The AP value is shown in Fig. 6 for the above similarity measure. The precision value for the proposed method is higher than the other two methods. The AP is almost same for top-5 and top-10 users. Up to top 20 users precision value is greater than or equal to 0.90. After that the precision value is start decreasing and for top-50 user, the precision value is very less in preference based method.

Fig. 6.
figure 6

Average precision for cosine similarity measure

4.2 Comparison with Other Trust-Based Recommender System

The proposed recommender system is compared with other trust-based recommender system. The evaluation is done on MAE and RMSE. First, the proposed method (boosted similarity) is compared against without boosting the similarity. The MAE of this is shown in Fig. 7. The experiment is carried out with Jaccard, Dice and Cosine similarity measure. All these three measures show the lesser prediction error while boosting the similarity than without boosting the similarity. In Jaccard repetition of a word does not reduce the similarity but Cosine measure reduce the similarity.

Fig. 7.
figure 7

MAE for with and without boost

The Fig. 8 and 9 shows the MAE and RMSE values of the proposed approach with existing trust-based recommender system respectively. The reviewers are chosen through random sampling The existing approaches considered are Tidal trust, Mole trust, Fuzzy Trust Filtering, Ensemble and Hybrid. The MAE value is checked for few randomly selected reviewers. The graph shows the reduced prediction error in the proposed method.

Fig. 8.
figure 8

MAE compared with existing trust-based recommender system

Fig. 9.
figure 9

RMSE compared with existing trust-based recommender system

Similar to MAE, the RMSE value is checked with few randomly selected reviewers. The graph shows the reduced prediction error in the proposed method.

5 Conclusion and Future Work

In this work, trust-boosted recommender system is designed to recommend top-k reviewers of the book based social network. The use of provenance based trust computation from multiple aspects has improved the recommendation quality. Also performance of proposed trust boosted (the gain as weight) measure is compared with other weights such as expected value, and preference based method. The analysis shows that the precision@k is increased 10.166% when compared to the expected weight method and 2.186% when compared with preference based weight method. Also proposed approach is compared with other trust based methods and the results shows that the prediction is achieved with minimum MAE and RMSE. The future work is to recommend the top-k reviewers to a group of users. That is to develop a group recommender system