Keywords

1 Introduction

Due to the unprecedented proliferation of information available on the web, it is very difficult for users to find the relevant information from a large collection of data available online. To overcome the problem of information overload, web personalization tool would be the most prevalent tool. Recommender system (RS), a web personalization tool provides relevant suggestions to users based on their preferences [7]. The suggestions provided are aimed to support the decision-making process of users in various fields like videos, music, movies (MovieLens, Netflix), restaurants (Entree), books (Amazon), jokes (Jester). Many filtering techniques are used to construct RS such as content based filtering, collaborative filtering (CF) and demographic filtering [3, 12]. Among these techniques, CF is the most widely used and prevalent technique [12]. Collaborative filtering (CF) recommends items to active users based on those users who have similar tastes in the past. When a user has rated a few items, a reliable recommendation is not possible for that user. This problem is termed as a cold-start user problem. Furthermore, traditional CF also suffers from the sparsity problem [12].

The growing popularity of open social network and trend to integrate e-commerce applications with RS have generated an increased interest toward developing trust aware RS as people rely more on those recommendations suggested by trustworthy people in real life [7]. In these trust aware RS, usually a trust network is used to search more likely neighbors by establishing a relationship between users that are not sharing any co-rated items. Trust-aware CF approaches can be broadly classified into two categories: namely, explicit trust model [1, 4, 5] or implicit trust model [2, 6, 7, 10]. Recently, a lot of work has been carried out by elicitation of trust values into collaborative RSs for improving the accuracy of predictions and handling the sparsity as well as cold start problems. In contrast to other trust-aware recommendation methods, our approach also exploits distrust links among users. The effect of distrust has not been much analyzed in the realm of RS due to the absence of available data sets representing both the trust and the distrust values for a particular person [8]. Our work in this paper is an attempt toward developing trust-distrust enhanced recommendations model based on the novel similarity measure that combines user ratings and trust values for generating more quality recommendations. Our work has the following main research contributions:

  • Designing a novel similarity measure for CF based on the computed trust values between users.

  • Handling the problems of new user and sparsity by utilizing propagation operator based on trust-distrust values.

  • Comparative analysis of proposed recommendation strategies using of trust-distrust models.

The rest of this paper is organized as follows: Sect. 2 covers related work. Section 3 describes the overall framework of our approach. Computational experiments and results are given in Sect. 4. Finally, we conclude our work in Sect. 5.

2 Related Work

Collaborative filtering and explanation of direct and indirect models of Trust and Distrust are described in this section.

2.1 Collaborative Filtering

Collaborative filtering, follows the principle of ‘word of mouth’ where similar users provide suggestions to users. The following three steps are required to generate recommendations to users in CF based RS.

  • Step 1 (Similarity Computation): It computes the similarity between active users (\(u_a\)) and other user (u) by using various similarity measures such as cosine similarity, Pearson correlation, jaccard similarity. The most widely used similarity measure in CF is Pearson similarity measure which is defined below:

    $$\begin{aligned} Sim(u_a,u)= \frac{\sum _{i \epsilon I} {( r_{u_a,i} - \overline{r}_{u_a} )} {( r_{u,i} - \overline{r}_u )}}{{\sqrt{\sum _{i \epsilon I} {( r_{u_a,i} - \overline{r}_{u_a} )}^2}}{\sqrt{\sum _{i \epsilon I} {( r_{u,i} - \overline{r}_u)}^2}}} \end{aligned}$$
    (1)

    where, \(r_{u_a,i}\) - Rating provided by user u\(_a\) on item i

    \({{\overline{r}}_u}\) - Mean rating of user u

    I - Set of corated items.

  • Step 2 (Neighbourhood set formation): Usually top k similar users are selected in the neighbourhood sets. Alternatively the neighbourhood set can be generated through predefined similarity threshold.

  • Step 3 (Prediction and Recommendation): It predicts an unknown rating of a target item for an active user based on the neighbourhood set using following formula:

    $$\begin{aligned} P_{u_a,m}=\overline{r}_{u_a} + \frac{\sum _{u \epsilon N(u_a)} Sim(u_a,u)(r_{u,m}-\overline{r}_{u_a})}{{\sum _{u \epsilon N(u_a)} Sim(u_a,u)}} \end{aligned}$$
    (2)

    where, N\(_{(u_{a})}\) - Set of neighbours to user u\(_a\)

    \(P_{{u_a},m}\) - Represents the predictive rating of active user \(u_a\) on item m

    r\(_{u,m}\) is the rating of user u who is a neighbour of user \(u_a\). Finally highly predicted items will be recommended to active users.

However, similarity based CF suffers several problems such as, cold-start and sparsity that could affect the precision of recommendations [3, 12]. To generate effective recommendations by dealing with these concerns, many studies have been conducted by eliciting trust values into collaborative recommender system. In these studies, a trust network is built between users that may be helpful to RS [4,5,6]. It is also indicated that a user is much more confident on trusted user rather than a stranger. Since this trusted user may also trust his friend’s opinion in recursive manner by propagating trust values. Guha et al. [2] was the first one who utilized the idea of transitivity of trust and developed a framework for trust propagation. In the area of RS, a new trend about distrust is also investigated recently. Victor et al. [8] developed trust assessment scheme between unconnected pairs in a trust and distrust network by using trust and distrust propagation and aggregation operators and explored various ways in which distrust information can be utilized in a fine-tuned network using the Epinion data set. Since this data set does not include assignment of pair (trust, distrust) to individuals, the propagation/aggregation operators have not been fully analyzed especially in inconsistent situations [8, 9].

2.2 Trust Model

Trust models can be classified into two categories, namely explicit trust model and implicit trust model. An explicit trust model deals with direct linking between users where users specify their trust values to directly connected users [1, 4, 5]. However, implicit trust model computes trust values among users either by propagating trust values or computing trust values based on available ratings on items [6, 7] (Table 1).

Table 1. Trust model

3 Trust Distrust Enhanced Recommendation Framework

In this section, we will discuss about our proposed trust-distrust enhanced recommendation framework. For a RS, let \( U = \{ {u_1,u_2,u_3,...u_n}\} \) be the set of n users and I = \(\{{i_1},{i_2},{i_3},...{i_m}\}\) is the set of m items in the system. Each user u\(_i\) rated a set of items and rating of u\(_i\) on i\(_j\) is expressed as r\(_{u_{i},i_{j}}\). Our proposed system has following three phases which are depicted in Fig. 1. The details about these phases are given below:

Phase 1. (Effective Similarity Computation based on trust values): We have computed effective similarity through three steps which are discussed below:

  • Step 1 (Similarity computation): We have computed the similarity between active user u\(_a\) and a user u by using Eq. 1.

  • Step 2 (Trust-Distrust Computation): We have evaluated trust and distrust values between active user u\(_a\) on user u by using following equations:

    $$\begin{aligned} Trust_{u_{a}}(u)=\frac{2 * rec_{trust} * exp_{trust} (u_a,u)}{rec_{trust} + exp_{trust}(u_a,u)} \end{aligned}$$
    (3)

    where, rec\(_{trust}\) and exp\(_{trust}\) will be computed by utilizing the computational models [8, 11]

    $$\begin{aligned} Dis_{u_{a}}(u)=\frac{2 * rec_{dis} * exp_{dis} (u_a,u)}{rec_{dis} + exp_{dis} (u_a,u)} \end{aligned}$$
    (4)

    where, rec\(_{dis}\) and exp\(_{dis}\) will be computed by utilizing the computational models [11].

  • Step 3 (Effective Similarity): In real life, users are more confident on those users who are more trustworthy. Therefore, we have embedded similarity with trust value to compute effective similarity measure Sim\('\)(\(u_a\),u) between active user u\(_a\) and a user u by using following formula:

    $$\begin{aligned} Sim'(u_a,u)=\frac{(w_1 * Sim(u_a,u))+(w_2 * Trust(u_a,u))}{w_1+w_2} \end{aligned}$$
    (5)

    The reason for fusing these two types of information is based on the observation that the similarity and social trust among users may not be highly correlated.

    Here, weights are decided experimentally and these values (\(w_1\) and \(w_2\)) are normalized in the range of [0,1].

Phase 2. (Neighbourhood set construction based on distrust as a filter): At this stage, the distrust is used as a means to filter out neighbours before the recommendations so that only the most trusted neighbours can participate in the recommendation process. Thus, the distrust system will be implemented on the neighbourhood set to filtered out most distrust user from neighbourhood set.

Phase 3. (Prediction and Recommendations): The selected neighbourhood set after phase 2 is used to predict the ratings of all unseen items for an active user using Eq. 2. Finally top predicted items can be recommended to the active user.

Fig. 1.
figure 1

Three phases of our proposed recommendation framework

4 Experiment Setup

To show the effectiveness of our proposed approaches we conducted several experiments on MovieLens dataset.

4.1 Design of Experiments

MovieLens data set contains 100,000 ratings provided by 943 users on 1682 movies on a using 5 point rating scale [11]. We divided the whole MovieLens dataset into 5 splits. Each split contains 200 users. For each split, we selected 50 active users randomly and the remaining 150 users are considered as training users in each split. Further, we divided ratings of each active user into two sets namely training movies [60%] and test movies [40%]. Training movies are used for constructing neighbourhood generation and trust-distrust computation. We repeated all experiments on each split five times in order to reduce the inherent bias if it exits. In all experiments we kept fixed neighbourhood size (k) which is decided by verifying different values of k in the experiments.

4.2 Performance Evaluation

We have used following performance measures for the evaluation of our proposed approaches

  • Mean absolute error (MAE): MAE represents the difference between actual ratings and predicted ratings.

    $$\begin{aligned} MAE=\frac{1}{n}\sum _{i=1}^n |a_i-p_i| \end{aligned}$$
    (6)

    where, \(a_i\) is actual rating.

    \(p_i\) is predicted rating.

    n is total no of predicted item.

  • Precision: Precision, measuring correctness of recommendation, is defined as the ratio of the number of selected items to the number of recommended items.

    $$\begin{aligned} precision = \frac{Number\ of\ item\ recommended}{Total\ number\ of\ recommended\ item} \end{aligned}$$
    (7)
  • Recall: Recall is a measure of completeness. It determines the ratio of good items retrieved to all good items. In other words, it computes the fraction all good movies recommended.

    $$\begin{aligned} recall = \frac{\left| good\ movies\ recommended \right| }{\left| all\ good\ movies \right| } \end{aligned}$$
    (8)
  • F-measure: The f-measure is the harmonic mean of precision and recall

    $$\begin{aligned} f\text {-}measure = 2\times \frac{precision \times recall}{precision + recall} \end{aligned}$$
    (9)
  • Percentage of correct prediction PCP: PCP is defined as the ratio of Correctly predicted items to the number of rated items.

    $$\begin{aligned} PCP = {\frac{Correctly\ predicted\ item}{Total\ number\ of\ rated\ item}} * 100 \end{aligned}$$
    (10)

4.3 Experiments

We have compared our approaches namely Trust Distrust Pearson Collaborative Filtering (TD_PCF), Trust Pearson Collaborative Filtering with propagation (TPCF_PROP) and Trust distrust Pearson Collaborative Filtering with Propagation (TD_PCF_PROP) with the following approaches such as:

  • Pearson Collaborative Filtering (PCF) [15]

  • Trust based Collaborative Filtering (TCF) [1]

  • Trust Distrust Collaborative Filtering (TDCF) [9]

  • Trust Collaborative Filtering with propagation (TCF_PROP)

  • Trust distrust Collaborative Filtering with Propagation (TD_CF_PROP) [2]

  • Trust Based Weight Collaborative Filtering (TBW) [4, 9]

  • Trust Based Filteringt Collaborative Filtering (TBF) [9]

  • Ensemble Trust Collaborative Filtering (ETCF) [16].

Table 2. Performance comparison on various approaches on MAE
Table 3. Performance comparison on various approaches on PCP
Table 4. Performance comparison on various approaches on Precision
Table 5. Performance comparison on various approaches on F-Measure

4.4 Result

To demonstrate the effectiveness of the proposed approaches TD_PCF_PRO, TPCF_PRO and TDPCF, we analyzed the results for the MAE, PCP, precision and f-measure as shown in Tables 2, 3, 4 and 5. In these tables, last row indicates the average performance over five splits. The lower values of MAE implies the better performance of the approach. Similarly, higher values of PCP, precision and f-measure also indicate the better performance. Based on these tables, we can say that our proposed approaches namely, TD_PCF_PRO, TPCF_PRO and TDPCF, outperform other approaches in terms of various performance evaluation schemes.

5 Conclusion

Recommender systems are one of the recent invention for dealing with information overload problem by identifying more relevant items to users based on their preferences. Collaborative filtering is the most successful recommendation technique in the area of RS. However, the new user and sparsity are major concerns. In this work, we have proposed trust distrust enhanced recommendation framework where effective similarity is suggested for using the utility of trust and similarity factor in the construction of neighbourhood set. For more efficient neighbours, we have filtered out the distrusted user from the neighbourhood set. Further, we have investigated the use of trust distrust based propagation operator in resolving the new user and sparsity problems. Finally, experimental results demonstrated that our proposed strategy were superior to traditional collaborative filtering and other existing trust aware recommendation strategies.