Robust recommendation method based on suspicious users measurement and multidimensional trust

Yi, Huawei; Zhang, Fuzhi

doi:10.1007/s10844-015-0375-2

Robust recommendation method based on suspicious users measurement and multidimensional trust

Published: 14 August 2015

Volume 46, pages 349–367, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Robust recommendation method based on suspicious users measurement and multidimensional trust

Download PDF

Huawei Yi^1,2 &
Fuzhi Zhang^1,2

588 Accesses
7 Citations
Explore all metrics

Abstract

The existing collaborative recommendation algorithms have poor robustness against shilling attacks. To address this problem, in this paper we propose a robust recommendation method based on suspicious users measurement and multidimensional trust. Firstly, we establish the relevance vector machine classifier according to the user profile features to identify and measure the suspicious users in the user rating database. Secondly, we mine the implicit trust relation among users based on the user-item rating data, and construct a reliable multidimensional trust model by integrating the user suspicion information. Finally, we combine the reliable multidimensional trust model, the neighbor model and matrix factorization model to devise a robust recommendation algorithm. The experimental results on the MovieLens dataset show that the proposed method outperforms the existing methods in terms of both recommendation accuracy and robustness.

Detecting Anomalous Ratings Using Matrix Factorization for Recommender Systems

Extracting user influence from ratings and trust for rating prediction in recommendations

Article Open access 12 August 2020

Collaborative filtering-based recommender systems by effective trust

Article 15 March 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

As an important information filtering means, the collaborative recommender systems provide an effective way to solve the problem of “information overload” on the internet, and have been widely used in e-commerce websites (Xu et al. 2009). However, due to their natural openness, malicious users motivated by the commercial profit can inject a large number of fake profiles into the systems’ rating database in order to manipulate the recommendation results. Such behavior has been termed shilling attacks (Gunes et al. 2014), also called profile injection attacks (Aghili et al. 2011) or recommendation attacks (Zhang et al. 2006), and the fake user profiles injected are called attack profiles. Common shilling attack types include random attack, average attack, AoP attack and bandwagon attack (Burke et al. 2006). According to the purposes of attacks, shilling attacks can be divided into push attacks and nuke attacks (Hurley et al. 2007), which are used to increase and decrease the recommended frequency of target item respectively. A number of studies have indicated that collaborative recommender systems are vulnerable to shilling attacks, and the quality of recommendation will be harmed.

To reduce the influence of shilling attacks on the recommender systems, one way is to perform shilling attack detection before making recommendations. However, the existing shilling attack detection methods are based on binary classification, which is prone to filtering out the real user profiles, resulting in the decline of recommendation accuracy. An alternative way is to construct robust recommendation algorithms, i.e., to enhance the anti-attack ability of algorithms. In the field of recommender systems, robustness refers to the ability of a recommender system to provide stable recommendations when its rating database is contaminated with some portion of noisy or attack profiles (O’Mahony et al. 2004a). In this paper, we focus on developing a robust recommendation algorithm which not only has better anti-attack ability but also has higher recommendation accuracy.

In recent years, research on the robustness of recommendation algorithms has been conducted in the context of shilling attacks. A neighbor selection method is proposed by O’Mahony (2004). First, the active user’s neighbors are divided into two groups by clustering according to the rating of the target item. Then, the standard deviations of the target item are calculated respectively, and the group of users with smaller standard deviation is deemed as the attack users. O’Mahony et al. (2004b) propose a profile utility calculation method based on the item’s inverse popularity to adjust the similarity weight between users, and then an intelligent neighborhood formation method is proposed to reduce the influence of attacks on recommendation results. The trust is introduced into collaborative recommendation, and a trust measurement method is proposed by Weng et al. (2006). Compared with the conventional similarity-based recommendation algorithms, the trust-based algorithm can improve the recommendation accuracy, coverage and robustness. O’Donovan and Smyth (2005) propose the profile-level and item-level computational models of trust, and the experimental results show that the latter performs better than the former in recommendation accuracy. Pitsilis and Marshall (2004) propose a trust computation method based on the uncertain probability theory. This method analyzes the trust relation between users from the perspective of subjective logic thinking. A multidimensional credibility model based on the source credibility theory is proposed by Kwon et al. (2009). However, it only takes into account the heterogeneous of ratings of users and still has the vulnerability when there are shilling attacks in recommender systems. Based on the theoretical analysis of knowledge-based trustworthiness and deduction trustworthiness, a multidimensional trustworthiness model is proposed by Maida et al. (2012), but it does not give the specific calculation method.

Mobasher et al. (2006) propose two recommendation algorithms.One is based on k-means clustering and the other is based on probabilistic latent semantic analysis (PLSA). Compared with standard k-neighbor approach, they have better robustness. Furthermore, the PLSA-based algorithm can achieve comparable recommendation accuracy. The recommendation algorithm based on association rule mining is presented by Sandvig et al. (2007). This algorithm can get better robustness, but the robustness is acquired at the cost of coverage. Mehta et al. (2007) propose a matrix factorization algorithm based on M-estimator (MMF), which can resist the outliers to some extent in comparison with PLSA and k-neighbor. But this method only works on moderate attacks. The least trimmed squares estimator based matrix factorization (LTSMF) (Cheng and Hurley 2010) shows better robustness and accuracy compared with MMF. LTS-estimator trims part of the largest residuals, which may cause the loss of recommendation accuracy.

In this paper, we propose a robust recommendation method based on suspicious users measurement and multidimensional trust (RRM-SUMMT). The main contributions are summarized as follows:

According to the user profiles features, we construct the RVM-based classifier to identify and measure the suspicious users in the user rating database and get the user suspicion degree.
Based on the user-item rating data of the recommendation system, we mine the implicit trust relationships between users and combine with the user suspicion degree to construct a reliable multidimensional trust model.
We design a robust recommendation algorithm which incorporates reliable multidimensional trust model, neighbor model and matrix factorization model and demonstrate the effectiveness of the proposed algorithm.

2 Background

In this section, we first introduce the theory of relevance vector machine. Then, we present the theory of matrix factorization.

2.1 Relevance vector machine

Relevance vector machine (RVM) is a machine learning technique based on Bayesian theory framework, which is proposed by Tipping in 2000. Similar to the support vector machine (SVM), RVM also converts the nonlinear problem of low-dimensional space into the linear problem of high-dimensional space based on the kernel function mapping (Tipping 2001a, b). Compared with SVM, RVM has the following advantages: a) the selection of kernel function is not limited by Mercer condition; b) the parameters can be obtained without cross verification; c) it has good generalization ability; d) there are fewer relevance vectors and the model is sparser; e) it provides the probability of prediction which can be used to analyze the uncertainty of the problem.

For binary classification, a training set {(x _i,t _i),i = 1,2,...,N} is given, in which N is the number of sample, and the corresponding objective output is t _i∈{0,1} , where 0 and 1 represent the category label of the two kinds of samples, respectively. The prediction model of RVM maps the linear combination y(x) to the interval (0, 1) by means of Sigmoid function for the judgment of category:

$$ \sigma (y) = {1 / {(1 + {e^{- y}}}}) $$

(1)

where y(x) is the output of RVM, which is the linear combination of weight vector and nonlinear kernel function:

$$ y(\boldsymbol x) = \sum\limits_{i = 1}^{N} {{w_{i}}} k(\boldsymbol {x},{\boldsymbol {x}_{i}}) $$

(2)

where k(x, x _i) is a kernel function, and w _i represents the weight.

Assume that the samples are independent identically distributed, and p(t|⋅) obeys Bernoulli distribution, and then the likelihood function of the sample set can be expressed as:

$$ p(\boldsymbol {t}|\boldsymbol {w}) = \prod\limits_{i = 1}^{N} {\sigma {{\{ y({\boldsymbol {x}_{i}};\boldsymbol {w})\} }^{{t_{i}}}}{{[1 - \sigma \{ y({\boldsymbol {x}_{i}};\boldsymbol {w})\} ]}^{1 - {t_{i}}}}} $$

(3)

If the maximum likelihood estimation method is employed in the formula (3), it is likely to generate the problem of over-fitting. To avoid this, RVM assumes that the parameter w _i obeys the Gaussian conditional probability distribution with mean value of 0 and variance $\alpha _{i}^{- 1}$ , therefore:

$$ p(\boldsymbol {w}|\boldsymbol{\alpha} ) = \prod\limits_{i = 1}^{N} {N({w_{i}}|0,\alpha_{i}^{- 1})} $$

(4)

where, α is the hyper-parameter vector deciding the prior distribution of the weight vector w, which controls the degree of deviation from zero mean of each weight value.

According to the Bayesian theory, if the likelihood distribution of sample set and the prior probability distribution of the weight value are known, then the posterior probability of the model parameter is:

$$ p(\boldsymbol {w},\boldsymbol {\alpha} |\boldsymbol {t}) = p(\boldsymbol {w}|\boldsymbol {t},\boldsymbol {\alpha})p(\boldsymbol {\alpha} |\boldsymbol {t}) $$

(5)

For the sample x _∗ to be detected, the prediction distribution of the corresponding output t _∗ is:

$$ p({t_{\ast}}|\boldsymbol {t}) = \int {p({t_{\ast}}|\boldsymbol {w},\boldsymbol {\alpha} )p(\boldsymbol {w}|\boldsymbol {t},\boldsymbol {\alpha} )p(\boldsymbol {\alpha} |\boldsymbol {t})} d\boldsymbol {w} d \boldsymbol {\alpha} $$

(6)

In the above formula (6), the weight posterior probability p(w|t,α) and marginal likelihood function p(α|t) can not be directly solved through integral, so we use the Laplace method proposed by Mackay (1992) for approximation.

2.2 Matrix factorization

Matrix factorization model can reveal the hidden characteristics of users and items in ratings data, which is denoted by user feature matrix P and item feature matrix Q. Let $\hat {\boldsymbol {R}} $ be the matrix of predicted ratings, and then the matrix factorization model is defined as follows:

$$ \hat {\boldsymbol {R}} = {\boldsymbol {Q}}^{T}\boldsymbol {P} $$

(7)

where P = (p ₁,p ₂,...,p _m) is the f×m(f<m) matrix, and p _u is the f-dimensional feature vector for user u, Q = (q ₁,q ₂,...,q _n) is the f×n(f<n) matrix, and q _i is the f-dimensional feature vector for item i. Let ${\hat r_{u,i}}$ be the predicted rating, which is expressed as follows:

$$ \hat r_{u,i}= \boldsymbol{q}_{\boldsymbol{i}}^{T}\boldsymbol{p}_{\boldsymbol{u}} $$

(8)

In order to solve the vectors p _u and q _i, the least squares problem is defined as follows:

$$ \boldsymbol{P}^{\boldsymbol{\ast}},\boldsymbol {Q}^{\boldsymbol{\ast}} = \arg \min \sum\limits_{{r_{u,i}} \ne \phi } {{{({r_{u,i}} - \boldsymbol{q}_{\boldsymbol{i}}^{T}\boldsymbol{p}_{\boldsymbol{u}})}^{2}} + \lambda ({{\left\| {{\boldsymbol {q}_{i}}} \right\|}^{2}} + {{\left\| {{\boldsymbol {p}_{u}}} \right\|}^{2}})} $$

(9)

where r _u,i is the real rating, λ(∥q _i∥²+∥p _u∥²) is a regularization term which can avoid overfitting and λ is a constant.

We use stochastic gradient descent to solve the optimization problem above, and its steps are defined as follows:

$$ {\boldsymbol{q}_{\boldsymbol{i}}} \leftarrow {\boldsymbol{q}_{\boldsymbol{i}}} + \gamma ({\boldsymbol{p}_{\boldsymbol{u}}}{e_{u,i}} - \lambda {\boldsymbol{q}_{\boldsymbol{i}}}) $$

(10)

$$ {\boldsymbol{p}_{\boldsymbol{u}}} \leftarrow {\boldsymbol{p}_{\boldsymbol{u}}} + \gamma ({\boldsymbol{q}_{\boldsymbol{i}}}{e_{u,i}} - \lambda {\boldsymbol{p}_{\boldsymbol{u}}}) $$

(11)

where e _u,i is the residual between real rating and predicted rating in recommender systems, which is computed as follows:

$$ {e_{u,i}} = {r_{u,i}} - {\hat r_{u,i}} $$

(12)

3 The proposed method

To ensure the quality of recommendation, we first use RVM-based classifier to identify and measure the suspicious users in rating database and get the user suspicion degree. Then we mine the implicit trust relationships between users and incorporate the user suspicion degree to construct a reliable multidimensional trust model. Finally, we combine the trust model, neighbor model and matrix factorization to design a robust recommendation algorithm RRM-SUMMT. The framework of RRM-SUMMT is shown in Fig. 2.

3.1 Suspicious users measurement based on RVM

In this Section, we propose a RVM-based method for the suspicious users measurement (SUM_RVM). As shown in Fig. 2, SUM_RVM consists of two stages: training stage of RVM-based classifier and measurement stage of suspicious users.

In the process of training, the training set includes genuine and attack profiles which are marked with 0 and 1, respectively. In order to generate training set and user profile set to be measured in feature space, we use the following features proposed by Williams 2007a, b.

6 generic features: WDMA, RDMA, WDA, Length Variance, DegSim and DegSim’;
3 Average attack model features(3 for push): FMV, FMD, PV;
2 Random attack model features(2 for push): FAC, FMD;
2 Bandwagon attack model features(2 for push): FAC, FMD;

The training set is expressed as the form of feature vectors. We can employ the training set composed by the feature vectors to train the relevance vector machine and generate RVM-based classifier.

In the process of measurement, the user profile set to be measured is expressed as feature vectors set. So, we can employ RVM-based classifier to get the prediction probability of user classification (we call it user suspicion degree). The larger prediction probability is, the larger the user suspicion degree is.

Let S u s d e g r e e _{s
e
t} denote the set of measurement results. The algorithm of suspicious users measurement based on RVM (SUM_RVM) is described as follows.

3.2 Reliable multidimensional trust model

In this section, we introduce three trust attributes (i.e., rating authority, rating objectivity, and rating similarity) and combine the user suspicion degree with them to construct a reliable multidimensional trust model.

Definition 1 (Rating authority.)

The rating authority of v∈U , denoted as A(v), is defined as follows:

$$ A(v) = \frac{{\sum\nolimits_{i \in {I_{v}}} {Trust\_item(v,i)} }}{{|{I_{v}}|}} \times (1 - sus\_degree(v)) $$

(13)

$$ Trust\_item(v,i) = \frac{{\sum\nolimits_{c \in {U_{i}}} {Correc{t_{v}}(c,i)} }}{{|{U_{i}}|}} $$

(14)

$$ Correc{t_{v}}(c,i) = \left\{ \begin{array}{l} 1\;,\;\;\;|{p_{c,i}} - {r_{c,i}}|\, < \varepsilon \\ 0\;,\;\;others \end{array} \right. $$

(15)

where p _c,i represents the prediction rating of user c on item i under the condition of the user v is as the sole recommendation user of user c ; r _c,i represents the real rating of user c on item i ; I _v is the set of items rated by user v ; for item i∈I _v, the set of users who rate on item i is expressed as U _i={c|c≠v,c∈U,r _c,i≠0} ; s u s_d e g r e e(v) represents the suspicion degree of user v; ε = 1.8 (O’Donovan and Smyth 2005).

Definition 2 (Rating objectivity.)

The rating objectivity of user v∈U , denoted as O(v), is defined as follows:

$$ O(v) = \left( 1 - \frac{{\sum\nolimits_{i \in {I_{v}}} {|{r_{v,i}} - {{\bar r}_{i}}|} }}{{|{I_{v}}|}}\right) \times (1 - sus\_degree(v)) $$

(16)

where, ${\bar r_{i}}$ is the average rating of item i ; I _v is the set of items rated by user v ; s u s_d e g r e e(v) represents the suspicion degree of user v .

Definition 3 (Rating similarity.)

For user u∈U and v∈U, the rating similarity between user u and user v, denoted as S(u,v), is defined as follows:

$$ S(u,v) = \frac{{\sum\nolimits_{i \in I(u,v)} {({r_{u,i}} - {{\bar r}_{u}})({r_{v,i}} - {{\bar r}_{v}})} }}{{\sqrt {\sum\nolimits_{i \in I(u,v)} {{{({r_{u,i}} - {{\bar r}_{u}})}^{2}}} } \sqrt {\sum\nolimits_{i \in I(u,v)} {{{({r_{v,i}} - {{\bar r}_{v}})}^{2}}} } }} \times (1 - sus\_degree(v)) $$

(17)

Where, ${\bar r_{u}}$ and ${\bar r_{v}}$ are the average ratings of user u and user v respectively; I(u,v) is the set of items co-rated by user u and user v; s u s_d e g r e e(v) represents the suspicion degree of user v.

Based on the above definitions, we propose the computing method of reliable multidimensional trust model, which is regarded as the sum of the trust attribute and its importance weight, the trust value of user u for user v is shown in the following equation:

$$ Trus{t_{u,v}} = {w_{A}}A(v) + {w_{O}}O(v) + {w_{S}}S(u,v) $$

(18)

where, w _A, w _O and w _S represent the importance weight of rating authority, rating objectivity and rating similarity, respectively. We use analytic hierarchy process to compute the importance weight of each trust attribute.

Analytic hierarchy process (AHP) (Dyer 1992) is a hierarchy weight decision-making analysis method proposed by American operational research expert T. L. Saaty. AHP can mathematize the thinking process of decision-making with less quantitative information, thus providing a simple decision-making method for the multi-objective and multi-criteria complicated decision-making problem.

The main steps are as follows:

Step 1: :: We implement three recommendation strategies based on the above trust attributes (i.e., rating authority, rating objectivity, rating similarity). For each of the recommendation strategies, we calculate its recommendation accuracy, coverage and prediction shift, respectively.
Step 2: :: Take the recommendation accuracy, coverage and prediction shift as the assessment criteria of the importance of trust attributes and construct the hierarchy model.
Step 3: :: Compare the recommendation accuracy, coverage and prediction shift of three recommendation strategies respectively and use “1-9 scale method” to construct the pairwise comparison matrices.
Step 4: :: If the pairwise comparison matrices satisfy the consistency verification, we compute the weight vectors of pairwise comparison matrices using geometric mean method. Then we get the weight vectors of the trust attributes aiming at the above assessment criteria.
Step 5: :: Aiming at the trust attributes, we compute the combination of weight vectors and get the importance weight w = [w ₁ w ₂ w ₃]^T, where w ₁, w ₂ and w ₃ denote the importance weight of rating authority, rating objectivity and rating similarity, respectively.

Based on the above multidimensional trust model, we can calculate the degree of trust between users. The algorithm of user trust computation (UTC) is described as follows:

Algorithm 2 mainly includes two parts: the first part, from lines 1 to 4, is to get the suspicion degree of user v and compute user v’ s rating authority, rating objectivity and rating similarity. The second part, from lines 5 to 6, is to compute the trust of user u to user v and return the trust value.

3.3 Robust recommendation algorithm

By combining the multidimensional trust model, matrix factorization model and neighbor model, we propose a robust recommendation algorithm based on suspicious users measurement and multidimensional trust (RRM-SUMMT). The formula of prediction rating can be written as

$$ {\hat r_{u,i}} = \mu + {b_{u}} + {b_{i}} + \boldsymbol{q}_{\boldsymbol{i}}^{T}\boldsymbol{p}_{\boldsymbol{u}} + |R(u){|^{- \,\frac{1}{2}}}\sum\limits_{v \in R(u)} {({r_{v,i}} - {{\bar r}_{v}})Trus{t_{u,v}}} $$

(19)

where, μ refers to the average value of all ratings in rating database; b _u and b _i indicate the observed deviations of user u and item i, respectively, from the average; R(u) is the set of similar users for the target user u, r _v,i is the rating of user v to item i, ${\bar r_{v}}$ is the average rating of user v , T r u s t _u,v is the trust of user u to user v.

$$ {b_{u}} = \frac{1}{{\left| {{I_{u}}} \right|}}\sum\limits_{i \in {I_{u}}} {({r_{u,i}} - \mu )} $$

(20)

$$ {b_{i}} = \frac{1}{{\left| {{U_{i}}} \right|}}\sum\limits_{u \in {U_{i}}} {({r_{u,i}} - {b_{u}} - \mu )} $$

(21)

where, I _u is the set of items rated by the user u; U _i is the set of users who rate on item i.

Formula (19) can be solved by stochastic gradient descent, and the iterative formulae are as follows:

$$ {\boldsymbol{q}_{\boldsymbol{i}}} \leftarrow \boldsymbol{q}_{\boldsymbol{i}} + \gamma ({e_{u,i}}\boldsymbol{p}_{\boldsymbol{u}} - \lambda \boldsymbol{q}_{\boldsymbol{i}}) $$

(22)

$$ {\boldsymbol{p}_{\boldsymbol{u}}} \leftarrow \boldsymbol{p}_{\boldsymbol{u}} + \gamma ({e_{u,i}}\boldsymbol{q}_{\boldsymbol{i}} - \lambda \boldsymbol{p}_{\boldsymbol{u}}) $$

(23)

$$ {b_{u}} \leftarrow {b_{u}} + \gamma ({e_{u,i}} - \lambda {b_{u}}) $$

(24)

$$ {b_{i}} \leftarrow {b_{i}} + \gamma ({e_{u,i}} - \lambda {b_{i}}) $$

(25)

The core idea of RRM-SUMMT is summarized as follows:

Algorithm 3 mainly includes three parts: the first part, from lines 1 to 7, is to initialize user feature matrix and item feature matrix and calculate the b _u and b _i. The second part, from lines 8 to 24, is to complete model training process and obtain the optimal user feature matrix and item feature matrix. The third part, from lines 25 to 26, is to compute the predicted rating ${\hat r_{t,j}}$ for user t on item j.

4 Experimental evaluations

4.1 Dataset

We use the MovieLens 100K dataset as the experimental data Miller et al. (2003). This dataset consists of 100,000 ratings on 1682 movies by 943 users. All ratings are integer values between 1 and 5, where 1 is the lowest (disliked) and 5 is the highest (most liked). For the number of items which one user has rated in the dataset, the minimum is 20. The dataset is divided randomly in a ratio 80:20 into training and test sets.

4.2 Evaluation Metrics

We use mean absolute error (MAE) and prediction shift (PS) to evaluate the performance of RRM-SUMMT.

MAE is commonly used in recommender systems as the measurement of accuracy, and it is defined as follows (Kantor et al. 2011):

$$ MAE = \frac{{\sum\limits_{j = 1}^{n} {\left| {{r_{u,i}} - {{\hat r}_{u,i}}} \right|} }}{N} $$

(26)

where r _u,i is the real rating of user u on item i, ${\hat r_{u,i}}$ is the predicted rating of user u on item i, N is the total number of prediction.

PS is commonly used in recommender systems as the measurement of robustness, and it is computed as follows (Kantor et al. 2011):

$$ PS = \frac{{\sum\limits_{u \in U} {\left| {{{\hat r^{\prime}}_{u,i}} - {{\hat r}_{u,i}}} \right|} }}{N} $$

(27)

where ${\hat r_{u,i}}$ and ${\hat r_{u,i}}^{\prime }$ are the predicted ratings of user u on item i before and after the target item i is attacked respectively, N is the total number of prediction.

4.3 Experimental results and analysis

To evaluate the performance of algorithm, we compare the performance of RRM-SUMMT with the following methods.

(1)
MMF: Matrix factorization algorithm based on M-estimator proposed by Mehta et al. (2007).
(2)
LTSMF: Matrix factorization algorithm based on least trimmed square estimator proposed by Cheng and Hurley (2010).
(3)
CF: User-based collaborative filter recommendation algorithm.
(4)
WItem: Recommendation algorithm based on trust proposed by O’Donovan and Smyth (2005).

To evaluate the robustness of algorithm, average attack, random attack, bandwagon attack and AoP attack are respectively injected into the training set. The attack size is 1 %, 2 %, 4 %, 6 %, 8 % and 10 %. The filler size is 3 % and 5 %. The results of experimental data comparison for the five algorithms are shown in Tables 1, 2, 3 and 4 and Figs. 3, 4, 5 and 6.

Table 1 Comparison of MAE for five algorithms with random attack

Full size table

Table 2 Comparison of MAE for five algorithms with average attack

Full size table

Table 3 Comparison of MAE for five algorithms with bandwagon attack

Full size table

Table 4 Comparison of MAE for five algorithms with AoP attack

Full size table

From Tables 1 to 4, we can see that under the same attack size and filler size, MAE value of RRM-SUMMT is best. Take MAE value under random attack with the filler size of 5 % for an example. Compared with MMF, LTSMF, CF and WItem, MAE values of RRM-SUMMT reduce 1.3 %, 1.2 %, 2.9 % and 2.3 %, respectively. On the whole, MAE values of MMF and LTSMF are between 0.7457 and 0.7516; MAE values of WItem and CF are between 0.7551 and 0.7633; MAE value of RRM-SUMMT is between 0.7377 and 0.7422. Since lower MAE value means better recommendation accuracy, the recommendation accuracy of RRM-SUMMT is superior to MMF, LTSMF, CF and WItem. It thus can be seen that the combination of the neighbor model with matrix factorization can improve the recommendation accuracy of algorithm.

From Figs. 3 to 6, it can be seen that under the four attack types mentioned in this paper, when the filler size is the same, the PS values of the five algorithms rise with the increase of attack size. Thus, the anti-attack ability of the algorithms gradually drops with the increase of the attack users. Under the condition of the same filler size and the same attack size, compared with MMF, LTSMF, CF and WItem, the PS value of RRM-SUMMT is obviously small. Take PS value of random attack with 3 % filler size for example. Compared with MMF, LTSMF, CF and WItem, PS values of RRM-SUMMT reduce 22.4 %, 17.4 %, 36.2 % and 18.3 %, respectively.

Under the four attack situations, the PS values of CF and WItem increase obviously; the PS values of MMF and LTSMF increase obviously along with the attack size increases under Aop attack. However, the PS values of RRM-SUMMT are not change obviously under the four attack situations. Lower PS value means better robustness. Thus, RRM-SUMMT has strong anti-attack ability, i.e. good robustness. The reason is that the RRM-SUMMT is incorporated with multidimensional trust model. Therefore, the neighbors reliability of target user is improved greatly so as to reduce the influence of attack profiles on the recommendation results. So, the robustness of RRM-SUMMT is improved.

5 Conclusions and future work

Improving the robustness of algorithm is an effective approach to increase the recommender systems’ anti-attack ability. In this paper, we propose a robust recommendation algorithm based on the suspicious user measurement and multidimensional trust. First, in order to measure the possibility of user as attacker, we propose a RVM-based suspicious user measurement method. Then, we mine the user trust attributes from user rating authority, rating objectivity and rating similarity, and the suspicion degree is incorporated as the weight factor of attributes. We propose a reliable multidimensional trust model to measure the user’s reliability. We combine the multidimensional trust model, matrix factorization model and neighbor model. The experiment demonstrates that the proposed algorithm does not only increase the robustness, but also improves the accuracy.

Future work can be carried out from the following two aspects. On the one hand, we will focus on improving the reliable multidimensional trust model proposed in this paper. In particular, we will propose more effective features to characterize the attack profiles, and use ensemble method to detect shilling attacks in order to further improve the performance of classification. On the other hand, we will introduce the items’ attribute information to address the data sparsity issue in order to further improve the recommendation accuracy. Particularly, we will explore more effective method to complete the missing values of user rating matrix.

References

Aghili, G., Shajari, M., Khadivi, S., & Morid, M.A. (2011). Using genre interest of users to detect profile injection attacks in movie recommender systems. In Proceedings of the 10th international conference on Machine Learning and Applications and Workshops (pp. 49–52).
Burke, R., Mobasher, B., Williams, C., & Bhaumik, R. (2006). Classification features for attack detection in collaborative recommender systems. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 542–547).
Cheng, Z., & Hurley, N. (2010). Robust collaborative recommendation by least trimmed squares matrix factorization. In Proceedings of the 22nd IEEE international conference on tools with artificial intelligence (pp. 105–112).
Dyer, R.F. (1992). Group decision support with the analytic hierarchy process. Decision Support Systems, 8(2), 99–124.
Article Google Scholar
Gunes, I., Kaleli, C., Bilge, A., & Polat, H. (2014). Shilling attacks against recommender systems: a comprehensive survey. Artificial Intelligence Review, 42(4), 767–799.
Article Google Scholar
Hurley, N.J., O’Mahony, M.P., & Silvestre, G.C. (2007). Attacking recommender systems: A cost-benefit analysis. IEEE Intelligent Systems, 22(3), 64–68.
Article Google Scholar
Kantor, P.B., Rokach, L., Ricci, F., & Shapira, B. (2011). Recommender systems handbook: Springer.
Kwon, K., Cho, J., & Park, Y (2009). Multidimensional credibility model for neighbor selection in collaborative recommendation. Expert Systems with Applications, 36(3), 7114–7122.
Article Google Scholar
Mackay, D. (1992). The evidence framework applied to classification networks. Neural computation, 4(5), 720–736.
Article Google Scholar
Maida, M., Maier, K., Obwegeser, N., & Stix, V. (2012). A multidimensional model of trust in recommender systems. In Proceedings of 13th International Conference on Electronic Commerce and Web Technologies (pp. 212–219).
Mehta, B., Hofmann, T., & Nejdl, W. (2007). Robust collaborative filtering. In Proceedings of the 2007 ACM conference on Recommender systems (pp. 49–56).
Miller, B.N., Albert, I., Lam, S. K., Konstan, J. A., & Riedl, J. (2003). MovieLens unplugged: experiences with an occasionally connected recommender system. In Proceedings of the 8th international conference on Intelligent user interfaces (pp. 263–266).
Mobasher, B., Burke, R., & Sandvig, J.J. (2006). Model-based collaborative filtering as a defense against profile injection attacks. In Proceedings of the 21st national conference on artificial intelligence (pp. 1388–1393).
O’Donovan, J., & Smyth, B. (2005). Trust in recommender systems. In Proceedings of the 10th international conference on Intelligent user interfaces (pp. 167–174).
O’Mahony, M., Hurley, N., Kushmerick, N., & Silvestre, G. (2004a). Collaborative recommendation: A robustness analysis. ACM Transactions on Internet Technology (TOIT), 4(4), 344–377.
Article Google Scholar
O’Mahony, M.P., Hurley, N.J., & Silvestre, G.C. (2004b). Efficient and secure collaborative filtering through intelligent neighbour selection. In Proceedings of the 16th European conference on artificial intelligence (pp. 383–387).
O’Mahony, M.P. (2004). Towards robust and efficient automated collaborative filtering. PhD dissertation: University College Dublin.
Pitsilis, G., & Marshall, L.F. (2004). A model of trust derivation from evidence for use in recommendation systems. Computing Science: University of Newcastle upon Tyne.
Google Scholar
Sandvig, J.J., Mobasher, B., & Burke, R. (2007). Robustness of collaborative recommendation based on association rule mining. In Proceedings of the 2007 ACM conference on Recommender systems (pp. 105–112).
Tipping, M. (2001a). The relevance vector machine. Advances in Neural Information Processing Systems, 12, 652–658.
MATH Google Scholar
Tipping, M.E. (2001b). Sparse Bayesian learning and the relevance vector machine. The journal of machine learning research, 1, 211–244.
MathSciNet MATH Google Scholar
Weng, J., Miao, C., & Goh, A. (2006). Improving collaborative filtering with trust-based metrics. In Proceedings of the 2006 ACM symposium on Applied computing (pp. 1860–1864).
Williams, C.A., Mobasher, B., & Burke, R. (2007a). Defending recommender systems: detection of profile injection attacks. Service Oriented Computing and Applications, 1(3), 157–170.
Article Google Scholar
Williams, C.A., Mobasher, B., Burke, R., & Bhaumik, R. (2007b). Detecting profile injection attacks in collaborative filtering: a classification-based approach. In Proceedings of the 8th Knowledge Discovery on the Web International Conference on Advances in Web Mining and Web Usage Analysis (pp. 167–186).
Xu, H., Wu, X., & Li, X. (2009). Comparision study of Internet recommendation system. Journal of Software, 20(2), 350–362.
Article Google Scholar
Zhang, S., Ouyang, Y., Ford, J., & Makedon, F. (2006). Analysis of a low-dimensional linear model under recommendation attacks. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 517–524).

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Yanshan University, Qinhuangdao, 066004, China
Huawei Yi & Fuzhi Zhang
The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao, 066004, China
Huawei Yi & Fuzhi Zhang

Authors

Huawei Yi
View author publications
You can also search for this author in PubMed Google Scholar
Fuzhi Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fuzhi Zhang.

Additional information

This work was supported by the National Natural Science Foundation of China (No.61379116), the Natural Science Foundation of Hebei Province, China (No. F2013203124, No. F2015203046) , the Key Program of Research on Science and Technology of Higher Education Institutions of Hebei Province, China (No. ZH2012028) and the Scientific Research Foundation of Liaoning Provincial Education Department, China (No. L2015240).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yi, H., Zhang, F. Robust recommendation method based on suspicious users measurement and multidimensional trust. J Intell Inf Syst 46, 349–367 (2016). https://doi.org/10.1007/s10844-015-0375-2

Download citation

Received: 24 March 2015
Revised: 18 July 2015
Accepted: 20 July 2015
Published: 14 August 2015
Issue Date: April 2016
DOI: https://doi.org/10.1007/s10844-015-0375-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Robust recommendation method based on suspicious users measurement and multidimensional trust

Abstract

Similar content being viewed by others

Detecting Anomalous Ratings Using Matrix Factorization for Recommender Systems

Extracting user influence from ratings and trust for rating prediction in recommendations

Collaborative filtering-based recommender systems by effective trust

1 Introduction