Keywords

1 Problem Definition

The Service Recommendation Problem (SRP) can be described as the problem of recommending service which provides both the required functionalities and has optimal QoS performance from many candidate services with equivalent function to the target user [1], subject to side constraints. The SRP plays an important role in the fields of service computing and cloud computing. There exists a wide variety of SRPs and a broad works on this class of problems recently, for example, the surveys of Pranjali M. Patil and R.B. Wagh [2], Ruchita V. Tatiya et al. [3], M. Subha and M. Uthaya Banu [4], and the recent survey proposed by Ashwini Puri and Mansi Bhonsle [5] and other works [6, 7]. However, these works lack in comparison with other classical works. Inspired by this, our paper surveys the main traditional and recent approaches based on Collaborative Filtering proposed by Z. Zheng et al. [8, 9] developed for the SRP.

Z. Zheng et al. [8, 9] proposed an approach for predicting QoS values of services by combining the traditional user-based and item-based collaborative filtering methods, then based on QoS values, they recommended the optimal service to an active service user.

In this section, we present the SRP definition and introduce the approach’s main steps for service recommendation based on Z. Zheng’s papers [8, 9].

Given a recommender system consisting of M service users and N service items, the relationship between service users and service items is denoted by an M × N matrix, called the user-item matrix. To simplify the description of our paper, we formalize the Service Recommendation Problem as follows:

U = {u 1, u 2, …, u m} is a set of service users, where m is total number of service users registered in the recommendation system.

S = {s 1, s 2, …, s n} is a set of services, where n is total number of services collected by the recommendation system.

M = {r i,j |1 ≤ i≤ m, 1 ≤ j ≤ n} is the user-service matrix, where r i,j is a vector of QoS attribute values (e.g., response-time, failure-rate, etc.) acquired from service user u i invoking service item s j. If service user u i did not invoke the service item s j before, then r i,j  = 0.

Next, we take user-based approach as an example to introduce main steps of the approach as follows:

Step 1: Compute the similarity of different service users as well as service items.

$$ sim(a,u) = \frac{{\sum\limits_{i \in I} {(r_{a,i} - \bar{r}_{a} )(r_{u,i} - \bar{r}_{u} )} }}{{\sqrt {\sum\limits_{i \in I} {(r_{a,i} - \bar{r}_{a} )^{2} } } \sqrt {\sum\limits_{i \in I} {(r_{u,i} - \bar{r}_{u} )^{2} } } }} $$
(1)

where \( sim(a,u) \) denotes degree of similarity between user a and user u, I is the set of service items that are invoked by user a and user u. \( \bar{r}_{a} \) and \( \bar{r}_{u} \) represent an average QoS value for user a and user u respectively.

Employ a significance weight to reduce the influence of a small number similar common invoked items

$$ sim^{{\prime }} (a,u) = \frac{{2 \times \left| {I_{a} \cap I_{u} } \right|}}{{\left| {I_{a} } \right| + \left| {I_{u} } \right|}}sim(a,u) $$
(2)

where \( \left| {I_{a} \cap I_{u} } \right| \) is the number of service items that are employed by both two users, and \( \left| {I_{a} } \right|,\left| {I_{u} } \right| \) are the number of services invoked by user a and user u, respectively.

Step 2: Find top k similar neighbors for the active user, where neighbors with similarities smaller or equal to 0 will be excluded.

Predict the missing values as follows:

$$ P(r_{u,i} ) = \overline{u} + \frac{{\sum\nolimits_{{u_{a} \in S(u)}} {sim^{{\prime }} (u_{a} ,u)(r_{{u_{a,i} }} - \overline{{u_{a} }} )} }}{{\sum\nolimits_{{u_{a} \in S(u)}} {sim^{{\prime }} (u_{a} ,u)} }} $$
(3)

where \( P(r_{u,i} ) \) is the predicted QoS values for the active user u and \( S(u) \) is a set of similar users for service user u.

Employ two confidence weights to balance the results from these two prediction methods.

$$ con_{u} = \sum\limits_{{u_{a} \in S(u)}} {\frac{{sim^{{\prime }} (u_{a} ,u)}}{{\sum\nolimits_{{u_{a} \in S(u)}} {sim^{{\prime }} (u_{a} ,u)} }}} \times sim^{{\prime }} (u_{a} ,u) $$
(4)

where \( con_{u} \) is the prediction confidence of the user-based method.

$$ w_{u} = \frac{{con_{u} \times \lambda }}{{con_{u} \times \lambda + con_{i} \times (1 - \lambda )}} $$
(5)
$$ w_{i} = \frac{{con_{i} \times \lambda }}{{con_{u} \times \lambda + con_{i} \times (1 - \lambda )}} $$
(6)

where both \( w_{u} \) and \( w_{i} \) are the combinations of the confidence weights (\( con_{u} \) and \( con_{i} \)) and the parameter λ.

Step 3: Prediction for active users. The formula was defined as

$$ P(r_{a,i} ) = w_{u} \times \overline{{r_{a} }} + w_{i} \times \overline{{r_{i} }} $$
(7)

Based on the service recommendation problem proposed by Z. Zheng et al. [8, 9], our paper surveys approaches for the SRP which can be classified into traditional and recent approaches.

2 Traditional Approaches

Based on the service recommendation problem approached by Z. Zheng et al. [8, 9], traditional approaches for the SRP can be classified into three categories: (1) personalized collaborative filtering; (2) time-aware approach; (3) location-aware approach and other traditional approaches, like trust-aware approach. Because the number of proposed approaches is very large, we provide one representative example for every category.

2.1 The Region-Sensitive Personalized Collaborative Filtering Approach

The following approach due to Xi Chen et al. [10] presented an effective region-sensitive collaborative filtering method for the SRP. The method recognized the characteristics of QoS and employed it by building an efficient region model. They clustered users into several regions based on their physical locations and historical QoS similarities.

According to their observation, QoS, highly relates to users’ physical locations. For instance, as Fig. 1 [10] depicts, Bob and Alice check the same public service registry located in America. From the recorded QoS values, they find that service A provided by an American provider outperforms others. After trying it, however, Alice finds that the response time of service A is much higher than her expectation, while Bob thinks A is what he wants. Then Alice recognizes that some QoS properties, like response time and availability, highly relate to the network environment of the region where she locates. She returns to her colleagues and they suggest service B provided by a local company based on their past experiences.

Fig. 1.
figure 1

Alice is working in India with a low bandwidth and Bob is in America with a higher one. Both of them need an email filtering service with low response time.

In this method, they took RTT (round-trip time) as an example to describe their approach and define the notations as follows: \( T_{u} = \{ R_{u} (s_{1} ),R_{u} (s_{2} ), \ldots ,R_{u} (s_{m} )\} \) is an RTT vector or a user’s RTT profile associated with each other, where \( R_{u} (s_{i} )(1 \le i \le m) \) denotes the RTT of service s i provided by user u. \( \overline{{R_{u} }} \) denotes the average RTT provided by user u.

They defined a region as a group of users who were closely located with each other and had similar RTT profiles. Firstly, they detected the region-sensitive services. The set of RTTs \( \{ R_{1} (s),R_{2} (s), \ldots ,R_{k} (s)\} (1 \le k \le n) \) collected from all users of service s is a sample from population R. They used media and media absolute deviation (MAD) to estimate the μ and the standard deviation σ of R.

$$ MAD = median_{i} \left( {\left| {R_{i} (s) - median_{j} (R_{j} (s))} \right|} \right)(i = 1, \ldots ,k,j = 1, \ldots ,k) $$
(8)

The two estimators are

$$ \widehat{\mu } = median_{i} (R_{i} (s)) $$
(9)
$$ \widehat{\sigma } = 1.4862MAD_{i} (R_{i} (s)) $$
(10)

Then, the similarity of two regions M and N was measured by the similarity of their region centers which were defined as the median vector of all the RTT vectors. The similarity of the two region centers m and n was formally defined as:

$$ sim(m,n) = \frac{{\sum\limits_{s \in S(n) \cap S(m)} {(R_{m} (s) - \overline{{R_{m} }} )(R_{n} (s) - \overline{{R_{n} }} )} }}{{\sqrt {\sum\limits_{s \in S(n) \cap S(m)} {(R_{m} (s) - \overline{{R_{m} }} )^{2} } } \sqrt {\sum\limits_{s \in S(n) \cap S(m)} {(R_{n} (s) - \overline{{R_{n} }} )^{2} } } }} $$
(11)

where S(m) and S(n) is the set of services invoked by users in region M, N respectively.

Based on the region similarity, they used the region aggregation algorithm for regions and generate QoS prediction and recommendation. The method significantly improves the prediction accuracy regardless of the sparsity of the training matrix and its scalability advantage over traditional approaches.

There are many other works considering the personalization. For example, YeChun Jiang et al. [11] proposed a personalized QoS-Aware service recommendation approach; Lingshuang Shao et al. [12] and Jinliang Xu et al. [13] also proposed similar methods as before.

2.2 The Time-Aware Approach

Yan Hu et al. [1] proposed a novel time-aware approach for SRPs, which integrated time information into both the similarity measurement and the final QoS prediction, and a hybrid personalized random walk algorithm to infer more indirect user similarities and service similarities. They redefine the user-service matrix with adding the time information for service users as follows:

M = {(r i,j,tij)|1 ≤ i≤ m,1 ≤ j ≤ n} is the user-service matrix, each entry of M is a 2-tuple (r i,j, t ij ), where r i,j is a vector of QoS attribute values acquired from service user u i invoking service item s j, and t ij is the timestamp when u i invoked s j.

They found that a longer timespan indicated a higher probability that a QoS value deviated from its original value. Based on it, there are two intuitive principles behind the user similarity measurement. (1) More temporally close QoS experience from two users on a same service contributes more to the user similarity measurement. (2) More recent QoS experience from two users on a same service contributes more to the user similarity measurement.

As Fig. 2 [1] depicts, if Δt 1 is long, even though u i and u j have very similarity QoS experience on s k , it does not really mean high similarity between u i and u j , since u i ’s QoS experience in s k may change violently over Δt 1 . Therefore, a shorter Δt 1 generally indicates a greater contribution of s k to the similarity measurement between u i and u j . So the contribution of s k can be approximately weighted by an exponential decay function of Δt 1 , which is defined as:

$$ f_{1} (t_{ik} ,t_{jk} ) = e^{{ - \partial \left| {t_{ik} - t_{jk} } \right|}} $$
(12)

where α is a non-negative decay constant, with a large α making the value of f 1 vanish more rapidly with increase of the timespan \( \left| {t_{ik} - t_{jk} } \right| \).

Fig. 2.
figure 2

Suppose that the service recommender system starts at t start . A service user u i invoked service s k at t ik, and another user u j invoked the same service s k at t jk. The timespan between t ik and t jk is denoted by Δt 1, Δt 2 is the timespan between u j ’s invocation on service s k and the current moment, and Δt 3 is the timespan between u i ’s invocation on s k and the current moment.

They utilized Δt 4  = (Δt 2  + Δt 3 )/2 to denote the second time factor. A shorter Δt 4 generally indicates a greater contribution of s k to the service user similarity measurement. Thus, the contribution of s k decays exponentially with the increase of Δt 4 . The exponential decay function is defined as:

$$ f_{2} (t_{ik} ,t_{jk} ) = e^{{ - \beta \left| {t_{current} - (t_{ik} - t_{jk} )/2} \right|}} $$
(13)

where β is a non-negative decay constant, with a larger β making f 2 vanish much more rapidly with the increase of the timespan \( \left| {t_{current} - (t_{ik} - t_{jk} )/2} \right| \).

Based on the definition of f 1 and f 2 , the time-aware PCC-based similarity measurement between u i and u j , can be defined as:

$$ sim(u_{i} ,u_{j} ) = \frac{{\sum\limits_{{s_{k} \in S}} {(r_{i,k} - \overline{{r_{i} }} )} (r_{j,k} - \overline{{r_{j} }} )f_{1} (t_{ik} ,t_{jk} )f_{2} (t_{ik} ,t_{jk} )}}{{\sqrt {\sum\limits_{{s_{k} \in S}} {(q_{ik} - \overline{{q_{i} }} )} } \sqrt {\sum\limits_{{s_{k} \in S}} {(q_{jk} - \overline{{q_{j} }} )} } }} $$
(14)

2.3 The Random Walk Method

Mingdong Tang et al. [14] proposed a method considering the service users’ locations to solve the data sparsity issue for improving the precision of service recommendation.

Typically, a user invokes only a small number of services, thus the matrix M is likely very sparse, i.e., r u,i is missing for most u and i. The random walk method is inspired by the service selection process in real life.

To imitate the process, they built a user network defined as \( NU_{u} \), which connected user u with its nearest Top-K neighbors. Based on the motivate that QoS information provided by near neighbors of the source user on the target or similar services are more reliable than QoS information provided by far neighbors, they combined the location-based and item-based method to make QoS prediction and propose a random walk model called WSWalker.

Firstly, they proposed a method to measure distance between users of the user network. The location of a user is represented by its latitude and longitude, so the distance between users is calculated as follows:

$$ d_{u,v} = \sqrt {(\varphi_{u} - \varphi_{v} )^{2} + (\lambda_{u} - \lambda_{v} )^{2} * \cos^{2} \left( {\frac{{\varphi_{u} + \varphi_{v} }}{2}} \right)} $$
(15)

where \( \varphi_{u} \) is the latitude of u, \( \lambda_{u} \) is the longitude of u and \( \varphi_{v} ,\lambda_{v} \) is the latitude and longitude of v respectively. They also proposed a method to measure the weight of edges in the user network.

$$ t_{u,v} = \frac{1}{{1 + d_{u,v} }} * sim^{{\prime \prime }} (u,v) $$
(16)

where \( t_{u,v} \) represents the weight of edge(u,v) in the user network, which can be viewed as a trust value between u and v and \( sim^{{\prime \prime }} (u,v) \) represents the similarity between u and v based on a modified PCC, i.e. Based on the Formula (1), The modified PCC measured as follows:

$$ sim^{{\prime }} (u,v) = \frac{1}{{1 + e^{{\frac{{\left| {IC_{u,v} } \right|}}{2}}} }}sim(u,v) $$
(17)

where \( \left| {IC_{u,v} } \right| \) is the number of services co-invoked by u and v. Both values of \( sim(u,v) \) and \( sim^{{\prime }} (u,v) \) are in the range [−1, 1]. However, it is meaningless for \( t_{u,v} \) taking negative values. They converted the similarity value into range [0, 1] as follow:

$$ sim^{{\prime \prime }} (u,v) = \frac{{(1 + sim^{{\prime }} (u,v))}}{2} $$
(18)
$$ P(S_{u} = v) = \frac{{t_{u,v} }}{{\sum\nolimits_{{w \in NU_{u} }} {t_{u,w} } }} $$
(19)

where \( P(S_{u} = v) \) denotes the probability of the condition that when a user walk at the node u and select the user v to go on walking.

$$ P(X_{{u_{0} ,i,k + 1}} = v|X_{{u_{0} ,i,k}} = u,\widetilde{{R_{u,i} }}) = (1 - \phi_{u,i,k} ) \times \frac{{t_{u,v} }}{{\sum\nolimits_{{w \in NU_{u} }} {t_{u,w} } }} $$
(20)

where \( P(X_{{u_{0} ,i,k + 1}} = v|X_{{u_{0} ,i,k}} = u,\widetilde{{R_{u,i} }}) \) is the probability of the condition that a user at node u in step k and select the user v in step k + 1 for source user u 0 and \( \widetilde{{R_{u,i} }} \) denotes the condition that the user u in step k  1 does not have QoS experience for target service i.

$$ \phi_{u,i,k} = \mathop {\hbox{max} }\limits_{{j \in I_{u} }} sim^{{\prime }} (i,j) \times \frac{1}{{1 + e^{{ - \frac{k}{2}}} }} $$
(21)
$$ sim^{{\prime }} (i,j) = \frac{1}{{1 + e^{{ - \frac{{\left| {UC_{i,j} } \right|}}{2}}} }}sim(i,j) $$
(22)

where \( \left| {UC_{i,j} } \right| \) is the number of users who have invoked both service i and j.

For each random walk, there are three cases to stop it:

Case 1. Arriving at a user who has QoS experience on the target service i.

Case 2. At some user node u, we decide to stay at the node and randomly select one of the services invoked by u which is similar to i and returns its QoS values as the result of random walk.

Case 3. When a single random walk continues forever, they limit the maximum depth (k > max-depth). According to the idea of “six-degrees of separation” in social networks, we set max-depth to 6.

The proposed random walk method could achieve significantly better tradeoffs between coverage and precision than the other CF methods and also showed that more confident recommendations are of greater quality.

3 Recent Approaches

Existing many approaches for solving the SRP different from the method proposed by Z. Zheng et al. [8, 9] are classified into the category called recent approach in our paper. The recent approaches have a large number of categories, including the approach for solving code-start SRPs, considering multi-dimensional algorithm for SRPs and applying in mobile internet and so on. In this section, we describe three recent approaches specifically developed for the SRP.

3.1 The Multi-dimensional QoS Prediction Approach

In SRP, most contemporary QoS prediction methods exploit the QoS characteristics for one specific dimension, e.g., time or location, and do not exploit the structural relationships among the multi-dimensional QoS data. Shangguang Wang et al. [7] proposed an integrated QoS prediction approach which unifies the modeling of multi-dimensional QoS data via multi-linear-algebra based concepts of tensor and enables efficient Web service recommendation for mobile clients via tensor decomposition and reconstruction optimization algorithms. A possible implementation of the method is following.

Step 1. Adopt the multi-linear-algebra concept of tensor to model multi-dimensional QoS data. The QoS data can be modeled as a tensor \( x \in {\mathbb{R}}^{m \times n \times k \times l \times p} \), which has five dimensions: m users, n services, k time periods, l locations and p QoS properties. \( x_{{i_{1} i_{2} i_{3} i_{4} i_{5} }} \) is an entry of \( x \), denoting the value of i 5 -th QoS property for the i 1 -th user invoking the i 2 -th service and the invocation must occur in the i 3 -th time period and at the i 4 -th location.

Step 2. Use the concept of rank one tensor to decompose a tensor. A tensor \( x \in {\mathbb{R}}^{{I_{1} \times I_{2} \times \ldots \times I_{N} }} \) can be decomposed as:

$$ x = \sum\nolimits_{r = 1}^{R} {a_{r}^{(1)} \circ a_{r}^{(2)} } \ldots \circ a_{r}^{(N)} $$
(23)

where \( a_{r}^{(j)} \) denotes the j-th vector whose length is \( I_{j} (1 \le j \le N) \) and the subscript r indicates that \( a_{r}^{(j)} \) is \( x_{r} \) specific, the superscript (j) indicates that \( a_{r}^{(j)} \) is the j-th vector.

Let every \( a_{r}^{(j)} \) in Formulate (23) be a column vector, then for a given j, the R columns \( a_{1}^{(j)} ,a_{2}^{(j)} , \ldots ,a_{R}^{(j)} \) constitute an \( I_{j} \times R \) matrix denoted as \( A^{j} \). Then Formulate (24) is equivalent to:

$$ x = \sum\nolimits_{r = 1}^{R} {A_{r}^{(1)} \circ } A_{r}^{(2)} \circ \ldots \circ A_{r}^{(N)} $$
(24)

Finally all the values of \( x \) including the unknown values can be estimated from Formulate (25) as:

$$ x_{{i_{1} i_{2} \ldots i_{N} }} \approx \widehat{x}_{{i_{1} i_{2} \ldots i_{N} }} = \sum\nolimits_{r = 1}^{R} {A_{{i_{1} r}}^{(1)} } \cdot A_{{i_{2} r}}^{(2)} \cdot \ldots \cdot A_{{i_{N} r}}^{(N)} $$
(25)

Step 3. Use the QoS tensor established in Step 2 and compute the component matrices, then we can predict the unknown QoS values by reconstructing the QoS tensor as Formulate (25).

This procedure can be executed in \( O(S \cdot C \cdot R) \), where C is number of known values of χ and S is the number of iterations required. The core of this approach is one of the fastest optimization mechanisms and the performance can result in much better accuracy in recommending service.

3.2 The MF and Decision Tree Integration Approach

This approach was proposed by Qi Yu [15] to solve cold-start SRPs which integrated Matrix Factorization (MF) with decision tree learning to service recommendation systems.

Step 1. Matrix Factorization discovers the hidden user group structure from a set of incomplete QoS data. It computes two low-rank matrices \( F \in R_{{^{ + } }}^{m \times k} \) and \( G \in R_{ + }^{n \times k} \) to approximate the original QoS matrix \( M_{m \times n} \), i.e., \( M \approx FG^{{\prime }} \)

Step 2. The tree learning algorithm constructs a decision tree to partition the users to fit the group structure discovered by MF. Firstly, discover user groups that contain users sharing similar QoS experience. Then, estimate the unobserved entries in the QoS matrix M.

Step 3. Cold-start service recommendation. The new user is expected to share similar QoS experience with other users in the same group. Therefore, the new user should not deviate much from the group mean, which makes the group mean a good estimate for the new user’s QoS experience.

The cold-start performance of this approach outperforms the warm-start performance of other approaches. Compared to the ternary tree approach, which is not suitable for the initial interview of service recommendation, this approach contributes to the good performance for the cold-start issue.

3.3 The Context-Aware Role Mining Approach

Jian Wang et al. [16] proposed the approach to group users automatically to their interests and habits for mobile service recommendation of SRPs. Then, popular mobile services can be recommended to other members in the same group in a context dependent manner. This approach can be implemented as follow steps.

Step 1. Mine the minimal set of roles from the user-context-behavior matrix, and output the user-role assignment matrix and the role-context-behavior assignment matrix.

Step 2. Create a role tree based on FCA (Formal Concept Analysis) approach [17] and a formal context can be represented as a matrix, where rows represent objects and columns represent attributes.

This approach was successfully applied to recommend service on mobile devices and evaluated as an efficient and scalable for mobile service recommendation.

4 Conclusion

The Service Recommendation Problem lies at the heart of commercial systems (i.e., Amazon, Ebay and so on), service computing [18, 19], cloud computing [20] and mobile computing [21] and so on. There exist several versions of the problem, and a wide variety of traditional and recent approaches have been proposed for its solution. Traditional approaches can only solve relatively small problems, but a number of recent approaches have proved very satisfactory.