1 Introduction

In the last decade, agent and Multi-Agent Systems (MASs) technologies have been widely applied to develop web-based service-oriented systems such as Internet-based grid systems [4, 27], e-markets [2, 9, 11], pervasive computing systems [13, 15], and e-government [5, 24]. Web-based service-oriented environments are open environments where consumers and providers can enter and leave the environments freely. In such environments, how to select a trustworthy and suitable service provider to fulfill a service request for a consumer is a very challenging issue for most service-oriented applications.

‘Agent trust’ is one of the important research issues in MASs. There are a number of models, which have been developed to evaluate trusts in MASs, for example, the probabilistic theory-based model [1, 21], the certified reputation model [7] and the evidential trust model [23]. In the last decade, some trust models have been developed in service-oriented environments to help consumers evaluate the trust values of potential service providers from different perspectives [3, 6, 8, 10, 16, 25]. Zacharia et al. proposed a reputation-based trust evaluation model, called the SPORAS [26], to select single service provider based on their historical performance. Huynh et al. introduced a famous trust model, called the ‘Certified Reputation’ (CR) model, to evaluate a service provider’s trust through third party references [7]. Su et al. extended the work of the CR model and developed a priority-based trust model to evaluate a full context trust value of a potential service provider based on third party references, the historical performance and the priority distribution of attributes [19]. Most current trust models focus on evaluating the trust value for a single provider; however, many complex service requests from consumers cannot be handled by an individual service from a single service provider and a service group consisting of a number of services from different providers are needed to satisfy these service requests [20, 22]. In such situations, trust models focusing on the trust evaluation for single service providers cannot be directly employed for the trust evaluation for service groups. How to select a suitable service group based on a complex service request has become a new problem.

The trust evaluation for a service group is different from that of a single service, because there are some specific factors that affect the trust value of a service group. Such factors mainly include: (1) the functionality coverage of the service group on the service request, which determines whether the group can satisfy all the attributes of the service request, (2) the workflow and dependency relationships among individual services of the group, (3) the historical performance of individual services in the group, and (4) the suitability about whether the historical performance of the group can reflect its performance on the service request based on the priority distribution of attributes.

In the literature, there are only a few models for group trust evaluation, but they do not take most of aforementioned factors into account. The REGRET model [12] is one of these models, which generated trust value of a service group from averaging all trust values of individual services in the group.

In this paper, we introduce an innovative trust model, the GTrust model, for trust evaluation of a service group by considering aforementioned four factors. The GTrust model has the following merits. (1) We use the ‘functionality coverage’ to measure the extent of functionalities of a service request satisfied by a service group. (2) We introduce the ‘workflow description’ and the ‘degree of dependency’ to describe the workflow and dependency relationships among individual services of a service group. (3) We borrow the concept of ‘third party reference report’ from the PBTrust model [19] to record the historical performance of individual services in a service group. (4) We use the concept of ‘similarity’ to measure the similarity of priority distribution between the service request and historical services in reference reports.

The rest of this paper is organised as follows. In Section 2, the problem description and definitions are presented. The basic components of the GTrust model are introduced briefly in Section 3. The modules of the GTrust model are introduced in detail in Section 4. In Section 5, we demonstrate experimental results and analysis. The related work and comparison are given in Section 6. In Section 7, we conclude the paper and outline the future work

2 Problem description and definitions

In general, a service in service-oriented environment can be described by a number of attributes such as price, time, quality, and their corresponding priority values. For different service requests, the priority distribution on the same attributes are different. In order to precisely describe a service request, we propose a service description formally.

Suppose that a service request includes n number of attributes and each attribute has a priority value to describe its importance in the request. A service can be described by n number of attributes and their priority values as follows.

Definition 1

A service description (SDes) is defined as a 2 × n matrix.

$$ SDes = \left( \begin{array}{ccccc} A_{1} & A_{2} & A_{3} & ... & A_{n} \\ P_{1} & P_{2} & P_{3} & ... & P_{n} \\ \end{array} \right), $$
(1)

where A j is the j th attribute of the service request, P j is the priority value of A j and \({\sum }_{j=1}^{n}P_{j}=1\).

Definition 2

A reference report (Rf) is defined as a two-tuple, R f = < S D e s,R a t i n g s>, where S D e s is the service description of the service request from a historical consumer (third party) and R a t i n g s is further defined as a vector, R a t i n g s = < R 1,R 2,...,R n > , where R j represents the rating of the individual service on the j th attribute of the historical service request and R j ∈ [0, 1], where 0 and 1 represent the worst and best performance of the individual service, respectively. The expansion of a reference report is described as follows.

$$ Rf = \left\langle\left( \begin{array}{ccccc} A_{1} & A_{2} & A_{3} & ... & A_{n} \\ P_{1} & P_{2} & P_{3} & ... & P_{n} \\ \end{array}\right), <R_{1}, R_{2}, R_{3}, ..., R_{n}>\right\rangle $$
(2)

To deal with a complex service request, a number of individual services need to form a group with a certain workflow and dependency relationships among them. If two service groups have the same individual services but the workflows and dependency relationships among these individual services are different, the two service groups may have a different performance on a service request. For example, suppose that two service groups have the same individual services S 1, S 2, S 3, S 4 and S 5, but different workflows and dependency relationships among individual services, which are described as follows.

In Figure 1, Group 1 has a sequential workflow from S 1 to S 5, i.e., the latter individual services can begin to work when their former individual services finish their works. The workflow in Group 2 are different from the workflow in Group 1. In Group 2, S 1, S 2, S 3 and S 4 can work concurrently and S 5 can only begin to work when they finish their works. In addition, the dependency relationships among individual services can affect the performance of the service group. For example, in Group 1, if S 4 has a bad performance and offers a low quality output to S 5, even if S 5 has a very good performance, the bad input received from S4 can reduce the actual performance of S 5.

Figure 1
figure 1

The workflows and dependency relationships among individual services of the two service groups

In order to describe the extent of the dependency relationship between two individual services, we introduce the concept of degree of dependency.

Definition 3

A degree of dependency (λ) is defined as a value λ∈[0, 1], where 0 and 1 represents no dependency relationship and the strongest dependency relationship between two individual services, respectively.

The workflow and dependency relationships among individual services of a service group can be described by a directed graph, which can be further described by an adjacency matrix as follows.

Definition 4

A workflow description (WDes) of a service group which consists of m number of individual services can be represented by an m × m adjacency matrix as (3).

$$ WDes = \left( \begin{array}{ccccc} \lambda_{11}, \lambda_{12}, \lambda_{13}, ... ,\lambda_{1m}\\ \lambda_{21}, \lambda_{22}, \lambda_{23}, ... ,\lambda_{2m}\\ ... , ... , ... , ... , ...\\ \lambda_{m1}, \lambda_{m2}, \lambda_{m3}, ... ,\lambda_{mm}\\ \end{array} \right), $$
(3)

where \(\lambda _{ii^{\prime }}\in [0, 1] \) represents the degree of dependency that the i th service (i.e., \(S_{i^{\prime }}\)) depends on the i th service (i.e., S i ). \(\lambda _{ii^{\prime }}=0\) represents that there is no dependency relationship between S i and \(S_{i^{\prime }}\). \(0<\lambda _{ii^{\prime }}\leq 1\) represents that \(S_{i^{\prime }}\) depends on S i and the value of \(\lambda _{ii^{\prime }}\) is the degree of dependency between S i and \(S_{i^{\prime }}\).

In addition, cycles in a workflow (i.e., there are two services S i and \(S_{i^{\prime }}\) in W D e s, where S i directly or indirectly depends on \(S_{i^{\prime }}\) and \(S_{i^{\prime }}\) directly or indirectly depends on \(S_{i^{\prime }}\)) can cause infinite calculation loops in the final rating calculation and the individual rating calculation (i.e., see, Sections 4.3.3 and 4.4). Hence, the GTrust model can only calculate the trust value for a group of services with the acyclic workflow.

3 Basic modules of the GTrust model

The GTrust model consists of 4 modules which are the request module, the reply module, the priority-based group trust calculation module and the evaluation module. The working procedure of the 4 modules is shown as follows.

As shown in Figure 2. When a consumer needs a complex service, (1) the request module of the consumer generates a service request and broadcasts it to providers in the network; (2) potential providers form service groups and the reply module of each service group generates a service reply to the consumer; (3) the priority-based group trust calculation module of the consumer calculates the trust value for each replied service group and selects the service group with the highest trust value; (4) after the selected service group finishing the service request, the evaluation module of the consumer generates reference reports for individual services of the group to describe their performance on the service request, which are recorded by the individual services.

Figure 2
figure 2

The working procedure of the GTrust model

4 The principle of the GTrust model

In this section, the 4 modules of the GTrust model are introduced in detail.

4.1 The request module

The objective of the request module is to generate a service request (i.e., S R e q) based on consumer’s requirements. A service request is described as a two-tuple S R e q = < S D e s,R N >, where S D e s is the service description and RN is the number of reference reports required from each individual service. In the GTrust model, the value of RN has the significant influence on the trust evaluation of service groups. If the value of RN is high, the consumer will receive more reference reports from each service group, which can help the consumer evaluate the potential performance of each service group accurately. However, the high value of RN can also significantly reduce the number of service groups that can reply the service request. Many suitable service groups cannot reply the service request, since some individual services in these service groups do not have enough reference reports to achieve the requirement of the consumer. To the contrary, if the value of RN is low, more service groups can reply the service request. However, the reference reports received from each service group might not be enough for the consumer to evaluate the potential performance of each service group accurately. In service-oriented networks, the suitable value of RN is related to many factors, such as the number of service providers in the network, the numbers of reference reports obtained by service providers, the functionality coverage of these reference reports on the current service request, the numbers of individual services in service groups and the workflows and dependency relationships among individual services of service groups. Therefore, the value of RN should be decided according to the circumstances of networks. We leave this for users to decide the value of RN based on the above factors and their applications.

Suppose that consumer C in an e-market environment needs a complex service including 5 attributes (i.e., cost, speed, quality, colour and warranty). The priority values of 5 attributes are 0.1, 0.4, 0.2, 0.1 and 0.2, respectively. In addition, C requires each individual service to provide 3 reference reports to indicate its historical performance. Based on the above requirements, the request module generates a service request S R e q as follows.

$$ SReq = \left\langle\left( \begin{array}{ccccc} Cost & Speed & Quality & Colour & Warranty \\ 0.1 & 0.4 & 0.2 & 0.1 & 0.2 \\ \end{array} \right), 3 \right\rangle $$
(4)

Then, S R e q is broadcasted to all providers through the network. The above example will be used to explain the following modules in this section.

4.2 The reply module

The purpose of the reply module is to generate a service reply to describe the workflow, dependency relationships and historical performance of individual services in a service group. A service reply is described as a two-tuple S R = < W D e s,R f S e t >, where W D e s is the workflow and dependency relationships among individual services of the service group and R f S e t is the set of reference reports of individual services in the service group. For example, if a service group (SG) intends to conduct the service request, the reply module of SG collects the following information: the workflows and dependency relationships among individual services of SG and reference reports of individual services of SG.

Suppose that SG consists of 5 individual services named S 1, S 2, S 3, S 4 and S 5, with the workflow and dependency relationships shown as follows.

In Figure 3, the circles represent the individual services of SG, the arrows represent the dependency relationships between two individual services and the value on each arrow indicates the degree of dependency (i.e., λ, see Definition 3) between two individual services.

Figure 3
figure 3

The workflow and dependency relationships among individual services of SG

The workflow description (i.e., W D e s, see Definition 4) of SG can be represented in the format of Definition 4 as follows.

$$ WDes = \left( \begin{array}{lllll} 0, 0, 0.5, 0, 0 \\ 0, 0, 0.2, 0, 0 \\ 0, 0, 0, 0, 0.3 \\ 0, 0, 0, 0, 0.6 \\ 0, 0, 0, 0, 0 \\ \end{array} \right) $$
(5)

In addition to the workflow description, the service provider of each individual service in SG also needs to provide RN number of reference reports (i.e., R f, see Definition 2) to reflect its historical performance on similar service requests. The service providers are assumed to be rational entities and are familiar with the trust evaluation of the GTrust model so that they will provide their most suitable reference reports to the consumer that can maximise the trust value of SG. The suitability (R S u i t) of a reference report on the service request is calculated based on two aspects: 1) the similarity of priority distribution in service descriptions between service request (i.e., S D e s 1 in S R e q, see Section 4.1) and the reference report (i.e., S D e s 2 in R f, see Definition 2) and 2) the ratings of the reference report (e.g. R a t i n g s in R f, see Definition 2) on the service request, which can be calculated as follows.

$$ RSuit=RSim \times RRating, $$
(6)

where R S i m is the similarity of priority distribution in service descriptions between the service request and the reference report; and R R a t i n g is the ratings of the reference report on the service request.

To calculate the similarity (R S i m) of priority distribution in service descriptions (i.e., S D e s 1 and S D e s 2), the order of attributes in S D e s 2 is arranged as same as that of in S D e s 1., and then, R S i m is calculated based on the distance of two vectors, which is described as follows.

$$ RSim=e^{-\sqrt{{\sum}_{j=1}^{n}(P_{1j}-P_{2j})^{2}}}, $$
(7)

where R S i m is the similarity of priority distributions between S D e s 1 and S D e s 2 and R S i m ∈ [0, 1], where 0 and 1 represent the lowest and highest similarity of priority distributions, respectively; P 1j and P 2j are the priority values in S D e s 1 and S D e s 2 on the j th attribute of the service request, respectively. In addition, some attributes of the service request might not be covered by the reference report (i.e., \(P_{2j}=\varnothing \)). In this situation, the priority values of these uncovered attributes equal 0 (i.e., P 2j = 0) in this calculation.

To calculate the ratings (R R a t i n g) of the reference report on the service request, we let each rating in the reference report multiply its corresponding priority value in the service description of the service request (i.e., S D e s 1) first. Then, we add up all the calculation results together, which is described as follows.

$$ RRating=\sum\limits_{j=1}^{n}P_{1j}\cdot R_{j}, $$
(8)

where R R a t i n g is the ratings of the reference report on the service request; P 1j is the priority value of the j th attribute of the service request; and R j is the rating of the reference report on the j th attribute of the service request. In addition, some attributes in the service request might not be covered by the reference report (i.e., \(R_{j}=\varnothing \)). In this situation, the ratings of these uncovered attributes equal 0 (i.e., R j = 0) in this calculation.

Finally, the RN number of reference reports with the highest suitability values (R S u i t) in each individual service are chosen to be the suitable reference reports. After collected all necessary information, the reply module will create a service reply S R = < W D e s,R f S e t > for SG, where W D e s is the workflow description of SG and R f S e t consists of b number of reference reports and b = m × R N = 15, where m = 5 is the number of individual services in SG and R N = 3 is the number of reference reports required from each individual service.

4.3 The priority-based group trust calculation module

The purpose of this module is to evaluate the trust value of each service group based on its service reply SR. This module produces the trust value for a service group based on three factors: 1) the functionality coverage of a service group on the attributes of the service request, 2) the similarity of priority distribution in service descriptions between the service request and the reference reports of the group and 3) the ratings of the group on the attributes of the service request. Since a service group is composed of different individual services owned by different providers, the group abilities to handle the service request depends on the abilities of individual services in the group. We use a group service description to formally describe the abilities of a service group by extracting service descriptions from reference reports of the group.

Definition 5

A group service description (GSDes) is represented by a b × n matrix, where b is the number of reference reports of a service group and n is the number of attributes in the service request. G S D e s is described as follows.

$$ GSDes= \left( \begin{array}{cccc} P_{11} & P_{12} &... & P_{1n}\\ P_{21} & P_{22} &... & P_{2n}\\ ... & ... &... & ...\\ P_{b1} & P_{b2} &... & P_{bn} \\ \end{array} \right) $$
(9)

In (9), the r th row represents the priority distribution of the r th reference report and P r j represents the priority value of the r th reference report on the j th attribute of the service request. If the reference report does not cover the j th attribute of the service request, \(P_{rj}=\varnothing \); otherwise P r j ∈ [0, 1], where 0 and 1 represent the lowest and highest priority values, respectively. By using (9), the abilities of a service group can be described comprehensively.

4.3.1 Functionality coverage calculation

The purpose of functionality coverage calculation is to measure whether the abilities provided by a service group can cover all the attributes of the service request. The functionality coverage of a service group on the service request is defined as follows.

Definition 6

A functionality coverage (FCov) is defined as an n-tuple F C o v = < A C o v 1, A C o v 2, A C o v 3,... A C o v n > , where A C o v j ∈ [0, 1] represents the functionality coverage value of the service group on the j th attribute of the service request, which can be calculated based on the group service description (i.e., G S D e s, see Definition 5) as follows.

$$ ACov_{j}=\frac{b-MS_{j}}{b}, $$
(10)

where A C o v j represents the functionality coverage value of a service group on the j th attribute of the service request, b represents the number of reference reports of the group and M S j represents the number of ‘\(\varnothing \)’ (i.e., how many reference reports cannot cover the j th attribute of the service request) in the j th column of G S D e s. If the functionality coverage value on the j th attribute equals 0, we can say that this service group is not able to conduct the service request.

4.3.2 Group similarity calculation

The objective of the group similarity calculation is to measure the similarity of priority distributions between the service request and a service group. In the GTrust model, the priority distribution of the service request is a vector, which can be easily obtained through extracting priority values P i from S D e s (see Definition 1). However, the priority distribution of a service group is a matrix (i.e., G S D e s, see Definition 4), which is obtained through extracting priority values P i from b number of reference reports of the group. In order to evaluate the similarity of priority distribution between the service request and a service group, we must transfer the priority distribution of a service group (i.e., the matrix G S D e s) to a vector G P V = < G P 1,G P 2,G P 3,...G P n > , where G P j is the priority value of the service group on the j th attribute of the service request. G P j in GPV is calculated as follows.

$$ GP_{j}=\frac{{\sum}_{r=1}^{b}P_{rj}}{b}, $$
(11)

where b is the number of reference reports of the service group and P r j is the priority value of the r th reference report on the j th attribute of the service request. In addition, some attributes of the service request might not be covered by some reference reports (i.e., \(P_{rj}=\varnothing \)). In this situation, the priority values of these attributes equal 0 (i.e., P r j = 0) in this calculation.

With (11), we can calculate each element in vector GPV and transfer the priority distribution of the service group to a vector. After that, we can calculate the similarity (G S i m) of priority distribution between the service request and the service group based on the distance between two vectors (i.e. see (8)), which is described as follows.

$$ GSim=e^{-\sqrt{{\sum}_{j=1}^{n}(P_{j}-GP_{j})^{2}}}, $$
(12)

where G S i m is a value in [0, 1], where 0 and 1 represent the lowest and highest similarities of priority distributions, respectively; P j is the priority value of the j th attribute of the service request; and G P j is the priority value of the service group on the j th attribute of the service request.

4.3.3 Group rating calculation

The purpose of group rating calculation is to predict the performance of a service group on the attributes of the service request. The rating of the service group on the j th attribute of the service request is calculated as follows.

$$ GRating_{j}=\frac{{\sum}_{i=1}^{m}FRating_{ij} }{m}, $$
(13)

where m is the number of individual services in the service group and F R a t i n g i j is the final rating of the i th individual service on the j th attribute of the service request, after considering the degree of dependency with other individual services in the group. F R a t i n g i j is calculated as follows.

$$ FRating_{ij}=\frac{{\sum}_{r} R_{rj}}{RN} \times \left( 1-\frac{{\sum}_{k=1}^{d}\lambda_{ki}\cdot (1-FRating_{kj})}{d}\right), $$
(14)

where \(\frac {{\sum }_{i} R_{rj}}{RN}\) is the average rating of the i th individual service on the j th attribute of the service request, which is calculated from the RN number of reference reports provided by the i th individual service; R r j is the rating value of the r th reference report on the j th attribute of the service request; d represents the number of individual services on which the i th individual service (i.e., S i ) depends and λ k i is the degree of dependency that S i depends on its k th depending individual service and F R a t i n g k j is the final rating of the k th depending individual service of S i on the j th attribute of the service request. In addition, since cycles in a workflow (i.e., see, Definition 4) can cause infinite calculation loops in the final rating calculation, the GTrust model can only calculate final ratings of a group of services with the acyclic workflow.

The final rating calculation begins from the individual service without depending on any other individual service and then move to the individual services depending on the individual services whose final ratings already known. In addition, if some ratings in reference reports on an attribute do not exist, the average value of existing ratings on the same attribute is used to represent the missing ratings during the group rating calculation. The reason for this setting is that these rating must be in [0, 1], where 0 and 1 represent the worst and best historical performance of individual services on corresponding attributes, respectively. First of all, it is not reasonable to set these missing ratings to 1. In addition, we do not set these missing ratings to 0 for two reasons. First, for the missing ratings, the trust value of a service group has been reduced in the functionality coverage calculation (i.e., A C o v j , see Section 4.3.1). If we set the values of the missing ratings to 0, it means that the trust value of the service group has been reduced for the missing ratings again in the group rating calculation, which would increase the influence of the missing ratings on the trust value of the service group. Secondly, the missing ratings in the reference reports do not mean that individual services have the worst performance on the corresponding attributes. Therefore, the average value of existing ratings on the same attributes means that the missing ratings do not have any influence on the group rating calculation of the service group, since we have considered them in the functionality coverage calculation (i.e., see Section 4.3.1).

4.3.4 Final trust calculation

After the functionality coverage calculation, the group similarity calculation and the group rating calculation, we can calculate the final trust value Trust of a service group as follows.

$$ Trust=GSim\cdot\sum\limits_{j=1}^{n}P_{j}\cdot ACov_{j}\cdot GRating_{j}, $$
(15)

where G S i m is the group similarity value, P j is the priority value of the j th attribute in the service request, A C o v j is the functionality coverage of the service group on the j th attribute of the service request and G R a t i n g j represents the rating of the service group on the j th attribute of the service request after considering the workflow and dependency relationships among individual services of the group.

4.4 The evaluation module

After completing the service request, the consumer uses the evaluation module to generate the group reference report based on the performance of the group on the service request in the format of Definition 2, which is described as follows.

$$ Rf = <SDes, <GR_{1}, GR_{2}, GR_{3}, ..., GR_{n}>>, $$
(16)

where S D e s is the service description of the service request, G R j ∈ [0, 1] represents the group rating on the j th attribute of the service request.

The group ratings represent the performance of the service group on the final product of the service request. Due to the dependency relationships among individual services, some individual services use the outputs of other individual services as their inputs so that the performance of these individual services might be reduced by the low quality of inputs from other individual services. After the group reference report generation, the consumer generates an individual reference report for each individual service in the group, which trends to recover the actual performance of the individual service. The individual ratings are adapted by taking the group ratings, the workflow and dependency relationships among individual services into account, which are calculated as follows.

$$ R_{ij}=GR_{j} \times \left( 1+\frac{{\sum}_{k=1}^{d}\lambda_{ki}\cdot (1-R_{kj})}{d}\right), $$
(17)

where R i j ∈ [0, 1] is the individual rating of the i th individual service on the j th attribute of the service request, G R j is the group rating on the j th attribute of the service request, d represents the number of the individual services on which the i th individual service (i.e., S i ) depends and λ k i is the degree of dependency that S i depends on its k th depending individual service and R k j is the individual rating of the k th depending individual service of S i on the j th attribute of the service request. In addition, since cycles in a workflow (i.e., see, Definition 4) can cause infinite calculation loops in the individual rating calculation, the GTrust model can only calculate individual ratings of a group of services with the acyclic workflow.

We use the same example in the request module and the reply module to demonstrate how to generate the group and individual reference reports in the evaluation module. After completing the service request, Consumer C generates the group reference report based on the performance of the service group SG on attributes of the service request, which is described as follows.

$$ \left \langle \left( \begin{array}{ccccc} Cost & Speed & Quality & Colour & Warranty \\ 0.1 & 0.4 & 0.2 & 0.1 & 0.2 \\ \end{array} \right), <0.6, 0.4, 0.9, 0.9, 0.3> \right \rangle $$
(18)

From the consumer’s group ratings, we can see that the consumer was satisfied with the ‘cost’ attribute of the service provided by SG (i.e., the first attribute of S D e s), was not satisfied with the ‘speed’ and ‘warranty’ attributes of the service (i.e., the second and fifth attributes of S D e s), and was very satisfied with the ‘quality’ and ‘colour’ attributes of the service (i.e., the third and forth attributes of S D e s).

Based on the group reference report, the individual reference reports can be generated as follows. First, from Figure 3, it can be seen that S 1, S 2 and S 4 in SG do not have other individual services to depend on (i.e., d = 0, see (17)) so that the individual ratings of S 1, S 2 and S 4 are the same as the group ratings in (18). Then, since S 3 depends on S 1 and S 2, the individual ratings of S 3 are calculated based on (17), which are described as follows.

$$ \left \langle \left( \begin{array}{ccccc} Cost & Speed & Quality & Colour & Warranty \\ 0.1 & 0.4 & 0.2 & 0.1 & 0.2 \\ \end{array} \right), <0.68, 0.48, 0.93, 0.93, 0.37> \right \rangle $$
(19)

Since S 5 depends on S 3 and S 4, the individual ratings of S 5 are calculated based on (17), which are described as follows.

$$ \left \langle \left( \begin{array}{ccccc} Cost & Speed & Quality & Colour & Warranty \\ 0.1 & 0.4 & 0.2 & 0.1 & 0.2 \\ \end{array} \right), <0.70, 0.50, 0.94, 0.94, 0.39> \right \rangle $$
(20)

5 Experiments and analysis

Two experiments are conducted to evaluate the performance of the GTrust model on the service group selection. After examining the related research, we found the REGRET model [12] to compare with the GTrust model. In the REGRET model, the average trust value of all individual services in a service group is used as the trust value for the group, which does not consider the workflow and dependency relationships among individual services of the group.

5.1 Experimental settings of experiment 1

In Experiment 1, 1 consumer and 20 individual services have participated. The 20 individual services form 4 service groups (i.e., G 1 to G 4) and each service group contains 5 individual services.

The consumer sends a service request containing 5 attributes (i.e., cost, speed, quality, colour and warranty) with different priority values. The service description of the service request is described as follows.

$$ SDes^{\prime} = \left( \begin{array}{ccccc} Cost & Speed & Quality & Colour & Warranty \\ 0.04 & 0.08 & 0.2 & 0.34 & 0.34 \\ \end{array} \right). $$
(21)

In order to comprehensively test the trust calculation of the GTurst model and the REGRET model, the reference reports of individual services in 4 service groups are generated to distinguish the advantages of different service groups, where G 1 has the highest functionality coverage on the attributes of the service request, G 3 has the highest similarity with the priority distribution of the service request, G 4 has the highest ratings on the attributes of the service request and G2 has the second highest functionality coverage, similarity, and ratings among 4 service groups. Each individual service is required to provide 3 reference reports (i.e., R N = 3) so that a service group needs to provide 15 reference reports to the consumer. The reference reports of 20 individual services are shown in Table 1.

Table 1 The references reports of the twenty individual services

4 service groups are tested in three scenarios with different workflows and dependency relationships among individual services, which are shown in Figure 4.

Figure 4
figure 4

The workflows and dependency relationships among individual services in three scenarios

In Figure 4, the workflows in three scenarios represent three common workflows of 5 individual services in real-life applications, respectively. The workflow of Scenario 1 represents the sequential workflow, in which each individual service depends on the output of its former individual service. The workflow in Scenario 2 represents the hierarchical workflow, in which two individual services are at the low dependency level, two individual services are at the middle dependency level and one individual service is at the high dependency level. The workflow in Scenario 3 represents the concurrent workflow, in which one individual service depends on the other 4 individual services. In each scenario, we calculate the trust values for 4 service groups based on 5 different degrees of dependency (i.e., 0, 0.1, 0.5, 0.7 and 1). The reason for this setting is to comprehensively test influence of the degree of dependency on the trust value of the service group.

5.1.1 Trust value transfer function

Since the GTrust model and the REGRET model use different presentations of trust values, we define the trust value transfer function to transfer trust values of the GTrust model to trust values of the REGRET model. The trust value transfer function is described as follows.

$$ Trans(Rf) = \frac{{\sum}_{j=1}^{n}R_{j}}{n-{\sum}_{j=1}^{n} MS_{j}}, $$
(22)

where R f is a reference report of the GTrust model (see Definition 2), R j is the rating of the reference report on the j th attribute of the service request, n is the number of attributes in the service request and M S j is the number of missing attributes of the reference report on the j th attribute of the service request.

5.2 Experimental results and analysis of experiment 1

This subsection gives the experimental results and analysis of Experiment 1 in detail.

5.2.1 Experimental results and analysis in scenario 1

Table 2 shows the results of the functionality coverage, group similarities and ratings of 4 service groups in the GTrust model with 5 degrees of dependency in Scenario 1 (Figure 4a). Since the results of a service group on 5 attributes of the service request are the same, we use only one value to represent them all. Tables 4 and 6 are the same as Table 2.

Table 2 The functionality coverage, similarities and ratings of 4 service groups

Table 3 shows the trust values of 4 service groups calculated by the GTrust model and the REGRET model in 5 degrees of dependency of Scenario 1. The trust values with (C) indicate the service groups selected by the GTrust model and the REGRET model, respectively.

Table 3 The trust values of 4 service groups

Figure 5 demonstrates the trust values of G 2 and G 4 evaluated by the GTrust model and the REGRET models in 5 degrees of dependency in Scenario 1. In Figure 5, the X-axis represents the degrees of dependency, while the Y-axis represents the trust values of service groups.

Figure 5
figure 5

The trust values of service groups selected by the two trust models

From Table 3, we can see that all of the service groups except G 2 have their advantages in regard to the service request. G 1 has the highest functionality coverage on the attributes of the service request, G 3 has the highest similarity of priority distribution with the service request, and G 4 has the highest ratings on the service request. In principle, the REGRET model always selects the service group based on the average rating of reference reports. Therefore, in Scenario 1, the REGRET model always selects G 4, since G 4 has the highest average rating among 4 service groups. The GTrust model considers 4 main factors: the functionality coverage, dependency relationships, the similarity and ratings, which can affect the performance of a service group from different perspectives. After the trust value calculation, the GTrust model selects G 2 for the service request because G 2 has the second highest value on each main factor but the best trust value by considering 4 main factors at the same time.

Now, we analyse the selection results from two models to see which service group is more suitable for the service request from 4 perspectives:

  • (1) Workflow and dependency relationships. From Figure 4a, we can see that all of 4 service groups have a sequential workflow in this scenario, i.e., the latter individual service depends on its former individual service. In other words, the performance of the former individual service affect the performance of the latter individual service. From Figure 5, we can clearly see that the degree of dependency has an impact on group trust value calculation in the GTrust model. The experimental result of the GTrust model in Scenario 1 shows that when the degree of dependency (λ) among individual services in a group increases, the group trust values decreases. For example, we can see from Table 3 that when the degree of dependency increases from 0 to 1, the trust value of G 2 decreases from 0.34 to 0.16 and the trust value of G 4 decreases from 0.12 to 0.08. Figure 5 also indicates that the trust values of G 2 and G 4 calculated by the REGRET model always remain constant i.e., 0.6 and 0.8, respectively, in all 5 degrees of dependency. In the real-life applications, a service group always has the workflow and dependency relationships among individual services. With this consideration, the trust calculation process of the GTrust model is closer to realistic situations than that of the REGRET model.

  • (2) Similarity of priority distributions. The reference reports can partially reflect the potential performance of individual services on the service request. The more similar between the priority distribution of the service request and reference reports of a service group, the better the reference reports reflect the potential performance of the service group on the service request. From the priority distribution of 5 attributes of the service request (see Subsection 5.1), we can see that if a service group wants to provide similar reference reports with the service request, the reference reports must have the highest priority values on the ‘colour’ and ‘warranty’ attributes, the second highest priority value on the ‘quality’ attribute, then the ‘speed’ and the ‘cost’ attributes. Table 1 shows the priority distributions in reference reports provided by individual services of 4 service groups. In G 4, only reference reports 51, 53 and 57 have the highest priority values on the ‘colour’ attribute and reference reports 49, 55 and 59 have the highest priority values on the ‘warranty’ attribute. Different with G 4, in G 2, reference reports 19, 20, 21 and 25 have the highest priority values on the ‘colour’ attribute and reference reports 22, 23, 24, 26, 27, 29 and 30 have the highest priority values on the ‘warranty’ attribute. After comparing the priority distribution of reference reports of G 2 and G 4, we can see that reference reports of G 2 can better reflect the potential performance of the service group on the service request than that of G 4.

  • (3) Functionality coverage. The functionality coverage value can reflect whether a service group can satisfy the functionalities of the service request. In other words, whether a service group can cover all attributes of the service request. Table 1 also shows the functionality coverage of individual services in 4 service groups. In G 4, the reference reports (reference reports 46 to 60) miss too many priority values and ratings on the attributes of the service request and the functionality coverage of G 4 on the service request is only 20 %. In G 2, even if reference reports (reference reports 16 to 30) also miss some priority values and ratings on the attributes of the service request, the functionality coverage of G 2 on the service request is 60 %, which is much higher than the functionality coverage of G 4. Therefore, the potential performance of individual services in G 2 is more predictable than that of G 4 on the service request.

  • (4) Final decision making. The REGRET model selects G 4 based on its average rating (0.8). The GTrust model selects G 2 not only based on its average rating (0.6), but also based on its workflow, the similarity of priority distributions as well as its functionality coverage on the service request.

Therefore, the service group selected by the GTrust model is more reasonable than that of selected by the REGRET model in Scenario 1.

5.2.2 Experimental result and analysis in scenario 2

Table 4 shows the results of the functionality coverage, group similarities and ratings of 4 service groups in the GTrust model with 5 degrees of dependency in Scenario 2 (Figure 4b).

Table 4 The functionality coverage, similarities and ratings of 4 service groups

Table 5 shows the trust values of 4 service groups calculated by the GTrust model and the REGRET model in 5 degrees of dependency of Scenario 2. The trust values with (C) indicate the service groups selected by the GTrust model and the REGRET model, respectively.

Table 5 The trust values of 4 service groups

Figure 6 demonstrates the trust values of G 2 and G 4 evaluated by the GTrust model and the REGRET models in 5 degrees of dependency in Scenario 2. In Figure 6, the X-axis represents the degree of dependency, while the Y-axis represents the trust values of service groups.

Figure 6
figure 6

The trust values of service groups selected by two trust models

In this scenario, the GTrust model also selects G 2 while the REGRET model also selects G 4. Using the same analysis as in Scenario 1, we can see that the service group selected by the GTrust model is more reasonable than that of selected by the REGRET model in Scenario 2.

5.2.3 Experimental result and analysis in scenario 3

Table 6 shows the results of the functionality coverage, group similarities and ratings of 4 service groups in the GTrust model with 5 degrees of dependency in Scenario 3 (Figure 4c).

Table 6 The functionality coverage, similarities and ratings of 4 service groups

Table 7 shows the trust values of 4 service groups calculated by the GTrust model and the REGRET model in 5 degrees of dependency of Scenario 3. The trust values with (C) indicate the service groups selected by the GTrust model and the REGRET model, respectively.

Table 7 The trust values of 4 service groups

Figure 7 demonstrates the trust values of G 2 and G 4 evaluated by the GTrust model and the REGRET models in 5 degrees of dependency in Scenario 3. In Figure 7, the X-axis represents the degree of dependency, while the Y-axis represents the trust values of service groups.

Figure 7
figure 7

The trust values of service groups selected by two trust models

In this scenario, the GTrust model still selects G 2, while the REGRET model still selects G 4. Using the same analysis as in Scenario 1, we can see that the service group selected by the GTrust model is more reasonable than that of selected by the REGRET model in Scenario 3.

5.2.4 The impact of workflows and dependency relationships on the service group selection

In the experiment, we used three scenarios with three different workflows and 5 different dependency relationships. Now, we analyse the potential impact of different workflows and degrees of dependency on the group trust values. Figure 8 demonstrates the changes of trust values of G 2 and G 4 evaluated by the GTrust model and the REGRET model with 5 degrees of dependency in three scenarios, where the X-axis represents the degree of dependency and the Y-axis represents the trust values of service groups.

Figure 8
figure 8

The trust values of service groups evaluated by two trust models in three scenarios

From Figure 8, we can clearly see that although the degree of dependency changes in three scenarios, the trust values of G 4 evaluated by the REGRET model remain a constant, while the trust values of G 2 evaluated by the GTrust model change based on different scenarios. In Scenario 1, the 5 individual services of G 2 are in a sequential workflow (see Figure 4a) and there are 4 dependency relationships among individual services. The trust values of a service group in Scenario 1 highly depends on the 4 dependency relationships so that from Figure 8, we can see that the trust values of G 2 decreases significantly as the degree of dependency (λ) increases. In Scenario 2, there are also 4 dependency relationships among individual services in G 2. However, the dependency relationships among individual service are different from that of in Scenario 1. For example, S 3 depends on S 1 and S 2. This means either S 1 or S 2 has a partial influence on S 3. The same relationships exist between S 5 with S 3 and S 4. Scenario 2 has weaker dependency relationships among individual services so that when the degree of dependency (λ) increases, the trust value of G 2 decreases but the decrease rate is slower than that of in Scenario 1. Scenario 3 has the slowest decrease rate on trust values when the degree of dependency increases. This is because each individual service (S 1 to S 4) only has a partial influence on S 5. From the above analysis the trust value calculated based on the GTrust model is more reasonable than that of the REGRET model, because we take the workflow and dependency relationships of a service group into account.

In summary, the trust value of a service group is impacted not only by the ratings in reference reports, but also by the workflow and dependency relationships among individual services of the group.

5.3 Experimental settings of experiment 2

In Experiment 2, 1 consumer and 50 individual services have participated. The 50 individual services form 10 service groups and each service group contains 5 individual services. The service in Experiment 2 also contains 5 attributes (i.e., cost, speed, quality, colour and warranty).

In the experiment, the consumer totally sends 100 service requests, each of which only contains 3 attributes, which are randomly selected from the 5 attributes of the service. The priority values of the 3 attributes are randomly created from values in [0, 1] with one decimal place. In addition, the sum of priority values of the 3 attributes is 1. In the experiment, each service request only requires 1 reference report from each individual service (i.e., R N = 1). In order to remove the impact of the workflow and dependency relationships on the service group selection, the workflows and degrees of dependency of individual services in 10 service groups (see, Definition 4) are the same as Figure 4a with 0.5 degree of dependency (i.e., λ = 0.5).

At the beginning of the experiment, each individual service only has 1 reference report, which also contains 3 random attributes. The creation of priority values of the 3 attributes is the same as those created in service requests. The ratings of 3 random attributes in a reference report are randomly created from values in [0, 1] with one decimal place. For each service request, the consumer selects service groups based on the GTrust model and the REGRET model, respectively. In the REGRET model, the trust values of service groups are calculated from reference reports based on (22). After the service groups completed the service request, the consumer employs the evaluation module of the GTrust model to create new reference reports based on the performance of the service groups, which are recorded by the individual services of the service groups and used to reply the following service requests.

The functionality coverage, group similarities and ratings of the service groups selected by two models are three indicators in this experiment. In addition, the number of reference reports in each service group after 100 service requests is also calculated to know the number of times that each service group is selected by two models in the experiment.

5.3.1 The performance estimation function

In this experiment, how to estimate the performance of the selected service groups on the service request is a problem, which has influenced the reference reports created by the consumer and the following service group selections. In order to estimate the performance, we first assume that all ratings in reference reports are evaluated by the consumer objectively, which means that ratings in reference reports are only related to the performance of individual services. Based on this assumption, we can directly estimate the performance of an individual service on the attributes of the service request from the ratings of its reference report. In addition, if an attribute is not covered by a reference report of an individual service, the performance of the individual service on the attribute is estimated as the average value of existing ratings on the same attribute in other reference reports. The reason for this estimation is to eliminate the influence of the missing ratings on the performance of the service group, which is the same as the setting in the group rating calculation (i.e., see, Section 4.3.3). Based on the above assumption and setting, the performance estimation function is proposed as follows.

$$ AP_{j}= \frac{{\sum}_{r=1}^{b} R_{rj}}{b}, $$
(23)

where A P j is the estimated performance of a service group on the j th attribute of the service request, R r j is the rating of the r th reference report on the j th attribute of the service request, and b is the number of reference reports of the service group.

5.4 Experimental results and analysis of Experiment 2

Figure 9 shows the average values of the functionality coverage, group similarities and ratings of the service groups selected by the GTrust model and the REGRET model, respectively. In Figure 9, the X-axis represents the three indicators of the service groups, while the Y-axis represents the average values of the three indicators of the service groups.

Figure 9
figure 9

The average values of the three indicators of the service groups selected by two models

From Figure 9, it can be seen that the functionality coverage and group similarities of the service groups selected by the GTrust model are more than those selected by the REGRET model. This is because both of the functionality coverage and group similarities are two factors in the trust calculation of the GTrust model, while these factors are not taken into account by the REGRET model during the trust calculation. Therefore, the service groups selected by the GTrust model have higher values on the functionality coverage and group similarities. However, the ratings of the service groups selected by the GTrust model are sightly lower than those selected by the REGRET model. This is because that during the trust calculation, the GTrust model considers not only the ratings of service groups, but also the functionality coverage and group similarities of the service group so that the service groups selected by the GTrust model do not always have the highest values on ratings, but must have the highest values on all of the functionality coverage, group similarities and ratings among all service groups. Differing from the GTrust model, the service groups selected by the REGRET model always have the highest value on ratings among all service groups, because the GTrust model only considers the ratings of service groups during the trust calculation. The GTrust model considers three factors rather than one factor during the trust calculation so that the service groups selected by the GTrust model are more reasonable than those selected by the REGRET model.

Figure 10 shows the number of reference reports of each service groups after 100 service requests. In Figure 10, the X-axis represents the ten service groups, while the Y-axis represents the numbers of reference reports in ten service groups after the experiment.

Figure 10
figure 10

The numbers of reference reports in the ten service groups after the experiment

From Figure 10, it can be seen that based on the GTrust model, the number of reference reports of all service groups are more than 1, which means that all of them have been selected for different service requests. Because the GTrust model considers all of the functionality coverage, group similarities and ratings factors during the trust calculation, it can find the service group with the highest values on three factors based on the attributes and priority distribution of the service request. Therefore, the GTrust model could select different service groups for 100 different service requests. However, based on the REGRET model, only G 1 and G 2 have the number of reference reports more than 1, which means that only G 1 and G 2 were selected for 100 different service requests by the REGRET model, in which, G 1 (i.e., having 90 reference reports) was selected for about 90 % service requests. This is because the REGRET model only considers ratings of the service groups during the trust calculation. Even if the priority distributions of 100 service requests are different, the ratings in the reference reports provided by different service groups are the same. Therefore, G 1 and G 2 having the highest ratings among all service groups were always selected by the REGRET model. The GTrust model selects the suitable service group based on the attributes and priority distribution of the service request so that the service groups selected by the GTrust model are more reasonable than those selected by the REGRET model.

6 Related work and comparison

First, J. Sabater and C. Sierra proposed the REGRET trust model in 2001 [12], which includes a simple way to evaluate group trust through calculating the average trust value based on the trust values of individual services in the group. The REGRET model evaluates the trust value of a service provider from three perspectives: the individual dimension, the social dimension and the ontological dimension. The individual dimension is the direct reputation of the service provider offered by service consumers who had a previous interaction with the provider. The social dimension is the reputation of a group to which a service provider belongs. The ontological dimension represents the reputations of different aspects of the services offered by the provider. Even though the REGRET model considers the social dimension to reflect the influence of a group reputation on the evaluation of a trust for an individual provider, it does not consider the relationships among services in the group. In real-life applications, a complex service may consist of a number of services with different relationships and these services can also be owned by different providers. A good trust model for group trust evaluation needs to take these relationships into account. The REGRET model neglects these factors, so it cannot work properly for group service selection when services are provided by different providers. Our model overcomes these limitations through the analysis of the workflow and dependency relationships among individual services of a service group and introduces the workflow description and the degree of dependency to the trust calculation.

The other well known model for trust calculation is the certified reputation (CR) model proposed by Huynh et al. [7]. In the CR model, an agent’s reputation is derived from the references of third parties which have had previous interactions with the provider. A provider can collect and present such references to consumers in order to be trusted by them. Since the CR model allows consumers to evaluate trust themselves without using a central controller, it can be adapted in a wide range of open and dynamic environments such as web-based service-oriented environments. But, there are still some limitations in the CR model. First, in the CR model, a service is represented by a single item and the evaluation of the service given by a referee is represented by a single value. In the real-world applications, it is difficult or even impossible to use a single value to represent complex contexts related to a service [14]. A service provider’s performance should be evaluated from different aspects such as speed, cost, quality, reliability etc. In addition, the evaluation result may also depend on the service request and the preferences of consumers. Secondly, the CR model only focuses on the trust evaluation for an individual service based on a single provider, so it cannot handle the problem of group trust evaluation for multiple providers. In the GTrust model, we take the concept of third-party references from the CR model to evaluate the performance of a provider on a rich context based services, then evaluate the group trust value for a group of providers not only based on the performance of individual providers but also on the dependency relationships among individual services in a group.

Another related work is the priority-based trust (PBTrust) model proposed by Su et al. in 2010 [18, 19]. In the PBTrust model, the reputation of a service provider is represented by a rich context format, including attributes of the service, the priority distribution of attributes and a rating value for each attribute from a third party. The concept of ‘similarity’ is also introduced to measure the difference in terms of priority distribution of attributes between requested service and a refereed service in order to precisely predict the performance of a potential provider for the requested service. In addition, the concept of previous performance of a service provider on a service is also introduced to help the trust evaluation of a potential provider. Although the PBTrust model overcomes several limitations of the CR model, it still focuses on the trust evaluation of individual providers. The GTrust model is based on a partial extension of the PBTrust model and focuses on a group trust evaluation based on the four factors introduced in Section 1.

Finally, Su et al. proposed an innovative trust model for service group selection in service-oriented environments in 2011 [17]. In their approach, the reputation of each individual service is obtained from their former service consumers and recorded in the form of reference reports, which is the same as the CR model. In addition, the innovative trust model uses the rich context format and the similarity concept of the PBTrust model to represent the reputation of an individual service rather than a single value. In order to evaluate the trust value of a service group, the innovative trust model introduced the concepts of the workflow description and the degree of dependency to describe the relationships among individual services in a service group. However, in the innovative trust model, each individual service only provides its best historical reference report to the service consumer to indicate their potential performance on the service request. In addition, the innovative trust model does not introduce how to create the individual service reports based on the performance of the service group. In the GTrust model, the request model requires a certain number of reference reports from each individual service in the group and the reply module of a service group should provide the number of reference reports that satisfy the service request so as to enable the consumer to comprehensively evaluate the potential performance of the service group. Moreover, the evaluation module of the GTrust model can generate individual reference reports for individual services based on the group reference report, the workflow and dependency relationships among individual services of the group.

7 Conclusion and future work

In this paper, we proposed the GTrust model for service group selection in web-based service-oriented environments. This model is innovative because: (1) the GTrust model uses the ‘functionality coverage’ value to represent the functionalities which a potential service group can provide corresponding to the request from the consumer; (2) the GTrust model introduces the concept of ‘degree of dependency’ to represent relationships among individual services in a service group; (3) the GTrust model uses the concept of ‘third party reference’ to represent the historical performance of individual services in a service group; and (4) the GTrust model uses the concept of ‘similarity’ to measure the similarity in terms of priority distributions of attributes between the service request and reference reports. The experimental results indicate that the GTrust model has good performance for service group selection in three scenarios. In this study, we used the group performance evaluated by the consumer as the reference report for each individual service of the group without considering the different roles of each individual service. In our future work, we will handle the trust calculation for groups of services with cyclic workflows and employ learning approaches to our trust model to analyse the role of each individual service in a group.