Introduction

Over the last couple of decades, we have seen a major shift from an industrial economy to that of an information economy. This leads to new technologies to help capitalize on the information economy. A virtual enterprise (VE) is an emerging business cooperation model which enables rapid response to the unpredictable market behavior and opportunity (Camarinha-Matos and Afsarmanesh 2006; Goranson 1999). Based on the advanced information network, it is a temporary alliance of enterprises that come together to share core competencies and resources in order to fulfill a specific business task.

Service composition is about how to build a new value-added service by using existing services and it is one of the core concepts in service oriented architecture (SOA) (Erl 2005). With the SOA concept being widely accepted and utilized in enterprise informatization and enterprise application integration, more and more enterprises encapsulate their computing resources as services and publish them online (Gao et al. 2011; Janssen 2008; Leitão et al. 2012; Villaseñor Herrera et al. 2012; Wang et al. 2012; Yang et al. 2012). Thus, for service oriented enterprises, establishing a virtual enterprise can be regarded as a process of service composition. According to certain business logic, a group of existing services can be integrated into a value-added service to complete a more complicated mission.

In the temporary organization of virtual enterprise, quality of service (QoS) is the key point to ensure all participating enterprises to have a good cooperation and thus is crucial for virtual enterprises to meet the quality requirements from consumers. QoS criteria include domain-independent ones such as response time, price, reliability, and domain-dependent ones like precision and refresh frequency for an exchange rate inquiry service. Consumers have diverse preferences for QoS criteria, e.g. one consumer can bear a composite service with a response time of 1 s at most while another consumer deems service price more important than response time. The composite service should be orchestrated with optimal overall QoS on the basis of consumers’ personalized requirement. This is the so-called optimal service selection problem (Wang et al. 2011).

With increasing numbers of available services providing similar functionalities but with different QoS values (Xu and Wang 2011), the search space for the global optimal solution of this problem is quite large. Moreover, there are potential business correlations among services, which make the search more difficult. For example, a flight booking service charges fewer fees if payment is done by credit cards of some specific banks, according to an agreement between them, or, a manufacturer may only choose to cooperate with a number of logistics companies s/he trusts, rather than all the candidate logistics companies. Therefore, this correlation-driven optimal service selection problem is not trivial.

A great deal of research has been conducted on the problem of optimal service selection for service composition by adopting different mathematic models and algorithms to optimize the overall QoS (Strunk 2010). However, most of these approaches assume services involved in the composition are independent from each other and hence fail to address aforementioned business correlations in reality. In response, we formally propose a business correlation model including quality correlations and selection correlations for two correlated abstract services in this paper, and then present an efficient approach for correlation-driven optimal service selection based on a genetic algorithm. The genetic algorithm is specially tailored with niching technology to avoid premature convergence, and with repair operator and penalty mechanism to address the infeasible solution problem. Empirical studies are conducted at last based on a real-world web service QoS dataset and the experiments show the performance of the proposed approach, in terms of optimality and computation time.

The remainder of this paper is organized as follows: “Literature Review” section discusses the related work. The third section formulates the business correlation model and the correlation-driven optimal service selection problem for service composition, while the fourth section presents a selection approach based on the genetic algorithm. Empirical studies are shown in the fifth section. Finally, the last section concludes our work and gives future work.

Literature review

The optimal service selection problem for service composition has gained much attention in recent years (Strunk 2010). Zeng et al. (2004) present an extensible QoS model and propose two approaches to address the problem: local optimization and global optimization. The former one is quite efficient, but it can not guarantee global optimality or support global QoS constraints. The latter one uses mixed integer programming to seek the globally-optimal components but suffers from poor scalability because of its exponential computational complexity and is thus impractical for real-time and dynamical applications.

Kinds of efficient heuristic algorithms for the problem are proposed to find a suboptimal solution. Canfora et al. (2005) propose an approach based on genetic algorithm, where the fitness is the aggregated QoS of the composite service. The fitness value increases from generation to generation and the fittest individual represents the optimal solution. The genetic algorithm for optimal service selection problem is also improved by some other authors (Gao et al. 2007; Ma and Zhang 2008; Syu et al. 2011; Wu et al. 2012). Yu et al. (2007) propose two models for this issue: a combinatorial model and a graph model, and then introduce two algorithms for each model. All the four algorithms are compared and their usage contexts are also suggested. Alrifai et al. (2010) leverage skyline technology as a preprocessing step to remove non-interesting candidates in the search space and therefore the required selection time is reduced. The authors also propose an approach by combining global optimization with local selection techniques to solve this problem (Alrifai et al. 2012). Mixed integer programming is first employed to seek the best decomposition of global QoS constraints into local QoS constraints and then local selection is performed to find the best services satisfying these local constraints for each abstract task. Klein et al. (2012) propose a network-aware approach to handle the QoS of services and the QoS of the network in the cloud independently. Selection algorithm then leverages this model to find compositions that will result in a low latency given an employed execution policy.

The above selection approaches neglect correlations in service composition and assume selection of services is independent from each other. Thus, they are not capable of handing the potential correlations among services.

Service correlations have different meanings in different contexts. Maamar et al. (2011) focus on statistical correlations that two or more services have been often banded to execute a task or have been replaced by each other in the past, and based on the statistical correlations, the social network of services is established. In context-based service selection (Lin et al. 2012; Maamar et al. 2007; Yu and Reiff-Marganiec 2009), service correlations are also studied. Yu and Reiff-Marganiec (2009) analyze correlations between context factors and QoS parameters, and then QoS values are predicted based on the context factors. Lin et al. (2012) considers the composition context and complex dependencies between services, and presents a novel service selection algorithm. Transactional correlations are studied in (Gabrel et al. 2012; Syu et al. 2011; Wu and Zhu 2013) and these correlations concern how to orchestrate services with different transactional properties in a transactional manner. The authors in (Guo et al. 2010) and (Lecue and Mehandjiev 2011) focus on compatibility correlations in services, i.e., whether their interfaces are compatible. Based on semantic similarity degree, an approach to calculate the correlation degree is given. Pedrinaci et al. (2010) describe a services publishing platform in order to better support service discovery, where linkage correlations between services are exploited. It exposes service descriptions as linked data expressed in terms of a simple vocabulary for describing services of different kinds with annotations in diverse formalisms. Gu et al. (2010) give a formalism of service data correlation to describe data mappings among the input and output attributes of services and this formalism is then applied in data-driven automatic service composition.

In this paper, we investigate how to establish virtual enterprises through service composition and therefore we mainly focus on business correlations among candidate services. To the best of our knowledge, only a few works take into account business correlations when investigating this optimal service selection problem. Ye et al. (2008) study business entity relation among composite services and present a systematic approach to construct correlation-driven QoS description to characterize the QoS dependence of an individual service on the other related services. Tao et al. (2010) present a QoS description mode supporting resource service correlation, and suggest a resource composition method based on the particle swarm optimization to address the multi-objective grid resource service composition and optimal selection problem. Barakat et al. (2012) model dependencies among services and these correlations are considered during composite service selection. Correlation-aware search space reduction techniques are introduced to prune out uninteresting service compositions, including correlation-aware domination pruning and inter-task pruning.

What specifically distinguishes our work from the above similar works in the literature are: (1) a comprehensive business correlation model including both quality correlations and selection correlations (dependencies), presented in a formal way; (2) a tailored genetic algorithm equipped with the niching technology, a repair operator and a penalty mechanism to solve the correlation-driven optimal service selection problem; (3) empirical studies which show the effectiveness and efficiency of the approach.

Correlation-driven service selection model

Motivating example

In order to describe the correlation-driven optimal service selection problem, a virtual enterprise for parts sale (VEPS) aiming at car manufactories is presented. Its core process is shown in Fig. 1, which consists of five key abstract component services, namely, OS, CBS, DS, IS, PS, and a number of abstract services for parts purchase (e.g. TPS, EPS, BPS). For each abstract service, there is a set of concrete service instances, which can fulfill the same functionality but have different levels of QoS.

Fig. 1
figure 1

A virtual enterprise for parts sale (VEPS)

The process is described in detail as follows.

  1. (1)

    Through the order service (OS), the car manufacturer submits purchase request, which manifests the list of required car parts, quantity, delivery date, etc. The request can be submitted through kinds of online transaction websites, or directly through phone.

  2. (2)

    The core business service (CBS) analyzes contents of the order and then invokes relevant parts purchase services. For each parts purchase service, there may be multiple concrete candidate services available supporting its functionality.

  3. (3)

    Then the manufacturer uses the delivery service (DS) to carry the purchased cargoes to the specific destination. The delivery service can be offered by logistics companies like COSCO, UPS, FedEx, EMS, TNT, ANA, and so on.

  4. (4)

    In order to share delivery risk, the manufacturer selects insurance agents to sign cargo insurance through insurance services (IS). Insurance companies such as PICC, Allianz and AXA can provide this service.

  5. (5)

    Finally, pay for all the car parts and insurance fee via payment service (PS). Its candidate services include VISA, MasterCard, JCB, UnionPay, PayPal, etc.

During the actual execution, each abstract service in the core process is dynamically configured to an instance service to fulfill VEPS’s functionality. Since there are numerous candidate services with different QoS values for each abstract service, it is significant to optimize the overall QoS of the instantiated composite service on the basis of consumers’ personalized preferences, through service selection. This optimal service selection problem is not trivial as its search space grows exponentially with the number of component services and the number of candidates. For example, given that the composite service consists of \(n\) abstract component services, and there are \(m\) candidate services for each one, the size of the search space is \(m^{n}\).

Furthermore, candidate services may be correlated with each other in a business way and thus the search for the global optimal solution becomes even harder. The business correlations in service composition are first elaborated in the next subsection, and then the correlation-driven service selection problem is formally described in “Problem Statement” section. To facilitate the statement, notations and explanations of key elements are shown in Table 1.

Table 1 Notations and explanations

Business correlation model

As have been discussed in “Literature review” section, what service correlations specifically refer to can be different in different contexts. Since we aim to establish virtual enterprises via service composition in this paper, we mainly focus on business correlations among candidate services. According to the target object and application scenario, business correlations are divided into two types: selection correlations (SC) and quality correlations (QC). For the sake of focus and clarity, the business correlation model in this paper is limited to two abstract services, i.e., only two abstract services are involved in a specific correlation.

Selection correlations represent correlations between selection results of different components in the composite service. That is, the selected instance service for one abstract service can determine whether or not to select a particular candidate service for another abstract service in the same composite service. This kind of correlations originates from the following three aspects.

  1. 1.

    Business strategy. It is common in reality for enterprises to decide to cooperate with only some enterprises, or not to cooperate with them in a particular area for a certain business purpose. For example, in Fig. 1, the DS provider FedEx may be only willing to cooperate with insurance companies PICC, Allianz and AXA.

  2. 2.

    Bundling sale. By the bundling sale model, the enterprise (or some cooperating enterprises) does not sell its services separately, and once the consumer chooses to buy one service in the bundling, s/he should also pay for the other bundled ones. For example, the provider of \(\text{ EPS}_{BMW}\) also offers a delivery service \(\text{ DS}_{BMW}\), and these two services are bundled for sale. Thus, in the case that \(\text{ EPS}_{BMW}\) is selected for composition, \(\text{ DS}_{BMW}\) should also be selected, and vice versa.

  3. 3.

    Constraints stemming from business security or standard issues. A typical security constraint is separation of duties, which is a well-established practice in business security applications (Basin et al. 2011; Botha and Eloff 2001). In the context of service composition, separation of duties dictates that two component services must be delegated to two different providers to avoid potential fraud (Hwang et al. 2008). On the other hand, an opposite constraint binding of duties may be needed under other circumstances.

The selection correlations are further divided into three sub-types: codependency (C1), dependency (C2), anti-dependency (C3). They are respectively explained as follows.

  1. C1:

    \(S_{i,j} \leftrightarrow S_{m,n}\) applies to two single services \(S_{i,j}\) and \(S_{m,n}\), and it represents that once one of them is selected, the other one should also be selected.

  2. C2:

    \(S_{i,j} \rightarrow \{S_{m,n} \ldots S_{m,l}\}\) applies to a single service \(S_{i,j}\) and a list of services \(\{S_{m,n} \ldots S_{m,l}\}\). This correlation represents that when \(S_{i,j}\) is selected for \(AS_{i}\), the candidates of \(AS_{m}\) will be limited to the list of \(\{S_{m,n} \ldots S_{m,l}\}\).

  3. C3:

    \(S_{i,j} \mapsto \{S_{m,n} \ldots S_{m,l}\}\) is the opposite correlation of C2. It indicates that in the case that \(S_{i,j}\) is assigned to \(AS_{i}, AS_{m}\) can not be instantiated to any one in the list of \(\{S_{m,n} \ldots S_{m,l}\}\).

For example, the selection correlations of the composite service VEPS are described in the form of rules as follows.

figure a1

Quality correlations (QC) represents that service’s quality value depends on selection result or QoS of another candidate service. Quality correlations mainly originate from the business cooperation strategy between enterprises and are also further divided into two sub-types: selection-dependent (C4) and quality-dependent (C5).

  1. C4:

    \(S_{i,j} \rightarrow q_{t}(S_{m,n})\) represents \(q_{t}(S_{m,n})\) is set to another specific value, which is different from the default value, when \(AS_{i}\) is instantiated to \(S_{i,j}\).

  2. C5:

    \(q_{r}(S_{i,j}) \rightarrow q_{t}(S_{m,n})\) represents that when \(S_{i,j}\) is selected for \(AS_{i}\) and \(q_{r}(S_{i,j})\) satisfies a specific condition, \(q_{t}(S_{m,n})\) is assigned to a different value.

Quality correlations can be incorporated into the QoS description. An example of QoS description supporting quality correlations is illustrated as follows.

figure a2

The above QoS model describes three QoS parameters of \(DS_{FedEx}\): response time, reliability and price. The values of response time and reliability are not influenced by other services. However, the value of price has three options: (1) in the default case, it is 96 USD; (2) it turns to 95 USD when JCB is selected for the payment service; (3) it is assigned to 93 USD if AXA is selected for the insurance service and its price is larger than 8 USD.

The conclusion of the defined business correlations is presented in Table 2.

Table 2 Business correlations in service composition

Problem statement

The set of QoS criteria \(Q\) can be classified into two categories: positive and negative (denoted as \(Q^{+}\) and \(Q^{-}\), respectively). For positive criteria, larger values indicate higher performance (e.g. reliability and availability) while for negative criteria, smaller values indicate higher performance (e.g. price and response time). Since the QoS value of a component instance service also relies on quality correlations, these correlations should be first checked to determine the specific QoS value in the composite service, prior to further QoS aggregation. The QoS parameter value \(q_{t} (CCS_{k})\) of a concrete composite service \(CCS_{k}\) is determined by QoS values of its component services and composition structures involved (in this paper, we suppose services are independent with no redundancy). The aggregation functions for QoS parameters response time, price, reliability and availability under the structures of sequence and parallel are shown in Table 3. Other QoS parameters or composition structures such as exclusive choice or loop can be added into the model straightforward by specifying their aggregation functions. More aggregation functions can be found in (Canfora et al. 2005; Wang et al. 2011).

Table 3 Examples of QoS aggregation functions

The target for service selection is to optimize overall QoS of the resulting composite service. In order to facilitate ranking of different composite services in terms of QoS, the simple additive weighting (Zeleny 1982) is adopted as the QoS utility function to map the QoS vector into a real value. SAW includes two steps: scaling and aggregating. The former normalizes the QoS parameters values into real values between 0 and 1, through comparison with the maximal and minimal values. The latter multiplies each normalized value with a preference weight \(w_{t}\) and then aggregates them. The aggregated QoS values of a concrete composite service \(CCS_{k}\) are compared with minimum and maximum possible aggregated values in the ACS it belongs to. The minimum (or maximum) possible aggregated values can be easily estimated by aggregating the minimum (or maximum) value of each abstract service in the ACS. For example, the maximum price can be computed by summing up the price of the most expensive service candidate for each abstract service. Therefore, the overall QoS utility of \(CCS_{k}\) can be calculated in Eq. 1, where \(q_{t,max}\) and \(q_{t,min}\) denote the minimum and maximum possible aggregated values of the \(t\)th QoS parameter, respectively.

$$\begin{aligned} U(CCS_k )&= \sum _{q_t \in Q^{-}} {\frac{q_{t,\max } -q_t (CCS_k )}{q_{t,\max } -q_{t,\min } }.w_t}\nonumber \\&\quad +\sum _{q_t \in Q^{+}} {\frac{q_t (CCS_k )-q_{t,\min } }{q_{t,\max } -q_{t,\min } }.w_t } \end{aligned}$$
(1)

Besides, consumers may impose global QoS constraints to QoS parameters of the resulting composite service. For example, it may be specified that the reliability of the composite service should be larger than 95 % and the price should be less than 100 USD. For a negative QoS parameter, the constraint refers to an upper bound constraint and for a positive QoS parameter, the constraint refers to a lower bound constraint. Hence, the problem of correlation-driven QoS-aware optimal service selection is to find a \(CCS_{k}\) for ACS, where,

  1. 1.

    The QoS utility of \(CCS_{k}\) is maximized;

  2. 2.

    The aggregated QoS values satisfy consumer’s end-to-end QoS constraints;

  3. 3.

    Selection correlations among candidate services are not violated.

Selection approach using genetic algorithm

A genetic algorithm (GA) is a search heuristic that mimics the process of natural evolution and it is widely employed as the optimization algorithm (Engelbrecht 2007). Some researchers have applied GA to solve the problem of optimal service selection, and it achieves good performance (Canfora et al. 2005; Ma and Zhang 2008; Syu et al. 2011; Wu et al. 2012). However, standard genetic algorithm is prone to premature convergence on local optimum and all individuals in the population tend to be similar, resulting in early termination of evolution. Furthermore, when taking into account business correlations in service composition, the search for the optimal solution will be even more difficult, as SC makes many solutions in the search space infeasible, and QC makes search landscape more rugged.

Due to violations of selection correlations and consumers’ global QoS constraints, new individuals (i.e. concrete composite services in this paper) generated by the population initialization procedure, the crossover operator and the mutation operator may not be feasible. Conventionally, there are two constraint handling mechanisms in GA to address the infeasible individual problem. One is the penalty mechanism, that is, to give penalty to infeasible individuals when evaluating their fitness values. The other is the repair mechanism, that is, to use domain-specific knowledge to fix up those infeasible individuals in the population such that all the individuals in the population are always feasible. As individuals not satisfying end-to-end QoS constraints may have some genes that are essential to build the optimal composite service, penalty mechanism is adopted to handle them and it will be explained in “fitness function and sharing fitness function” section. Meanwhile, since individuals violating selection correlations can be fixed up according to correlation rules, the repair mechanism is employed for these infeasible individuals and it will be elaborated in “Repair operator” section.

Genetic encoding

In GA, a genome is a genetic representation of the solution. For the optimal service selection problem, a concrete composite service is encoded as a genome, and the genome is represented by an integer array with its length equal to the number of involved component services. The \(i\)th entry in the array, in turn, refers to the selection result of the abstract service \(AS_{i}\). That is to say, given that the value of the \(i\)th entry is \(j\), it indicates that \(S_{i,j}\) is selected to execute \(AS_{i}\). Figure 2 illustrates this genetic encoding.

Fig. 2
figure 2

Encoding of genome

Fitness function and sharing fitness function

The fitness function is defined over the genome and measures the fitness of the represented solution. As clarified in “Problem statement” section, the fitness of a \(CCS_{k}\) relies on its QoS utility, and whether consumer’s end-to-end QoS constraints and correlation rules are satisfied. Since the repair operator in our GA can guarantee that all individuals in the population do not violate the correlations, the fitness value can be estimated based on QoS utility value and penalty for QoS constraint violations, and it is defined in Eq. 2, where pnl is a negative number, representing the penalty value for one violation, and \(x_{t}\) is a Boolean value defined in Eq. 3, denoting whether the \(t\)th QoS constraint is satisfied.

$$\begin{aligned}&F(CCS_k )= U(CCS_k )+\sum _{t=1}^{|Q|} {pnl\times x_t }\end{aligned}$$
(2)
$$\begin{aligned}&x_t = \left\{ {\begin{array}{ll} 1\;\;\;\;\;if\;q_t \;\in Q^{+}\;and\;q_t <cst_t \;or\;\;q_t \;\in Q^{-}\;and\;q_t >cst_t \\ 0\;\;\;\;else \\ \end{array}} \right. \nonumber \\ \end{aligned}$$
(3)

Quality correlations make the search landscape more rugged and thus the search for the optimal solution becomes much more difficult. The niching technology for GA can maintain the diversity of a population and enable GA to explore more search space so as to obviate premature convergence and it is especially useful for multimodal optimization (Goldberg and Richardson 1987). The rationale behind it is to improve selection strategy to form niche environments for evolution. The fitness sharing method is one of the best-known and widely-used niching techniques. This method rewards individuals that uniquely exploit areas of the search space, while discouraging highly similar individuals.

In order to utilize fitness sharing method, the genome distance between two individuals \(CCS_{h}\) and \(CCS_{k}\) is first defined. It is estimated as the number of different genes in the same positions of the two genomes. Let \(CCS_{h}[i]\) be the value in the \(i\)th position of \(CCS_{h}\) genome, and the detailed calculation formula of genome distance is given in Eq. 4.

$$\begin{aligned}&d_{h,k} =\;\sum _{i=1}^n {y_i } \;\;\;\;\;\;\;\;where,\;\;\; \nonumber \\&y_i =\left\{ {\begin{array}{lll} 1\;\;\;\;\;if\;\;CCS_h [i]\ne CCS_k [i] \\ 0\;\;\;\;if\;\;\;CCS_h [i]=CCS_k [i] \\ \end{array}} \right. \end{aligned}$$
(4)

The sharing value between two genomes depends on their genome distance and the sharing radius. Let \(\sigma _{share}\) be the sharing radius, and the sharing value between \(CCS_{h}\) and \(CCS_{k}\) is calculated as follows:

$$\begin{aligned} sh_{h,k} =\left\{ {\begin{array}{ll} (\sigma _{share} -d_{h,k} )/\sigma _{share} &{} if\;\;d_{h,k} <\sigma _{share} \\ 0 &{}if\;\;d_{h,k} \ge \sigma _{share} \\ \end{array}} \right. \end{aligned}$$
(5)

The sharing fitness of an individual is defined in Eq. 6, where the denominator represents the crowdedness degree of the individual \(CCS_{k}\) in its niche. By adopting sharing fitness for selection operator in GA, highly similar individuals are discouraged.

$$\begin{aligned} F_{niche} \left( {CCS_k } \right) =\frac{F\left( {CCS_k } \right) }{\sum _{h\in population} {sh_{h,k} } } \end{aligned}$$
(6)

For instance, given that a population consists of three individuals: \(CCS_{h}, CCS_{j}\) and \(CCS_{k}\), their genomes are \(\{5, 3, 7, 6, 11, 8\}, \{3, 3, 5, 9, 11, 8\}, \{3, 2, 5, 9, 11, 8\}\), respectively, and their fitness values are 0.6, 0.8, 0.9, respectively. It can be calculated by Eq. 4 that \(d_{h,j} = 3, d_{h,k} = 4\), and \(d_{j,k} = 1\). Supposing \(\sigma _{share}\) is assigned to 2, by using Eq. 5, the sharing values between two different genomes are calculated: \(sh_{h,j} = 0, sh_{h,k} = 0\), and \(sh_{j,k} = 0.5\). Then, \(F_{niche} (CCS_{k}) = F(CCS_{k}) / (sh_{h,k} + sh_{j,k} + sh_{k,k}) = 0.9/1.5 = 0.6\). Similarly, it can be calculated that \(F_{niche} (CCS_{j})\) is 0.533 and \(CCS_{h}\) is 0.6. Thus, the two highly similar individuals \(CCS_{j}\) and \(CCS_{k}\) are suppressed to some extent, and the individual \(CCS_{h}\) which uniquely exploits areas of the search space is encouraged for evolution.

Genetic operators

The GA adopts roulette wheel selection, in which the individuals with higher sharing fitness will be more likely to be selected. The selected individuals are arranged in couple and each couple generates two offspring using the standard two-point crossover operator. Then the mutation operator randomly selects a position (i.e., an abstract service) in the newly-generated genome and randomly replaces the selected candidate service with another one in the candidate list. The elitist operator preserves the best solutions in the population by maintaining a group of them in the next generation.

Repair operator

The repair operator is executed after the initial randomly-generated population, and at the end of each evolution generation to fix up violations of selection correlations in an infeasible individual.

Two repair methods are utilized, namely, the optimistic repair operator and the pessimistic repair operator. The optimistic one is conducted by a succession of atomic repair operations. It tries to fix the individual up by replacing the selected service with an alternative one, when a pair of genes (instance services) which violate SC rules are identified. To be more precise, when the C1 correlation (\(S_{i,j} \leftrightarrow S_{m,n})\) is violated, e.g. \(S_{i,j}\) is assigned to \(AS_{i}\), but \(S_{m,k}\) is assigned to \(AS_{m}\) instead of \(S_{m,n}\) (k \(\ne \) n), the repair is performed by replacing \(S_{i,j}\) with its alternative or replacing \(S_{m,k}\) with \(S_{m,n}\). When the C2 correlation (\(S_{i,j} \rightarrow \{S_{m,n} \ldots S_{m,l}\}\)) is violated, it is viable to replace \(S_{i,j}\) with another candidate service for \(AS_{i}\), or replace the service instance of \(AS_{m}\) with one of candidate services in the list \(\{S_{m,n} \ldots S_{m,l}\}\). The method to repair C3 correlation violation is almost the same as that of C2, and the only difference is that candidate services in the list \(\{S_{m,n} \ldots S_{m,l}\}\) are banned to be selected. The above three repair policies are denoted as RP1, RP2 and RP3, respectively. The infeasible individual is iteratively fixed until there is no correlation violation in it any more. Since the repair of one SC violation via RP1, RP2 and RP3 may bring about new SC violations, the optimistic method may not terminate in the case that the search space is constrained hard by correlations. To avoid infinite repair, the maximal repair time is utilized to control the termination of this iteration. The detailed algorithm of the optimistic repair method is shown in Algorithm 1.

figure a3

In the case that the optimistic method fails to repair (i.e. null is returned), the pessimistic method based on the backtracking algorithm is employed. Different from the optimistic one, it builds a new feasible individual incrementally (i.e. from a partial solution to a complete solution gene by gene) and the partial candidate \(c\) backtracks once it is found that \(c\) cannot be completed to a valid solution. The detailed repair algorithm is shown in Algorithm 2. Pessimistic_Repair is a recursive function, the parameter \(c\) denotes the new individual which is being constructed and the parameter \(i\) denotes that the \(i\)th gene (i.e. abstract service) is being constructed in the current recursion. Initially, \(c\) is an empty integer array and its element is assigned incrementally. The algorithm first checks whether \(c\) violates selection correlations (lines 1 to 3) and then checks whether \(i\) is equal to the length of ACS (lines 4 to 6). If \(c\) does not violate and \(i\) is equal to ACS’s length, it represents \(c\) has been a feasible complete solution and the algorithm terminates with \(c\) returned. When the algorithm continues, a loop is executed to find the next abstract service which is SC-correlated (lines 7–10) (An AS is called SC-correlated, if there is one of its candidates which is involved in selection correlations). If the \(i\)th AS is not SC-correlated, which instance service to select from its candidate list will not have an impact on the feasibility, and thus the selected service for the \(i\)th AS of the original infeasible individual is directly assigned to the new individual \(c\) (line 8). For the SC-correlated AS, its candidate services which will not violate selection correlations when joining into the partial solution \(c\), are put into the feasible candidate list (line 11). If the size of this list is 0, this iteration will be terminated (lines 12–13). Otherwise, the algorithm checks whether the selected service for the \(i\)th AS of the original infeasible individual is contained in this list and if that is the case, it is prioritized and moved to the head of the list so that the newly-built individual \(c\) will be as similar as possible to the original one (lines 15–17). Each candidate in the list is then enumerated to join into the partial solution \(c\) and a deeper iteration Pessimistic_Repair (\(c, i+1\)) is started for each case (lines 18–24).

figure a4

Although the pessimistic repair operator can always succeed to fix up the infeasible solution by building a new solution as similar as possible to the original one (if it can be repaired), it may require a large number of loops and iterations even in the case that the solution space is not hard constrained. Thus, the optimistic method is applied ahead of the pessimistic one.

Algorithm description

Having defined the genetic encoding, fitness function and genetic operators, our tailored genetic algorithm for the optimal service selection problem is now presented in Algorithm 3.

figure a5

Empirical studies

Simulation settings

To evaluate the performance of the proposed selection approach, simulation experiments are conducted by implementing the algorithm on a PC with Pentium 4 2.8 GHz, 2 GB RAM, Windows XP SP3, Java 2 Standard Edition V1.6.0.

An abstract composite service ACS with \(n\) abstract component services is simulated and for each component service, there are \(m\) candidate services. The publicly available QWS dataset (Al-Masri and Mahmoud 2008) is adopted as the QoS dataset of candidate services. It comprises measurements of 9 QoS parameters for 2,500 real-world web services, which are collected from public sources on the Web. For the sake of focus and clarity, the customer’s preference weight given to each QoS criterion is set to be the same. Business correlations among the candidate services are generated in a random manner and quality correlations improve the corresponding QoS values to some extent. Let \(\theta \) represent both the number of selection correlations and the number of quality correlations.

The niching genetic algorithm is denoted as NGA-X, where X represents the value of sharing radius \(\sigma _{share}\). The following two genetic algorithms are utilized as benchmarks: traditional genetic algorithm in (Canfora et al. 2005) for the optimal service selection problem, which is unaware of selection correlations (denoted as TGA), and its revised version which is armed with the repair operator presented in “repair operator” section to address selection correlations (denoted as RGA). For TGA, the best one of solutions satisfying selection correlations is returned as the final solution. The detailed characteristics for each genetic algorithm are as follows: the best individual kept alive across generations (i.e. elitists), a crossover probability of 0.7, a mutation probability of 0.3 and a population of 100 individuals. The termination condition of the genetic algorithms is ‘no improvement on the best solution for 100 consecutive generations’. Due to the stochastic nature of genetic algorithms, each experiment is repeated for 50 times and the average performance is shown.

Experiment I: the impact of correlations

In this experiment instance, the feasibility and optimality of the gained solutions by each method are learned with regards to different correlation numbers. The ACS is set with 15 component services (\(n=15\)), and 15 candidate services available for each component service (\(m =15\)). To measure the hardness of finding a solution satisfying selection correlations in the search space, the ratio of feasible solutions (\(\rho \)) in the search space is utilized. When the number of correlations (\(\theta \)) increases from 0 to 20 with the step of 2, the corresponding average values of \(\rho \) are calculated and shown in Table 4. When \(\theta \) is 10, there are almost half of solutions in the search space are feasible in terms of selection correlations, and when \(\theta \) grows to 20, there are only 23 % feasible solutions.

Table 4 Ratio of feasible solutions (\(\rho \)) w.r.t. number of correlations (\(\theta \))

RGA, NGA-1 and NGA-2 adopt the repair operator to address the selection correlations and Fig. 3a illustrates the average number of individuals in each generation which is infeasible and fixed up by the repair operator. Besides, Fig. 3b depicts the number of individuals which is fixed up by the pessimistic repair operator. It is observed from Fig. 3 that for the three methods, the repair number grows with the increase of \(\theta \), as it becomes more and more difficult to generate a feasible solution solely by the genetic operators. NGA-1 requires a bit less repairs than NGA-2, but a bit more than RGA. When \(\theta \) exceeds 15, the number of pessimistic repairs grows sharply.

Fig. 3
figure 3

Number of repairs w.r.t. number of correlations (\(\theta \))

The metric optimality is adopted to measure how close the approach approximates the optimal QoS utility value and it is calculated as \(u_{act}/u_{optimal}\), where \(u_{act}\) is the actual QoS utility gained by the corresponding approach, and \(u_{optimal}\) denotes the QoS utility achieved by the brute-force algorithmFootnote 1. Figure 4 shows the average optimality values gained by the four methods: TGA, RGA, NGA-1 and NGA-2. As can be learned from Fig. 4, with \(\theta \) increasing, it becomes more difficult for each method to approximate the optimal solution. For example, the optimality of RGA is greater than 98 % in the case of no correlations, while it falls to 91 % when \(\theta \) is 20. Among the four methods, TGA performs worst as it can not deal with service correlations. NGA-1 achieves higher optimality than RGA, as NGA-1 rewards individuals that uniquely exploit areas of the search space so as to prevent premature convergence. Although NGA-2 uses a larger sharing radius, its optimality is only a bit higher than NGA-1.

Fig. 4
figure 4

Optimality w.r.t. number of correlations (\(\theta \))

Meanwhile, the generations when the algorithm aborts in advance and the computation time each method consumes are depicted in Figs. 5 and 6, respectively. Figure 6 shows that the required runtime for TGA, RGA, NGA-1 and NGA-2 ascends with the rise of \(\theta \), and the increment is the least for TGA. This is because, the time required for fitness calculation is prolonged by more QC correlations when \(\theta \) grows. Besides, for genetic algorithms equipped with the repair operator (i.e. RGA, NGA-1 and NGA-2), it takes a longer time to validate whether an individual violates selection correlations and repair infeasible individuals with \(\theta \) increasing. NGA-1 and NGA-2 requires more time than RGA, because the niching technology itself requires extra computation time and as can be observed from Fig. 5, the niching technology postpones the convergence of the algorithms.

RGA consumes less computation time than NGA-1 and NGA-2, and it can be seen that the extra QoS utility gain of NGA-1 and NGA-2 is at the cost of computation time. Consequently, if the application desires real-time response more than optimality, RGA can perform well. If the application requires high optimality, NGA-2 can be utilized. Otherwise, NGA-1 can be a good tradeoff between them.

Fig. 5
figure 5

Generation w.r.t. number of correlations (\(\theta \))

Fig. 6
figure 6

Computation time w.r.t. number of correlations (\(\theta \))

Experiment II: the impact of problem size

The problem size depends on two parameters, the number of abstract services in the composite service (\(n\)) and the number of concrete services for each component service (\(m\)). In this experiment instance, the number of correlations is fixed to 10 and then the impacts of \(n\) and \(m\) are respectively investigated with the other parameter fixed to 15. Figure7a depicts the required computation time with regards to different values of \(n\) and Fig. 7b is for that of \(m\).

In Fig. 7a, b, we can see that runtime of the four algorithms rises smoothly with \(n\) or \(m\) increasing, in spite of that the search space of the problem is exponential on the basis of \(n\) and \(m\). Therefore, the algorithms scale well in this regard. Additionally, compared to \(m\), the rise of runtime is more sensitive to \(n\), as the search space grows faster with the increase of \(n\) than that of \(m\).

Fig. 7
figure 7

Computation time w.r.t. number of component services (n) & number of candidate services (m)

Conclusions and future work

Service composition is an effective method to build virtual enterprises for service oriented enterprises. Quality of Service (QoS) is the key point to ensure all participating enterprises to have a good cooperation and is also crucial for virtual enterprises to meet the requirements of consumers. In this paper, we formally propose a business correlation model including both QoS correlations and selection correlations, and then describe the correlation-driven service selection problem. It is not trivial to find the optimal solution, due to the exponential search space and potential business correlations. We present an efficient approach for this problem based on a genetic algorithm and it is specially tailored with niching technology, a repair operator and a penalty mechanism. Empirical studies are conducted at last and the experiments show the effectiveness and efficiency of the proposed approach.

The business correlation model presented in this paper is limited to two abstract services. In future work, we aim to extend it to more abstract services by introducing logical connectives such as conjunction and disjunction, and adapt the repair operators to selection correlations involving multiple abstract services. We also aim to analyze potential relations and conflicts among correlations, and investigate how to leverage them for optimal service selection. Besides, it is urgent to propose an automatic business correlation extraction method apart from correlations specified explicitly.