Keywords

1 Introduction

The web service technology is considered as an ideal scheme for realizing the interoperability and the integration of distributed applications. To fulfill this aim, the community has developed many standard, such as SOAP, UDDI, WSDL, [7] and BPEL [20]. Due to the rapid increase in the number of services over the internet, service matchmaking has been an active area of research during the last years. One of the major concerns, is to assess and rank services that fulfill (partially/or globally) a given request. More specifically, we need an efficient and effective approach for retrieving the top-k services according to several matching criteria.

In what follows, we present an example that clearly shows the different difficulties involved in this issue.

Let us consider a user request that consists of a set of input concepts Pin1, Pin2…and output concepts Pout1, Pout2…., (for the sake of simplicity we neglect the other parameters such as preconditions, effects, paths…). To fulfill this request, we can use several matchmaking algorithms or similarity functions termed f1…fn, each function is applied on set of parameters such as inputs or outputs.

Let RP be the request’ parameters set, i.e. RP = {Pin1,Pin2,… Pout1,Pout2….}. Each matchmaking function fj matches the advertised services against the request (i.e. the set of parameters RP). For each service S, if more than one parameter matches a requested parameter, the most appropriate match is retained. More formally we match the pair < RP, S > as follows:

$$ \forall\;{\text{p}}_{\text{i}} \in {\text{RP}},{\text{ match}}\left( {{\text{p}}_{\text{i}} ,{\text{S}}} \right) = {\text{MAX}}_{\text{l = 1}}^{{ | {\text{S|}}}} \,{\text{f}}_{\text{j}} \left( {{\text{p}}_{\text{i}} ,{\text{ p}}'_{\text{l}} } \right) $$
(1)

Such that p’l is an input/output parameter belonging to S.

Each cell of the Table 1, indicates a matching score denoted: match(pi,S).

Table 1. Partial matching scores of services

For more simplification, we also suppose, that all the services have a single input Pin and a single output Pout, the same assumption is considered for the request (see Table 1). In order to design an effective discovery system, we have to select a consistent matchmaking algorithm.

In practice, there is no ideal approach that always gives the right degree of Match. In fact, we discern different types of semantic matchmakers: we have logic approaches, that leverage reasoning techniques such as subsumption test, non-logic approaches that use information retrieval techniques, and datamining to assess the similarity between the request and the services, and hybrid approaches that combine the previous categories in order to enhance the retrieval system performance. This class, may also apply machine learning, so as to make the final decision.

To leverage the advantages of the proposed matching techniques, and thus boost the performances, we suggest the use of a set of matching algorithms. We refer to this suggestion as Sug1. Hence, the proposed approach belongs to the hybrid class.

The second key component of the discovery system consists in choosing the aggregating mechanism of the partial matching scores, a partial matching score measures the closeness degree between a single parameter of the request and a single parameter of the advertised service. To compute the global score (that handles all the parameters), we adopt a majority vote approach, and more specifically we compute the ranking according to the Condorcet principle.

To this end, we propose a fuzzy relationship to compare the services. This latter is a fuzzified version of the pareto dominance relation. We recall that, the condorcet winner of a set of candidate objects, is the one that beats or ties with every other candidate in a pair-wise comparison. In general, we notice that, several existing aggregating schemes are not suitable for service matchmaking, the major reasons are:

  1. (1)

    First, these techniques allow the compensation between parameters and thus, they entail an information loss. More specifically, the fact of averaging the partial scores (before comparison), may cause an erroneous ranking.

  2. (2)

    Second, these techniques cannot pick up the tradeoffs given by the advertised services, for example if the user focuses on services having medium output scores and higher input scores, then the averaging process is not proper, because it smoothes the final scores.

Thus, we propose in this paper, the use of all partial matching scores during the comparison and without averaging them. After that, we sum the comparison results, to get a final decision. This suggestion is referred to as Sug2.

Our aim is to propose a hybrid matching algorithm, which takes into account the aforementioned suggestions. More specifically, we adopt a fuzzy version of the dominance relationship, to compare the services.

The example revealed in Table 1, presents a user request and four advertised services. These services are evaluated according to m different matchmaking functions f1,f2,f3…., We also observe, that the matching degrees between two particular parameters (for instance the service output and the request output) are non-deterministic and seem to be stochastic, For instance, according to f1, W dominates Z, however, according to f3, Z is better than W. To deal with this kind of ambiguity, we introduce the fuzzy dominance score which measures the extent to which a service S1 dominates another service S2. This fuzzy score is used for comparing and ranking the services.

The main ideas of our proposition are given below:

  1. 1.

    We propose a Condorcet based ranking of the advertised services, that takes into account all the matching parameters (inputs, outputs,….) and all the matching algorithms (f1,f2,….).

  2. 2.

    To implement the fuzzy version of the Condorcet approach, we propose a fuzzy relation, so as to compare the services, and consequently we can compute the Condorcet winner of an answer set.

Our choice is mainly motivated by May’s theorem, which states that in the case of a two candidate election, “majority voting is the only method that is anonymous (equal consideration of all voters), neutral (equal consideration of the candidates), and monotonic (more support for a candidate does not jeopardize its election)” [19]. The rest of this paper is organized as follows: the Sect. 2 presents the related work, the third section introduces the developed approach, the fourth section shows the results and finally we present in the fifth section our conclusions.

2 State of the Art

We focus in this section on two areas: the web service discovery as well as the data fusion problem. For each of them we present the existing approaches.

Semantic Web Service Discovery:

We discern 03 types of semantic approaches. The first class is based on pure logic matchmaking, and more specifically these algorithms use subsumption test, or consistence test to establish the relatedness between a request and a service.

In OWLMX [14], the authors propose 04 discrete scores for OWLS web services, The major drawback of this class is the high rate of false positives and false negatives, and the exponential complexity [2].

For this reason the research community has developed another kind of algorithms which are based on non-logic approaches such as, graph matching, datamining, optimization, probabilistic matching, information retrieval mechanisms… [11, 21, 23].

In [5] the authors use Probabilistic Latent Semantic Analysis (PLSA), [12] and Latent Dirichlet Allocation (LDA) [4] to extract latent factors from semantic service descriptions and retrieve services in latent factor space. Each service is represented as a probability distribution over latent factors. A latent factor represents a group of concepts and more formally it is a probability distribution over semantic concepts. In URBE [22], the researchers use different service descriptors to assess the similarity between the request and the services. They leverage a path based similarity for measuring the closeness between concepts. In addition to that, they use mixed integer programming to find the optimal matching.

The third category [13, 15] combines the previous types, in order to improve the performances. More specifically, they aggregate the scores given by the individual matching algorithms, so as to provide a global score. The aggregating scheme can be static such as [15] or dynamic (or adaptive) such as [16, 24]. The static case, means that the individual scores have a fixed priority (or the scores’ ranking is static), for instance in OWLMX [14], or OWLMX2 [15] the logic score “exact” is always ranked before the textual similarity scores. In contrast, to the aforementioned systems, the adaptive approaches leverage machine learning, to boost the performances. For instance OWLMX3 [16], uses SVM to decide whether the advertised service is proper to the request or not. In [24] the authors propose three metrics for retrieving and sorting web services, each of them uses a set of matching algorithms (logic, textual similarities..), so as to provide a combined ranking.

Data Fusion:

The rank aggregation (or data fusion) aims to build a global ranking from a set of ranked lists of objects (or services/documents). These lists are given by different search engines. Even if the problem seems to be simple, it is worth mentioning that, the retrieval of the optimal combined ranking is NP-hard [8] under certain constraints.

Several aggregating techniques have sprung up [9, 10, 18], in particular we notice four categories of data fusion techniques [1]. The hierarchy is based on two criteria:

  1. (1)

    The presence/absence of the relevance scores

  2. (2)

    The presence/absence of machine learning

If the scores are given by the individual ranking algorithms, the combined method leverages them to build the novel order, for instance the work presented in [1] follows this policy. Since almost no search engine provides the ranking scores, it is possible to convert local ranks into relevance scores. In CombSUM [10], the combined relevance score of a document is the sum of the scores affected by each input ranking. Likewise the CombANZ (CombMNZ) systems [10], compute the final score of a document, in similar fashion with CombSUM, except that we divide (multiply) the score by the number of rankings in which the document appears. According to [17], the CombMNZ approach provides the highest search quality.

If the scores are not present (for rank aggregation), we use only the order of the input ranking [1].We call these approaches order based methods or positional methods. The first positional method is the Borda-fuse model [1]. For each document, it affects as score the summation of its rank (position) in each list.

The Condorcet-fuse approach [18], leverages a majoritarian voting model. More specifically, a document d1 is ranked before another document d2 in the fused list, if d1 is ranked before d2 more times than d2 is ranked before d1. The outranking approach [9], leverages the majoritarian model by introducing several kinds of thresholds.

3 Web Service Retrieval and Ranking

3.1 Problem Statement

Before introducing the developed contribution, we begin by formalizing the concept of top K dominating web services under multiple matching functions.

First of all we suppose that, each service (or request) is represented as a vector containing the inputs/outputs parameters. A matching degree between a parameter of the request RP = {p1 … pd,} and a service S, under the matching function fj is referred to as: matchj(pi,S).

Since we have d requested parameters, we obtain d matching scores, each of them belongs to [0,1]. Each function fj provides a partially ordered set of services, these elements are labeled with a vector of matching degrees having d dimensions. This vector is called matching instance. It is denoted Vqj. q represents the service identifier and j represents the matching function identifier.

Simply speaking, the partially ranked list of fj is represented as follows:

PRLj = <(S1,V1j), (S2,V2j),….. (S|base|,V|base|j) > , each Vqj∈ [0,1]*[0,1]*…[0,1].

For instance, if the user’s request has a single input and a single output, then, Vqj ∈ [0,1]*[0,1].

Our purpose is to construct a fused (combined) list FL from a set of partially ranked lists PRL1 ….PRLm, (given by m matching functions), such that the FL’s Top-K elements, have the best precision rate and the best recall rate. We also notice that, the implemented fusion technique, must take into account the suggestions Sug1 and Sug 2.

In order to satisfy the requirement Sug2, we use a fuzzy version of the dominance relationship, we adapt the function proposed by [3] to our context. We notice that, the (fuzzy) dominance function, compares (simultaneously) in a pairwise manner all the vector components of the matching instances, hence, the contribution largely fulfills the second suggestion Sug2.

In what follows we present the pareto dominance and its fuzzified version.

3.2 Pareto Dominance

Let us consider two d-dimensional vectors u and v, we say that u dominates v, i.e. u > v, if and only if u is at least as good as v in all dimensions and (strictly) better than v in at least one dimension, i.e., ∀i ∈ [1, d], ui ≥ vi ∧ ∃j ∈ [1, d], uj > vj.

Form the definition, we observe that, the Pareto dominance is not always decisive in comparing objects, in fact, it does not permit the differentiation between vectors with a large variance, i.e., vectors which are very good in some dimensions and mediocre in other ones (e.g., (0.95, 0.1) and (0.80, 0)) and good vectors, i.e., points that are (in general) adequate in all dimensions (e.g., (0.8, 1) and (1, 0.7)). To explain this fact, let us consider u = (u1, u2) = (0.95, 0.2) and v = (v1, v2) = (0.85, 0.9). According to the Pareto principle, we have neither u > v nor v > u, i.e., the instances u and v are incomparable. However, we can say that v is more appropriate than u since v2 = 0.9 is bigger than u2 = 0.2, while v1 = 0.85 is very close to u1 = 0.95.

To take advantage of this observation, we propose a fuzzy version of the Pareto dominance relationship to express the dominance degree between two matching instances, i.e. we assess, the extent to which a matching instance vector (more or less) dominates another one.

3.3 Fuzzy Dominance

Given two d-dimensional vectors u and v, the fuzzy dominance function adapted from [3] computes the extent to which u dominates v:

$$ {\text{FD}}\left( {{\text{u}},{\text{v}}} \right) = \, \left( { 1/{\text{d}}} \right)\sum\nolimits_{\text{i = 1}}^{\text{d}} {{\text{EFD}}\left( {{\text{u}}_{\text{i}} ,{\text{v}}_{\text{i}} } \right)} $$
(2)

Where the elementary fuzzy dominance EFD(x,y) is defined as follows:

$$ {\text{EFD}}\left( {{\text{x}},{\text{y}}} \right) \, = \;\;\;\left\{ {\begin{array}{*{20}c} 0 & {} & {{\text{if }}\left( {{\text{x }} - {\text{ y}}} \right) \, \le \varepsilon } \\ {({\text{x}} - {\text{y}} - \varepsilon )} & {} & {\text{ Otherwise}} \\ \end{array} } \right. $$
(3)

Where, ε ∈ [0,1], this parameter is chosen by an expert or empirically. ε allows the control of the dominance score between x and y. In this version, we neglect the additional parameter λ proposed by [3], because the experiments do not show a high importance of the parameter λ on the result’s quality. EFD constitutes a membership function of the fuzzy relation “x dominate y”. It expresses the strength of the relation” x is more or less greater than y”. The dominance function given in formula 2, is simply an average value of the different elementary fuzzy dominance degrees.

3.4 Service Comparison

Since we have various relevance scores for each service (i.e., the matching scores given by the different matching functions), we should design a comparison operator that takes into account all of them (i.e.,we accomplish the first requirement Sug1). In what follows, we elucidate our idea by considering a simple scenario. Let us consider two services S1 and S2, such that each of them is characterized by a set of matching instances derived from the m matching functions f1 f2 ….fm. First, we compute the extent to which S1 is better than S2, by adopting the following formula:

$$ {\text{Dominance}} - {\text{degree}}\left( {{\text{S}}_{ 1} ,{\text{S}}_{ 2} } \right) = \, \left( { 1/{\text{m}}} \right)\;\;\sum\nolimits_{{{\text{j}} = 1}}^{\text{m}} {\left( { 1/{\text{m}}} \right)} \sum\nolimits_{{{\text{k}} = 1}}^{\text{m}} {{\text{FD}}\left( {{\text{V}}_{{ 1 {\text{j}}}} ,{\text{V}}_{{ 2 {\text{k}}}} } \right)} $$
(4)

Where V1j is the J th matching instance of S1

Where V2k is the K th matching instance of S2

Informally, this function computes the average dominance score between the instances of S1 and the instances of S2.

Second, we compute the extent to which S2 is better than S1, by adopting the same formula:

$$ {\text{Dominance}} - {\text{degree}}\left( {{\text{S}}_{ 2} ,{\text{S}}_{ 1} } \right) = \, \left( { 1/{\text{m}}} \right)\sum\nolimits_{{{\text{j}} = 1}}^{\text{m}} {\left( { 1/{\text{m}}} \right)} \sum\nolimits_{{{\text{k}} = 1}}^{\text{m}} {{\text{FD}}\left( {{\text{V}}_{{ 2 {\text{j}}}} ,{\text{V}}_{{ 1 {\text{k}}}} } \right)} $$

Third, we rank the services according to the obtained scores:

If Dominance-degree(S1,S2) > Dominance-degree(S2,S1) then S1 is ranked before S2, and this means that S1 is better than S2.

If Dominance-degree(S1,S2) < Dominance-degree(S2,S1) then S2 is ranked before S1, and this means that S2 is better than S1

If Dominance-degree(S1,S2) = Dominance-degree(S2,S1) then S1 and S2 have same rank, and this means that S2 and S1 are equivalents (S1 ≡ S2).

To clarify the comparison mechanism, we consider the following example: For the sake of simplicity, we handle only two parameters: one input and one output and 03 matching functions f1 f2 f3 (i.e.03 matching instances per service), we also set ε to 0.

S1: {(1,0.6),(0.4,0.5),(0.7, 0.3)}, S2:{(0.7,0.7),(0.5,0.5),(0.6,0.7)}

Dominance-degree(S1,S2)=

1/3.[1/3.[(FD((1,0.6),(0.7,0.7)) + FD((1,0.6), (0.5,0.5)) + FD((1,0.6), (0.6,0.7))] +

1/3.[(FD((0.4,0.5),(0.7,0.7)) + FD((0.4,0.5), (0.5,0.5)) + FD((0.4,0.5), (0.6,0.7))] + 

1/3.[(FD((0.7,0.3),(0.7,0.7)) + FD((0.7,0.3), (0.5,0.5)) + FD((0.7,0.3), (0.6,0.7))]] = 0.0888.

In a similar manner: Dominance-degree (S2,S1) = 0.1222. Since the second score is bigger than the first score, we infer that S2 is better than S1 in the aggregated list.

3.5 Fuzzy Ranking Algorithm

Our proposition, referred to as FCAA (Fuzzy Condorcet Aggregating Algorithm), computes the top-k Web services according to the Condorcet ranking principle. The main ide a is to build a graph, such that, the nodes represent the services, and the edges represent the relation “is better than”. If two services are equivalents then, we put two edges S \( \to \) S’ and S’\( \to \) S.

The major problem of the Condorcet based ranking, is the presence of cycles. Several solutions are proposed to this issue, for instance, we may simply, view cycles as ties. In [18] the local ordering of objects within a cycle is only of secondary importance, while their ordering with respect to the rest of the candidates is a prime priority. In this contribution, we rank the services within a cycle, by using a simple heuristic. This latter favors the services having the smallest variance, in other words if we have for instance two equivalent services S and S’ such that S has two matching instances (a1,b1), (a2,b2) and S’ has two matching instances (a’1,b’1), (a’2,b’2), then

Mean(S) = ((a1 + a2)/2, (b1 + b2)/2) we denote this mean by (a*,b*)

Mean(S’) = ((a’1 + a’2)/2, (b’1 + b’2)/2) we denote this mean by (a’*,b’*)

Variance(S) = Variance1(S) + variance2(S), where:

Variance1(S) = 1/2.[(a1-a*)2 + (a2-a*)2]

Variance2(S) = 1/2.[(b1-b*)2 + (b2-b*)2]

In a similar manner, we define Variance(S’), and hence S is better than S’ if and only if Variance(S) < Variance(S’). The idea behind this heuristic is that we want to privilege the services that have consistent scores in all the matching algorithms. This policy will ensure more chances to retain reliable services. To infer the aggregated ranking, we simply search a hamiltonian traversal of the constructed graph. This task can be done in a polynomial time [18]. In what follows, we present the pseudo code of the proposed ranking algorithm:

The explanation of the algorithm is given below:

  1. (1)

    (Lines 1,2,3), we remove all the services where the matching scores (for inputs and outputs) are 0 in all the matching algorithms. This phase will alleviate the time complexity of the ordering. We discard all the services that have a matching instance = (0,0….0), in all the matching functions.

  2. (2)

    (Line 4), we fuse the pruned source lists in the set C.

  3. (3)

    (Line 5) the graph is created by considering the elements of C as nodes, and the edges are inserted according to the dominance degree, initially the edges are empty.

  4. (4)

    (Lines 6..12), we build the Condorcet graph by leveraging the fuzzy dominance relationship.

  5. (5)

    (Lines 13), we construct the hamiltonian path of the Condorcet graph (see Fig. 1), this task is done by counting the out-degree of each service(or node), the root of the hamiltonian graph is the node having an out degree = |C|-1, the other nodes of the hamiltonian path are computed with an almost similar manner.

    Fig. 1.
    figure 1

    Global ranking

  6. (6)

    (Lines 14..15), we extract the Top-K dominating services from the hamiltonian path, for instance, in Fig. 1 Top-2(Global- Ranking) = <S19,S4>

We notice that Fig. 1 shows only the edges connecting two consecutive levels, or nodes belonging to the same level. In our work, we use conjointly five matching functions: four of them are similarity measures, more specifically we adopt: the loss-of-information measure (IL), the extended Jaccard similarity (EJ), cosine similarity (Cosine), and Jensen-Shannon information divergence based similarity (JS). The last matching function (Logic) is a pure logic-based reasoning. This latter uses the subsumption test, for making the decision, it gives fives scores: Exact, Plugin, Subsume, SubsumedBy, and Fail. All these algorithms are presented in detail in [14]. These similarity measures are preferred in this work, because they represent the most promising techniques in information retrieval [14].

4 Experimental Study

We conduct a set of experiments to assess the effectiveness and the efficiency of the proposal, the used test collection OWLTC V2.2, (http://www-ags.dfki.uni-sb.de/~klusch/owls-mx/) involve real-world Web service descriptions, extracted principally from public IBM UDDI registries. The data set contains: (a) 1007 service descriptions, (b) 29 sample requests, and (c) a manually identified relevance set for each request. We use 19 requests for assessing the approach quality (precision and recall).

In order to compare the efficiency, we measure, the mean execution time of the different matching algorithms. All the algorithms were implemented in Java, and the experiments were conducted on a Core I3 GHz machine with 4 GB of RAM, running Window7.

In this, study we have implemented five matching functions and one fusion algorithm, which are respectively: Extended Jaccard similarity measure (EJ), Information Loss similarity measure (IL), jenson shanon similarity measure (JS), cosine similarity measure (cosine), logic matching (Logic), and Fuzzy Condorcet Aggregating Algorithm (FCAA).

We also compare our aggregating algorithm with the results given by the Borda fuse model, we assume that the Borda’s experiments use a set of source lists having 50 elements. To choose the best value of ε, we have conducted several simulations of the Fuzzy Condorcet Aggregating Algorithm, the best values belong to [0, 0.09], in what follows we take ε = 0.05.

As demonstrated in Fig. 2, FCAA presents a little overhead, in comparison with the 05 matching algorithms. This cost is mainly due to the comparison task (see lines 8, 9,10), The fast algorithm is the logic approach, this latter is based on an ontological encoding technique [6] for reducing the execution time of the subsume test. The rest of algorithms have almost similar execution times. We also observe that fact of pruning the input ranked lists (see step 2), enables the reduction of the computational cost, because the entire data set is replaced with a small subset C.

Fig. 2.
figure 2

Average execution time

We also notice that, the time complexity of the dominance-degree (formula 4) is O(d.m2). Because we need to perform m2 elementary comparisons and each of them requires d actions. To rank the services we must execute the dominance degree twice for each pair (S,S’) (we neglect for the moment the equivalence case), and since we have |C| services to rank, then each service is compared with |C|−1 elements. Finally, to build the Condorcet graph and extract the Hamiltonian path, we need O(|C|2.d.m2) steps. It obvious that this polynomial complexity enables a high efficiency for large data sets.

The Tables 2 and 3, show the precision and the recall associated to the TOP 10 up to TOP 60 answers. In general, the results indicate that FCAA is more effective than the other algorithms, in terms of both criteria. Furthermore the FCAA superiority is independent from the list size (K).

Table 2. Average precision for 19 requests
Table 3. Average recall for 19 requests

Since the Borda approach is very sensitive to the service ranks in the source lists, its average recall is very weak in comparison with FCAA (especially for high values of k). Likewise the precision superiority of the proposed approach is more apparent for k = 20 and k = 30. These experiments confirm the effectiveness of the Condorcet voting model.

The Table 4 provides the details that concern the R-precision. This criterion represents a special execution scenario, in which the recall and the precision are equal.

Table 4. Average R-precision for 19 requests

According to the results, FCAA is better than the individual matching functions and the Borda fuse model, we also notice that the information loss gives an almost similar performance.

The only case where FCAA will give a poor performance, is described below:

  • When the majority of the matching functions commits an error about a good service denoted S* and they ranked it beyond the first K elements, then the FCAA will not correct the mistake because it is based on the majority vote principle.

  • In this situation FCAA is not able to outperform the good matching algorithms.

We notice that these cases are very rare, because the majority of the similarity measures are very effective in the Top 30 services (see Tables 2 and 3).

5 Conclusion

In this paper, we have proposed an aggregating algorithm based on Condorcet principle for retrieving and ranking web services. Our proposition takes into account several parameters as well as various matching functions during the ranking process. Our contribution leverages five matching functions: four similarity measures and one logic matching approach. The experimental results show that the proposed algorithm is quite effective and efficient.

There are several directions for future work. First, we can learn additional parameters such as ε and the number of input lists that participate in the election process, second we can add additional matching functions, especially edge based similarities, or even kernel based similarities. These directions may ensure more chances to pick up the most relevant services.