Efficient Versus Accurate Algorithms for Computing a Semantic Logic-Based Similarity Measure

Gmati, Fatma Ezzahra; Chakhar, Salem; Yacoubi Ayadi, Nadia; Bahri, Afef; Xu, Mark

doi:10.1007/978-3-319-92058-0_78

Fatma Ezzahra Gmati¹⁷,
Salem Chakhar¹⁸,
Nadia Yacoubi Ayadi¹⁹,
Afef Bahri²⁰ &
…
Mark Xu¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10868))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

3066 Accesses
1 Citations

Abstract

There are three types of Web services matchmakers: logic-based, non logic-based, and hybrid. Logic-based matchmakers employ the semantics of the Web services. Non-logic based matchmakers employ other approaches such as syntactic and structural matching. Hybrid matchmakers combine both approaches. This paper presents and compares two algorithms for computing a semantic logic-based similarity measure in the perspective of Web service matching. The first algorithm offers an efficient solution, while the second algorithm proposes a more accurate result. Both algorithms are evaluated using the SME2 tool. Performance evaluation shows that efficient algorithm reduces substantially the computing time while the accurate algorithm ameliorates considerably the precision of the matching process.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Evaluation of Similarity Measures in Semantic Web Service Discovery

Semantic Web Service Search: A Brief Survey

Article 25 November 2015

A Hybrid-Sorting Semantic Matching Method

Keywords

1 Introduction

Web service matchmaking consists in the computation of the similarity between two Web services. It is an important task in Web service discovery and Web service composition. The matchmaking process is generally based on the Web service description; several standards define this description such as WSDL, OWL-S, SAWSDL, the two later include semantics. Many matchmakers that support Web semantics have been proposed in the literature, including [1, 3, 4, 7, 9, 10, 12].

Three types of Web services matchmakers can be distinguished [2, 11, 12]: (i) logic-based matchmakers that employ the semantics of the Web services and set logic-rules and constraints in order to perform the matching; (ii) non-logic based matchmakers that rely on syntactic and structural matching techniques; and (iii) hybrid matchmakers that combine both of the above mentioned matching approaches. Logic-based techniques did not receive much attention and the authors in [9] qualify them as imprecise. However, if the logic-based techniques is considered as a component of a hybrid matchmaker, improving their performances would improve the overall performance of the matchmaker.

To the best knowledge of the authors, the first logic based matchmaker has been proposed in [10]. The authors in [1] improve [10]’s proposal by using bipartite graphs. In this paper, two logic-based algorithms for computing the similarity measure between two Web services are proposed. These algorithms can be seen as an extension of the one proposed in [10]. Both algorithms have been evaluated using the SME2, which is an open source tool for testing different semantic matchmakers. The first algorithm slightly affects the precision of [1]’s algorithm but reduces substantially its computing time. The second algorithm enriches the semantic distance values used in [1] and ameliorates considerably its precision.

The rest of this paper is organized as follows. Section 2 introduces basic definitions and shows how the similarity measure is computed. Sections 3 and 4 present the efficient and accurate algorithms, respectively. Section 5 studies the computational complexity of the algorithms and Sect. 6 evaluates and compares their performances. Section 7 ends the paper.

2 Similarity Measure

2.1 Basic Definitions

A semantic match between two entities frequently involves a similarity measure. The similarity measure quantifies the semantic distance between the two entities participating in the match. A similarity measure is defined as follows.

Definition 1

(Similarity Measure). The similarity measure, $\mu $, of two service attributes is a mapping that measures the semantic distance between the conceptual annotations associated with the service attributes. Mathematically, $\mu : A\times A\rightarrow V$, where A is the set of all possible attributes and V the set of possible semantic distance measures.

There are different possible definition of set V. In [5], for instance, the mapping between two conceptual annotations may take one of the following values: Exact, Plugin, Subsumption, Container, Part-of and Disjoint.

A preferential total order may now be established on the above mentioned similarity maps.

Definition 2

(Similarity Measure Preference). Let V be the set of all possible semantic distance values such that $V=\{v_1,v_2,\cdots ,v_n\}$. Preference amongst the similarity measures is governed by the following strict order: $v_1\succ v_2\succ \cdots \succ v_n$, where a $\succ $ b means that a is preferred over b.

Definition 3

(Degree of Match). The degree of match is a function that defines a semantic distance value between two conceptual annotations. Mathematically, $\delta : CA_1\times CA_2\rightarrow V$, where $CA_1$ denotes the first conceptual annotation and $CA_2$ denotes the second conceptual annotation and V is the set of all possible semantic distance values.

This generic definition of similarity measure extends the one proposed by [5]. Other matchmaking frameworks (e.g. [1, 10, 13]) utilize an idea similar to $\mu $, but label it differently. The main difference between the above-cited works concerns the way the degree of match is computed. In Sects. 3 and 4, two algorithms for computing the similarity measure are provided. These algorithms improve the one proposed by [1], which is presented in the rest of this section.

2.2 Computing the Similarity Measure

The computing of the similarity measure over two attributes is modeled by [1] as a bipartite graph matching. Let first introduce some concepts.

Definition 4

(Bipartite Graph). A graph $G=(V_0+V_1,E)$ with disjoint vertex sets $V_0$ and $V_1$ and edge set E is called bipartite if every edge connects a vertex of $V_0$ with a vertex of $V_1$ and there are no edge in E with both endpoints are in $V_0$ or in $V_1$.

Definition 5

(Matching in a Bipartite Graph). Let $G=(V,E)$ be a bipartite graph. A matching M of G is a subgraph $G'=(V,E')$, $E'\subseteq E$, such that no two edges $e_1,e_2\in E'$ share the same vertex. A vertex v is matched if it is incident to an edge in the matching M.

Let $S^R$ be a service request and $S^A$ be a service advertisement. The matching between $S^R.A_i$ and $S^A.A_j$ (i.e. $\mu (S^R.A_i,S^A.A_j)$) is computed as follows:

1.
Construction of the bipartite graph. Let $CA_0$ and $CA_1$ be the sets of concepts corresponding to the attributes $S^R.A_i$ and $S^A.A_j$, respectively. These two sets are the vertex sets of the bipartite graph G. In other words, $G=(V_0+V_1,E)$ where $V_0=CA_0$ and $V_1=CA_1$. Considering two concepts $a\in V_0$ and $b\in V_1$. An edge between a and b is valued with the degree of match between a and b, i.e., $\delta (a,b)$. Then, a numerical weight is assigned to every edge in the bipartite graph as given in Table 1. These weights verify the following constraints:
$$\begin{aligned} w_1\le w_2\le w_3\le w_4. \end{aligned}$$
(1)
2.
Selection of the optimal matching. A matching is selected only if it fulfills an injective mapping between $V_0$ and $V_1$. Figure 1 presents two matchings. The first one respects the injective mapping between $V_0$ and $V_1$ while the second does not. A bipartite graph G may contain several possible matchings. In this case, the identification of the optimal matching is required. According to [1], the optimal matching is the one that minimizes $\max (w_k$) with $\max (w_k)$ is the maximum weighted edge in the matching. The final returned degree corresponds to the label of the edge that maximizes $\max (w_i)$ in the obtained matching.

Table 1. Weighting system

Full size table

To select the optimal matching, the authors in [1] propose the use of the Hungarian algorithm. However, the application of Hungarian algorithm using the weighing system given in Table 1 is not possible. This is because the Hungarian algorithm minimizes the sum of weights while the optimal matching is the one that minimizes the maximum weight. The authors in [1] proved that a modification in weights as in Table 2 permits the Hungarian algorithm to identify correctly the optimal matching.

Table 2. Improved weighting system

Full size table

The computing of the similarity measure $\mu (\cdot ,\cdot )$ as described above is formalized in Algorithm 1. The function ComputeWeights applies the weight changes as described in Table 2. The function HungarianAlg implements the Hungarian Algorithm. The w(a, b) in Algorithm 1 is the weight associated to edge (a, b).

The degree of match $\delta (\cdot ,\cdot )$ between two conceptual annotations is established according to Algorithm 2 where: $\equiv $: Equivalence relationship; $\sqsupset _1$: Direct parent/child relationship; $\sqsupset $: Indirect parent/child relationship; and $\sqsubset $: Direct or indirect child/parent relationship.

The optimality criterion used in [1] is designed to minimize the false positives and the false negatives. In fact, minimizing the maximal weight would minimize the ‘Fail’ labeled edges. However, the choice of $\max (w_i)$ as a final return value is restrictive and the risk of false negatives in the final result is higher. To avoid this problem, the consideration of both $\max (w_i)$ and $\min (w_i)$ as pertinent values in the matching is proposed.

In the two next sections, two different algorithms for the computing the similarity measure are introduced. These algorithms are based on [1]’s approach. The main difference concerns the computation of the degree of match $\delta (\cdot ,\cdot )$.

3 Efficient Computation of the Similarity Measure

The semantic distance values are defined similarly to [1] (see Table 1). The improvement concerns the degree of match which is now computed as in Algorithm 3 where: $\equiv $: Equivalence relationship; $\sqsubset _1$: Direct child/parent relationship; and $\sqsupset _1$: Direct parent/child relationship.

In this version of the algorithm, only direct related concepts are considered for Plugin and Subsume semantic distance values. This change affects the precision of the algorithm since it uses a small set of possible concepts. However, it reduces considerably the computing time. In fact, there is no need to use inference: only facts are parsed in the related ontology, which reduces the complexity of Algorithm 3 (as discussed in Sect. 5). This will necessarily improves the query response time. The proposed algorithm provides a balance between response time and precision, which is valuable in critical situations.

The computation of the similarity measure is the same as in [1] (see Algorithm 1) but the interpretation of the semantic distances is as given in the beginning of this section.

4 Accurate Computation of the Similarity Measure

In this case, six semantic distance values are defined as given in Table 3. The basic idea of this second improvement is that the precision of the algorithm increases with the number of granular values. Then, an extended weighting system is define as in Table 4. These weights verify the following constraints:

$$\begin{aligned} w_1\le w_2\le w_3\le w_4\le w_5\le w_6 \end{aligned}$$

(2)

Table 3. Elements of the semantic distance values set V

Full size table

Table 4. Extended weighting system

Full size table

In this version of the similarity measure, the improvement is also made over the computing of degree of match $\delta (\cdot ,\cdot )$, which is now computed according to Algorithm 4 where: $\equiv $: Equivalence relationship (it does not need inference); $\sqsubset _1$: Direct child/parent relationship; $\sqsupset _1$: Direct parent/child relationship; $\sqsubset $: Indirect child/parent relationship; and $\sqsupset $: Indirect parent/child relationship.

In this algorithm, the consideration of indirect concepts is performed for both Extended-Plugin and Extended-Subsume semantic distance values. Following Definition (2), the following preference order holds:

$$\begin{aligned} \text {Exact}\succ \text {Plugin}\succ \text {Subsume}\succ \text {Extended-plug-in}\succ \text {Extended-subsume}\succ \text {Fail}. \end{aligned}$$

Direct relations (i.e. Plugin and Subsume) are preferred to indirect relations (i.e. Extended-Plugin and Extended-Subsume). To apply the Hungarian Algorithm, the weights are modified as in Table 5.

Table 5. Improved weighting system

Full size table

The computation of the similarity measure is given in Algorithm 5. This algorithm extends the one proposed by [1] (see Algorithm 1). The function ComputeWeights applies the weight changes as described in Table 5. The function HungarianAlg implements the Hungarian algorithm. The w(a, b) in Algorithm 5 is the weight associated to edge (a, b).

5 Computational Complexity

The most expensive operation in Algorithms 1 and 5 is the computing of the degree of match. As shown in [5], inferring degree of match by ontological parse of pieces of information into facts and then utilizing commercial rule-based engines which use the fast Rete [6] pattern-matching algorithm leads to O(|R||F||P|) where |R| is the number of rules, |F| is the number of facts, and |P| is the average number of patterns in each rule. Accordingly, the complexity of Algorithms 2 and 4 is O(|R||F||P|). However, the computing the degree of match according to Algorithm 3 is only O(|F|) since there is not inference.

Let now discuss the algorithmic complexity of [1]’s approach. Let m be an approximation of the number of concepts for the attributes to be compared. Then: (i) the computation of weights is an operation of O(1) complexity; (ii) the construction of the graph involves the comparison of every pair of concepts. It takes then $O(m^2)$ time complexity; (iii) the Hungarian Algorithm has a time complexity of $O(m^3)$; and (iv) the degree of match in Algorithm 2 is O(|R||F||P|).

Generally, m is likely to take small values and it can be considered as a constant. The overall time complexity of [1]’s algorithm is than $O(1+m^2|F||R||P|+m^3)\simeq O(|F||R||P|)$. Based on the above discussion, the complexity of [1]’s algorithm will be $O(1+m^2 |F|+m^3)\simeq O(|F|)$ when the degree of similarity is computed as in Algorithm 3. The computation of the similarity measure according to Algorithm 5 is the same as [1]’s algorithm, i.e., $O(1+m^2|F||R||P|+m^3)\simeq O(|F||R||P|)$.

6 Evaluation and Comparison

6.1 Performance Analysis

To SME2 [8] tool has been used to evaluate the performance of the algorithms. The SME2 uses OWLS-TC collections to provide the matchmakers with Web service descriptions, and to compare their answers to the relevance sets of the various queries. The SME2 provides several metrics to evaluate the performance and effectiveness of a Web service matchmaker. The metrics that have been considered in this paper are: precision and recall, average precision and query response time. The definition of these metrics are given in [8].

A series of experimentations have been conducted on a Dell Inspiron 15 3735 Laptop with an Intel Core I5 processor (1.6 GHz) and 2 GB of memory. The test collection used is OWLS-TC4, which consists of 1083 Web service offers described in OWL-S 1.1 and 42 queries.

Figures 2 and 3 show the Average Precision and Recall/Precision plot of the accurate and efficient algorithms, respectively. It can be seen that the accurate algorithm outperforms efficient algorithm with respect to these two metrics. This is due to the use of logical inference, that obviously enhances the precision of the accurate algorithm. In Fig. 4, however, efficient algorithm is shown to be remarkably faster than the accurate algorithm. This is due to the inference process used in accurate algorithm that consumes considerable resources.

6.2 Comparative Study

Table 6 summarizes the main characteristics of the efficient and accurate algorithms and some existing algorithms proposed in [1, 5, 10]. The following criteria have been considered in this comparison: (i) the definition of the semantic distance set V, (ii) the computing of degree of match $\delta (\cdot ,\cdot )$, (iii) the modelling technique of the matching problem, (iv) the level of precision, and (v) the level of complexity. The description of this table is straightforward.

Table 6. Comparison of similarity measure computing algorithms

Full size table

7 Conclusion

Web service matchmaking is an important task in Web service discovery and Web service composition. Early non-logic based matchmakers rely on syntactic and structural matching techniques, which suffer from several shortcomings, especially the high number of false positives and false negatives. Accordingly, different logic-based matchmakers have been proposed (e.g., [1, 5, 9, 10, 13]) to overcome these shortcomings. In particular, the authors in [1] propose a bipartite graph-based algorithm that improves considerably the algorithm of [10] by reducing false positives and false negatives in the final results.

This paper presents and compares two algorithms for computing a semantic logic-based similarity measure in the perspective of Web service matching. These algorithms improve the computing of the similarity measure proposed in [1]. Both algorithms have been evaluated using the SME2 tool. The results show that the first algorithm slightly affects the precision of [1]’s algorithm but reduces substantially its computing time, while the second algorithm enriches the semantic distance values used in [1] and ameliorates considerably its precision. In the future, more advanced performance evaluation with large datasets will be conducted.

References

Bellur, U., Kulkarni, R.: Improved matchmaking algorithm for semantic Web services based on bipartite graph matching. In: IEEE International Conference on Web Services, Salt Lake City, Utah, USA, 9–13 July 2007, pp. 86–93 (2007)
Google Scholar
Blake, M.B., Cabral, L., König-Ries, B., Küster, U., Martin, D. (eds.): Semantic Web Services, Advancement through Evaluation. Springer, Heidelberg (2012)
Google Scholar
Chakhar, S.: Parameterized attribute and service levels semantic matchmaking framework for service composition. In: Fifth International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA 2013), Seville, Spain, 27 January–1 February 2013, pp. 159–165 (2013)
Google Scholar
Chen, F., Lu, C., Wu, H., Li, M.: A semantic similarity measure integrating multiple conceptual relationships for web service discovery. Expert Syst. Appl. 67, 19–31 (2017)
Article Google Scholar
Doshi, P., Goodwin, R., Akkiraju, R., Roeder, S.: Parameterized semantic matchmaking for workflow composition. IBM Research Report RC23133, IBM Research Division, March 2004
Google Scholar
Forgy, C.: Rete: a fast algorithm for the many patterns/many objects match problem. Artif. Intell. 19(1), 17–37 (1982)
Article Google Scholar
Gmati, F.E., Yacoubi Ayadi, N., Bahri, A., Chakhar, S., Ishizaka, A.: A framework for parameterized semantic matchmaking and ranking of web services. In: Proceedings of the 12th International Conference on Web Information Systems and Technologies, pp. 54–65 (2016)
Google Scholar
Klusch, M., Dudev, M., Misutka, J., Kapahnke, P., Vasileski, M.: SME$^2$ Version 2.2. User Manual. The German Research Center for Artificial Intelligence (DFKI), Germany (2010)
Google Scholar
Klusch, M., Kapahnke, P.: The iSeM matchmaker: a flexible approach for adaptive hybrid semantic service selection. In: Web Semantics: Science, Services and Agents on the World Wide Web, vol. 15, pp. 1–14 (2012)
Google Scholar
Paolucci, M., Kawamura, T., Payne, T.R., Sycara, K.: Semantic matching of web services capabilities. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 333–347. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-48005-6_26
Chapter MATH Google Scholar
Sangers, J., Frasincar, F., Hogenboom, F., Chepegin, V.: Semantic web service discovery using natural language processing techniques. Expert Syst. Appl. 40(11), 4660–4671 (2013)
Article Google Scholar
Stavropoulos, T.G., Andreadis, S., Bassiliades, N., Vrakas, D., Vlahavas, I.P.: The tomaco hybrid matching framework for SAWSDL semantic web services. IEEE Trans. Serv. Comput. 9(6), 954–967 (2016)
Article Google Scholar
Sycara, K., Paolucci, M., van Velsen, M., Giampapa, J.: The retsina mas infrastructure. Autonom. Agents Multi-Agent Syst. 7(1–2), 29–48 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

COSMOS, National School of Computer Science, University of Manouba, Manouba, Tunisia
Fatma Ezzahra Gmati
Portsmouth Business School and Centre for Operational Research and Logistics, University of Portsmouth, Portsmouth, UK
Salem Chakhar & Mark Xu
RIADI Research Laboratory, National School of Computer Sciences, University of Manouba, Manouba, Tunisia
Nadia Yacoubi Ayadi
MIRACL Laboratory, High School of Computing and Multimedia, University of Sfax, Sfax, Tunisia
Afef Bahri

Authors

Fatma Ezzahra Gmati
View author publications
You can also search for this author in PubMed Google Scholar
Salem Chakhar
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Yacoubi Ayadi
View author publications
You can also search for this author in PubMed Google Scholar
Afef Bahri
View author publications
You can also search for this author in PubMed Google Scholar
Mark Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Salem Chakhar .

Editor information

Editors and Affiliations

University of Regina, Regina, SK, Canada
Malek Mouhoub
University of Regina, Regina, SK, Canada
Samira Sadaoui
Concordia University, Montreal, QC, Canada
Otmane Ait Mohamed
Texas State University, San Marcos, TX, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gmati, F.E., Chakhar, S., Yacoubi Ayadi, N., Bahri, A., Xu, M. (2018). Efficient Versus Accurate Algorithms for Computing a Semantic Logic-Based Similarity Measure. In: Mouhoub, M., Sadaoui, S., Ait Mohamed, O., Ali, M. (eds) Recent Trends and Future Technology in Applied Intelligence. IEA/AIE 2018. Lecture Notes in Computer Science(), vol 10868. Springer, Cham. https://doi.org/10.1007/978-3-319-92058-0_78

Download citation

DOI: https://doi.org/10.1007/978-3-319-92058-0_78
Published: 30 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92057-3
Online ISBN: 978-3-319-92058-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficient Versus Accurate Algorithms for Computing a Semantic Logic-Based Similarity Measure

Abstract

Similar content being viewed by others

Evaluation of Similarity Measures in Semantic Web Service Discovery

Semantic Web Service Search: A Brief Survey

A Hybrid-Sorting Semantic Matching Method

Keywords

1 Introduction