1 Introduction

A common practice in the area of the Semantic Web is to reuse and extend existing ontologies for a specific task. Therefore, an approach to comparing multiple ontologies is often desired. In this paper, we propose the notion of projection module which characterizes the relative knowledge of an ontology, say, an ontology developed for a specific task (called task ontology), by taking another ontology as a reference (called reference ontology), e.g. an upper-level ontology. This can thus lead to (1) a method for comparing the entailment capacities of any two ontologies about a given vocabulary of interest, and (2) a fine-grained ontology comparison measurement between two ontologies.

As illustrated in Fig. 1, for a user interest expressed as a set \(\varSigma \) of concept and role names, the reference ontology \(\mathcal {T} _1\) (resp. task ontology \(\mathcal {T} _2\)) contains \(S_1\) (resp. \(S_2\)) as a sub-ontology, e.g., a minimal module [9, 10], that provides a minimal number of axioms entailing all \(\varSigma \)-consequences. A projection module, on the other hand, additionally takes a reference ontology into account as to preserve the relevant \(\varSigma \)-knowledge only, which can yield an even smaller module \(S_3\) of \(\mathcal {T} _2\). Figure 2 shows a concrete example with \(\mathcal {T} _1\) being the upper-level (reference) ontology modelling aspects of a university domain, and a task ontology \(\mathcal {T} _2\) as an extended and modified version of \(\mathcal {T} _1\). Consider the signature \(\varSigma = \{\textsf {Professor, Faculty, has, PhD, teach, Course} \}\) (consisting of the symbols marked in blue in Fig. 2). Then \(\mathcal {T} _1\) has merely one module \(S_1=\{\alpha _1\}\) that preserves \(\mathcal {T} _1\)’s knowledge about \(\varSigma \). When extracting modules of \(\mathcal {T} _2\) for \(\varSigma \), we obtain \(\mathcal {T} _2\) itself using existent module notions such as modules based on locality [14] or the module extracted by MEX [20]. Here we have two candidates for a minimal module \(S_2\) of \(\mathcal {T} _2\) that each preserve all inclusions over \(\varSigma \): \(\{\beta _1,\beta _2,\beta _3\}\) and \(\{\beta _1,\beta _2,\beta _4\}\) [9]. The projection modules of \(\mathcal {T} _2\), however, preserving the \(\varSigma \)-inclusions entailed by \(\mathcal {T} _1\) are even smaller with \(S_3=\{\beta _1,\beta _3\}\) or \(S_3 = \{\beta _1,\beta _4\}\). Every projection module \(S_3\) is a strict subset of a minimal module of \(\mathcal {T} _2\), which is in line with the fact that the task ontology \(\mathcal {T} _2\) has extended the reference ontology \(\mathcal {T} _1\) with new \(\varSigma \)-consequences, e.g., \(\textsf {Faculty}\sqsubseteq \exists \textsf {has.PhD}\).

Various approaches to comparing ontologies have been suggested, including ontology matching [13] and logical difference [22,23,24, 26]. Ontology matching is the process of determining correspondences, e.g., the subsumption relations between two concept or role names from different ontologies, for which a good concept similarity [1, 25] is often helpful. In contrast, logical difference focuses on the comparison of entailed logical consequences from each ontology and returns difference witnesses if differences are present. When an ontology has no logical difference compared to another one, our approach further extracts sub-ontologies of the first that contain the knowledge as represented by the second ontology.

Fig. 1.
figure 1

Projection module

Fig. 2.
figure 2

Example: minimal projection modules

Ontology modularity [9, 18, 22, 24, 27, 28] is about the extraction of sub-ontologies that preserve all logical consequences over a signature. The proposed projection module is different from modules of a single ontology, as illustrated by the example in Fig. 2. To compute projection modules, in this paper, we generalize the notion of justification to the notion of subsumption justification as a minimal set of axioms that maintains the entailment of a consequence. Our algorithm employs the classical notion of justification to compute subsumption justifications. Currently, the approaches for computing all justifications of an ontology for a consequence can be classified into two categories: “glass-box” [2, 5, 16, 17] and “black-box” [11, 16, 29].

We proceed as follows. After reviewing some preliminaries in Sect. 2, the notion of a minimal project module for subsumption, instance and conjunctive queries is introduced in Sect. 3. In Sect. 4, we introduce the algorithm for computing minimal projection modules. In Sect. 5, two applications of minimal projection modules are presented. Finally, we close the paper with a conclusion in Sect. 6.

2 Preliminaries

We start by reviewing the description logic \(\mathcal{EL}\) and several of its extensions.

Let \(\mathsf{N_C}\), \(\mathsf{N_R}\) and \(\mathsf{N_I}\) be mutually disjoint and countably infinite sets of concept names, role names and instance names. The signature \(\text {sig} (\xi )\) is the set of concept and role names occurring in \(\xi \), where \(\xi \) ranges over any syntactic object. The sets of \(\mathcal{EL}\) -concepts C, \(\mathcal{EL}^{\mathsf{ran}}\) -concepts D, \(\mathcal{EL}^{\sqcap }\) -concepts E, and \(\mathcal{EL}^{\sqcap ,u}\) -concepts F, and the sets of \(\mathcal{ELH}^{r} \)-inclusions \(\alpha \), \(\mathcal{EL}^{\mathsf{ran}} \)-inclusions \(\beta \) and \(\mathcal{EL}^{\mathsf{ran},\sqcap ,u} \)-inclusions \(\gamma \) are built according to the grammar rules:

$$\begin{aligned} \begin{array}{r@{\;\;}c@{\;\;}l} C &{} \,{::=}\, &{} A \mid C\sqcap C \mid \exists r.C \mid \mathsf{dom} (r)\\ D &{} \,{::=}\, &{} A \mid D\sqcap D \mid \exists r.D \mid \mathsf{dom} (r) \mid \mathsf{ran} (r)\\ E &{} \,{::=}\, &{} A \mid E\sqcap E \mid \exists R.E\\ F &{} \,{::=}\, &{} A \mid F\sqcap F \mid \exists R.F \mid \exists u.F\\ \alpha &{} \,{::=}\, &{} C\sqsubseteq C \mid \mathsf{ran} (r)\sqsubseteq C \mid \mathsf{ran} (r)\sqcap C\sqsubseteq C \mid C\equiv C \mid r\sqsubseteq s\\ \beta &{} \,{::=}\, &{} D\sqsubseteq C \mid r\sqsubseteq s\\ \gamma &{} \,{::=}\, &{} D\sqsubseteq F \mid r\sqsubseteq s \end{array} \end{aligned}$$

where \(A\in \mathsf{N_C} \), \(r,s\in \mathsf{N_R} \), u is a fresh logical symbol (the universal role) and \(R = r_1\sqcap \cdots \sqcap r_n\) with \(r_1,...,r_n\in \mathsf{N_R} \), for \(n\ge 1\). We refer to inclusions also as axioms. A \(\varGamma \)-TBox is a finite set of \(\varGamma \)-inclusions, where \(\varGamma \) ranges over the sets of \(\mathcal{ELH}^{r} \)- and \(\mathcal{EL}^{\mathsf{ran},\sqcap ,u} \)-inclusions. We use \(\text {lhs} (\alpha )\) (resp. \(\text {rhs} (\alpha )\)) to represent the left-hand side (resp. right-hand side) of an inclusion \(\alpha \).

The semantics is defined as usual in terms of interpretations interpreting concept/role names as unary/binary relations and are then inductively extended to complex concepts. The notions of satisfaction of a concept, axiom and TBox as well as the notions of a model and the logical consequence relation are defined as usual [4].

An \(\mathcal{ELH}^{r}\) -terminology \(\mathcal {T}\) is an \(\mathcal{ELH}^{r}\)-TBox consisting of axioms \(\alpha \) of the form \(A\sqsubseteq C\), \(A\equiv C\), \(r\sqsubseteq s\), \(\mathsf{ran} (r)\sqsubseteq C\) or \(\mathsf{dom} (r)\sqsubseteq C\), where A is a concept name, C an \(\mathcal{EL}\)-concept and no concept name occurs more than once on the left-hand side of an axiom. To simplify the presentation we assume that terminologies do not contain axioms of the form \(A \equiv B\) or \(A \equiv \top \) (after having removed multiple \(\top \)-conjuncts) for concept names A and B. For a terminology \(\mathcal {T}\), let \(\prec _\mathcal {T} \) be a binary relation over \(\mathsf{N_C}\) satisfying that \(A\prec _\mathcal {T} B\) iff there is an axiom of the form \(A\sqsubseteq C\) or \(A\equiv C\) in \(\mathcal {T}\) such that \(B\in \text {sig} (C)\). A terminology \(\mathcal {T}\) is acyclic if the transitive closure \(\prec _{\mathcal {T}}^{+}\) of \(\prec _\mathcal {T} \) is irreflexive; otherwise \(\mathcal {T}\) is cyclic. We say that a concept name A is conjunctive in \(\mathcal {T} \) iff there exist concept names \(B_1, \cdots , B_n\), \(n > 0\), such that \(A \equiv B_1 \sqcap \cdots \sqcap B_n \in \mathcal {T} \); otherwise A is said to be non-conjunctive in \(\mathcal {T} \). An \(\mathcal{ELH}^{r}\)-terminology \(\mathcal {T}\) is normalised iff it only contains axioms of the forms

  • \(r\sqsubseteq s\), \(\varphi \sqsubseteq B_1\sqcap \cdots \sqcap B_n\), \(A\sqsubseteq \exists r.B\), \(A\sqsubseteq \mathsf{dom} (r)\), and

  • \(A\equiv B_1\sqcap \cdots \sqcap B_m\), \(A\equiv \exists r.B\),

where \(\varphi \in \{A,\mathsf{dom} (s),\mathsf{ran} (s)\}\), \(n\ge 1\), \(m \ge 2\), \(A,B,B_i\in \mathsf{N_C} \), \(r,s\in \mathsf{N_R} \), and each conjunct \(B_i\) is non-conjunctive in \(\mathcal {T} \). Every \(\mathcal{ELH}^{r}\)-terminology \(\mathcal {T}\) can be normalised in polynomial time such that the resulting terminology is a conservative extension of \(\mathcal {T}\) [19]. A subset \(M \subseteq \mathcal {T} \) is called a justification for an \(\mathcal{ELH}\) -concept inclusion \(\alpha \) from \(\mathcal {T} \) iff \(M \models \alpha \) and \(M'\not \models \alpha \) for every \(M' \subsetneq M\).

We denote the set of all justifications for an \(\mathcal{ELH}\)-concept inclusion \(\alpha \) from an \(\mathcal{ELH}\)-terminology \(\mathcal {T} \) with \(\mathrm {Just} _{\mathcal {T}}(\alpha )\). The latter may contain exponentially many justifications in the number of axioms in \(\mathcal {T} \). An ABox contains assertions of the form \(\top (a)\), A(a) and r(ab), where \(a,b\in \mathsf{N_I} \) and \(r\in \mathsf{N_R} \). An ABox consists of finitely many ABox assertions.

Let \(\mathsf{N_I}\) and \(\mathsf{N_V}\) be disjoint sets of individual and variable names. A conjunctive query is a first-order formula built according to the following format: \(\exists y_1 \dots \exists y_n. \bigwedge _{i\in I_1} A_i(s_i) \wedge \bigwedge _{j\in I_2} r_j(t_j, t'_j)\), where \(y_1, \dots , y_n \in \mathsf{N_V} \) for \(n\ge 1\) are variable names, \(I_1, I_2\) are finite sets of indices, and for \(i\in I_1\) and \(j\in I_2\), \(A_i\) ranges over concept names in \(\mathsf{N_C} \), \(r_j\) ranges over role names in \(\mathsf{N_R} \), and \(s_i,t_j,t'_j\) range over individual and variable names in \(\mathsf{N_I} \cup \mathsf{N_V} \).

A signature \(\varSigma \) is a finite set of symbols from \(\mathsf{N_C}\) and \(\mathsf{N_R}\). The symbol \(\varSigma \) is used as a subscript to sets of concepts or inclusions to denote that the elements only use symbols from \(\varSigma \), e.g., \(\mathcal{ELH}^{r}_{\varSigma }\) and \(\mathcal{EL}^{\mathsf{ran},\sqcap ,u} _\varSigma \). For a signature \(\varSigma \), let \(\varSigma ^\mathsf{dom} = \{\,\mathsf{dom} (r) \mid r \in \mathsf{N_R} \cap \varSigma \,\}\) and \(\varSigma ^\mathsf{ran} = \{\,\mathsf{ran} (r) \mid r \in \mathsf{N_R} \cap \varSigma \,\}\) be the sets consisting of concepts of the form \(\mathsf{dom} (r)\) and \(\mathsf{ran} (r)\) for every role name r in \(\varSigma \), respectively. We recall the notion of logical difference for concept subsumption queries, instance queries and conjunctive queries from [19, 23]. For a more detailed introduction to description logics, we refer to [3, 4]. For latest results on logical inseparability see [7, 8, 15], and for a survey on query inseparability, see [6].

Definition 1

(Logical Difference). The \(\mathcal {L} \)-subsumption query difference, for some logic \(\mathcal {L} \), the instance and conjunctive query difference between \(\mathcal {T} _1\) and \(\mathcal {T} _2\) w.r.t. \(\varSigma \) are the sets \(\mathsf{cDiff}_{\varSigma }^\mathcal {L} (\mathcal {T} _1,\mathcal {T} _2)\), \(\mathsf{iDiff}_{\varSigma }(\mathcal {T} _1,\mathcal {T} _2)\), and \(\mathsf{qDiff}_{\varSigma }(\mathcal {T} _1,\mathcal {T} _2)\), respectively, where

  • \(\varphi \in \mathsf{cDiff}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _1,\mathcal {T} _2)\) iff \(\varphi \) is an \(\mathcal {L} \)-inclusion, \(\mathcal {T} _1\models \varphi \) and \(\mathcal {T} _2\not \models \varphi \);

  • \((\mathcal {A},\lambda ) \in \mathsf{iDiff}_{\varSigma }(\mathcal {T} _1,\mathcal {T} _2)\) iff \(\mathcal {A} \) is a \(\varSigma \)-ABox and \(\lambda \) a \(\varSigma \)-instance assertion such that \((\mathcal {T} _1,\mathcal {A}) \models \lambda \) and \((\mathcal {T} _2,\mathcal {A})\not \models \lambda \);

  • \((\mathcal {A},q(\mathbf {a})) \in \mathsf{qDiff}_{\varSigma }(\mathcal {T} _1,\mathcal {T} _2)\) iff \(\mathcal {A} \) is a \(\varSigma \)-ABox and \(q(\mathbf {a})\) a \(\varSigma \)-conjunctive query such that \((\mathcal {T} _1,\mathcal {A}) \models q(\mathbf {a})\) and \((\mathcal {T} _2,\mathcal {A})\not \models q(\mathbf {a})\).

According to [19], \(\mathcal {L} \)-subsumption queries for \(\mathcal {L} = \mathcal{EL}^{\mathsf{ran}} \) and \(\mathcal {L} = \mathcal{EL}^{\mathsf{ran},\sqcap ,u} \) are sufficient to detect the absence of any instance query and conjunctive query differences, respectively. Therefore, we only consider how to detect \(\mathcal {L} \)-subsumption queries for \(\mathcal {L} =\{\mathcal{ELH}^{r}, \mathcal{EL}^{\mathsf{ran}}, \mathcal{EL}^{\mathsf{ran},\sqcap ,u} \}\). Let \(\alpha ^\mathcal {L} _\varSigma \) be an \(\mathcal {L} \)-inclusion that only uses symbols in \(\varSigma \). We organise the \(\varSigma \)-symbols (and the domain and range concepts over role names from \(\varSigma \)) that occur as “witnesses” of a \(\mathcal {L} \)-subsumption query difference between \(\mathcal {T} _1\) and \(\mathcal {T} _2\) as follows:

figure a

where \(\mathsf{roleWtn}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _1,\mathcal {T} _2)=\{\,r\in \varSigma \cap \mathsf{N_R} \mid r \sqsubseteq s \text { or } s\sqsubseteq r\in \mathsf{cDiff}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _{1},\mathcal {T} _{2})\,\}\), \(\mathsf{lhsWtn}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _1,\mathcal {T} _2) = \{\varphi \in (\varSigma \cap \mathsf{N_C})\cup \varSigma ^\mathsf{dom} \cup \varSigma ^\mathsf{ran} \mid \varphi \sqsubseteq \text {rhs} (\alpha ^\mathcal {L} _\varSigma )\) and \(\alpha \) is a \(\mathcal {L} _\varSigma -\text {inclusion}\}\) and \(\mathsf{rhsWtn}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _1,\mathcal {T} _2) = \{\,A\in \varSigma \cap \mathsf{N_C} \mid \text {lhs} (\alpha ^\mathcal {L} _\varSigma ) \sqsubseteq A \in \mathsf{cDiff}_{\varSigma }^{Q}(\mathcal {T} _{1},\mathcal {T} _{2})\,\}\). The set \(\mathsf{Wtn}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _1,\mathcal {T} _2)\) can be seen as a finite representation of the set \(\mathsf{cDiff}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _{1},\mathcal {T} _{2})\), which is typically infinite when it is not empty. It follows from the “primitive witnesses” theorems in [19] that \(\mathsf{cDiff}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _{1},\mathcal {T} _{2}) = \emptyset \) iff \(\mathsf{Wtn}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _1,\mathcal {T} _2) = (\emptyset ,\emptyset ,\emptyset )\). Thus, deciding the existence of logical differences is equivalent to decide non-emptiness of the three witness sets.

3 Projection Modules

To understand the relations among different ontologies, we introduce the notion of projection module, as a way to explain how the knowledge that is encoded in a reference ontology is implemented in a task ontology. We are interested in computing all projection modules, since it provides a complete list of all implementations of an ontology regarding a reference, each of which may be necessary to be checked. To enable a manual validation by domain experts, we need to present only necessary information, so we focus on computing minimal projection modules.

A terminology \(\mathcal {T} _1\) together with a signature \(\varSigma \) and a type of query Q determine a set \(\varPhi \) of queries from Q formulated using only symbols from \(\varSigma \) that follow from \(\mathcal {T} _1\). A projection module of another terminology \(\mathcal {T} _2\) is a subset of \(\mathcal {T} _2\) that entails the queries in \(\varPhi \). For convenience, we bundle the parameters together in a tuple \(\rho =\langle \mathcal {T} _1,\varSigma , \mathcal {T} _2\rangle \), which we call a projection setting.

Definition 2

(Projection Module). Let \(\rho =\langle \mathcal {T} _1, \varSigma , \mathcal {T} _2\rangle \) be a projection setting, \(\mathcal {A} \) be a \(\varSigma \)-ABox. A subset \(\mathcal {M} \subseteq \mathcal {T} _2\) is a subsumption (resp. instance, conjunctive) query projection module under projection setting \(\rho \), denoted as \(\mathcal {M} _\rho ^c\) (resp. \(\mathcal {M} _\rho ^i\), \(\mathcal {M} _\rho ^q\)) iff:

  • \(\mathcal {M} _\rho ^c\): for each \(\mathcal{ELH}^{r} _\varSigma \)-inclusion \(\alpha \), if \(\mathcal {T} _1 \models \alpha \), then \(\mathcal {M} \models \alpha \);

  • \(\mathcal {M} _\rho ^i\): for each \(\varSigma \)-instance assertion \(\lambda \), if \((\mathcal {T} _1,\mathcal {A}) \models \lambda \), then \((\mathcal {M},\mathcal {A}) \models \lambda \);

  • \(\mathcal {M} _\rho ^q\): for each \(q(\mathbf {a})\), if \((\mathcal {T} _1,\mathcal {A}) \models q(\mathbf {a})\), then \((\mathcal {M},\mathcal {A}) \models q(\mathbf {a})\), where \(\mathbf {a}\) is a tuple of individual names from \(\mathcal {A} \) and \(q(\mathbf {a})\) is a \(\varSigma \)-conjunctive query.

A minimal subsumption (resp. instance, conjunctive) query projection module is a projection module \(\mathcal {M} _\rho ^c\) (resp. \(\mathcal {M} _\rho ^i,\mathcal {M} _\rho ^q\)) minimal w.r.t. \(\subsetneq \).

Note that there may exist several, even exponentially many minimal projection modules. It can readily be checked that \(\mathsf{cDiff}_{\varSigma }^\mathcal {L} (\mathcal {T} _1, \mathcal {M} _\rho ^c) = \emptyset \), for \(\mathcal {L} =\mathcal{ELH}^{r} \), \(\mathsf{iDiff}_{\varSigma }(\mathcal {T} _1, \mathcal {M} _\rho ^i) = \emptyset \) and \(\mathsf{qDiff}_{\varSigma }(\mathcal {T} _1, \mathcal {M} _\rho ^q) = \emptyset \) (cf. Definition 1).

Example 1

Suppose \(\mathcal {T} _1 =\{A_1\sqsubseteq A_2,A_2\sqsubseteq A_3\}\), \(\mathcal {T} _2=\{A_1\sqsubseteq A_3\sqcap B_1, B_1\sqsubseteq \exists r. A_3\}\), and the interested vocabulary \(\varSigma =\{A_1,A_3,r\}\). \(\mathcal {T} _2\) has no logical difference from \(\mathcal {T} _1\). However, the concept project module of \(\mathcal {T} _2\) with respect to \(\mathcal {T} _1\) and \(\varSigma \) is \(\{A_1\sqsubseteq A_3\sqcap B_1\}\subsetneq \mathcal {T} _2\). This means that a strict sub-ontology of \(\mathcal {T} _2\) is sufficient to capture all the information of \(\mathcal {T} _1\) about \(\varSigma \). Moreover, \(\mathcal {T} _2\) also entails a consequence \(A_1\sqsubseteq \exists r. A_3\), which is not the case for \(\mathcal {T} _1\).

The following example shows that the three notions of projection modules based on different query languages are distinct.

Example 2

Let \(\mathcal {T} =\{X\sqsubseteq Y, Y \sqsubseteq \exists t. Z, \mathsf{ran} (r) \sqsubseteq A_1, \mathsf{ran} (s) \sqsubseteq A_2, B \equiv A_1 \sqcap A_2 \}\), \(\varSigma =\{X,Y,Z,B,r,s\}\), and \(\rho =\langle \mathcal {T}, \sigma , \mathcal {T} \rangle \). We have that \(\mathcal {M} _\rho ^c=\{X\sqsubseteq Y \}\), \(\mathcal {M} _\rho ^i=\mathcal {M} _\rho ^c \cup \{\mathsf{ran} (r) \sqsubseteq A_1, \mathsf{ran} (s) \sqsubseteq A_2, B \equiv A_1 \sqcap A_2 \}\) and \(\mathcal {M} _\rho ^q =\mathcal {T} \).

Definition 3

Let \(Q\in \{c,i,q\}\). The relationship between \(\mathcal {T} _1\) and \(\mathcal {T} _2\) is a \(\langle \varSigma ,Q \rangle \)-implementation, denoted , iff there exists a projection module \(\mathcal {M} _\rho ^Q\) under the setting \(\rho =\langle \mathcal {T} _1,\varSigma ,\mathcal {T} _2\rangle \).

If , we also say that \(\mathcal {T} _2\) \(\langle \varSigma ,Q\rangle \)-implements \(\mathcal {T} _1\). In case \(\mathcal {T} _1\) and \(\mathcal {T} _2\) \(\langle \varSigma ,Q\rangle \)-implement each other, they cannot be separated using the query language Q.

Proposition 1

Let and . Then:

  • \(Q=c\): \(\mathsf{cDiff}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _1,\mathcal {T} _2) = \mathsf{cDiff}_{\varSigma }^{\mathcal {L}}(\mathcal {T} _2,\mathcal {T} _1) = \emptyset \), for \(\mathcal {L} = \mathcal{ELH}^{r} \);

  • \(Q=i\): \(\mathsf{iDiff}_{\varSigma }(\mathcal {T} _1,\mathcal {T} _2) = \mathsf{iDiff}_{\varSigma }(\mathcal {T} _2,\mathcal {T} _1) = \emptyset \); and

  • \(Q=q\): \(\mathsf{qDiff}_{\varSigma }(\mathcal {T} _1,\mathcal {T} _2) = \mathsf{qDiff}_{\varSigma }(\mathcal {T} _2,\mathcal {T} _1) = \emptyset \).

We obtain the following monotonicity properties of the \(\langle \varSigma ,Q\rangle \)-implementation relation.

Proposition 2

(Implementation Monotonicity).

  1. (i)

    If \(\mathcal {T} _2\subseteq \mathcal {T} _3\) and , then ;

  2. (ii)

    If \(\mathcal {T} _1 \subseteq \mathcal {T} _2\) and , then .

Property (i) states that if a terminology \(\mathcal {T} _3\) is obtained from \(\mathcal {T} _2\) by adding further axioms to it, then it \(\langle \varSigma ,Q\rangle \)-implements all the terminologies \(\mathcal {T} _1\) that \(\mathcal {T} _2\) \(\langle \varSigma ,Q\rangle \)-implements. Property (ii) states \(\mathcal {T} _3\) \(\langle \varSigma ,Q\rangle \)-implements all subsets \(\mathcal {T} _1\) of \(\mathcal {T} _2\) provided that \(\mathcal {T} _3\) \(\langle \varSigma ,Q\rangle \)-implements \(\mathcal {T} _2\). We leave investigating certain robustness properties of regarding signature extensions and varying query languages for future work; see, e.g., [21].

4 Computing Minimal Projection Modules

It is shown in [19] that detecting concept inclusion differences formulated in \(\mathcal{EL}^{\mathsf{ran}}\) and \(\mathcal{EL}^{\mathsf{ran},\sqcap ,u}\) is equivalent to detecting a difference with instance and conjunctive queries, respectively. We therefore consider subsumption queries from \(\mathcal{EL}^{\mathsf{ran}}\) and \(\mathcal{EL}^{\mathsf{ran},\sqcap ,u}\) to compute minimal projection justifications for instance and conjunctive queries, respectively.

4.1 Definition of Subsumption Projection Justifications

For computing minimal subsumption projection modules for subsumption queries, we introduce the notion of a subsumption projection justification between two terminologies. As the notion depends on several parameters, we organise them for better readability in a tuple \(\chi \) of the form \(\langle \mathcal {T} _1, X_1, \varSigma , \mathcal {T} _2, X_2, \mathcal {L} \rangle \), where \(\mathcal {T} _1\) and \(\mathcal {T} _2\) are normalised \(\mathcal{ELH}^{r} \)-terminologies, \(\varSigma \) is a signature, \(X_1, X_2 \in \mathsf{N_C} \cup \{\,\mathsf{dom} (r),\mathsf{ran} (r) \mid r \in \mathsf{N_R} \,\}\), and \(\mathcal {L} \in \{\mathcal{ELH}^{r},\mathcal{EL}^{\mathsf{ran}},\mathcal{EL}^{\mathsf{ran},\sqcap ,u} \}\).

To obtain subsumption modules, we use an operator ‘\(\otimes \)’ to combine sets of role, subsumee and subsumer projection justifications. Given a set \(\mathcal {S} \) and sets of sets \(\mathbb {S} _1,\mathbb {S} _2 \subseteq 2^\mathcal {S} \), we define . For instance, if \(\mathbb {S} _1=\{\{\alpha _1,\alpha _2\},\{\alpha _3\}\}\) and \(\mathbb {S} _2=\{\{\alpha _1,\alpha _3\},\{\alpha _4,\alpha _5\}\}\), then \(\mathbb {S} _1 \otimes \, \mathbb {S} _2 = \{ \{\alpha _1,\alpha _2,\alpha _3\},\) \(\{\alpha _1,\alpha _2,\alpha _4,\alpha _5\},\) \(\{\alpha _3,\alpha _4,\alpha _5\},\) \(\{\alpha _1,\alpha _3\} \}\). For a set \(\mathbb {M} \) of sets, we define a function \(\mathrm {Minimise}_{\subseteq }(\mathbb {M})\) as follows: \(\mathcal {M} \in \mathrm {Minimise}_{\subseteq }(\mathbb {M})\) iff \(\mathcal {M} \in \mathbb {M} \) and there does not exist a set \(\mathcal {M} '\in \mathbb {M} \) such that \(\mathcal {M} '\subsetneq \mathcal {M} \). Continuing with the example, \(\mathrm {Minimise}_{\subseteq }(\mathbb {S} _1 \otimes \mathbb {S} _2)=\{\{\alpha _1,\alpha _3\},\) \(\{\alpha _1,\alpha _2,\alpha _4,\alpha _5\},\) \(\{\alpha _3,\alpha _4,\alpha _5\}\}\).

Definition 4

(Subsumption Projection Justification). Let \(\chi =\langle \mathcal {T} _1, X_1, \varSigma \), \(\mathcal {T} _2, X_2, \mathcal {L} \rangle \). A set \(\mathcal {M} \) is a subsumee module under \(\chi \) iff \(\mathcal {M} \subseteq \mathcal {T} _2\) and for every \(\mathcal {L} _\varSigma \)-inclusion \(\alpha \): \(\mathcal {T} _1 \models \text {lhs} (\alpha ) \sqsubseteq X_1\) implies \(\mathcal {M} \models \text {lhs} (\alpha ) \sqsubseteq X_2\); and \(\mathcal {M} \) is a subsumer module under \(\chi \) iff \(\mathcal {M} \subseteq \mathcal {T} _2\) and for every \(\mathcal {L} _\varSigma \)-inclusion \(\alpha \): \(\mathcal {T} _1 \models X_1 \sqsubseteq \text {rhs} (\alpha )\) implies \(\mathcal {M} \models X_2 \sqsubseteq \text {rhs} (\alpha )\).

\(\mathcal {M} \) is called a subsumption projection module under \(\chi \) iff \(\mathcal {M} \) is a subsumee and a subsumer projection module under \(\chi \). A subsumee (subsumer, subsumption) projection justification under \(\chi \) is a subsumee (resp. subsumer, subsumption) projection module under \(\chi \) that is minimal w.r.t. \(\subsetneq \).

We denote the set of all subsumee (resp. subsumer, subsumption) justifications under \(\chi \) as \(\mathbb {J} ^\leftarrow _\chi \) (resp. \(\mathbb {J} ^\rightarrow _\chi \), \(\mathbb {J} _\chi \)), where \(\chi =\langle \mathcal {T} _1, \varphi _1, \varSigma , \mathcal {T} _2, \varphi _2, \mathcal {L} \rangle \), and \(\varphi _1,\varphi _2\in (\mathsf{N_C} \cap \varSigma ) \cup \varSigma ^\mathsf{dom} \cup \varSigma ^\mathsf{ran} \).

Definition 5

(Role Subsumption Projection Justification). Let \(\rho = \langle \mathcal {T} _1, \varSigma , \mathcal {T} _2 \rangle \) be a projection setting. A set \(\mathcal {M} \) is called a role subsumption module under \(\rho \) iff \(\mathcal {M} \subseteq \mathcal {T} _2\) and for every \(r, s \in \mathsf{N_R} \cap \varSigma \), \(\mathcal {T} _1 \models r \sqsubseteq s\) implies \(\mathcal {M} \models r \sqsubseteq s\). A minimal role subsumption projection justification under \(\rho \) is the role subsumption module under \(\rho \) that is minimal w.r.t. \(\subsetneq \).

We denote the set of all role subsumption projection justifications under \(\rho \) as \(\mathbb {J} ^R_\rho \). The following lemma states how role subsumption projection justifications can be computed. The lemma can be shown using Definition 5 and the notion of justification.

Lemma 1

Let \(\rho = \langle \mathcal {T} _1, \varSigma , \mathcal {T} _2 \rangle \) be a projection setting.

figure l

Using Definitions 1 and 4, we obtain the following lemma stating the absence of certain concept names, and domain and range concepts over role names as left-hand and right-hand difference witnesses between two terminologies \(\mathcal {T} _1\) and \(\mathcal {T} _2\).

Lemma 2

Let \(\varphi \in (\mathsf{N_C} \cap \varSigma ) \cup \varSigma ^\mathsf{dom} \cup \varSigma ^\mathsf{ran} \) and let \(A \in \varSigma \cap \mathsf{N_C} \). Additionally, let \(\chi =\langle \mathcal {T} _1, \varphi , \varSigma , \mathcal {T} _2, \varphi , \mathcal {L} \rangle \), \(\chi '=\langle \mathcal {T} _1, A, \varSigma , \mathcal {T} _2,A, \mathcal {L} \rangle \) and \(\mathcal {L} \in \{\mathcal{ELH}^{r},\mathcal{EL}^{\mathsf{ran}},\mathcal{EL}^{\mathsf{ran},\sqcap ,u} \}\). Then:

  • \(\varphi \not \in \mathsf{lhsWtn}_{\varSigma }^\mathcal {L} (\mathcal {T} _1,J_\varphi )\) for every \(J_\varphi \in \mathbb {J} ^\rightarrow _\chi \);

  • \(A \not \in \mathsf{rhsWtn}_{\varSigma }^\mathcal {L} (\mathcal {T} _1,J_A)\) for every \(J_A\in \mathbb {J} ^\leftarrow _{\chi '}\);

  • \(\mathsf{roleWtn}_{\varSigma }^\mathcal {L} (\mathcal {T} _1,J)=\emptyset \) for every \(J \in \mathbb {J} ^R_\rho \).

We need at least one subsumption justification, for every potential difference witness, to be contained in a projection module in order to prevent the witness; cf. Lemma 2. This is made precise in the following theorem.

Theorem 1

Let \(\rho = \langle \mathcal {T} _1, \varSigma , \mathcal {T} _2 \rangle \) be a projection setting and let \(Q\in \{c,i,q\}\). Additionally, let \(\mathbb {M} ^Q_\rho \) be the set of all minimal projection modules under \(\rho \) for query type Q. Finally, let

figure m

where \(\chi (\psi ,\mathcal {L}) = \langle \mathcal {T} _1, \psi , \varSigma , \mathcal {T} _2, \psi , \mathcal {L} \rangle \), and \(\mathcal {L} = \mathcal{ELH}^{r} \) if \(Q=c\), \(\mathcal {L} =\mathcal{EL}^{\mathsf{ran}} \) if \(Q=i\), and \(\mathcal {L} = \mathcal{EL}^{\mathsf{ran},\sqcap ,u} \) if \(Q = q\). Then it holds that \(\mathbb {M} ^Q_\rho =\mathbb {S} ^\mathcal {L} _\rho \).

Fig. 3.
figure 3

Algorithms for computing all subsumee justifications

In this paper, we present an algorithm for computing subsumee projection justifications for subsumption queries of \(\mathcal{EL}^{\mathsf{ran},\sqcap ,u} \). Recall that \(\mathcal{EL}^{\mathsf{ran},\sqcap ,u} \)-inclusions are sufficient to detect any difference that is detectable with conjunctive queries. The algorithm for computing subsumer projection justifications, and the algorithms for the other query languages are similar.

4.2 Computing Subsumee Projection Justifications

We now present the algorithm for computing subsumee projection justifications (Fig. 3). The basic idea of the algorithm is to collect as few axioms from a terminology \(\mathcal {T} _2\) as possible while ensuring the existence of a so-called subsumee simulation between another terminology \(\mathcal {T} _1\) and \(\mathcal {T} _2\) [12, 26].

Inclusions of the form \(\mathsf{ran} (r) \sqsubseteq X\) might cause non-trivial entailments. For example, let \(\mathcal {T} _1=\{X \equiv \exists r. Y, Y \equiv A_1 \sqcap A_2\}\) and \(\varSigma =\{X,A_1,A_2,r\}\). Then \(\mathcal {T} _1\) entails that X is subsumed by the \(\varSigma \)-concepts \(\exists r.\top \) and \(\exists r.(A_1 \sqcap A_2)\) (modulo equivalence). For \(\mathcal {T} _2 = \mathcal {T} _1 \cup \{\mathsf{ran} (r) \sqsubseteq A_1\}\), however, we additionally obtain \(\mathcal {T} _2 \models \exists r.A_2 \sqsubseteq X\). Hence, when formulating the algorithms for computing subsumee simulations, an additional parameter \(\zeta \in \{\epsilon \} \cup (\mathsf{N_R} \cap \varSigma )\) is needed which is used in range concepts of the form \(\mathsf{ran} (\zeta )\). We call this parameter context of a role. We treat \(\epsilon \) as a special role name and set \(\mathsf{ran} (\epsilon ) = \top \). The set of all role contexts, in symbols \(\mathcal {C} ^\varSigma \), is defined as \(\mathcal {C} ^\varSigma = \{\epsilon \} \cup (\mathsf{N_R} \cap \varSigma )\).

To identify concept and role names that are relevant for a subsumee simulation that we propose later, we first use the following notion of \(\varSigma \)-entailment:

  • \(A\in \mathsf{N_C} \) is \(\varSigma \)-entailed in \(\mathcal {T} \) iff there is an \(\mathcal{EL}^{\textsf {ran}}_{\varSigma } \)-concept C such that \(\mathcal {T} \models C \sqsubseteq A\);

  • \(s\in \mathsf{N_R} \) is \(\varSigma \)-entailed in \(\mathcal {T} \) iff there exists \(s' \in \mathsf{N_R} \cap \varSigma \) such that \(\mathcal {T} \models s' \sqsubseteq s\);

  • \(A\in \mathsf{N_C} \) is \((\varSigma ,s)\)-entailed in \(\mathcal {T} \) iff there is an \(\mathcal{EL}^{\textsf {ran}}_{\varSigma } \)-concept C such that \(\mathcal {T} \models C \sqcap \mathsf{ran} (s) \sqsubseteq A\).

Moreover, we say that \(X \in \mathsf{N_C} \) is complex \(\varSigma \)-entailed w.r.t. \(\mathcal {T} \) iff for every \(Y \in \text {{non-conj}} _\mathcal {T} (X)\) one of the following conditions holds:

  • there exists \(B \in \varSigma \) such that \(\mathcal {T} \models B \sqsubseteq Y\) and \(\mathcal {T} \not \models B \sqsubseteq X\);

  • there exists \(Y \equiv \exists r.Z \in \mathcal {T} \) and r, Z are \(\varSigma \)-entailed in \(\mathcal {T} \).

X is said to be simply \(\varSigma \)-entailed if X is \(\varSigma \)-entailed but not complex \(\varSigma \)-entailed. For example, let \(\mathcal {T} = \{X \equiv X_1 \sqcap X_2,\, B_1 \sqsubseteq X_1,\, X_2 \equiv \exists r.Z,\, B_2 \sqsubseteq Z, s \sqsubseteq r\}\). We have that \(\text {{non-conj}} _\mathcal {T} (X)=\{X_1, X_2\}\), then r is \(\varSigma \)-entailed w.r.t. \(\mathcal {T}\); X is complex \(\varSigma \)-entailed w.r.t. \(\mathcal {T}\) for \(\varSigma = \{B_1,B_2,s\}\); but X is not complex \(\varSigma '\)-entailed w.r.t. \(\mathcal {T}\), where \(\varSigma '\) ranges over \(\{B_1,B_2\}\), \(\{B_1,s\}\), \(\{B_2,s\}\). Additionally, X is not complex \(\varSigma \)-entailed w.r.t. \(\mathcal {T} \cup \{B_1 \sqsubseteq X\}\).

We now define the notion of a subsumee simulation from \(\mathcal {T} _1\) to \(\mathcal {T} _2\) as a subset of \((\mathsf{N_C} \cap \text {sig} (\mathcal {T} _1)) \times (\mathsf{N_C} \cap \text {sig} (\mathcal {T} _2)) \times \mathcal {C} ^\varSigma _{\mathcal {T} _1}\), where is the range of role contexts.

Definition 6

(Subsumee Simulation). A relation \(S \subseteq \text {sig} ^\mathsf{N_C} (\mathcal {T} _1) \times \text {sig} ^\mathsf{N_C} (\mathcal {T} _2) \times \mathcal {C} ^\varSigma _{\mathcal {T} _1}\) is a \(\varSigma \)-subsumee simulation from \(\mathcal {T} _1\) to \(\mathcal {T} _2\) iff the following conditions hold:

\((S^{\leftarrow }_\mathsf{N_C})\) :

if \((X_1,X_2,\zeta ) \in S\), then for every \(\varphi \in \varSigma ^{\zeta }\) and for every \(X'_2 \in \text {{non-conj}} _{\mathcal {T} _2}(X_2)\) with \(\mathcal {T} _2 \not \models \mathsf{ran} (\zeta ) \sqsubseteq X_2'\), \(\mathcal {T} _1 \models \varphi \sqsubseteq X_1\) implies \(\mathcal {T} _2 \models \varphi \sqsubseteq X_2'\);

\((S^{\leftarrow }_\exists )\) :

if \((X_1,X_2,\zeta ) \in S\) and \(X_1 \equiv \exists r.Y_1 \in \mathcal {T} _1\) such that \(\mathcal {T} _1 \models s \sqsubseteq r\) for \(s\in \varSigma \) and \(Y_1\) is \((\varSigma , s)\)-entailed in \(\mathcal {T} _1\), then for every \(X'_2 \in \text {{non-conj}} _{\mathcal {T} _2}(X_2)\) not entailed by \(\mathsf{dom} (s)\) or \(\mathsf{ran} (\zeta )\) w.r.t. \(\mathcal {T} _2\), there exists \(X'_2 \equiv \exists r'.Y_2 \in \mathcal {T} _2\) such that \(\mathcal {T} _2 \models s \sqsubseteq r'\) and \((Y_1,Y_2, s) \in S\);

\((S^{\leftarrow }_\sqcap )\) :

if \((X_1,X_2,\zeta ) \in S\) and \(X_1 \equiv Y_1 \sqcap Y_2 \sqcap \cdots \sqcap Y_n \in \mathcal {T} _1\), then for every \(Y_2 \in \text {{non-conj}} _{\mathcal {T} _2}(X_2)\) not entailed by \(\mathsf{ran} (\zeta )\) w.r.t. \(\mathcal {T} _2\), there exists \(Y_1 \in \text {{non-conj}} _{\mathcal {T} _1}(X_1)\) not entailed by \(\mathsf{ran} (\zeta )\) w.r.t. \(\mathcal {T} _2\) with \((Y_1,Y_2,\epsilon ) \in S\).

We write \(\mathcal {T} _1\sim ^{\leftarrow }_{\varSigma }\mathcal {T} _2\) iff there is a \(\varSigma \)-subsumee simulation S from \(\mathcal {T} _1\) to \(\mathcal {T} _2\) such that for every \(A \in \mathsf{N_C} \cap \varSigma \): \((A,A, \epsilon ) \in S\).

For \(\zeta \in \varSigma \cap \mathsf{N_R} \), we write \(\langle \mathcal {T} _1, X_1\rangle \sim ^{\leftarrow }_{\varSigma , \zeta } \langle \mathcal {T} _2, X_2 \rangle \) iff there is a \(\varSigma \)-subsumee simulation S from \(\mathcal {T} _1\) to \(\mathcal {T} _2\) with \((X_1, X_2,\zeta ) \in S\) for which \(\mathcal {T} _1\sim ^{\leftarrow }_{\varSigma }\mathcal {T} _2\).

A subsumee simulation captures the set of subsumees in the sense that \(\mathcal {T} _1\sim ^{\leftarrow }_{\varSigma }\mathcal {T} _2\) iff \(\mathsf{rhsWtn}_{\varSigma }(\mathcal {T} _1,\mathcal {T} _2)=\emptyset \). Moreover, if a concept name \(X_2\) in \(\mathcal {T} _2\) \(\varSigma \)-subsumee simulates a concept name \(X_1\) in \(\mathcal {T} _1\), then \(X_2\) subsumes all \(\varSigma \)-concepts w.r.t. \(\mathcal {T} _2\) that are subsumed by \(X_1\) w.r.t. \(\mathcal {T} _1\). Formally: \(\langle \mathcal {T} _1, X_1\rangle \sim ^{\leftarrow }_{\varSigma , \zeta } \langle \mathcal {T} _2, X_2 \rangle \) iff for every \(C \in \mathcal{EL} _\varSigma ^\mathsf{ran} \): \(\mathcal {T} _1 \models C \sqsubseteq X_1\) implies \(\mathcal {T} _2 \models C \sqsubseteq X_2\) [26].

Algorithm 1 provides the function \(\textsc {Cover}_\leftarrow \) for computing the set of all subsumee justifications. The algorithm recursively computes sets of axioms sufficient to construct a subsumee simulation. For better readability, the algorithm is structured into several parts, one for each condition of a subsumee simulation, cf. Definition 6. Algorithm 3 handles Case \((S^{\leftarrow }_\mathsf{N_C})\), Algorithm 2 takes care of Case \((S^{\leftarrow }_\exists )\) and Algorithm 4 is responsible for Case \((S^{\leftarrow }_\sqcap )\). Note that each of these algorithms requires a role context \(\zeta \) as an input parameter. The notion of complex \(\varSigma \)-entailment is employed in Algorithm 1. If X is not complex \(\varSigma \)-entailment, then neither the existential nor the conjunctive case need to be considered, and Algorithm 1 terminates in Line 6.

Compared with computing subsumer projection justifications, the challenge of computing subsumee projection justifications is to handle conjunctions on the left-hand side of subsumptions. Concept names defined as conjunctions in \(\mathcal {T} _2\) use conjuncts which in turn may also be defined as conjunctions. Such axioms form tree structures. When selecting axioms, all minimal subsets of \(\mathcal {T} _2\), i.e., all sub-trees, that maintain a subsumee simulation need to be considered. To this end, we define for each concept name X a so-called definitorial forest consisting of sets of axioms of the form \(Y \equiv Y_1 \sqcap \ldots \sqcap Y_n\) which can be thought of as forming trees. Any \(\langle X,\varSigma \rangle \)-subsumee projection justification contains the axioms of a selection of these trees, i.e., one tree for every conjunction formulated over \(\varSigma \) that entails X w.r.t. \(\mathcal {T} \). Formally, we define a set of a \(\text {DefForest}^{\sqcap } _\mathcal {T} (X) \subseteq 2^{\mathcal {T}}\) to be the smallest set closed under the following conditions:

  • \(\emptyset \in \text {DefForest}^{\sqcap } _\mathcal {T} (X)\);

  • \(\{\alpha \} \in \text {DefForest}^{\sqcap } _\mathcal {T} (X)\) for \(\alpha := X \equiv X_1 \sqcap \ldots \sqcap X_n \in \mathcal {T} \); and

  • \(\varGamma \cup \{\alpha \} \in \text {DefForest}^{\sqcap } _\mathcal {T} (X)\) for \(\varGamma \in \text {DefForest}^{\sqcap } _\mathcal {T} (X)\) with \(Z \equiv Z_1 \sqcap \ldots \sqcap Z_k \in \varGamma \) and \(\alpha := Z_i \equiv Z^1_i \sqcap \cdots \sqcap Z^n_i \in \mathcal {T} \).

Given a tree \(\varGamma \in \text {DefForest}^{\sqcap } _\mathcal {T} (X)\) rooted at X, we use \(\text {leaves} (\varGamma )\) to denote the set \(\text {sig} (\varGamma )\setminus \{\,X \in \text {sig} (C) \mid X \equiv C \in \varGamma \,\}\) of leaves if \(\varGamma \ne \emptyset \); and \(\{X\}\) otherwise. We denote with \(\text {max-tree}^{\sqcap } _\mathcal {T} (X)\) the set in \(\text {DefForest}^{\sqcap } _\mathcal {T} (X)\) that is maximal w.r.t. \(\subseteq \). Finally, we set to be the set of leaves of the maximal tree. For example, for \(\mathcal {T} = \{\alpha _1,\alpha _2,\alpha _3\}\) with \(\alpha _1 := X\equiv Y\sqcap Z\), \(\alpha _2 := Y \equiv Y_1\sqcap Y_2\), and \(\alpha _3 := Z\equiv Z_1\sqcap Z_2\), we obtain \(\text {DefForest}^{\sqcap } _\mathcal {T} (X) = \{\emptyset , \{\alpha _1\}, \{\alpha _1,\alpha _2\},\{\alpha _1,\alpha _3\}, \{\alpha _1,\alpha _2,\alpha _3\}\}\). Moreover, we have that \(\text {leaves} (\{\alpha _1,\alpha _3\}) = \{Y,Z_1,Z_2\}\), \(\text {max-tree}^{\sqcap } _\mathcal {T} (X) = \{\alpha _1,\alpha _2,\alpha _3\}\), and \(\text {{non-conj}} _\mathcal {T} (X) = \{Y_1,Y_2,Z_1,Z_2\}\).

The definitorial forest is used to enumerate and find all trees for which Case \((S^{\leftarrow }_\sqcap )\) holds, which is done in Algorithm 4. The set \(\text {{non-conj}} _\mathcal {T} (X)\), however, is also used in Algorithm 2, which we discuss next. The existence of axiom in Line 2 of Algorithm 2 is guaranteed by Line 7 of Algorithm 1. The axiom in Line 6 of Algorithm 2 exists as we assume that \(\mathcal {T} _2,X_2\) subsumee-simulates \(\mathcal {T} _1,X_1\) w.r.t. \(\varSigma \). Moreover, there is at most one axiom \(\alpha _{X_1}\in \mathcal {T} _1\) and at most one \(\alpha _{X'_2}\in \mathcal {T} _2\) as \(\mathcal {T} _1\) and \(\mathcal {T} _2\) are terminologies. The concept name \(X_2\) may be defined as a conjunction in \(\mathcal {T} _2\) whose conjuncts in turn may also be defined as a conjunction in \(\mathcal {T} _2\) and so forth. In Line 3 all axioms forming the maximal resulting definitorial conjunctive tree are collected.

For the next algorithm, we define to be the set of concept names that are conjunctively defined in \(\mathcal {T} \). For every \(X \in \text {def}^{\sqcap } _\mathcal {T} \), we set , where \(\alpha = X \equiv Y_1 \sqcap \cdots \sqcap Y_n \in \mathcal {T} \).

The axiom in Line 2 of Algorithm 4 is guaranteed to exist by Line 9 of Algorithm 4. In case \(X_2\) is defined as a conjunction in \(\mathcal {T} _2\), the pair consisting of \(\mathcal {T} _2\) containing only a partial conjunctive tree rooted at \(X_2\) and \(X_2\) needs to be considered to be sufficient to subsumee simulate \(\mathcal {T} _1,X_1\). Therefore Algorithm 4 considers every partial conjunctive tree \(\varGamma \) from \(\text {DefForest}^{\sqcap } _{\mathcal {T} _2}(X_2)\) in Line 4 and removes the axioms in \(\delta _\varGamma \) connecting the leaves of \(\varGamma \) with the remaining conjunctive tree from \(\mathcal {T} _2\) in lines 10 and 11.

The following theorem states that Algorithm 1 indeed computes the set of subsumee projection justifications. The proof establishes that Algorithm 1 computes all possible subsets of \(\mathcal {T} _2\) that are minimal w.r.t. \(\subsetneq \) while preserving one of the considered \(\varSigma \)-subsumee simulations from \(\mathcal {T} _1\) to \(\mathcal {T} _2\).

Theorem 2

Let \(\chi =\langle \mathcal {T} _1, \varphi _1, \varSigma , \mathcal {T} _2, \varphi _2, \mathcal{EL}^{\mathsf{ran},\sqcap ,u} \rangle \), and \(\varphi _1, \varphi _2 \in (\mathsf{N_C} \cap \varSigma ) \cup \{\,\mathsf{dom} (r), \mathsf{ran} (r) \mid r \in \mathsf{N_R} \cap \varSigma \,\}\). Additionally, let using Algorithm 1. Then \(\mathbb {M} \) is the set of all subsumee justifications under \(\chi \).

Algorithm 1 runs in exponential time in the number of axioms contained in the input terminologies, in the worst case. On the one hand, the algorithm uses justifications (see Line 6 of Alg. 2 and Line 5 of Alg. 3) whose number grows exponentially for role inclusions as well as concept name inclusions. The different justifications are each incorporated using the operator \(\otimes \) resulting in possibly different subsumption justifications. The majority of the running time will be spent on computing justifications. Another source of exponential blowup is contained in Line 4 of Algorithm 4. The number of elements in the set \(\text {DefForest}^{\sqcap } _\mathcal {T} (X)\) grows exponentially in \(|\mathcal {T} |\). According to our experience so far, however, it seems plausible to assume that definitorial forests in practical ontologies remain rather small and, thus, they do not cause a serious slowdown of the algorithm.

5 Application of Minimal Projection Modules

In this section, we discuss two applications of minimal projection modules.

5.1 Computing Minimal Query Modules

We first define the minimal query modules for different queries.

Definition 7

(Query Module). A set \(M \subseteq \mathcal {T} \) is a subsumption (resp. instance, conjunctive) query module of \(\mathcal {T} \), denoted as \(M^c_\varSigma \) (resp. \(M^i_\varSigma \), \(M^q_\varSigma \)).

  • \(M_\varSigma ^c\): for each \(\mathcal{ELH}^{r} _\varSigma \)-inclusion \(\alpha \), if \(\mathcal {T} \models \alpha \), then \(M \models \alpha \);

  • \(M_\varSigma ^i\): for each \(\varSigma \)-instance assertion \(\lambda \), if \((\mathcal {T},\mathcal {A}) \models \lambda \), then \((M,\mathcal {A}) \models \lambda \);

  • \(M_\varSigma ^q\): for each \(q(\mathbf {a})\), if \((\mathcal {T},\mathcal {A}) \models q(\mathbf {a})\), then \((M,\mathcal {A}) \models q(\mathbf {a})\), where \(\mathbf {a}\) is a tuple of individual names from \(\mathcal {A} \) and \(q(\mathbf {a})\) is a \(\varSigma \)-conjunctive query.

A subsumption (resp. instance, conjunctive) query module is called a minimal subsumption (resp. instance, conjunctive) query module iff it is minimal w.r.t.  \(\subsetneq \).

In general, the reference and the implementing/task ontologies do not coincide. Intuitively, the task ontology \(\mathcal {T} _2\) might contain more knowledge about \(\varSigma \) than the reference ontology \(\mathcal {T} _1\). The following lemma illustrates the relationship between minimal projection modules and minimal query modules.

Lemma 3

Let \(Q\in \{c,i,q\}\) and \(\rho =\{\mathcal {T} _1, \varSigma , \mathcal {T} _2\}\). Then: for every minimal projection module \(\mathcal {M} ^Q_\rho \) for a query type Q and under a projection setting \(\rho \), there exists a minimal Q-query module \(M^Q_\varSigma \) of \(\mathcal {T} _2\) such that \(\mathcal {M} ^Q_\rho \subseteq M^Q_\varSigma \).

Example 3

(Fig. 2 contd.). The minimal projection module of \(\mathcal {T} _2\) under \(\rho =\{\mathcal {T} _1, \varSigma , \mathcal {T} _2\}\) is \(\{\{\beta _1, \beta _3\},\{\beta _1, \beta _4\}\}\), for any query type \(Q\in \{c,i,q\}\). The minimal Q-query module of \(\mathcal {T} _2\) w.r.t. \(\varSigma \) is \(\{\{\beta _1, \beta _2, \beta _3\},\{\beta _1,\beta _2, \beta _4\}\}\). The minimal Q-query module of \(\mathcal {T} _1\) is \(\{\alpha _1\}\).

One solution for importing \(\varSigma \)-knowledge of a reference ontology to a task ontology is to import a minimal Q-query module of the reference ontology. However, one can see that if we include \(\alpha _1\) to \(\mathcal {T} _2\), then \(\alpha _1\) repeats the \(\varSigma \)-knowledge that is already represented by \(\beta _1\). Besides, the resulting ontology would not be a terminology anymore.

Consider a special projection setting of the form \(\langle \mathcal {T}, \varSigma , \mathcal {T} \rangle \), where the reference ontology \(\mathcal {T} \) is also the implementing ontology. We denote such reflexive projection settings with \(\rho ^\circlearrowleft \). A projection module \(\mathcal {M} \) of \(\mathcal {T} \) under \(\rho ^\circlearrowleft \) for subsumption (resp. instance, conjunctive) queries is a subset of \(\mathcal {T} \) that preserves the answers to \(\varSigma \)-concept subsumption (resp. instance, conjunctive) queries as given by \(\mathcal {T} \). It can readily be verified that a minimal projection module under a reflexive projection setting coincides with a minimal module for the type of queries considered.

5.2 Ontology Comparison Measure

In existent methods for measuring the entailment capacity of a terminology about a signature \(\varSigma \) for a query language, one can use logical difference. However, the following example shows that using logical difference can be not sufficient in some case.

Example 4

Let , , , . Let \(\mathcal {T} _1=\{\alpha _1\}\), \(\mathcal {T} _2=\{\alpha _1,\alpha _2\}\), \(\mathcal {T} _3=\{\alpha _1,\alpha _2,\alpha _3,\alpha _4\}\) and \(\varSigma =\{A, B_1, B_2, B_3, B_4\}\). We have that \(\mathsf{Wtn}_{\varSigma }^\mathcal {L} (\mathcal {T} _1,\mathcal {T} _2)=\mathsf{Wtn}_{\varSigma }^\mathcal {L} (\mathcal {T} _1,\mathcal {T} _3)=(\emptyset ,\emptyset ,\emptyset )\), for \(\mathcal {L} \in \{\mathcal{ELH}^{r}, \mathcal{EL}^{\mathsf{ran}}, \mathcal{EL}^{\mathsf{ran},\sqcap ,u} \}\).

In Example 4, the notion of logical difference cannot be used to distinguish between \(\mathcal {T} _2\) and \(\mathcal {T} _3\) w.r.t. \(\varSigma \) as \(\mathcal {T} _2\) and \(\mathcal {T} _3\) preserve the \(\varSigma \)-knowledge w.r.t. \(\mathcal {T} _1\). However, intuitively, \(\mathcal {T} _2\) and \(\mathcal {T} _3\) each contain more information about the \(\varSigma \)-concept names \(B_1, B_2\) and \(B_4\) than \(\mathcal {T} _1\). We therefore propose a new measure based on the notions of minimal projection module and query module for different query languages.

Definition 8

(Projection Rate). Let \(Q\in \{c,i,q\}\) and let \(\mathcal {M} _\rho ^Q\) range over minimal projection modules under \(\rho =\langle \mathcal {T} _1, \varSigma , \mathcal {T} _2 \rangle \) and the query type Q. Additionally, let \(M_\varSigma ^Q\) range over minimal modules of \(\mathcal {T} _2\) for the query type Q. The projection rate \(P^Q\) of \(\mathcal {T} _1\) over \(\mathcal {T} _2\) is defined as:

$$ P^Q=\frac{\mid \bigcup \mathcal {M} _\rho ^Q \mid }{\mid \bigcup M_\varSigma ^Q \mid } \qquad $$

Note that \(p \le 1\) always holds by Lemma 3. Intuitively, the lower the projection rate, the more \(\varSigma \)-knowledge is contained in \(\mathcal {T} _2\) compared with \(\mathcal {T} _1\).

Example 5

(Example 4 contd.). Considering when \(Q=c\), we have that \(\mathcal {M} _\rho ^c\)=\(\mathcal {M} _{\rho '}^c\)=\(\{\alpha _1\}\) under \(\rho =\langle \mathcal {T} _1, \varSigma , \mathcal {T} _2 \rangle \) and \(\rho '=\langle \mathcal {T} _1, \varSigma , \mathcal {T} _3 \rangle \). The minimal subsumption query module of \(\mathcal {T} _2\) w.r.t. \(\varSigma \) is \(\{\alpha _1,\alpha _2\}\). But there exists two minimal subsumption query module of \(\mathcal {T} _3\) w.r.t. \(\varSigma \), which are \(\{\alpha _1,\alpha _2,\alpha _3\}\) and \(\{\alpha _1,\alpha _2,\alpha _4\}\). So the union of minimal subsumption query module of \(\mathcal {T} _3\) w.r.t. \(\varSigma \) is \(\mathcal {T} _3\) that contains four axioms. Therefore, the projection rate \(P^c\) of \(\mathcal {T} _1\) over \(\mathcal {T} _2\) is \(P^c=1/2\) and the projection rate \(P^c\) of \(\mathcal {T} _1\) over \(\mathcal {T} _3\) is \(P^c=1/4\). So \(\mathcal {T} _3\) contains more \(\varSigma \)-knowledge compared with \(\mathcal {T} _2\) as \(1/4 < 1/2\).

6 Conclusion

We proposed a novel module notion called projection module that entails the queries that follow from a reference ontology. We presented an algorithm for computing all minimal projection modules of acyclic \(\mathcal{ELH}^{r}\)-terminologies and two applications of minimal projection modules. We expect that the algorithms can be extended to deal with cyclic terminologies and possibly general \(\mathcal{ELH}^{r}\)-TBoxes, and to yield a ranking between different projection modules, e.g., via weighted signatures.