1 Introduction

Ontologies provide structured representations of domain knowledge that are suitable for AI reasoning. They are used in various domains, including medicine, biology, and finance. In the domain of ontologies, one of the interesting topics is to provide explanations of reasoning conclusions. To this end, justifications have been proposed to offer users a brief explanation for a given conclusion. Computing justifications has been widely explored for different tasks, for instance for debugging ontologies [1, 9, 11] and computing ontology modules [6]. Extracting just one justification can be easy for tractable ontologies, such as \(\mathcal{EL}\mathcal{}^+\) [17]. For instance, we can find one justification by deleting unnecessary axioms one by one. However, there may exist more than one justification for a given conclusion. Computing all such justifications is computationally complex and reveals itself to be a challenging problem [18].

There are mainly two different approaches [17] to compute all justifications for a given conclusion, the black-box approach and the glass-box approach. The black-box approach [11] relies only on a reasoner and, as such, can be used for ontologies in any existing Description Logics. For example, a simple (naive) black-box approach would check all the subsets of the ontology using an existing reasoner and then filter the subset-minimal ones (i.e., justifications). Many advanced and optimized black-box algorithms have been proposed since 2007 [10]. Meanwhile, the glass-box approaches have achieved better performances over certain specific ontology languages (such as \(\mathcal{EL}\mathcal{}^+\)-ontology) by going deep into the reasoning process. Among them, the class of SAT-based methods [1,2,3, 14, 16] performs the best. The main idea developed by SAT-based methods is to trace, in a first step, a complete set of inferences (complete set for short) that contribute to the derivation of a given conclusion, and then, in a second step, to use SAT-tools or resolution to extract all justifications from these inferences. A detailed example is provided in Sect. 4.1.

In the real world, ontologies are always huge. For instance, the SnomedCT ontology contains more than 300,000 axioms. Thus, the traced complete set can be large, which could make it challenging to extract the justifications over them. Several techniques could be applied to reduce the size of the traced complete set, like the locality-based modules [8] and the goal-directed tracing algorithm [12]. One of their shared ideas is to identify, for a given conclusion, a particular part of the ontology relevant for the extraction of justifications. For example, the state-of-the-art algorithm, PULi [14], uses a goal-directed tracing algorithm. However, even for PULi, a simple ontology \(\mathcal {O} = \{A_i \sqsubseteq A_{i+1} \mid 1 \le i \le n - 1\}\) with the conclusion \(A_0 \sqsubseteq A_n\) leads to a complete set containing \(n - 1\) inferences. This set can not be reduced further even with the previously mentioned optimizations. From this observation, we decided to explore a new SAT-based glass-box method to handle such situations better.

Now, let us look carefully at the ontology \(\mathcal {O}\) above, and let us regard each \(A_i\) as a graph node \(N_{A_i}\). Then we are able to construct, for \(\mathcal {O}\), a directed graph whose edges are of the form \(N_{A_i} \rightarrow N_{A_{i+1}}\). It turns out that all the justifications for the conclusion \(A_0 \sqsubseteq A_n\) are extracted from all the paths from \(N_{A_0}\) to \(N_{A_n}\), and here we have only one such path. We can easily extend this idea on \(\mathcal{EL}\mathcal{}^+\)-ontology because most of the \(\mathcal{EL}\mathcal{}^+\)-axioms can be interpreted as direct edges except one case (i.e., \(A\equiv B_1\sqcap \cdots \sqcap B_n\)), for which we need a hyperedge (for more details see Definition 3). However, for more expressive ontologies, this translation becomes more complicated. For example, it is hard to map \(\mathcal {ALC}\)-axioms to edges as those axioms may contain negation or disjunction of concepts.

This example inspired us to explore a hypergraph representation of the ontology and reformulate inferences and justifications. Roughly, our inferences are built from elementary paths of the hypergraph and lead to particular paths called H-paths. Then, computing all the justifications for a given conclusion is made using such H-paths. For the previous ontology \(\mathcal {O}\) and the conclusion \(A_0 \sqsubseteq A_n\), our complete set is reduced to only two inferences (no matter the value of n) corresponding to the unique path from \(N_{A_0}\) to \(N_{A_n}\). The source of improvement provided by our method is twofold. On the one hand, it comes from the fact that elementary paths are pre-computed while extracting the inferences and that existing algorithms like depth-first search can efficiently compute such paths. On the other hand, yet as a consequence, decreasing the size of the complete sets of inferences leads to smaller inputs for the SAT-based algorithm extracting justifications from the complete set (recall here that our method is a SAT-based glass-box method).

The paper is organized as follows. Section 2 introduces preliminary definitions and notions. In Sect. 3, we associate a hypergraph representation to \(\mathcal{EL}\mathcal{}^+\)-ontology and introduce a new set of inference rules, called H-rules, that generate our inferences. In Sect. 4, we develop the algorithm minH, which compute justifications based on our inferences. Section 5 shows experimental results and Sect. 6 summarizes our work.

2 Preliminaries

2.1 \(\mathcal{EL}\mathcal{}^+\)-Ontology

Given sets of atomic concepts \(\textsf {N}_C = \{A,B,\cdots \}\) and atomic roles \(\textsf {N}_R = \{r,s,t, \cdots \}\), the set of \(\mathcal{EL}\mathcal{}^+\)concepts C and axioms \(\alpha \) are built by the following grammar rules:

$$C\,{:}{:}\!\!= \top ~| ~A~ |~ C\sqcap C~ | ~\exists r. C, \ \ a \,{:}{:}\!\!= C\sqsubseteq C~|~C\equiv C~|~r\sqsubseteq s~|~r_1\circ \cdots \circ r_n\sqsubseteq s.$$

A \(\mathcal{EL}\mathcal{}^+\)-ontology \(\mathcal {O}\) is a finite set of \(\mathcal{EL}\mathcal{}^+\)-axioms. An interpretation \(\mathcal {I} = (\triangle ^\mathcal {I},\cdot ^\mathcal {I})\) of \(\mathcal {O}\) consists of a non-empty set \(\triangle ^\mathcal {I}\) and a mapping from atomic concepts \(A \in \textsf {N}_C\) to a subset \(A^\mathcal {I} \subseteq \triangle ^\mathcal {I}\) and from roles \(r \in \textsf {N}_R\) to a subset \(r^\mathcal {I} \subseteq \triangle ^\mathcal {I} \times \triangle ^\mathcal {I}\). For a concept C built from the grammar rules, we define \(C^\mathcal {I}\) inductively by: \((\top )^\mathcal {I} = \triangle ^\mathcal {I}, ( C\sqcap D)^\mathcal {I} = C^\mathcal {I}\cap D^\mathcal {I}\), \( (\exists r. C)^\mathcal {I} = \{a \in \triangle ^\mathcal {I} \mid \exists b, (a,b) \in r^\mathcal {I},b \in C^\mathcal {I}\}\), \((r \circ s)^\mathcal {I} = \{(a,b) \in \triangle ^\mathcal {I} \times \triangle ^\mathcal {I} \mid \exists c, (a,c) \in r^\mathcal {I},(c,b) \in s^\mathcal {I}\}\). An interpretation is a model of \(\mathcal {O}\) if it is compatible with all axioms in \(\mathcal {O}\), i.e., for all \(C \sqsubseteq D,C \equiv D,r \sqsubseteq s,r_1 \circ \cdots \circ r_n \sqsubseteq s\in \mathcal {O}\), we have \(C^\mathcal {I} \subseteq D^\mathcal {I},C^\mathcal {I}= D^\mathcal {I},r^\mathcal {I} \subseteq s^\mathcal {I},(r_1 \circ \cdots \circ r_n)^\mathcal {I} \subseteq s^\mathcal {I}\), respectively. We say \(\mathcal {O} \models a\) where \(\alpha \) is an axiom iff each model of \(\mathcal {O}\) is compatible with \(\alpha \). A concept A is subsumed by B w.r.t. \(\mathcal {O}\) if \(\mathcal {O} \models A \sqsubseteq B\).

Next, we use \(A,B,\cdots ,G\) (possibly with subscripts) to denote atomic concepts and we use XYZ (possibly with subscripts) to denote atomic concepts \(A,\cdots ,G\), or complex concepts \(\exists r.A\), \(\cdots \), \(\exists r.G\).

We assume that ontologies are normalized. A \(\mathcal{EL}\mathcal{}^+\)-ontology \(\mathcal {O}\) is normalized if all its axioms are of the form \(A \equiv B_1 \sqcap \cdots \sqcap B_m, A \sqsubseteq B_1 \sqcap \cdots \sqcap B_m, A \equiv \exists r. B, A \sqsubseteq \exists r. B, r \sqsubseteq s\), or \(r \circ s \sqsubseteq t\), where \(A, B, B_i \in \textsf {N}_C\), and \( r,s,t \in \textsf {N}_R\). Every \(\mathcal{EL}\mathcal{}^+\)-ontology can be normalised in polynomial time by introducing new atomic concepts and atomic roles.

Example 1

The following set of axioms is a \(\mathcal{EL}\mathcal{}^+\)-ontology:

\(\mathcal {O}=\{\ a_1{:} A \sqsubseteq D,\ a_2{:} D \sqsubseteq \exists r. E,\ a_3{:} E \sqsubseteq F, a_4{:} B \equiv \exists t.F,\ a_5{:} r \sqsubseteq t,\ a_6{:} G \equiv C\sqcap B\ , a_7{:} C \sqsubseteq A\}.\)

It is clear that \(\mathcal {O} \models A \sqsubseteq \exists r. E\) as for all models \(\mathcal {I}\), we have \(A^\mathcal {I} \subseteq D^\mathcal {I}\) by the axiom \(a_1\) and \(D^\mathcal {I} \subseteq (\exists r. E)^\mathcal {I}\) by \(a_2\).

2.2 Inference, Support and Justification

Given a \(\mathcal{EL}\mathcal{}^+\)-ontology \(\mathcal {O}\), a major reasoning task over \(\mathcal {O}\) is classification, which aims at finding all subsumptions \(\mathcal {O} \models A \sqsubseteq B\) for atomic concepts AB occurring in \(\mathcal {O}\). Generally, it can be solved by applying inferences recursively over \(\mathcal {O}\) [5].

An inference \(\rho \) is a pair \({\langle } \rho _{pre}, \rho _{con}{\rangle }\) whose premise set \(\rho _{pre}\) consists of \(\mathcal{EL}\mathcal{}^+\)-axioms and conclusion \(\rho _{con}\) is a single \(\mathcal{EL}\mathcal{}^+\)-axiom. As usual, a sequence of inferences \(\rho ^1,\cdots , \rho ^n\) is a derivation of an axiom \(\alpha \) from \(\mathcal {O}\) if \(\rho ^n_{con} = \alpha \) and for any \(\beta \in \rho ^i_{pre}, 1\le i\le n \), we have \(\beta \in \mathcal {O}\) or \(\beta = \rho ^j_{con}\) for some \(j<i\).

As usual, inference rules are used to generate inferences. For instance, Table 1 [1, 5] shows a set of inference rules for \(\mathcal{EL}\mathcal{}^+\)-ontologies. Next, we use \(\mathcal {O} \vdash A \sqsubseteq B\) to denote that \(A \sqsubseteq B\) is derivable from \(\mathcal {O}\) using inferences generated by the rules in Table 1. The set of inference rules in Table 1 is sound and complete for classification [5], i.e., \(\mathcal {O} \models {A \sqsubseteq B}\) iff \(\mathcal {O} \vdash {A \sqsubseteq B}\) for any \(A, B\in \textsf {N}_C\).

Table 1. Inference rules over \(\mathcal{EL}\mathcal{}^+\)-ontology.

A support of \(A \sqsubseteq B\) over \(\mathcal {O}\) is a sub-ontology \(\mathcal {O}' \subseteq \mathcal {O}\) such that \(\mathcal {O}' \models {A \sqsubseteq B}\). The justifications for \(A \sqsubseteq B\) are subset-minimal supports of \(A \sqsubseteq B\). We denote the collection of all justifications for \(A \sqsubseteq B\) w.r.t. \(\mathcal {O}\) by \(J_{\mathcal {O}}(A \sqsubseteq B)\).

We say S is a complete set (of inferences) for \(A \sqsubseteq B\) if for any justifications \(\mathcal {O}'\) of \(A \sqsubseteq B\), we can derive \(A \sqsubseteq B\) from \(\mathcal {O}'\) using only the inferences in S.

Example 2

(Example 1 cont’d). Before applying inference rules, axioms in \(\mathcal {O}\) are preprocessed in order to be compatible with Table 1. For example, \(a_4\) is replaced by \(B \sqsubseteq \exists t. F\) and \( \exists t. F \sqsubseteq B\). Then, according to the inference rules of Table 1, we may produce the following inferences: \(\rho = {\langle }\{A \sqsubseteq D, D \sqsubseteq \exists r. E\}, A \sqsubseteq \exists r. E{\rangle }\), \(\rho ' = {\langle }\{A \sqsubseteq \exists r. E, r \sqsubseteq t\}, A \sqsubseteq \exists t. E{\rangle }\) and \(\rho '' = {\langle }\{A \sqsubseteq \exists t. E, E \sqsubseteq F, \exists t. F \sqsubseteq B\}, A \sqsubseteq B{\rangle }\) generated by rule \(\mathcal {R}_2\), \(\mathcal {R}_4\) and \(\mathcal {R}_3\) respectively. Then \(\mathcal {O} \vdash A \sqsubseteq B\) since \(A \sqsubseteq B\) is derivable from \(\mathcal {O}\) by the sequence \(\rho , \rho ', \rho ''\).

Notice that \(\mathcal {O}'=\{a_1,a_2,a_3,a_4, a_5\}\) is a support for \(A \sqsubseteq B\), and thus, any superset \(\mathcal {O}''\) of \(\mathcal {O}'\) is a support of \(A \sqsubseteq B\). \(\mathcal {O}'\) is also one of the justifications for \(A \sqsubseteq B\) as for any \(\mathcal {O}''' \subset \mathcal {O}' \), we have \(\mathcal {O}''' \not \models A \sqsubseteq B\). Moreover, here the three inferences \(\rho , \rho ', \rho ''\) provide a complete set for \(A \sqsubseteq B\).

3 Hypergraph-Based Inference Rules

3.1 H-Inferences

In general, a (directed) hypergraph \(\mathcal {G}=(\mathcal {V},\mathcal {E})\) is defined by a set of nodes \(\mathcal {V}\) and a set of hyperedges \(\mathcal {E}\) [4, 7]. A hyperedge is of the form \(e = (S_1,S_2), S_1,S_2 \subseteq \mathcal {V}\). In this paper, a hypergraph is associated to an ontology as follows:

Definition 3

For a given \(\mathcal{EL}\mathcal{}^+\)-ontology \(\mathcal {O}\), the associated hypergraph is \(\mathcal {G}_\mathcal {O} =(\mathcal {V}_\mathcal {O},\mathcal {E}_\mathcal {O})\) where (i) the set of nodes \(\mathcal {V}_\mathcal {O} = \{N_A, N_r, N_{\exists r.A} \mid A\in \textsf {N}_C, r\in \textsf {N}_R\}\) and (ii) the set of edges \(\mathcal {E}_{\mathcal {O}}\) is defined by \(f(\mathcal {O})\) where f is the multi-valued mapping shown in Fig. 1. Given a hyperedge e of \(\mathcal {E}_{\mathcal {O}}\), the inverse image of e, \(f^{-1}(e)\), is defined in the obvious manner. For a set E of hyperedges, \(f^{-1}(E)=\cup _{e\in E} f^{-1}(e).\)

Fig. 1.
figure 1

Definition of f (left) and graphical illustrations of \(f(\alpha )\) (right)

Notice that, the hyperedges associated with \(A \equiv B_1 \sqcap \cdots \sqcap B_m \) are (i) the hyperedge \((\{N_{B_1}, \cdots , N_{B_m}\}, \{N_A\})\) and (2) of course, the edges corresponding to \(A \sqsubseteq B_1 \sqcap \cdots \sqcap B_m\).

Example 4

(Example 1 cont’d). The hypergraph \(\mathcal {G}_\mathcal {O}\) for \(\mathcal {O}\) is shown in Fig. 2, where \(e_0 =(\{N_C\},\{N_A\})\), \(e_1 =(\{N_A\},\{N_D\})\), \(e_2 =(\{N_D\}, \{N_{\exists r. E}\})\), etc. Also, \(f^{-1}(e_0)= C \sqsubseteq A,\ f^{-1}(e_1) = A \sqsubseteq D\), and \(f^{-1}(e_2) = D \sqsubseteq \exists r. E\), etc.

Fig. 2.
figure 2

The hypergraph associated with the ontology \(\mathcal {O}\).

As for graphs, a path (next called regular path) from nodes \(N_1\) to \(N_2\) in a hypergraph is a sequence of edges:

$$\begin{aligned} e_0 = (S_1^0,S_2^0), e_1 = (S_1^1,S_2^1), \cdots , e_n = (S_1^n,S_2^n) \end{aligned}$$
(1)

where \(N_1 \in S_1^0,N_2 \in S_2^n\) and \(S_2^{i-1}\,{=}\,S_1^i, 1 \le i \le n.\) Next, the existence of a regular path from \(N_X\) to \(N_Y\) in a hypergraph \(\mathcal {G}_\mathcal {O}\) is denoted \( N_X \leadsto N_Y\). Now, we introduce hypergraph-based inferences which are based on the existence of regular paths as follows:

Definition 5

Given a hypergraph \(\mathcal {G}_\mathcal {O}\), Table 2 gives a set of inference rules called H-rules. Inferences based on H-rules are called H-inferences. Next, we denote by \(\mathcal {O}\vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Y\) (or simply \(N_X {\mathop {\leadsto }\limits ^{h}}N_Y\)) the fact that \(N_X {\mathop {\leadsto }\limits ^{h}}N_Y\) can be derived from \(\mathcal {G}_\mathcal {O}\) using the H-inferences.

Table 2. H-rules over \(\mathcal {G}_\mathcal {O} = (\mathcal {V}_\mathcal {O},\mathcal {E}_\mathcal {O})\).

Example 6

(Example 4 cont’d). As shown in Fig. 2, we have \(N_A \leadsto N_{\exists r.E}\), \(N_E\leadsto N_F\), \(N_{\exists r.F} \leadsto N_B\) from the existence of regular paths. Then we can derive \(N_A {\mathop {\leadsto }\limits ^{h}}N_{B}\) from \(\mathcal {G}_\mathcal {O}\) by the H-rules \(\mathcal {H}_0\), \(\mathcal {H}_0\) and \(\mathcal {H}_2\) which generate the H-inferences \(\rho ^1, \rho ^2, \rho ^3\), where \(\rho ^1 = {\langle }\{N_A \leadsto N_{\exists r.E}\}, N_A {\mathop {\leadsto }\limits ^{h}}N_{\exists r.E}{\rangle }\), \(\rho ^2 = {\langle }\{N_E\leadsto N_F\}, N_E {\mathop {\leadsto }\limits ^{h}}N_F{\rangle }\) and \(\rho ^3 = {\langle }\{N_A {\mathop {\leadsto }\limits ^{h}}N_{\exists r.E}, N_E {\mathop {\leadsto }\limits ^{h}}N_F, N_{\exists r.F} \leadsto N_B \}, N_A{\mathop {\leadsto }\limits ^{h}}N_B{\rangle }\), respectively.

Note that the first rule \(\mathcal {H}_0\), the initialization rule, makes regular paths the elementary components of H-rules. Moreover, Proposition 7 formally states that, in our H-inference system, we do not need to add the transitive inference rule:

$$\frac{N_X {\mathop {\leadsto }\limits ^{h}}N_Z,N_Z {\mathop {\leadsto }\limits ^{h}}N_Y}{N_X {\mathop {\leadsto }\limits ^{h}}N_Y}.$$

Proposition 7

If \(\mathcal {O} \vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Z\) and \(\mathcal {O} \vdash _h N_Z {\mathop {\leadsto }\limits ^{h}}N_Y\) then \(\mathcal {O} \vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Y\).

3.2 Completeness and Soundness of H-Inferences

The following result is the main result of this section. It states the equivalence of \(N_X {\mathop {\leadsto }\limits ^{h}}N_Y\) derivation (by Table 2) and ontology entailment for \(X \sqsubseteq Y\), and thus states that our H-rules are sound and complete for \(\mathcal{EL}\mathcal{}^+\)-ontology.

Theorem 8

If \(\mathcal {O}\) is an \(\mathcal{EL}\mathcal{}^+\)-ontology, then \(\mathcal {O}\models X\sqsubseteq Y \text { iff } \mathcal {O} \vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Y\), where XY are concepts of either form A or \(\exists r. B\).

Proof

\(\Leftarrow \)” is obvious by induction over Table 2 and the fact that \(N_X \leadsto N_Y\) implies \(\mathcal {O} \models X\sqsubseteq Y\), so we only need to prove the direction “\(\Rightarrow \)”.

Assume that \(\mathcal {O} \models X\sqsubseteq Y\). We consider two cases:

Case 1. We assume \(\mathcal {O} \vdash X \sqsubseteq Y\)Footnote 1. Let d(XY) be the length of one shortest derivation of \(X \sqsubseteq Y\) from \(\mathcal {O}\) using Table 1. We prove “\(\Rightarrow \)” by induction on d(XY).

  • Assume \(d(X,Y)=0\). In this case \(\mathcal {O}\) must contain axioms of the form \(X \equiv Y\sqcap \cdots \text { or } X \sqsubseteq Y\sqcap \cdots \). Clearly we have \(N_X \leadsto N_{Y}\) thus \(\mathcal {O} \vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_{Y}\).

  • Assuming\(\Rightarrow \)holds when \(d(X,Y) < k\), let us prove\(\Rightarrow \)holds for \(d(X,Y)=k\). Suppose \(\rho ^{last}\) is the last inference in one shortest derivation of \(X \sqsubseteq Y\) using Table 1. Two cases arise:

    1. 1.

      Assume \(\rho ^{last}\) is generated by \(\mathcal {R}_1 (n>1),\mathcal {R}_3 \text { or }\mathcal {R}_4(n=2)\). For example, assume \(\rho ^{last} = {\langle }\{X \sqsubseteq \exists r.B_1, B_1 \sqsubseteq B_2,\exists r.B_2 \sqsubseteq Y\}, X \sqsubseteq Y{\rangle }\) comes from \(\mathcal {R}_3\). We have \(d(X,\exists r. B_1),d(B_1,B_2),d(\exists r.B_2,Y)<k\) because their corresponding subsumptions can be derived without \(\rho ^{last}\). By the assumption \(\mathcal {O}\vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_{\exists r.B_1}, N_{B_1} {\mathop {\leadsto }\limits ^{h}}N_{B_2},N_{\exists r.B_2} {\mathop {\leadsto }\limits ^{h}}N_Y.\) Then we have \(\mathcal {O} \vdash _h N_X{\mathop {\leadsto }\limits ^{h}}N_{\exists r.B_2}\) by first deriving \(N_X {\mathop {\leadsto }\limits ^{h}}N_{\exists r.B_1}, N_{B_1} {\mathop {\leadsto }\limits ^{h}}N_{B_2}\), and then applying H-inference:

      $$\rho ^{new} = {\langle }\{N_X {\mathop {\leadsto }\limits ^{h}}N_{\exists r.B_1}, N_{B_1}{{\mathop {\leadsto }\limits ^{h}}} N_{B_2}, N_{\exists r.B_2} \leadsto N_{\exists r.B_2}\}, N_X {\mathop {\leadsto }\limits ^{h}}N_{\exists r.B_2}{\rangle }.$$

      Then \(\mathcal {O}\vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Y\) by Proposition 7 since \(\mathcal {O}\vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_{\exists r.B_2}, N_{\exists r.B_2}{\mathop {\leadsto }\limits ^{h}}N_B\). The argument also holds for \(\mathcal {R}_1(n>1)\)(or \(\mathcal {R}_4(n=2)\)) by applying \(\mathcal {H}_1\) (or \(\mathcal {H}_3\)) instead of \(\mathcal {H}_2\).

    2. 2.

      Assume \(\rho ^{last}\) is generated by \(\mathcal {R}_1 (n=1),\mathcal {R}_2 \text { or }\mathcal {R}_4(n=1)\). Then, in each case, we have \(\rho ^{last}\) has the form \({\langle }\{X \sqsubseteq Z, Z \sqsubseteq Y\}, X \sqsubseteq Y{\rangle }.\) As in case 1, we have \(d(X,Z),d(Z,Y) < k\). By the assumption, \(\mathcal {O}\vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Z, N_Z {\mathop {\leadsto }\limits ^{h}}N_Y\), then \(\mathcal {O}\vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Y\) by Proposition 7.

Case 2. If \(\mathcal {O} \vdash X \sqsubseteq Y\) does not hold, then X or Y is not atomic. In this case, we introduce new axioms \(A\equiv X\), \(B\equiv Y\) with new atomic concepts AB and denote the extended ontology by \(\mathcal {O}'\). Clearly, \(\mathcal {O}' \models A \sqsubseteq B\) and thus \(\mathcal {O}' \vdash A \sqsubseteq B\) since Table 1 is sound and complete. Therefore, we have \(\mathcal {O}'\vdash _h N_A {\mathop {\leadsto }\limits ^{h}}N_B\) by the same arguments as above. Now, notice that \(\mathcal {G}_{\mathcal {O}'}\) is obtained from \(\mathcal {G}_{\mathcal {O}}\) by adding 4 edges: \((\{N_A\}, \{N_X\}), (\{N_X\}, \{N_A\}), (\{N_B\}, \{N_Y\})\) and \((\{N_Y\}, \{N_B\})\), thus we have \(\mathcal {O}'\vdash _h N_A{{\mathop {\leadsto }\limits ^{h}}} N_B\) iff \(\mathcal {O}\vdash _h N_X {\mathop {\leadsto }\limits ^{h}}N_Y\).

3.3 Extracting Justifications from \(\mathcal {G}_\mathcal {O}\)

Now, we formally define H-paths as a hypergraph representation of classical derivations based on H-rules. The reader should pay attention to the fact that H-paths are not classical hyperpaths [7]. Next, for the sake of homogeneity, we consider a regular path from \(N_X\) to \(N_Y\) as the set of its edges and denote it as \(P_{X, Y}\).

Definition 9

(H-paths). In the hypergraph \(\mathcal {G}_\mathcal {O}\), an H-path \(H_{X, Y}\) from \(N_X\) to \(N_Y\) is a set of edges recursively generated by the following composition rules:

  1. 0.

    A regular path \(P_{X, Y}\) is an H-path from \(N_X\) to \(N_Y\);

  2. 1.

    If \(e=(\{N_{B_1}, \cdots ,N_{B_m}\},\{N_{A}\})\in \mathcal {V}_\mathcal {O}\), if \(H_{X, B_i}\) are H-paths for \(i=1..m\), and if \(P_{A,Y}\) is a regular path, then \(H_{X, B_1} \cup \cdots \cup H_{X, B_m}\cup P_{A, Y} \cup \{e\}\) is an H-path from \(N_X\) to \(N_Y\);

  3. 2.

    If \(H_{X,\exists r.B_1}, H_{B_1,B_2}\) are H-paths and \(P_{\exists r. B_2, Y}\) is a regular path, then \(H_{X,\exists r.B_1}\cup H_{B_1,B_2}\cup P_{\exists r. B_2,Y}\) is an H-path from \(N_X\) to \(N_Y\);

  4. 3.

    If \(e = (\{N_{r},N_s\}, \{N_s,N_t\}) \in \mathcal {V}_\mathcal {O}\), if \(H_{X,\exists r.A_1}, H_{A_1, \exists s.A_2}\) are H-paths and if \(P_{\exists t.A_2, B}\) is a regular path, then \(H_{X,\exists r.A_1} \cup H_{A_1, \exists s.A_2}\cup P_{\exists t.A_2, B}\cup \{e\}\) is an H-path from \(N_X\) to \(N_Y\).

Fig. 3.
figure 3

Structure of H-paths from \(N_X\) to \(N_Y\)

Figure 3 gives an illustration of H-paths: the blue arrows \(\leadsto \) correspond to regular paths, and the red ones \({\mathop {\leadsto }\limits ^{h}}\) to H-paths. It is straightforward to compare composition rules building H-paths with H-rules building derivations in Table 2. One may also consider H-paths as deviation-trees with leaves corresponding to the edges in \(\mathcal {G}_\mathcal {O}\). However, our approach provides a more direct characterization of justifications as shown in Theorem 10.

We say that an H-path \(H_{X,Y}\) is minimal if there is no H-path \(H_{X,Y}'\) such that \(H_{X,Y}' \subset H_{X,Y}\).

Now, we are ready to explain how H-paths and justifications are related. We can compute justifications from minimal H-paths as stated below:

Theorem 10

Given XY of either form A or \(\exists r. B\). Let

$$\mathcal {S} = \{f^{-1}(H_{X,Y}) \mid H_{X,Y} \textit{ is a minimal H-path from } N_X \ \textit{to}\ N_Y \}.$$

Then \(\mathcal {J}_\mathcal {O}(X \sqsubseteq Y) = \{s \in \mathcal {S}\mid s'\not \subset s, \forall s'\in \mathcal {S}\}\). That is, all justifications for \(X \sqsubseteq Y\) are the minimal subsets in \(\mathcal {S}\).

Proof

For any justification \(\mathcal {O}'\) of \(X \sqsubseteq Y\), there exists a minimal H-path \(H_{X,Y}\) such that \(\mathcal {O}' = f^{-1}(H_{X,Y})\). The reason is that, since \(\mathcal {O}' \models X \sqsubseteq Y\), there exists an H-path \(H_{X,Y}\) from \(N_X \) to \(N_Y\) on \(\mathcal {G}_{\mathcal {O}'}\) by Theorem 8. Without loss of generality, we can assume \(H_{X,Y}\) is minimal on \(\mathcal {G}_{\mathcal {O}'}\), then it is also minimal on \(\mathcal {G}_{\mathcal {O}}\) since \(\mathcal {G}_{\mathcal {O}'}\) is a sub-graph of \(\mathcal {G}_{\mathcal {O}}\). We have \(\mathcal {O}' = f^{-1}(H_{X,Y})\) because otherwise there exists \(\mathcal {O}'' \subsetneq \mathcal {O}'\) such that \(\mathcal {O}'' = f^{-1}(H_{X,Y})\), and thus \(\mathcal {O}'' \models X \sqsubseteq Y\) by Theorem 8 again. Therefore, \(\mathcal {O}'\) is not a justification. Contradiction.

Now, we know \(\mathcal {S}\) contains all justifications for \(X\sqsubseteq Y\). Moreover, \(f^{-1}(H_{X,Y})\models X \sqsubseteq Y\) for any H-path \(H_{X,Y}\). Therefore, we have \(\mathcal {J}_\mathcal {O}(X \sqsubseteq Y) = \{s \in \mathcal {S}\mid s'\not \subset s, \forall s'\in \mathcal {S}\}\) by the definition of justifications.

Example 11

(Example 4 cont’d). The regular paths from \(N_A\) to \(N_{\exists r.E}\) and from \(N_E\) to \(N_F\) produce two H-paths \(H_{A,\exists r E} = \{e_1, e_2, e_3\}\) and \(H_{E, F} = \{e_4\}\). Then, applying the third composition rule with \(H_{A,\exists r E}, H_{E, F}\) and \(P_{\exists r.F, B} = \{e_6\}\), we get \(H_{A, B} = \{e_1,e_2, e_3, e_4, e_6\}\), which is the unique H-path from \(N_A\) to \(N_B\). Thus, by Theorem 10, we have \(\{\alpha _1, \alpha _2, \alpha _3, \alpha _4, \alpha _5\}\), the only justification for \(A \sqsubseteq B\).

4 Implementation: Computing Justifications

4.1 SAT-Based Method

In this section, we describe briefly how PULi [14], the state-of-the-art glass-box algorithm, proceeds. Given an ontology \(\mathcal {O}\), computing \(J_\mathcal {O}(X \sqsubseteq Y)\) is done through 2 steps: (1) tracing a complete set for \(X \sqsubseteq Y\), (2) using resolution to extract the justifications from the complete set. The following example illustrates both steps:

Example 12

(Example 1 cont’d). Let us compute \(J_\mathcal {O}(G \sqsubseteq D)\) using PULi’s method.

  1. 1.

    Using the goal-directed tracing algorithm in [12], the first step produces a complete set of inferencesFootnote 2 \(\{\rho _1,\rho _2\}\) for \(G \sqsubseteq D\), where \(\rho _1 = {\langle }\{G \sqsubseteq C, C \sqsubseteq A\}, G\sqsubseteq A{\rangle }, \rho _2 = {\langle }\{G \sqsubseteq A, A \sqsubseteq D \}, G \sqsubseteq D{\rangle }.\)

  2. 2.

    This step is again composed of two parts:

    1. (a)

      The first part proceeds to the translation of the inferences into clauses. Let us denote \(\overline{p}_1{:}G \sqsubseteq C\), \( \overline{p}_2{:} C \sqsubseteq A\), \(\overline{p}_3{:} A \sqsubseteq D\), \(p_4{:} G \sqsubseteq A\), \(p_5 {:} G \sqsubseteq D\). Here the literals \(\overline{p}_1, \overline{p}_2,\overline{p}_3\) (with a bar) are called answer literals as they correspond to the axioms \(a_6,a_7,a_1\) in \(\mathcal {O}\). Thus, we obtain \(\mathcal {C} = \{\lnot \overline{p}_1 \vee \lnot \overline{p}_2 \vee p_4, \lnot p_4 \vee \lnot \overline{p}_3 \vee p_5\}\) by rewriting the inferences \(\rho _1\), \(\rho _2\) as clauses.

    2. (b)

      Secondly, a new clause \(\lnot p_5\) is added to \(\mathcal {C}\), where \(p_5\) corresponds to the conclusion \(G \sqsubseteq D\), and resolution is applied over \(\mathcal {C}\). The set of all justifications \(J_\mathcal {O}(G \sqsubseteq D)\) is obtained by considering (i) the clauses formed of answer literals only and (ii) among them keeping the minimal onesFootnote 3. In this example, after the resolution phase, the only clause that consists of merely answer literals is \(\lnot \overline{p}_1 \vee \lnot \overline{p}_2 \vee \lnot \overline{p}_3\). Thus, the set of all justifications is \(J_\mathcal {O}(G \sqsubseteq D) = \{\{a_1,a_6,a_7\}\}\).

Our method for computing justifications follows the same steps as PULi although here the major difference is that the first step computes a complete set of H-inferences instead of a complete set of inferences wrt. Table 1.

4.2 Computing Justification by Minimal H-Paths

In this section, given an ontology \(\mathcal {O}\) and its associated hypergraph \(\mathcal {G}_\mathcal {O}\), we present minH (Algorithm 1) that computes all justifications for \(X_0 \sqsubseteq Y_0\) using the minimal H-paths from \(N_{X_0}\) to \(N_{Y_0}\) over \(\mathcal {G}_\mathcal {O}\). The algorithm minH proceeds in two steps described below.

figure a

Step 1. First, at Line 2, minH computes a complete set of inferences \(\mathcal {U}\) for \(N_{X_0} {\mathop {\leadsto }\limits ^{h}}N_{Y_0}\) using CompleteH (See Algorithm 2). Here, \(\mathcal {U}\) is complete in the sense that for any H-path \(H_{X, Y}\), we can derive \(N_{X} {\mathop {\leadsto }\limits ^{h}}N_{Y}\) using inferences in \(\mathcal {U}\) from the edge set \(H_{X, Y}\). CompleteH computes \(\mathcal {U}\) as follows:

  • Line 3–12 of Algorithm 2: The recursive application of trace_one_turn (See Algorithm 3) outputs the set of all H-inferences whose conclusion is the given input \(N_{X_1} {\mathop {\leadsto }\limits ^{h}}N_{Y_1}\);

  • Line 13–17 of Algorithm 2: Let path be the depth-first search algorithm that computes all regular paths from \(N_X\) to \(N_Y\) in \(\mathcal {G}_\mathcal {O}\) with input (\(N_X,N_Y\)). Intuitively, the purpose is to shift inferences from regular paths to edges.

figure b
figure c

Step 2. Then Algorithm minH computes all justifications for \(X_0 \sqsubseteq Y_0\) as follows:

  • Line 3 of Algorithm 1: It computes all minimal H-paths from \(N_{X_0}\) to \(N_{Y_0}\) using resolution, which is developed by PULiFootnote 4, over the clauses generated from \(\mathcal {U}\) as illustrated in Sect. 4.1. Here, a literal p is associated with each edge e, each \(N_X {\mathop {\leadsto }\limits ^{h}}N_{Y}\), and each \(N_X \leadsto N_{Y}\) in \(\mathcal {U}\). The answer literals are those associated with edges.

  • Line 4–8 of Algorithm 1: It computes justifications by mapping back all the minimal H-paths and select the subset-minimal sets as stated in Theorem 10.

Example 13

(Example 4 cont’d). Assume \(X_0 = G\) and \(Y_0 = D\) are the input of minH. Then at line 2 of minH, we have \(\mathcal {U} = \{\rho ^1, \rho ^2\}\), where \(\rho ^1 = {\langle }\{N_G \leadsto N_D\}, N_G{\mathop {\leadsto }\limits ^{h}}N_D{\rangle }\) is H-inference obtained by CompleteH (line 3–12) and \(\rho ^2 = {\langle }\{e_0, e_1, e_8\}, N_G\leadsto N_D{\rangle }\) is produced from regular paths obtained by CompleteH (line 13–17). Let us denote \(\overline{p}_0 {:} e_0,\ \overline{p}_1 {:} e_1,\ \overline{p}_2 {:} e_8\) as answer literals and \(p_3{:}N_G \leadsto N_D\), \(p_4{:} N_G {\mathop {\leadsto }\limits ^{h}}N_D\). Then clauses(\(\mathcal {U}\)) \(=\{\lnot p_3 \vee p_4,\ \lnot \overline{p}_0 \vee \lnot \overline{p}_1 \vee \lnot \overline{p}_2 \vee p_3\}\).

By resolution over clauses(\(\mathcal {U}\)), we obtain min_hpaths = \(\{\{e_0, e_1, e_8\}\}\) at line 3 of minH. Then the output of minH is J = \(\{\{a_1, a_6, a_7\}\}\), which is the set of all justifications for \(G \sqsubseteq D\).

4.3 Optimization

Below we present two optimizations that have been implemented in order to accelerate the computation of all justifications.

Fig. 4.
figure 4

Illustration of Optimization 1

  1. 1.

    In Algorithm 3, for the H-inference added at Line 5, we require that there exists at least one regular path from \(N_{A}\) to \(N_Y\) that does not contain an edge \(e_i=(\{N_{A}\}, \{N_{B_i}\})\) for some \(1\le i\le m\). Otherwise, as shown in Fig. 4, H-paths corresponding to this H-inference are not minimal, as they all contain one H-path from \(N_X\) to \(N_Y\) of the form \(H_{X, B_i} \cup (P_{A,Y} - \{e_i\})\). In the same spirit, we require that the H-path from \(N_X\) to \(N_{B_i}\) does not pass by \(N_{A}\).

  2. 2.

    If we have an H-path \(H_{A,B} = H_{A,\exists r.B_1} \cup H_{B_1,B_2} \cup P_{\exists r.B_2, B}\) where

    $$\begin{aligned} H_{A,\exists r.B_1} = H_{A,\exists r.C} \cup H_{C,B_1}. \end{aligned}$$
    (2)

    then \(H_{C,B_2}= H_{C,B_1} \cup H_{B_1,B_2}\) is also an H-path and \(H_{A, B} = H_{A,\exists r.C} \cup H_{C,B_2} \cup P_{\exists r.B_2, B}\). The two different ways to decompose \(H_{A,B}\) above are already considered in Line 8 when executing Algorithm 3 with the input \(N_A {\mathop {\leadsto }\limits ^{h}}N_B\). It means that the decomposition (2) is redundant. We can avoid such redundancy by requiring \(\exists r.B_2\ne Y\) at Line 11.

5 Experiments

To evaluate and validate our approach, we compare minHFootnote 5 with PULi [14], the state-of-the-art algorithm for computing justifications at this moment. Both methods compute all justifications based on resolution but with different inference rules generated in different ways. PULi uses a complete set (next denoted by elk) generated by the ELK reasoner [13], which uses inference rules slightly different from those in Table 1. Our method uses the complete set \(\mathcal {U}\) generated by Step 1 of minH, described in Sect. 4.2. To analyze the performance of our setting, we make the following two measures: (1) we compare the size of elk with that of \(\mathcal {U}\), (2) we compare the time cost of PULi with that of minH. All the experiments were conducted on a machine with an INTEL Xeon 2.6 GHz and 128 GiB of RAM.

The experiments were processed with four different ontologiesFootnote 6: go-plus, galen7, SnomedCT (version Jan. 2015 and Jan. 2021). All the non-\(\mathcal{EL}\mathcal{}^+\) axioms are deleted. Here, go-plus, galen7 are the same ontologies used in [14]. We denote the four ontologies above by go-plus, galen7, snt2015 and snt2021. The number of axioms, concepts, relations, and queries for each ontology are shown in Table 3.

Next a query refers to a direct subsumptionFootnote 7 \(A \sqsubseteq B\). In our experiments, for the four ontologies, the set of all justifications \(J_\mathcal {O}(A \sqsubseteq B)\) is computed for each query \(A \sqsubseteq B\). A query \(A \sqsubseteq B\) is called trivial iff all minimal H-paths from \(N_A\) to \(N_B\) are regular paths, otherwise, the query is non-trivial.

Table 3. Summary of sizes of the input ontologies.
Table 4. Summary of size of elk, \(\mathcal {U}\).

Comparing Complete Sets: \(\mathcal {U}\) vs. elk. We summarize our results in Table 4 and Fig. 5. Table 4 shows that on all four ontologies, \(\mathcal {U}\) is much smaller than elk on average. Especially on galen7, the difference between elk and \(\mathcal {U}\) is even up to 50 times. The gap is even more significant for the median value since a large part of the queries is trivial. However, the gap is much smaller for the maximal number. On snt2021, the largest \(\mathcal {U}\) in size is three times larger than that of elk.

In Fig. 5, for a given query, if the complete set elk contains fewer inference rules than \(\mathcal {U}\), the corresponding blue point is below the red line. The percentage of such cases are: 0.34% for go-plus, 0.066% for galen7, 0.79% for snt2015, and 1.01% for snt2021. It means that for most of the queries, the corresponding \(\mathcal {U}\) is smaller than elk.

As shown in Table 4 and in Fig. 5, sometimes minH generates bigger complete set \(\mathcal {U}\) than PULi. It may happen when, for example, there might be exponentially many different regular paths occurring in the computation process of minH. Therefore, minH could produce a huge complete set. Also, \(\mathcal {U}\) can be bigger than elk when all the regular paths involved are simple. For example, if all regular paths contain only one edge, then the complete set \(\mathcal {U}\) includes many clauses of the form \(\lnot p_e\vee p_{N_A \leadsto N_B}\), which happens because H-rules use regular paths. Indeed, the clause \(\lnot p_e\vee p_{N_A \leadsto N_B}\) is redundant since we can omit this clause by replacing \(p_{N_A \leadsto N_B}\) by \(p_e\). For elk, this does not happen.

Fig. 5.
figure 5

Each blue point has coordinate \((\log (\# |\mathcal {U}|),\log (\# |\textit{elk} |))\), where \(\mathcal {U}, \textit{elk}\) are generated from a non-trivial query, the red line is \(x = y\). (Color figure online)

Comparing Time Cost: minH  vs. PULi. In the following, we only compare the time cost on non-trivial queries. For trivial queries, all H-path are regular paths. Thus all the justifications have already been enumerated by path in minH. It is also easy to compute all the justifications for trivial queries for PULi.

We set a limit of 60 s for each query. The timed-out queries contribute of 60 s to the total time cost. To compare minH with PULi, we test all three different strategies, threshold, top down, bottom up of the resolution algorithm proposed in [14]. We summarize in Table 5 the total time cost (top) and the timed-out queries (bottom). Figure 6 gives the comparisons over queries that are successful for both minH and PULi.

Fig. 6.
figure 6

For each line, the left, middle and right charts correspond to threshold, top down, bottom up strategies respectively. The y-axis is the log value of time(s). The red (resp. black) curve presents the ascending ordered (log value of) time cost of PULi (resp. minH). For a green point (xy), \(e^y\) is the time cost of minH for the query corresponding to the red line point \((x,y')\). (Color figure online)

As shown in Table 5, when using the threshold strategy, minH is more time consuming in total (+5%) on snt2021, and minH has more timed-out queries than PULi on snt2015 and snt2021. This is in part due to the fact that \(\mathcal {U}\) is larger than elk for relatively many queries on snt2015 and snt2021 as shown in Fig. 5. For the remaining 11 cases, minH performs better than PULi in terms of total time cost and the number of timed-out queries. Especially on galen7, the gap between the two methods is even up to ten times for the total time cost. We can see from Table 5 that the threshold strategy performs the best for PULi on all four ontologies. This strategy is also the best strategy for minH except for galen7, for which the bottom up strategy is the best with minH.

For each strategy detailed in Fig. 6, the black curve (the ordered time costs of minH on successful queries) is always below the red curve (the ordered time costs of PULi on successful queries) for all the ontologies. This suggests that minH spends less time over successful queries. Also, most of the green points are below the red lines, which suggests that minH performs better than PULi most of the time for a given query. In some cases, we can see that PULi is more efficient than minH. One of the reasons might be as follows. Note that when computing justifications by resolution, we have to compare two different clauses and delete the redundant one (i.e., the non-minimal one). When regular paths are big, minH might be time consuming because of these comparisons.

Table 5. Total time cost and number of timed-out queries.

6 Conclusion

In this paper, we introduce and investigate a new set of sound and complete inference rules based on a hypergraph representation of ontologies. We design the algorithm minH that leverages these inference rules to compute all justifications for a given conclusion. The key of the performance of our method is that regular paths are used as elementary components of H-paths  and this leads to reducing the size of complete sets because (1) rules are more compact than standard ones, (2) redundant inferences are captured and eliminated by regular paths (see Sect. 4.3). The efficiency of the algorithm minH has been validated by our experiments showing that it outperforms PULi in most of the cases.

There are still many possible extensions and applications of the hypergraph approach. For instance, to get even more compact inference rules, we could extend the notion of regular path to a more general one that will encapsulate the inference rule \(\mathcal {H}_2\) in the same way as regular paths are encapsulated in H-rules. Moreover, we will try to apply our approach for other tasks like classification and to compute logical differences [15].