Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

With the success of semantic technologies and its tool support, most notably the OWL language family and its status as W3C standard, more and more people from various application domains create and use ontologies. Meanwhile, ontological modeling is not only well supported by established tools like Protégé, also methodologies such as the usage of ontology design patterns help practitioners to design and deploy ontologies of high quality [9].

Despite these evolutionary improvements in ontology engineering, the resulting ontologies are not free of errors such as unintended entailments (including the case of inconsistency). For that purpose, research has already brought up several techniques to detect the causalities of unintended entailments, and it has been studied for lightweight ontology languages such as \(\mathcal {EL}\) [22], as well as for very expressive description logics up to \(\mathcal {SROIQ}\) [13, 15], which in fact is the logical foundation of OWL 2 DL [10]. These techniques already found their way as built-in functionality into tools like Protégé, or are available stand-alone. In any case, these methods have become an integral part of the semantic development chain.

When considering their purpose, ontologies are often divided into two groups: those where the intended use is an (highly axiomatized) expert system focusing on automated reasoning as main use (typically less data driven), or those ontologies that are rather used for data sharing, integration, and reuse with little or no intentions on reasoning (typically data driven) [9]. However in our collaborations with practitioners, we found scenarios exhibiting characteristics of both usages, aiming at ontologies that (a) represent a detailed specification of some product (schema knowledge), (b) include all data and (c) contain axioms that (non-deterministically) specify variable (configurable) parts of the product. In general, these ontologies resemble constraint-type problems, where the purpose of typical automated reasoning tasks is (i) checking satisfiability and (ii) asking for models – solutions of the encoded problem. For both tasks, the natural assumption in this setup is that the domain is explicitly given in the ontology, and thus is finite and fixed a priori.

To accommodate these requirements, the fixed-domain semantics has been introduced [6, 27], which allows for reasoning over an explicitly given finite domain. A reasoner, named \(\mathtt {Wolpertinger}\)Footnote 1, that supports standard reasoning as well as model enumeration under the fixed-domain semantics has been developed [28], based on a translation of DL into answer-set programming.

Our motivation in this paper is to elaborate on possible approaches to compute justifications for ontologies under the fixed-domain semantics. We focus on three approaches that evolved naturally during our investigation. First, it is possible to axiomatize a finite domain and conduct fixed-domain reasoning using standard tools, such that computing explanations can be done via standard tools as well. Second, the \(\mathtt {Wolpertinger}\) reasoner can be coupled with the off-the-shelf justification components of Protégé, and finally we introduce a dedicated encoding of the whole problem into answer-set programming. Our contributions in this paper are:

  1. 1.

    A formal framework for justifications under the fixed-domain semantics.

  2. 2.

    A novel translation for \(\mathcal {SROIQ}\) into answer-set programming that allows for standard reasoning and model enumeration.

  3. 3.

    An extended version of the translation enabling to compute justifications where the problem is encoded entirely into answer-set programming.

  4. 4.

    A comparison of three different approaches: one using standard OWL technology employing an axiomatization of the fixed-domain semantics, one using our dedicated fixed-domain reasoner Wolpertinger in combination with standard justification computation technology, and one with our novel translation where the problem is encoded entirely into answer-set programming.

The paper is organized as follows. We briefly recall the description logic \(\mathcal {SROIQ}\) and a sufficient background on answer-set programming in Sect. 2. In Sect. 3, we introduce the notion of justifications, especially under the fixed-domain semantics. Each possible approach to compute justifications is then depicted in detail in Sect. 4. Finally, we compare the introduced methodologies in Sect. 5.

2 Preliminaries

OWL 2 DL, the version of the Web Ontology Language we focus on, is defined based on description logics (DLs, [1, 26]). We briefly recap the description logic \(\mathcal {SROIQ}\) (for details see [14]). Let \(N_I\), \(N_C\), and \(N_R\) be finite, disjoint sets called individual names, concept names and role names respectively. These atomic entities can be used to form complex ones as displayed in Table 1.

A \(\mathcal {SROIQ}\) knowledge base \(\mathcal {K} \) is a tuple \((\mathcal {A},\mathcal {T},\mathcal {R})\) where \(\mathcal {A}\) is an ABox, \(\mathcal {T}\) is a TBox and \(\mathcal {R}\) is an RBox. Table 2 presents the respective axiom types available in the three parts. The definition of \(\mathcal {SROIQ}\) also contains so-called global restrictions which prevents certain axioms from occurring together, retaining decidability. They are not necessary for fixed-domain reasoning considered here, hence we omit them for the sake of brevity. We use \(N_I(\mathcal {K})\), \(N_C(\mathcal {K})\), and \(N_R(\mathcal {K})\) to denote the sets of individual names, concept names, and role names occurring in \(\mathcal {K} \), respectively.

Table 1. Syntax and semantics of role and concept constructors in \(\mathcal {SROIQ}\) , where \(a_1, \ldots a_n\) denote individual names, \(s\) a role name, \(r\) a role expression and \(C\) and \(D\) concept expressions.
Table 2. Syntax and semantics of \(\mathcal {SROIQ}\) axioms.

The semantics of \(\mathcal {SROIQ}\) is defined via interpretations \(\mathcal {I}=(\varDelta ^\mathcal {I},\cdot ^\mathcal {I})\) composed of a non-empty set \(\varDelta ^\mathcal {I}\) called the domain of \(\mathcal {I}\) and a function \(\cdot ^\mathcal {I}\) mapping individual names to elements of \(\varDelta ^\mathcal {I}\), concept names to subsets of \(\varDelta ^\mathcal {I}\), and role names to subsets of \(\varDelta ^\mathcal {I}\times \varDelta ^\mathcal {I}\). This mapping is extended to complex role and concept expressions (cf. Table 1) and finally used to define satisfaction of axioms (see Table 2). We say that \(\mathcal {I}\) satisfies a knowledge base \(\mathcal {K} =(\mathcal {A},\mathcal {T}, \mathcal {R})\) (or \(\mathcal {I}\) is a model of \(\mathcal {K} \), written: \(\mathcal {I} \ \models \ \mathcal {K} \)) if it satisfies all axioms of \(\mathcal {A}\), \(\mathcal {T}\), and \(\mathcal {R}\). We say that a knowledge base \(\mathcal {K} \) entails an axiom \(\alpha \) (written \(\mathcal {K}\ \models \ \alpha \)) if all models of \(\mathcal {K} \) are models of \(\alpha \).

Answer-Set Programming. We review the basic notions of answer set programming [19] under the stable model semantics [8], for further details we refer to [4, 7].

We fix a countable set \(\mathcal{U}\) of (domain) elements, also called constants; and suppose a total order < over the domain elements. An atom is an expression \(p(t_{1},\ldots ,t_{n})\), where p is a predicate of arity \(n\ge 0\) and each \(t_{i}\) is either a variable or an element from \(\mathcal{U}\). An atom is ground if it is free of variables. \({B_\mathcal{U}}\) denotes the set of all ground atoms over \(\mathcal{U}\). A (disjunctive) rule \(\rho \) is of the form

$$\begin{aligned} a_1, \ldots , a_n \leftarrow b_1,\ldots , b_k,\ { not}\,b_{k+1},\ldots ,\ { not}\,b_m. \end{aligned}$$

with \(m\ge k\ge 0\), where \(a_1, \ldots ,a_n,b_1,\ldots ,b_m\) are atoms, and “\({ not}\,\)” denotes default negation. The head of \(\rho \) is the set \(H(\rho )\) = \(\{a_1, \ldots ,a_n\}\) and the body of \(\rho \) is \(B(\rho )= \{b_1,\ldots , b_k,\, { not}\,b_{k+1},\ldots ,\) \({ not}\,b_m\}\). Furthermore, \(B^+(\rho )\) = \(\{b_{1},\ldots ,b_{k}\}\) and \(B^-(\rho )\) = \(\{b_{k+1},\ldots ,b_m\}\). A rule \(\rho \) is safe if each variable in \(\rho \) occurs in \(B^+(r)\). A rule \(\rho \) is ground if no variable occurs in \(\rho \). A fact is a ground rule with empty body. An (input) database is a set of facts. A (disjunctive) program is a finite set of disjunctive rules. For a program \(\mathrm {\varPi }\) and an input database D, we often write \(\mathrm {\varPi } (D)\) instead of \(D\cup \mathrm {\varPi } \). For any program \(\mathrm {\varPi }\), let \(U_\mathrm {\varPi } \) be the set of all constants appearing in \(\mathrm {\varPi }\). \( Gr (\mathrm {\varPi })\) is the set of rules \(\rho \sigma \) obtained by applying, to each rule \(\rho \in \mathrm {\varPi } \), all possible substitutions \(\sigma \) from the variables in \(\rho \) to elements of \(U_\mathrm {\varPi } \).

An interpretation \(I\subseteq {B_\mathcal{U}}\) satisfies a ground rule \(\rho \) iff \(H(\rho ) \cap I \ne \emptyset \) whenever \(B^+(\rho )\subseteq I\), \(B^-(\rho ) \cap I = \emptyset \). I satisfies a ground program \(\mathrm {\varPi } \), if each \(\rho \in \mathrm {\varPi } \) is satisfied by I. A non-ground rule \(\rho \) (resp., a program \(\mathrm {\varPi } \)) is satisfied by an interpretation I iff I satisfies all groundings of \(\rho \) (resp., \( Gr (\mathrm {\varPi })\)). \(I \subseteq {B_\mathcal{U}}\) is an answer set (also called stable model) of \(\mathrm {\varPi } \) iff it is a subset-minimal set satisfying the Gelfond-Lifschitz reduct \( \mathrm {\varPi } ^I=\{ H(\rho ) \leftarrow B^+(\rho ) \mid I\cap B^-(\rho ) = \emptyset , \rho \in Gr (\mathrm {\varPi })\} \). For a program \(\mathrm {\varPi } \), we denote the set of its answer sets by \(\mathcal {AS}(\mathrm {\varPi })\).

For a program \(\mathrm {\varPi } \), we denote the set of its answer sets by \(\mathcal {AS}(\mathrm {\varPi })\), and might use \(\mathcal {AS}(\mathrm {\varPi })_{| P}\) to project on the predicates \(P=\{p_1, \ldots , p_n\}\).

We make use of further syntactic extensions, namely integrity constraints and count expressions, which both can be recast to ordinary normal rules as described in [7]. An integrity constraint is a rule \(\rho \) where \(H(\rho ) = \emptyset \), intuitively representing an undesirable situation; i.e. it has to be avoided that \(B(\rho )\) evaluates positively. Count expressions are of the form \( \#count \{l:l_1,\ldots ,l_i\}\bowtie u\), where l is an atom and \(l_j = p_j\) or \(l_j={ not}\,p_j\), for \(p_j\) an atom, \(1 \le j \le i\), u a non-negative integer, and \(\bowtie \; \in \{\le ,<,=,>,\ge \}\). The expression \(\{l:l_1,\ldots ,l_n\}\) denotes the set of all ground instantiations of l, governed through \(\{l_1,\ldots ,l_n\}\). We restrict the occurrence of count expressions in a rule \(\rho \) to \(B^+(\rho )\) only. Intuitively, an interpretation satisfies a count expression, if \(N \bowtie u\) holds, where N is the cardinality of the set of ground instantiations of l, \(N = |\{l \mid l_1,\ldots ,l_n\}|\), for \(\bowtie \; \in \{\le ,<,=,>,\ge \}\) and u a non-negative integer.

In order to handle (subset) preferences over answer-sets w.r.t. to ground instances of a specific atom, we make use of asprin [3]. The framework is designed to support and simplify the incorporation of preferences over answer-sets.

3 Justifications Under Fixed-Domain Semantics

3.1 Fixed-Domain Semantics

The standard semantics of DLs is defined on arbitrary domains. While finite model reasoning (a natural assumption in database theory) has become the focus of studies in DLs [5, 18, 25], where one is interested in models over arbitrary but finite domains, we consider the case where the domain has an a-priori known cardinality and use the term fixed-domain. This restriction yields an advantage regarding computational complexity for expressive DLs, but it also seems to reflect the intuitive model-theoretic expectations of practitioners in the industrial use cases we were confronted with. Satisfiability checking in \( \mathcal {SROIQ} \) under the standard semantics is N2ExpTime-complete [16], while being NP-complete in the fixed-domain setting [6].

Definition 1

(Fixed-Domain Semantics [6]). Let \( \varDelta \) be a non-empty finite set called fixed domain. An interpretation \( \mathcal {I} = (\varDelta ^{\mathcal {I}}, \cdot ^{\mathcal {I}}) \) is said to be \(\varDelta \)-fixed, if \( \varDelta ^{\mathcal {I}} = \varDelta \), and \( a^{\mathcal {I}} = a \) for all \( a \in \varDelta \). For a DL knowledge base \( \mathcal {K} \), we call an interpretation \( \mathcal {I} \) a \(\varDelta \)-model of \( \mathcal {K} \) (and write \( \mathcal {I}\ \models _{\varDelta } \ \mathcal {K} \)), if \( \mathcal {I} \) is a \( \varDelta \)-fixed interpretation and \( \mathcal {I}\ \models \ \mathcal {K} \). A knowledge base \( \mathcal {K} \) is called \(\varDelta \)-satisfiable, if it has a \( \varDelta \)-model. A knowledge base is said to \(\mathcal {K} \) \(\varDelta \)-entail an axiom \( \alpha \) (\( \mathcal {K}\ \models _{\varDelta } \alpha \)), if \( \mathcal {I}\ \models \ \alpha \) for every \( \mathcal {I}\ \models _{\varDelta } \ \mathcal {K} \).

3.2 Justifications

Logical modeling is prone to error, and it is therefore important to provide debugging support. One of the most investigated methods is to determine explanations of certain entailments. These explanations are usually (minimal) subsets of the input knowledge base that suffice to entail the axiom in question. Several terms have been coined to refer to such (sub)sets. In the context of lightweight description logics Minimal Axiom Sets (MinAs) is used, while the task of finding them is called Axiom Pinpointing [2, 22]. Instead, for propositional logic, the term Minimal Unsatisfiable Subformula (MUS) to explain unsatisfiability was introduced long before [21]. In this paper we use the notion of justification, introduced in the context of highly expressive description logics [15].

Definition 2

(Justification [15]). Let \( \mathcal {K} \) be a knowledge base such that \(\mathcal {K}\ \models \ \alpha \). \(\mathcal {J} \) is a justification for \(\alpha \) in \(\mathcal {K} \) if \(\mathcal {J} \subseteq \mathcal {K} \) and \(\mathcal {J}\ \models \ \alpha \), and for all \(\mathcal {J} ' \subset \mathcal {J}, \mathcal {J} ' \ \not \models \ \alpha \).

Obviously, there may be multiple justifications for an axiom \(\alpha \). Dually to justifications, one might be interested in minimal subsets that can be retracted in order to restore consistency, or remove the unwanted entailment; commonly called repair. These two notions are strongly related in the sense that any repair has a non-empty intersection with each justification. However, in this work we restrict ourselves to justifications only.

Regarding the fixed-domain semantics, any justification needs to adhere to the considered fixed domain. Note that fixed-domain reasoning is monotonic, since otherwise, the subset minimality criterion in the definition of justifications would not be reasonable.

Definition 3

(Fixed-Justification). Let \(\mathcal {K} \) be a knowledge base, and \(\varDelta \) a fixed-domain such that \(\mathcal {K}\ \models _{\varDelta } \ \alpha \). \(\mathcal {J} \) is a \(\varDelta \)-justification for \(\alpha \) in \(\mathcal {K} \) if \(\mathcal {J} \subseteq \mathcal {K} \) and \(\mathcal {J}\ \models _{\varDelta } \ \alpha \), and for all \(\mathcal {J} ' \subset \mathcal {J}, \mathcal {J} ' \ \not \models _{\varDelta } \ \alpha \).

It is the case that, if \(\mathcal {K}\ \models \ \alpha \), then \(\mathcal {K}\ \models _{\varDelta }\ \alpha \) for any fixed-domain \(\varDelta \). However, it does not hold that, if \(\mathcal {J} \) is a justification for \(\mathcal {K}\ \models \ \alpha \), then \(\mathcal {J} \) is a \(\varDelta \)-justification for \(\mathcal {K} \models _{\varDelta }\ \alpha \) for any fixed-domain \( \varDelta \). Due to a stronger restriction on models, there might exist \( \mathcal {J} ' \subset \mathcal {J} \), such that \( \mathcal {J} ' \ \not \models \ \alpha \) but \( \mathcal {J} ' \models _{\varDelta }\ \alpha \). Nonetheless, giving a justification \( \mathcal {J} \) under the standard semantics is helpful, since only subsets of \(\mathcal {J} \) need to be considered. Formally, if \( \mathcal {J} \) is a justification for \( \mathcal {K} \models \ \alpha \), then there exist no \( \varDelta \)-justification \( \mathcal {J} ' \supset \mathcal {J} \) for \( \mathcal {K}\ \models _{\varDelta }\ \alpha \), for any fixed-domain \( \varDelta \). This holds for any restricted reasoning maintaining monotonicity (e.g. finite model reasoning).

We focus on finding justifications for inconsistency, since entailment checking in \( \mathcal {SROIQ} \) can be reduced to satisfiability checking. For example, \( \mathcal {K}\ \models _{\varDelta }\ A \sqsubseteq B \), iff \( \mathcal {K} \cup \{(A \;\sqcap \; \lnot B)(a)\} \) is \( \varDelta \)-inconsistent, where a is a fresh individual not occurring in \(\mathcal {K} \). In the same way, justifications for entailments can be reduced to finding justifications for inconsistency. The caveat is that the introduced axiom should be fixed and not be part of candidate subset guessing.

Example 1

We consider a simple assignment problem, encoded in \( \mathcal {K} _{as} \). We let the domain be \( \varDelta = \{p_{1},p_{2},p_{3}, l_{1},l_{2}, l_{3}, t_{1}, t_{2},t_{3}\}\).

figure a

First, we introduce the core of the knowledge base. Axioms \( \alpha _{1-2} \) specify that a lecture must be taught by a professor, but one professor teaches at most one lecture. Axioms \( \alpha _{3-4} \) introduce special lectures that can only be taught by professor \( p_{2} \). Pairwise disjointness of the classes of lectures, professors and times is represented by axioms \( \alpha _{5-7} \). The domain and the range of heldAt are restricted by \( \alpha _{8-9} \). Finally, axiom \( \alpha _{10} \) defines that a professor is busy at a certain time if he teaches a lecture at that time.

Fig. 1.
figure 1

\( \mathcal {K} _{as} \) ABox representation and axioms.

We specify the ABox for \( \mathcal {K} _{as} \) in Fig. 1. As shown by the graph, this knowledge base is designed to find a suitable teach “configuration”. Then, we add additional constraints \( \lnot busyAt(p_{1}, t_{2}) \) [\( \alpha _{25} \)] and \( \{p_{3}\} \sqsubseteq {\le 1\;} busyAt.Time \) [\( \alpha _{26} \)]. It is easy to see that those constraints enforce \( p_{1} \) and \( p_{3} \) to teach \( l_{1} \). However, \( l_{1} \) is a special lecture that can only be taught by \( p_{2} \). Consequently, \( \mathcal {K} _{as} \) is inconsistent. Then, for example \( \mathcal {J} _{as} = \mathcal {K} _{as} \setminus \{\alpha _{11-13}, \alpha _{15}, \alpha _{17-19}, \alpha _{21-22}\} \) is a \( \varDelta \)-justification for \( \mathcal {K} _{as} \) inconsistency.

Note that some assertions can be concluded implicitly, i.e. using the axioms in \( \mathcal {J} _{as} \), we can infer that \( p_{1}, p_{2}, p_{3} \) must be professors since other elements in the domain are lectures and time points. Thus, we can remove them to get a minimal justification. Besides \(\mathcal {J} _{as}\), there are 52 justifications in total. Also note that \( \mathcal {K} _{as} \) is consistent under the standard semantics, since new professors can be introduced to teach problematic lectures.

4 Computing Justifications

Algorithms for finding justifications can be categorized coarsely into black-box and glass-box approaches. Black-box approaches use a reasoner to conduct the reasoning tasks it was designed for, i.e. entailment checking. Contrarily, in the glass-box approach, reasoners are modified, i.e. the internal reasoning algorithms are tweaked towards justifications. Generally, black-box approaches are more robust and easier to implement, whereas the glass-box approaches provide more potential for optimization. Subsequently, we introduce two black-box approaches, followed by a dedicated glass-box approach.

4.1 Black-Box Approaches

The ontology editor Protégé, has built-in functionality to compute justifications under the standard semantics, which is based on the OWL Explanation WorkbenchFootnote 2 [12]. The underlying algorithm is based on the Hitting-Set Tree (HST) algorithm originating from Reiter’s theory of diagnosis [24]. For the details of the implementation we refer to [11].

Axiomatization of \(\varDelta \)-models. Given a knowledge base \( \mathcal {K} \) and a fixed domain \( \varDelta = \{a_{1}, \ldots , a_{n}\} \), one can axiomatize the fixed-domain semantics, such that \( \mathcal {K}\ \models _{\varDelta } \ \alpha \) iff \( \mathcal {K} \cup \mathcal {FD}_{\varDelta } \ \models \ \alpha \), where . It is easy to see, that those axioms enforce reasoning over \(\varDelta \). A black-box algorithm for finding justifications merely exploits inconsistency or entailment checking, which is a standard reasoning task, thus standard DL reasoners can be used for fixed-domain standard reasoning. In Sect. 5 we will therefore use the explanation workbench with \(\mathtt {HermiT}\) as black-box reasoner.

A Fixed-Domain Reasoner as a Black-box. \(\mathtt {Wolpertinger}\) has been introduced as reasoner adhering to the fixed-domain semantics [28], which can easily be plugged into the explanation workbench. We will evaluate the performance of this approach, and expect the performance to correlate with the performance of entailment checking. With \(\mathtt {W\text {-}black\text {-}box}\) we refer to this approach in the subsequent evaluation.

4.2 A Glass-Box Approach Using Answer-Set Programming

We now introduce a glass-box approach for computing justifications using an encoding into answer-set programming. The translation is based on the naïve translation [6], which has already been implemented in \(\mathtt {Wolpertinger}\), but some fundamental changes needed to be made in order to compute justifications. Since finding justifications is about finding the corresponding (minimal) subsets of a knowledge base, another “layer” is required, on the top of the model correspondence established in the naïve translation, which is not straightforward to encode in ASP. We will therefore avoid negation-as-failure, and hence refer to this new translation as naff (negation-as-failure free). Subsequently, the translation is depicted in detail.

Let \(\mathcal {K} = (\mathcal {A}, \mathcal {T}, \mathcal {R})\) be a normalized \(\mathcal {SROIQ}\) knowledge base, and \(\varDelta \) a fixed domain.Footnote 3 With \(\varPi (\mathcal {K},\varDelta ) = \varPi _{ gen }(\mathcal {K},\varDelta ) \cup \varPi _{ chk }(\mathcal {K}) \cup \varPi _{inc}(\mathcal {K})\), we denote the translation of \(\mathcal {K} \) into a logic program w.r.t. \(\varDelta \). Intuitively, \(\varPi _{ gen }(\mathcal {K},\varDelta ) \) generates candidate interpretations w.r.t. \(\varDelta \), and each axiom is translated into rules in \(\varPi _{ chk }(\mathcal {K})\), in such a way, that any violation will cause a dedicated propositional atom to be true. If so, the Principle of Explosion (POE) is applied via appropriate rules. For every translated axiom, an additional dedicated propositional activator is added in the body of the resulting rule, allowing to activate or deactivate the rule, thus indicating whether to include or exclude the axiom in a candidate justification. With the disjunctive rules in \(\varPi _{ gss }(\mathcal {K},\varDelta ) \), the generation of extensions for every concept and role name is realized.

$$\begin{aligned} \varPi _{ gss }(\mathcal {K},\varDelta ) =&\ \{A(X), not\_A(X) \;{:-} \; \top (X) \mid A \in N_{C}(\mathcal {K})\} \; \cup \\&\ \{r(X,Y), not\_r(X,Y) \;{:-} \; \top (X), \top (Y) \mid r \in N_{R}(\mathcal {K}\} \; \cup \\&\ \{\top (a) \mid a \in \varDelta \}. \end{aligned}$$

Atomic clashes need to be detected explicitly, which is done via simple rules in \(\varPi _{ obv }(\mathcal {K})\). Note that clashes are not represented by constraints, but rules with the dedicated propositional variable inc as head.

$$\begin{aligned} \varPi _{ obv }(\mathcal {K}) =&\ \{inc \;{:-} \; A(X), not\_A(X) \mid A \in N_{C}(\mathcal {K})\} \; \cup \\&\ \{inc \;{:-} \; r(X,Y), not\_r(X,Y) \mid r \in N_{R}(\mathcal {K}\}. \end{aligned}$$

Based on the detection of atomic clashes, the rules in \(\varPi _{poe}(\mathcal {K}) \) encode the POE, that is, every concept and role assertion follows whenever inc holds.

$$\begin{aligned} \varPi _{poe}(\mathcal {K}) =&\ \{A(X) \;{:-} \; inc, \top (X) \mid A \in N_{C}(\mathcal {K})\} \; \cup \\&\ \{not\_A(X) \;{:-} \; inc, \top (X) \mid A \in N_{C}(\mathcal {K})\} \; \cup \\&\ \{r(X,Y) \;{:-} \; inc, \top (X), \top (Y) \mid r \in N_{R}(\mathcal {K}\} \; \cup \\&\ \{not\_r(X,Y) \;{:-} \; inc, \top (X), \top (Y) \mid r \in N_{R}(\mathcal {K}\}. \end{aligned}$$

Qualified Number Restriction Encoding. One problem that we encountered is the usage of the -operator in the translation of at-least cardinality restrictions. Consider the concept \(\ge n\;r.C\), which restricts an individual to have at least n r-neighbors, that are a member of C. The intuitive translation is a constraint that counts how many outgoing r-connections exist, satisfying also the membership in C, thus failing if there are less than n r-neighbors not satisfying the condition. However, this translation does not work anymore due to the rules in \(\varPi _{poe}(\mathcal {K})\).

We therefore introduce a different view of the semantics of cardinality restrictions in the fixed-domain setting. For simplicity, we define \( r.C (a) = \{x \in C^{\mathcal {I}} \mid (a,x) \in r^{\mathcal {I}}\} \). Hence r.C(a) consists of all members of concept C that are connected via r starting in a. The idea is to count individuals which are not a member of the concept where this restriction applies. There are two possibilities that an individual b is not in r.C(a) : \( (a,b) \notin r \) or \( b \notin C \). Let \( n = |\varDelta ^{\mathcal {I}}| \) and \( m = |\{b \in \varDelta ^{\mathcal {I}} \mid b \notin \;r.C(a) \}|\). Hence, the number of individuals in \( \;r.C(a) \) is \(n - m\). This is only possible due to the given fixed domain.

Proposition 1

Let \(\mathcal {K}\) be a \(\mathcal {SROIQ}\) knowledge base, \(\varDelta \) be a fixed-domain, and \(\mathcal {I}\) a \(\varDelta \)-model of \(\mathcal {K}\). Then \( (\ge n\;r.C)^{\mathcal {I}} = \{x \in \varDelta \mid \#\{y \in \varDelta \mid y \notin C^{\mathcal {I}} \;or\; (x,y) \notin r^{\mathcal {I}}\} \le |\varDelta | - n\} \).

Hence, we can compute such a relation between two individuals to be used later in the translation of axioms. A new auxiliary predicate is introduced for each pair of concept (and its negation) and role. We define:

$$\begin{aligned} \varPi _{nra}(\mathcal {K}) =&\ \{not\_r\_C(X,Y) \;{:-} \; not\_C(Y) \;|\; C \in N_{C}(\mathcal {K}), r \in N_{R}(\mathcal {K}) \} \;\cup \\&\ \{not\_r\_C(X,Y) \;{:-} \; not\_r(X,Y) \;|\; C \in N_{C}(\mathcal {K}), r \in N_{R}(\mathcal {K}) \} \;\cup \\&\ \{not\_r\_not\_C(X,Y) \;{:-} \; C(Y) \;|\; C \in N_{C}(\mathcal {K}), r \in N_{R}(\mathcal {K}) \} \;\cup \\&\ \{not\_r\_not\_C(X,Y) \;{:-} \; not\_r(X,Y) \;|\; C \in N_{C}(\mathcal {K}), r \in N_{R}(\mathcal {K}) \}. \end{aligned}$$

\( \varPi _{nra}(\mathcal {K}) \) does not change the interpretation built by \( \varPi _{ gen }(\mathcal {K}) \). It merely collects all those individuals satisfying the previously mentioned conditions. Additionally, we have to take care about inverse roles, for which the rules look similar, but variables need to be swapped. Finally, \(\varPi _{ gen }(\mathcal {K},\varDelta ) = \varPi _{ gss }(\mathcal {K},\varDelta ) \cup \varPi _{ obv }(\mathcal {K}) \cup \varPi _{poe}(\mathcal {K}) \cup \varPi _{nra}(\mathcal {K})\).

ABox Translation. The first pruning of the search space originates from ABox assertions. As the input is a normalized knowledge base, each assertion contains only a literal concept, or literal role, respectively. It then straightforward to encode:

$$\begin{aligned} \varPi _{ chk }(\mathcal {A}) =&\ \{inc \;{:-} \; active(i), not\_A(a) \mid A(a) \in \mathcal {A} \} ~\cup \\&\ \{inc \;{:-} \; active(i), A(a) \mid \lnot A(a) \in \mathcal {A} \} ~\cup \\&\ \{inc \;{:-} \; active(i), not\_r(a,b) \mid r(a,b) \in \mathcal {A} \} ~\cup \\&\ \{inc \;{:-} \; active(i), r(a,b) \mid \lnot r(a,b) \in \mathcal {A} \}. \end{aligned}$$

TBox Translation. Each TBox axiom is normalized and of form \(\top \sqsubseteq \bigsqcup ^{n}_{i=1} C_{i}\), with each \(C_i\) being non-complex, i.e. one of the concept constructors depicted in Table 3. It is then easy to turn normalized axioms into appropriate rules to detect any violation.

figure b
Table 3. Translation of concept constructors. Note: \(O_a\) is a new concept name unique for a, and \(m = |\varDelta ^{\mathcal {I}}|\).

RBox Translation. Since normalized, each axiom in an RBox \(\mathcal {R} \) is either a (simplified) role chain, disjointness or role inclusion axiom. As for TBox axioms, each axiom in \(\mathcal {R} \) is translated into a rule that enforces the propositional variable inc to be true, whenever the axiom is violated.

$$\begin{aligned} \varPi _{ chk }(\mathcal {R}) =&\ \{inc \;{:-} \; active(i), r(X,Y),\; s(X,Y) \mid \mathsf {Dis} (r,s) \in \mathcal {R} \} \;\cup \\&\ \{inc \;{:-} \; active(i), r(X,Y),\;not\_s(X,Y) \mid r \sqsubseteq s \in \mathcal {R} \} \;\cup \\&\ \{inc \;{:-} \; active(i), s_1(X,Y), s_2(Y,Z),not\_r(X,Z) \mid s_1 \circ s_2 \sqsubseteq r \in \mathcal {R} \}. \end{aligned}$$

For example, \( \alpha _{2} \) and \( \alpha _{10} \) in Example 1 are encoded as:

figure c

Finally, let \(\varPi _{ chk }(\mathcal {K}) = \varPi _{ chk }(\mathcal {A}) \cup \varPi _{ chk }(\mathcal {T}) \cup \varPi _{ chk }(\mathcal {R})\), be the translation of all axioms in a knowledge base. It remains to remove any candidate answer-set not including inc, as well as guessing the set of active rules. As a result, any answer-set now indicates which axioms jointly cause the inconsistency. Then, preferring answer-sets that are subset-minimal w.r.t. the set of ground instances of \( active \) yield exactly the desired justifications. The following program captures these requirements and completes the translation \(\varPi (\mathcal {K},\varDelta ) = \varPi _{ gen }(\mathcal {K},\varDelta ) \cup \varPi _{ chk }(\mathcal {K}) \cup \varPi _{inc}(\mathcal {K})\).

$$\begin{aligned} \varPi _{inc}(\mathcal {K}) =&\ \{{:-} \;not\;inc. \\&\ \{active(X) \;{:-} \; X = 1..n\}. \\&\ \#optimize(p). \\&\ \# preference (p, subset) \{ active(C) : C=1..n\}.\ \} \end{aligned}$$

Theorem 1

Let \( ACT(\mathcal {K}) = \{active(1), \ldots , active(n)\} \), where \( n = |\mathcal {K} | \) and \( \mathcal {K} ^{X} = \{\alpha _{i} \in \mathcal {K} \mid active(i) \in X\} \). Then \( \mathcal {AS}(\varPi (\mathcal {K},\varDelta ))_{|\{active\}} = \{X \in 2^{ACT(\mathcal {K})} \mid \mathcal {K} ^{X} {is \ \varDelta -inconsistent} \} \).

Proof sketch

Using the well-known splitting theorem [17], we split \(\varPi (\mathcal {K},\varDelta ) \) into two parts: axiom (subset) guessing and inconsistency checking. First, we show that each X representing a potential subset can be used to reduce the program to \( \varPi (\mathcal {K} ^{X}, \varDelta ) \). For the second part, we show that if \( \mathcal {K} ^{X} \) is \( \varDelta \)-inconsistent, \( \mathcal {AS}(\varPi (\mathcal {K} ^{X}),\varDelta ) \) consists only of exactly one answer set. Combining both arguments via the splitting theorem, it can be concluded that each answer set of \(\varPi (\mathcal {K},\varDelta ) \) corresponds to a \( \varDelta \)-justification for inconsistency of \( \mathcal {K} \).   \(\square \)

We implemented this glass-box approach into \(\mathtt {Wolpertinger}\). In the evaluation, we refer to this approach as \(\mathtt {W\text {-}glass\text {-}box}\). While our translated programs need to be evaluated by \(\mathtt {asprin}\) (which needs \(\mathtt {Clingo}\)), it would be easy to remove the minimality preference, such that each answer set then corresponds to an inconsistent subset of the knowledge base. One could the also define (other) preferences, e.g. prioritizing some axioms to be necessarily included.

5 Evaluation

We introduce several simple constraint-type combinatorial problems that are aligned with our approach. We deliberately make them inconsistent, with a controlled number of justifications. The evaluations were performed on an HPC system with Haswell CPUs using a 4 GB memory limit. Unless stated differently, the timeout for each evaluation was 30 min. We use the hyphen symbol (-) to denote a timeout.

We reused an unsatisfiable knowledge base described in [6]. The knowledge base represents a Pigeonhole-type problem. We specified the axioms such that we want a model that depicts an r-chain of length \( n + 1 \), but fixed the domain to n elements, for which a model cannot exist. For \(\mathcal {K} _n = (\mathcal {T} _n, \mathcal {A} _n)\), we have:

$$\begin{aligned} \mathcal {T} _n&= \{A_1 \sqsubseteq \exists r.A_{2},\dots ,\; A_n \sqsubseteq \exists r.A_{n+1} \}\; \cup \\&\quad \ \{A_i \sqcap A_j \sqsubseteq \bot \;|\;1 \le i < j \le n+1\}\; \\ \mathcal {A} _n&= \{A_1(a_1)\} \\ \varDelta _n&= \{a_{1}, ..., a_{n}\} \end{aligned}$$
Table 4. Runtimes for checking unsatisfiability of \(\mathcal {K} _n\) (left table), and runtimes of each approach for computing one justification.

First, a comparison between the naïve and naff translation as been made. We expected the naff translation to be somewhat slower due to the overhead of computing some auxiliary atoms, which is confirmed as depicted in Table 4 (left). Afterwards, the performance of each approach to compute (\( \varDelta \))-justifications (of inconsistency) of this knowledge base has been evaluated. In this case, since the only justification is the whole knowledge base, there is no major difference between requesting only one, or all justifications. This would be different if the only justification is a proper subset, because the algorithm has to make sure there is no other justification. Table 4 (right) shows the result. It can be stressed, that for smaller instances, the \(\mathtt {W\text {-}glass\text {-}box}\) approach performs best, followed by \(\mathtt {W\text {-}black\text {-}box}\). However, they do not scale well for bigger instances where \(\mathtt {H\text {-}black\text {-}box}\) outperforms both of them. For the latter experiment, the timeout was set to one hour.

The second knowledge base \(\mathcal {K} _{(m,n)}\) used for evaluation heavily uses cardinality restrictions. Individuals of the source concept C need to be connected with at least n individuals that are each a member of the concept \( A_{i} \), where \( 1 \le i \le m \). However, we restrict the domain to contain only \(n + 1 \) elements. Finally, we impose a constraint such that all concepts are disjoint. Obviously, the existence of two such axioms already cause an inconsistency. For \(\mathcal {K} _{(m,n)} = (\mathcal {T} _{(m,n)}, \mathcal {A})\) we have:

$$\begin{aligned} \mathcal {T} _{(m,n)} =&\ \{C \sqsubseteq \;\ge n\;r.A_{1} , \ldots , C \sqsubseteq \;\ge n\;r.A_{m}\}\; \cup \\&\ \{C \sqsubseteq \lnot A_{1}, \ldots , C \sqsubseteq \lnot A_{m}\}\; \cup \\&\ \{A_{1} \sqsubseteq \lnot A_{2}, A_{1} \sqsubseteq \lnot A_{3} , \ldots , A_{m-1} \sqsubseteq \lnot A_{m}\} \\ \mathcal {A} =&\ \{C(a)\} \\ \varDelta _{n} =&\ \{a, x_{1}, ..., x_{n}\} \end{aligned}$$

The result is shown in Table 5. The black-box approach with HermiT failed to compute the justifications for any case within the time limit. This result indicates that standard reasoners struggle in handling cardinality restrictions under the fixed-domain semantics. We suppose that the result originates from the fact that \( \ge \)-cardinality is handled easily in standard semantics since the reasoner can introduce new individuals satisfying the restriction. While \(\mathtt {H\text {-}black\text {-}box}\) is able to solve some of the instances, \(\mathtt {W\text {-}glass\text {-}box}\) computes all of them in reasonable time.

Table 5. Runtime for individual cardinality.

The third evaluation is based on the graph-coloring problem. We encode some instances of the Mycielskian graphsFootnote 4. Since the chromatic number of each instance is provided, making them non-colorable is trivial. For a graph with chromatic number n, we only provide \( n - 1 \) colors. The result is shown in Table 6.

Table 6. Runtime for n-coloring problems.

Each approach exceeded the timeout for the larger instances. Similar to the cardinality evaluation, the \(\mathtt {W\text {-}glass\text {-}box}\) approach performs best. For the small instance, \(\mathtt {H\text {-}black\text {-}box}\) performs better than \(\mathtt {W\text {-}black\text {-}box}\). For the second instance, we find that \(\mathtt {H\text {-}black\text {-}box}\) provided merely one justification before the timeout, while \(\mathtt {W\text {-}black\text {-}box}\) was able to compute at least five justifications.

As shown in Table 4, \(\mathtt {H\text {-}black\text {-}box}\) performs better in some cases. While finding justifications is a hard problem, asking for several of them is more feasible. The necessary adjustments can easily be done for each tool. Another important note to mention is, we only use one thread for evaluation, though the problem itself could be done in parallel.

6 Conclusion

We considered the task of computing justifications for entailments under the fixed-domain semantics, a task of general high importance in the modeling phase of ontologies. We proposed three different approaches to this problem and comparatively evaluated one using standard OWL technology employing an axiomatization of the fixed-domain semantics, one using our dedicated fixed-domain reasoner Wolpertinger in combination with Protégé ’s explanation workbench, and one where the problem is encoded entirely into answer-set programming. The evaluation suggests that each of the proposed approaches do have their difficulties as well as individual advantages. Hence, it remains imperative to conduct more experiments with different setups. Also, all tools were used in their standard configuration, which gives another optimization angle.

Moreover, other approaches developed to debug answer-set programs need to be considered and compared. For example, Pontelli et al. suggest a method to obtain justifications for the truth value of an atom in an answer-set [23], which might be reused in our setting to obtain an explanation for inconsistency (represented by the propositional atom \( inc \)). A different approach is the stepwise debugging methodology proposed by Oetsch et al. which allows to identify rules that prohibit a certain (partial) answer-set [20]. However, this approach is designed to answer the question, why some interpretation is actually not an answer-set of the program, thus we see it as future work to identify how this approach can be resembled into our setting. Moreover, it would be a great feature for users if a tool actually recommended automatic repairs in addition to the justifications, which might be realized be using these related approaches.