1 Introduction

Abduction is a specific type of reasoning which explains why some observation does not follow from the information that we already have, i.e., from our knowledge base (KB). This problem of abduction was originally introduced by Peirce [19] in 1878. However studying abduction in the context of ontologies—and hence description logics (DL)—is relatively new [7]. Applications of abduction in this area include e.g., ontology debugging [26], system malfunction diagnosis [14], multimedia interpretation [20], and medical diagnosis [21].

A number of works studied ABox abduction and focused on algorithms for computing explanations [1, 3, 4, 6, 8, 9, 11, 12, 16,17,18, 22, 23].

Notably, multiple works adopted Reiter’s Minimal Hitting Set (MHS) algorithm [25] exploiting a DL reasoner for satisfiability checking. This was proposed by Halland and Britz [11, 12], and later extended by Pukancová and Homola [22, 23] and Mrózek et al. [18] who have developed a black box approach and its proof-of-concept implementation dubbed the AAA solverFootnote 1 (AAA stands for ABox Abduction Algorithm) that we describe in this paper.

2 ABox Abduction in Description Logics

Formally speaking, an abduction problem [7] is a pair \((\mathcal {K},\mathcal {O})\) where \(\mathcal {K}\) is an ontology (i.e., a DL knowledge base) and \(\mathcal {O}\) is a set of axioms called observations. A solution of an abduction problem is another set of axioms \(\mathcal {E}\) (also called explanation) s.t. \(\mathcal {K}\cup \mathcal {E}\models \mathcal {O}\).

If \(\mathcal {O}\) and \(\mathcal {E}\) are assumed to be TBox axioms, we are talking about TBox abduction. On the other hand, if \(\mathcal {O}\) and \(\mathcal {E}\) are supposed to be ABox assertions, then we deal with ABox abduction [7].

Our work focuses entirely on ABox abduction, and we distinguish ABox assertions of several types. Concept assertions are of the form \(\mathtt {a}:\mathtt {C}\), where \(\mathtt {a}\) is an individual and \(\mathtt {C}\) is a concept name or possibly a complex concept description. Role assertions are of the form \(\mathtt {a},\mathtt {b}:\mathtt {R}\), and negated role assertions of the form \(\mathtt {a},\mathtt {b}:\lnot \mathtt {R}\), where \(\mathtt {a},\,\mathtt {b}\) are individuals and \(\mathtt {R}\) is a role name. Reflexive role assertions of the forms \(\mathtt {a},\mathtt {a}:\mathtt {R}\) and \(\mathtt {a},\mathtt {a}:\lnot \mathtt {R}\) are called loops.

Given the monotonicity of DLs, the number of possible explanations is generally too high and must be limited somehow to make the whole task meaningful. Therefore most works focus only on (syntactically) minimal explanations, i.e., such \(\mathcal {E}\) that there is no other explanation \(\mathcal {E}'\subsetneq \mathcal {E}\).

This may still yield some undesirable explanations hence the desired explanations are often further constrained. The constraints that we assume in our work are that all explanations should be:

  1. 1.

    Consistent: i.e., \(\mathcal {K}\cup \mathcal {E}\) is consistent;

  2. 2.

    Relevant: i.e., \(\mathcal {E}\not \models \mathcal {O}\);

  3. 3.

    Explanatory: i.e., \(\mathcal {K}\not \models \mathcal {O}\).

3 The AAA Solver

Building on ideas of Halland and Britz [11, 12], we have developed an ABox abduction solver AAA [22, 23] based on Reiter’s MHS algorithm [25].

Let us first assume an observation \(\mathcal {O}= \{O\}\) consisting of a single ABox assertion. Reiter showed that in order to find an explanation \(\mathcal {E}\) s.t. \(\mathcal {K}\cup \mathcal {E}\models \{O\}\) it suffices to find a hitting set of the set of all models of \(\mathcal {K}\cup \{\lnot O\}\). The MHS algorithm, which he also proposed, does this by constructing a so called HS-tree in which explanations are found as paths from the root to the leaves. It is often desired to find all (meaningful) explanations and shorter explanations are preferred, the HS-tree is therefore constructed using breadth-first search.

The solver takes an advantage of this and allows the search to be depth-bound, thus allowing to find all explanations up to a given cardinality. The rest of the HS-tree is cut off. Our algorithm is complete up to any given depth [23].

As we need to compute the models of \(\mathcal {K}\cup \{\lnot O\}\), auxiliary calls to a DL reasoner are used. The DL reasoner is called as a black-box, and it is called on-the-fly during the HS-tree construction. AAA also exploits optimization techniques suggested by Reiter such as model reuse and pruning.

AAA also handles observations \(\mathcal {O}= \{O_1, \ldots , O_n\}\) consisting of a set of ABox assertions. For this case we have developed two approaches. The first one, so called splitting approach, computes the explanations for each \(O_i \in \mathcal {O}\) separately, which are then combined. The other one, so called reduction approach, reduces the set of ABox assertions \(\mathcal {O}\) into one ABox assertion \(O'\) in such a way, that all the explanations of \(\mathcal {O}\) are preserved.

AAA thus supports observations in form of any set of ABox assertions (including complex concept assertions and negated role assertions). It supports explanations in form of sets consisting of atomic and negated atomic concept and role assertions. The supported DL expressivity for both the input ontology and the observations is up to \(\mathcal {SROIQ}\) [13], i.e., OWL 2 [2].

Since version 0.9 the solver also allows to specify abducibles, i.e., individuals, concept names, and role names to which the search for explanations is limited. If the user knows beforehand in which part of the search space they are looking for explanations, this can greatly reduce the HS-tree size and thus the overall computation time.

4 Running AAA

Let us consider the following ontology (given in Manchester syntax) describing family relations, as our input knowledge base \(\mathcal {K}\):

figure a

The goal for AAA is to compute explanations for the observation “Jane is a mother”, i.e., for the observation \(O = \{\mathtt {jane}:\mathtt {Mother}\}\). To run the solver, the input file (the ontology) and the output file must be stated (-i and -out switches), and the observation is given on the command line using the -obs switch. There are some additional optional switches which will be explained below. The example command line is as follows:

figure b

After running the command, AAA computes the explanations and writes the output in the output file. Except some statistic information, such as computation times for the specific levels of the run, all the explanations are listed in the end of the file. An excerptFootnote 2 from the output file:

figure c

We can see that the solver has found six explanations in this case.

4.1 Restricting MHS Depth

The switch -d (already used in the example above) allows to limit the search to specific depth of the HS-tree, i.e., to search for explanations only up to a given cardinality.

The main reason why the user may want to use this restriction is because with greater depth, exploring each consecutive level of the HS-tree becomes exponentiallyFootnote 3 more time consuming (as apparent from the time statistics in the example above). Also, while our algorithm only returns minimal explanations, even among these, smaller explanations (in terms of cardinality) are often more preferred.

For example, from the six explanations in the listing above, only the explanations of the size 1 (containing only one assertion) may be desired. In such a case we modify the command as follows:

figure d

In this case only one explanation is found:

figure e

4.2 Avoiding Loops

There are some unintuitive explanations amongst the six found in the first example run above. Depending on the domain and the application it may not be desirable to explain the observed phenomenon by reflexive relations of the form \(\mathtt {a},\mathtt {a}:\mathtt {R}\) or \(\mathtt {a},\mathtt {a}:\lnot \mathtt {R}\). For instance, in our example we do not want to explain the observation that Jane is mother by the explanation that she is a child of herself.

The switch -l enables loops (i.e., reflexive role assertions) which are disallowed by default under the assumption that users would only rarely look for such explanations. The following example demonstrates the run of AAA without loops.

figure f

AAA now disregards loops, hence out of the six original explanations of the observation \(O = \{\mathtt {jane}:\mathtt {Mother}\}\), only four remain:

figure g

If loops are undesired they should always be omitted from the computation as this also reduces the search space and hence decreases the computation time. This can be also observed in our example runs above.

4.3 Introducing Abducibles

Another meaningful restriction on the search space is to restrict the individuals, concepts, and roles that are included in the explanations only to a certain set, called abducibles. Especially in bigger ontologies, the user may be specifically interested only in explanations involving specific entities.

Abducibles can be specified using the switch -abd as an enumeration of symbols allowed in the explanations. In the example below, the abducibles are restricted to the individual \(\mathtt {jane}\) and the concepts \(\mathtt {Woman}\), \(\mathtt {Man}\), \(\mathtt {Parent}\), and \(\mathtt {Grandmother}\):

figure h

When AAA is run on this input, it returns the following output:

figure i

As apparent from the example, providing a reasonable set of abducibles reduces the search space greatly, which is now limited to all possible (atomic and negated) ABox assertions that can be constructed from the provided sets of individuals, concept, and role names. This can significantly reduce the computation time—however, the user must have some prior assumptions as for where to look for the explanations.

4.4 Multiple Observations

The AAA solver notably allows for observation-sets consisting of multiple assertions. For example, we may want to find explanations that explain both \(\mathtt {jane}:\mathtt {Mother}\) and \(\mathtt {tarzan}:\mathtt {Child}\). We may do so, using the following command:

figure j

Note that in order to avoid nonsensical explanations with loops (8 in this case), we chose to disable loops. We obtain the following output:

figure k

As mentioned above, AAA supports two different approaches to enable multiple observations. This run was obtained by the so called splitting approach (default) in which the algorithm is run separately for each element of the observation set, and then the results are combined. We can see in TIME DETAILS that the solver has indeed constructed two HS-trees.

Using the switch -r we can enable the so called reduction approach, which reduces the observation set into a single ABox assertion and runs the MHS algorithm only once, on the reduced input. The output for this approach is as follows:

figure l

As we can see, in this case the reduction approach is more efficient than splitting, but it is not always the case.

5 Implementation Details

The AAA solver is implemented in Java (version 1.8). It has two main components, the implementation of the MHS algorithm and a DL reasoner (Pellet [28]) which is being called by the MHS component for consistency checks and for model retrieval. The Pellet reasoner calls are tightly integrated at the source-code level. Pellet is an open source OWL DL reasoner implemented in Java. We have used Pellet 2.3.1 which is the latest open-source version.

The advantage of using the DL reasoner as a black box, is that AAA consequently supports the same DL expressivity as is supported by the DL reasoner. Pellet features full OWL 2 support which means that any OWL 2 ontology may be used with our algorithm.

In the unrestricted case (when no abducibles are specified) the solver is able to process a single observation w.r.t. to the LUBM ontology [10] within seconds (up to depth 2), within minutes (up to depth 3) and within hours (up to depth 4) [24].

AAA is available for download at: http://dai.fmph.uniba.sk/~pukancova/aaa/.

6 Conclusions and Future Work

We have provided an overview of AAA, an ABox abduction solver for description logics. AAA is sound and complete up to \(\mathcal {SROIQ}\) DL (i.e., OWL 2). It is implemented in Java, it is based on Reiter’s Minimal Hitting Set algorithm, and it uses an external DL reasoner (Pellet) which is called as a black box.

AAA integrates Pellet directly at the source-code level; however experimental versions exploiting OWL API were also explored [9, 18]. We would like to explore this line also in the future and to provide a stable version based on OWL API that would enable the user to pick different DL reasoners. We also plan to implement different abduction strategies that could serve as an alternative to MHS, even some which are incomplete (but very fast) such as MergeXplain [9, 27].