Keywords

1 Introduction

Existential rules are a versatile knowledge representation language with relevance in ontological reasoning [1, 5, 6, 10], databases [11, 13, 15], and declarative computing in general [3, 4, 9]. In various semantic web applications, existential rule engines have been used to process knowledge graphs and ontologies, often realising performance advantages on large data sets [2, 3, 7, 22].

Existential rules extend Datalog with the facility for value invention, expressed by existentially quantified variables in conclusions. This ability to refer to “unknown” values is an important similarity to description logics (DLs) and the DL-based ontology standard OWL, and many such ontologies can equivalently be expressed in existential rules. This can be a practical approach for ontology-based query answering [8, 10]. For reasoning, many rule engines rely on materialisation, where the input data is expanded iteratively until all rules are satisfied (this type of computation is called chase). With existentials, this can require adding new “anonymous” individuals – called nulls –, and the process may not terminate. Several acyclicity conditions define cases where termination is ensured, and were shown to apply to many practical ontologies [10].

Nulls correspond to blank nodes in RDF, and – like bnodes in RDF [20] – are not always desirable. Avoiding nulls entirely is not an option in chase-based reasoning, but one can still avoid some “semantically redundant” nulls. For example, given a fact \(\textsf {person}(\textsf {alice})\) and a rule \(\textsf {person}(x)\rightarrow \exists y.\, \textsf {parent}(x,y)\), the chase would derive \(\textsf {parent}(\textsf {alice},n)\) for a fresh null n. However, if we already know that \(\textsf {parent}(\textsf {alice},\textsf {bob})\), then this inference is redundant and can be omitted. In general, structures that are free of such redundancies are mathematically known as cores. An RDF-graph that is a core is called a lean graph [16]. Unfortunately, the computation of cores is expensive, and can in general not be afforded during the chase. Sometimes, however, when rules satisfy a condition known as core stratification, practical chase algorithms can also produce a core directly [17].

Interestingly, both of the previously mentioned types of conditions – acyclicity and core stratification – are detected by analysing dependenciesFootnote 1 that indicate possible semantic interactions between rules. Early works focussed on cases where a rule \(\rho _2\) positively relies on a rule \(\rho _1\) in the sense that an application of rule \(\rho _1\) might trigger an application of rule \(\rho _2\). They are used to detect several forms of acyclity [1, 11, 21]. When adding negation, a rule might also inhibit another, and such negative reliances are used to define semantically well-behaved fragments of nonmonotonic existential rules [17, 19]. A third kind of dependency are restraints, which indicate that the application of one rule might render another one redundant: restraints were used to define core stratified rule sets [17], and recently also to define a semantics for queries with negation [12].

Surprisingly, given this breadth of applications, rule dependencies are hardly supported in practice. To our knowledge, positive reliances are only computed by the Graal toolkit [2], whereas negative reliances and restraints have no implementation at all. A possible reason is that such dependency checks are highly intractable, typically \(\varSigma _{\textsc {2}}^{\textsc {P}}\)-complete, and therefore not easy to implement efficiently. This is critical since their proposed uses are often related to the choice of a rule-processing strategy, so that their computation adds to overall reasoning time. Moreover, as opposed to many other static analyses, dependency computation is not mainly an application of algorithms that are already used in rule reasoning. Today’s use of dependencies in optimisation and analysis therefore falls short of expectations.

To address this problem, we design optimised algorithms for the computation of positive reliances and restraints. We propose global optimisations, reducing the number of relevant checks, and local optimisations, reducing the work needed to execute a specific check. The latter include an improved search strategy that often avoids the full exploration of exponentially many subsets of rule atoms, which may be necessary in the worst case. The underlying ideas can also be adapted to negative reliances and any of the modified definitions of positive reliances found in the literature.

We implement our methods and conduct extensive experiments with over 200 real-world ontologies of varying sizes. Considering the effectiveness of our optimisations, we find that local and global techniques both make important contributions to overall performance, enabling various practical uses:

  • We conduct the first analysis of the practical prevalence of core stratification [17] using our implementation of restraints. We find this desirable property in a significant share of ontologies from a curated repository and provide preliminary insights on why some rule sets are not core stratified.

  • Comparing the computation of all positive reliances to Graal, we see speed-ups of more than two orders of magnitude. Our stronger definition yields an acyclic graph of rule dependencies [1] in more cases.

  • The graph of positive reliances allows for showing how to speed up the expensive rule analysis algorithm MFA [10]. Compared to the MFA implementation of VLog [7], we observe speed-ups of up to four orders of magnitude.

2 Preliminaries

We build expressions from countably infinite, mutually disjoint sets \(\textbf{V}\) of variables, \(\textbf{C}\) of constants, \(\textbf{N}\) of labelled nulls, and \(\textbf{P}\) of predicate names. Each predicate name \(p\in \textbf{P}\) has an arity \(\textsf{ar}(p)\ge 0\). Terms are elements of \(\textbf{V}\cup \textbf{N}\cup \textbf{C}\). We use \(\boldsymbol{t}\) to denote a list \(t_1,\ldots ,t_{|\boldsymbol{t}|}\) of terms, and similar for special types of terms. An atom is an expression \(p(\boldsymbol{t})\) with \(p\in \textbf{P}\), \(\boldsymbol{t}\) a list of terms, and \(\textsf{ar}(p)=|\boldsymbol{t}|\). Ground terms or atoms contain neither variables nor nulls. An interpretation \(\mathcal {I}\) is a set of atoms without variables. A database \(\mathcal {D}\) is a finite set of ground atoms.

Syntax. An existential rule (or just rule) \(\rho \) is a formula

$$\begin{aligned} \rho = \forall \boldsymbol{x}, \boldsymbol{y}.\, \varphi [\boldsymbol{x}, \boldsymbol{y}] \rightarrow \exists \boldsymbol{z}.\, \psi [\boldsymbol{y}, \boldsymbol{z}], \end{aligned}$$
(1)

where \(\varphi \) and \(\psi \) are conjunctions of atoms using only terms from \(\textbf{C}\) or from the mutually disjoint lists of variables \(\boldsymbol{x}, \boldsymbol{y}, \boldsymbol{z}\subseteq \textbf{V}\). We call \(\varphi \) the body (denoted \(\textsf{body}(\rho )\)) and \(\psi \) the head (denoted \(\textsf{head}(\rho )\)). We may treat conjunctions of atoms as sets, and we omit universal quantifiers in rules. We require that all variables in \(\boldsymbol{y}\) do really occur in \(\varphi \) (safety). A rule is Datalog if it has no existential quantifiers.

Semantics. Given a set of atoms \(\mathcal {A}\) and an interpretation \(\mathcal {I}\), a homomorphism \(h:\mathcal {A}\rightarrow \mathcal {I}\) is a function that maps the terms occurring in \(\mathcal {A}\) to the (variable-free) terms occurring in \(\mathcal {I}\), such that: (i) for all \(c\in \textbf{C}\), \(h(c)=c\); (ii) for all \(p\in \textbf{P}\), \(p(\boldsymbol{t})\in \mathcal {A}\) implies \(p(h(\boldsymbol{t}))\in \mathcal {I}\), where \(h(\boldsymbol{t})\) is the list of h-images of the terms \(\boldsymbol{t}\). If (ii) can be strengthened to an “if, and only if”, then h is strong. We apply homomorphisms to a formula by applying them individually to all of its terms.

A match of a rule \(\rho \) in an interpretation \(\mathcal {I}\) is a homomorphism \(\textsf{body}(\rho )\rightarrow \mathcal {I}\). A match h of \(\rho \) in \(\mathcal {I}\) is satisfied if there is a homomorphism \(h':\textsf{head}(\rho )\rightarrow \mathcal {I}\) that agrees with h on all variables that occur in body and head (i.e., variables \(\boldsymbol{y}\) in (1)). Rule \(\rho \) is satisfied by \(\mathcal {I}\), written \(\mathcal {I}\models \rho \), if every match of \(\rho \) in \(\mathcal {I}\) is satisfied. A set of rules \(\varSigma \) is satisfied by \(\mathcal {I}\), written \(\mathcal {I}\models \varSigma \), if \(\mathcal {I}\models \rho \) for all \(\rho \in \varSigma \). We write \(\mathcal {I}\models \mathcal {D},\varSigma \) to express that \(\mathcal {I}\models \varSigma \) and \(\mathcal {D}\subseteq \mathcal {I}\). In this case, \(\mathcal {I}\) is a model of \(\varSigma \) and \(\mathcal {D}\).

Applying Rules. A rule \(\rho \) of form (1) is applicable to an interpretation \(\mathcal {I}\) if there is an unsatisfied match h in \(\mathcal {I}\) (i.e., h cannot be extended to a homomorphism \(\psi \rightarrow \mathcal {I}\)). Applying \(\rho \) for h yields the interpretation \(\mathcal {I}\cup \psi [h'(\boldsymbol{y}),h'(\boldsymbol{z})]\), where \(h'\) is a mapping such that \(h'(y)=h(y)\) for all \(y\in \boldsymbol{y}\), and for all \(z\in \boldsymbol{z}\), \(h'(z)\in \textbf{N}\) is a distinct null not occurring in \(\mathcal {I}\). The (standard) chase is a reasoning algorithm obtained by applying rules to a given initial database, such that all applicable rules are eventually applied (fairness).

Core Models. A model \(\mathcal {I}\) is a core if every homomorphism \(h:\mathcal {I}\rightarrow \mathcal {I}\) is strong and injective. For finite models, this is equivalent to the requirement that every such homomorphism is an isomorphism, and this will be the only case we are interested in for this work. Intuitively, the condition states that the model does not contain a strictly smaller substructure that is semantically equivalent for conjunctive query answering.

Unification. For atom sets \(\mathcal A\) and \(\mathcal B\), partial function \(m :\mathcal A\rightarrow \mathcal B\) is an atom mapping, where \(\textsf{dom}(m)\subseteq \mathcal A\) is the set of all atoms for which m is defined. A substitution is a function \(\theta :\textbf{C}\cup \textbf{V}\cup \textbf{N}\rightarrow \textbf{C}\cup \textbf{V}\cup \textbf{N}\), such that \(\theta (c)=c\) for all \(c\in \textbf{C}\cup \textbf{N}\). Denote the application of \(\theta \) to term t by \(t\theta \), naturally extending to atoms and atom sets by term-wise application. The concatenation of substitutions \(\sigma \) and \(\theta \) is \(\sigma \theta \) where \(t\sigma \theta = (t\sigma )\theta \). A substitution is a unifier for atom mapping m if for all \(\alpha \in \textsf{dom}(m)\), \(\alpha \theta = (m(\alpha ))\theta \). A unifier \(\mu \) for m is a most general unifier (mgu) for m if for all unifiers \(\nu \) of m, there is a substitution \(\sigma \), such that \(\mu \sigma =\nu \).

3 Dependencies and Their Naive Computation

We first introduce the two kinds of rule dependencies that we consider: positive reliances and restraints. Our definitions largely agree with the literature, but there are some small differences that we comment on.

Definition 1

A rule \(\rho _2\) positively relies on a rule \(\rho _1\), written \(\rho _1\mathrel {\prec ^{+}}\rho _2\), if there are interpretations \(\mathcal {I}_a\subseteq \mathcal {I}_b\) and a function \(h_2\) such that

  1. (a)

    \(\mathcal {I}_b\) is obtained from \(\mathcal {I}_a\) by applying \(\rho _1\) for the match \(h_1\) extended to \(h_1'\),

  2. (b)

    \(h_2\) is an unsatisfied match for \(\rho _2\) on \(\mathcal {I}_b\), and

  3. (c)

    \(h_2\) is not a match for \(\rho _2\) on \(\mathcal {I}_a\).

Definition 1 describes a situation where an application of \(\rho _1\) immediately enables a new application of \(\rho _2\). Condition (b) takes into account that only unsatisfied matches can lead to rule applications in the standard chase. The same condition is used by Krötzsch [17], whereas Baget et al. [1, 2] – using what they call piece-unifier – only require \(h_2\) to be a match. In general, weaker definitions are not incorrect, but may lead to unnecessary dependencies.

Example 1

Consider the following ontology. We provide three axioms in DL syntax (left-hand side) and their translation into existential rules (right-hand side).

figure a

For this rule set, we find \(\rho _1 \mathrel {\prec ^{+}}\rho _2\) by using \(\mathcal {I}_a = \{ a(c) \}\), \(\mathcal {I}_b = \{ a(c), r(c,n) \}\), and \(h_2 = \{ y\mapsto c, z_1\mapsto n, z_2\mapsto n \}\). Note that \(\rho _3\) does not positively rely on \(\rho _1\) although the application of \(\rho _1\) may lead to a new match for \(\rho _3\). However, this match is always satisfied, so condition (b) of Definition 1 is not fulfilled.

The definition of restraints considers situations where the nulls introduced by applying rule \(\rho _2\) are at least in part rendered obsolete by a later application of \(\rho _1\). This obsolescence is witnessed by an alternative match that specifies a different way of satisfying the rule match of \(\rho _2\).

Definition 2

Let \(\mathcal {I}_a\subseteq \mathcal {I}_b\) be interpretations such that \(\mathcal {I}_a\) was obtained by applying the rule \(\rho \) for match h which is extended to \(h'\). A homomorphism \(h^A:h'(\textsf{head}(\rho ))\rightarrow \mathcal {I}_b\) is an alternative match of \(h'\) and \(\rho \) on \(\mathcal {I}_{b}\) if

  1. (1)

    \(h^A(t)=t\) for all terms t in \(h(\textsf{body}(\rho ))\), and

  2. (2)

    there is a null n in \(h'(\textsf{head}(\rho ))\) that does not occur in \(h^A(h'(\textsf{head}(\rho )))\).

Now \(\rho _1\) restrains \(\rho _2\) if it creates an alternative match for it:

Definition 3

A rule \(\rho _1\) restrains a rule \(\rho _2\), written \(\rho _1\mathrel {\prec ^{\square }}\rho _2\), if there are interpretations \(\mathcal {I}_a\subseteq \mathcal {I}_b\) such that

  1. (a)

    \(\mathcal {I}_b\) is obtained by applying \(\rho _1\) for match \(h_1\) extended to \(h_1'\),

  2. (b)

    \(\mathcal {I}_a\) is obtained by applying \(\rho _2\) for match \(h_2\) extended to \(h_2'\),

  3. (c)

    there is an alternative match \(h^A\) of \(h_2'\) and \(\rho _{2}\) on \(\mathcal {I}_{b}\), and

  4. (d)

    \(h^A\) is no alternative match of \(h_2'\) and \(\rho _{2}\) on \(\mathcal {I}_b\setminus h_1'(\textsf{head}(\rho _1))\).

Our definition slightly deviates from the literature [17], where (d) made a stronger requirement:

(d’):

\(h_2\) has no alternative match \(h_2'(\textsf{head}(\rho _2))\rightarrow \mathcal {I}_b\setminus h_1'(\textsf{head}(\rho _1))\).

As we will see, our modification allows for a much more efficient implementation, but it also leads to more restraints. Since restraints overestimate potential interactions during the chase anyway, all formal results of prior works are preserved.

Example 2

For the rules \(\rho _1= r(y,y)\rightarrow \exists w.\, r(y,w)\wedge b(w)\) and \(\rho _2= a(x)\rightarrow \exists v.\, r(x,v)\), we find \(\rho _1\mathrel {\prec ^{\square }}\rho _2\) by Definition 3, where we set \(\mathcal {I}_a=\{a(c),r(c,n_1)\}\), \(\mathcal {I}_b=\mathcal {I}_a\cup \{r(c,c),r(c,n_2),b(n_2)\}\), and \(h^A=\{c\mapsto c, n_1\mapsto n_2\}\). However, these \(\mathcal {I}_a\) and \(\mathcal {I}_b\) do not satisfy the stricter condition (d’), since \(h^B=\{c\mapsto c, n_1\mapsto c\}\) is an alternative match, too. Indeed, when \(\rho _2\) is applicable in such a way as to produce an alternative match w.r.t. an application of \(\rho _1\), another one must have already existed.

Example 2 is representative of situations where (d) leads to different restraints than (d’): the body of the restraining rule \(\rho _1\) must contain a pattern that enforces an additional alternative match (here: r(yy)), while not being satisfiable by the conclusion of \(\rho _2\) (here: \(r(y,n_1)\)). To satisfy the remaining conditions, \(\textsf{head}(\rho _1)\) must further produce a (distinct) alternative match. Such situations are very rare in practice, so that the benefits of (d) outweigh the loss of generality.

Checking for positive reliances and restraints is \(\varSigma _{\textsc {2}}^{\textsc {P}}\)-complete. Indeed, we can assume \(\mathcal {I}_a\) and \(\mathcal {I}_b\) to contain at most as many elements as there are distinct terms in the rule, so that they can be polynomially guessed. The remaining conditions can be checked by an \(\textsc {NP} \)-oracle. Hardness follows from the \(\varSigma _{\textsc {2}}^{\textsc {P}}\)-hardness of deciding if a rule has an unsatisfied match [15].

The existence of alternative matches in a chase sequence indicates that the resulting model may contain redundant nulls. Ordering the application of rules during the chase in a way that obeys the restraint relationship (\(\mathrel {\prec ^{\square }}\)) ensures that the chase sequence does not contain any alternative matches and therefore results in a core model [17].

Example 3

Consider again the rule set from Example 1. For the interpretation \(\mathcal {I}_0 = \{a(c), r(c, d)\}\) all three rules are applicable. Disregarding \(\rho _3 \mathrel {\prec ^{\square }}\rho _1\) and applying \(\rho _1\) first results in \(\mathcal {I}_1 = \mathcal {I}_0 \cup \{r(c, n), b(n)\}\), which leads to the alternative match \(h^A = \{c \mapsto c, n \mapsto d\}\) after applying \(\rho _3\). If we, on the other hand, start with \(\rho _3\), we obtain \(\mathcal {I}_1' = \mathcal {I}_0 \cup \{b(d)\}\). Rule \(\rho _1\) is now satisfied and the computation finishes with a core model after applying \(\rho _2\).

The ontology from Example 1 is an example of a core stratified rule set. A set of rules is core stratified if the graph of all \(\mathrel {\prec ^{+}}\cup \mathrel {\prec ^{\square }}\) edges does not have a cycle that includes a \(\mathrel {\prec ^{\square }}\) edge. This property allows us to formulate a rule application strategy that respects the restraint relationship as follows: Given \(\rho _1 \mathrel {\prec ^{\square }}\rho _2\), apply the restrained rule \(\rho _2\) only if neither \(\rho _1\) nor any of the rules \(\rho _1\) directly or indirectly positively relies on is applicable.

4 Computing Positive Reliances

The observation that positive reliances can be decided in \(\varSigma _{\textsc {2}}^{\textsc {P}}\) is based on an algorithm that considers all possible sets \(\mathcal {I}_a\) and \(\mathcal {I}_b\) up to a certain size. This is not practical, in particular for uses where dependencies need to be computed as part of the (performance-critical) reasoning, and we therefore develop a more goal-oriented approach.

In the following, we consider two rules \(\rho _1\) and \(\rho _2\) of form \(\rho _i = \textsf{body}_i\rightarrow \exists \boldsymbol{z}_i.\, \textsf{head}_i\), with variables renamed so that no variable occurs in both rules. Let \(\textbf{V}_\forall \) and \(\textbf{V}_\exists \), respectively, denote the sets of universally and existentially quantified variables in \(\rho _1\) and \(\rho _2\). A first insight is that the sets \(\mathcal {I}_a\) and \(\mathcal {I}_b\) of Definition 1 can be assumed to contain only atoms that correspond to atoms in \(\rho _1\) and \(\rho _2\), with distinct universal or existential variables replaced by distinct constants or nulls, respectively. For this replacement, we fix a substitution \(\omega \) that maps each variable in \(\textbf{V}_\exists \) to a distinct null, and each variable in \(\textbf{V}_\forall \) to a distinct constant that does not occur in \(\rho _1\) or \(\rho _2\).

figure b

A second insight is that, by (c), \(\rho _1\) must produce some atoms that are relevant for a match of \(\rho _2\), so that our algorithm can specifically search for a mapped subset \(\textsf{body}^m_2\subseteq \textsf{body}_2\) and a substitution \(\eta \) such that \(\textsf{body}^m_2\eta \subseteq \textsf{head}_1\eta \). Note that \(\eta \) represents both matches \(h_1\) and \(h_2\) from Definition 1, which is possible since variables in \(\rho _1\) and \(\rho _2\) are disjoint. The corresponding set \(\mathcal {I}_a\) then is \((\textsf{body}_1\cup (\textsf{body}_2\setminus \textsf{body}^m_2))\eta \omega \). Unfortunately, it does not suffice to consider singleton sets for \(\textsf{body}^m_2\), as shown by Example 4:

Example 4

Consider the rules from Example 1. Trying to map either one of the atoms of \(\textsf{body}(\rho _2)\) to \(\textsf{head}(\rho _1)\) yields an \(\mathcal {I}_a=\{ a(c), r(c,c') \}\), to which \(\rho _1\) is not applicable. The correct \(\mathcal {I}_a=\{a(c)\}\) as given in Example 1 is found by unifying both atoms of \(\textsf{body}(\rho _2)\) with (an instance of) \(\textsf{head}(\rho _1)\).

Therefore, we have to analyse all subsets \(\textsf{body}^m_2\subseteq \textsf{body}_2\) for possible matches with \(\textsf{head}_1\). We start the search from singleton sets, which are successively extended by adding atoms. A final important insight is that this search can often be aborted early, since a candidate pair for \(\mathcal {I}_a\) and \(\mathcal {I}_b\) may fail Definition 1 for various reasons, and considering a larger \(\textsf{body}^m_2\) is not always promising. For example, if \(\eta \) is a satisfied match for \(\rho _2\) over \(\mathcal {I}_b\) (b), then adding more atoms to \(\textsf{body}^m_2\) will never succeed.

These ideas are implemented in Algorithms 1 ( ) and 2 ( ), explained next. For a substitution \(\theta \), we write \(\theta _\forall \) (\(\theta _\exists \), resp.), to denote the substitution assigning existential variables (universal variables, resp.) to themselves, and otherwise agrees with \(\theta \).

Function iterates over extensions of a given candidate set. To specify how atoms of \(\textsf{body}_2\) are mapped to \(\textsf{head}_1\), we maintain an atom mapping \(m:\textsf{body}_2\rightarrow \textsf{head}_1\) whose domain \(\textsf{dom}(m)\) corresponds to the chosen \(\textsf{body}^m_2\subseteq \textsf{body}_2\). To check for the positive reliance, we initially call . Note that \(\rho _1\) and \(\rho _2\) can be based on the same rule (a rule can positively rely on itself); we still use two variants that ensure disjoint variable names.

figure g

We treat rule bodies and heads as lists of atoms, and write \(\varphi [i]\) for the ith atom in \(\varphi \). The expression returns the largest index of an atom in \(\textsf{dom}(m)\), or 0 if \(\textsf{dom}(m)=\emptyset \). By extending m only with atoms of larger index (L1), we ensure that each \(\textsf{dom}(m)\) is only considered once. We then construct each possible extension of m (L3), where we replace existential variables by fresh nulls in \(\textsf{head}_1\). In Line 4, is the most general unifier \(\eta \) of \(m'\) or undefined if \(m'\) cannot be unified. With variables, constants, and nulls as the only terms, unification is an easy polynomial algorithm.

Processing continues with , called in Line 5 of . We first partition \(\textsf{body}_2\) into the matched atoms \(\textsf{body}^m_2\), and the remaining atoms to the left \(\textsf{body}^\ell _2\) and right \(\textsf{body}^r_2\) of the maximal index of m. Only \(\textsf{body}^r_2\) can still be considered for extending m. Six if-blocks check all conditions of Definition 1, and true is returned if all checks succeed. When a check fails, the search is either stopped (L10, L11, and L17) or recursively continued with an extended mapping (L12, L14, and L15). The three checks in L10–L12 cover cases where \(\mathcal {I}_a\) (L13) would need to contain nulls that are freshly introduced by \(\rho _1\) only later. L10 applies, e.g., when checking \(\rho _2\mathrel {\prec ^{+}}\rho _1\) for \(\rho _1,\rho _2\) as in Example 2, where we would get \(a(n)\in \mathcal {I}_a\) (note the swap of rule names compared to our present algorithm). Further extensions of m are useless for L10, since they could only lead to more specific unifiers, and also for L11, where nulls occur in “earlier” atoms that are not considered in extensions of m. For case L12, however, moving further atoms from \(\textsf{body}^r_2\) to \(\textsf{body}^m_2\) might be promising, so we call there.

In L14, we check if the constructed match of \(\rho _1\) on \(\mathcal {I}_a\) is already satisfied. This might again be fixed by extending the mapping, since doing so makes \(\textsf{body}^r_2\) and hence \(\mathcal {I}_a\) smaller. If we reach L15, we have established condition (a) of Definition 1. L15 then ensures condition (c), which might again be repaired by extending the atom mapping so as to make \(\mathcal {I}_a\) smaller. Finally, L17 checks condition (b). If this fails, we can abort the search: unifying more atoms of \(\textsf{body}_2\) with \(\textsf{head}_1\) will only lead to a more specific \(\mathcal {I}_b\) and \(\eta \), for which the check would still fail.

figure m

Theorem 1

For rules \(\rho _1\) and \(\rho _2\) that (w.l.o.g.) do not share variables, \(\rho _1\mathrel {\prec ^{+}}\rho _2\) iff .

5 Computing Restraints

We now turn our attention to the efficient computation of restraints. In spite of the rather different definitions, many of the ideas from Sect. 4 can also be applied here. The main observation is that the search for an alternative match can be realised by unifying a part of \(\textsf{head}_2\) with \(\textsf{head}_1\) in a way that resembles our unification of \(\textsf{body}_2\) with \(\textsf{head}_1\) in Sect. 4.

To realise this, we define a function as a small modification of Algorithm 1, where we simply replace \(\textsf{body}_2\) in L1 and L3 by \(\textsf{head}_2\), and in L5 by , which is defined in Algorithm 3 and explained next.

We use the notation for \(\rho _1\), \(\rho _2\), \(\omega \), \(\textbf{V}_\exists \), and \(\textbf{V}_\forall \) as introduced in Sect. 4, and again use atom mapping m to represent our current hypothesis for a possible match. What is new now is that unified atoms in \(\textsf{dom}(m)\) can contain existentially quantified variables, though existential variables in the range of m (from \(\textsf{head}_1\)) are still replaced by nulls as in Algorithm 1, L5. An existential variable in \(\textsf{head}_2\) might therefore be unified with a constant, null, or universal variable of \(\textsf{head}_1\). In the last case, where we need a unifier \(\eta \) with \(z\eta = x\eta \) for \(z\in \textbf{V}_\exists \) and \(x\in \textbf{V}_\forall \), we require that \(x\eta =z\eta \in \textbf{V}_\forall \) so that \(\eta \) only maps to variables in \(\textbf{V}_\forall \). \(\eta \) simultaneously represents the matches \(h_1\), \(h_2\), and \(h^A\) from Definition 3.

Example 5

For rules \(\rho _1= r(x,y) \rightarrow s(x,x,y)\) and \(\rho _2= a(z) \rightarrow \exists v.\, s(z,v,v)\wedge b(v)\), and mapping \(m=\{s(z,v,v)\mapsto s(x,x,y)\}\), we obtain a unifier \(\eta \) that maps all variables to x (we could also use y, but not the existential v). Let \(x\omega =c\) be the constant that x is instantiated with. Then we can apply \(\rho _2\) to \(\tilde{\mathcal {I}}_a = \{ a(z)\eta \omega \} = \{ a(c) \}\) with match \(h_2=\{z\mapsto c, v\mapsto n \}\) to get \(\mathcal {I}_a=\tilde{\mathcal {I}}_a\cup \{ s(c,n,n),b(n)\}\), and \(\rho _1\) to \(\tilde{\mathcal {I}}_b=\mathcal {I}_a\cup \{r(c,c),b(c)\}\) with match \(h_1=\{x\mapsto c, y\mapsto c\}\) to get \(\mathcal {I}_b=\tilde{\mathcal {I}}_b\cup \{s(c,c,c)\}\). Note that we had to add b(c) to obtain the required alternative match \(h^A\), which maps n to \(v\eta \omega =c\) and c to itself.

As in the example, a most general unifier \(\eta \) yields a candidate \(h^A\) that maps every null of the form \(v\omega _\exists \) to \(v\eta _\exists \omega _\forall \). Likewise, for \(i\in \{1,2\}\), \(h_i=\eta _\forall \omega \) are the (extended) matches, while \(\eta _\forall \omega _\forall \) are the body matches. The image of the instantiated \(\textsf{head}_2\eta _\forall \omega \) under the alternative match \(h^A\) is given by \(\textsf{head}_2\eta \omega \). The corresponding interpretations are \(\mathcal {I}_a= \textsf{body}_2\eta _\forall \omega _\forall \cup \textsf{head}_2\eta _\forall \omega \) and \(\mathcal {I}_b= \mathcal {I}_a\cup \textsf{body}_1\eta _\forall \omega _\forall \cup \textsf{head}_1\eta _\forall \omega \cup (\textsf{head}\setminus \textsf{dom}(m))\eta \omega \), where \((\textsf{head}_2\setminus \textsf{dom}(m))\eta \omega \) provides additional atoms required for the alternative match but not in the mapped atoms of \(\textsf{head}_2\). With these intuitions, Algorithm 3 can already be understood.

It remains to explain the conditions that are checked before returning true. As before, we partition \(\textsf{dom}(m)\) into mapped atoms \(\textsf{head}^m_2\) and left and right remainder atoms. Checks in L22–L24 ensure that the only variables mapped by \(\eta \) to nulls (necessarily from \(\textsf{head}_1\omega _\exists \)) are existential variables in \(\textsf{head}^m_2\): such mappings are possible by \(h^A\). Extending m further is only promising if the nulls only stem from atoms in \(\textsf{head}^r_2\).

Check L26 continues the search when no atoms with existentials have been selected yet. Selecting other atoms first might be necessary by our order, but no alternative matches can exist for such mappings (yet). Lines L29 and L32 check that the matches \(h_1\) and \(h_2\) are indeed unsatisfied. Extending m might fix L29 by making \(\tilde{\mathcal {I}}_a\) smaller, whereas L32 cannot be fixed. Finally, L33 ensures condition (d) of Definition 3.

Example 6

Consider rules \(\rho _1 = b(x,y) \rightarrow r(x,y,x,y) \wedge q(x,y)\), \(\rho _2 = a(u,v) \rightarrow \exists w.\, r(u,v,w,w) \wedge r(v,u,w,w)\), and mapping \(m = \{ r(u,v,w,w)\mapsto r(x,y,x,y) \}\). We obtain unifier \(\eta \) mapping all variables to a single universally quantified variable, say x. We reach \(\tilde{\mathcal {I}}_b = \{ a(c,c), r(c,c,n,n), b(c,c), r(c,c,c,c) \}\), based on \(\tilde{\mathcal {I}}_a = \{ a(c,c) \}\) (\(x\omega = c\)), for which \(\rho _1\) is applicable but \(h^A = \{ n\mapsto c, c\mapsto c \}\) is already an alternative match on \(\tilde{\mathcal {I}}_b\), recognized by L33.

Theorem 2

For rules \(\rho _1\) and \(\rho _2\) that (w.l.o.g.) do not share variables, \(\rho _1\mathrel {\prec ^{\square }}\rho _2\) holds according to Definition 3 for some \(\mathcal {I}_a\ne \mathcal {I}_b\) iff .

The case \(\mathcal {I}_a=\mathcal {I}_b\), which Theorem 2 leaves out, is possible [17, Example 5], but requires a slightly different algorithm. We can adapt Algorithm 3 by restricting to one rule, for which we map from atoms in \(\textsf{head}\) to atoms in \(\textsf{head}\omega _\exists \). The checks (for \(\textsf{head}_2\)) of Algorithm 3 remain as before, but we only need to compute a single \(\mathcal {I}\) that plays the role of \(\mathcal {I}_a\) and \(\mathcal {I}_b\). Check L33 is replaced by a new check

figure s

ensuring that at least one null is mapped differently in the alternative match. With these modifications, we can show an analogous result to Theorem 2 for the case \(\mathcal {I}_a=\mathcal {I}_b\).

6 Implementation and Global Optimisations

We provide a C++ implementation of our algorithms, which also includes some additional optimisations and methods as described next. Our prototype is build on top of the free rule engine VLog (Release 1.3.5) [23], so that we can use its facilities for loading rules and checking MFA (see Sect. 7). Reasoning algorithms of VLog are not used in our code.

The algorithms of Sects. 4 and 5 use optimisations that are local to the task of computing dependencies for a single pair of rules. The quadratic number of potential rule pairs is often so large, however, that even the most optimised checks lead to significant overhead. We therefore build index structures that map predicates p to rules that use p in their body or head, respectively. For each rule \(\rho _1\), we then check \(\rho _1\mathrel {\prec ^{+}}\rho _2\) only for rules \(\rho _2\) that mention some predicate from \(\textsf{head}(\rho _1)\) in their body, and analogously for \(\rho _1\mathrel {\prec ^{\square }}\rho _2\).

Specifically for large rule sets, we further observed that many rules share the exact same structure up to some renaming of predicates and variables. For every rule pair considered, we therefore create an abstraction that captures the co-occurrence of predicates but not the concrete predicate names. This abstraction is used as a key to cache results of prior computations that can be re-used when encountering rule pairs with the exact same pattern of predicate names.

Besides these optimisations, we also implemented unoptimised variants of the algorithms of Sects. 4 and 5 to be used as a base-line in experiments. Instead of our goal-directed check-and-extend strategy, we simply iterate over all possible mappings until a dependency is found or the search is completed.

7 Evaluation

We have evaluated our implementation regarding (1) efficiency of our optimisations and (2) utility for solving practical problems. The latter also led to the first study of so-called core stratified real-world rule sets. Our evaluation machine is a mid-end server (Debian Linux 9.13; Intel Xeon CPU E5-2637v4@3.50 GHz; 384 GB RAM DDR4; 960 GB SSD), but our implementation is single-threaded and did not use more than 2 GB of RAM per individual experiments.

Experimental Data. All experiments use the same corpus of rule sets, created from real-world OWL ontologies of the Oxford Ontology Repository (http://www.cs.ox.ac.uk/isg/ontologies/). OWL is based on a fragment of first-order logic that overlaps with existential rules. OWL axioms that involve datatypes were deleted; any other axiom was syntactically transformed to obtain a Horn clause that can be written as a rule. This may fail if axioms use unsupported features, especially those related to (positive) disjunctions and equality. We dropped ontologies that could not fully be translated or that required no existential quantifier in the translation. Thereby 201 of the overall 787 ontologies were converted to existential rules, corresponding largely to those ontologies in the logic Horn-\(\mathcal {SRI}\) [18]. The corpus contains 63 small (18–1,000 rules), 90 medium (1,000–10,000 rules), and 48 large (10,000–167,351 rules) sets. Our translation avoided normalisation and auxiliary predicates, which would profoundly affect dependencies. This also led to larger rule bodies and heads, both ranging up to 31 atoms.

Table 1. Number of rule sets achieving a given order of magnitude of speed-up for computing \(\mathrel {\prec ^{+}}\) (left) and \(\mathrel {\prec ^{\square }}\) (right) from one variant to another; t.o. gives the number of avoided timeouts

Optimisation Impact. We compare four software variants to evaluate the utility of our proposed optimisations. Our baseline N is the unoptimised version described in Sect. 6, while L uses the locally optimised algorithms of Sects. 4 and 5. Version G is obtained from N by enabling the global optimisations of Sect. 6, and A combines all optimisations of L and G. For each of the four cases, we measured the total time of determining all positive reliances and all restraints for each rule set. A timeout of 60sec was used. The number of timeouts for each experiment was as follows:

figure t

To present the remaining results, we focus on speed-up, i.e., the ratio of runtime of a less optimised variant over runtime of a more optimised one. Table 1 classifies the observed speed-ups in several scenarios by their order of magnitude. For example, in the left table, the number 14 in line \(\textsf {N}/\textsf {L} \) and column “\({<}10^2\)” means that for 14 of the 201 rule sets, L was between 10–\(10^2\) times faster than N. Note that \(\textsf {G}/\textsf {A} \) shows the effect of adding local optimisations to G. Column “\({=}1\)” shows cases where both variants agree, and column “t.o.” cases where the optimisation avoided a prior timeout (the speed-up cannot be computed since the timeout does not correspond to a time).

We conclude that both L and G can lead to significant performance gains across a range of ontologies. Strong effects are seen against the baseline (\(\textsf {N}/\textsf {L} \) and \(\textsf {N}/\textsf {G} \)), but also (to a slightly lesser extent) against variants with the other optimisations (\(\textsf {G}/\textsf {A} \) and \(\textsf {L}/\textsf {A} \)). Overall, \(\mathrel {\prec ^{\square }}\) turned out to be slower than \(\mathrel {\prec ^{+}}\), with the global optimisations being less effective.

Fig. 1.
figure 1

Positive reliance computation in Graal (top) and our system (bottom)

Acyclic Positive Reliances. For rule sets where the graph of positive reliances is acyclic, query answering is possible with many existing rule engines [1]. To evaluate how our work compares to the state of the art in computing this graph, we measure the time taken by Graal to find all positive reliances and compare them to our prototype A from above. The results are shown in Fig. 1.

Our approach consistently outperformed Graal by about one order of magnitude. Overall, we can classify 178 ontologies in under 1 s, making this analysis feasible at reasoning time. The difference in execution time is explained by our optimisations: given two rules \(\rho _1\) and \(\rho _2\), Graal computes all (exponentially many in the worst case) different ways to unify the \(\textsf{head}(\rho _1)\) with \(\textsf{body}(\rho _2)\) while our implementation (1) stops when a positive reliance is discovered, (2) discards atom mappings when a negative result is guaranteed, and (3) caches results of previous computations. Recall that Graal uses a slightly weaker notion of positive reliance (cf. Sect. 3), which leads to more cycles: we find 36 acyclic sets in Graal, but 70 such sets in our system.

Faster MFA. Model-faithful acyclicity (MFA) is an advanced analysis of rule sets that can discover decidability of query answering in many cases, but is \(\textsc {2ExpTime}\)-complete [10]. However, instead of performing this costly analysis on the whole rule set, an equivalent result can be obtained by analysing each strongly connected components of the \(\mathrel {\prec ^{+}}\)-graph individually. We measure the times for both approaches using the MFA implementation of VLog and our optimised variant A, with a timeout of 30min per rule set. The two variants are denoted V (VLog MFA) and C (component-wise MFA).

Using C, 163 ontologies are classified as MFA, 33 fail MFA, and 5 cases time out. V times out in 10 cases, but agrees on all other outcomes. C is slower in three cases that still run in under 50 ms. The numbers of speed-ups, grouped by order of magnitude, are as follows:

figure u

We conclude that our optimised reliance computation is a feasible approach for speeding up MFA analysis.

Core Stratification. We can use our implementation to determine how common this favourable property (cf. Sect. 3) is among real-world ontologies. The analysis was feasible for 200 rule sets in our corpus, yielding 44 core stratified sets with up to 121,712 rules. One can improve this result by considering pieces, minimal subsets of rule heads where each two atoms refer to a common existentially quantified variable [1]. Each rule can then equivalently be replaced by several rules, each combining the original body with one of the pieces of the original head. Applying this transformation to our rule sets leads to more fine-grained dependencies that have fewer cycles over \(\mathrel {\prec ^{\square }}\). With this modification, 75 rule sets are core stratified.

Our implementation fails in one case (ontology ID 00477), containing 167,351 rules like \(A(x)\rightarrow \exists v.\, \textsf {located-in}(x,v)\wedge B(v)\), for various A and B. The required \(>28\times 10^9\) checks, though mostly cached, take very long. In spite of many \(\mathrel {\prec ^{\square }}\)-relations, the set is core-stratified as it describes a proper meronomy.

The remaining 125 rule sets are not core stratified. To validate the outcome, we have analysed these sets manually, and found several common reasons why ontologies were indeed not core stratified (and therefore correctly classified in our implementation). The following two examples explain two typical situations.

Example 7

In some cases, core stratification fails even though there is a natural rule application order that always leads to a core. Consider the rules \(\rho _1= a(x) \rightarrow \exists v.\, r(x, v) \wedge b(v)\), \(\rho _2= r(x, y) \rightarrow s(y, x)\), and \(\rho _3= s(x, y) \rightarrow r(y, x)\). This set is not core stratified since we have \(\rho _1 \mathrel {\prec ^{+}}\rho _3\), \(\rho _2 \mathrel {\prec ^{+}}\rho _3\), \(\rho _3 \mathrel {\prec ^{+}}\rho _2\), and \(\rho _3 \mathrel {\prec ^{\square }}\rho _1\). However, prioritising \(\rho _2\) and \(\rho _3\) over \(\rho _1\) (i.e., using a Datalog-first strategy [9]) always leads to a core. Indeed, the positive reliance \(\rho _1 \mathrel {\prec ^{+}}\rho _3\) over-estimates relevant rule applications, since no new atom produced by \(\rho _1\) can (indirectly) lead to an application of \(\rho _3\).

Example 8

In other cases, there is indeed no data-independent strategy for rule applications that would always lead to a core. Consider the rules \(\rho _1= a(x) \rightarrow \exists v.\, r(x, v) \wedge b(v)\) and \(\rho _2= r(x, y) \wedge r(y, z) \rightarrow r(x, z)\). Both are common in OWL ontologies with existential axioms and transitive roles. The rule set is not core stratified since \(\rho _1\mathrel {\prec ^{+}}\rho _2\) and \(\rho _2\mathrel {\prec ^{\square }}\rho _1\).

Consider \(\mathcal {I}_a = \{a(1), a(2), r(1, 2)\}\). Applying \(\rho _1\) over \(\mathcal {I}_a\) to all matches yields \(\mathcal {I}_b = \mathcal {I}_a \cup \{r(1, n), b(n), r(2, m), b(m)\}\), which makes \(\rho _2\) applicable to obtain \(\mathcal {I}_c = \mathcal {I}_b \cup \{r(1, m)\}\). Here we have the alternative match \(h^A = \{1 \mapsto 1, 2 \mapsto 2, n \mapsto m\}\).

In contrast, applying \(\rho _1\) only for the match \(\{x\mapsto 2\}\) produces \(\mathcal {I}_b' = \mathcal {I}_a \cup \{r(2,n),b(n)\}\). A subsequent application of \(\rho _2\) yields \(\mathcal {I}_c' = \mathcal {I}_b' \cup \{r(1,n)\}\), which is a core model. Indeed, core models could often be achieved in such settings, but require fine-grained, data-dependent strategies that cannot be found by static analysis (concretely: we could consider r as a pre-order and apply \(\rho _1\) to the r-greatest elements first, followed by an exhaustive application of \(\rho _2\)).

Overall, our manual inspection supported the correctness of our computation and led to interesting first insights about core stratification in practical cases. Regarding the contribution of this work, our main conclusion of this evaluation is that our proposed algorithms are able to solve real-world tasks that require the computation of positive reliances and restraints over large ontologies.

8 Conclusions

We have shown that even the complex forms of dependencies that arise with existential rules can be implemented efficiently, and that doing so enables a number of uses of practical and theoretical interest. In particular, several previously proposed approaches can be made significantly faster or implemented for the first time at all. Our methods can be adapted to cover further cases, especially the negative reliances.

Our work opens up a path towards further uses of reliance-based analyses in practice. Already our experiments on core stratification – though primarily intended to evaluate the practical feasibility of our restraint algorithm – also showed that (a) core stratification does occur in many non-trivial real-world ontologies, whereas (b) there are also relevant cases where this criterion fails although a rule-based core computation seems to be within reach. This could be a starting point for refining this notion. It is also interesting to ask whether good ontology design should, in principle, lead to specifications that naturally produce cores, i.e., that robustly avoid redundancies. A different research path is to ask how knowledge of dependencies can be used to speed up reasoning. Indeed, dependencies embody characteristics of existential rule reasoning that are not found in other rule languages, and that therefore deserve further attention.

Supplemental Material Statement. We provide full proofs in the technical report published on arXiv [14]. Our source code, experimental data, instructions for repeating all experiments, and our own raw measurements are available on GitHub.