Keywords

1 Introduction

The formalism of existential rules has come to prominence as an effective approach for both specifying and querying knowledge. Within this context, a knowledge base takes the form \(\mathcal {K} = (\mathcal {D},\mathcal {R})\), where \(\mathcal {D}\) is a finite collection of atomic facts (called a database) and \(\mathcal {R}\) is a finite set of existential rules (called a rule set), which are first-order formulae of the form \(\forall \textbf{x}\textbf{y}(\varphi (\textbf{x},\textbf{y}) \rightarrow \exists \textbf{z}\psi (\textbf{y},\textbf{z}))\). Although existential rules are written in a relatively simple language, they are expressive enough to generalize many important languages used in knowledge representation, including rule-based formalisms as well as such based on description logics. Moreover, existential rules have meaningful applications within the domain of ontology-based query answering [2], data exchange and integration [9], and have proven beneficial in the study of general decidability criteria [10].

Fig. 1.
figure 1

A graphic depicting the containment relations between the classes of rule sets considered. The solid edges represent strict containment relations.

The Boolean conjunctive query entailment problem consists of taking a knowledge base \(\mathcal {K}\), a Boolean conjunctive query (BCQ) q, and determining if \(\mathcal {K}\models q\). As this problem is known to be undecidable for arbitrary rule sets [7], much work has gone into identifying existential rule fragments for which decidability can be reclaimed. Typically, such classes of rule sets are described in one of two ways: either, a rule set’s membership in said class can be established through easily verifiable syntactic properties (such classes are called concrete classes), or the property is more abstract (which is often defined on the basis of semantic notions) and may be hard or even impossible to algorithmically determine (such classes are called abstract classes). Examples of concrete classes include functional/inclusion dependencies [11], datalog, and guarded rules [6]. Examples of abstract classes include finite expansion sets [4], finite unification sets [3], and bounded-treewidth sets (\(\textbf{bts}\)) [6].

Yet, there is another means of establishing the decidability of query entailment: only limited work has gone into identifying classes of rule sets with decidable query entailment based on their proof-theoretic characteristics, in particular, based on specifics of the derivations such rules produce. To the best of our knowledge, only the class of greedy bounded treewidth sets (\(\textbf{gbts}\)) has been identified in such a manner (see [14]). A rule set qualifies as \(\textbf{gbts}\) when every derivation it produces is greedy, in a sense that it is possible to construct a tree decomposition of finite width in a “greedy” fashion alongside the derivation, ensuring the existence of a model with finite treewidth for the knowledge base under consideration, thus warranting the decidability of query entailment [6].

In this paper, we investigate the \(\textbf{gbts}\) class and three new classes of rule sets where decidability is determined proof-theoretically. First, we define a weakened version of \(\textbf{gbts}\), dubbed \(\textbf{wgbts}\), where the rule set need only produce at least one greedy derivation relative to any given database. Second, we investigate two new classes of rule sets, dubbed cycle-free derivation graph sets (\(\textbf{cdgs}\)) and weakly cycle-free derivation graph sets (\(\textbf{wcdgs}\)), which are defined relative to the notion of a derivation graph. Derivation graphs were introduced by Baget et al. [5] and are directed acyclic graphs encoding how certain facts are derived in the course of a derivation. Notably, via the application of reduction operations, a derivation graph may be reduced to a tree, which serves as a tree decomposition of a model of the considered knowledge base. Such objects helped establish that (weakly) frontier-guarded rule sets are \(\textbf{bts}\) [5]. In short, our key contributions are (Fig. 1):

  1. 1.

    We investigate how proof-theoretic structures gives rise to decidable query entailment and propose three new classes of rule sets.

  2. 2.

    We show that \(\textbf{gbts}= \textbf{cdgs}\) and \(\textbf{wgbts}= \textbf{wcdgs}\), establishing a correspondence between greedy derivations and reducible derivation graphs.

  3. 3.

    We show that \(\textbf{wgbts}\) properly subsumes \(\textbf{gbts}\) via a novel proof transformation argument. Therefore, by the former point, we also find that \(\textbf{wcdgs}\) properly subsumes \(\textbf{cdgs}\).

The paper is organized accordingly: In Section 2, we define preliminary notions. We study \(\textbf{gbts}\) and \(\textbf{wgbts}\) in Section 3, and show that the latter class properly subsumes the former via an intricate proof transformation argument. In Section 4, we define \(\textbf{cdgs}\) and \(\textbf{wcdgs}\) as well as show that \(\textbf{gbts}= \textbf{cdgs}\) and \(\textbf{wgbts}= \textbf{wcdgs}\). In Section 5, we conclude and discuss future work. Last, we note that full proofs can be found in the appended version [12].

2 Preliminaries

Syntax and formulae. We let \(\textbf{Ter}\) be a set of terms, which is the the union of three countably infinite, pairwise disjoint sets, namely, the set of constants \(\textbf{Con}\), the set of variables \(\textbf{Var}\), and the set of nulls \(\textbf{Nul}\). We use \(a\), \(b\), \(c\), \(\ldots \) (occasionally annotated) to denote constants, and \(x\), \(y\), \(z\), \(\ldots \) (occasionally annotated) to denote both variables and nulls. A signature \(\varSigma \) is a set of predicates p, q, r, \(\ldots \) (which may be annotated) such that for each \(p \in \varSigma \), \(ar(p) \in \mathbb {N}\) is the arity of p. For simplicity, we assume a fixed signature \(\varSigma \) throughout the paper.

An atom over \(\varSigma \) is defined to be a formula of the form \(p(t_{1}, \ldots , t_{n})\), where \(p \in \varSigma \), \(ar(p) = n\), and \(t_{i} \in \textbf{Ter}\) for each \(i \in \{1, \ldots , n\}\). A ground atom over \(\varSigma \) is an atom \(p(a_{1}, \ldots , a_{n})\) such that \(a_{i} \in \textbf{Con}\) for each \(i \in \{1, \ldots , n\}\). We will often use \(\textbf{t}\) to denote a tuple \((t_{1}, \ldots , t_{n})\) of terms and \(p(\textbf{t})\) to denote a (ground) atom \(p(t_{1}, \ldots , t_{n})\). An instance over \(\varSigma \) is defined to be a (potentially infinite) set \(\mathcal {I}\) of atoms over constants and nulls, and a database \(\mathcal {D}\) is a finite set of ground atoms. We let \(\mathcal {X}\), \(\mathcal {Y}\), \(\ldots \) (occasionally annotated) denote (potentially infinite) sets of atoms with \(\textbf{Ter}(\mathcal {X})\), \(\textbf{Con}(\mathcal {X})\), \(\textbf{Var}(\mathcal {X})\), and \(\textbf{Nul}(\mathcal {X})\) denoting the set of terms, constants, variables, and nulls occurring in the atoms of \(\mathcal {X}\), respectively.

Substitutions and homomorphisms. A substitution is a partial function over the set of terms \(\textbf{Ter}\). A homomorphism h from a set \(\mathcal {X}\) of atoms to a set \(\mathcal {Y}\) of atoms, is a substitution \(h : \textbf{Ter}(\mathcal {X}) \rightarrow \textbf{Ter}(\mathcal {Y})\) such that (i) \(p(h(t_{1}), \ldots , h(t_{n})) \in \mathcal {Y}\), if \(p(t_{1}, \ldots , t_{n}) \in \mathcal {X}\), and (ii) \(h(a) = a\) for each \(a \in \textbf{Con}\). If h is a homomorphism from \(\mathcal {X}\) to \(\mathcal {Y}\), we say that h homomorphically maps \(\mathcal {X}\) to \(\mathcal {Y}\). Atom sets \(\mathcal {X},\mathcal {Y}\) are homomorphically equivalent, written \(\mathcal {X}\equiv \mathcal {Y}\), iff \(\mathcal {X}\) homomorphically maps to \(\mathcal {Y}\), and vice versa. An isomorphism is a bijective homomorphism h where \(h^{-1}\) is a homomorphism.

Existential rules. Whereas databases encode assertional knowledge, ontologies consist in the current setting of existential rules, which we will frequently refer to as rules more simply. An existential rule is a first-order sentence of the form:

$$ \rho = \forall \textbf{x}\textbf{y}(\varphi (\textbf{x},\textbf{y}) \rightarrow \exists \textbf{z}\psi (\textbf{y},\textbf{z})) $$

where \(\textbf{x}\), \(\textbf{y}\), and \(\textbf{z}\) are pairwise disjoint collections of variables, \(\varphi (\textbf{x},\textbf{y})\) is a conjunction of atoms over constants and the variables \(\textbf{x},\textbf{y}\), and \(\psi (\textbf{y},\textbf{z})\) is a conjunction of atoms over constants and the variables \(\textbf{y},\textbf{z}\). We define \( body (\rho ) = \varphi (\textbf{x},\textbf{y})\) to be the body of \(\rho \), and \( head (\rho ) = \psi (\textbf{y},\textbf{z})\) to be the head of \(\rho \). For convenience, we will often interpret a conjunction \(p_{1}(\textbf{t}_{1}) \wedge \cdots \wedge p_{n}(\textbf{t}_{n})\) of atoms (such as the body or head of a rule) as a set \(\{p_{1}(\textbf{t}_{1}), \cdots , p_{n}(\textbf{t}_{n})\}\) of atoms; if h is a homomorphism, then \(h(p_{1}(\textbf{t}_{1}) \wedge \cdots \wedge p_{n}(\textbf{t}_{n})) := \{p_{1}(h(\textbf{t}_{1})), \cdots , p_{n}(h(\textbf{t}_{n}))\}\) with h applied componentwise to each tuple \(\textbf{t}_{i}\) of terms. The frontier of \(\rho \), written \( fr (\rho )\), is the set of variables \(\textbf{y}\) that the body and head of \(\rho \) have in common, that is, \( fr (\rho ) = \textbf{Var}( body (\rho )) \cap \textbf{Var}( head (\rho ))\). We define a frontier atom in a rule \(\rho \) to be an atom containing at least one frontier variable. We use \(\rho \) and annotated versions thereof to denote rules, as well as \(\mathcal {R}\) and annotated versions thereof to denote finite sets of rules (simply called rule sets).

Models. We note that sets of atoms (which include instances and databases) may be seen as first-order interpretations, and so, we may use \(\models \) to represent the satisfaction of formulae on such structures. A set of atoms \(\mathcal {X}\) satisfies a set of atoms \(\mathcal {Y}\) (or, equivalently, \(\mathcal {X}\) is a model of \(\mathcal {Y}\)), written \(\mathcal {X}\models \mathcal {Y}\), iff there exists a homomorphic mapping from \(\mathcal {Y}\) to \(\mathcal {X}\). A set of atoms \(\mathcal {X}\) satisfies a rule \(\rho \) (or, equivalently, \(\mathcal {X}\) is a model of \(\rho \)), written \(\mathcal {X}\models \rho \), iff for any homomorphism h, if h is a homomorphism from \( body (\rho )\) to \(\mathcal {X}\), then it can be extended to a homomorphism \(\overline{h}\) that also maps \( head (\rho )\) to \(\mathcal {X}\). A set of atoms \(\mathcal {X}\) satisfies a rule set \(\mathcal {R}\) (or, equivalently, \(\mathcal {X}\) is a model of \(\mathcal {R}\)), written \(\mathcal {X}\models \mathcal {R}\), iff \(\mathcal {X}\models \rho \) for every rule \(\rho \in \mathcal {R}\). If a model \(\mathcal {X}\) of a set of atoms, a rule, or a rule set homomorphically maps into every model of that very set of atoms, rule, or rule set, then we refer to \(\mathcal {X}\) as a universal model of that set of atoms, rule, or rule set [8].

Knowledge bases and querying. A knowledge base (KB) \(\mathcal {K}\) is defined to be a pair \((\mathcal {D},\mathcal {R})\), where \(\mathcal {D}\) is a database and \(\mathcal {R}\) is a rule set. An instance \(\mathcal {I}\) is a model of \(\mathcal {K}= (\mathcal {D},\mathcal {R})\) iff \(\mathcal {D}\subseteq \mathcal {I}\) and \(\mathcal {I}\models \mathcal {R}\). We consider querying knowledge bases with conjunctive queries (CQs), that is, with formulae of the form \(q(\textbf{y}) = \exists \textbf{x}\varphi (\textbf{x},\textbf{y})\), where \(\varphi (\textbf{x},\textbf{y})\) is a non-empty conjunction of atoms over the variables \(\textbf{x},\textbf{y}\) and constants. We refer to the variables \(\textbf{y}\) in \(q(\textbf{y})\) as free and define a Boolean conjunctive query (BCQ) to be a CQ without free variables, i.e. a BCQ is a CQ of the form \(q = \exists \textbf{x}\varphi (\textbf{x})\). A knowledge base \(\mathcal {K}= (\mathcal {D}, \mathcal {R})\) entails a CQ \(q(\textbf{y})= \exists \textbf{x}\varphi (\textbf{x},\textbf{y})\), written \(\mathcal {K}\models q(\textbf{y})\), iff \(\varphi (\textbf{x},\textbf{y})\) homomorphically maps into every model \(\mathcal {I}\) of \(\mathcal {K}\); we note that this is equivalent to \(\varphi (\textbf{x},\textbf{y})\) homomorphically mapping into a universal model of \(\mathcal {D}\) and \(\mathcal {R}\).

As we are interested in extracting implicit knowledge from the explicit knowledge presented in a knowledge base \(\mathcal {K}= (\mathcal {D},\mathcal {R})\), we are interested in deciding the BCQ entailment problem:Footnote 1

(BCQ Entailment) Given a KB \(\mathcal {K}\) and a BCQ q, is it the case that \(\mathcal {K}\models q\)?

While it is well-known that the BCQ entailment problem is undecidable in general [7], restricting oneself to certain classes of rule sets (e.g. datalog or finite unification sets [5]) may recover decidability. We refer to classes of rule sets for which BCQ entailment is decidable as query-decidable classes.

Derivations. One means by which we can extract implicit knowledge from a given KB is through the use of derivations, that is, sequences of instances obtained by sequentially applying rules to given data. We say that a rule \(\rho = \forall \textbf{x}\textbf{y}(\varphi (\textbf{x},\textbf{y}) \rightarrow \exists \textbf{z}\psi (\textbf{y},\textbf{z}))\) is triggered in an instance \(\mathcal {I}\) via a homomorphism h, written succinctly as \(\tau (\rho ,\mathcal {I},h)\), iff h homomorphically maps \(\varphi (\textbf{x},\textbf{y})\) to \(\mathcal {I}\). In this case, we define

$$\textbf{Ch}(\mathcal {I},\rho ,h) = \mathcal {I}\cup \overline{h}(\psi (\textbf{y},\textbf{z})),$$

where \(\overline{h}\) is an extension of h mapping every variable z in \(\textbf{z}\) to a fresh null. Consequently, we define an \(\mathcal {R}\)-derivation to be a sequence \(\mathcal {I}_{0}, (\rho _{1}, h_{1}, \mathcal {I}_{1}), \ldots , (\rho _{n}, h_{n}, \mathcal {I}_{n})\) such that (i) \(\rho _{i} \in \mathcal {R}\) for each \(i \in \{1, \ldots , n\}\), (ii) \(\tau (\rho _{i},\mathcal {I}_{i-1},h_{i})\) holds for \(i \in \{1, \ldots , n\}\), and (iii) \(\mathcal {I}_{i} = \textbf{Ch}(\mathcal {I}_{i-1},\rho ,h_{i})\) for \(i \in \{1, \ldots , n\}\). We will use \(\delta \) and annotations thereof to denote \(\mathcal {R}\)-derivations, and we define the length of an \(\mathcal {R}\)-derivation \(\delta = \mathcal {I}_{0}, (\rho _{1}, h_{1}, \mathcal {I}_{1}), \ldots , (\rho _{n}, h_{n}, \mathcal {I}_{n})\), denoted \(|\delta |\), to be n. Furthermore, for instances \(\mathcal {I}\) and \(\mathcal {I}'\), we write to mean that there exists an \(\mathcal {R}\)-derivation \(\delta \) of \(\mathcal {I}'\) from \(\mathcal {I}\). Also, if \(\mathcal {I}''\) can be derived from \(\mathcal {I}'\) by means of a rule \(\rho \in \mathcal {R}\) and homomorphism h, we abuse notation and write to indicate that and with \(\delta ' = \mathcal {I}', (\rho ,h,\mathcal {I}'')\). Derivations play a fundamental role in this paper as we aim to identify (and analyze the relationships between) query-decidable classes of rule sets based on how such rule sets derive information, i.e. we are interested in classes of rule sets that may be proof-theoretically characterized.

Chase. A tool that will prove useful in the current work is the chase, which in our setting is a procedure that (in essence) simultaneously constructs all \(\mathcal {K}\)-derivations in a breadth-first manner. Although many variants of the chase exist [5, 9, 13], we utilize the chase procedure (also called the k-Saturation) from Baget et al. [5]. We use the chase in the current work as a purely technical tool for obtaining universal models of knowledge bases, proving useful in separating certain query-decidable classes of rule sets.

We define the one-step application of all triggered rules from some \(\mathcal {R}\) in \(\mathcal {I}\) by

$$ \textbf{Ch}_{1}(\mathcal {I},\mathcal {R}) = {\bigcup }_{\rho \in \mathcal {R}, \tau (\rho ,\mathcal {I},h)} \textbf{Ch}(\mathcal {I},\rho ,h), $$

assuming all nulls introduced in the “parallel” applications of \(\textbf{Ch}\) to \(\mathcal {I}\) are distinct. We let \(\textbf{Ch}_{0}(\mathcal {I},\mathcal {R}) = \mathcal {I}\), as well as let \(\textbf{Ch}_{i+1}(\mathcal {I},\mathcal {R}) = \textbf{Ch}_{1}(\textbf{Ch}_{i}(\mathcal {I},\mathcal {R}),\mathcal {R})\), and define the chase to be

$$ \textbf{Ch}_{\infty }(\mathcal {I},\mathcal {R}) = {\bigcup }_{i \in \mathbb {N}} \textbf{Ch}_{i}(\mathcal {I},\mathcal {R}). $$

For any KB \(\mathcal {K}= (\mathcal {D},\mathcal {R})\), the chase \(\textbf{Ch}_{\infty }(\mathcal {D},\mathcal {R})\) is a universal model of \(\mathcal {K}\), that is, \(\mathcal {D}\subseteq \textbf{Ch}_{\infty }(\mathcal {D},\mathcal {R})\), \(\textbf{Ch}_{\infty }(\mathcal {D},\mathcal {R}) \models \mathcal {R}\), and \(\textbf{Ch}_{\infty }(\mathcal {D},\mathcal {R})\) homomorphically maps into every model of \(\mathcal {D}\) and \(\mathcal {R}\).

Rule dependence. Let \(\rho \) and \(\rho '\) be rules. We say that \(\rho '\) depends on \(\rho \) iff there exists an instance \(\mathcal {I}\) such that (i) \(\rho '\) is not triggered in \(\mathcal {I}\) via any homomorphism, (ii) \(\rho \) is triggered in \(\mathcal {I}\) via a homomorphism h, and (iii) \(\rho '\) is triggered in \(\textbf{Ch}(\mathcal {I},\rho ,h)\) via a homomorphism \(h'\). We define the graph of rule dependencies [1] of a set \(\mathcal {R}\) of rules to be \(G(\mathcal {R}) = (V,E)\) such that (i) \(V= \mathcal {R}\) and (ii) \((\rho ,\rho ') \in E\) iff \(\rho '\) depends on \(\rho \).

Treewidth. A tree decomposition of an instance \(\mathcal {I}\) is defined to be a tree \(T= (V,E)\) such that \(V \subseteq 2^{\textbf{Ter}(\mathcal {I})}\) (where each element of V is called a bag) and \(E \subseteq V \times V\), satisfying the following three conditions: (i) \(\bigcup _{X \in V} X = \textbf{Ter}(\mathcal {I})\), (ii) for each \(p(t_{1}, \ldots ,t_{n}) \in \mathcal {I}\), there is an \(X \in V\) such that \(\{ t_{1}, \ldots ,t_{n} \}\subseteq X\), and (iii) for each \(t \in \textbf{Ter}(\mathcal {I})\), the subgraph of \(T\) induced by the bags \(X \in V\) with \(t \in X\) is connected (this condition is referred to as the connectedness condition). We define the width of a tree decomposition \(T= (V,E)\) of an instance \(\mathcal {I}\) as follows:

$$ w(T) := \max \{|X| : X \in V\} - 1 $$

i.e. the width is equal to the cardinality of the largest node in \(T\) minus 1. We let \(w(T) = \infty \) iff for all \(n \in \mathbb {N}\), \(n \le \max \{|X| : X \in V\}\). We define the treewidth of an instance \(\mathcal {I}\), written \(tw(\mathcal {I})\), as follows:

$$ tw(\mathcal {I}) := \min \{w(T) : T\text { is a tree decomposition of }\mathcal {I}\} $$

i.e. the treewidth of an instance equals the minimal width among all its tree decompositions. If no tree decomposition of \(\mathcal {I}\) has finite width, we set \(tw(\mathcal {I})=\infty \).

3 Greediness

We now discuss a property of derivations referred to as greediness. In essence, a derivation is greedy when the image of the frontier of any applied rule consists solely of constants from a given KB and/or nulls introduced by a single previous rule application. Such derivations were defined by Thomazo et al. [14] and were used to identify the (query-decidable) class of greedy bounded-treewidth sets (\(\textbf{gbts}\)), that is, the class of rule sets that produce only greedy derivations (defined below) when applied to a database.

In this section, we also identify a new query-decidable class of rule sets, referred to as weakly greedy bounded-treewidth sets (\(\textbf{wgbts}\)). The \(\textbf{wgbts}\) class serves as a more liberal version of \(\textbf{gbts}\), and contains rule sets that admit at least one greedy derivation of any derivable instance. It is straightforward to confirm that \(\textbf{wgbts}\) generalizes \(\textbf{gbts}\) since if a rule set is \(\textbf{gbts}\), then every derivation of a derivable instance is greedy, implying that every derivable instance has some greedy derivation. Yet, what is non-trivial to show is that \(\textbf{wgbts}\) properly subsumes \(\textbf{gbts}\). We are going to prove this fact by means of a proof-theoretic argument and counter-example along the following lines: first, we show under what conditions we can permute rule applications in a given derivation (see Lemma 1 below), and second, we provide a rule set which exhibits non-greedy derivations (witnessing that the rule set is not \(\textbf{gbts}\)), but for which every derivation can be transformed into a greedy derivation by means of rule permutations and replacements (witnessing \(\textbf{wgbts}\) membership).

Let us now formally define greedy derivations and provide examples to demonstrate the concept of (non-)greediness. Based on this, we then proceed to define the \(\textbf{gbts}\) and \(\textbf{wgbts}\) classes.

Definition 1

(Greedy Derivation [14]). We define an \(\mathcal {R}\)-derivation

$$ \delta = \mathcal {I}_{0},(\rho _{1},h_{1},\mathcal {I}_{1}), \ldots , (\rho _{n},h_{n},\mathcal {I}_{n}) $$

to be greedy iff for each i such that \(0 < i \le n\), there exists a \(j < i\) such that \(h_{i}( fr (\rho _{i})) \subseteq \textbf{Nul}(\overline{h}_{j}( head (\rho _{j}))) \cup \textbf{Con}(\mathcal {I}_{0},\mathcal {R}) \cup \textbf{Nul}(\mathcal {I}_{0})\).

To give examples of non-greedy and greedy derivations, let us define the database \(\mathcal {D}_{\dag }:= \{p(a),r(b)\}\) and the rule set \(\mathcal {R}_{2}:= \{\rho _{1}, \rho _{2}, \rho _{3}, \rho _{4}\}\), with

$$ \begin{array}{rlrl} \rho _{1} = &{} \, p(x) \rightarrow \exists yz. q(x,y,z) &{} \rho _{3} = &{} \, p(x) \wedge r(y) \rightarrow \exists zwuv. q(x,z,w) \wedge s(y,u,v) \\ \rho _{2} = &{} \, r(x) \rightarrow \exists yz. s(x,y,z) &{} \rho _{4} = &{} \, q(x,y,z) \wedge s(w,u,v) \rightarrow \exists o. t(x,y,w,u,o) \\ \end{array} $$

An example of a non-greedy derivation is the following:

$$ \delta _{1} = \mathcal {D}_{\dag }, (\rho _{1},h_{1},\mathcal {I}_{1}), (\rho _{1},h_{2},\mathcal {I}_{2}), (\rho _{2},h_{3},\mathcal {I}_{3}), (\rho _{4},h_{4},\mathcal {I}_{4}),\ \ \text { with } $$
$$ \mathcal {I}_4 = \{ \underbrace{p(a),r(b)}_{\smash {\mathcal {D}_{\dag }}} , \underbrace{q(a,y_{0},z_{0}) }_{\mathcal {I}_{1}\setminus \mathcal {D}_{\dag }} , \underbrace{q(a,y_{1},z_{1})}_{\mathcal {I}_{2}\setminus \mathcal {I}_{1}} , \underbrace{s(b,y_{2},z_{2})}_{\mathcal {I}_{3}\setminus \mathcal {I}_{2}} , \underbrace{t(a,y_{0},b,y_{2},o)}_{\mathcal {I}_{4}\setminus \mathcal {I}_{3}} \} \ \ \ \text { and }$$

\(h_1= h_2= \{x {\mapsto } a\}\), \(h_3= \{x {\mapsto } b\}\), \(h_4= \{x {\mapsto } a, y {\mapsto } y_0, z{\mapsto } z_0, w{\mapsto } b, u{\mapsto } y_2, v {\mapsto } z_2\}\).

Note that this derivation is not greedy because

figure e

That is to say, the image of the frontier from the last rule application (i.e. the application of \(\rho _{4}\)) contains nulls introduced by two previous rule applications (as opposed to containing nulls from just a single previous rule application), namely, the first application of \(\rho _{1}\) and the application of \(\rho _{2}\). In contrast, the following is an example of a greedy derivation

$$ \delta _{2} = \mathcal {D}_{\dag }, (\rho _{3},h_{1}',\mathcal {I}_{1}'), (\rho _{1},h_{2}',\mathcal {I}_{2}'), (\rho _{4},h_{3}',\mathcal {I}_{3}'),\text { with } $$
$$ \mathcal {I}_3' = \{ \underbrace{p(a),r(b)}_{\smash {\mathcal {D}_{\dag }}} , \underbrace{q(a,y_{0},z_{0}),s(b,y_{2},z_{2})}_{\mathcal {I}_{1}'\setminus \mathcal {D}_{\dag }} , \underbrace{q(a,y_{1},z_{1})}_{\mathcal {I}_{2}'\setminus \mathcal {I}_{1}'} , \underbrace{t(a,y_{0},b,y_{2},o)}_{\mathcal {I}_{3}'\setminus \mathcal {I}_{2}'}\} \ \ \ \text { and }$$

\(h'_1= \{x {\mapsto } a,y {\mapsto } b\}\), \(h'_2= \{x {\mapsto } a\}\), \(h'_3= \{x {\mapsto } a, y {\mapsto } y_0, z{\mapsto } z_0, w{\mapsto } b, u{\mapsto } y_2, v {\mapsto } z_2\}\).

Greediness of \(\delta _{2}\) follows from the frontier of any applied rule being mapped to nothing but constants and/or nulls introduced by a sole previous rule application.

Definition 2

((Weakly) Greedy Bounded-Treewidth Set). A rule set \(\mathcal {R}\) is a greedy bounded-treewidth set (\(\textbf{gbts}\)) iff if , then \(\delta \) is greedy. \(\mathcal {R}\) is a weakly greedy bounded-treewidth set (\(\textbf{wgbts}\)) iff if , then there exists some greedy \(\mathcal {R}\)-derivation \(\delta '\) such that .

Remark 1

Observe that \(\textbf{gbts}\) and \(\textbf{wgbts}\) are characterized on the basis of derivations starting from given databases only, that is, derivations of the form \(\mathcal {I}_{0},(\rho _{1},h_{1},\mathcal {I}_{1}), \ldots , (\rho _{n},h_{n},\mathcal {I}_{n})\) where \(\mathcal {I}_{0} = \mathcal {D}\) is a database. In such a case, a derivation of the above form is greedy iff for each i with \(0 < i \le n\), there exists a \(j < i\) such that \(h_{i}( fr (\rho _{i})) \subseteq \textbf{Nul}(\overline{h}_{j}( head (\rho _{j}))) \cup \textbf{Con}(\mathcal {D},\mathcal {R})\) as databases only contain constants (and not nulls) by definition.

As noted above, it is straightforward to show that \(\textbf{wgbts}\) subsumes \(\textbf{gbts}\). Still, establishing that \(\textbf{wgbts}\) strictly subsumes \(\textbf{gbts}\), i.e. there are rule sets within \(\textbf{wgbts}\) that are outside \(\textbf{gbts}\), requires more effort. As it so happens, the rule set \(\mathcal {R}_{2}\) (defined above) serves as such a rule set, admitting non-greedy \(\mathcal {R}_{2}\)-derivations, but where it can be shown that every instance derivable using the rule set admits a greedy \(\mathcal {R}_{2}\)-derivation. As a case in point, observe that the \(\mathcal {R}_{2}\)-derivations \(\delta _{1}\) and \(\delta _{2}\) both derive the same instance \(\mathcal {I}_{4} = \mathcal {I}_{3}'\), however, \(\delta _{1}\) is a non-greedy \(\mathcal {R}_{2}\)-derivation of the instance and \(\delta _{2}\) is a greedy \(\mathcal {R}_{2}\)-derivation of the instance. Clearly, the existence of the non-greedy \(\mathcal {R}_{2}\)-derivation \(\delta _{1}\) witnesses that \(\mathcal {R}_{2}\) is not \(\textbf{gbts}\). To establish that \(\mathcal {R}_{2}\) still falls within the \(\textbf{wgbts}\) class, we show that every non-greedy \(\mathcal {R}_{2}\)-derivation can be transformed into a greedy \(\mathcal {R}_{2}\)-derivation using two operations: (i) rule permutations and (ii) rule replacements.

Regarding rule permutations, we consider under what conditions we may swap consecutive applications of rules in a derivation to yield a new derivation of the same instance. For example, in the \(\mathcal {R}_{2}\)-derivation \(\delta _{1}\) above, we may swap the consecutive applications of \(\rho _{1}\) and \(\rho _{2}\) to obtain the following derivation:

$$\delta _{1}' = \mathcal {D}_{\dag }, (\rho _{1},h_{1},\mathcal {I}_{1}), (\rho _{2},h_{3},\mathcal {I}_{1} \cup (\mathcal {I}_{3} \setminus \mathcal {I}_{2})),(\rho _{1},h_{2},\mathcal {I}_{3}), (\rho _{4},h_{4},\mathcal {I}_{4}).$$

\(\mathcal {I}_{1} \cup (\mathcal {I}_{3} \setminus \mathcal {I}_{2}) = \{p(a),r(b),q(a,y_{0},z_{0}),s(b,y_{2},z_{2})\}\) is derived by applying \(\rho _{2}\) and the subsequent application of \(\rho _{1}\) reclaims the instance \(\mathcal {I}_{3}\). Therefore, the same instance \(\mathcal {I}_{4}\) remains the conclusion. Although one can confirm that \(\delta _{1}'\) is indeed an \(\mathcal {R}_{2}\)-derivation, thus serving as a successful example of a rule permutation (meaning, the rule permutation yields another \(\mathcal {R}_{2}\)-derivation), the following question still remains: for a rule set \(\mathcal {R}\), under what conditions will permuting rules within a given \(\mathcal {R}\)-derivation always yield another \(\mathcal {R}\)-derivation?

We pose an answer to this question, formulated as the permutation lemma below, which states that an application of a rule \(\rho \) may be permuted before an application of a rule \(\rho '\) so long as the former rule does not depend on the latter (in the sense formally defined in Section 2 based on the work of Baget [1]). Furthermore, it should be noted that such rule permutations preserve the greediness of derivations. In the context of the above example, \(\rho _{2}\) may be permuted before \(\rho _{1}\) in \(\delta _{1}\) because the former does not depend on the latter.

Lemma 1 (Permutation Lemma)

Let \(\mathcal {R}\) be a rule set with \(\mathcal {I}_{0}\) an instance. Suppose we have a (greedy) \(\mathcal {R}\)-derivation of the following form:

$$ \mathcal {I}_{0}, \ldots , (\rho _{i},h_{i},\mathcal {I}_{i}), (\rho _{i+1},h_{i+1},\mathcal {I}_{i+1}), \ldots , (\rho _{n},h_{n},\mathcal {I}_{n}) $$

If \(\rho _{i+1}\) does not depend on \(\rho _{i}\), then the following is a (greedy) \(\mathcal {R}\)-derivation too:

$$ \mathcal {I}_{0}, \ldots , (\rho _{i+1},h_{i+1},\mathcal {I}_{i-1} \cup (\mathcal {I}_{i+1} \setminus \mathcal {I}_{i})), (\rho _{i},h_{i},\mathcal {I}_{i+1}), \ldots , (\rho _{n},h_{n},\mathcal {I}_{n}). $$

As a consequence of the above lemma, rules may always be permuted in a given \(\mathcal {R}\)-derivation so that its structure mirrors the graph of rule dependencies \(G(\mathcal {R})\) (defined in Section 2). That is, given a rule set \(\mathcal {R}\) and an \(\mathcal {R}\)-derivation \(\delta \), we may permute all applications of rules serving as sources in \(G(\mathcal {R})\) (which do not depend on any rules in \(\mathcal {R}\)) to the beginning of \(\delta \), followed by all rule applications that depend only on sources, and so forth, with any applications of rules serving as sinks in \(G(\mathcal {R})\) concluding the derivation. For example, in the graph of rule dependencies of \(\mathcal {R}_{2}\), the rules \(\rho _{1}\), \(\rho _{2}\), and \(\rho _{3}\) serve as source nodes (they do not depend on any rules in \(\mathcal {R}_{2}\)) and the rule \(\rho _{4}\) is a sink node depending on each of the aforementioned three rules, i.e. \(G(\mathcal {R}_{2}) = (V,E)\) with \(V= \{\rho _{1}, \rho _{2}, \rho _{3}, \rho _{4}\}\) and \(E= \{(\rho _{i},\rho _{4}) \ | \ 1 \le i \le 3\}\). Hence, in any given \(\mathcal {R}_{2}\)-derivation \(\delta \), any application of \(\rho _{1}\), \(\rho _{2}\), or \(\rho _{3}\) can be permuted backward (toward the beginning of \(\delta \)) and any application of \(\rho _{4}\) can be permuted forward (toward the end of \(\delta \)).

Beyond the use of rule permutations, we also transform \(\mathcal {R}_{2}\)-derivations by making use of rule replacements. In particular, observe that \( head (\rho _{3})\) and \( body (\rho _{3})\) correspond to conjunctions of \( head (\rho _{1})\) and \( head (\rho _{2})\), and \( body (\rho _{1})\) and \( body (\rho _{2})\), respectively. Thus, we can replace the first application of \(\rho _{1}\) and the succeeding application of \(\rho _{2}\) in \(\delta _{1}'\) above by a single application of \(\rho _{3}\), thus yielding the \(\mathcal {R}_{2}\)-derivation \( \delta _{1}'' = \mathcal {D}_{\dag }, (\rho _{3},h,\mathcal {I}_{1} \cup (\mathcal {I}_{3} \setminus \mathcal {I}_{2})), (\rho _{1},h_{2},\mathcal {I}_{3}), (\rho _{4},h_{4},\mathcal {I}_{4}), \) where \(h(x) = a\) and \(h(y) = b\). Interestingly, inspecting the above \(\mathcal {R}_{2}\)-derivation, one will find that it is identical to the greedy \(\mathcal {R}_{2}\)-derivation \(\delta _{2}\) defined earlier in the section, and so, we have shown how to take a non-greedy \(\mathcal {R}_{2}\)-derivation (viz. \(\delta _{1}\)) and transform it into a greedy \(\mathcal {R}_{2}\)-derivation (viz. \(\delta _{2}\)) by means of rule permutations and replacements. In the same way, one can prove in general that any non-greedy \(\mathcal {R}_{2}\)-derivation can be transformed into a greedy \(\mathcal {R}_{2}\)-derivation, thus giving rise to the following theorem, and demonstrating that \(\mathcal {R}_{2}\) is indeed \(\textbf{wgbts}\).

Theorem 1

\(\mathcal {R}_{2}\) is \(\textbf{wgbts}\), but not \(\textbf{gbts}\). Thus, \(\textbf{wgbts}\) properly subsumes \(\textbf{gbts}\).

4 Derivation Graphs

We now discuss derivation graphs – a concept introduced by Baget et al. [5] and used to establish that certain classes of rule sets (e.g. weakly frontier guarded rule sets [6]) exhibit universal models of bounded treewidth. A derivation graph has the structure of a directed acyclic graph and encodes how atoms are derived throughout the course of an \(\mathcal {R}\)-derivation. By applying so-called reduction operations, a derivation graph may (under certain conditions) be transformed into a treelike graph that serves as a tree decomposition of an \(\mathcal {R}\)-derivable instance.

Below, we define derivation graphs and discuss how such graphs are transformed into tree decompositions by means of reduction operations. To increase comprehensibility, we provide an example of a derivation graph (shown in Figure 2) and give an example of applying each reduction operation (shown in Figure 3). After, we identify two (query-decidable) classes of rule sets on the basis of derivation graphs, namely, cycle-free derivation graph sets (\(\textbf{cdgs}\)) and weakly cycle-free derivation graph sets (\(\textbf{wcdgs}\)). Despite their prima facie distinctness, the \(\textbf{cdgs}\) and \(\textbf{wcdgs}\) classes coincide with \(\textbf{gbts}\) and \(\textbf{wgbts}\) classes, respectively, thus showing how the latter classes can be characterized in terms of derivation graphs. Let us now formally define derivation graphs, and after, we will demonstrate the concept by means of an example.

Definition 3

(Derivation Graph). Let \(\mathcal {D}\) be a database, \(\mathcal {R}\) be a rule set, \(C= \textbf{Con}(\mathcal {D},\mathcal {R})\), and \(\delta \) be some \(\mathcal {R}\)-derivation \(\mathcal {D}, (\rho _{1}, h_{1}, \mathcal {I}_{1}), \ldots , (\rho _{n}, h_{n}, \mathcal {I}_{n})\). The derivation graph of \(\delta \) is the tuple \(G_{\delta }:= (\textrm{V},\textrm{E},\textrm{At},\textrm{L})\), where \(\textrm{V}:= \{X_{0}, \ldots ,X_{n}\}\) is a finite set of nodes, \(\textrm{E}\subseteq \textrm{V}\times \textrm{V}\) is a set of arcs, and the functions \(\textrm{At}: \textrm{V}\rightarrow 2^{\mathcal {I}_{n}}\) and \(\textrm{L}: \textrm{E}\rightarrow 2^{\textbf{Ter}(\mathcal {I}_{n})}\) decorate nodes and arcs, respectively, such that:

  1. 1.

    \(\textrm{At}(X_{0}) := \mathcal {D}\) and \(\textrm{At}(X_{i}) = \mathcal {I}_{i} \setminus \mathcal {I}_{i-1}\);

  2. 2.

    \((X_{i},X_{j}) \in \textrm{E}\) iff there is a \(p(\textbf{t}) \in \textrm{At}(X_{i})\) and a frontier atom \(p(\textbf{t}')\) in \(\rho _{j}\) such that \(h_{j}(p(\textbf{t}')) = p(\textbf{t})\). We then set \(\textrm{L}(X_{i},X_{j})= \Big (h_{j}\big (\textbf{Var}(p(\textbf{t}')) \cap fr (\rho _{j})\big )\Big ) \setminus C\).

We refer to \(X_{0}\) as the initial node and define the set of non-constant terms associated with a node to be \(\overline{C}(X) = \textbf{Ter}(X) \setminus C\) where \(\textbf{Ter}(X_{i}) := \textbf{Ter}(\textrm{At}(X_{i})) \cup C\).

Toward an example, assume \(\mathcal {D}_{\ddag }= \{p(a,b)\}\) and \(\mathcal {R}_{3}= \{\rho _{1},\rho _{2},\rho _{3},\rho _{4}\}\) where

$$ \begin{array}{rlrl} \rho _{1} = &{} \, p(x,y) \rightarrow \exists z . q(y,z) &{} \rho _{3} = &{} \, r(x,y) \wedge q(z,x) \rightarrow s(x,y) \\ \rho _{2} = &{} \, q(x,y) \rightarrow \exists z . (r(x,y) \wedge r(y,z)) &{} \rho _{4} = &{} \, r(x,y) \wedge s(z,w) \rightarrow t(y,w) \\ \end{array} $$

Let us consider the following derivation:

$$ \delta = \mathcal {D}_{\ddag }, (\rho _{1},h_{1},\mathcal {I}_{1}), (\rho _{2},h_{2},\mathcal {I}_{2}), (\rho _{3},h_{3},\mathcal {I}_{3}), (\rho _{4},h_{4},\mathcal {I}_{4}) \ \ \text { with} $$
$$ \mathcal {I}_4 = \{ \underbrace{p(a,b)}_{\smash {\mathcal {D}_{\ddag }}} , \underbrace{q(b,z_{0}) }_{\mathcal {I}_{1}\setminus \mathcal {D}_{\ddag }} , \underbrace{r(b,z_{0}),r(z_0,z_{1})}_{\mathcal {I}_{2}\setminus \mathcal {I}_{1}} , \underbrace{s(z_{0},z_{1})}_{\mathcal {I}_{3}\setminus \mathcal {I}_{2}} , \underbrace{t(z_{0},z_{1})}_{\mathcal {I}_{4}\setminus \mathcal {I}_{3}} \} \ \ \ \text { and }$$
Fig. 2.
figure 2

The derivation graph \(G_{\delta }\).

\(h_1= \{x {\mapsto } a,y {\mapsto } b\}\), \(h_2= \{x {\mapsto } b,y {\mapsto } z_0\}\), \(h_3= \{x {\mapsto } z_0, y {\mapsto } z_1, z{\mapsto } b\}\), as well as \(h_4= \{x {\mapsto } b, y {\mapsto } z_0, z{\mapsto } z_0, w{\mapsto } z_1\}\). The derivation graph \(G_{\delta }= (\textrm{V},\textrm{E},\textrm{At},\textrm{L})\) corresponding to \(\delta \) is shown in Figure 2 and has fives nodes, \(\textrm{V}= \{X_{0},X_{1},X_{2},X_{3},X_{4}\}\). Each node \(X_{i} \in \textrm{V}\) is associated with a set \(\textrm{At}(X_{i})\) of atoms depicted in the associated circle (e.g. \(\textrm{At}(X_{2}) = \{r(b,z_{0}),r(z_{0},z_{1})\}\)), and each arc \((X_{i},X_{j}) \in \textrm{E}\) is represented as a directed arrow with \(\textrm{L}(X_{i},X_{j})\) shown as the associated set of terms (e.g. \(\textrm{L}(X_{3},X_{4}) = \{z_{1}\}\)). For each node \(X_{i} \in \textrm{V}\), the set \(\textbf{Ter}(X_{i})\) of terms associated with the node is equal to \(\textbf{Ter}(\textrm{At}(X_{i})) \cup \{a,b\}\) (e.g. \(\textbf{Ter}(X_{3}) = \{z_{0},z_{1},a,b\}\)) since \(C = \textbf{Con}(\mathcal {D}_{\ddag },\mathcal {R}_{3}) = \{a,b\}\).

As can be witnessed via the above example, derivation graphs satisfy a set of properties akin to those characterizing tree decompositions [5, Proposition 12].

Lemma 2 (Decomposition Properties)

Let \(\mathcal {D}\) be a database, \(\mathcal {R}\) be a rule set, and \(C = \textbf{Con}(\mathcal {D},\mathcal {R})\). If , then \(G_{\delta }\) satisfies the following properties:

  1. 1.

    \(\bigcup _{X_{n} \in \textrm{V}} \textbf{Ter}(X_{n}) = \textbf{Ter}(\mathcal {I})\);

  2. 2.

    For each \(p(\textbf{t}) \in \mathcal {I}\), there is an \(X_{n} \in \textrm{V}\) such that \(p(\textbf{t}) \in \textrm{At}(X_{n})\);

  3. 3.

    For each term \(x \in \overline{C}(\mathcal {I})\), the subgraph of \(G_{\delta }\) induced by the nodes \(X_{n}\) such that \(x \in \overline{C}(X_{n})\) is connected;

  4. 4.

    For each \(X_{n} \in \textrm{V}\) the size of \(\textbf{Ter}(X_{n})\) is bounded by an integer that only depends on the size of \((\mathcal {D},\mathcal {R})\), viz. \(\max \{|\textbf{Ter}(\mathcal {D})|,|\textbf{Ter}( head (\rho _{i}))|_{\rho _{i} \in \mathcal {R}}\} + |C|\).

Let us now introduce our set of reduction operations. As remarked above, in certain circumstances such operations can be used to transform derivation graphs into tree decompositions of an instance.

We make use of three reduction operations, namely, (i) arc removal, denoted \((\textsf{ar})^{[i,j]}\), (ii) term removal, denoted \((\textsf{tr})^{[i,j,k,t]}\), and (iii) cycle removal, denoted \((\textsf{cr})^{[i,j,k,\ell ]}\). The first two reduction operations were already proposed by Baget et al. [5] (they presented \((\textsf{tr})\) and \((\textsf{ar})\) as a single operation called redundant arc removal), whereas cycle removal is introduced by us as a new operation that will assist us in characterizing \(\textbf{gbts}\) and \(\textbf{wgbts}\) in terms of derivation graphs.Footnote 2

Definition 4

(Reduction Operations). Let \(\mathcal {D}\) be a database, \(\mathcal {R}\) be a rule set, , and \(G_{\delta }\) be the derivation graph of \(\delta \). We define the set \(\textsf{RO}\) of reduction operations as \( \{(\textsf{ar})^{[i,j]}\!, (\textsf{tr})^{[i,j,k,t]}\!, (\textsf{cr})^{[i,j,k,\ell ]} \,|\, i,j,k,\ell {\,\le \,} n,\, t{\,\in \,} \textbf{Ter}(\mathcal {I}_{n})\}, \) whose effect is further specified below. We let \((\textsf{r})\Upsigma (G_{\delta })\) denote the output of applying the operation \((\textsf{r})\) to the (potentially reduced) derivation graph \(\Upsigma (G_{\delta }) = (\textrm{V},\textrm{E},\textrm{At},\textrm{L})\), where \(\Upsigma \in \textsf{RO}^{*}\) is a reduction sequence, that is, \(\Upsigma \) is a (potentially empty) sequence of reduction operations.

  1. 1.

    Arc Removal \((\textsf{ar})^{[i,j]}\): Whenever \((X_{i},X_{j}) \in \textrm{E}\) and \(\textrm{L}(X_{i},X_{j}) = \emptyset \), then \((\textsf{ar})^{[i,j]}\Upsigma (G_{\delta }) := (\textrm{V},\textrm{E}',\textrm{At},\textrm{L}')\) where \(\textrm{E}' := \textrm{E}\setminus \{(X_{i},X_{j})\}\) and \(\textrm{L}' = \textrm{L}\restriction \textrm{E}'\).

  2. 2.

    Term Removal \((\textsf{tr})^{[i,j,k,t]}\): If \((X_{i},X_{k}),(X_{j},X_{k}) \in \textrm{E}\) with \(X_{i} \ne X_{j}\) and \(t \in \textrm{L}(X_{i},X_{k}) \cap \textrm{L}(X_{j},X_{k})\), then \((\textsf{tr})^{[i,j,k,t]}\Upsigma (G_{\delta }) := (\textrm{V},\textrm{E},\textrm{At},\textrm{L}')\) where \(\textrm{L}'\) is obtained from \(\textrm{L}\) by removing t from \(\textrm{L}(X_{j},X_{k})\).

  3. 3.

    Cycle Removal \((\textsf{cr})^{[i,j,k,\ell ]}\): If \((X_{i},X_{k}),(X_{j},X_{k}) \in \textrm{E}\) and there exists a node \(X_{\ell } \in \textrm{V}\) with \(\ell <k\) such that \( \textrm{L}(X_{i},X_{k}) \cup \textrm{L}(X_{j},X_{k}) \subseteq \textbf{Ter}(X_{\ell }) \) then, \((\textsf{cr})^{[i,j,k,\ell ]}\Upsigma (G_{\delta }) := (\textrm{V},\textrm{E}',\textrm{At},\textrm{L}')\) where

    $$ \textrm{E}' := \big (\textrm{E}\setminus \{(X_{i},X_{k}),(X_{j},X_{k}) \}\big ) \cup \{(X_{\ell },X_{k})\} $$

    and \(\textrm{L}'\) is obtained from \(\textrm{L}\restriction \textrm{E}'\) by setting \(L(X_{\ell },X_{k})\) to \(\textrm{L}(X_{i},X_{k}) \cup \textrm{L}(X_{j},X_{k})\).

Last, we say that a reduction sequence \(\Upsigma \in \textsf{RO}^{*}\) is a complete reduction sequence relative to a derivation graph \(G_{\delta }\) iff \(\Upsigma (G_{\delta })\) is cycle-free.

Remark 2

When there is no danger of confusion, we will take the liberty to write \((\textsf{tr})\), \((\textsf{ar})\), and \((\textsf{cr})\) without superscript parameters. That is, given a derivation graph \(G_{\delta }\), the (reduced) derivation graph \((\textsf{cr})(\textsf{tr})(G_{\delta })\) is obtained by applying an instance of \((\textsf{tr})\) followed by an instance of \((\textsf{cr})\) to \(G_{\delta }\). When applying a reduction operation we always explain how it is applied, so the exact operation is known.

Fig. 3.
figure 3

Left to right: reduced derivation graphs \((\textsf{tr})\!(G_{\delta }\!)\), \((\textsf{ar})\!(\textsf{tr})\!(G_{\delta }\!)\), and \((\textsf{cr})\!(\textsf{ar})\!(\textsf{tr})\!(G_{\delta }\!)\).

We now describe the functionality of each reduction operation and illustrate each by means of an example. We will apply each to transform the derivation graph \(G_{\delta }\) (shown in Figure 2) into a tree decomposition of \(\mathcal {I}_{4}\) (which was defined above). The \((\textsf{tr})\) operation deletes a term t within the intersection of the sets labeling two converging arcs. For example, we may apply \((\textsf{tr})\) to the derivation graph \(G_{\delta }\) from Figure 2, deleting the term \(z_{0}\) from the label of the arc \((X_{1},X_{3})\), and yielding the reduced derivation graph \((\textsf{tr})(G_{\delta })\), which is shown first in Figure 3. We may then apply \((\textsf{ar})\) to \((\textsf{tr})(G_{\delta })\), deleting the arc \((X_{1},X_{3})\), which is labeled with the empty set, to obtain the reduced derivation graph \((\textsf{ar})(\textsf{tr})(G_{\delta })\) shown middle in Figure 3.

The \((\textsf{cr})\) operation is more complex and works by considering two converging arcs \((X_{i},X_{k})\) and \((X_{j},X_{k})\) in a (reduced) derivation graph. If there exists a node \(X_{\ell }\) whose index \(\ell \) is less than the index k of the child node \(X_{k}\) and \(\textrm{L}(X_{i},X_{k}) \cup \textrm{L}(X_{j},X_{k}) \subseteq \textbf{Ter}(X_{\ell })\), then the converging arcs \((X_{i},X_{k})\) and \((X_{j},X_{k})\) may be deleted and the arc \((X_{\ell },X_{k})\) introduced and labeled with \(\textrm{L}(X_{i},X_{k}) \cup \textrm{L}(X_{j},X_{k})\). As an example, the reduced derivation graph \((\textsf{cr})(\textsf{ar})(\textsf{tr})(G_{\delta })\) (shown third in Figure 3) is obtained from \((\textsf{ar})(\textsf{tr})(G_{\delta })\) (shown middle in Figure 3) by applying \((\textsf{cr})\) in the following manner to the convergent arcs \((X_{2},X_{4})\) and \((X_{3},X_{4})\): since for \(X_{2}\) (whose index 2 is less than the index 4 of \(X_{4}\)) \(\textrm{L}(X_{2},X_{4}) \cup \textrm{L}(X_{3},X_{4}) \subseteq \textbf{Ter}(X_{2})\), we may delete the arcs \((X_{2},X_{4})\) and \((X_{3},X_{4})\) and introduce the arc \((X_{2},X_{4})\) labeled with \(\textrm{L}(X_{2},X_{4}) \cup \textrm{L}(X_{3},X_{4}) = \{z_{0}\} \cup \{z_{1}\} = \{z_{0},z_{1}\}\). Observe that the reduced derivation graph \((\textsf{cr})(\textsf{ar})(\textsf{tr})(G_{\delta })\) is free of cycles, witnessing that \(\Upsigma = (\textsf{cr})(\textsf{ar})(\textsf{tr})\) is a complete reduction sequence relative to \(G_{\delta }\). Moreover, if we replace each node by the set of its terms and disregard the labels on arcs, then \(\Upsigma (G_{\delta })\) can be read as a tree decomposition of \(\mathcal {I}_{4}\). In fact, one can show that every reduced derivation graph satisfies the decomposition properties mentioned in Lemma 2 above.

Lemma 3

Let \(\mathcal {D}\) be a database and \(\mathcal {R}\) be a rule set. If , then for any reduction sequence \(\Upsigma \), \(\Upsigma (G_{\delta }) = (\textrm{V},\textrm{E},\textrm{At},\textrm{L})\) satisfies the decomposition properties 1-4 in Lemma 2.

As illustrated above, derivation graphs can be used to derive tree decompositions of \(\mathcal {R}\)-derivable instances. By the fourth decomposition property (see Lemma 2 above), the width of such a tree decomposition is bounded by a constant that depends only on the given knowledge base. Thus, if a rule set \(\mathcal {R}\) always yields derivation graphs that are reducible to cycle-free graphs – meaning that (un)directed cycles do not occur within the graph – then all \(\mathcal {R}\)-derivable instances have tree decompositions that are uniformly bounded by a constant. This establishes that the rule set \(\mathcal {R}\) falls within the \(\textbf{bts}\) class, confirming that query entailment is decidable with \(\mathcal {R}\). We define two classes of rule sets by means of reducible derivation graphs:

Definition 5

((Weakly) Cycle-free Derivation Graph Set). A rule set \(\mathcal {R}\) is a cycle-free derivation graph set (\(\textbf{cdgs}\)) iff if , then \(G_{\delta }\) can be reduced to a cycle-free graph by the reduction operations. \(\mathcal {R}\) is a weakly cycle-free derivation graph set (\(\textbf{wcdgs}\)) iff if , then there is a derivation \(\delta '\!\) where and \(G_{\!\delta '}\!\) can be reduced to a cycle-free graph by the reduction operations.

It is straightforward to confirm that \(\textbf{wcdgs}\) subsumes \(\textbf{cdgs}\), and that both classes are subsumed by \(\textbf{bts}\).

Proposition 1

Every \(\textbf{cdgs}\) rule set is \(\textbf{wcdgs}\) and every \(\textbf{wcdgs}\) rule set is \(\textbf{bts}\).

Furthermore, as mentioned above, \(\textbf{gbts}\) and \(\textbf{wgbts}\) coincide with \(\textbf{cdgs}\) and \(\textbf{wcdgs}\), respectively. By making use of the \((\textsf{cr})\) operation, one can show that the derivation graph of any greedy derivation is reducible to a cycle-free graph, thus establishing that \(\textbf{gbts}\subseteq \textbf{cdgs}\) and \(\textbf{wgbts}\subseteq \textbf{wcdgs}\). To show the converse (i.e. that \(\textbf{cdgs}\subseteq \textbf{gbts}\) and \(\textbf{wcdgs}\subseteq \textbf{wgbts}\)) however, requires more work. In essence, one shows that for every (non-source) node \(X_{i}\) in a cycle-free (reduced) derivation graph there exists another node \(X_{j}\) such that \(j < i\) and the frontier of the atoms in \(\textrm{At}(X_{i})\) only consist of constants and/or nulls introduced by the atoms in \(\textrm{At}(X_{j})\). This property is preserved under reverse applications of the reduction operations, and thus, one can show that if a derivation graph is reducible to a cycle-free graph, then the above property holds for the original derivation graph, implying that the derivation graph encodes a greedy derivation. Based on such arguments, one can prove the following:

Theorem 2

\(\textbf{gbts}\) coincides with \(\textbf{cdgs}\) and \(\textbf{wgbts}\) coincides with \(\textbf{wcdgs}\). Membership in \(\textbf{cdgs}\), \(\textbf{gbts}\), \(\textbf{wcdgs}\), or \(\textbf{wgbts}\) warrants decidable BCQ entailment.

Note that by Theorem 1, this also implies that \(\textbf{wcdgs}\) properly contains \(\textbf{cdgs}\).

An interesting consequence of the above theorem concerns the redundancy of \((\textsf{ar})\) and \((\textsf{tr})\) in the presence of \((\textsf{cr})\). In particular, since we know that (i) if a derivation graph can be reduced to a cycle-free graph, then the derivation graph encodes a greedy derivation, and (ii) the derivation graph of any greedy derivation can be reduced to an cycle-free graph by means of applying the \((\textsf{cr})\) operation only, it follows that if a derivation graph can be reduced to a cycle-free graph, then it can be reduced by only applying the \((\textsf{cr})\) operation. We refer to this phenomenon as reduction-admissibility, which is defined below.

Definition 6

(Reduction-admissible). Suppose \(S_{1} = \{(\textsf{r}_{i}) \ | \ 1 \le i \le n\}\) and \(S_{2} = \{(\textsf{r}_{j}) \ | \ n+1 \le j \le k\}\) are two sets of reduction operations. We say that \(S_{1}\) is reduction-admissible relative to \(S_{2}\) iff for any rule set \(\mathcal {R}\) and \(\mathcal {R}\)-derivation \(\delta \), if \(G_{\delta }\) is reducible to a cycle-free graph with \(S_{1} \cup S_{2}\), then \(G_{\delta }\) is reducible to a cycle-free graph with just \(S_{2}\).

Corollary 1

\(\{(\textsf{tr}),(\textsf{ar})\}\) is reduction-admissible relative to \((\textsf{cr})\).

5 Conclusion

In this paper, we revisited the concept of a greedy derivation, which immediately gives rise to a bounded-width tree decomposition of the constructed instance. This well-established notion allows us to categorize rule sets as being (weakly) greedy bounded treewidth sets (\(\mathbf {(w)gbts}\)), if all (some) derivations of a derivable instance are guaranteed to be greedy, irrespective of the underlying database. By virtue of being subsumed by \(\textbf{bts}\), these classes warrant decidability of BCQ entailment, while at the same time subsuming various popular rule languages, in particular from the guarded family.

By means of an example together with a proof-theoretic argument, we exposed that \(\textbf{wgbts}\) strictly generalizes \(\textbf{gbts}\). In pursuit of a better understanding and more workable methods to detect and analyze \(\mathbf {(w)gbts}\) rule sets, we resorted to the previously proposed notion of derivation graphs. Through a refinement of the set of reduction methods for derivation graphs, we were able to make more advanced use of this tool, leading to the definition of (weakly) cycle-free derivation graph sets (\(\mathbf {(w)cdgs}\)) of rules, of which we were then able to show the respective coincidences with \(\mathbf {(w)gbts}\). This way, we were able to establish alternative characterizations of \(\textbf{gbts}\) and \(\textbf{wgbts}\) by means of derivation graphs. En passant, we found that the newly introduced cycle removal reduction operation over derivation graphs is sufficient by itself and makes the other operations redundant.

For future work, we plan to put our newly found characterizations to use. In particular, we aim to investigate if a rule set’s membership in \(\textbf{gbts}\) or \(\textbf{wgbts}\) is decidable. For \(\textbf{gbts}\), this has been widely conjectured, but never formally established. In the positive case, derivation graphs might also be leveraged to pinpoint the precise complexity of the membership problem. We are also confident that the tools and insights in this paper – partially revived, partially upgraded, partially newly developed – will prove useful in the greater area of static analysis of existential rule sets. On a general note, we feel that the field of proof theory has a lot to offer for knowledge representation, whereas the cross-fertilization between these disciplines still appears to be underdeveloped.